RSS BotMB to Hacker NewsEnglish · 21 days agoBuilt RL for long-horizon agents – tested on 32x H100s but too poor to traingithub.comexternal-linkmessage-square0fedilinkarrow-up12arrow-down10file-text
arrow-up12arrow-down1external-linkBuilt RL for long-horizon agents – tested on 32x H100s but too poor to traingithub.comRSS BotMB to Hacker NewsEnglish · 21 days agomessage-square0fedilinkfile-text