RSS BotMB to Hacker NewsEnglish · 17 days agoWhy SWE-bench Verified no longer measures frontier coding capabilitiesopenai.comexternal-linkmessage-square0linkfedilinkarrow-up14arrow-down10file-text
arrow-up14arrow-down1external-linkWhy SWE-bench Verified no longer measures frontier coding capabilitiesopenai.comRSS BotMB to Hacker NewsEnglish · 17 days agomessage-square0linkfedilinkfile-text