RSS BotMB to Hacker NewsEnglish · 20 days agoEvals in 2025: benchmarks to build models people can usegithub.comexternal-linkmessage-square0fedilinkarrow-up12arrow-down10file-text
arrow-up12arrow-down1external-linkEvals in 2025: benchmarks to build models people can usegithub.comRSS BotMB to Hacker NewsEnglish · 20 days agomessage-square0fedilinkfile-text