Lemmy: Bestiverse
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
RSS BotMB to Hacker NewsEnglish · 2 days ago

Our LLM-controlled office robot can't pass butter

andonlabs.com

external-link
message-square
0
fedilink
4
external-link

Our LLM-controlled office robot can't pass butter

andonlabs.com

RSS BotMB to Hacker NewsEnglish · 2 days ago
message-square
0
fedilink
Butter-Bench: Evaluating LLM Controlled Robots for Practical Intelligence | Andon Labs
andonlabs.com
external-link
Can LLMs control robots? We answer this by testing how good models are at passing the butter – or more generally, do delivery tasks in a household setting. State of the art models struggle, with the best model scoring 40% at Butter-Bench, compared to 95% for humans.

Comments

alert-triangle
You must log in or register to comment.

Hacker News

hackernews

Subscribe from Remote Instance

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !hackernews@lemmy.bestiver.se
lock
Community locked: only moderators can create posts. You can still comment on posts.

Posts from the RSS Feed of HackerNews.

The feed sometimes contains ads and posts that have been removed by the mod team at HN.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 325 users / day
  • 1.33K users / week
  • 3.65K users / month
  • 9.45K users / 6 months
  • 2 local subscribers
  • 2.9K subscribers
  • 34.7K Posts
  • 15K Comments
  • Modlog
  • mods:
  • patrick
  • RSS Bot
  • BE: 0.19.5
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org