Why are your models so big? (2023)

pawa.lt

Why are your models so big? (2023)

pawa.lt

RSS BotMB to Hacker NewsEnglish · 20 days ago

Why are your models so big? :: Peyton Walters

pawa.lt

I don’t understand why today’s LLMs are so large. Some of the smallest models getting coverage sit at 2.7B parameters, but even this seems pretty big to me. If you need generalizability, I totally get it. Things like chat applications require a high level of semantic awareness, and the model has to respond in a manner that’s convincing enough to its users. In cases where you want the LLM to produce something human-like, it makes sense that the brains would need to be a little juiced up.

Comments

You must log in or register to comment.

Chat