Soon

ikt@aussie.zone · edit-2 1 month ago

The current models depend on massive investment into server farms

I hate to tell you this but your knowledge of AI appears to be limited to 2023 ;)

You missed the entire Deepseek fiasco which basically put an end to the “just order more chips” strategy of AI

Chinese company DeepSeek made waves earlier this year when it revealed it had built models comparable to OpenAI’s flagship products for a tiny fraction of the training cost. Likewise, researchers from Seattle’s Allen Institute for AI (Ai2) and Stanford University claim to have trained a model for as little as US$50.

https://theconversation.com/microsoft-cuts-data-centre-plans-and-hikes-prices-in-push-to-make-users-carry-ai-costs-250932

Or if you’d like to read an absolutely mega article on it: https://stratechery.com/2025/deepseek-faq/

And no, self-hosted models aren’t going to make up for it. They aren’t as powerful, and more importantly, they will never be able to drive mass market adaptation

Both Samsung and Apple have on device AI already, you’ve not seen the Apple ad? https://www.youtube.com/watch?v=iL88A5F9V3k

They’re only planning more and more features using it

They aren’t as powerful

We’ve had insane gains in locallama’s since 2022, including but not limited to this from the other day https://lemmy.world/post/30442991

And every few weeks a newer and improved model comes out, I’ve never seen tech so amazing and progress so fast as I have AI

frezik@midwest.social · 1 month ago

Deepseek is not the huge leap it appears. It’s better, but not at all what was initially claimed.

Nor is smartphone AI going to do the things people what AI to do. It won’t let the CEO take your job.

ikt@aussie.zone · 1 month ago

why are you linking me to articles i read ages ago?

Nor is smartphone AI going to do the things people what AI to do. It won’t let the CEO take your job.

You think AI is only useful if it’s taking someones job?

frezik@midwest.social · 1 month ago

why are you linking me to articles i read ages ago?

Perhaps because you didn’t understand what they said.

You think AI is only useful if it’s taking someones job?

It’s why companies are dumping billions into it.

If the models were actually getting substantially more efficient, we wouldn’t be talking about bringing new nuclear reactors online just to run it.

ikt@aussie.zone · edit-2 1 month ago

Perhaps because you didn’t understand what they said.

they repeat what i said, did you read them? previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that

From your own article

Is it impressive that DeepSeek-V3 cost half as much as Sonnet or 4o to train? I guess so. But OpenAI and Anthropic are not incentivized to save five million dollars on a training run, they’re incentivized to squeeze every bit of model quality they can. DeepSeek are obviously incentivized to save money because they don’t have anywhere near as much.

https://www.seangoedecke.com/is-deepseek-fast/

The revelations regarding its cost structure, GPU utilization, and innovative capabilities position DeepSeek as a formidable player.

https://www.yahoo.com/news/research-exposes-deepseek-ai-training-165025904.html

^ fyi that article you linked to is an AI summary of a semianalysis.com article, maybe AI is useful after all ;)

If the models were actually getting substantially more efficient, we wouldn’t be talking about bringing new nuclear reactors online just to run it.

Youtube uses a fuck ton of power but is an incredibly efficient video delivery service

The growth and popularity of AI and its uses is simply outpacing the efficiency gains

frezik@midwest.social · 1 month ago

they repeat what i said, did you read them? previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that

Yes, and it says exactly what I claimed. DeepSeek is an improvement, but not to the level initially reported. Not even close.

Youtube uses a fuck ton of power but is an incredibly efficient video delivery service

What a colossally stupid thing to say. We’re not looking at starting up new nuclear reactors to run YouTube.

ikt@aussie.zone · 1 month ago

DeepSeek is an improvement, but not to the level initially reported.

🫠 I cannot be any clearer:

previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that