• ikt@aussie.zone
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    2
    ·
    edit-2
    3 days ago

    The current models depend on massive investment into server farms

    I hate to tell you this but your knowledge of AI appears to be limited to 2023 ;)

    You missed the entire Deepseek fiasco which basically put an end to the “just order more chips” strategy of AI

    Chinese company DeepSeek made waves earlier this year when it revealed it had built models comparable to OpenAI’s flagship products for a tiny fraction of the training cost. Likewise, researchers from Seattle’s Allen Institute for AI (Ai2) and Stanford University claim to have trained a model for as little as US$50.

    https://theconversation.com/microsoft-cuts-data-centre-plans-and-hikes-prices-in-push-to-make-users-carry-ai-costs-250932

    Or if you’d like to read an absolutely mega article on it: https://stratechery.com/2025/deepseek-faq/

    And no, self-hosted models aren’t going to make up for it. They aren’t as powerful, and more importantly, they will never be able to drive mass market adaptation

    Both Samsung and Apple have on device AI already, you’ve not seen the Apple ad? https://www.youtube.com/watch?v=iL88A5F9V3k

    They’re only planning more and more features using it

    They aren’t as powerful

    We’ve had insane gains in locallama’s since 2022, including but not limited to this from the other day https://lemmy.world/post/30442991

    And every few weeks a newer and improved model comes out, I’ve never seen tech so amazing and progress so fast as I have AI

      • ikt@aussie.zone
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 days ago

        why are you linking me to articles i read ages ago?

        Nor is smartphone AI going to do the things people what AI to do. It won’t let the CEO take your job.

        You think AI is only useful if it’s taking someones job?

        • frezik@midwest.social
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          3 days ago

          why are you linking me to articles i read ages ago?

          Perhaps because you didn’t understand what they said.

          You think AI is only useful if it’s taking someones job?

          It’s why companies are dumping billions into it.

          If the models were actually getting substantially more efficient, we wouldn’t be talking about bringing new nuclear reactors online just to run it.

          • ikt@aussie.zone
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            2
            ·
            edit-2
            3 days ago

            Perhaps because you didn’t understand what they said.

            they repeat what i said, did you read them? previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that

            From your own article

            Is it impressive that DeepSeek-V3 cost half as much as Sonnet or 4o to train? I guess so. But OpenAI and Anthropic are not incentivized to save five million dollars on a training run, they’re incentivized to squeeze every bit of model quality they can. DeepSeek are obviously incentivized to save money because they don’t have anywhere near as much.

            https://www.seangoedecke.com/is-deepseek-fast/

            The revelations regarding its cost structure, GPU utilization, and innovative capabilities position DeepSeek as a formidable player.

            https://www.yahoo.com/news/research-exposes-deepseek-ai-training-165025904.html

            ^ fyi that article you linked to is an AI summary of a semianalysis.com article, maybe AI is useful after all ;)

            If the models were actually getting substantially more efficient, we wouldn’t be talking about bringing new nuclear reactors online just to run it.

            Youtube uses a fuck ton of power but is an incredibly efficient video delivery service

            The growth and popularity of AI and its uses is simply outpacing the efficiency gains

            • frezik@midwest.social
              link
              fedilink
              arrow-up
              1
              ·
              3 days ago

              they repeat what i said, did you read them? previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that

              Yes, and it says exactly what I claimed. DeepSeek is an improvement, but not to the level initially reported. Not even close.

              Youtube uses a fuck ton of power but is an incredibly efficient video delivery service

              What a colossally stupid thing to say. We’re not looking at starting up new nuclear reactors to run YouTube.

              • ikt@aussie.zone
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                2
                ·
                3 days ago

                DeepSeek is an improvement, but not to the level initially reported.

                🫠 I cannot be any clearer:

                previously ai model training was entirely based on simply buying more chips as fast and as hard as possible, deepseek changed that