I prefer waterfox, OpenAI can keep its Chat chippy tea browser.

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    4 hours ago

    Net might get to where you need AI

    I hate to say it, but we’re basically there, and AI doesn’t help a ton. If the net is slop and trash, there’s not a lot it can do.

    Hopefully by then they will have figured out a way to make it free.

    Fortunately self hosting is 100% taking off. Getting a (free) local agent to sift through the net’s sludge will be about as easy as tweaking Firefox before long.

    You can already do it. I already do it (and am happy to ramble about how when asked), but it’s more of an enthusiast/tinkerer thing now.

    • MagicShel@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 hours ago

      Local is also slower and… less robust in capability. But it’s getting there. I run local AI and I’m really impressed with gains in both. It’s just still a big gap.

      We’re headed in a good direction here, but I’m afraid local may be gated by ability to afford expensive hardware.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        4 hours ago

        Not anymore.

        I can run GLM 4.6 on a Ryzen/single RTX 3090 desktop at 7 tokens/s, and it blows lesser API models away. I can run 14-49Bs (or GLM Air) in more utilitarian cases that do just fine.

        And I can reach for free/dirt cheap APIs called locally when needed.

        But again, it’s all ‘special interest tinkerer’ tier. You can’t do that with ollama run, you have to mess with exotic libraries and tweaked setups and RAG chains to squeeze out that kind of performance. But all that getting simplified is inevitable.

        • MagicShel@lemmy.zip
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 hours ago

          I’ll look into it. OAI’s 30B model is the most I can run in my MacBook and it’s decent. I don’t think I can even run that on my desktop with a 3060 GPU. I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

          It’s pretty reasonable in capability. I want to play around with setting up RAG pipelines for specific domain knowledge, but I’m just getting started.