ChatGPT's new browser has potential, if you're willing to pay

themachinestops@lemmy.dbzer0.com · edit-2 25 days ago

ChatGPT's new browser has potential, if you're willing to pay

brucethemoose@lemmy.world · edit-2 25 days ago

Net might get to where you need AI

I hate to say it, but we’re basically there, and AI doesn’t help a ton. If the net is slop and trash, there’s not a lot it can do.

Hopefully by then they will have figured out a way to make it free.

Fortunately self hosting is 100% taking off. Getting a (free) local agent to sift through the net’s sludge will be about as easy as tweaking Firefox before long.

You can already do it. I already do it (and am happy to ramble about how when asked), but it’s more of an enthusiast/tinkerer thing now.

MagicShel@lemmy.zip · 25 days ago

Local is also slower and… less robust in capability. But it’s getting there. I run local AI and I’m really impressed with gains in both. It’s just still a big gap.

We’re headed in a good direction here, but I’m afraid local may be gated by ability to afford expensive hardware.

brucethemoose@lemmy.world · edit-2 25 days ago

Not anymore.

I can run GLM 4.6 on a Ryzen/single RTX 3090 desktop at 7 tokens/s, and it blows lesser API models away. I can run 14-49Bs (or GLM Air) in more utilitarian cases that do just fine.

And I can reach for free/dirt cheap APIs called locally when needed.

But again, it’s all ‘special interest tinkerer’ tier. You can’t do that with ollama run, you have to mess with exotic libraries and tweaked setups and RAG chains to squeeze out that kind of performance. But all that getting simplified is inevitable.

MagicShel@lemmy.zip · 25 days ago

I’ll look into it. OAI’s 30B model is the most I can run in my MacBook and it’s decent. I don’t think I can even run that on my desktop with a 3060 GPU. I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

It’s pretty reasonable in capability. I want to play around with setting up RAG pipelines for specific domain knowledge, but I’m just getting started.

brucethemoose@lemmy.world · edit-2 25 days ago

I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

It is. I’m running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I’ll make a quant just for you, if you want.

Depending on the use case, I’d recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq

MagicShel@lemmy.zip · 25 days ago

I’m going to upgrade my ram shortly because I found a bad stick and I’m down to 16GB currently. I’ll see if I can swing that order this weekend.

brucethemoose@lemmy.world · edit-2 25 days ago

To what?

64G would be good, as that’s enough to fit GLM Air. There are some good 2x64GB 6000Mhz kits for 128GB as well.

MagicShel@lemmy.zip · 25 days ago

I’ll see about 128, then, but I’ll probably do 64. Just depends on cost. Any recs?

brucethemoose@lemmy.world · edit-2 25 days ago

For DDR5? Depends how much you care about latency:

https://pcpartpicker.com/products/memory/#ff=ddr5&Z=131072002&B=1000000000%2C1250000000&sort=price&page=1

The $342 Crucial kit is kinda a no-brainer for 128. Its timings aren’t great when overclocked, but it’s 5600 MHz out of the box, low voltage, and significantly cheaper per gigabyte than many 64GB/96GB kits. See for yourself:

https://www.igorslab.de/en/when-size-is-all-that-matters-crucial-2x-64-gb-ddr5-5600-kit-with-new-micron-32-gbit-ics-in-test-incl-gaming-and-overclocking/4/

The overclockability matters even less if you are on a 7000 series CPU.

I got the 1.25V Flare X5 kit because I wanted tighter timings for sim games, albeit at a MUCH lower price ($390) than it is now.

RAM prices seem to be rising (hence the price of my kit spiked by $200), so now is not a bad time to buy.

MagicShel@lemmy.zip · 25 days ago

I’m have to check. It’s a pro, not air, but I think it’s only 40GB total. I’m really new to Macs so the memory situation is unclear. I requested it at work specifically for its capability to run local AI.

ChatGPT's new browser has potential, if you're willing to pay

ChatGPT's new browser has potential, if you're willing to pay

I tried ChatGPT's Atlas browser to rival Google - here's what I found