RSS BotMB to Hacker NewsEnglish · 3 months agoPool spare GPU capacity to run LLMs at larger scalegithub.comexternal-linkmessage-square1linkfedilinkarrow-up14arrow-down10file-text
arrow-up14arrow-down1external-linkPool spare GPU capacity to run LLMs at larger scalegithub.comRSS BotMB to Hacker NewsEnglish · 3 months agomessage-square1linkfedilinkfile-text
minus-squaretroed@fedia.iolinkfedilinkarrow-up1·3 months agoThat’s really interesting. Only macOS instructions though? Seems like something that would easily run on Linux as well. (I’d love to hook my server’s GPU into local LLM workloads otherwise only offloaded to the CPU from my main workstation when needing too much VRAM)
That’s really interesting. Only macOS instructions though? Seems like something that would easily run on Linux as well.
(I’d love to hook my server’s GPU into local LLM workloads otherwise only offloaded to the CPU from my main workstation when needing too much VRAM)