A $196 fine-tuned 7B model outperforms OpenAI o3 on document extraction

RSS Bot · 3 months ago

A $196 fine-tuned 7B model outperforms OpenAI o3 on document extraction

mindbleach@sh.itjust.works · 3 months ago

The supervised fine-tuning phase employed Low-Rank Adaptation (LoRA) to efficiently adapt the base DeepSeek- R1-Distill-Qwen-7B model for extraction tasks

So this is bolted on top of a model that cost six figures.

Dionysus@leminal.space · 3 months ago

And deepseek is based on llama, more than six figures.

I’m not aware of any larger parameter LLMs not based on one which is absurdly expensive.

mindbleach@sh.itjust.works · 3 months ago

DeepSeek is trained from-scratch. Only some variants used other LLMs.

This is a megaphone made from string, a squirrel, and a megaphone.

A $196 fine-tuned 7B model outperforms OpenAI o3 on document extraction

A $196 fine-tuned 7B model outperforms OpenAI o3 on document extraction

Extract-0: A Specialized Language Model for Document Information Extraction