BERT Is Just a Single Text Diffusion Step

nathan.rs

BERT Is Just a Single Text Diffusion Step

nathan.rs

RSS BotMB to Hacker NewsEnglish · 9 hours ago

BERT is just a Single Text Diffusion Step

nathan.rs

A while back, Google DeepMind unveiled Gemini Diffusion, an experimental language model that generates text using diffusion. Unlike traditional GPT-style models that generate one word at a time, Gemini Diffusion creates whole blocks of text by refining random noise step-by-step. I read the paper Large Language Diffusion Models and was surprised to find that discrete language diffusion is just a generalization of masked language modeling (MLM), something we’ve been doing since 2018. The first thought I had was, “can we finetune a BERT-like model to do text generation?” I decided to try a quick proof of concept out of curiosity.

Comments

You must log in or register to comment.

Chat