Hi all, tldr; I’ve vibecoded a new library, here’s how it went. Background A while back the NLP community rediscovered a 20-year old clustering algorithm based on compression (see Cilibrasi & Vitanyi 2003), and it made quite a splash because now we don’t have to train neural networks to produce embeedings anymore! Indeed, it works quite well and you can find various implementations online. Many implementations however use an all-to-all distance computation which you can guess to be both slo...
slop