• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 hours ago

    Alt attention is here.

    I wonder what OpenAI/Claude are using internally these days? GTP-OSS was just sliding window attention (a relatively primitive mechanism).