• m532@lemmygrad.ml
    link
    fedilink
    arrow-up
    1
    ·
    9 hours ago

    But does it need Flash Attention, Sage Attention, or Scaled dot product Attention?