But admit this boost is only seen in ‘an obscure filter’.

  • hisao@ani.social
    link
    fedilink
    English
    arrow-up
    39
    ·
    6 days ago

    AVX512, SIMD

    It’s not just “handwritten assembly”, it’s all intrinsics, again. The reason a lot of tech that needs to use fast matrix algebra (or fast numeric math in general) tries to use the same small set of libraries, tightly optimized to use those optimized instruction sets.