• humanspiral@lemmy.caOP
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 month ago

    int4 would be faster on CPU than fp4. They show benchmarks that claim better “accuracy”/less retardation than other 4 bit quantization methods (all fp4 variants) int4 and fp4 is the same memory requirement. I don’t think they claim that the actual “post quantization” transformation process is less resources than fp4 alternatives.