Convert: mixed k-quant with legacy quant fallback #447

stduhpf · 2024-10-25T13:33:22Z

Adds a new cli argument: --fallback-type.

If tensors cannot be quantized to a k-quant because of block size issues, the fallback type will be used instead of full precision.

Very useful for SD3.5 models, because 90% of SD3.5 8B weights can't be quantized to k quants.

--type q4_k --fallback-type q4_0 has always the exact same output size as --type q4_0, but with less degradation.

Somewhat adresses #446

stduhpf · 2024-10-25T14:38:47Z

I'm currently uploading quantized weights to HF, but with my cellular data, it takes very long.

here:

Convert: mixed k-quant with legacy fallback

63c10d1

Provide feedback