-
Apologies if this has been answered previously, or has an implied answer. Do we need to build separate Core ML versions of quantized models, or can we access quantized models via their .bin files once the corresponding Core ML model has already been built for the unquantized model? e.g. for reference, the below did not seem to accomplish anything:
after:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
CoreML is solely utilized for encoding, while decoding is managed through GGML. Although CoreML doesn't support these specialized quantization methods, theoretically, it's still feasible to run the decoder using a quantized model because it is managed through GGML.
|
Beta Was this translation helpful? Give feedback.
CoreML is solely utilized for encoding, while decoding is managed through GGML. Although CoreML doesn't support these specialized quantization methods, theoretically, it's still feasible to run the decoder using a quantized model because it is managed through GGML.
Unquantized CoreML model
+quantized GGML model