Do we need to build Core ML versions of quantized models? #1829

7k50 · 2024-02-02T18:00:31Z

7k50
Feb 2, 2024

Apologies if this has been answered previously, or has an implied answer.

Do we need to build separate Core ML versions of quantized models, or can we access quantized models via their .bin files once the corresponding Core ML model has already been built for the unquantized model?

e.g. for reference, the below did not seem to accomplish anything:

./models/generate-coreml-model.sh base.en-q5_0.bin

after:

./quantize models/ggml-base.en.bin models/ggml-base.en-q5_0.bin q5_0

Answered by bobqianic

Feb 2, 2024

CoreML is solely utilized for encoding, while decoding is managed through GGML. Although CoreML doesn't support these specialized quantization methods, theoretically, it's still feasible to run the decoder using a quantized model because it is managed through GGML.

Unquantized CoreML model + quantized GGML model

View full answer

bobqianic · 2024-02-02T18:40:37Z

bobqianic
Feb 2, 2024
Collaborator

CoreML is solely utilized for encoding, while decoding is managed through GGML. Although CoreML doesn't support these specialized quantization methods, theoretically, it's still feasible to run the decoder using a quantized model because it is managed through GGML.

Unquantized CoreML model + quantized GGML model

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do we need to build Core ML versions of quantized models? #1829

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Do we need to build Core ML versions of quantized models? #1829

7k50 Feb 2, 2024

Replies: 1 comment

bobqianic Feb 2, 2024 Collaborator

7k50
Feb 2, 2024

bobqianic
Feb 2, 2024
Collaborator