From edea2e7c3a4cf4e6a6034f3c855834429ff6fba1 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 14 Jul 2024 12:16:04 +0200 Subject: [PATCH] docs: add a note on benchmarks (#2857) Add a note on LocalAI defaults and benchmarks in our FAQ section. See also https://github.com/mudler/LocalAI/issues/2780 Signed-off-by: Ettore Di Giacinto --- docs/content/docs/faq.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/content/docs/faq.md b/docs/content/docs/faq.md index 9b2a54792ce0..c1dc24ec7759 100644 --- a/docs/content/docs/faq.md +++ b/docs/content/docs/faq.md @@ -16,6 +16,10 @@ Here are answers to some of the most common questions. Most gguf-based models should work, but newer models may require additions to the API. If a model doesn't work, please feel free to open up issues. However, be cautious about downloading models from the internet and directly onto your machine, as there may be security vulnerabilities in lama.cpp or ggml that could be maliciously exploited. Some models can be found on Hugging Face: https://huggingface.co/models?search=gguf, or models from gpt4all are compatible too: https://github.com/nomic-ai/gpt4all. +### Benchmarking LocalAI and llama.cpp shows different results! + +LocalAI applies a set of defaults when loading models with the llama.cpp backend, one of these is mirostat sampling - while it achieves better results, it slows down the inference. You can disable this by setting `mirostat: 0` in the model config file. See also the advanced section ({{%relref "docs/advanced/advanced-usage" %}}) for more information and [this issue](https://github.com/mudler/LocalAI/issues/2780). + ### What's the difference with Serge, or XXX? LocalAI is a multi-model solution that doesn't focus on a specific model type (e.g., llama.cpp or alpaca.cpp), and it handles all of these internally for faster inference, easy to set up locally and deploy to Kubernetes.