The Brain module for Linguflex handles natural language processing with both local and OpenAI's language models.
This module supports
- Generating textual responses based on user input.
- Integration of both local language models and OpenAI's GPT models.
- Handling of history and state of conversations.
- "Tell me a joke about computers."
OPENAI_API_KEY
To use openai with this module an OpenAI API Key must be set as the environment variable OPENAI_API_KEY
.
To obtain an API key:
- Create an account on the OpenAI signup page.
- Click on the name at the top right and select "View API keys".
- Click on "Create new secret key" and generate a new API key.
To switch to a local language model instead of OpenAI's, follow these steps:
-
Enable Local LLM: Set the parameter
local_llm/use_local_llm
totrue
in thesettings.yaml
file. -
Choose a Provider: Select between two providers for local models: Ollama and Llama.cpp.
- Ollama: Recommended for faster inference. Install it from Ollama's website.
- Llama.cpp: If you choose this provider, download the model you intend to use and place it in the directory specified under
local_llm/model_path
.
-
Configure the Provider:
- Set the provider in the
settings.yaml
file underlocal_llm/model_provider
. Allowed values are"ollama"
or"llama.cpp"
. - Specify the model name under
local_llm/model_name
. For instance, use"Starling-LM-7B-beta-Q8_0.gguf"
for Llama.cpp or"llama3"
for Ollama.
- Set the provider in the
Configure the Brain module by editing the settings.yaml
file
Section: general (at the top)
openai_model
- Set to specify the OpenAI model to use.max_history_messages
Maximum number of history messages to keep.
Section: local_llm
gpu_layers
: Number of GPU layers to use.model_path
: Path to the local language model directory.model_name
: Filename of the local language model.max_retries
: Maximum retries for LLM requests.context_length
: Context length for language model.max_tokens
: Maximum tokens in a single response.repeat_penalty
: Penalty for repeated content.temperature
: Controls randomness.top_p
: Top probability for token selection.top_k
: Top K tokens to consider for generation.tfs_z
: Transformer factorization setting.mirostat_mode
: Mirostat mode.mirostat_tau
: Mirostat time constant.mirostat_eta
: Mirostat learning rate.verbose
: Verbose logging.threads
: Number of threads for processing.rope_freq_base
: Rope frequency base.rope_freq_scale
: Rope frequency scale.
Section: see
vision_model
: Specify the vision model for image-related tasks.vision_max_tokens
: Maximum tokens for vision tasks.