Skip to content

Compatibility with llama-cpp-python and add documentation #1779

Discussion options

You must be logged in to vote

@0x33taji I've managed to get it working -- the key was using http://localhost:8000/v1 for baseUrl

Run command

python3 -m llama_cpp.server \
    --model "<path-to-models>/nous-hermes-2-mixtral-8x7Bb-dpo/Nous-Hermes-2-Mixtral-8x7B-DPO.Q5_K_M.gguf" \
    --chat_format chatml --n_gpu_layers -1 --n_ctx 8192

librechat.yaml

# Configuration version (required)
version: 1.0.2

cache: true

# Definition of custom endpoints
endpoints:
  custom:      
    - name: "Local LLAMA"   
      apiKey: "1234"
      baseURL: "http://localhost:8000/v1"
      models:
        default: ["Nous-Hermes-2-Mixtral-8×7B-DPO.Q5_K_M.gguf"]
        fetch: true
      titleConvo: true
      titleModel: "Nous-Hermes-2-Mixtra…

Replies: 4 comments 4 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
2 replies
@danny-avila
Comment options

@vladiliescu
Comment options

Answer selected by danny-avila
Comment options

You must be logged in to vote
2 replies
@danny-avila
Comment options

@vladiliescu
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants
Converted from issue

This discussion was converted from issue #1769 on February 12, 2024 08:57.