Releases: NexaAI/nexa-sdk
v0.0.8.9-cu124
Improvements 🚀
- Added multiprocessing support to speed up model evaluation tasks (#175)
- Use
--num_workers
flag to specify number of parallel processes - Example:
nexa eval phi3 --tasks ifeval --num_workers 4
- Use
- Added support for Python 3.13 (#172)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.9
Improvements 🚀
- Added multiprocessing support to speed up model evaluation tasks (#175)
- Use
--num_workers
flag to specify number of parallel processes - Example:
nexa eval phi3 --tasks ifeval --num_workers 4
- Use
- Added support for Python 3.13 (#172)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.8-rocm621
Improvements 🚀
nexa eval
command now supports evaluating memory usage, latency, and energy consumption (#166)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.8-metal
Improvements 🚀
nexa eval
command now supports evaluating memory usage, latency, and energy consumption (#166)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.8-cu124
Improvements 🚀
nexa eval
command now supports evaluating memory usage, latency, and energy consumption (#166)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.8
Improvements 🚀
nexa eval
command now supports evaluating memory usage, latency, and energy consumption (#166)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.7-rocm621
What's New ✨
-
Support for running models from user's local path (#151)
- See details in CLI doc and Server doc
- Run a NLP model from local path:
nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
- Start a multimodal model server from a local directory:
nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
-
Embedding models support (#159)
- See details in nexa embed
- Quick example:
nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt
(This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
-
VLM models support in /v1/chat/completions (#154)
- See details in Server doc
-
Support for running model evaluation on your device (#150)
Improvements 🚀
-
Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)
-
CV models now supported when running with -hf flag (#151)
- Pull and run a CV model from Hugging Face:
nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION
- Pull and run a CV model from Hugging Face:
Fixes 🐞
-
Fixed streaming issues with /v1/chat/completions: (#152)
-
Resolved download problems on macOS and Windows: (#146)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.7-metal
What's New ✨
-
Support for running models from user's local path (#151)
- See details in CLI doc and Server doc
- Run a NLP model from local path:
nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
- Start a multimodal model server from a local directory:
nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
-
Embedding models support (#159)
- See details in nexa embed
- Quick example:
nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt
(This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
-
VLM models support in /v1/chat/completions (#154)
- See details in Server doc
-
Support for running model evaluation on your device (#150)
Improvements 🚀
-
Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)
-
CV models now supported when running with -hf flag (#151)
- Pull and run a CV model from Hugging Face:
nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION
- Pull and run a CV model from Hugging Face:
Fixes 🐞
-
Fixed streaming issues with /v1/chat/completions: (#152)
-
Resolved download problems on macOS and Windows: (#146)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.7-cu124
What's New ✨
-
Support for running models from user's local path (#151)
- See details in CLI doc and Server doc
- Run a NLP model from local path:
nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
- Start a multimodal model server from a local directory:
nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
-
Embedding models support (#159)
- See details in nexa embed
- Quick example:
nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt
(This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
-
VLM models support in /v1/chat/completions (#154)
- See details in Server doc
-
Support for running model evaluation on your device (#150)
Improvements 🚀
-
Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)
-
CV models now supported when running with -hf flag (#151)
- Pull and run a CV model from Hugging Face:
nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION
- Pull and run a CV model from Hugging Face:
Fixes 🐞
-
Fixed streaming issues with /v1/chat/completions: (#152)
-
Resolved download problems on macOS and Windows: (#146)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.
v0.0.8.7
What's New ✨
-
Support for running models from user's local path (#151)
- See details in CLI doc and Server doc
- Run a NLP model from local path:
nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
- Start a multimodal model server from a local directory:
nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
-
Embedding models support (#159)
- List of embedding models we support: model hub embedding models
- See details in nexa embed
- Quick example:
nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt
(This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
-
VLM models support in /v1/chat/completions (#154)
- See details in Server doc
-
(Beta-testing) Support for running model evaluation on your device (#150)
Improvements 🚀
-
Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)
-
CV models now supported when running with -hf flag (#151)
- Pull and run a CV model from Hugging Face:
nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION
- Pull and run a CV model from Hugging Face:
Fixes 🐞
-
Fixed streaming issues with /v1/chat/completions: (#152)
-
Resolved download problems on macOS and Windows: (#146)
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting Metal (macOS):
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For Linux:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows PowerShell:
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Command Prompt:
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For Windows Git Bash:
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For Linux:
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the Installation section in the README.