Skip to content

Releases: NexaAI/nexa-sdk

v0.0.8.9-cu124

22 Oct 19:14
41e2d40
Compare
Choose a tag to compare

Improvements 🚀

  • Added multiprocessing support to speed up model evaluation tasks (#175)
    • Use --num_workers flag to specify number of parallel processes
    • Example: nexa eval phi3 --tasks ifeval --num_workers 4
  • Added support for Python 3.13 (#172)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.8...v0.0.8.9

v0.0.8.9

22 Oct 19:40
Compare
Choose a tag to compare

Improvements 🚀

  • Added multiprocessing support to speed up model evaluation tasks (#175)
    • Use --num_workers flag to specify number of parallel processes
    • Example: nexa eval phi3 --tasks ifeval --num_workers 4
  • Added support for Python 3.13 (#172)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.8...v0.0.8.9

v0.0.8.8-rocm621

18 Oct 07:11
fb34fb5
Compare
Choose a tag to compare

Improvements 🚀

  • nexa eval command now supports evaluating memory usage, latency, and energy consumption (#166)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.7...v0.0.8.8

v0.0.8.8-metal

18 Oct 06:54
fb34fb5
Compare
Choose a tag to compare

Improvements 🚀

  • nexa eval command now supports evaluating memory usage, latency, and energy consumption (#166)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.7...v0.0.8.8

v0.0.8.8-cu124

18 Oct 08:06
fb34fb5
Compare
Choose a tag to compare

Improvements 🚀

  • nexa eval command now supports evaluating memory usage, latency, and energy consumption (#166)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.7...v0.0.8.8

v0.0.8.8

18 Oct 08:14
Compare
Choose a tag to compare

Improvements 🚀

  • nexa eval command now supports evaluating memory usage, latency, and energy consumption (#166)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.7...v0.0.8.8

v0.0.8.7-rocm621

13 Oct 20:31
Compare
Choose a tag to compare

What's New ✨

  • Support for running models from user's local path (#151)

    • See details in CLI doc and Server doc
    • Run a NLP model from local path: nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
    • Start a multimodal model server from a local directory: nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
  • Embedding models support (#159)

    • See details in nexa embed
    • Quick example: nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
  • VLM models support in /v1/chat/completions (#154)

  • Support for running model evaluation on your device (#150)

Improvements 🚀

  • Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)

  • CV models now supported when running with -hf flag (#151)

    • Pull and run a CV model from Hugging Face: nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION

Fixes 🐞

  • Fixed streaming issues with /v1/chat/completions: (#152)

  • Resolved download problems on macOS and Windows: (#146)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.6.1...v0.0.8.7

v0.0.8.7-metal

13 Oct 20:14
Compare
Choose a tag to compare

What's New ✨

  • Support for running models from user's local path (#151)

    • See details in CLI doc and Server doc
    • Run a NLP model from local path: nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
    • Start a multimodal model server from a local directory: nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
  • Embedding models support (#159)

    • See details in nexa embed
    • Quick example: nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
  • VLM models support in /v1/chat/completions (#154)

  • Support for running model evaluation on your device (#150)

Improvements 🚀

  • Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)

  • CV models now supported when running with -hf flag (#151)

    • Pull and run a CV model from Hugging Face: nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION

Fixes 🐞

  • Fixed streaming issues with /v1/chat/completions: (#152)

  • Resolved download problems on macOS and Windows: (#146)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.6.1...v0.0.8.7

v0.0.8.7-cu124

13 Oct 21:26
Compare
Choose a tag to compare

What's New ✨

  • Support for running models from user's local path (#151)

    • See details in CLI doc and Server doc
    • Run a NLP model from local path: nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
    • Start a multimodal model server from a local directory: nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
  • Embedding models support (#159)

    • See details in nexa embed
    • Quick example: nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
  • VLM models support in /v1/chat/completions (#154)

  • Support for running model evaluation on your device (#150)

Improvements 🚀

  • Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)

  • CV models now supported when running with -hf flag (#151)

    • Pull and run a CV model from Hugging Face: nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION

Fixes 🐞

  • Fixed streaming issues with /v1/chat/completions: (#152)

  • Resolved download problems on macOS and Windows: (#146)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.6.1...v0.0.8.7

v0.0.8.7

13 Oct 21:34
Compare
Choose a tag to compare

What's New ✨

  • Support for running models from user's local path (#151)

    • See details in CLI doc and Server doc
    • Run a NLP model from local path: nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP
    • Start a multimodal model server from a local directory: nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL
  • Embedding models support (#159)

    • List of embedding models we support: model hub embedding models
    • See details in nexa embed
    • Quick example: nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
  • VLM models support in /v1/chat/completions (#154)

  • (Beta-testing) Support for running model evaluation on your device (#150)

Improvements 🚀

  • Customizable maximum context window (--nctx) for NLP and VLM models: (#155 and #158)

  • CV models now supported when running with -hf flag (#151)

    • Pull and run a CV model from Hugging Face: nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION

Fixes 🐞

  • Fixed streaming issues with /v1/chat/completions: (#152)

  • Resolved download problems on macOS and Windows: (#146)

Upgrade Guide 📝

To upgrade the Nexa SDK, use the command for your system:

CPU

pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (Metal)

For the GPU version supporting Metal (macOS):

CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (CUDA)

For Linux:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows PowerShell:

$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Command Prompt:

set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

For Windows Git Bash:

CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir

GPU (ROCm)

For Linux:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir

For detailed installation instructions, please refer to the Installation section in the README.

Full Changelog - v0.0.8.6.1...v0.0.8.7