From 5ccfc87ecc42ce0cf49d36641b5bf77c928ed75f Mon Sep 17 00:00:00 2001 From: Stephen Baione <109226581+stbaione@users.noreply.github.com> Date: Fri, 15 Nov 2024 12:47:00 -0600 Subject: [PATCH] Set upper Numpy Version (#540) # Description Found a bug when walking through the shortfin llm docs using latest `nightly` sharktank. gguf is currently incompatible with numpy > 2. This breaks `sharktank.examples.export_paged_llm_v1` on linux. The gguf issue is filed [here](https://github.com/ggerganov/llama.cpp/issues/9021). It was closed from inactivity, but isn't actually solved and has a PR open for the fix. ## Repro Steps On linux: ### Before re-pinning Create a virtual environment: ```bash python -m venv --prompt sharktank .venv souce .venv/bin/activate ``` Install depencies and sharktank: ```bash pip install -r pytorch-cpu-requirements.txt pip install -r requirements.txt -e sharktank/ ``` Show numpy version (before re-pinning): ```bash pip show numpy | grep Version Version: 2.1.3 ``` Try running `export_paged_llm_v1`: ```bash python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4 ``` You'll see this error: ```text Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/examples/export_paged_llm_v1.py", line 336, in main() File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/examples/export_paged_llm_v1.py", line 67, in main dataset = cli.get_input_dataset(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/utils/cli.py", line 104, in get_input_dataset return Dataset.load(data_files["gguf"], file_type="gguf") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/theta.py", line 347, in load ds = _dataset_load_helper(path, file_type=file_type, mmap=mmap) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/theta.py", line 536, in _dataset_load_helper return gguf_interop.load_file(path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/gguf_interop/base.py", line 117, in load_file reader = GGUFReader(gguf_path) ^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/.venv_2/lib/python3.12/site-packages/gguf/gguf_reader.py", line 87, in __init__ if self._get(offs, np.uint32, override_order = '<')[0] != GGUF_MAGIC: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/.venv_2/lib/python3.12/site-packages/gguf/gguf_reader.py", line 137, in _get .newbyteorder(override_order or self.byte_order) ^^^^^^^^^^^^ AttributeError: `newbyteorder` was removed from the ndarray class in NumPy 2.0. Use `arr.view(arr.dtype.newbyteorder(order))` instead. ``` ## After re-pinning Create a virtual environment: ```bash python -m venv --prompt sharktank .venv souce .venv/bin/activate ``` Install depencies and sharktank: ```bash pip install -r pytorch-cpu-requirements.txt pip install -r requirements.txt -e sharktank/ ``` Show numpy version: ```bash pip show numpy | grep Version Version: 1.26.3 ``` Run `export_paged_llm_v1`: ```bash python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4 ``` With re-pinning we get desired output: ```text Exporting decode_bs1 Exporting prefill_bs4 Exporting decode_bs4 GENERATED! Exporting Saving to './temp/model.mlir' ``` --------- Co-authored-by: Marius Brehler --- sharktank/requirements.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sharktank/requirements.txt b/sharktank/requirements.txt index 19e48f825..6b533d977 100644 --- a/sharktank/requirements.txt +++ b/sharktank/requirements.txt @@ -2,8 +2,7 @@ iree-turbine # Runtime deps. gguf==0.6.0 -numpy==1.26.3; sys_platform == 'win32' -numpy; sys_platform != 'win32' +numpy<2.0 # Needed for newer gguf versions (TODO: remove when gguf package includes this) # sentencepiece>=0.1.98,<=0.2.0