Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] mlc_llm serve error on Mac M1 (git clone failed with error 128) #2938

Closed
pchalasani opened this issue Sep 24, 2024 · 8 comments
Closed
Labels
bug Confirmed bugs

Comments

@pchalasani
Copy link

pchalasani commented Sep 24, 2024

🐛 Bug

see title

To Reproduce

Steps to reproduce the behavior:

  1. follow instructions to install mlc from nightly (NOT using conda, just in my venv)
  2. run mlc_llm serve HF://mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC

Expected behavior

should work

Environment

  • Platform Mac M1 Max
  • Operating system - MacOS Sonoma 14.2.1
  • How you installed MLC-LLM (conda, source):
    Ran this with my venv activated (did NOT use conda, not sure if that matters)
python -m pip install --pre -U -f https://mlc.ai/wheels mlc-llm-nightly-cpu mlc-ai-nightly-cpu
  • How you installed TVM-Unity (pip, source):
    did not install this

  • Python version (e.g. 3.10): 3.11

Additional context

error trace:

[2024-09-24 09:45:08] INFO auto_device.py:88: Not found device: cuda:0
[2024-09-24 09:45:09] INFO auto_device.py:88: Not found device: rocm:0
[2024-09-24 09:45:10] INFO auto_device.py:79: Found device: metal:0
[2024-09-24 09:45:10] INFO auto_device.py:88: Not found device: vulkan:0
[2024-09-24 09:45:11] INFO auto_device.py:88: Not found device: opencl:0
[2024-09-24 09:45:11] INFO auto_device.py:35: Using device: metal:0
[2024-09-24 09:45:11] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC
[2024-09-24 09:45:11] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2024-09-24 09:45:11] INFO download_cache.py:56: [Git] Cloning https://huggingface.co/mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC.git to /var/folders/dx/39xz0fk938zftc3djhbm78bc0000gn/T/tmpjpli117j/tmp
Traceback (most recent call last):
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/support/download_cache.py", line 57, in git_clone
    subprocess.run(
  File "/opt/homebrew/Cellar/python@3.11/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'clone', 'https://huggingface.co/mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC.git', '.tmp']' returned non-zero exit status 128.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/pchalasani/Git/langroid/.venv/bin/mlc_llm", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/__main__.py", line 49, in main
    cli.main(sys.argv[2:])
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/cli/serve.py", line 204, in main
    serve(
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/interface/serve.py", line 55, in serve
    async_engine = engine.AsyncMLCEngine(
                   ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine.py", line 896, in __init__
    super().__init__(
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine_base.py", line 590, in __init__
    ) = _process_model_args(models, device, engine_config)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine_base.py", line 171, in _process_model_args
    model_args: List[Tuple[str, str]] = [_convert_model_info(model) for model in models]
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine_base.py", line 171, in <listcomp>
    model_args: List[Tuple[str, str]] = [_convert_model_info(model) for model in models]
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine_base.py", line 125, in _convert_model_info
    model_path = download_cache.get_or_download_model(model.model)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/support/download_cache.py", line 228, in get_or_download_model
    model_path = download_and_cache_mlc_weights(model)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/support/download_cache.py", line 180, in download_and_cache_mlc_weights
    git_clone(git_url, tmp_dir, ignore_lfs=True)
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/support/download_cache.py", line 70, in git_clone
    raise ValueError(
ValueError: Git clone failed with return code 128: None. The command was: ['git', 'clone', 'https://huggingface.co/mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC.git', '.tmp']
Exception ignored in: <function MLCEngineBase.__del__ at 0x127f94540>
Traceback (most recent call last):
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine_base.py", line 654, in __del__
    self.terminate()
  File "/Users/pchalasani/Git/langroid/.venv/lib/python3.11/site-packages/mlc_llm/serve/engine_base.py", line 661, in terminate
    self._ffi["exit_background_loop"]()
    ^^^^^^^^^
AttributeError: 'AsyncMLCEngine' object has no attribute '_ffi'
@pchalasani pchalasani added the bug Confirmed bugs label Sep 24, 2024
@shahizat
Copy link

Hello, I am experiencing same issue on the NVIDIA Jetson AGX Orin 64GB Developer Kit.

[2024-09-29 17:09:43] INFO auto_device.py:79: Found device: cuda:0
[2024-09-29 17:09:45] INFO auto_device.py:88: Not found device: rocm:0
[2024-09-29 17:09:47] INFO auto_device.py:88: Not found device: metal:0
[2024-09-29 17:09:49] INFO auto_device.py:88: Not found device: vulkan:0
[2024-09-29 17:09:51] INFO auto_device.py:88: Not found device: opencl:0
[2024-09-29 17:09:51] INFO auto_device.py:35: Using device: cuda:0
[2024-09-29 17:09:51] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/Llama-3.2-1B-Instruct-q4f32_1-MLC
[2024-09-29 17:09:51] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2024-09-29 17:09:51] INFO download_cache.py:56: [Git] Cloning https://huggingface.co/mlc-ai/Llama-3.2-1B-Instruct-q4f32_1-MLC.git to /tmp/tmp39iovrkp/tmp
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/support/download_cache.py", line 57, in git_clone
    subprocess.run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'clone', 'https://huggingface.co/mlc-ai/Llama-3.2-1B-Instruct-q4f32_1-MLC.git', '.tmp']' returned non-zero exit status 128.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/mlc_llm", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/__main__.py", line 49, in main
    cli.main(sys.argv[2:])
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/cli/serve.py", line 204, in main
    serve(
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/interface/serve.py", line 55, in serve
    async_engine = engine.AsyncMLCEngine(
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine.py", line 896, in __init__
    super().__init__(
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine_base.py", line 590, in __init__
    ) = _process_model_args(models, device, engine_config)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine_base.py", line 171, in _process_model_args
    model_args: List[Tuple[str, str]] = [_convert_model_info(model) for model in models]
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine_base.py", line 171, in <listcomp>
    model_args: List[Tuple[str, str]] = [_convert_model_info(model) for model in models]
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine_base.py", line 125, in _convert_model_info
    model_path = download_cache.get_or_download_model(model.model)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/support/download_cache.py", line 228, in get_or_download_model
    model_path = download_and_cache_mlc_weights(model)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/support/download_cache.py", line 180, in download_and_cache_mlc_weights
    git_clone(git_url, tmp_dir, ignore_lfs=True)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/support/download_cache.py", line 70, in git_clone
    raise ValueError(
ValueError: Git clone failed with return code 128: None. The command was: ['git', 'clone', 'https://huggingface.co/mlc-ai/Llama-3.2-1B-Instruct-q4f32_1-MLC.git', '.tmp']
Exception ignored in: <function MLCEngineBase.__del__ at 0xffff36b43130>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine_base.py", line 654, in __del__
    self.terminate()
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/serve/engine_base.py", line 661, in terminate
    self._ffi["exit_background_loop"]()
AttributeError: 'AsyncMLCEngine' object has no attribute '_ffi'

@rickzx
Copy link
Contributor

rickzx commented Sep 30, 2024

Hi @pchalasani @shahizat I was not able to reproduce the same error on my Mac. I suspect this is due to git configuration issue. Can you try directly running:

git clone https://huggingface.co/mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC.git

and see if that works?

@pchalasani
Copy link
Author

@rickzx yes the git clone works. Can we use mlc_llm serve and directly point it to the local cloned model, rather than the HF... argument?

@ptrkstr
Copy link

ptrkstr commented Oct 1, 2024

@pchalasani you may be missing the dependencies in step 1 here

@pchalasani
Copy link
Author

@pchalasani you may be missing the dependencies in step 1 here

Thanks but I'm not using it on iOS. Please let me know if these deps are needed for my scenario
(I followed the docs precisely and this was not mentioned).

@ptrkstr
Copy link

ptrkstr commented Oct 1, 2024

Ahhh apologies, I received a similar error and installing lfs is what worked for me but sounds like not applicable to you.

@MasterJH5574
Copy link
Member

Can we use mlc_llm serve and directly point it to the local cloned model, rather than the HF... argument?

@pchalasani Yes of course! You can do the following

git clone https://huggingface.co/mlc-ai/Qwen2.5-32B-Instruct-q4f32_1-MLC
mlc_llm serve ./Qwen2.5-32B-Instruct-q4f32_1-MLC

@shahizat
Copy link

shahizat commented Nov 6, 2024

Hello guys, I believe this issue is caused by Git LFS. Just install it and see if that resolves the problem.

sudo apt-get install git-lfs
git lfs install

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs
Projects
None yet
Development

No branches or pull requests

5 participants