PyTorch "Undefined symbol" error when importing SAM ONNX models to cluster #161

marias65 · 2024-04-14T19:18:13Z

Currently trying to follow the segment anything notebook to run sentinel2_segmentation.ipynb but when trying to import SAM's ONNX models to the cluster with ! python ../../scripts/export_sam_models.py --models vit_b, I run into an error that says "ImportError: /home/msbksan/micromamba/envs/segment_anything_cpu/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent"

The text was updated successfully, but these errors were encountered:

rafaspadilha · 2024-04-15T13:37:40Z

Hi, @marias65. I couldn't reproduce your error on my machine but found a few similar issues here and here that the cause might be installing pytorch via conda and a possible solution would be pointing to the CPU wheel during installation.

Quick question: are you able to import PyTorch in the segment_anything_cpy environment?

$ python -c "import torch; print(torch.__version__)"

I was able to set up a new environment with the latest version of PyTorch and run the script to export the model to ONNX files. Could I ask you to try on your end as well?

Please change the env_cpy.yaml, commenting the pip lines as below:

name: new_segment_anything_cpu
channels:
  - pytorch
  - nvidia
  - conda-forge
  - defaults
dependencies:
  - python==3.8.*
  - geopandas~=0.11.1
  - ipython~=8.5.0
  - ipywidgets~=8.0.2
  - jupyter~=1.0.0
  - matplotlib~=3.6.0
  - numpy~=1.23.3
  # - pytorch=2.0.0=py3.8_cpu_0
  # - torchvision=0.15.0=py38_cpu
  # - torchaudio=2.0.0=py38_cpu
  - pip~=22.2.0
  - pandas~=1.5.0
  - rasterio~=1.3.2
  - shapely~=1.8.4
  - tqdm~=4.64.1
  - scikit-image~=0.20.0
  # - pip:
  #     - git+https://github.com/facebookresearch/segment-anything.git
  #     - ../../src/vibe_core
  #     - cartopy~=0.21.0
  #     - xarray~=2022.10.0
  #     - ipympl~=0.9.3
  #     - onnx~=1.14.0
  #     - onnxruntime~=1.15.0

Once the env is created, please activate it and install the pip packages:

$ micromamba activate new_segment_anything_cpu
$ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
$ pip install git+https://github.com/facebookresearch/segment-anything.git 
$ pip install ../../src/vibe_core 
$ pip install cartopy~=0.21.0 xarray~=2022.10.0 ipympl~=0.9.3 onnx~=1.14.0 onnxruntime~=1.15.0

Make sure the path to src/vibe_core is correct on the $ pip install ../../src/vibe_core command.

Please, let me know if you are able to run the exportation script in this new environment.

marias65 · 2024-04-16T19:34:53Z

Thank you for your response! I was not able to run $ python -c "import torch; print(torch.__version__)" as it gave me the same iJIT_NotifyEvent error while in the segment_anything_cpu environment.

I was able to create the new_segment_anything_cpu environment and install all the pip packages you listed but when I attempted to run $ python -c "import torch; print(torch.__version__)" or ! python ../../scripts/export_sam_models.py --models vit_b I still came across the same iJIT_NotifyEvent error.

rafaspadilha · 2024-04-19T14:25:28Z

Hi, @marias65. I was able to replicate your issue.

Installing the pytorch 2.1.0 with the appropriate wheel within the segment anything environment solved the problem for me.

In summary, what I did was:

Create the segment_anything_cpu environment with the yaml that is currently available in the repo.
Run pip install torch~=2.1.0 --index-url https://download.pytorch.org/whl/cpu

After that, I was able to import torch:

$ python -c "import torch; print(torch.__version__)"
2.1.2+cpu

Please, could you let me know if this works for you?

I will fix the environment yaml files in the next release.

rafaspadilha · 2024-04-19T15:02:32Z

Another possibility that worked for me (and won't change the pytorch version) was creating the environment with the following yaml:

name: segment_anything_cpu
channels:
  - pytorch
  - nvidia
  - conda-forge
  - defaults
dependencies:
  - python==3.8.*
  - geopandas~=0.11.1
  - ipython~=8.5.0
  - ipywidgets~=8.0.2
  - jupyter~=1.0.0
  - matplotlib~=3.6.0
  - numpy~=1.23.3
  - pip~=22.2.0
  - pandas~=1.5.0
  - rasterio~=1.3.2
  - shapely~=1.8.4
  - tqdm~=4.64.1
  - scikit-image~=0.20.0
  - pip:
      - --extra-index-url https://download.pytorch.org/whl/cpu
      - torch~=2.0.0
      - torchvision~=0.15.0
      - torchaudio~=2.0.0
      - git+https://github.com/facebookresearch/segment-anything.git
      - ../../src/vibe_core
      - cartopy~=0.21.0
      - xarray~=2022.10.0
      - ipympl~=0.9.3
      - onnx~=1.14.0
      - onnxruntime~=1.15.0

by running:

$ micromamba env create -f notebooks/segment_anything/env_cpu.yaml

With the environment activated:

$ python -c "import torch; print(torch.__version__)"
2.0.1+cpu

marias65 · 2024-04-19T20:43:24Z

Thank you! I rebuild farmvibes-ai and followed your latest solution and that seems to have helped!

Right now, I receive this message but looking into it further suggests that it is due to limited memory on the machine I am currently using. Otherwise, I would say that it worked, thank you

rafaspadilha · 2024-04-23T12:09:07Z

I'm glad that error is fixed.

For this new one, the script doesn't require that much memory, especially with the vit_b model. What are your specs (memory and disk space)?

The script also logs a few messages (e.g., when it is able to load the encoder/decoder model and when it starts converting them), but these didn't show up, which I find it weird.

Are you able to import onnxruntime and onnx?

import onnx
import onnxruntime

rafaspadilha · 2024-05-17T16:34:15Z

Closing this issue for now. @marias65, let me know if you are still facing this error.

github-actions bot added the triage Issues still not triaged by team label Apr 14, 2024

rafaspadilha self-assigned this Apr 15, 2024

rafaspadilha added notebooks Issues encountered while running the notebooks local cluster Issues encountered in local cluster and removed triage Issues still not triaged by team labels Apr 15, 2024

rafaspadilha changed the title ~~Error when importing SAM ONNX models to cluster~~ PyTorch "Undefined symbol" error when importing SAM ONNX models to cluster Apr 15, 2024

rafaspadilha added the bug Something isn't working label Apr 19, 2024

rafaspadilha closed this as completed May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch "Undefined symbol" error when importing SAM ONNX models to cluster #161

PyTorch "Undefined symbol" error when importing SAM ONNX models to cluster #161

marias65 commented Apr 14, 2024

rafaspadilha commented Apr 15, 2024 •

edited

Loading

marias65 commented Apr 16, 2024

rafaspadilha commented Apr 19, 2024 •

edited

Loading

rafaspadilha commented Apr 19, 2024 •

edited

Loading

marias65 commented Apr 19, 2024

rafaspadilha commented Apr 23, 2024

rafaspadilha commented May 17, 2024

PyTorch "Undefined symbol" error when importing SAM ONNX models to cluster #161

PyTorch "Undefined symbol" error when importing SAM ONNX models to cluster #161

Comments

marias65 commented Apr 14, 2024

rafaspadilha commented Apr 15, 2024 • edited Loading

marias65 commented Apr 16, 2024

rafaspadilha commented Apr 19, 2024 • edited Loading

rafaspadilha commented Apr 19, 2024 • edited Loading

marias65 commented Apr 19, 2024

rafaspadilha commented Apr 23, 2024

rafaspadilha commented May 17, 2024

rafaspadilha commented Apr 15, 2024 •

edited

Loading

rafaspadilha commented Apr 19, 2024 •

edited

Loading

rafaspadilha commented Apr 19, 2024 •

edited

Loading