Skip to content
This repository has been archived by the owner on Jun 10, 2024. It is now read-only.

feat: add Cupy samples #518

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

feat: add Cupy samples #518

wants to merge 9 commits into from

Conversation

royinx
Copy link
Contributor

@royinx royinx commented Aug 15, 2023

hello world,
this PR do the followings:

  • sample for Cupy + TensorRT
  • added build config and test case for CICD
  • fixed the first batch output didn't sync issue in SampleTensorRTResnet.py

please review

@royinx royinx force-pushed the master branch 3 times, most recently from fbddfa3 to 0b1f36b Compare August 15, 2023 19:04
Comment on lines +16 to +22
def get_cupy() -> str:
CUDA_VERSION = os.environ.get("CUDA_VERSION", None)
if CUDA_VERSION>="11.2": # after 11.2 use
cupy_pack = f"cupy-cuda{CUDA_VERSION[:2]}x"
else:
cupy_pack = f"cupy-cuda{CUDA_VERSION[:4].replace('.','')}"
return cupy_pack
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails if the env variable is not set. Can we have a check that in this case does not install any cupy or installs just the latest ?
Also you could do a fall through check: env variable -> nvcc subprocess -> nvidia-smi.

Copy link
Contributor Author

@royinx royinx Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered the approach of env, nvcc and nvidia-smi, and there will encounter some issues like

  1. no env var set
  2. nvcc not available in the runtime image, like nvidia/cuda:11.7.1-runtime-ubuntu22.04
  3. nvidia-smi always shows the latest CUDA version 12.2 by the host driver while docker is using 11.7
    or even has no nvidia-smi for some cases I have seen.

Can I list all cuda versions in cuda directory /usr/local/cuda* ?
while I have no idea about windows cuda

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as usual windows is the annoying part. This is why I though you could do a fall through approach, trying one after the other and if nothing works just assume a version ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For linux, I check /usr/local/cuda-* to extract the version.
For windows, I implemented the fall-through approach (nvcc > nvidia-smi).
This approach should handle most of the cases.

There are some potential issues still in the fall-through approach.

  • nvcc not available.
  • utility not in NVIDIA_DRIVER_CAPABILITIES=compute,video (also a conflict in Dockerfile.tensorrt)
  • CUDA version in nvidia-smi does not match the cuda installed.

Since cupy package conflict with each other, assume one is not a good idea. I prefer not to install it if the CUDA version cannot be accessed correctly.
If the complexity goes up, I think we should let the user install it manually.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants