Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install failure of TensorRT #3721

Open
Tom-Teamo opened this issue Mar 18, 2024 · 11 comments
Open

install failure of TensorRT #3721

Tom-Teamo opened this issue Mar 18, 2024 · 11 comments
Labels
triaged Issue has been triaged by maintainers

Comments

@Tom-Teamo
Copy link

Tom-Teamo commented Mar 18, 2024

Description

When I install TrensorRT in my Ubuntu22.04 as step No.1 in setup, it always failed at "extracting: ngc-cli.md5 ". And I do not know how to fix this.

... 
9.464   inflating: ngc-cli/opentelemetry/proto/common/v1/common_pb2.pyi  
9.464   inflating: ngc-cli/opentelemetry/proto/version.py  
9.464    creating: ngc-cli/opentelemetry/util/
9.464   inflating: ngc-cli/opentelemetry/util/_once.py  
9.464    creating: ngc-cli/opentelemetry/util/__pycache__/
9.464   inflating: ngc-cli/opentelemetry/util/__pycache__/_once.cpython-39.pyc  
9.464   inflating: ngc-cli/opentelemetry/util/__pycache__/_providers.cpython-39.pyc  
9.464   inflating: ngc-cli/opentelemetry/util/__pycache__/re.cpython-39.pyc  
9.464   inflating: ngc-cli/opentelemetry/util/__pycache__/_importlib_metadata.cpython-39.pyc  
9.464   inflating: ngc-cli/opentelemetry/util/__pycache__/types.cpython-39.pyc  
9.464   inflating: ngc-cli/opentelemetry/util/_importlib_metadata.py  
9.464   inflating: ngc-cli/opentelemetry/util/_providers.py  
9.464   inflating: ngc-cli/opentelemetry/util/re.py  
9.464   inflating: ngc-cli/opentelemetry/util/types.py  
9.464  extracting: ngc-cli.md5             
71.87 Connection failed; retrying... (Retries left: 5)
137.0 Connection failed; retrying... (Retries left: 4)
202.1 Connection failed; retrying... (Retries left: 3)
267.3 Connection failed; retrying... (Retries left: 2)
332.4 Connection failed; retrying... (Retries left: 1)
332.4 Error: Request timed out.
------
ubuntu-20.04.Dockerfile:109
--------------------
 107 |     
 108 |     # Download NGC client
 109 | >>> RUN cd /usr/local/bin && wget https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip && unzip ngccli_cat_linux.zip && chmod u+x ngc-cli/ngc && rm ngccli_cat_linux.zip ngc-cli.md5 && echo "no-apikey\nascii\n" | ngc-cli/ngc config set
 110 |     
 111 |     # Set environment and working directory
--------------------
ERROR: failed to solve: process "/bin/bash -c cd /usr/local/bin && wget https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip && unzip ngccli_cat_linux.zip && chmod u+x ngc-cli/ngc && rm ngccli_cat_linux.zip ngc-cli.md5 && echo \"no-apikey\\nascii\\n\" | ngc-cli/ngc config set" did not complete successfully: exit code: 1

Environment

TensorRT Version: latest

NVIDIA GPU: 3090

NVIDIA Driver Version: 12.2

CUDA Version: 12.1

Operating System: Ubuntu22.04

Steps To Reproduce

just as step No.1 in setup, it always failed at "extracting: ngc-cli.md5 ".

@lix19937
Copy link

Maybe network problem, you need a stable vpn

@YongWookHa
Copy link

I'm facing same problem at extracting: ngc-cli.md5.
@lix19937 did you test it?

@lix19937
Copy link

lix19937 commented Mar 19, 2024

you can offline download ngccli_cat_linux.zip and then unzip it and then
chmod u+x ngc-cli/ngc && echo \"no-apikey\\nascii\\n\" | ngc-cli/ngc config set

@YongWookHa
Copy link

YongWookHa commented Mar 19, 2024

@lix19937 It's not about network problem.
The command stucks at the echo "no-apikey\nascii\n" | ngc-cli/ngc config set part.

@YongWookHa
Copy link

YongWookHa commented Mar 20, 2024

I've just resolved the problem by the steps below.

  1. Split the line in dockerfile

RUN cd /usr/local/bin && wget https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip && unzip ngccli_cat_linux.zip && chmod u+x ngc-cli/ngc && rm ngccli_cat_linux.zip ngc-cli.md5 && echo "no-apikey\nascii\n" | ngc-cli/ngc config set
to

# Download NGC client
RUN cd /usr/local/bin && wget https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
RUN cd /usr/local/bin && unzip ngccli_cat_linux.zip
RUN cd /usr/local/bin && chmod u+x ngc-cli/ngc
RUN cd /usr/local/bin && rm -f ngccli_cat_linux.zip ngc-cli.md5
RUN cd /usr/local/bin && echo "no-apikey\nascii\n" | ngc-cli/ngc config set
  1. Comment out the problem causing command
# RUN cd /usr/local/bin && echo "no-apikey\nascii\n" | ngc-cli/ngc config set
  1. Build the docker image
> ./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.1
  1. Launch the container
> ./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.1 --gpus all
  1. Put the command manually
> ngc-cli/ngc config set
Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: no-apikey
Enter CLI output format type [ascii]. Choices: ['ascii', 'csv', 'json']: ascii
Validating configuration...
Successfully validated configuration.
Saving configuration...
Successfully saved NGC configuration to /home/trtuser/.ngc/config

@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Mar 22, 2024
@guiyuliu
Copy link

hello,do you know why this step was stucked ? “echo "no-apikey\nascii\n" | ngc-cli/ngc config set”
I encountered the same problem here

@bashirmindee
Copy link

doing echo "no-apikey\Nascii\n" | ... is really weird.
I suggest doing: /usr/local/bin/ngc-cli/ngc config set --format_type ascii

@bashirmindee
Copy link

for some reason, you have to run ngc-cli/ngc config twice in order for it to work!
The 1st time it blocks and doesn't work. The second time it works correctly

@bashirmindee
Copy link

I finally found it.
ngc config ... generates /root/.ngc/meta_data if it doesn't exist. However it blocks the 1st time because it doesn't already exist.
by generating dummy data you can bypass this:

RUN mkdir -p /root/.ngc
RUN echo -e ";WARNING - This is a machine generated file.  Do not edit manually.\n\n[COMMAND_MAP]\n\n[UPGRADE]\nlast_upgrade_msg_date = 2024-03-26::09:43:31\n\n[USER_ROLES]\n\n[PRODUCT_NAMES]\n" > /root/.ngc/meta_data
RUN /usr/local/bin/ngc-cli/ngc config set --format_type ascii

This works directly. I can do a PR to fix it

@Goodluckhf
Copy link

Goodluckhf commented Apr 3, 2024

It happened after release new CLI version. I installed specific previous version and it fixed.

you can change this line

RUN cd /usr/local/bin && wget https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip && unzip ngccli_cat_linux.zip && chmod u+x ngc-cli/ngc && rm ngccli_cat_linux.zip ngc-cli.md5 && echo "no-apikey\nascii\n" | ngc-cli/ngc config set

To ->

RUN cd /usr/local/bin && wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.38.0/files/ngccli_linux.zip -O ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc && rm ngccli_linux.zip ngc-cli.md5 && echo "no-apikey\nascii\n" | ngc-cli/ngc config set

p.s you can choose specific version here -> https://org.ngc.nvidia.com/setup/installers/cli

@Nghiauet
Copy link

@Goodluckhf Thanks it work for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

8 participants