You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Greetings, I have been searching for a way to run koyha_ss on the Jetson AGX Orin within the nvidia container, so it will utilize the GPU. After copying this git and running the docker compose line, the following message was received.
user@ubuntu:~/kohya_ss-docker$ docker compose --profile kohya up --build
[+] Building 1.1s (22/27) docker:default
=> [kohya internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 5.35kB 0.0s
=> [kohya] resolve image config for docker.io/docker/dockerfile:1 0.3s
=> CACHED [kohya] docker-image://docker.io/docker/dockerfile:1@sha256:ac 0.0s
=> [kohya internal] load metadata for docker.io/library/python:3.10-slim 0.2s
=> [kohya internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [kohya internal] load build context 0.0s
=> => transferring context: 187B 0.0s
=> [kohya base 1/7] FROM docker.io/library/python:3.10-slim@sha256:4bd9a 0.0s
=> CACHED [kohya base 2/7] RUN <<EOF (# apt for general container depend 0.0s
=> CACHED [kohya base 3/7] RUN <<EOF (# apt for extensions/custom script 0.0s
=> CACHED [kohya base 4/7] RUN <<EOF (# apt configurations...) 0.0s
=> CACHED [kohya base 5/7] RUN <<EOF (# cuda configurations...) 0.0s
=> CACHED [kohya base 6/7] COPY ./scripts/install-container-dep.sh /dock 0.0s
=> CACHED [kohya base 7/7] RUN <<EOF (# cuda cudnn + cutlass + tensorrt. 0.0s
=> CACHED [kohya kohya_base 1/8] RUN <<EOF (git clone https://github.com 0.0s
=> CACHED [kohya kohya_base 2/8] WORKDIR /koyah_ss 0.0s
=> CACHED [kohya kohya_base 3/8] RUN <<EOF (# Build requirements...) 0.0s
=> CACHED [kohya kohya_base 4/8] RUN <<EOF (# tensorflow...) 0.0s
=> CACHED [kohya kohya_base 5/8] RUN <<EOF (# torch, torchvision, torcha 0.0s
=> CACHED [kohya kohya_base 6/8] RUN <<EOF (# xformers...) 0.0s
=> CACHED [kohya kohya_base 7/8] RUN <<EOF (# deepspeed...) 0.0s
=> CACHED [kohya kohya_base 8/8] RUN <<EOF (#jax/tpu...) 0.0s
=> ERROR [kohya kohya_cuda 1/2] RUN <<EOF (# Hotfix for libnvinfer7...) 0.3s
[kohya kohya_cuda 1/2] RUN <<EOF (# Hotfix for libnvinfer7...):
0.276 + ln -s /venv/lib/python3.10/site-packages/tensorrt/libnvinfer.so.8 /venv/lib/python3.10/site-packages/tensorrt/libnvinfer.so.7
0.278 ln: failed to create symbolic link '/venv/lib/python3.10/site-packages/tensorrt/libnvinfer.so.7': No such file or directory
failed to solve: process "/bin/bash -ceuxo pipefail # Hotfix for libnvinfer7\nln -s $TENSORRT_PATH/libnvinfer.so.8 $TENSORRT_PATH/libnvinfer.so.7\nln -s $TENSORRT_PATH/libnvinfer_plugin.so.8 $TENSORRT_PATH/libnvinfer_plugin.so.7\n" did not complete successfully: exit code: 1
Any thoughts on getting past this, so we can move on with some training?
The text was updated successfully, but these errors were encountered:
I am getting this same error on my Arch Linux desktop with my 4090. I have Nvidia Container Toolkit installed and the card works fine in some other docker containers.
Greetings, I have been searching for a way to run koyha_ss on the Jetson AGX Orin within the nvidia container, so it will utilize the GPU. After copying this git and running the docker compose line, the following message was received.
user@ubuntu:~/kohya_ss-docker$ docker compose --profile kohya up --build
[+] Building 1.1s (22/27) docker:default
=> [kohya internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 5.35kB 0.0s
=> [kohya] resolve image config for docker.io/docker/dockerfile:1 0.3s
=> CACHED [kohya] docker-image://docker.io/docker/dockerfile:1@sha256:ac 0.0s
=> [kohya internal] load metadata for docker.io/library/python:3.10-slim 0.2s
=> [kohya internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [kohya internal] load build context 0.0s
=> => transferring context: 187B 0.0s
=> [kohya base 1/7] FROM docker.io/library/python:3.10-slim@sha256:4bd9a 0.0s
=> CACHED [kohya base 2/7] RUN <<EOF (# apt for general container depend 0.0s
=> CACHED [kohya base 3/7] RUN <<EOF (# apt for extensions/custom script 0.0s
=> CACHED [kohya base 4/7] RUN <<EOF (# apt configurations...) 0.0s
=> CACHED [kohya base 5/7] RUN <<EOF (# cuda configurations...) 0.0s
=> CACHED [kohya base 6/7] COPY ./scripts/install-container-dep.sh /dock 0.0s
=> CACHED [kohya base 7/7] RUN <<EOF (# cuda cudnn + cutlass + tensorrt. 0.0s
=> CACHED [kohya kohya_base 1/8] RUN <<EOF (git clone https://github.com 0.0s
=> CACHED [kohya kohya_base 2/8] WORKDIR /koyah_ss 0.0s
=> CACHED [kohya kohya_base 3/8] RUN <<EOF (# Build requirements...) 0.0s
=> CACHED [kohya kohya_base 4/8] RUN <<EOF (# tensorflow...) 0.0s
=> CACHED [kohya kohya_base 5/8] RUN <<EOF (# torch, torchvision, torcha 0.0s
=> CACHED [kohya kohya_base 6/8] RUN <<EOF (# xformers...) 0.0s
=> CACHED [kohya kohya_base 7/8] RUN <<EOF (# deepspeed...) 0.0s
=> CACHED [kohya kohya_base 8/8] RUN <<EOF (#jax/tpu...) 0.0s
=> ERROR [kohya kohya_cuda 1/2] RUN <<EOF (# Hotfix for libnvinfer7...) 0.3s
failed to solve: process "/bin/bash -ceuxo pipefail # Hotfix for libnvinfer7\nln -s $TENSORRT_PATH/libnvinfer.so.8 $TENSORRT_PATH/libnvinfer.so.7\nln -s $TENSORRT_PATH/libnvinfer_plugin.so.8 $TENSORRT_PATH/libnvinfer_plugin.so.7\n" did not complete successfully: exit code: 1
Any thoughts on getting past this, so we can move on with some training?
The text was updated successfully, but these errors were encountered: