RuntimeError: Triton Error [CUDA]: invalid argument by run_dnabert2.sh #36

shiro-kur · 2023-08-31T09:01:19Z

What is the problem and the solution??

The provided data_path is /home/shiro/DNABERT_2/finetune
2023-08-31 17:57:18.856636: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib:/usr/local/cuda/lib64:
2023-08-31 17:57:18.856685: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:root:Perform single sequence classification...
WARNING:root:Perform single sequence classification...
WARNING:root:Perform single sequence classification...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Some weights of the model checkpoint at zhihan1996/DNABERT-2-117M were not used when initializing BertForSequenceClassification: ['cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.bias']

This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at zhihan1996/DNABERT-2-117M and are newly initialized: ['classifier.weight', 'bert.pooler.dense.weight', 'bert.pooler.dense.bias', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using cuda_amp half precision backend
***** Running training *****
Num examples = 36,496
Num Epochs = 5
Instantaneous batch size per device = 8
Total train batch size (w. parallel, distributed & accumulation) = 32
Gradient Accumulation steps = 4
Total optimization steps = 5,700
Number of trainable parameters = 117,070,851
0%| | 0/5700 [00:00<?, ?it/s]Traceback (most recent call last):
File "", line 21, in _bwd_kernel
KeyError: ('2-.-0-.-0-1e8410f206c822547fb50e2ea86e45a6-2b0c5161c53c71b37ae20a9996ee4bb8-c1f92808b4e4644c1732e8338187ac87-42648570729a4835b21c1c18cebedbfe-12f7ac1ca211e037f62a7c0c323d9990-5c5e32ff210f3b7f56c98ca29917c25e-06f0df2d61979d629033f4a22eff5198-0dd03b0bd512a184b3512b278d9dfa59-d35ab04ae841e2714a253c523530b071', (torch.float16, torch.float16, torch.float16, torch.float32, torch.float16, torch.float32, torch.float16, torch.float16, torch.float32, torch.float32, 'fp32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32'), ('matrix', False, 64, False, False, False, True, 128, 128), (True, True, True, True, True, True, True, True, True, True, (False,), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (False, False), (True, False), (True, False), (True, False), (True, False), (False, False), (False, False)))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_modified.py", line 332, in
train()
File "train_modified.py", line 314, in train
trainer.train()
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/transformers/trainer.py", line 1664, in train
return inner_training_loop(
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/transformers/trainer.py", line 1940, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/transformers/trainer.py", line 2745, in training_step
self.scaler.scale(loss).backward()
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/torch/autograd/init.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/torch/autograd/function.py", line 267, in apply
return user_fn(self, *args)
File "/home/shiro/.cache/huggingface/modules/transformers_modules/zhihan1996/DNABERT-2-117M/81ac6a98387cf94bc283553260f3fa6b88cef2fa/flash_attn_triton.py", line 1041, in backward
_flash_attn_backward(do,
File "/home/shiro/.cache/huggingface/modules/transformers_modules/zhihan1996/DNABERT-2-117M/81ac6a98387cf94bc283553260f3fa6b88cef2fa/flash_attn_triton.py", line 949, in _flash_attn_backward
_bwd_kernel[grid]( # type: ignore
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/runtime/jit.py", line 106, in launcher
return self.run(*args, grid=grid, **kwargs)
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/runtime/autotuner.py", line 73, in run
timings = {config: self._bench(*args, config=config, **kwargs)
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/runtime/autotuner.py", line 73, in
timings = {config: self._bench(*args, config=config, **kwargs)
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/runtime/autotuner.py", line 63, in _bench
return do_bench(kernel_call)
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/testing.py", line 140, in do_bench
fn()
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/runtime/autotuner.py", line 62, in kernel_call
self.fn.run(*args, num_warps=config.num_warps, num_stages=config.num_stages, **current)
File "/home/shiro/miniconda3/envs/dnabert2/lib/python3.8/site-packages/triton/runtime/autotuner.py", line 200, in run
return self.fn.run(*args, **kwargs)
File "", line 43, in _bwd_kernel
RuntimeError: Triton Error [CUDA]: invalid argument
0%| | 0/5700 [00:00<?, ?it/s

The text was updated successfully, but these errors were encountered:

shiro-kur · 2023-08-31T09:02:27Z

Here are my installed packages.

(dnabert2) shiro@GTUNE:~/DNABERT_2/finetune$ pip list
Package Version

absl-py 1.0.0
accelerate 0.22.0
anndata 0.7.6
antlr4-python3-runtime 4.9.3
appdirs 1.4.4
astor 0.8.1
astunparse 1.6.3
autograd 1.4
autograd-gamma 0.5.0
biopython 1.79
biothings-client 0.2.6
bleach 5.0.1
Brotli 1.0.9
cachetools 5.0.0
certifi 2023.7.22
charset-normalizer 2.0.12
click 8.1.2
cmake 3.27.2
coloredlogs 15.0.1
cycler 0.11.0
dash 2.0.0
dash-core-components 2.0.0
dash-dangerously-set-inner-html 0.0.2
dash-html-components 2.0.0
dash-table 5.0.0
docutils 0.19
einops 0.6.1
filelock 3.12.3
Flask 2.1.1
Flask-Compress 1.11
fonttools 4.32.0
formulaic 0.2.4
fsspec 2023.6.0
future 0.18.2
gast 0.3.3
google-auth 2.6.5
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.44.0
h5py 2.10.0
huggingface-hub 0.16.4
humanfriendly 10.0
idna 3.4
importlib-metadata 4.11.3
interface-meta 1.3.0
itsdangerous 2.1.2
Jinja2 3.1.1
joblib 1.1.0
Keras-Preprocessing 1.1.2
kiwisolver 1.4.2
lifelines 0.26.4
lit 17.0.0rc3
llvmlite 0.36.0
Markdown 3.3.6
markdown-it-py 2.1.0
MarkupSafe 2.1.1
matplotlib 3.5.1
mdurl 0.1.2
mhcflurry 2.0.5
mhcgnomes 1.7.0
mygene 3.2.2
natsort 8.1.0
np-utils 0.6.0
numba 0.53.0
numpy 1.18.5
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
omegaconf 2.3.0
opt-einsum 3.3.0
packaging 21.3
pandas 1.3.4
patsy 0.5.2
peft 0.3.0
Pillow 9.1.0
pip 23.2.1
pkginfo 1.9.6
plotly 5.4.0
protobuf 3.20.0
psutil 5.9.5
Pygments 2.14.0
pynndescent 0.5.6
pyparsing 3.0.8
python-dateutil 2.8.2
pytz 2022.1
PyYAML 6.0.1
readme-renderer 37.3
regex 2023.8.8
requests 2.26.0
requests-oauthlib 1.3.1
requests-toolbelt 0.10.1
rfc3986 2.0.0
rich 13.2.0
rsa 4.8
safetensors 0.3.3
scikit-learn 1.0.2
scipy 1.4.1
seaborn 0.11.2
serializable 0.2.1
setuptools 68.0.0
six 1.16.0
SNAF 0.5.2
statsmodels 0.13.1
tenacity 8.0.1
tensorboard 2.8.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorboardX 2.6.2.2
tensorflow 2.3.0
tensorflow-estimator 2.3.0
termcolor 1.1.0
threadpoolctl 3.1.0
tokenizers 0.13.3
torch 1.13.0
torchaudio 0.13.0
torchvision 0.14.0
tqdm 4.62.3
transformers 4.29.2
triton 2.0.0.dev20221202
twine 4.0.2
typechecks 0.1.0
typing_extensions 4.7.1
umap-learn 0.5.2
urllib3 1.26.14
webencodings 0.5.1
Werkzeug 2.0.2
wheel 0.38.4
wrapt 1.14.0
xlrd 1.2.0
xmltodict 0.12.0
xmltramp2 3.1.1

shiro-kur · 2023-08-31T10:48:06Z

Here is my GPU implementation.

(dnabert2) shiro@GTUNE:~$ nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 3070 Laptop GPU (UUID: GPU-776edd0d-aef5-ab3a-3750-32bfa854fecf)

(dnabert2) shiro@GTUNE:~$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0

(dnabert2) shiro@GTUNE:~$ dpkg -l | grep cudnn
ii cudnn-local-repo-ubuntu2004-8.9.4.25 1.0-1 amd64 cudnn-local repository configuration files
ii libcudnn8 8.9.4.25-1+cuda11.8 amd64 cuDNN runtime libraries
ii libcudnn8-dev 8.9.4.25-1+cuda11.8 amd64 cuDNN development libraries and headers

jiaojiaoguan · 2024-02-01T03:46:41Z

I have the same error KeyError: ('2-.-0-.-0-1e8410f206c822547fb50e2ea86e45a6-2b0c5161c53c71b37ae20a9996ee4bb8-c1f92808b4e4644c1732e8338187ac87-42648570729a4835b21c1c18cebedbfe-12f7ac1ca211e037f62a7c0c323d9990-5c5e32ff210f3b7f56c98ca29917c25e-06f0df2d61979d629033f4a22eff5198-0dd03b0bd512a184b3512b278d9dfa59-d35ab04ae841e2714a253c523530b071', (torch.float16, torch.float16, torch.float16, torch.float32, torch.float16, torch.float32, torch.float16, torch.float16, torch.float32, torch.float32, 'fp32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32'), ('matrix', False, 64, False, False, False, True, 128, 128), (True, True, True, True, True, True, True, True, True, True, (False,), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (False, False), (True, False), (True, False), (True, False), (True, False), (False, False), (False, False)))

Do you solve it?Thanks!

shiro-kur · 2024-02-01T08:14:44Z

I just gave up..... Sorry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Triton Error [CUDA]: invalid argument by run_dnabert2.sh #36

RuntimeError: Triton Error [CUDA]: invalid argument by run_dnabert2.sh #36

shiro-kur commented Aug 31, 2023

shiro-kur commented Aug 31, 2023

shiro-kur commented Aug 31, 2023

jiaojiaoguan commented Feb 1, 2024

shiro-kur commented Feb 1, 2024 •

edited

Loading

RuntimeError: Triton Error [CUDA]: invalid argument by run_dnabert2.sh #36

RuntimeError: Triton Error [CUDA]: invalid argument by run_dnabert2.sh #36

Comments

shiro-kur commented Aug 31, 2023

shiro-kur commented Aug 31, 2023

shiro-kur commented Aug 31, 2023

jiaojiaoguan commented Feb 1, 2024

shiro-kur commented Feb 1, 2024 • edited Loading

shiro-kur commented Feb 1, 2024 •

edited

Loading