AssertionError: Outputs not close enough in tensor in test_numerics.py #1165

sirutBuasai · 2024-09-06T05:30:53Z

Hi,

I currently observed the following sanity test error when running with PyTorch 2.4.0 + CUDA 12.4 + cuDNN 9.1.0.

=================================================================================================== short test summary info ====================================================================================================
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[True-LayerNorm-126m-1-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 359 with 0.2375284731388092 vs 0.23826351761817932 (diff 0.0007350444793701172).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[True-LayerNorm-126m-2-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 2663 with 0.2375284731388092 vs 0.23826351761817932 (diff 0.0007350444793701172).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[True-RMSNorm-126m-1-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 614 with 0.12908415496349335 vs 0.12974251806735992 (diff 0.0006583631038665771).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[True-RMSNorm-126m-2-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 1602 with 0.10864763706922531 vs 0.10934028774499893 (diff 0.0006926506757736206).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[False-LayerNorm-126m-1-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 359 with 0.2375284731388092 vs 0.23826351761817932 (diff 0.0007350444793701172).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[False-LayerNorm-126m-2-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 2663 with 0.2375284731388092 vs 0.23826351761817932 (diff 0.0007350444793701172).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[False-RMSNorm-126m-1-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 614 with 0.12908415496349335 vs 0.12974251806735992 (diff 0.0006583631038665771).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_linear_accuracy[False-RMSNorm-126m-2-dtype0] - AssertionError: Outputs not close enough in tensor at idx=0. Location of the maximum difference: 1602 with 0.10864763706922531 vs 0.10934028774499893 (diff 0.0006926506757736206).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_mlp_accuracy[LayerNorm-srelu-126m-2-dtype0] - AssertionError: Outputs not close enough in tensor at idx=1152. Location of the maximum difference: 608 with 21.20404052734375 vs 21.217058181762695 (diff 0.013017654418945312).
FAILED TransformerEngine/tests/pytorch/test_numerics.py::test_layernorm_mlp_accuracy[RMSNorm-srelu-126m-2-dtype0] - AssertionError: Outputs not close enough in tensor at idx=1152. Location of the maximum difference: 608 with 21.316505432128906 vs 21.329757690429688 (diff 0.01325225830078125).
============================================================================= 10 failed, 477 passed, 80 skipped, 663 warnings in 98.25s (0:01:38) =============================================================================

This is running on a single AWS p4d.24xlarge instance with A100 GPUs within a docker container.

The test is run using

PYTORCH_JIT=0 NVTE_TORCH_COMPILE=0 NVTE_ALLOW_NONDETERMINISTIC_ALGO=0 pytest -v -s $TE_PATH/tests/pytorch/test_numerics.py

TE is installed through

pip install --no-cache-dir git+https://github.com/NVIDIA/TransformerEngine.git@release_v1.9

Installed libaries:

Package                 Version
----------------------- -------------
absl-py                 2.1.0
accelerate              0.34.2
annotated-types         0.7.0
apex                    0.1
archspec                0.2.3
asttokens               2.4.1
awscli                  1.34.12
blis                    0.7.11
boltons                 24.0.0
boto3                   1.35.12
botocore                1.35.12
Brotli                  1.1.0
cached-property         1.5.2
catalogue               2.0.10
certifi                 2024.8.30
cffi                    1.17.1
charset-normalizer      3.3.2
click                   8.1.7
cloudpathlib            0.19.0
colorama                0.4.6
coloredlogs             15.0.1
comm                    0.2.2
conda                   24.7.1
conda-libmamba-solver   24.1.0
conda-package-handling  2.2.0
conda_package_streaming 0.9.0
confection              0.1.5
contourpy               1.3.0
cryptography            43.0.1
cycler                  0.12.1
cymem                   2.0.8
Cython                  3.0.11
debugpy                 1.8.5
decorator               5.1.1
distro                  1.9.0
docutils                0.16
einops                  0.8.0
exceptiongroup          1.2.2
executing               2.1.0
expecttest              0.2.1
fastai                  2.7.17
fastcore                1.7.4
fastdownload            0.0.7
fastprogress            1.0.3
filelock                3.15.4
flash_attn              2.4.2
flatbuffers             24.3.25
fonttools               4.53.1
frozendict              2.4.4
fsspec                  2024.9.0
grpcio                  1.66.1
h5py                    3.11.0
huggingface-hub         0.24.6
humanfriendly           10.0
idna                    3.8
importlib_metadata      8.4.0
iniconfig               2.0.0
ipykernel               6.29.5
ipython                 8.27.0
jedi                    0.19.1
Jinja2                  3.1.4
jmespath                1.0.1
joblib                  1.4.2
jsonpatch               1.33
jsonpointer             3.0.0
jupyter_client          8.6.2
jupyter_core            5.7.2
kiwisolver              1.4.7
langcodes               3.4.0
language_data           1.2.0
libmambapy              1.5.8
mamba                   1.5.8
marisa-trie             1.2.0
Markdown                3.7
markdown-it-py          3.0.0
MarkupSafe              2.1.5
matplotlib              3.9.2
matplotlib-inline       0.1.7
mdurl                   0.1.2
menuinst                2.1.2
mpi4py                  4.0.0
mpmath                  1.3.0
murmurhash              1.0.10
mypy-extensions         1.0.0
nest_asyncio            1.6.0
networkx                3.3
ninja                   1.11.1.1
numpy                   1.26.4
onnx                    1.16.2
onnxruntime             1.19.2
opencv-python           4.10.0.84
packaging               24.0
pandas                  2.2.2
parso                   0.8.4
pexpect                 4.9.0
pickleshare             0.7.5
pillow                  10.4.0
pip                     24.0
platformdirs            4.2.0
pluggy                  1.5.0
preshed                 3.0.9
prompt_toolkit          3.0.47
protobuf                5.28.0
psutil                  6.0.0
ptyprocess              0.7.0
pure_eval               0.2.3
pyasn1                  0.6.0
pybind11                2.13.5
pybind11_global         2.13.5
pycosat                 0.6.6
pycparser               2.22
pydantic                2.9.0
pydantic_core           2.23.2
Pygments                2.18.0
pyOpenSSL               24.2.1
pyparsing               3.1.4
pyre-extensions         0.0.30
PySocks                 1.7.1
pytest                  8.2.1
python-dateutil         2.9.0
pytz                    2024.1
PyYAML                  6.0.2
pyzmq                   26.2.0
requests                2.32.3
rich                    13.8.0
rsa                     4.7.2
ruamel.yaml             0.18.6
ruamel.yaml.clib        0.2.8
s3transfer              0.10.2
safetensors             0.4.5
scikit-learn            1.5.1
scipy                   1.14.1
setuptools              73.0.1
shellingham             1.5.4
six                     1.16.0
smart-open              7.0.4
spacy                   3.7.6
spacy-legacy            3.0.12
spacy-loggers           1.0.5
srsly                   2.4.8
stack-data              0.6.2
sympy                   1.13.2
tabulate                0.9.0
tensorboard             2.17.1
tensorboard-data-server 0.7.2
thinc                   8.2.5
threadpoolctl           3.5.0
torch                   2.4.0+cu124
torchaudio              2.4.0+cu124
torchtext               0.18.0+cu124
torchtnt                0.2.4
torchvision             0.19.0+cu124
tornado                 6.4.1
tqdm                    4.66.5
traitlets               5.14.3
transformer_engine      1.9.0+e79d915
triton                  3.0.0
truststore              0.8.0
typer                   0.12.5
typing_extensions       4.12.2
typing-inspect          0.9.0
tzdata                  2024.1
urllib3                 1.26.19
wasabi                  1.1.3
wcwidth                 0.2.13
weasel                  0.4.1
Werkzeug                3.0.4
wheel                   0.43.0
wrapt                   1.16.0
zipp                    3.20.1
zstandard               0.23.0

The text was updated successfully, but these errors were encountered:

ptrendx added the bug Something isn't working label Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: Outputs not close enough in tensor in test_numerics.py #1165

AssertionError: Outputs not close enough in tensor in test_numerics.py #1165

sirutBuasai commented Sep 6, 2024 •

edited

Loading

AssertionError: Outputs not close enough in tensor in test_numerics.py #1165

AssertionError: Outputs not close enough in tensor in test_numerics.py #1165

Comments

sirutBuasai commented Sep 6, 2024 • edited Loading

sirutBuasai commented Sep 6, 2024 •

edited

Loading