GitHub

To benchmark

TF_XLA_FLAGS="--tf_xla_auto_jit=2" XLA_FLAGS="--xla_gpu_enable_cublaslt=true" python gpt_transformer.py --fp8

TF_XLA_FLAGS="--tf_xla_auto_jit=2" XLA_FLAGS="--xla_gpu_enable_cublaslt=true" python gpt_transformer.py --mixed --fp8

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
DenseFp8.py		DenseFp8.py
README.md		README.md
gpt_transformer.py		gpt_transformer.py