fp16-demo-tf

Some example codes for mixed-precision training in TensorFlow and PyTorch.

General rule of thumb:

It's good to have parameters as multiple of 8 to utilize performance of TensorCores in Volta GPUs.

Convolutions: Multiple of 8 - Number of input channels, output channels, batch size
GEMM: Multiple of 8 - M, N, K dimensions
Fully connected layers: Multiple of 8 - Input features, output features, batch size

The examples:

mnist_softmax.py - simple softmax mnist classification example in TensorFlow source
mnist_softmax_fp16_naive.py - naive fp16 implementation - just works
mnist_softmax_deep.py - softmax mnist classification with 1 hidden layer
mnist_softmax_deep_fp16_naive.py - naive fp16 implementation of the mnist_softmax_deep.py - it doesn't work
mnist_softmax_deep_fp16_advanced.py - mixed-precision implementation of the mnist_softmax_deep.py - works with speed-up utilizing TensorCores in Volta GPUs with reduced memory usage - can experiment with number of hidden units to see how that affects utilizing TensorCores and training speed
mnist_softmax_deep_conv_fp16_advanced.py - mixed-precision implementation of convolutional neural network for mnist classification - can experiment with convolutional filter size and if that affects utilizing TensorCores and training speed
pytorch - corresponding PyTorch implementations

Checking if TensorCores are utilized

Run the program with nvprof and see the log output - if there's kernel calls with "884" then TensorCores are called. Example:

nvprof python mnist_softmax_deep_conv_fp16_advanced.py

Notes about loss-scaling

The "default" loss-scaling value of 128 works for all the examples here. However, in a case it doesn't work, it's advised to choose a large value and gradually decrease it until sucessful. apex is a easy-to-use mixed-precision training utilities for PyTorch, and it's loss-scaler does that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fp16-demo-tf

General rule of thumb:

The examples:

Checking if TensorCores are utilized

Notes about loss-scaling

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
pytorch		pytorch
README.md		README.md
mnist_softmax.py		mnist_softmax.py
mnist_softmax_deep.py		mnist_softmax_deep.py
mnist_softmax_deep_conv_fp16_advanced.py		mnist_softmax_deep_conv_fp16_advanced.py
mnist_softmax_deep_fp16_advanced.py		mnist_softmax_deep_fp16_advanced.py
mnist_softmax_deep_fp16_naive.py		mnist_softmax_deep_fp16_naive.py
mnist_softmax_fp16_naive.py		mnist_softmax_fp16_naive.py

khcs/fp16-demo-tf

Folders and files

Latest commit

History

Repository files navigation

fp16-demo-tf

General rule of thumb:

The examples:

Checking if TensorCores are utilized

Notes about loss-scaling

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages