You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The variable eps is double, so THTensor_(get1d)(running_var, f) converts to double, so whole 1 / sqrt thing becomes double, and in the end converts to real, because invstd is real.
If eps is converted to float before, the output should match the expected behavior.
I was trying to convert Pytorch model to Caffe on CPU and all conv layers worked flawlessly and produced equal output tensors, but on BatchNorm layers something broke and the output was very different. After some investigation I found that Caffe implementation produced the same output as direct numpy equivalent in float32, but Pytorch didn't. After setting types to float64 for values described above I was able to reproduce Pytorch batchnorm output with numpy.
This issue breaks direct conversion of Pytorch models to Caffe, and also seems like genuine unexpected behavior, since everything is expected be float32 during both training and inference.
The text was updated successfully, but these errors were encountered:
TL;DR:
THNN/BatchNormalization.c, Line 53
nn/lib/THNN/generic/BatchNormalization.c
Line 53 in 8726825
The variable
eps
is double, soTHTensor_(get1d)(running_var, f)
converts to double, so whole1 / sqrt
thing becomes double, and in the end converts toreal
, becauseinvstd
isreal
.If eps is converted to
float
before, the output should match the expected behavior.I was trying to convert Pytorch model to Caffe on CPU and all conv layers worked flawlessly and produced equal output tensors, but on BatchNorm layers something broke and the output was very different. After some investigation I found that Caffe implementation produced the same output as direct numpy equivalent in float32, but Pytorch didn't. After setting types to float64 for values described above I was able to reproduce Pytorch batchnorm output with numpy.
This issue breaks direct conversion of Pytorch models to Caffe, and also seems like genuine unexpected behavior, since everything is expected be float32 during both training and inference.
The text was updated successfully, but these errors were encountered: