-
Notifications
You must be signed in to change notification settings - Fork 6
3.GEMM Optimization
Rafael CF Sousa edited this page May 25, 2022
·
4 revisions
For this task, you will have to implement the Convolutions as an im2col+GEMM.
For im2col, you can use any implementation available over the internet. A suggested implementation can be found at the following link: https://github.com/pytorch/pytorch/blob/7e4730d017e3da94fe3b7b5f504df0563516367f/caffe2/quantization/server/im2col_dnnlowp.h#L186.
For the BLAS part, you will have to use a call to the library OpenBLAS. The function you will have to use is cblas_sgemm
.
Some hints:
- pay attention when defining what is matrix A and matrix B
- not necessarily both matrices are
no_tranpose
- use
#include <cblas.h>
at the header of the source file where you will call OpenBLAS - check if the im2col is generating the patch matrix over the columns of the resultant matrix (data_col)
- T = float
- group = dilation_h = dilation_w = 1
- zero_point = 0
- do not forget to append
-lopenblas
in the variableLIBS
(/work/models/*/builder.mk) - the new Dockerfile installs automatically the library OpenBLAS.
You will have to send us a report (pdf) with the following information:
-
explain in detail each one of the arguments used to call BLAS, eg, " I used
ldb
equal to ... because ...". -
fill the following tables:
Model | Glow (us) | Naive (us) | BLAS (us) |
---|---|---|---|
MNIST | |||
MobileNet | |||
ResNet18 | |||
SqueezeNet |
Model | Glow (top1) | Naive (top1) | BLAS (top1) |
---|---|---|---|
MNIST | |||
MobileNet | |||
ResNet18 | |||
SqueezeNet |
Model | Glow (top5) | Naive (top5) | BLAS (top5) |
---|---|---|---|
MNIST | |||
MobileNet | |||
ResNet18 | |||
SqueezeNet |
- We also expect a detailed conclusion of the general work.