Skip to content

3.GEMM Optimization

Rafael CF Sousa edited this page May 25, 2022 · 4 revisions

3. GEMM Optimization

For this task, you will have to implement the Convolutions as an im2col+GEMM.

For im2col, you can use any implementation available over the internet. A suggested implementation can be found at the following link: https://github.com/pytorch/pytorch/blob/7e4730d017e3da94fe3b7b5f504df0563516367f/caffe2/quantization/server/im2col_dnnlowp.h#L186.

For the BLAS part, you will have to use a call to the library OpenBLAS. The function you will have to use is cblas_sgemm.

Some hints:

  • pay attention when defining what is matrix A and matrix B
  • not necessarily both matrices are no_tranpose
  • use #include <cblas.h> at the header of the source file where you will call OpenBLAS
  • check if the im2col is generating the patch matrix over the columns of the resultant matrix (data_col)
  • T = float
  • group = dilation_h = dilation_w = 1
  • zero_point = 0
  • do not forget to append -lopenblas in the variable LIBS (/work/models/*/builder.mk)
  • the new Dockerfile installs automatically the library OpenBLAS.

You will have to send us a report (pdf) with the following information:

  • explain in detail each one of the arguments used to call BLAS, eg, " I used ldb equal to ... because ...".

  • fill the following tables:

Model Glow (us) Naive (us) BLAS (us)
MNIST
MobileNet
ResNet18
SqueezeNet
Model Glow (top1) Naive (top1) BLAS (top1)
MNIST
MobileNet
ResNet18
SqueezeNet
Model Glow (top5) Naive (top5) BLAS (top5)
MNIST
MobileNet
ResNet18
SqueezeNet
  • We also expect a detailed conclusion of the general work.
Clone this wiki locally