3.GEMM Optimization

3. GEMM Optimization

For this task, you will have to implement the Convolutions as an im2col+GEMM.

For im2col, you can use any implementation available over the internet. A suggested implementation can be found at the following link: https://github.com/pytorch/pytorch/blob/7e4730d017e3da94fe3b7b5f504df0563516367f/caffe2/quantization/server/im2col_dnnlowp.h#L186.

For the BLAS part, you will have to use a call to the library OpenBLAS. The function you will have to use is cblas_sgemm.

Some hints:

pay attention when defining what is matrix A and matrix B
not necessarily both matrices are no_tranpose
use #include <cblas.h> at the header of the source file where you will call OpenBLAS
check if the im2col is generating the patch matrix over the columns of the resultant matrix (data_col)
T = float
group = dilation_h = dilation_w = 1
zero_point = 0
do not forget to append -lopenblas in the variable LIBS (/work/models/*/builder.mk)
the new Dockerfile installs automatically the library OpenBLAS.

You will have to send us a report (pdf) with the following information:

explain in detail each one of the arguments used to call BLAS, eg, " I used ldb equal to ... because ...".
fill the following tables: