Skip to content

Collection of simple General Matrix Multiplication - GEMM implementations

License

Notifications You must be signed in to change notification settings

PhilipFackler/simple-gemm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple-gemm

Collection of simple General Matrix Multiplication - GEMM implementations

C = a . A x B + C
if a = 1 and C = zeros
C = A x B

A and B are initialized with random numbers C is initialized with zeros

Arguments are always 3 matrix dimensions: args = [A_rows, A_cols (= B_rows), B_cols]

e.g. 5 5 5 or 10 10 10

CPU multithreading:

  • GemmDenseThreads: native Julia Threads implementation

    $ cd GemmDenseThreads
    $ julia -t 4 gemm-dense-threads.jl 5 5 5    
    
  • GemmDenseBlas: uses LinearAlgebra.jl (super-fast), if compiled with OpenBLAS set OPENBLAS_NUM_THREADS

    $ cd GemmDenseThreads
    $ OPENBLAS_NUM_THREADS=4 julia gemm-dense-blas.jl 5 5 5    
    

GPU :

  • GemmDenseCUDA : uses CUDA.jl which uses the optimized cuBLAS (very fast) on NVIDIA GPUs

    $ cd GemmDenseCUDA
    $ julia gemm-dense-cuda.jl 5 5 5
    

About

Collection of simple General Matrix Multiplication - GEMM implementations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Julia 27.0%
  • C++ 21.5%
  • Fortran 12.1%
  • Shell 11.4%
  • Python 11.1%
  • C 9.8%
  • Makefile 7.1%