Skip to content

lilh9598/x86_sgemm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

x86_sgemm

run: ./build.sh step7

intel 8260
fma peak performance 81gflop/s

step0 naive : 0.607gflop/s
step1 c code optimize : 0.663gflop/s
step2 kernel 8x8 : 20.829gflop/s
step3 Kc Mc tile : 21.718gflop/s
step4 Pack B : 21.569gflop/s
step5 Pack A : 48.245gflop/s
step6 kernel 16x6 : 53.913gflop/s
step7 asm kernel16x6/aligned memory : 67.108gflop/s (82.8%)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published