Strassen's Matrix-Multiplication algorithm. An implemetation of Strassen's Matrix-Multiplication algorithm with SIMD AVX. Also contains implemetation of MPI and CUDA of native matrix multiplication
The SIMD implemetation takes about 1.6 more time (2.6x) than Intel's MKL libaray on 4096*4096 matrix.