Skip to content Skip to sidebar Skip to footer

Is It Possible To Use BLAS To Speed Up Sparse Matrix Multiplication?

I am currently trying to speed up my large sparse (scipy) matrix multiplications. I have successfully linked my numpy installation with OpenBLAS and henceforth, also scipy. I have

Solution 1:

BLAS is just used for dense floating-point matrices. Matrix multiplication of a scipy.sparse.csr_matrix is done using pure C++ functions that don't make any calls to external BLAS libraries.

For example, matrix-matrix multiplication is implemented here, in csr_matmat_pass_1 and csr_matmat_pass_2.

Optimised BLAS libraries are highly tuned to make efficient use of CPU caches by decomposing the dense input matrices into smaller block matrices in order to achieve better locality-of-reference. My understanding is that this strategy can't be easily applied to sparse matrices, where the non-zero elements may be arbitrarily distributed within the matrix.


Post a Comment for "Is It Possible To Use BLAS To Speed Up Sparse Matrix Multiplication?"