SIMD Matrix Transposition in Go: Gonum and BLAS Register Remapping in AVX-512

SIMD Matrix Transposition in Go: Gonum and BLAS Register Remapping in AVX-512

Click the blue text to follow us SIMD Matrix Transposition in Go: Gonum and BLAS Register Remapping in AVX-512 The matrix transposition operation appears simple in large-scale data processing, yet it harbors performance pitfalls. Recently, while optimizing a project processing tens of GB of scientific data, we discovered that our SIMD acceleration scheme performed poorly … Read more