Magma-2.5.0-rc1 released
Also works the same way of the 2.4.0, just need to use the same patches:
cmakelists.patch and thread_queue.patch ignoring the others
MAGMA 2.5.0 RC1 | 2018-11-16 |
---|---|
MAGMA 2.5.0 RC1 is now released. Updates include: |
- New routine: magmablas_Xgemm_batched_strided (X = {s, d, c, z}) is the stride-based variant of magmablas_Xgemm_batched;
- New routine: magma_Xgetrf_native (X = {s, d, c, z}) performs the LU factorization with partial pivoting using the GPU only. It has the same interface as the hybrid (CPU+GPU) implementation provided by magma_Xgetrf_gpu. Testing the performance of this routine is possible through running testing_Xgetrf_gpu with the option (–version 3);
- New routine: magma_Xpotrf_native (X = {s, d, c, z}) performs the Cholesky factorization using the GPU only. It has the same interface as the hybrid (CPU+GPU) implementation provided by magma_Xpotrf_gpu.
Testing the performance of this routine is possible through running testing_Xpotrf_gpu with the option (–version 2) - Added benchmark for GEMM in FP16 arithmetic (HGEMM) as well as auxiliary functions to cast matrices from FP32 to FP16 storage (magmablas_slag2h) and from FP16 to FP32 (magmablas_hlag2s).|
MAGMA 2.4.0 | 2018-06-25 |
---|---|
MAGMA 2.4.0 is now released. Updates include: |
- Added constrained least squares routines (magma_[sdcz]gglse) and dependencies:
magma_zggrqf - generalized RQ factorization
magma_zunmrq - multiply by orthogonal Q as returned by zgerqf - Performance improvements across many batch routines, including batched TRSM, batched LU, batched LU-nopiv, and batched Cholesky
- Fixed some compilation issues with inf, nan, and nullptr.
MAGMA-sparse
- Changed the way how data from an external application is handled:
There is now a clear distinction between memory allocated/used/freed from MAGMA and the user application. We added a functions magma_zvcopy and magma_zvpass that do not allocate memory, instead they copy values from/to application-allocated memory. - The examples ( in example/example_sparse.c ) give a demonstration on how these routines should be used.|