Follow
Jakub Kurzak
Jakub Kurzak
SMTS Software Design Engineer, AMD
Verified email at amd.com - Homepage
Title
Cited by
Cited by
Year
A class of parallel tiled linear algebra algorithms for multicore architectures
A Buttari, J Langou, J Kurzak, J Dongarra
Parallel computing 35 (1), 38-53, 2009
7192009
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
E Agullo, J Demmel, J Dongarra, B Hadri, J Kurzak, J Langou, H Ltaief, ...
Journal of Physics: Conference Series 180 (1), 012037, 2009
5862009
Accelerating scientific computations with mixed precision algorithms
M Baboulin, A Buttari, J Dongarra, J Kurzak, J Langou, J Langou, ...
Computer Physics Communications 180 (12), 2526-2533, 2009
2892009
Parallel tiled QR factorization for multicore architectures
A Buttari, J Langou, J Kurzak, J Dongarra
Concurrency and Computation: Practice and Experience 20 (13), 1573-1590, 2008
2802008
Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA
G Bosilca, A Bouteiller, A Danalis, M Faverge, A Haidar, T Herault, ...
2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011
254*2011
Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)
J Langou, J Langou, P Luszczek, J Kurzak, A Buttari, J Dongarra
Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, 113-es, 2006
2092006
Mixed precision iterative refinement techniques for the solution of dense linear systems
A Buttari, J Dongarra, J Langou, J Langou, P Luszczek, J Kurzak
The International Journal of High Performance Computing Applications 21 (4 …, 2007
1892007
The impact of multicore on math software
A Buttari, J Dongarra, J Kurzak, J Langou, P Luszczek, S Tomov
International Workshop on Applied Parallel Computing, 1-10, 2006
1772006
Quark users’ guide: Queueing and runtime for kernels
A YarKhan, J Kurzak, J Dongarra
University of Tennessee Innovative Computing Laboratory Technical Report ICL …, 2011
1672011
Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy
A Buttari, J Dongarra, J Kurzak, P Luszczek, S Tomov
ACM Transactions on Mathematical Software (TOMS) 34 (4), 1-22, 2008
1632008
Autotuning GEMM kernels for the Fermi GPU
J Kurzak, S Tomov, J Dongarra
IEEE Transactions on Parallel and Distributed Systems 23 (11), 2045-2057, 2012
1602012
Scheduling dense linear algebra operations on multicore processors
J Kurzak, H Ltaief, J Dongarra, RM Badia
Concurrency and Computation: Practice and Experience 22 (1), 15-44, 2010
1482010
Solving systems of linear equations on the CELL processor using Cholesky factorization
J Kurzak, A Buttari, J Dongarra
IEEE Transactions on Parallel and Distributed Systems 19 (9), 1175-1186, 2008
1452008
Accelerating numerical dense linear algebra calculations with GPUs
J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ...
Numerical computations with GPUs, 3-28, 2014
1422014
Implementation of mixed precision in solving systems of linear equations on the CELL processor
J Kurzak, J Dongarra
Concurrency and Computation: Practice and Experience 19 (10), 1371-1385, 2007
1122007
The singular value decomposition: Anatomy of optimizing an algorithm for extreme scale
J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ...
SIAM review 60 (4), 808-865, 2018
1102018
SLATE: Design of a modern distributed and accelerated linear algebra library
M Gates, J Kurzak, A Charara, A YarKhan, J Dongarra
Proceedings of the International Conference for High Performance Computing …, 2019
1042019
Scientific computing with multicore and accelerators
J Kurzak, DA Bader, J Dongarra
CRC Press, 2010
942010
A rough guide to scientific computing on the playstation 3
A Buttari, P Luszczek, J Kurzak, J Dongarra, G Bosilca
version 1.0. Technical Report UT-CS-07-595, Computer Science Department …, 2007
942007
Implementing linear algebra routines on multi-core processors with pipelining and a look ahead
J Kurzak, J Dongarra
International Workshop on Applied Parallel Computing, 147-156, 2006
942006
The system can't perform the operation now. Try again later.
Articles 1–20