Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer T Shimokawabe, T Aoki, T Takaki, T Endo, A Yamanaka, N Maruyama, ... Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 265 | 2011 |
Statistical power modeling of GPU kernels using performance counters H Nagasaka, N Maruyama, A Nukada, T Endo, S Matsuoka International conference on green computing, 115-122, 2010 | 265 | 2010 |
Auto-tuning 3-D FFT library for CUDA GPUs A Nukada, S Matsuoka Proceedings of the Conference on High Performance Computing Networking …, 2009 | 187 | 2009 |
An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code T Shimokawabe, T Aoki, C Muroi, J Ishida, K Kawano, T Endo, A Nukada, ... SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 178 | 2010 |
Bandwidth intensive 3-D FFT kernel for GPUs using CUDA A Nukada, Y Ogata, T Endo, S Matsuoka SC'08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, 1-11, 2008 | 173 | 2008 |
Fast conjugate gradients with multiple GPUs A Cevahir, A Nukada, S Matsuoka Computational Science–ICCS 2009: 9th International Conference Baton Rouge …, 2009 | 125 | 2009 |
はじめての CUDA プログラミング 青木 (No Title), 2009 | 93 | 2009 |
High-performance and memory-saving sparse general matrix-matrix multiplication for nvidia pascal gpu Y Nagasaka, A Nukada, S Matsuoka 2017 46th International Conference on Parallel Processing (ICPP), 101-110, 2017 | 84 | 2017 |
NVCR: A transparent checkpoint-restart library for NVIDIA CUDA A Nukada, H Takizawa, S Matsuoka 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 81 | 2011 |
High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning A Cevahir, A Nukada, S Matsuoka Computer Science-Research and Development 25, 83-91, 2010 | 80 | 2010 |
Linpack evaluation on a supercomputer with heterogeneous accelerators T Endo, S Matsuoka, A Nukada, N Maruyama 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 78 | 2010 |
A high-performance fault-tolerant software framework for memory on commodity gpus N Maruyama, A Nukada, S Matsuoka 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 61 | 2010 |
Scalable multi-gpu 3-d fft for tsubame 2.0 supercomputer A Nukada, K Sato, S Matsuoka SC'12: Proceedings of the International Conference on High Performance …, 2012 | 54 | 2012 |
GPU accelerated computing–from hype to mainstream, the rebirth of vector computing S Matsuoka, T Aoki, T Endo, A Nukada, T Kato, A Hasegawa Journal of Physics: Conference Series 180 (1), 012043, 2009 | 48 | 2009 |
Performance evaluation of parallel sparse matrix–vector products on SGI Altix3700 H Kotakemori, H Hasegawa, T Kajiyama, A Nukada, R Suda, A Nishida OpenMP Shared Memory Parallel Programming: International Workshops, IWOMP …, 2008 | 41 | 2008 |
Aspects of GPU for general purpose high performance computing R Suda, T Aoki, S Hirasawa, A Nukada, H Honda, S Matsuoka 2009 Asia and South Pacific Design Automation Conference, 216-223, 2009 | 40 | 2009 |
Software-based ECC for GPUs N Maruyama, A Nukada, S Matsuoka 2009 Symposium on Application Accelerators in High Performance Computing …, 2009 | 36 | 2009 |
Low-overhead diskless checkpoint for hybrid computing systems LB Gomez, A Nukada, N Maruyama, F Cappello, S Matsuoka 2010 International Conference on High Performance Computing, 1-10, 2010 | 34 | 2010 |
High performance 3-D FFT using multiple CUDA GPUs A Nukada, Y Maruyama, S Matsuoka Proceedings of the 5th Annual Workshop on General Purpose Processing with …, 2012 | 32 | 2012 |
TSUBAME-KFC: A modern liquid submersion cooling prototype towards exascale becoming the greenest supercomputer in the world T Endo, A Nukada, S Matsuoka 2014 20th IEEE International Conference on Parallel and Distributed Systems …, 2014 | 31 | 2014 |