A reconfigurable fabric for accelerating large-scale datacenter services A Putnam, AM Caulfield, ES Chung, D Chiou, K Constantinides, J Demme, ... ACM SIGARCH Computer Architecture News 42 (3), 13-24, 2014 | 1574 | 2014 |
A cloud-scale acceleration architecture AM Caulfield, ES Chung, A Putnam, H Angepat, J Fowers, M Haselman, ... 2016 49th Annual IEEE/ACM international symposium on microarchitecture …, 2016 | 895 | 2016 |
A configurable cloud-scale DNN processor for real-time AI J Fowers, K Ovtcharov, M Papamichael, T Massengill, M Liu, D Lo, ... 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture …, 2018 | 696 | 2018 |
Accelerating deep convolutional neural networks using specialized hardware K Ovtcharov, O Ruwase, JY Kim, J Fowers, K Strauss, ES Chung Microsoft Research Whitepaper 2 (11), 1-4, 2015 | 531 | 2015 |
Serving dnns in real time at datacenter scale with project brainwave E Chung, J Fowers, K Ovtcharov, M Papamichael, A Caulfield, ... iEEE Micro 38 (2), 8-20, 2018 | 399 | 2018 |
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications J Fowers, G Brown, P Cooke, G Stitt Proceedings of the ACM/SIGDA international symposium on Field Programmable …, 2012 | 351 | 2012 |
A high memory bandwidth fpga accelerator for sparse matrix-vector multiplication J Fowers, K Ovtcharov, K Strauss, ES Chung, G Stitt 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom …, 2014 | 192 | 2014 |
A reconfigurable fabric for accelerating large-scale datacenter services A Putnam, AM Caulfield, ES Chung, D Chiou, K Constantinides, J Demme, ... IEEE Micro 35 (3), 10-22, 2015 | 178 | 2015 |
A scalable high-bandwidth architecture for lossless compression on fpgas J Fowers, JY Kim, D Burger, S Hauck 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom …, 2015 | 130 | 2015 |
Pushing the limits of narrow precision inferencing at cloud scale with microsoft floating point B Darvish Rouhani, D Lo, R Zhao, M Liu, J Fowers, K Ovtcharov, ... Advances in neural information processing systems 33, 10271-10281, 2020 | 123 | 2020 |
Toward accelerating deep learning at scale using specialized hardware in the datacenter K Ovtcharov, O Ruwase, JY Kim, J Fowers, K Strauss, ES Chung 2015 IEEE Hot Chips 27 Symposium (HCS), 1-38, 2015 | 91 | 2015 |
Accelerating persistent neural networks at datacenter scale E Chung, J Fowers, K Ovtcharov, M Papamichael, A Caulfield, ... Hot Chips 29, 2017 | 86 | 2017 |
A configurable cloud-scale dnn processor for real-time ai. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) J Fowers, K Ovtcharov, M Papamichael, T Massengill, M Liu, D Lo, ... IEEE. https://doi. org/10.1109/isca, 2018 | 78 | 2018 |
Hardware node with matrix-vector multiply tiles for neural network processing J Fowers, ES Chung US Patent 10,140,252, 2018 | 57 | 2018 |
A reconfigurable fabric for accelerating large-scale datacenter services A Putnam, AM Caulfield, ES Chung, D Chiou, K Constantinides, J Demme, ... Communications of the ACM 59 (11), 114-122, 2016 | 56 | 2016 |
Neural network processor based on application specific synthesis specialization parameters J Fowers, K Ovtcharov, ES Chung, TM Massengill, MG Liu, GL Weisz US Patent 11,556,762, 2023 | 53 | 2023 |
Configurable clouds AM Caulfield, ES Chung, A Putnam, H Angepat, D Firestone, J Fowers, ... IEEE Micro 37 (3), 52-61, 2017 | 53 | 2017 |
A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors J Fowers, G Brown, J Wernsing, G Stitt ACM Transactions on Architecture and Code Optimization (TACO) 9 (4), 1-21, 2013 | 48 | 2013 |
Sparse matrix data structure K Strauss, J Fowers, K Ovtcharov US Patent 9,367,519, 2016 | 39 | 2016 |
A software-defined tensor streaming multiprocessor for large-scale machine learning D Abts, G Kimmell, A Ling, J Kim, M Boyd, A Bitar, S Parmar, I Ahmed, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 35 | 2022 |