Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems E Ebrahimi, CJ Lee, O Mutlu, YN Patt ACM Sigplan Notices 45 (3), 335-346, 2010 | 481 | 2010 |
Transparent offloading and mapping (TOM) enabling programmer-transparent near-data processing in GPU systems K Hsieh, E Ebrahimi, G Kim, N Chatterjee, M O'Connor, N Vijaykumar, ... ACM SIGARCH Computer Architecture News 44 (3), 204-216, 2016 | 325 | 2016 |
MCM-GPU: Multi-chip-module GPUs for continued performance scalability A Arunkumar, E Bolotin, B Cho, U Milic, E Ebrahimi, O Villa, A Jaleel, ... ACM SIGARCH Computer Architecture News 45 (2), 320-332, 2017 | 253 | 2017 |
Coordinated control of multiple prefetchers in multi-core systems E Ebrahimi, O Mutlu, CJ Lee, YN Patt Proceedings of the 42nd Annual IEEE/ACM International Symposium on …, 2009 | 250 | 2009 |
Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems E Ebrahimi, O Mutlu, YN Patt 2009 IEEE 15th International Symposium on High Performance Computer …, 2009 | 194 | 2009 |
DRAM-aware last-level cache writeback: Reducing write-caused interference in memory systems CJ Lee, V Narasiman, E Ebrahimi, O Mutlu, YN Patt Carnegie Mellon University, 2010 | 192 | 2010 |
Prefetch-aware shared resource management for multi-core systems E Ebrahimi, CJ Lee, O Mutlu, YN Patt ACM SIGARCH Computer Architecture News 39 (3), 141-152, 2011 | 190 | 2011 |
Parallel application memory scheduling E Ebrahimi, R Miftakhutdinov, C Fallin, CJ Lee, JA Joao, O Mutlu, YN Patt Proceedings of the 44th Annual IEEE/ACM International Symposium on …, 2011 | 174 | 2011 |
Predicting performance impact of DVFS for realistic memory systems R Miftakhutdinov, E Ebrahimi, YN Patt 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 155-165, 2012 | 137 | 2012 |
Accelerating dependent cache misses with an enhanced memory controller M Hashemi, Khubaib, E Ebrahimi, O Mutlu, YN Patt ACM SIGARCH Computer Architecture News 44 (3), 444-455, 2016 | 134 | 2016 |
Flexible software profiling of gpu architectures M Stephenson, SK Sastry Hari, Y Lee, E Ebrahimi, DR Johnson, ... Proceedings of the 42nd Annual International Symposium on Computer …, 2015 | 133 | 2015 |
Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks S Ghodrati, BH Ahn, JK Kim, S Kinzer, BR Yatham, N Alla, H Sharma, ... 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture …, 2020 | 124 | 2020 |
Beyond the socket: NUMA-aware GPUs U Milic, O Villa, E Bolotin, A Arunkumar, E Ebrahimi, A Jaleel, A Ramirez, ... Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017 | 93 | 2017 |
SiP-ML: high-bandwidth optical network interconnects for machine learning training M Khani, M Ghobadi, M Alizadeh, Z Zhu, M Glick, K Bergman, A Vahdat, ... Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 657-675, 2021 | 85 | 2021 |
Combining HW/SW mechanisms to improve NUMA performance of multi-GPU systems V Young, A Jaleel, E Bolotin, E Ebrahimi, D Nellans, O Villa 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018 | 81 | 2018 |
The locality descriptor: A holistic cross-layer abstraction to express data locality in GPUs N Vijaykumar, E Ebrahimi, K Hsieh, PB Gibbons, O Mutlu 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture …, 2018 | 79 | 2018 |
Selective GPU caches to eliminate CPU-GPU HW cache coherence N Agarwal, D Nellans, E Ebrahimi, TF Wenisch, J Danskin, SW Keckler 2016 IEEE International Symposium on High Performance Computer Architecture …, 2016 | 73 | 2016 |
Optimizing multi-GPU parallelization strategies for deep learning training S Pal, E Ebrahimi, A Zulfiqar, Y Fu, V Zhang, S Migacz, D Nellans, ... Ieee Micro 39 (5), 91-101, 2019 | 68 | 2019 |
Large graph convolutional network training with GPU-oriented data communication architecture SW Min, K Wu, S Huang, M Hidayetoğlu, J Xiong, E Ebrahimi, D Chen, ... arXiv preprint arXiv:2103.03330, 2021 | 67 | 2021 |
EMOGI: Efficient memory-access for out-of-memory graph-traversal in GPUs SW Min, VS Mailthody, Z Qureshi, J Xiong, E Ebrahimi, W Hwu arXiv preprint arXiv:2006.06890, 2020 | 56 | 2020 |