Rodinia: A benchmark suite for heterogeneous computing S Che, M Boyer, J Meng, D Tarjan, JW Sheaffer, SH Lee, K Skadron 2009 IEEE international symposium on workload characterization (IISWC), 44-54, 2009 | 3873 | 2009 |
A performance study of general-purpose applications on graphics processors using CUDA S Che, M Boyer, J Meng, D Tarjan, JW Sheaffer, K Skadron Journal of parallel and distributed computing 68 (10), 1370-1380, 2008 | 937 | 2008 |
Dynamic warp subdivision for integrated branch and memory divergence tolerance J Meng, D Tarjan, K Skadron Proceedings of the 37th annual international symposium on Computer …, 2010 | 378 | 2010 |
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs J Meng, K Skadron Proceedings of the 23rd international conference on Supercomputing, 256-265, 2009 | 201 | 2009 |
GROPHECY: GPU performance projection from CPU code skeletons J Meng, VA Morozov, K Kumaran, V Vishwanath, TD Uram Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 135 | 2011 |
Best-effort parallel execution framework for recognition and mining applications J Meng, S Chakradhar, A Raghunathan 2009 IEEE International Symposium on Parallel & Distributed Processing, 1-12, 2009 | 134 | 2009 |
Improving GPU performance prediction with data transfer modeling M Boyer, J Meng, K Kumaran 2013 IEEE International Symposium on Parallel & Distributed Processing …, 2013 | 79 | 2013 |
Increasing memory miss tolerance for SIMD cores D Tarjan, J Meng, K Skadron Proceedings of the Conference on High Performance Computing Networking …, 2009 | 77 | 2009 |
Avoiding cache thrashing due to private data placement in last-level cache for manycore scaling J Meng, K Skadron 2009 IEEE international conference on computer design, 282-288, 2009 | 72 | 2009 |
A performance study for iterative stencil loops on GPUs with ghost zone optimizations J Meng, K Skadron International Journal of Parallel Programming 39, 115-142, 2011 | 67 | 2011 |
Exploiting the forgiving nature of applications for scalable parallel execution J Mengte, A Raghunathan, S Chakradhar, S Byna 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 57 | 2010 |
Workflow performance improvement using model-based scheduling over multiple clusters and clouds K Maheshwari, ES Jung, J Meng, V Morozov, V Vishwanath, R Kettimuthu Future Generation Computer Systems 54, 206-218, 2016 | 41 | 2016 |
Systems and methods for implementing best-effort parallel computing frameworks S Chakradhar, A Raghunathan, J Meng US Patent 8,286,172, 2012 | 38 | 2012 |
Exploiting inter-thread temporal locality for chip multithreading J Meng, JW Sheaffer, K Skadron 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 38 | 2010 |
Best-effort semantic document search on GPUs S Byna, J Meng, A Raghunathan, S Chakradhar, S Cadambi Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics …, 2010 | 37 | 2010 |
Dataflow-driven GPU performance projection for multi-kernel transformations J Meng, VA Morozov, V Vishwanath, K Kumaran SC'12: Proceedings of the International Conference on High Performance …, 2012 | 31 | 2012 |
Skope: A framework for modeling and exploring workload behavior J Meng, X Wu, V Morozov, V Vishwanath, K Kumaran, V Taylor Proceedings of the 11th ACM Conference on Computing Frontiers, 1-10, 2014 | 30 | 2014 |
Dynamic warp subdivision for integrated branch and memory latency divergence tolerance K Skadron, J Meng, D Tarjan US Patent App. 13/040,045, 2011 | 29 | 2011 |
Robust SIMD: Dynamically adapted SIMD width and multi-threading depth J Meng, JW Sheaffer, K Skadron 2012 IEEE 26th international parallel and distributed processing symposium …, 2012 | 24 | 2012 |
A multiple SIMD, multiple data (MSMD) architecture: Parallel execution of dynamic and static SIMD fragments Y Wang, S Chen, J Wan, J Meng, K Zhang, W Liu, X Ning 2013 IEEE 19th International Symposium on High Performance Computer …, 2013 | 21 | 2013 |