Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems J Lin, Q Lu, X Ding, Z Zhang, X Zhang, P Sadayappan 2008 IEEE 14th International Symposium on High Performance Computer …, 2008 | 510 | 2008 |
Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models G Baumgartner, A Auer, DE Bernholdt, A Bibireata, V Choppella, ... Proceedings of the IEEE 93 (2), 276-292, 2005 | 263 | 2005 |
Automatic code generation for many-body electronic structure methods: the tensor contraction engine AA Auer, G Baumgartner, DE Bernholdt, A Bibireata, V Choppella, ... Molecular Physics 104 (2), 211-228, 2006 | 179 | 2006 |
PARDA: A fast parallel reuse distance analysis algorithm Q Niu, J Dinan, Q Lu, P Sadayappan 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 117 | 2012 |
Data layout transformation for enhancing data locality on nuca chip multiprocessors Q Lu, C Alias, U Bondhugula, T Henretty, S Krishnamoorthy, ... 2009 18th International Conference on Parallel Architectures and Compilation …, 2009 | 108 | 2009 |
MCC-DB: Minimizing cache conflicts in multi-core processors for databases R Lee, X Ding, F Chen, Q Lu, X Zhang Proceedings of the VLDB Endowment 2 (1), 373-384, 2009 | 85 | 2009 |
Soft-OLP: Improving hardware cache performance through software-controlled object-level partitioning Q Lu, J Lin, X Ding, Z Zhang, X Zhang, P Sadayappan 2009 18th International Conference on Parallel Architectures and Compilation …, 2009 | 76 | 2009 |
Enabling software management for multicore caches with a lightweight hardware support J Lin, Q Lu, X Ding, Z Zhang, X Zhang, P Sadayappan Proceedings of the Conference on High Performance Computing Networking …, 2009 | 65 | 2009 |
Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry A Hartono, Q Lu, T Henretty, S Krishnamoorthy, H Zhang, G Baumgartner, ... The Journal of Physical Chemistry A 113 (45), 12715-12723, 2009 | 39 | 2009 |
Combining analytical and empirical approaches in tuning matrix transposition Q Lu, S Krishnamoorthy, P Sadayappan Proceedings of the 15th international conference on Parallel architectures …, 2006 | 30 | 2006 |
Applying MPI derived datatypes to the NAS benchmarks: A case study Q Lu, J Wu, D Panda, P Sadayappan Workshops on Mobile and Wireless Networking/High Performance Scientific …, 2004 | 28 | 2004 |
Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions Q Lu, X Gao, S Krishnamoorthy, G Baumgartner, J Ramanujam, ... Journal of Parallel and Distributed Computing 72 (3), 338-352, 2012 | 25 | 2012 |
Hermit:{Low-Latency},{High-Throughput}, and Transparent Remote Memory via {Feedback-Directed} Asynchrony Y Qiao, C Wang, Z Ruan, A Belay, Q Lu, Y Zhang, M Kim, GH Xu 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023 | 24 | 2023 |
Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations A Hartono, Q Lu, X Gao, S Krishnamoorthy, M Nooijen, G Baumgartner, ... Computational Science–ICCS 2006: 6th International Conference, Reading, UK …, 2006 | 24 | 2006 |
Performance modeling and optimization of parallel out-of-core tensor contractions X Gao, SK Sahoo, CC Lam, J Ramanujam, Q Lu, G Baumgartner, ... Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of …, 2005 | 16 | 2005 |
ArkDB: a key-value engine for scalable cloud storage services Z Pang, Q Lu, S Chen, R Wang, Y Xu, J Wu Proceedings of the 2021 International Conference on Management of Data, 2570 …, 2021 | 12 | 2021 |
High-performance key-value store Z Pang, Q Lu, S Chen, Y Xu, J Wu, R Wang US Patent App. 17/336,047, 2022 | 8 | 2022 |
Modelling and Simulation in the Natural Sciences-Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations A Hartono, Q Lu, X Gao, S Krishnamoorthy, M Nooijen, G Baumgartner, ... Lecture Notes in Computer Science 3991, 267-275, 2006 | 6 | 2006 |
Logparser-llm: Advancing efficient log parsing with large language models A Zhong, D Mo, G Liu, J Liu, Q Lu, Q Zhou, J Wu, Q Li, Q Wen Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024 | 5 | 2024 |
Granularly timestamped concurrency control for key-value store R Wang, Z Pang, Q Lu, S Chen, J Wu US Patent 11,741,073, 2023 | 4 | 2023 |