Fleetrec: Large-scale recommendation inference on hybrid gpu-fpga clusters W Jiang, Z He, S Zhang, K Zeng, L Feng, J Zhang, T Liu, Y Li, J Zhou, ... Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021 | 49 | 2021 |
MicroRec: Efficient recommendation inference by hardware and data structure solutions W Jiang, Z He, S Zhang, TB Preußer, K Zeng, L Feng, J Zhang, T Liu, Y Li, ... Proceedings of Machine Learning and Systems 3, 845-859, 2021 | 42 | 2021 |
Minions: Accelerating large language model inference with adaptive and collective speculative decoding S Wang, H Yang, X Wang, T Liu, P Wang, X Liang, K Ma, T Feng, X You, ... arXiv e-prints, arXiv: 2402.15678, 2024 | 8 | 2024 |
Logic-of-thought: Injecting logic into contexts for full reasoning in large language models T Liu, W Xu, W Huang, Y Zeng, J Wang, X Wang, H Yang, J Li arXiv preprint arXiv:2409.17539, 2024 | 6 | 2024 |
Microrec: accelerating deep recommendation systems to microseconds by hardware and data structure solutions W Jiang, Z He, S Zhang, TB Preußer, K Zeng, L Feng, J Zhang, T Liu, Y Li, ... arXiv preprint arXiv:2010.05894, 2020 | 6 | 2020 |
Groupdebate: Enhancing the efficiency of multi-agent debate using group discussion T Liu, X Wang, W Huang, W Xu, Y Zeng, L Jiang, H Yang, J Li arXiv preprint arXiv:2409.14051, 2024 | 5 | 2024 |
Leveraging LLMs for Hypothetical Deduction in Logical Inference: A Neuro-Symbolic Approach Q Li, J Li, T Liu, Y Zeng, M Cheng, W Huang, Q Liu arXiv preprint arXiv:2410.21779, 2024 | 1 | 2024 |
AtRec: Accelerating recommendation model training on CPUs S Wang, T Feng, H Yang, X You, B Chen, T Liu, Z Luan, D Qian IEEE Transactions on Parallel and Distributed Systems, 2024 | 1 | 2024 |
S-MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency Y Zeng, W Huang, L Jiang, T Liu, X Jin, CT Tiana, J Li, X Xu arXiv preprint arXiv:2502.04790, 2025 | | 2025 |
FoPru: Focal Pruning for Efficient Large Vision-Language Models L Jiang, W Huang, T Liu, Y Zeng, J Li, L Cheng, X Xu arXiv preprint arXiv:2411.14164, 2024 | | 2024 |
Exploiting Structured Feature and Runtime Isolation for High-Performant Recommendation Serving X You, H Yang, S Wang, T Peng, C Ding, X Li, B Chen, Z Luan, T Liu, Y Li, ... IEEE Transactions on Computers, 2024 | | 2024 |
PRmalloc: Leveraging Predictability for Deep Learning Memory Allocation W Xiao, S Ren, T Liu, Y Li | | 2019 |