Παρακολούθηση
Lianmin Zheng
Lianmin Zheng
xAI
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα x.ai - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Judging llm-as-a-judge with mt-bench and chatbot arena
L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ...
Advances in Neural Information Processing Systems 36, 46595-46623, 2023
2441*2023
TVM: An automated end-to-end optimizing compiler for deep learning
T Chen, T Moreau, Z Jiang, L Zheng, E Yan, H Shen, M Cowan, L Wang, ...
13th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2018
2221*2018
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality
WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ...
https://lmsys.org/blog/2023-03-30-vicuna/, 2023
2165*2023
Efficient memory management for large language model serving with pagedattention
W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023
9482023
Learning to optimize tensor programs
T Chen, L Zheng, E Yan, Z Jiang, T Moreau, L Ceze, C Guestrin, ...
Advances in Neural Information Processing Systems 31, 2018
4732018
Ansor: Generating High-Performance Tensor Programs for Deep Learning
L Zheng, C Jia, M Sun, Z Wu, CH Yu, A Haj-Ali, Y Wang, J Yang, D Zhuo, ...
14th USENIX symposium on operating systems design and implementation (OSDI …, 2020
4012020
Alpa: Automating Inter-and Intra-Operator Parallelism for Distributed Deep Learning
L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ...
16th USENIX symposium on operating systems design and implementation (OSDI 22), 2022
3122022
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Re, ...
International Conference on Machine Learning, 2023
2912023
A hardware–software blueprint for flexible deep learning specialization
T Moreau, T Chen, L Vega, J Roesch, E Yan, L Zheng, J Fromm, Z Jiang, ...
IEEE Micro 39 (5), 8-16, 2019
272*2019
Magent: A many-agent reinforcement learning platform for artificial collective intelligence
L Zheng, J Yang, H Cai, M Zhou, W Zhang, J Wang, Y Yu
Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018
2452018
Chatbot arena: An open platform for evaluating llms by human preference
WL Chiang, L Zheng, Y Sheng, AN Angelopoulos, T Li, D Li, H Zhang, ...
arXiv preprint arXiv:2403.04132, 2024
2442024
H2o: Heavy-hitter oracle for efficient generative inference of large language models
Z Zhang, Y Sheng, T Zhou, T Chen, L Zheng, R Cai, Z Song, Y Tian, C Ré, ...
Advances in Neural Information Processing Systems 36, 34661-34710, 2023
2022023
How Long Can Context Length of Open-Source LLMs truly Promise?
D Li, R Shao, A Xie, Y Sheng, L Zheng, J Gonzalez, I Stoica, X Ma, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
119*2023
AlpaServe: Statistical multiplexing with model parallelism for deep learning serving
Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
1152023
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ...
The Twelfth International Conference on Learning Representations, 2023
762023
Actnn: Reducing training memory footprint via 2-bit activation compressed training
J Chen, L Zheng, Z Yao, D Wang, I Stoica, M Mahoney, J Gonzalez
International Conference on Machine Learning, 1803-1813, 2021
722021
SGLang: Efficient Execution of Structured Language Model Programs
L Zheng, L Yin, Z Xie, J Huang, C Sun, CH Yu, S Cao, C Kozyrakis, ...
arXiv preprint arXiv:2312.07104, 2023
67*2023
Slora: Scalable serving of thousands of lora adapters
Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ...
Proceedings of Machine Learning and Systems 6, 296-311, 2024
66*2024
Tensorir: An abstraction for automatic tensorized program optimization
S Feng, B Hou, H Jin, W Lin, J Shao, R Lai, Z Ye, L Zheng, CH Yu, Y Yu, ...
Proceedings of the 28th ACM International Conference on Architectural …, 2023
632023
Rethinking benchmark and contamination for language models with rephrased samples
S Yang, WL Chiang, L Zheng, JE Gonzalez, I Stoica
arXiv preprint arXiv:2311.04850, 2023
482023
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20