Volgen
Lingxiao Ma
Lingxiao Ma
Senior Researcher, Microsoft Research
Geverifieerd e-mailadres voor pku.edu.cn - Homepage
Titel
Geciteerd door
Geciteerd door
Jaar
NeuGraph: Parallel Deep Neural Network Computation on Large Graphs
L Ma, Z Yang, Y Miao, J Xue, M Wu, L Zhou, Y Dai
2019 {USENIX} Annual Technical Conference ({USENIX}{ATC} 19), 443-458, 2019
2822019
Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks
L Ma, Z Xie, Z Yang, J Xue, Y Miao, W Cui, W Hu, F Yang, L Zhang, ...
14th {USENIX} Symposium on Operating Systems Design and Implementation …, 2020
1472020
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
S Ma, H Wang, L Ma, L Wang, W Wang, S Huang, L Dong, R Wang, J Xue, ...
arXiv preprint arXiv:2402.17764, 2024
1282024
SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization
S Cao, L Ma, W Xiao, C Zhang, Y Liu, L Zhang, L Nie, Z Yang
Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019
932019
Garaph: efficient GPU-accelerated graph processing on a single machine with balanced replication
L Ma, Z Yang, H Chen, J Xue, Y Dai
2017 USENIX Annual Technical Conference (USENIX ATC 17), 195-207, 2017
832017
Bitnet: Scaling 1-bit transformers for large language models
H Wang, S Ma, L Dong, S Huang, H Wang, L Ma, F Yang, R Wang, Y Wu, ...
arXiv preprint arXiv:2310.11453, 2023
752023
{ROLLER}: Fast and Efficient Tensor Compilation for Deep Learning
H Zhu, R Wu, Y Diao, S Ke, H Li, C Zhang, J Xue, L Ma, Y Xia, W Cui, ...
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
712022
Architectural Implications of Graph Neural Networks
Z Zhang, J Leng, L Ma, Y Miao, C Li, M Guo
IEEE Computer Architecture Letters 19 (1), 59-62, 2020
582020
Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce
X Miao, X Nie, Y Shao, Z Yang, J Jiang, L Ma, B Cui
Proceedings of the 2021 International Conference on Management of Data, 2262 …, 2021
542021
PCGCN: Partition-Centric Processing for Accelerating Graph Convolutional Network
C Tian, L Ma, Z Yang, Y Dai
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020
472020
{SparTA}:{Deep-Learning} Model Sparsity via {Tensor-with-Sparsity-Attribute}
N Zheng, B Lin, Q Zhang, L Ma, Y Yang, F Yang, Y Wang, M Yang, L Zhou
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
422022
Towards Efficient Large-Scale Graph Neural Network Computing
L Ma, Z Yang, Y Miao, J Xue, M Wu, L Zhou, Y Dai
arXiv preprint arXiv:1810.08403, 2018
362018
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement
X Nie, X Miao, Z Wang, Z Yang, J Xue, L Ma, G Cao, B Cui
Proceedings of the ACM on Management of Data 1 (1), 1-19, 2023
312023
Evomoe: An evolutional mixture-of-experts training framework via dense-to-sparse gate
X Nie, X Miao, S Cao, L Ma, Q Liu, J Xue, Y Miao, Y Liu, Z Yang, B Cui
arXiv preprint arXiv:2112.14397, 2021
292021
Dense-to-Sparse Gate for Mixture-of-Experts
X Nie, S Cao, X Miao, L Ma, J Xue, Y Miao, Z Yang, Z Yang, B Cui
arXiv preprint arXiv:2112.14397, 2021
272021
Welder: Scheduling Deep Learning Memory Access via Tile-graph
Y Shi, Z Yang, J Xue, L Ma, Y Xia, Z Miao, Y Guo, F Yang, L Zhou
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
202023
PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation
N Zheng, H Jiang, Q Zhang, Z Han, L Ma, Y Yang, F Yang, C Zhang, L Qiu, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 331-347, 2023
192023
Optimizing Dynamic Neural Networks with Brainstorm
W Cui, Z Han, L Ouyang, Y Wang, N Zheng, L Ma, Y Yang, F Yang, J Xue, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
132023
Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning
C Zhang, L Ma, J Xue, Y Shi, Z Miao, F Yang, J Zhai, Z Yang, M Yang
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
102023
Accelerating GNN training with locality-aware partial execution
T Kim, C Hwang, KS Park, Z Lin, P Cheng, Y Miao, L Ma, Y Xiong
Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems, 34-41, 2021
102021
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20