Spremljaj
Shuaiwen Leon Song
Shuaiwen Leon Song
VP of Research, Together.ai; Ex-Microsoft; Tenured Professor
Preverjeni e-poštni naslov na together.ai - Domača stran
Naslov
Navedeno
Navedeno
Leto
Powerpack: Energy profiling and analysis of high-performance systems and applications
R Ge, X Feng, S Song, HC Chang, D Li, KW Cameron
IEEE Transactions on Parallel and Distributed Systems 21 (5), 658-671, 2009
5452009
Superneurons: Dynamic GPU memory management for training deep neural networks
L Wang, J Ye, Y Zhao, W Wu, A Li, SL Song, Z Xu, T Kraska
Proceedings of the 23rd ACM SIGPLAN symposium on principles and practice of …, 2018
3222018
Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect
A Li, SL Song, J Chen, J Li, X Liu, NR Tallent, KJ Barker
IEEE Transactions on Parallel and Distributed Systems 31 (1), 94-110, 2019
3022019
A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures
S Song, C Su, B Rountree, KW Cameron
27th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2013
2172013
Locality-driven dynamic GPU cache bypassing
C Li, SL Song, H Dai, A Sidelnik, SKS Hari, H Zhou
Proceedings of the 29th ACM on International Conference on Supercomputing, 67-77, 2015
1432015
Graphreduce: processing large-scale graphs on accelerator-based systems
D Sengupta, SL Song, K Agarwal, K Schwan
Proceedings of the International Conference for High Performance Computing …, 2015
1122015
Locality-aware CTA clustering for modern GPUs
A Li, SL Song, W Liu, X Liu, A Kumar, H Corporaal
ACM SIGARCH Computer Architecture News 45 (1), 297-311, 2017
972017
Randomness in neural network training: Characterizing the impact of tooling
D Zhuang, X Zhang, S Song, S Hooker
Proceedings of Machine Learning and Systems 4, 316-336, 2022
962022
Deepspeed ulysses: System optimizations for enabling training of extreme long sequence transformer models
SA Jacobs, M Tanaka, C Zhang, M Zhang, SL Song, S Rajbhandari, Y He
arXiv preprint arXiv:2309.14509, 2023
732023
Iso-energy-efficiency: An approach to power-constrained parallel computation
S Song, CY Su, R Ge, A Vishnu, KW Cameron
2011 IEEE International Parallel & Distributed Processing Symposium, 128-139, 2011
712011
AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures
Z Zheng, X Yang, P Zhao, G Long, K Zhu, F Zhu, W Zhao, X Liu, J Yang, ...
Proceedings of the 27th ACM International Conference on Architectural …, 2022
692022
Tartan: evaluating modern GPU interconnect via a multi-GPU benchmark suite
A Li, SL Song, J Chen, X Liu, N Tallent, K Barker
2018 IEEE International Symposium on Workload Characterization (IISWC), 191-202, 2018
692018
Energy profiling and analysis of the hpc challenge benchmarks
S Song, R Ge, X Feng, KW Cameron
The International Journal of High Performance Computing Applications 23 (3 …, 2009
682009
Unified performance and power modeling of scientific workloads
SL Song, K Barker, D Kerbyson
Proceedings of the 1st International Workshop on Energy Efficient …, 2013
622013
Processing-in-memory enabled graphics processors for 3D rendering
C Xie, SL Song, J Wang, W Zhang, X Fu
2017 IEEE International Symposium on High Performance Computer Architecture …, 2017
612017
Flash-llm: Enabling cost-effective and highly-efficient large generative model inference with unstructured sparsity
H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song
arXiv preprint arXiv:2309.10285, 2023
572023
Exploring and analyzing the real impact of modern on-package memory on HPC scientific kernels
A Li, W Liu, MRB Kristensen, B Vinter, H Wang, K Hou, A Marquez, ...
Proceedings of the International Conference for High Performance Computing …, 2017
572017
New-sum: A novel online abft scheme for general iterative methods
D Tao, SL Song, S Krishnamoorthy, P Wu, X Liang, EZ Zhang, ...
Proceedings of the 25th ACM International Symposium on High-Performance …, 2016
572016
Deepspeed-chat: Easy, fast and affordable rlhf training of chatgpt-like models at all scales
Z Yao, RY Aminabadi, O Ruwase, S Rajbhandari, X Wu, AA Awan, ...
arXiv preprint arXiv:2308.01320, 2023
562023
Investigating the interplay between energy efficiency and resilience in high performance computing
L Tan, SL Song, P Wu, Z Chen, R Ge, DJ Kerbyson
2015 IEEE International Parallel and Distributed Processing Symposium, 786-796, 2015
562015
Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.
Članki 1–20