Theo dõi
Siddharth Singh
Siddharth Singh
PhD Student, Computer Science, University of Maryland
Email được xác minh tại umd.edu
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Stance detection in web and social media: a comparative study
S Ghosh, P Singhania, S Singh, K Rudra, S Ghosh
Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th …, 2019
1102019
A hybrid tensor-expert-data parallelism approach to optimize mixture-of-experts training
S Singh, O Ruwase, AA Awan, S Rajbhandari, Y He, A Bhatele
Proceedings of the 37th International Conference on Supercomputing, 203-214, 2023
222023
Be like a goldfish, don't memorize! mitigating memorization in generative llms
A Hans, J Kirchenbauer, Y Wen, N Jain, H Kazemi, P Singhania, S Singh, ...
Advances in Neural Information Processing Systems 37, 24022-24045, 2024
162024
AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning
S Singh, A Bhatele
2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2022
162022
Loki: Low-rank keys for efficient sparse attention
P Singhania, S Singh, S He, S Feizi, A Bhatele
arXiv preprint arXiv:2406.02542, 2024
152024
A survey and empirical evaluation of parallel deep learning frameworks
D Nichols, S Singh, SH Lin, A Bhatele
arXiv preprint arXiv:2111.04949, 2021
10*2021
Inducing Cooperation in Multi-Agent Games Through Status-Quo Loss
P Badjatiya, M Sarkar, A Sinha, S Singh, N Puri, B Krishnamurthy
arXiv preprint, 2020
9*2020
Exploiting sparsity in pruned neural networks to optimize large model training
S Singh, A Bhatele
2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023
82023
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
J Geiping, S McLeish, N Jain, J Kirchenbauer, S Singh, BR Bartoldson, ...
arXiv preprint arXiv:2502.05171, 2025
42025
PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
A Ghose, S Singh, V Kulaharia, L Dokara, S Maity, S Dey
IEEE Transactions on Computers 71 (9), 2234-2247, 2021
42021
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
arXiv preprint arXiv:2305.13525, 2023
3*2023
HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages
A Chaturvedi, D Nichols, S Singh, A Bhatele
arXiv preprint arXiv:2412.15178, 2024
12024
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
S Singh, P Singhania, A Ranjan, J Kirchenbauer, J Geiping, Y Wen, ...
SC24: International Conference for High Performance Computing, Networking …, 2024
12024
Jorge: Approximate Preconditioning for GPU-efficient Second-order Optimization
S Singh, Z Sating, A Bhatele
arXiv preprint arXiv:2310.12298, 2023
12023
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
S McLeish, J Kirchenbauer, DY Miller, S Singh, A Bhatele, M Goldblum, ...
arXiv preprint arXiv:2502.06857, 2025
2025
Eve: Less Memory, Same Might
A Tomar, S Singh, T Goldstein, A Bhatele
2024
Creating Code LLMs for HPC: It’s LLMs All the Way Down
A Chaturvedi, D Nichols, S Singh, A Bhatele
Memory (MB) 4011 (7228), 14927, 0
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–17