Learning to (Learn at Test Time): RNNs with Expressive Hidden States Y Sun, X Li, K Dalal, J Xu, A Vikram, G Zhang, Y Dubois, X Chen, X Wang, ... arXiv preprint arXiv:2407.04620, 2024 | 68* | 2024 |
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression T Fu, H Huang, X Ning, G Zhang, B Chen, T Wu, H Wang, Z Huang, S Li, ... arXiv preprint arXiv:2406.14909, 2024 | 14 | 2024 |
CATS: Context-Aware Thresholding for Sparsity in Large Language Models D Lee, J Lee, G Zhang, M Tiwari, A Mirhoseini First Conference on Language Modeling, 2024 | 10* | 2024 |
Sgap: towards efficient sparse tensor algebra compilation for GPU G Zhang, Y Zhao, Y Tao, Z Yu, G Dai, S Huang, Y Wen, P Petoumenos, ... CCF Transactions on High Performance Computing, 1-18, 2023 | 5 | 2023 |
Hypergef: A framework enabling efficient fusion for hypergraph neural network on gpus Z Yu, G Dai, S Yang, G Zhang, H Zhang, F Zhu, J Yang, J Zhao, Y Wang Proceedings of Machine Learning and Systems 5, 387-399, 2023 | 5 | 2023 |
FEASTA: A Flexible and Efficient Accelerator for Sparse Tensor Algebra in Machine Learning K Zhong, Z Zhu, G Dai, H Wang, X Yang, H Zhang, J Si, Q Mao, S Zeng, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 4 | 2024 |
Compilation of Modular and General Sparse Workspaces G Zhang, O Hsu, F Kjolstad Proceedings of the ACM on Programming Languages 8 (PLDI), 1213-1238, 2024 | 1 | 2024 |
Streaming Tensor Programs: A Programming Abstraction for Streaming Dataflow Accelerators G Sohn, C Gyurgyik, G Zhang, S Velury, P Mure, N Zhang, K Olukotun ASPLOS Young Architect Workshop (YArch), 2024 | 1 | 2024 |
Canvas: End-to-End Kernel Architecture Search in Neural Networks C Zhao, G Zhang, M Gao arXiv preprint arXiv:2304.07741, 2023 | 1 | 2023 |
Adaptive Self-improvement LLM Agentic System for ML Library Development G Zhang, W Liang, O Hsu, K Olukotun arXiv preprint arXiv:2502.02534, 2025 | | 2025 |
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity W Liang, J Shen, G Zhang, N Dong, L Zettlemoyer, L Yu arXiv preprint arXiv:2501.16295, 2025 | | 2025 |
GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU Z Yu, G Zhang, H Huang, X Chen, J Zhao arXiv preprint arXiv:2404.03019, 2024 | | 2024 |