A survey of quantization methods for efficient neural network inference A Gholami, S Kim, Z Dong, Z Yao, MW Mahoney, K Keutzer Low-Power Computer Vision, 291-326, 2022 | 1235 | 2022 |
Q-bert: Hessian based ultra low precision quantization of bert S Shen, Z Dong, J Ye, L Ma, Z Yao, A Gholami, MW Mahoney, K Keutzer Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 8815-8821, 2020 | 590 | 2020 |
Hawq: Hessian aware quantization of neural networks with mixed-precision Z Dong, Z Yao, A Gholami, MW Mahoney, K Keutzer Proceedings of the IEEE/CVF international conference on computer vision, 293-302, 2019 | 556 | 2019 |
Zeroq: A novel zero shot quantization framework Y Cai, Z Yao, Z Dong, A Gholami, MW Mahoney, K Keutzer Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020 | 446 | 2020 |
How much can clip benefit vision-and-language tasks? S Shen, LH Li, H Tan, M Bansal, A Rohrbach, KW Chang, Z Yao, ... arXiv preprint arXiv:2107.06383, 2021 | 423 | 2021 |
I-bert: Integer-only bert quantization S Kim, A Gholami, Z Yao, MW Mahoney, K Keutzer International conference on machine learning, 5506-5518, 2021 | 349 | 2021 |
Zeroquant: Efficient and affordable post-training quantization for large-scale transformers Z Yao, R Yazdani Aminabadi, M Zhang, X Wu, C Li, Y He Advances in Neural Information Processing Systems 35, 27168-27183, 2022 | 326 | 2022 |
Pyhessian: Neural networks through the lens of the hessian Z Yao, A Gholami, K Keutzer, MW Mahoney 2020 IEEE international conference on big data (Big data), 581-590, 2020 | 309 | 2020 |
Hawq-v2: Hessian aware trace-weighted quantization of neural networks Z Dong, Z Yao, D Arfeen, A Gholami, MW Mahoney, K Keutzer Advances in neural information processing systems 33, 18518-18529, 2020 | 290 | 2020 |
Adahessian: An adaptive second order optimizer for machine learning Z Yao, A Gholami, S Shen, M Mustafa, K Keutzer, M Mahoney proceedings of the AAAI conference on artificial intelligence 35 (12), 10665 …, 2021 | 273 | 2021 |
Hawq-v3: Dyadic neural network quantization Z Yao, Z Dong, Z Zheng, A Gholami, J Yu, E Tan, L Wang, Q Huang, ... International Conference on Machine Learning, 11875-11886, 2021 | 267 | 2021 |
Shallow neural networks for fluid flow reconstruction with limited sensors NB Erichson, L Mathelin, Z Yao, SL Brunton, MW Mahoney, JN Kutz Proceedings of the Royal Society A 476 (2238), 20200097, 2020 | 232 | 2020 |
Deepspeed-moe: Advancing mixture-of-experts inference and training to power next-generation ai scale S Rajbhandari, C Li, Z Yao, M Zhang, RY Aminabadi, AA Awan, J Rasley, ... International conference on machine learning, 18332-18346, 2022 | 223 | 2022 |
Hessian-based analysis of large batch training and robustness to adversaries Z Yao, A Gholami, Q Lei, K Keutzer, MW Mahoney Advances in Neural Information Processing Systems 31, 2018 | 179 | 2018 |
ANODEV2: A coupled neural ODE framework T Zhang, Z Yao, A Gholami, JE Gonzalez, K Keutzer, MW Mahoney, ... Advances in Neural Information Processing Systems 32, 2019 | 104 | 2019 |
Improving semi-supervised federated learning by reducing the gradient diversity of models Z Zhang, Y Yang, Z Yao, Y Yan, JE Gonzalez, K Ramchandran, ... 2021 IEEE International Conference on Big Data (Big Data), 1214-1225, 2021 | 94 | 2021 |
Powernorm: Rethinking batch normalization in transformers S Shen, Z Yao, A Gholami, M Mahoney, K Keutzer International conference on machine learning, 8741-8751, 2020 | 89 | 2020 |
On the computational inefficiency of large batch sizes for stochastic gradient descent N Golmant, N Vemuri, Z Yao, V Feinberg, A Gholami, K Rothauge, ... arXiv preprint arXiv:1811.12941, 2018 | 87 | 2018 |
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation Z Yao, C Li, X Wu, S Youn, Y He arXiv preprint arXiv:2303.08302, 2023 | 84* | 2023 |
Trust region based adversarial attack on neural networks Z Yao, A Gholami, P Xu, K Keutzer, MW Mahoney Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 74 | 2019 |