Sharpness-aware minimization revisited: Weighted sharpness as a regularization term Y Yue, J Jiang, Z Ye, N Gao, Y Liu, K Zhang Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 9 | 2023 |
Multi-aspect heterogeneous graph augmentation Y Zhou, Y Cao, Y Liu, Y Shang, P Zhang, Z Lin, Y Yue, B Wang, X Fu, ... Proceedings of the ACM Web Conference 2023, 39-48, 2023 | 4 | 2023 |
AGD: an auto-switchable optimizer using stepwise gradient difference for preconditioning matrix Y Yue, Z Ye, J Jiang, Y Liu, K Zhang Advances in Neural Information Processing Systems 36, 45812-45832, 2023 | 2 | 2023 |
Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction Y Yue, Y Liu, S Tong, M Li, Z Zhang, C Wen, H Bao, L Gu, J Gu, Y Mu Joint European Conference on Machine Learning and Knowledge Discovery in …, 2021 | 2 | 2021 |
Integer is enough: when vertical federated learning meets rounding P Qiu, Y Pu, Y Liu, W Liu, Y Yue, X Zhu, L Li, J Li, S Ji Proceedings of the AAAI Conference on Artificial Intelligence 38 (13), 14704 …, 2024 | 1 | 2024 |
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models J Cheng, N Gao, Y Yue, Z Ye, J Jiang, J Sha arXiv preprint arXiv:2412.07210, 2024 | | 2024 |