A hypergradient approach to robust regression without correspondence Y Xie, Y Mao, S Zuo, H Xu, X Ye, T Zhao, H Zha The Ninth International Conference on Learning Representations, 2020 | 18 | 2020 |
In-sample actor critic for offline reinforcement learning H Zhang, Y Mao, B Wang, S He, Y Xu, X Ji The Eleventh International Conference on Learning Representations, 2023 | 11 | 2023 |
Supported Trust Region Optimization for Offline Reinforcement Learning Y Mao, H Zhang, C Chen, Y Xu, X Ji International Conference on Machine Learning, 2023 | 10 | 2023 |
Supported value regularization for offline reinforcement learning Y Mao, H Zhang, C Chen, Y Xu, X Ji Advances in Neural Information Processing Systems 36, 2024 | 8 | 2024 |
Robust fast adaptation from adversarially explicit task distribution generation C Wang, Y Lv, Y Mao, Y Qu, Y Xu, X Ji arXiv preprint arXiv:2407.19523, 2024 | 4 | 2024 |
Choices are more important than efforts: Llm enables efficient multi-agent exploration Y Qu, B Wang, Y Jiang, J Shao, Y Mao, C Wang, C Liu, X Ji arXiv preprint arXiv:2410.02511, 2024 | 2 | 2024 |
Offline reinforcement learning with ood state correction and ood action suppression Y Mao, C Wang, C Chen, Y Qu, X Ji arXiv preprint arXiv:2410.19400, 2024 | 1 | 2024 |
Doubly Mild Generalization for Offline Reinforcement Learning Y Mao, Q Wang, Y Qu, Y Jiang, X Ji arXiv preprint arXiv:2411.07934, 2024 | | 2024 |
Enhancing Offline Reinforcement Learning with an Optimal Supported Dataset C Chen, Z Xu, Y Mao, H Zhang, X Ji | | |
Pessimistic Policy Iteration for Offline Reinforcement Learning H Zhang, B Wang, Y Mao, J Shao, Y Jiang, Y Xu, X Ji | | |