Knowledge consistency between neural networks and beyond R Liang, T Li, LF Li, J Wang, Q Zhang The Eighth International Conference on Learning Representations (ICLR), 2020 | 41 | 2020 |
Dynamic regret of online markov decision processes P Zhao, LF Li, ZH Zhou International Conference on Machine Learning, 26865-26894, 2022 | 21 | 2022 |
Dynamic regret of adversarial linear mixture MDPs LF Li, P Zhao, ZH Zhou Advances in Neural Information Processing Systems 36, 2024 | 6 | 2024 |
Improved algorithm for adversarial linear mixture MDPs with bandit feedback and unknown transition LF Li, P Zhao, ZH Zhou International Conference on Artificial Intelligence and Statistics, 3061-3069, 2024 | 4 | 2024 |
Tracking treatment effect heterogeneity in evolving environments T Qin, LF Li, TZ Wang, ZH Zhou Machine Learning 113 (6), 3653-3673, 2024 | 3 | 2024 |
Provably efficient reinforcement learning with multinomial logit function approximation LF Li, YJ Zhang, P Zhao, ZH Zhou arXiv preprint arXiv:2405.17061, 2024 | 2 | 2024 |
Dynamic regret of adversarial MDPs with unknown transition and linear function approximation LF Li, P Zhao, ZH Zhou Proceedings of the AAAI Conference on Artificial Intelligence 38 (12), 13572 …, 2024 | 2 | 2024 |
Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits LF Li, YY Qian, P Zhao, ZH Zhou arXiv preprint arXiv:2502.07193, 2025 | | 2025 |
Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs LF Li, P Zhao, ZH Zhou arXiv preprint arXiv:2411.03107, 2024 | | 2024 |