Two-timescale networks for nonlinear value function approximation W Chung | 52 | 2019 |
Importance resampling for off-policy prediction M Schlegel, W Chung, D Graves, J Qian, M White Advances in Neural Information Processing Systems 32, 2019 | 43 | 2019 |
Beyond variance reduction: Understanding the true impact of baselines on policy optimization W Chung, V Thomas, MC Machado, N Le Roux International Conference on Machine Learning, 1999-2009, 2021 | 30 | 2021 |
The role of baselines in policy gradient optimization J Mei, W Chung, V Thomas, B Dai, C Szepesvari, D Schuurmans Advances in Neural Information Processing Systems 35, 17818-17830, 2022 | 18 | 2022 |
High-confidence error estimates for learned value functions T Sajed, W Chung, M White arXiv preprint arXiv:1808.09127, 2018 | 8 | 2018 |
Incrementally Learning Functions of the Return B Bennett, W Chung, M Zaheer, V Liu arXiv preprint arXiv:1907.04651, 2019 | 1 | 2019 |
Parseval Regularization for Continual Reinforcement Learning W Chung, L Cherif, D Precup, D Meger The Thirty-eighth Annual Conference on Neural Information Processing Systems, 0 | | |
Offline-Online Reinforcement Learning: Extending Batch and Online RL M Hashemzadeh, W Chung, M White | | |
Importance Resampling for Off-policy Policy Evaluation M Schlegel, W Chung, D Graves, M White | | |