Các bài viết có thể truy cập công khai - Pierre MénardTìm hiểu thêm
Có tại một số nơi: 24
Explore first, exploit next: The true shape of regret in bandit problems
A Garivier, P Ménard, G Stoltz
Mathematics of Operations Research 44 (2), 377-399, 2019
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
O Darwiche Domingues, P Ménard, E Kaufmann, M Valko
arXiv e-prints, arXiv: 2010.03531, 2020
Các cơ quan ủy nhiệm: Science Foundation Ireland, Agence Nationale de la Recherche
Fast active learning for pure exploration in reinforcement learning
P Ménard, OD Domingues, A Jonsson, E Kaufmann, E Leurent, M Valko
International Conference on Machine Learning, 7599-7608, 2021
Các cơ quan ủy nhiệm: Science Foundation Ireland, Agence Nationale de la Recherche, Government of …
Gamification of pure exploration for linear bandits
R Degenne, P Ménard, X Shang, M Valko
International Conference on Machine Learning, 2432-2442, 2020
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche
Adaptive reward-free exploration
E Kaufmann, P Ménard, OD Domingues, A Jonsson, E Leurent, M Valko
Algorithmic Learning Theory, 865-891, 2021
Các cơ quan ủy nhiệm: Science Foundation Ireland, Government of Spain
Fixed-confidence guarantees for bayesian best-arm identification
X Shang, R Heide, P Menard, E Kaufmann, M Valko
International Conference on Artificial Intelligence and Statistics, 1823-1832, 2020
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche
Kernel-based reinforcement learning: A finite-time analysis
OD Domingues, P Ménard, M Pirotta, E Kaufmann, M Valko
International Conference on Machine Learning, 2783-2792, 2021
Các cơ quan ủy nhiệm: Science Foundation Ireland, Agence Nationale de la Recherche
Ucb momentum q-learning: Correcting the bias without forgetting
P Ménard, OD Domingues, X Shang, M Valko
International Conference on Machine Learning, 7609-7618, 2021
Các cơ quan ủy nhiệm: Science Foundation Ireland
Learning in two-player zero-sum partially observable Markov games with perfect recall
T Kozuno, P Ménard, R Munos, M Valko
Advances in Neural Information Processing Systems 34, 11987-11998, 2021
Các cơ quan ủy nhiệm: Natural Sciences and Engineering Research Council of Canada, Science …
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints
A Garivier, H Hadiji, P Menard, G Stoltz
Journal of Machine Learning Research 23 (179), 1-66, 2022
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces
O Darwiche Domingues, P Ménard, M Pirotta, E Kaufmann, M Valko
arXiv e-prints, arXiv: 2007.05078, 2020
Các cơ quan ủy nhiệm: Science Foundation Ireland, Agence Nationale de la Recherche
Planning in markov decision processes with gap-dependent sample complexity
A Jonsson, E Kaufmann, P Ménard, O Darwiche Domingues, E Leurent, ...
Advances in Neural Information Processing Systems 33, 1253-1263, 2020
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche, Government of Spain
Fano’s inequality for random variables
S Gerchinovitz, P Ménard, G Stoltz
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche
Bandits with many optimal arms
R De Heide, J Cheshire, P Ménard, A Carpentier
Advances in Neural Information Processing Systems 34, 22457-22469, 2021
Các cơ quan ủy nhiệm: German Research Foundation, Science Foundation Ireland, Agence Nationale de …
Fast rates for maximum entropy exploration
D Tiapkin, D Belomestny, D Calandriello, E Moulines, R Munos, ...
International Conference on Machine Learning, 34161-34221, 2023
Các cơ quan ủy nhiệm: German Research Foundation, Agence Nationale de la Recherche
From Dirichlet to Rubin: Optimistic exploration in RL without bonuses
D Tiapkin, D Belomestny, E Moulines, A Naumov, S Samsonov, Y Tang, ...
International Conference on Machine Learning, 21380-21431, 2022
Các cơ quan ủy nhiệm: German Research Foundation, Science Foundation Ireland
Planning in entropy-regularized Markov decision processes and games
JB Grill, O Darwiche Domingues, P Ménard, R Munos, M Valko
Advances in Neural Information Processing Systems 32, 2019
Các cơ quan ủy nhiệm: European Commission, Agence Nationale de la Recherche
The influence of shape constraints on the thresholding bandit problem
J Cheshire, P Ménard, A Carpentier
Conference on Learning Theory, 1228-1275, 2020
Các cơ quan ủy nhiệm: German Research Foundation, Science Foundation Ireland
Adapting to game trees in zero-sum imperfect information games
C Fiegel, P Ménard, T Kozuno, R Munos, V Perchet, M Valko
International Conference on Machine Learning, 10093-10135, 2023
Các cơ quan ủy nhiệm: Agence Nationale de la Recherche
Optimistic posterior sampling for reinforcement learning with few samples and tight guarantees
D Tiapkin, D Belomestny, D Calandriello, E Moulines, R Munos, ...
Advances in Neural Information Processing Systems 35, 10737-10751, 2022
Các cơ quan ủy nhiệm: German Research Foundation, Agence Nationale de la Recherche
Chương trình máy tính sẽ tự động xác định thông tin xuất bản và thông tin về nhà tài trợ