Ian Osband

Cytowane przez

	Wszystkie	Od 2020
Cytowania	10649	8800
h-indeks	28	28
i10-indeks	37	34

1900

950

475

1425

2015201620172018201920202021202220232024202527 73 243 548 899 1295 1548 1688 1858 1834 563

Współautorzy

Benjamin Van RoyStanford UniversityZweryfikowany adres z stanford.edu
Zheng WenGoogle DeepMindZweryfikowany adres z google.com
Vikranth DwaracherlaDeepMindZweryfikowany adres z google.com
Xiuyuan LuGoogle DeepMindZweryfikowany adres z google.com
Bilal PiotGoogle DeepmindZweryfikowany adres z google.com
Olivier PietquinEarth Species Project | ex Google DeepMind (On leave - Professor at University of Lille)Zweryfikowany adres z univ-lille.fr
Rémi MunosFAIR, MetaZweryfikowany adres z inria.fr
Mohammad Gheshlaghi AzarCohereZweryfikowany adres z cohere.com
Morteza IbrahimiStanford UniversityZweryfikowany adres z stanford.edu
Daniel RussoColumbia UniversityZweryfikowany adres z gsb.columbia.edu
Brendan O'DonoghueStanford University, Google DeepMindZweryfikowany adres z alumni.stanford.edu
Alexander PritzelDeepmindZweryfikowany adres z google.com
Todd HesterWaymoZweryfikowany adres z waymo.com
Tom SchaulSenior Staff Scientist, DeepMindZweryfikowany adres z nyu.edu
Marc LanctotResearch Scientist, Google DeepMindZweryfikowany adres z google.com

Obserwuj

Ian Osband

OpenAI

Zweryfikowany adres z openai.com - Strona główna

Reinforcement Learning


Tytuł Sortuj wg cytatów Sortuj wg roku Sortuj wg tytułu	Cytowane przez Cytowane przez	Rok
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1628	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1386	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1276	2018
Noisy networks for exploration M Fortunato, MG Azar, B Piot, J Menick, I Osband, A Graves, V Mnih, ... arXiv preprint arXiv:1706.10295, 2017	1218	2017
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	907	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in neural information processing systems 31, 2018	477	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	364	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	356	2016
Gpt-4o system card A Hurst, A Lerer, AP Goucher, A Perelman, A Ramesh, A Clark, AJ Ostrow, ... arXiv preprint arXiv:2410.21276, 2024	288	2024
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	287	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	259	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	213	2014
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	207	2019
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	185	2017
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	176*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	146	2012
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2795-2823, 2023	136	2023
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	134	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	124	2016
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	121	2013

Nie można teraz wykonać tej operacji. Spróbuj ponownie później.

Prace 1–20

Cytowania rocznie

Powielone cytowania

Scalone cytowania

Dodaj współautorówWspółautorzy

Obserwuj

Cytowane przez

Współautorzy