Philip Thomas

Citado por

	Total	Desde 2019
Citas	4869	3847
Índice h	33	30
Índice i10	60	55

780

390

195

585

2011201220132014201520162017201820192020202120222023202416 27 28 41 68 137 183 254 411 585 680 722 766 680

Acceso público

Ver todo

28 artículos

0 artículos

disponibles

no disponibles

Basado en requisitos de financiación

Coautores

Georgios TheocharousAdobe ResearchDirección de correo verificada de adobe.com
Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityDirección de correo verificada de cs.stanford.edu
Bruno Castro da SilvaUniversity of MassachusettsDirección de correo verificada de cs.umass.edu
Scott M. JordanPostdoctoral Fellow, University of AlbertaDirección de correo verificada de ualberta.ca
Scott NiekumAssociate Professor, University of Massachusetts AmherstDirección de correo verificada de cs.umass.edu
George KonidarisBrownDirección de correo verificada de cs.brown.edu
Stephen GiguereUniversity of MassachusettsDirección de correo verificada de cs.umass.edu
Yuriy BrunManning College of Information and Computer Sciences, University of Massachusetts AmherstDirección de correo verificada de cs.umass.edu
Chris NotaUniversity of Massachusetts, AmherstDirección de correo verificada de cs.umass.edu
Antonie J. (Ton) van den BogertProfessor of Mechanical Engineering, Cleveland State UniversityDirección de correo verificada de csuohio.edu
Michael BranickyProfessor of Electrical Engineering & Computer Science, University of KansasDirección de correo verificada de ku.edu
Erik Learned-MillerProfessor of Computer Science, University of Massachusetts AmherstDirección de correo verificada de cs.umass.edu
Sarah OsentoskiVinci4dDirección de correo verificada de vinci4d.ai
Blossom MetevierUniversity of Massachusetts AmherstDirección de correo verificada de umass.edu
Sridhar MahadevanDirector, Data Science Lab, Adobe Research & Professor, University of Massachusetts, AmherstDirección de correo verificada de cs.umass.edu
Will DabneyDeepMindDirección de correo verificada de google.com
Robert KirschProfessor and Chair of Biomedical Engineering, Case Western Reserve UniversityDirección de correo verificada de case.edu
Francisco M. GarciaUniversity of Massachusetts - AmherstDirección de correo verificada de cs.umass.edu
Peter StoneProfessor of Computer Science, The University of Texas at AustinDirección de correo verificada de cs.utexas.edu
Arthur GuezGoogle DeepMindDirección de correo verificada de google.com

Seguir

Philip Thomas

University of Massachusetts Amherst

Dirección de correo verificada de cs.umass.edu - Página principal

Artificial Intelligence Reinforcement Learning AI Safety


Título Ordenar por citas Ordenar por año Ordenar por título	Citado por Citado por	Año
Data-efficient off-policy policy evaluation for reinforcement learning P Thomas, E Brunskill International Conference on Machine Learning, 2139-2148, 2016	758	2016
Value function approximation in reinforcement learning using the Fourier basis G Konidaris, S Osentoski, P Thomas Proceedings of the AAAI conference on artificial intelligence 25 (1), 380-385, 2011	556	2011
High-confidence off-policy evaluation P Thomas, G Theocharous, M Ghavamzadeh Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	334	2015
High confidence policy improvement P Thomas, G Theocharous, M Ghavamzadeh International Conference on Machine Learning, 2380-2388, 2015	224	2015
Preventing undesirable behavior of intelligent machines P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill Science 366 (6468), 999-1004, 2019	208	2019
Ad recommendation systems for life-time value optimization G Theocharous, PS Thomas, M Ghavamzadeh Proceedings of the 24th international conference on world wide web, 1305-1310, 2015	207	2015
Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019	205	2019
Increasing the action gap: New operators for reinforcement learning MG Bellemare, G Ostrovski, A Guez, P Thomas, R Munos Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016	181	2016
Bias in natural actor-critic algorithms P Thomas International conference on machine learning, 441-448, 2014	164	2014
Safe reinforcement learning PS Thomas	126	2015
Is the policy gradient a gradient? C Nota, PS Thomas arXiv preprint arXiv:1906.07073, 2019	80	2019
Evaluating the performance of reinforcement learning algorithms S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas International Conference on Machine Learning, 4962-4973, 2020	76	2020
Optimizing for the future in non-stationary mdps Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ... International Conference on Machine Learning, 1414-1425, 2020	74	2020
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017	71	2017
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ... arXiv preprint arXiv:1405.6757, 2014	71	2014
Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing P Thomas, G Theocharous, M Ghavamzadeh, I Durugkar, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 31 (2), 4740-4745, 2017	68	2017
Policy gradient methods for reinforcement learning with function approximation and action-dependent baselines PS Thomas, E Brunskill arXiv preprint arXiv:1706.06643, 2017	66	2017
Importance Sampling for Fair Policy Selection. S Doroudi, PS Thomas, E Brunskill Grantee Submission, 2017	62	2017
Offline contextual bandits with high probability fairness guarantees B Metevier, S Giguere, S Brockman, A Kobren, Y Brun, E Brunskill, ... Advances in neural information processing systems 32, 2019	60	2019
Risk Quantification for Policy Deployment PS Thomas, G Theocharous, M Ghavamzadeh US Patent App. 14/552,047, 2016	59	2016

El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.

Artículos 1–20

Citas por año

Citas duplicadas

Citas combinadas

Añadir coautoresCoautores

Seguir

Citado por

Coautores