Adam Gleave

Cytowane przez

	Wszystkie	Od 2020
Cytowania	5304	5058
h-indeks	16	16
i10-indeks	20	20

1700

850

425

1275

20172018201920202021202220232024202526 54 143 342 622 874 1294 1683 234

Dostęp publiczny

Wyświetl wszystko

2 artykuły

0 artykułów

dostępne

niedostępne

Objęte finansowaniem

Współautorzy

Antonin RaffinDLRZweryfikowany adres z dlr.de
Ashley WD HillResearch EngineerZweryfikowany adres z ensta-paristech.fr
Anssi KanervistoResearcher, Meta FAIRZweryfikowany adres z meta.com
Stuart RussellProfessor of Computer Science, University of California, BerkeleyZweryfikowany adres z cs.berkeley.edu
Sergey LevineUC Berkeley, Physical IntelligenceZweryfikowany adres z eecs.berkeley.edu
Ionel GogGoogleZweryfikowany adres z google.com
Steven HandCambridgeZweryfikowany adres z cl.cam.ac.uk
Malte SchwarzkopfBrown UniversityZweryfikowany adres z cs.brown.edu
Robert N. M. WatsonDepartment of Computer Science and Technology, University of CambridgeZweryfikowany adres z cl.cam.ac.uk
Dylan Hadfield-MenellMassachusetts Institute of TechnologyZweryfikowany adres z csail.mit.edu
Rohin ShahResearch Scientist, Google DeepMindZweryfikowany adres z deepmind.com
Sören MindermannUniversity of Oxford, OATMLZweryfikowany adres z cs.ox.ac.uk

Obserwuj

Adam Gleave

CEO at FAR AI

Zweryfikowany adres z far.ai - Strona główna

Machine Learning Deep RL


Tytuł Sortuj wg cytatów Sortuj wg roku Sortuj wg tytułu	Cytowane przez Cytowane przez	Rok
Stable-baselines3: Reliable reinforcement learning implementations A Raffin, A Hill, A Gleave, A Kanervisto, M Ernestus, N Dormann Journal of machine learning research 22 (268), 1-8, 2021	2899	2021
Stable baselines A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...	980	2018
Adversarial policies: Attacking deep reinforcement learning A Gleave, M Dennis, C Wild, N Kant, S Levine, S Russell International Conference on Learning Representations, 2020	458	2020
Firmament: Fast, centralized cluster scheduling at scale I Gog, M Schwarzkopf, A Gleave, RNM Watson, S Hand 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2016	300	2016
imitation: Clean imitation learning implementations A Gleave, M Taufeeque, J Rocamonde, E Jenner, SH Wang, S Toyer, ... arXiv preprint arXiv:2211.11972, 2022	92*	2022
Quantifying differences in reward functions A Gleave, M Dennis, S Legg, S Russell, J Leike International Conference on Learning Representations, 2021	75	2021
Inverse reinforcement learning for video games A Tucker, A Gleave, S Russell Deep Reinforcement Learning Workshop at NeurIPS, 2018	66	2018
Adversarial Policies Beat Superhuman Go AIs TT Wang, A Gleave, T Tseng, N Belrose, J Miller, MD Dennis, Y Duan, ... arXiv preprint arXiv:2211.00241, 2022	63*	2022
Invariance in policy optimisation and partial identifiability in reward learning JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave International Conference on Machine Learning, 32033-32058, 2023	52	2023
Multi-task maximum entropy inverse reinforcement learning A Gleave, O Habryka GoalsRL Workshop at ICML, 2018	48	2018
Understanding learned reward functions EJ Michaud, A Gleave, S Russell Deep Reinforcement Learning Workshop at NeurIPS, 2020	40	2020
Active inverse reward design S Mindermann, R Shah, A Gleave, D Hadfield-Menell GoalsRL Workshop at ICML, 2018	35	2018
Uncertainty estimation for language reward models A Gleave, G Irving arXiv preprint arXiv:2203.07472, 2022	28	2022
A primer on maximum causal entropy inverse reinforcement learning A Gleave, S Toyer arXiv preprint arXiv:2203.11409, 2022	24	2022
Exploiting novel gpt-4 apis K Pelrine, M Taufeeque, M Zając, E McLean, A Gleave arXiv preprint arXiv:2312.14302, 2023	21	2023
On the fragility of learned reward functions L McKinney, Y Duan, D Krueger, A Gleave arXiv preprint arXiv:2301.03652, 2023	21	2023
Making compression algorithms for Unicode text A Gleave, C Steinruecken Data Compression Conference, 2017	15	2017
Preprocessing reward functions for interpretability E Jenner, A Gleave arXiv preprint arXiv:2203.13553, 2022	13	2022
Scaling laws for data poisoning in llms D Bowen, B Murphy, W Cai, D Khachaturov, A Gleave, K Pelrine arXiv e-prints, arXiv: 2408.02946, 2024	11	2024
DERAIL: Diagnostic Environments for Reward And Imitation Learning P Freire, A Gleave, S Toyer, S Russell Deep Reinforcement Learning Workshop at NeurIPS, 2020	10	2020

Nie można teraz wykonać tej operacji. Spróbuj ponownie później.

Prace 1–20

Cytowania rocznie

Powielone cytowania

Scalone cytowania

Dodaj współautorówWspółautorzy

Obserwuj

Cytowane przez

Współautorzy