Obserwuj
Adam Gleave
Adam Gleave
CEO at FAR AI
Zweryfikowany adres z far.ai - Strona główna
Tytuł
Cytowane przez
Cytowane przez
Rok
Stable-baselines3: Reliable reinforcement learning implementations
A Raffin, A Hill, A Gleave, A Kanervisto, M Ernestus, N Dormann
Journal of machine learning research 22 (268), 1-8, 2021
28992021
Stable baselines
A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...
9802018
Adversarial policies: Attacking deep reinforcement learning
A Gleave, M Dennis, C Wild, N Kant, S Levine, S Russell
International Conference on Learning Representations, 2020
4582020
Firmament: Fast, centralized cluster scheduling at scale
I Gog, M Schwarzkopf, A Gleave, RNM Watson, S Hand
12th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2016
3002016
imitation: Clean imitation learning implementations
A Gleave, M Taufeeque, J Rocamonde, E Jenner, SH Wang, S Toyer, ...
arXiv preprint arXiv:2211.11972, 2022
92*2022
Quantifying differences in reward functions
A Gleave, M Dennis, S Legg, S Russell, J Leike
International Conference on Learning Representations, 2021
752021
Inverse reinforcement learning for video games
A Tucker, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2018
662018
Adversarial Policies Beat Superhuman Go AIs
TT Wang, A Gleave, T Tseng, N Belrose, J Miller, MD Dennis, Y Duan, ...
arXiv preprint arXiv:2211.00241, 2022
63*2022
Invariance in policy optimisation and partial identifiability in reward learning
JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave
International Conference on Machine Learning, 32033-32058, 2023
522023
Multi-task maximum entropy inverse reinforcement learning
A Gleave, O Habryka
GoalsRL Workshop at ICML, 2018
482018
Understanding learned reward functions
EJ Michaud, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
402020
Active inverse reward design
S Mindermann, R Shah, A Gleave, D Hadfield-Menell
GoalsRL Workshop at ICML, 2018
352018
Uncertainty estimation for language reward models
A Gleave, G Irving
arXiv preprint arXiv:2203.07472, 2022
282022
A primer on maximum causal entropy inverse reinforcement learning
A Gleave, S Toyer
arXiv preprint arXiv:2203.11409, 2022
242022
Exploiting novel gpt-4 apis
K Pelrine, M Taufeeque, M Zając, E McLean, A Gleave
arXiv preprint arXiv:2312.14302, 2023
212023
On the fragility of learned reward functions
L McKinney, Y Duan, D Krueger, A Gleave
arXiv preprint arXiv:2301.03652, 2023
212023
Making compression algorithms for Unicode text
A Gleave, C Steinruecken
Data Compression Conference, 2017
152017
Preprocessing reward functions for interpretability
E Jenner, A Gleave
arXiv preprint arXiv:2203.13553, 2022
132022
Scaling laws for data poisoning in llms
D Bowen, B Murphy, W Cai, D Khachaturov, A Gleave, K Pelrine
arXiv e-prints, arXiv: 2408.02946, 2024
112024
DERAIL: Diagnostic Environments for Reward And Imitation Learning
P Freire, A Gleave, S Toyer, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
102020
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20