Jan Leike

Cited by

	All	Since 2019
Citations	27309	26836
h-index	27	25
i10-index	36	31

18000

9000

4500

13500

2017201820192020202120222023202484 189 289 375 514 1142 6857 17392

Public access

View all

10 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Jeffrey WuOpenAIVerified email at openai.com
Paul ChristianoNational Institute of Standards and TechnologyVerified email at nist.gov
John SchulmanAnthropicVerified email at anthropic.com
Ryan LoweOpenAIVerified email at openai.com
Marcus HutterResearcher@DeepMind & Professor at ANUVerified email at anu.edu.au
Dario AmodeiCEO and Co-Founder at AnthropicVerified email at anthropic.com
Matthias HeizmannUniversity of Stuttgart, GermanyVerified email at heizmann.name
David Scott KruegerUniversity Assistant Professor, University of CambridgeVerified email at cam.ac.uk
Ilya SutskeverCo-Founder and Chief Scientist of OpenAIVerified email at openai.com
Tom EverittStaff Research Scientist at Google DeepMindVerified email at google.com
Pushmeet KohliDeepMindVerified email at google.com
Yuri BurdaOpenAIVerified email at openai.com
Andreas PodelskiProfessor of Computer Science, Freiburg UniversityVerified email at informatik.uni-freiburg.de
Geoffrey IrvingUK AI Safety Institute (AISI)Verified email at naml.us
Tegan MaharajAssistant Professor at MilaVerified email at polymtl.ca
William SaundersOpenAIVerified email at cs.toronto.edu
Collin BurnsResearcher, OpenAIVerified email at openai.com
Pavel IzmailovAnthropic; NYUVerified email at anthropic.com
Adam GleaveCEO at FAR AIVerified email at far.ai
Andrew TraskUniversity of Oxford and OpenMinedVerified email at openmined.org

Jan Leike

OpenAI

Verified email at openai.com - Homepage

reinforcement learning deep learning agent alignment


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in Neural Information Processing Systems 35, 27730-27744, 2022	10317	2022
GPT-4 technical report OpenAI arXiv, 2023	7014*	2023
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021	3318	2021
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in Neural Information Processing Systems 30, 4299-4307, 2017	3113	2017
Reward learning from human preferences and demonstrations in Atari B Ibarz, J Leike, T Pohlen, G Irving, S Legg, D Amodei Advances in Neural Information Processing Systems, 8011-8023, 2018	424	2018
Let's Verify Step by Step H Lightman, V Kosaraju, Y Burda, H Edwards, B Baker, T Lee, J Leike, ... arXiv preprint arXiv:2305.20050, 2023	418	2023
AI Safety Gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017	348	2017
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018	338	2018
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021	244	2021
Learning to Understand Goal Specifications by Modelling Reward D Bahdanau, F Hill, J Leike, E Hughes, P Kohli, E Grefenstette arXiv preprint arXiv:1806.01946, 2018	204*	2018
Language models can explain neurons in language models S Bills, N Cammarata, D Mossing, H Tillman, L Gao, G Goh, I Sutskever, ... URL https://openaipublic. blob. core. windows. net/neuron-explainer/paper …, 2023	203	2023
Self-critiquing models for assisting human evaluators W Saunders, C Yeh, J Wu, S Bills, L Ouyang, J Ward, J Leike arXiv preprint arXiv:2206.05802, 2022	203	2022
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision C Burns, P Izmailov, JH Kirchner, B Baker, L Gao, L Aschenbrenner, ... arXiv preprint arXiv:2312.09390, 2023	164	2023
Ranking Templates for Linear Loops J Leike, M Heizmann Logical Methods in Computer Science, 2015	102	2015
Learning human objectives by evaluating hypothetical behavior S Reddy, A Dragan, S Levine, S Legg, J Leike International Conference on Machine Learning, 8020-8029, 2020	89	2020
Institutionalizing ethics in AI through broader impact requirements CEA Prunkl, C Ashurst, M Anderljung, H Webb, J Leike, A Dafoe Nature Machine Intelligence 3 (2), 104-110, 2021	74	2021
Quantifying Differences in Reward Functions A Gleave, M Dennis, S Legg, S Russell, J Leike arXiv preprint arXiv:2006.13900, 2020	69	2020
Linear ranking for linear lasso programs M Heizmann, J Hoenicke, J Leike, A Podelski Automated Technology for Verification and Analysis, 365-380, 2013	68	2013
Hidden Incentives for Auto-Induced Distributional Shift D Krueger, T Maharaj, J Leike arXiv preprint arXiv:2009.09153, 2020	61*	2020
Geometric nontermination arguments J Leike, M Heizmann International Conference on Tools and Algorithms for the Construction and …, 2018	55*	2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors