Micah Carroll

Trích dẫn bởi

	Tất cả	Từ 2020
Trích dẫn	1444	1440
h-index	11	11
i10-index	12	12

780

390

195

585

20202021202220232024202521 45 100 321 780 163

Truy cập công khai

Xem tất cả

6 bài viết

0 bài viết

có sẵn

không có sẵn

Dựa trên yêu cầu tài trợ

Đồng tác giả

Anca D DraganAssistant Professor at UC Berkeley // Director, AI Safety and Alignment, Google DeepMindEmail được xác minh tại berkeley.edu
Rohin ShahResearch Scientist, Google DeepMindEmail được xác minh tại deepmind.com
Stuart RussellProfessor of Computer Science, University of California, BerkeleyEmail được xác minh tại cs.berkeley.edu
David Scott KruegerUniversity Assistant Professor, University of CambridgeEmail được xác minh tại cam.ac.uk
Alan ChanCentre for the Governance of AIEmail được xác minh tại governance.ai
Sam DevlinMicrosoft Research CambridgeEmail được xác minh tại microsoft.com
Katja HofmannMicrosoft ResearchEmail được xác minh tại microsoft.com
Smitha MilliMeta FAIREmail được xác minh tại meta.com
Dylan Hadfield-MenellMassachusetts Institute of TechnologyEmail được xác minh tại csail.mit.edu

Theo dõi

Micah Carroll

PhD student, UC Berkeley

Email được xác minh tại berkeley.edu - Trang chủ

AI Alignment AI Influence Recommender systems Human-AI Collaboration


Tiêu đề Sắp xếp theo số lượt trích dẫn Sắp xếp theo năm Sắp xếp theo tiêu đề	Trích dẫn bởi Trích dẫn bởi	Năm
Open problems and fundamental limitations of reinforcement learning from human feedback S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ... arXiv preprint arXiv:2307.15217, 2023	495	2023
On the Utility of Learning About Humans for Human-AI Coordination M Carroll, R Shah, MK Ho, T Griffiths, S Seshia, P Abbeel, A Dragan Advances in Neural Information Processing Systems, 2019, 5174-5185, 2019	476	2019
Harms from Increasingly Agentic Algorithmic Systems A Chan, R Salganik, A Markelius, C Pang, N Rajkumar, D Krasheninnikov, ... Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023	126*	2023
Estimating and Penalizing Induced Preference Shifts in Recommender Systems M Carroll, A Dragan, S Russell, D Hadfield-Menell International Conference on Machine Learning, 2022 (Spotlight), 2686-2708, 2022	79*	2022
Characterizing Manipulation from AI Systems M Carroll, A Chan, H Ashton, D Krueger EEAMO 2023, 2023	65	2023
Engagement, user satisfaction, and the amplification of divisive content on social media S Milli, M Carroll, Y Wang, S Pandey, S Zhao, AD Dragan arXiv preprint arXiv:2305.16941, 2023	52*	2023
Uni[MASK]: Unified inference in sequential decision problems M Carroll, O Paradise, J Lin, R Georgescu, M Sun, D Bignell, S Milani, ... NeurIPS 2022 (Oral), 2022	42*	2022
Evaluating the Robustness of Collaborative Agents P Knott, M Carroll, S Devlin, K Ciosek, K Hofmann, AD Dragan, R Shah AAMAS 2021 (Extended Abstract), 2021	33	2021
Beyond preferences in ai alignment T Zhi-Xuan, M Carroll, M Franklin, H Ashton Philosophical Studies, 1-51, 2024	17	2024
Ai alignment with changing and influenceable reward functions M Carroll, D Foote, A Siththaranjan, S Russell, A Dragan arXiv preprint arXiv:2405.17713, 2024	17	2024
Humanity's Last Exam L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, S Shi, M Choi, A Agrawal, ... arXiv preprint arXiv:2501.14249, 2025	12	2025
Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration M Yang, M Carroll, A Dragan NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop, 2022	10	2022
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback M Williams, M Carroll, A Narang, C Weisser, B Murphy, A Dragan arXiv preprint arXiv:2411.02306, 2024	7*	2024
Who Needs to Know? Minimal Knowledge for Optimal Coordination N Lauffer, A Shah, M Carroll, MD Dennis, S Russell International Conference on Machine Learning 2023, 18599-18613, 2023	5	2023
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking D Zhang, M Carroll, A Bobu, A Dragan NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop, 2022	5	2022
Overview of current AI alignment approaches M Carroll	3	2018
Truthfulness Without Supervision: Model Evaluation Using Peer Prediction T Qiu, M Carroll, C Allen

Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.

Bài viết 1–17

Trích dẫn mỗi năm

Trích dẫn trùng lặp

Trích dẫn được hợp nhất

Thêm đồng tác giảĐồng tác giả

Theo dõi

Trích dẫn bởi

Đồng tác giả