Hannah Rose Kirk

عدد مرات الاقتباسات

	الكل	قبل 2020
اقتباسات	1857	1855
h-index	21	21
i10-index	25	25

1100

550

275

825

202120222023202420258 72 428 1091 249

عدد المنشورات المتاحة للجميع

عرض المجموعة جميعها

5 مقالات

0 مقالة

المقالات البحثية المتاحة للجميع

المقالات البحثية غير المتاحة للجميع

تمّ اختيار المعلومات استنادًا إلى تفويضات التمويل

المؤلفون المشاركون

Bertie VidgenOxford, Turingبريد إلكتروني تم التحقق منه على rewire.online
Scott A. HaleOxford Internet Institute, University of Oxford, Meedan, and the Alan Turing Instituteبريد إلكتروني تم التحقق منه على oii.ox.ac.uk
Paul RöttgerPostdoctoral Researcher, Bocconi Universityبريد إلكتروني تم التحقق منه على unibocconi.it
Aleksandar (Suny) ShtedritskiPhD student, University of Oxfordبريد إلكتروني تم التحقق منه على robots.ox.ac.uk
Yuki M. AsanoFull Professor, Head of FunAI Lab, University of Technology Nurembergبريد إلكتروني تم التحقق منه على utn.de
Yennie JunGoogle Research, Truveta, University of Oxford, UN Global Pulseبريد إلكتروني تم التحقق منه على google.com
Frédéric A. DreyerPrescient Design, Genentechبريد إلكتروني تم التحقق منه على gene.com
Siobhan Mackenzie HallDPhil Student, University of Oxfordبريد إلكتروني تم التحقق منه على nds.ox.ac.uk
Dirk HovyBocconi Universityبريد إلكتروني تم التحقق منه على unibocconi.it
Katerina MargatinaApplied Scientist, AWS AI Labsبريد إلكتروني تم التحقق منه على amazon.com
Leon DerczynskiITU Copenhagen & NVIDIAبريد إلكتروني تم التحقق منه على itu.dk
Max BainGoogle DeepMindبريد إلكتروني تم التحقق منه على google.com
Jonas SchuettSenior Research Fellow, Centre for the Governance of AI, Oxford, UKبريد إلكتروني تم التحقق منه على governance.ai
Luciano FloridiYale University - Alma Mater Studiorum University of Bolognaبريد إلكتروني تم التحقق منه على yale.edu
Jakob MökanderUniversity of Oxfordبريد إلكتروني تم التحقق منه على oii.ox.ac.uk
Tristan ThrushStanfordبريد إلكتروني تم التحقق منه على stanford.edu
Wenjie YinQueen Mary University of Londonبريد إلكتروني تم التحقق منه على qmul.ac.uk
abeba birhaneAdjunct assistant professor at the school of computer science and statistics, Trinity College Dublinبريد إلكتروني تم التحقق منه على tcd.ie
Yash BhalgatVisual Geometry Group, University of Oxfordبريد إلكتروني تم التحقق منه على robots.ox.ac.uk
Hugo BergUndergraduate student, Mathematics & Computer Science, University of Oxfordبريد إلكتروني تم التحقق منه على ccc.ox.ac.uk

متابعة

Hannah Rose Kirk

University of Oxford

بريد إلكتروني تم التحقق منه على oii.ox.ac.uk - الصفحة الرئيسية

Large language models NLP Ethics in AI Alignment AI Safety


عنوان ترتيب حسب الاقتباسات ترتيب حسب السنة الترتيب حسب العنوان	عدد مرات الاقتباسات عدد مرات الاقتباسات	السنة
Bias out-of-the-box: An empirical analysis of intersectional occupational biases in popular generative language models‏ HR Kirk, Y Jun, F Volpin, H Iqbal, E Benussi, F Dreyer, A Shtedritski, ...‏ Advances in neural information processing systems 34, 2611-2624, 2021‏	227	2021
Auditing large language models: a three-layered approach‏ J Mökander, J Schuett, HR Kirk, L Floridi‏ AI and Ethics 4 (4), 1085-1115, 2024‏	214	2024
The benefits, risks and bounds of personalizing the alignment of large language models to individuals‏ HR Kirk, B Vidgen, P Röttger, SA Hale‏ Nature Machine Intelligence 6 (4), 383-392, 2024‏	192*	2024
Xstest: A test suite for identifying exaggerated safety behaviours in large language models‏ P Röttger, HR Kirk, B Vidgen, G Attanasio, F Bianchi, D Hovy‏ Proceedings of the 2024 Conference of the North American Chapter of the …, 2023‏	153	2023
Dataperf: Benchmarks for data-centric ai development‏ M Mazumder, C Banbury, X Yao, B Karlaš, W Gaviria Rojas, S Diamos, ...‏ Advances in Neural Information Processing Systems 36, 5320-5347, 2023‏	139	2023
Semeval-2023 task 10: Explainable detection of online sexism‏ HR Kirk, W Yin, B Vidgen, P Röttger‏ Best Paper Award, Proceedings of the 17th International Workshop on Semantic …, 2023‏	139	2023
A prompt array keeps the bias away: Debiasing vision-language models with adversarial learning‏ H Berg, SM Hall, Y Bhalgat, W Yang, HR Kirk, A Shtedritski, M Bain‏ Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the …, 2022‏	108	2022
The PRISM alignment dataset: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large language models‏ HR Kirk, A Whitefield, P Rottger, AM Bean, K Margatina, ...‏ Best Paper Award, Advances in Neural Information Processing Systems 37 …, 2025‏	88*	2025
Political compass or spinning arrow? towards more meaningful evaluations for values and opinions in large language models‏ P Röttger, V Hofmann, V Pyatkin, M Hinck, HR Kirk, H Schütze, D Hovy‏ Outstanding Paper Award, Proceedings of the 62nd Annual Meeting of the …, 2024‏	66	2024
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate‏ HR Kirk, B Vidgen, P Röttger, T Thrush, SA Hale‏ Proceedings of the 2022 Conference of the North American Chapter of the …, 2021‏	64	2021
Handling and Presenting Harmful Text in NLP‏ HR Kirk, A Birhane, B Vidgen, L Derczynski‏ EMNLP Findings, 2022‏	51*	2022
Looking for a Handsome Carpenter! Debiasing GPT-3 Job Advertisements‏ C Borchers, DS Gala, B Gilburt, E Oravkin, W Bounsi, YM Asano, HR Kirk‏ Proceedings of the 4th workshop on gender bias in natural language …, 2022‏	50	2022
Introducing v0. 5 of the ai safety benchmark from mlcommons‏ B Vidgen, A Agrawal, AM Ahmed, V Akinwande, N Al-Nuaimi, N Alfaraj, ...‏ arXiv preprint arXiv:2404.12241, 2024‏	40	2024
The past, present and better future of feedback learning in large language models for subjective human preferences and values‏ HR Kirk, AM Bean, B Vidgen, P Röttger, SA Hale‏ Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023‏	38	2023
Assessing language model deployment with risk cards‏ L Derczynski, HR Kirk, V Balachandran, S Kumar, Y Tsvetkov, MR Leiser, ...‏ arXiv preprint arXiv:2303.18190, 2023‏	36	2023
Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset‏ HR Kirk, Y Jun, P Rauba, G Wachtel, R Li, X Bai, N Broestl, M Doff-Sotta, ...‏ Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021), 2021‏	35	2021
Simplesafetytests: a test suite for identifying critical safety risks in large language models‏ B Vidgen, N Scherrer, HR Kirk, R Qian, A Kannappan, SA Hale, P Röttger‏ arXiv preprint arXiv:2311.08370, 2023‏	32	2023
Visogender: A dataset for benchmarking gender bias in image-text pronoun resolution‏ SM Hall, F Gonçalves Abrantes, H Zhu, G Sodunke, A Shtedritski, HR Kirk‏ Advances in Neural Information Processing Systems 36, 63687-63723, 2023‏	31	2023
Adversarial nibbler: An open red-teaming method for identifying diverse harms in text-to-image generation‏ J Quaye, A Parrish, O Inel, C Rastogi, HR Kirk, M Kahng, E Van Liemt, ...‏ Proceedings of the 2024 ACM Conference on Fairness, Accountability, and …, 2024‏	24*	2024
Balancing the picture: Debiasing vision-language datasets with synthetic contrast sets‏ B Smith, M Farinha, SM Hall, HR Kirk, A Shtedritski, M Bain‏ arXiv preprint arXiv:2305.15407, 2023‏	22	2023

يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.

مقالات 1–20

عدد الاقتباسات في العام

اقتباسات مكررة

الاقتباسات المدمجة

إضافة مؤلفين مشاركينالمؤلفون المشاركون

متابعة

عدد مرات الاقتباسات

المؤلفون المشاركون