Towards auditing large language models: Improving text-based stereotype detection W Zekun, S Bulathwela, AS Koshiyama NeurIPS 2023 SoLaR Workshop, 2023 | 7 | 2023 |
Jobfair: A framework for benchmarking gender hiring bias in large language models Z Wang, Z Wu, X Guan, M Thaler, A Koshiyama, S Lu, S Beepath, ... Findings of EMNLP 2025, 2024 | 5 | 2024 |
Eliciting personality traits in large language models A Hilliard, C Munoz, Z Wu, AS Koshiyama arXiv preprint arXiv:2402.08341, 2024 | 5 | 2024 |
Bias Amplification: Language Models as Increasingly Biased Media Z Wang, Z Wu, J Zhang, N Jain, X Guan, A Koshiyama arXiv preprint arXiv:2410.15234, 2024 | 3 | 2024 |
Advancing Multimodal Data Fusion in Pain Recognition: A Strategy Leveraging Statistical Correlation and Human-Centered Perspectives X Gu, Z Wang, I Jin, Z Wu ACII 2024 AHRI Workshop, 2024 | 3 | 2024 |
HEARTS: A Holistic Framework for Explainable, Sustainable and Robust Text Stereotype Detection T King, Z Wu, A Koshiyama, E Kazim, P Treleaven NeurIPS 2024 SoLaR and Safe Generative AI Workshops, 2024 | 2 | 2024 |
HyPA-RAG: A Hybrid Parameter Adaptive Retrieval-Augmented Generation System for AI Legal and Policy Applications R Kalra, Z Wu, A Gulley, A Hilliard, X Guan, A Koshiyama, P Treleaven NAACL 2025 Industry Track & EMNLP 2024 Workshop CustomNLP4U, 2024 | 2 | 2024 |
Auditing large language models for enhanced text-based stereotype detection and probing-based bias evaluation Z Wu, S Bulathwela, M Perez-Ortiz, A Soares Koshiyama arXiv e-prints, arXiv: 2404.01768, 2024 | 2 | 2024 |
Assessing Bias in Metric Models for LLM Open-Ended Generation Bias Benchmarks N Demchak, X Guan, Z Wu, Z Xu, A Koshiyama, E Kazim NeurIPS 2024 EvalEval Workshop, 2024 | 1 | 2024 |
CauSkelNet: Causal Representation Learning for Human Behaviour Analysis X Gu, C Jiang, E Wang, Z Wu, Q Cui, L Tian, L Wu, S Song, C Yu arXiv preprint arXiv:2409.15564, 2024 | | 2024 |
SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration X Guan, N Demchak, S Gupta, Z Wang, E Ertekin Jr, A Koshiyama, ... Oral Presentation of COLING 2024, 2024 | | 2024 |
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models M Liang, A Arun, Z Wu, C Munoz, J Lutch, E Kazim, A Koshiyama, ... NeurIPS 2024 SoLaR Workshop, 2024 | | 2024 |
From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs N Jain, Z Wu, C Munoz, A Hilliard, A Koshiyama, E Kazim, P Treleaven Findings of NAACL 2025, 2024 | | 2024 |
Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach Z Wu, S Bulathwela, M Perez-Ortiz, AS Koshiyama arXiv preprint arXiv:2404.01768, 2024 | | 2024 |
Towards Auditing Large Language Models: Improving Text-based Stereotype Detection Z Wu, S Bulathwela, A Koshiyama Socially Responsible Language Modelling Research, 2023 | | 2023 |