Bloom: A 176b-parameter open-access multilingual language model BS Workshop, TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, ... JMLR 2023, 2022 | 1872* | 2022 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... TMLR 2023, 2022 | 1445 | 2022 |
StarCoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... TMLR 2023, 2023 | 1068* | 2023 |
A framework for few-shot language model evaluation L Gao, J Tow, S Biderman, S Black, A DiPofi, C Foster, L Golding, J Hsu, ... GitHub, 2021 | 848* | 2021 |
Crosslingual generalization through multitask finetuning N Muennighoff, T Wang, L Sutawika, A Roberts, S Biderman, TL Scao, ... ACL 2023, 2022 | 753 | 2022 |
MTEB: Massive text embedding benchmark N Muennighoff, N Tazi, L Magne, N Reimers EACL 2023, 2022 | 712 | 2022 |
C-pack: Packed resources for general chinese embeddings S Xiao, Z Liu, P Zhang, N Muennighoff, D Lian, JY Nie SIGIR 2024, 2024 | 488 | 2024 |
Kto: Model alignment as prospect theoretic optimization K Ethayarajh, W Xu, N Muennighoff, D Jurafsky, D Kiela ICML 2024 Spotlight, 2024 | 381 | 2024 |
Olmo: Accelerating the science of language models D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord, AH Jha, ... ACL 2024, Best Theme Paper Award, 2024 | 278* | 2024 |
SantaCoder: don't reach for the stars! LB Allal, R Li, D Kocetkov, C Mou, C Akiki, CM Ferrandis, N Muennighoff, ... ICLR 2023 DL4C Workshop, Best Paper Award, 2023 | 255* | 2023 |
Scaling Data-Constrained Language Models N Muennighoff, AM Rush, B Barak, TL Scao, A Piktus, N Tazi, S Pyysalo, ... NeurIPS 2023 Oral, Outstanding Paper Runner-Up Award, 2023 | 251 | 2023 |
Starcoder 2 and the stack v2: The next generation A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ... arXiv, 2024 | 228 | 2024 |
SGPT: GPT sentence embeddings for semantic search N Muennighoff arXiv, 2022 | 216 | 2022 |
Octopack: Instruction tuning code large language models N Muennighoff, Q Liu, A Zebaze, Q Zheng, B Hui, TY Zhuo, S Singh, ... ICLR 2024 Spotlight; NeurIPS 2023 Instruction Workshop, 2023 | 209 | 2023 |
Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ... ACL 2024, Best Resource Paper Award, 2024 | 207* | 2024 |
Aya model: An instruction finetuned open-access multilingual language model A Üstün, V Aryabumi, ZX Yong, WY Ko, D D'souza, G Onilude, N Bhandari, ... ACL 2024, Best Paper Award, 2024 | 156 | 2024 |
What Language Model to Train if You Have One Million GPU Hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, S Bideman, ... EMNLP 2022 Findings, 2022 | 122 | 2022 |
Generative representational instruction tuning N Muennighoff, H Su, L Wang, N Yang, F Wei, T Yu, A Singh, D Kiela ICLR 2025; ICLR 2024 AGI Workshop Oral, Best Paper Award, 2024 | 97 | 2024 |
Openhands: An open platform for ai software developers as generalist agents X Wang, B Li, Y Song, FF Xu, X Tang, M Zhuge, J Pan, Y Song, B Li, ... ICLR 2025, 2024 | 93* | 2024 |
A Survey on Data Selection for Language Models A Albalak, Y Elazar, SM Xie, S Longpre, N Lambert, X Wang, ... TMLR 2024, 2024 | 90 | 2024 |