Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1604 | 2023 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 166 | 2022 |
Assessing the Impact of OCR Quality on Downstream NLP Tasks D van Strien, K Beelen, MC Ardanuy, K Hosseini, B McGillivray, ... | 137 | 2020 |
Library Carpentry: software skills training for library professionals J Baker, C Moore, E Priego, R Alegre, J Cope, L Price, O Stephens, ... Liber Quarterly: The Journal of European Research Libraries 26 (3), 141-162, 2016 | 25 | 2016 |
BLOOM: A 176b-parameter open-access multilingual language model. CoRR, abs/2211.05100, 2022. doi: 10.48550 T Le Scao, A Fan, C Akiki, E Pavlick, S Ilic, D Hesslow, R Castagné, ... arXiv preprint arXiv.2211.05100 10, 0 | 22 | |
Maps of a nation? The digitized ordnance survey for new historical research K Hosseini, K McDonough, D van Strien, O Vane, DCS Wilson Journal of Victorian Culture 26 (2), 284-299, 2021 | 17 | 2021 |
Documenting geographically and contextually diverse data sources: The bigscience catalogue of language data and resources A McMillan-Major, Z Alyafeai, S Biderman, K Chen, F De Toni, G Dupont, ... arXiv preprint arXiv:2201.10066, 2022 | 15 | 2022 |
A deep learning approach to geographical candidate selection through toponym matching MC Ardanuy, K Hosseini, K McDonough, A Krause, D Van Strien, F Nanni Proceedings of the 28th International Conference on Advances in Geographic …, 2020 | 15 | 2020 |
Datasheets for digital cultural heritage datasets H Alkemade, S Claeyssens, G Colavizza, N Freire, J Lehmann, ... Journal of open humanities data 9 (17), 1-11, 2023 | 12 | 2023 |
Resolving places, past and present: toponym resolution in historical British newspapers using multiple resources MC Ardanuy, K McDonough, A Krause, DCS Wilson, K Hosseini, ... Proceedings of the 13th Workshop on Geographic Information Retrieval, 1-6, 2019 | 12 | 2019 |
Entities, dates, and languages: Zero-shot on historical texts with t0 F De Toni, C Akiki, J De La Rosa, C Fourrier, E Manjavacas, S Schweter, ... arXiv preprint arXiv:2204.05211, 2022 | 11 | 2022 |
An Introduction to AI for GLAM D van Strien, M Bell, NR McGregor, M Trizna Proceedings of the second teaching machine learning and artificial …, 2022 | 11 | 2022 |
A dataset for toponym resolution in nineteenth-century english newspapers MC Ardanuy, D Beavan, K Beelen, K Hosseini, J Lawrence, ... Journal of Open Humanities Data 8, 2022 | 11 | 2022 |
AI training resources for GLAM: a snapshot A Darby, CN Coleman, C Engel, D van Strien, M Trizna, ZW Painter arXiv preprint arXiv:2205.04738, 2022 | 5 | 2022 |
An Introduction to Version Control Using GitHub Desktop D Van Strien The Programming Historian, 2016 | 5 | 2016 |
Building networks to strengthen research data management advocacy and training D Van Strien, M Fellous-Sigrist SCONUL Focus 69, 27-29, 2017 | 2 | 2017 |
Metadata Might Make Language Models Better K Beelen, D van Strien arXiv preprint arXiv:2211.10086, 2022 | 1 | 2022 |
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1) D van Strien, K Beelen, M Wevers, T Smits, K McDonough The Programming Historian, 2022 | 1 | 2022 |
Using smart annotations to map the geography of newspapers Y Ryan, M Coll Ardanuy, D van Strien, K Hosseini, K Beelen, ... [""], 2020 | 1 | 2020 |
Contextualizing Victorian Newspapers K Beelen, R Ahnert, D Beavan, M Coll Ardanuy, K Hosseini, ... [""], 2020 | 1 | 2020 |