Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Iliæ, D Hesslow, R Castagné, ... | 1184 | 2023 |
Mistral 7B AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, D Casas, ... arXiv preprint arXiv:2310.06825, 2023 | 177 | 2023 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 107 | 2022 |
What language model to train if you have one million gpu hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ... arXiv preprint arXiv:2210.15424, 2022 | 80 | 2022 |
Mixtral of experts AQ Jiang, A Sablayrolles, A Roux, A Mensch, B Savary, C Bamford, ... arXiv preprint arXiv:2401.04088, 2024 | 72 | 2024 |
Obelics: An open web-scale filtered dataset of interleaved image-text documents H Laurençon, L Saulnier, L Tronchon, S Bekman, A Singh, A Lozhkov, ... Advances in Neural Information Processing Systems 36, 2024 | 71 | 2024 |
Distributed deep learning in open collaborations M Diskin, A Bukhtiyarov, M Ryabinin, L Saulnier, A Sinitsin, D Popov, ... Advances in Neural Information Processing Systems 34, 7879-7897, 2021 | 39 | 2021 |
BLOOM: A 176b-parameter open-access multilingual language model. CoRR, abs/2211.05100, 2022. doi: 10.48550 T Le Scao, A Fan, C Akiki, E Pavlick, S Ilic, D Hesslow, R Castagné, ... arXiv preprint arXiv.2211.05100, 0 | 20 | |
Loubna Ben allal H Laurençon, L Saulnier, T Wang, C Akiki, AV del Moral, T Le Scao, ... | 10 | 2022 |
Training transformers together A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ... NeurIPS 2021 Competitions and Demonstrations Track, 335-342, 2022 | 9 | 2022 |