DiLoCo: Distributed Low-Communication Training of Language Models A Douillard, Q Feng, AA Rusu, R Chhaparia, Y Donchev, A Kuncoro, ... arXiv preprint arXiv:2311.08105, 2023 | 17 | 2023 |
Scaling instructable agents across many simulated worlds MA Raad, A Ahuja, C Barros, F Besse, A Bolt, A Bolton, B Brownfield, ... arXiv preprint arXiv:2404.10179, 2024 | 8 | 2024 |
Scaling instructable agents across many simulated worlds M Abi Raad, A Ahuja, C Barros, F Besse, A Bolt, A Bolton, B Brownfield, ... arXiv e-prints, arXiv: 2404.10179, 2024 | 5 | 2024 |
DiPaCo: Distributed Path Composition A Douillard, Q Feng, AA Rusu, A Kuncoro, Y Donchev, R Chhaparia, ... arXiv preprint arXiv:2403.10616, 2024 | 3 | 2024 |