Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 2084 | 2023 |
Distilling knowledge from ensembles of neural networks for speech recognition. Y Chebotar, A Waters Interspeech, 3439-3443, 2016 | 186 | 2016 |
From audio to semantics: Approaches to end-to-end spoken language understanding P Haghani, A Narayanan, M Bacchiani, G Chuang, N Gaur, P Moreno, ... 2018 IEEE Spoken Language Technology Workshop (SLT), 720-726, 2018 | 172 | 2018 |
Pali-x: On scaling up a multilingual vision and language model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... arXiv preprint arXiv:2305.18565, 2023 | 141 | 2023 |
Spherical topic models J Reisinger, A Waters, B Silverthorn, R Mooney International Conference on Machine Learning, 2010 | 132 | 2010 |
Crisscrossed captions: Extended intramodal and intermodal semantic similarity judgments for MS-COCO Z Parekh, J Baldridge, D Cer, A Waters, Y Yang arXiv preprint arXiv:2004.15020, 2020 | 63 | 2020 |
Less is more: Generating grounded navigation instructions from landmarks S Wang, C Montgomery, J Orbay, V Birodkar, A Faust, I Gur, N Jaques, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 50 | 2022 |
Towards acoustic model unification across dialects M Elfeky, M Bastani, X Velez, P Moreno, A Waters 2016 IEEE Spoken Language Technology Workshop (SLT), 624-628, 2016 | 38* | 2016 |
Leveraging language id in multilingual end-to-end speech recognition A Waters, N Gaur, P Haghani, P Moreno, Z Qu 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 32 | 2019 |
Talk, don't write: A study of direct speech-based image retrieval R Sanabria, A Waters, J Baldridge arXiv preprint arXiv:2104.01894, 2021 | 26 | 2021 |
Simple and effective synthesis of indoor 3d scenes JY Koh, H Agrawal, D Batra, R Tucker, A Waters, H Lee, Y Yang, ... Proceedings of the AAAI Conference on Artificial Intelligence 37 (1), 1169-1178, 2023 | 24 | 2023 |
A new path: Scaling vision-and-language navigation with synthetic instructions and imitation learning A Kamath, P Anderson, S Wang, JY Koh, A Ku, A Waters, Y Yang, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 19 | 2023 |
Automated calling system A Aharoni, A Narayanan, N Shabat, P Haghani, GT Chuang, Y Leviathan, ... US Patent 11,158,321, 2021 | 17 | 2021 |
Imagen 3 J Baldridge, J Bauer, M Bhutani, N Brichtova, A Bunner, K Chan, Y Chen, ... arXiv preprint arXiv:2408.07009, 2024 | 5 | 2024 |
On Scaling Up a Multilingual Vision and Language Model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 4 | 2024 |
Greedy growing enables high-resolution pixel-based diffusion models CN Vasconcelos, A Rashwan, A Waters, T Walker, K Xu, J Yan, R Qian, ... Transactions on Machine Learning Research, 2024 | 2 | 2024 |
Automated calling system A Aharoni, A Narayanan, N Shabat, P Haghani, GT Chuang, Y Leviathan, ... US Patent 11,495,233, 2022 | 2 | 2022 |
Automated calling system A Aharoni, A Narayanan, N Shabat, P Haghani, GT Chuang, Y Leviathan, ... US Patent 11,741,966, 2023 | 1 | 2023 |
Automated calling system A Aharoni, A Narayanan, N Shabat, P Haghani, GT Chuang, Y Leviathan, ... US Patent App. 18/635,974, 2024 | | 2024 |
Automated calling system A Aharoni, A Narayanan, N Shabat, P Haghani, GT Chuang, Y Leviathan, ... US Patent 11,990,133, 2024 | | 2024 |