Gpt-4 technical report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 5727 | 2023 |
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPDO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021 | 3310 | 2021 |
Megatron-lm: Training multi-billion parameter language models using model parallelism M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro arXiv preprint arXiv:1909.08053, 2019 | 1775 | 2019 |
Text and code embeddings by contrastive pre-training A Neelakantan, T Xu, R Puri, A Radford, JM Han, J Tworek, Q Yuan, ... arXiv preprint arXiv:2201.10005, 2022 | 394 | 2022 |
Training question answering models from synthetic data R Puri, R Spring, M Patwary, M Shoeybi, B Catanzaro arXiv preprint arXiv:2002.09599, 2020 | 167 | 2020 |
MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models P Xu, M Patwary, M Shoeybi, R Puri, P Fung, A Anandkumar, B Catanzaro arXiv preprint arXiv:2010.00840, 2020 | 154 | 2020 |
BioMegatron: larger biomedical domain language model HC Shin, Y Zhang, E Bakhturina, R Puri, M Patwary, M Shoeybi, R Mani Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020 | 143 | 2020 |
Zero-shot text classification with generative language models R Puri, B Catanzaro arXiv preprint arXiv:1912.10165, 2019 | 107 | 2019 |
Practical text classification with large pre-trained language models N Kant, R Puri, N Yakovenko, B Catanzaro arXiv preprint arXiv:1812.01207, 2018 | 83 | 2018 |
Evaluating large language models trained on code. arXiv 2021 M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374 10, 2021 | 55 | 2021 |
Megatron-lm: Training multi-billion parameter language models using model parallelism. arXiv 2019 M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro arXiv preprint arXiv:1909.08053, 1909 | 28 | 1909 |
Large scale multi-actor generative dialog modeling A Boyd, R Puri, M Shoeybi, M Patwary, B Catanzaro arXiv preprint arXiv:2005.06114, 2020 | 26 | 2020 |
Large scale language modeling: Converging on 40gb of text in four hours R Puri, R Kirby, N Yakovenko, B Catanzaro 2018 30th International Symposium on Computer Architecture and High …, 2018 | 26 | 2018 |
Transferability of adversarial attacks in model-agnostic meta-learning R Edmunds, N Golmant, V Ramasesh, P Kuznetsov, P Patil, R Puri Deep Learning and Security Workshop (DLSW) in Singapore, 2017 | 15 | 2017 |
Few shot learning for point cloud data using model agnostic meta learning R Puri, A Zakhor, R Puri 2020 IEEE International Conference on Image Processing (ICIP), 1906-1910, 2020 | 13 | 2020 |
Gpt-4o system card A Hurst, A Lerer, AP Goucher, A Perelman, A Ramesh, A Clark, AJ Ostrow, ... arXiv preprint arXiv:2410.21276, 2024 | 10 | 2024 |
Local knowledge powered conversational agents S Santhanam, W Ping, R Puri, M Shoeybi, M Patwary, B Catanzaro arXiv preprint arXiv:2010.10150, 2020 | 4 | 2020 |
Model agnostic contrastive explanations for structured data. CoRR abs/1906.00117 (2019) A Dhurandhar, T Pedapati, A Balakrishnan, P Chen, K Shanmugam, ... | 4 | 1906 |
Frame rate upscaling with deep neural networks T Xiao, R Puri, G Kesineni Term Paper for CS294-129 Deep Neural Networks, Fall, 2016 | 3 | 2016 |
Adversarial machine learning P Kuznetsov, R Edmunds, T Xiao, H Iqbal, R Puri, N Golmant, S Shih Artificial Intelligence Safety and Security, 235-248, 2018 | 2 | 2018 |