Phi-3 technical report: A highly capable language model locally on your phone M Abdin, J Aneja, H Awadalla, A Awadallah, AA Awan, N Bach, A Bahree, ... arXiv preprint arXiv:2404.14219, 2024 | 474 | 2024 |
Textbooks are all you need S Gunasekar, Y Zhang, J Aneja, CCT Mendes, A Del Giorno, S Gopi, ... arXiv preprint arXiv:2306.11644, 2023 | 468 | 2023 |
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions S Chen, S Chewi, J Li, Y Li, A Salim, AR Zhang The Eleventh International Conference on Learning Representations, 2022 | 256 | 2022 |
Maximum mean discrepancy gradient flow M Arbel, A Korba, A Salim, A Gretton Advances in Neural Information Processing Systems 32, 2019 | 163 | 2019 |
Phi-2: The surprising power of small language models M Javaheripi, S Bubeck, M Abdin, J Aneja, S Bubeck, CCT Mendes, ... Microsoft Research Blog 1, 3, 2023 | 160 | 2023 |
A non-asymptotic analysis for Stein variational gradient descent A Korba, A Salim, M Arbel, G Luise, A Gretton Advances in Neural Information Processing Systems 33, 4672-4682, 2020 | 95 | 2020 |
Optimal and practical algorithms for smooth and strongly convex decentralized optimization D Kovalev, A Salim, P Richtárik Advances in Neural Information Processing Systems 33, 18342-18352, 2020 | 85 | 2020 |
The probability flow ODE is provably fast S Chen, S Chewi, H Lee, Y Li, J Lu, A Salim Advances in Neural Information Processing Systems 36, 2023 | 81 | 2023 |
Towards a theory of non-log-concave sampling: first-order stationarity guarantees for langevin monte carlo K Balasubramanian, S Chewi, MA Erdogdu, A Salim, S Zhang Conference on Learning Theory, 2896-2923, 2022 | 72 | 2022 |
Improved analysis for a proximal algorithm for sampling Y Chen, S Chewi, A Salim, A Wibisono Conference on Learning Theory, 2984-3014, 2022 | 57 | 2022 |
The Wasserstein proximal gradient algorithm A Salim, A Korba, G Luise Advances in Neural Information Processing Systems 33, 12356-12366, 2020 | 56 | 2020 |
Primal dual interpretation of the proximal stochastic gradient Langevin algorithm A Salim, P Richtarik Advances in Neural Information Processing Systems 33, 3786-3796, 2020 | 44 | 2020 |
Dualize, split, randomize: Toward fast nonsmooth optimization algorithms A Salim, L Condat, K Mishchenko, P Richtárik Journal of Optimization Theory and Applications 195 (1), 102-130, 2022 | 41 | 2022 |
A convergence theory for SVGD in the population limit under Talagrand’s inequality T1 A Salim, L Sun, P Richtarik International Conference on Machine Learning, 19139-19152, 2022 | 31* | 2022 |
Stochastic proximal langevin algorithm: Potential splitting and nonasymptotic rates A Salim, D Kovalev, P Richtárik Advances in Neural Information Processing Systems 32, 2019 | 30 | 2019 |
Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein space MZ Diao, K Balasubramanian, S Chewi, A Salim International Conference on Machine Learning, 7960-7991, 2023 | 29 | 2023 |
An optimal algorithm for strongly convex minimization under affine constraints A Salim, L Condat, D Kovalev, P Richtárik International conference on artificial intelligence and statistics, 4482-4498, 2022 | 29 | 2022 |
A constant step Forward-Backward algorithm involving random maximal monotone operators P Bianchi, W Hachem, A Salim Journal of Convex Analysis 26 (2), 387-436, 2019 | 29 | 2019 |
Distributed fixed point methods with compressed iterates S Chraibi, A Khaled, D Kovalev, P Richtárik, A Salim, M Takáč arXiv preprint arXiv:1912.09925, 2019 | 25 | 2019 |
Snake: a stochastic proximal gradient algorithm for regularized problems over large graphs A Salim, P Bianchi, W Hachem IEEE Transactions on Automatic Control 64 (5), 1832-1847, 2019 | 24 | 2019 |