Rethinking attention with performers K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ...
arXiv preprint arXiv:2009.14794, 2020
1873 2020 Masked language modeling for proteins via linearly scalable long-context transformers K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ...
arXiv preprint arXiv:2006.03555, 2020
112 2020 Rethinking attention with Performers. arXiv 2020 K Choromanski, V Likhosherstov, D Dohan, X Song, A Gane, T Sarlos, ...
arXiv preprint arXiv:2009.14794, 0
83 Polyvit: Co-training vision transformers on images, videos and audio V Likhosherstov, A Arnab, K Choromanski, M Lucic, Y Tay, A Weller, ...
arXiv preprint arXiv:2111.12993, 2021
78 2021 From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers K Choromanski, H Lin, H Chen, T Zhang, A Sehanobish, V Likhosherstov, ...
International Conference on Machine Learning, 3962-3983, 2022
35 2022 On the expressive power of self-attention matrices V Likhosherstov, K Choromanski, A Weller
arXiv preprint arXiv:2106.03764, 2021
34 2021 Ode to an ODE KM Choromanski, JQ Davis, V Likhosherstov, X Song, JJ Slotine, J Varley, ...
Advances in neural information processing systems 33, 3338-3350, 2020
30 2020 Hybrid random features K Choromanski, H Chen, H Lin, Y Ma, A Sehanobish, D Jain, MS Ryoo, ...
arXiv preprint arXiv:2110.04367, 2021
24 2021 Sub-linear memory: How to make performers slim V Likhosherstov, KM Choromanski, JQ Davis, X Song, A Weller
Advances in Neural Information Processing Systems 34, 6707-6719, 2021
22 2021 Chefs' random tables: Non-trigonometric random features V Likhosherstov, KM Choromanski, KA Dubey, F Liu, T Sarlos, A Weller
Advances in Neural Information Processing Systems 35, 34559-34573, 2022
14 2022 Adaptive computation with elastic input sequence F Xue, V Likhosherstov, A Arnab, N Houlsby, M Dehghani, Y You
International Conference on Machine Learning, 38971-38988, 2023
13 2023 Large‐scale log analysis of digital reading P Braslavski, V Likhosherstov, V Petras, M Gäde
Proceedings of the Association for Information Science and Technology 53 (1 …, 2016
13 2016 Learning a fourier transform for linear relative positional encodings in transformers K Choromanski, S Li, V Likhosherstov, KA Dubey, S Luo, D He, Y Yang, ...
International Conference on Artificial Intelligence and Statistics, 2278-2286, 2024
9 2024 Stochastic flows and geometric optimization on the orthogonal group K Choromanski, D Cheikhi, J Davis, V Likhosherstov, A Nazaret, ...
International Conference on Machine Learning, 1918-1928, 2020
9 2020 Simplex random features I Reid, KM Choromanski, V Likhosherstov, A Weller
International Conference on Machine Learning, 28864-28888, 2023
8 2023 UFO-BLO: Unbiased first-order bilevel optimization V Likhosherstov, X Song, K Choromanski, J Davis, A Weller
arXiv preprint arXiv:2006.03631, 2020
6 2020 Scalable neural network kernels A Sehanobish, K Choromanski, Y Zhao, A Dubey, V Likhosherstov
arXiv preprint arXiv:2310.13225, 2023
5 2023 Efficient graph field integrators meet point clouds KM Choromanski, A Sehanobish, H Lin, Y Zhao, E Berger, T Parshakova, ...
International Conference on Machine Learning, 5978-6004, 2023
5 2023 Inference and Sampling of -free Ising Models V Likhosherstov, Y Maximov, M Chertkov
International Conference on Machine Learning, 3963-3972, 2019
5 2019 Ten months of digital reading: An exploratory log study P Braslavski, V Petras, V Likhosherstov, M Gäde
Research and Advanced Technology for Digital Libraries: 20th International …, 2016
5 2016