Követés
Tri Dao
Tri Dao
E-mail megerősítve itt: princeton.edu - Kezdőlap
Cím
Hivatkozott rá
Hivatkozott rá
Év
Mamba: Linear-time sequence modeling with selective state spaces
A Gu, T Dao
Conference on Language Modeling (COLM), 2023
25492023
FlashAttention: Fast and memory-efficient exact attention with IO-awareness
T Dao, D Fu, S Ermon, A Rudra, C Ré
Advances in neural information processing systems 35, 16344-16359, 2022
19062022
Starcoder: may the source be with you!
R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ...
Transactions on Machine Learning Research (TMLR), 2023
1034*2023
Flashattention-2: Faster attention with better parallelism and work partitioning
T Dao
International Conference on Learning Representations, 2023
8732023
Combining recurrent, convolutional, and continuous-time models with linear state space layers
A Gu, I Johnson, K Goel, K Saab, T Dao, A Rudra, C Ré
Advances in neural information processing systems 34, 572-585, 2021
5792021
Hippo: Recurrent memory with optimal polynomial projections
A Gu, T Dao, S Ermon, A Rudra, C Ré
Advances in neural information processing systems 33, 1474-1487, 2020
5452020
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
DY Fu, T Dao, KK Saab, AW Thomas, A Rudra, C Re
The Eleventh International Conference on Learning Representations, 2023
4812023
Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality
T Dao, A Gu
International Conference on Machine Learning (ICML), 2024
3752024
Hyena Hierarchy: Towards Larger Convolutional Language Models
M Poli, S Massaroli, E Nguyen, DY Fu, T Dao, S Baccus, Y Bengio, ...
International Conference on Machine Learning, 2023
3202023
Deja vu: Contextual sparsity for efficient llms at inference time
Z Liu, J Wang, T Dao, T Zhou, B Yuan, Z Song, A Shrivastava, C Zhang, ...
International Conference on Machine Learning, 22137-22176, 2023
2782023
A kernel theory of modern data augmentation
T Dao, A Gu, A Ratner, V Smith, CD Sa, C Ré
Proceedings of the 36th International Conference on Machine Learning, ICML, 9-15, 2019
2392019
Starcoder 2 and the stack v2: The next generation
A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ...
arXiv preprint arXiv:2402.19173, 2024
2272024
S4nd: Modeling images and videos as multidimensional signals with state spaces
E Nguyen, K Goel, A Gu, G Downs, P Shah, T Dao, S Baccus, C Ré
Advances in neural information processing systems 35, 2846-2861, 2022
2062022
Medusa: Simple llm inference acceleration framework with multiple decoding heads
T Cai, Y Li, Z Geng, H Peng, JD Lee, D Chen, T Dao
International Conference on Machine Learning (ICML), 2024
1912024
Scatterbrain: Unifying sparse and low-rank attention
B Chen, T Dao, E Winsor, Z Song, A Rudra, C Ré
Advances in Neural Information Processing Systems 34, 17413-17426, 2021
1442021
Learning fast algorithms for linear transforms using butterfly factorizations
T Dao, A Gu, M Eichhorn, A Rudra, C Ré
International conference on machine learning, 1517-1527, 2019
1282019
Monarch: Expressive structured matrices for efficient and accurate training
T Dao, B Chen, NS Sohoni, A Desai, M Poli, J Grogan, A Liu, A Rao, ...
International Conference on Machine Learning, 4690-4721, 2022
1052022
Decentralized training of foundation models in heterogeneous environments
B Yuan, Y He, J Davis, T Zhang, T Dao, B Chen, PS Liang, C Re, C Zhang
Advances in Neural Information Processing Systems 35, 25464-25477, 2022
962022
Caduceus: Bi-directional equivariant long-range dna sequence modeling
Y Schiff, CH Kao, A Gokaslan, T Dao, A Gu, V Kuleshov
International Conference on Machine Learning (ICML), 2024
852024
Pixelated butterfly: Simple and efficient sparse training for neural network models
T Dao, B Chen, K Liang, J Yang, Z Song, A Rudra, C Re
International Conference on Learning Representations, 2021
842021
A rendszer jelenleg nem tudja elvégezni a műveletet. Próbálkozzon újra később.
Cikkek 1–20