Kv cache compression, but what must we give in return? a comprehensive benchmark of long context capable approaches J Yuan, H Liu, S Zhong, YN Chuang, S Li, G Wang, D Le, H Jin, ... arXiv preprint arXiv:2407.01527, 2024 | 15 | 2024 |
Understanding different design choices in training large time series models YN Chuang, S Li, J Yuan, G Wang, KH Lai, L Yu, S Ding, CY Chang, ... arXiv preprint arXiv:2406.14045, 2024 | 4 | 2024 |