Ikuti
Xin Wang
Judul
Dikutip oleh
Dikutip oleh
Tahun
Large language models with controllable working memory
D Li, AS Rawat, M Zaheer, X Wang, M Lukasik, A Veit, F Yu, S Kumar
arXiv preprint arXiv:2211.05110, 2022
1382022
On the benefits of learning to route in mixture-of-experts models
N Dikkala, N Ghosh, R Meka, R Panigrahy, N Vyas, X Wang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023
182023
A unified cascaded encoder asr model for dynamic model sizes
S Ding, W Wang, D Zhao, TN Sainath, Y He, R David, R Botros, X Wang, ...
arXiv preprint arXiv:2204.06164, 2022
182022
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Atish Agarwala, Abhimanyu Das, Brendan Juba, Rina Panigrahy, Vatsal Sharan ...
International Conference on Learning Representations, 2021
18*2021
A theoretical view on sparsely activated networks
C Baykal, N Dikkala, R Panigrahy, C Rashtchian, X Wang
Advances in Neural Information Processing Systems 35, 30071-30084, 2022
92022
Sketch based memory for neural networks
R Panigrahy, X Wang, M Zaheer
International Conference on Artificial Intelligence and Statistics, 3169-3177, 2021
92021
JaxPruner: A concise library for sparsity research
JH Lee, W Park, NE Mitchell, J Pilault, JSO Ceron, HB Kim, N Lee, ...
Conference on Parsimony and Learning, 515-528, 2024
72024
Back and forth error compensation and correction method for linear hyperbolic systems with application to the Maxwell's equations
X Wang, Y Liu
Journal of Computational Physics: X 1, 100014, 2019
72019
Alternating updates for efficient transformers
C Baykal, D Cutler, N Dikkala, N Ghosh, R Panigrahy, X Wang
Advances in Neural Information Processing Systems 36, 76718-76736, 2023
62023
Layernas: Neural architecture search in polynomial complexity
Y Fan, D Alon, J Shen, D Peng, K Kumar, Y Long, X Wang, F Iliopoulos, ...
arXiv preprint arXiv:2304.11517, 2023
62023
Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients
R Li, X Wang, H Zha, M Tao
arXiv preprint arXiv:2002.08949, 2020
62020
Causal language modeling can elicit search and reasoning capabilities on logic puzzles
K Shah, N Dikkala, X Wang, R Panigrahy
arXiv preprint arXiv:2409.10502, 2024
22024
Provable hierarchical lifelong learning with a sketch-based modular architecture
Z Deng, Z Fryer, B Juba, R Panigrahy, X Wang
arXiv preprint arXiv:2112.10919, 2021
22021
How transformers solve propositional logic problems: A mechanistic analysis
GZ Hong, N Dikkala, E Luo, C Rashtchian, X Wang, R Panigrahy
arXiv preprint arXiv:2411.04105, 2024
12024
Sketching based representations for robust image classification with provable guarantees
N Dikkala, SR Karingula, R Meka, J Nelson, R Panigrahy, X Wang
Advances in Neural Information Processing Systems 35, 5459-5470, 2022
12022
StagFormer: Time Staggering Transformer Decoding for RunningLayers In Parallel
D Cutler, A Kandoor, N Dikkala, N Saunshi, X Wang, R Panigrahy
arXiv preprint arXiv:2501.15665, 2025
2025
Unified Cascaded Encoder ASR model for Dynamic Model Sizes
S Ding, Y He, X Wang, W Wang, T Strohman, TN Sainath, ...
US Patent US20230326461A1, 2023
2023
The Power of External Memory in Increasing Predictive Model Capacity
C Baykal, DJ Cutler, N Dikkala, N Ghosh, R Panigrahy, X Wang
arXiv preprint arXiv:2302.00003, 2023
2023
JAXPruner: A Modular Library for Sparsity Research
ICLR 2023 Workshop on Sparsity in Neural Networks, 2023
2023
One network fits all? Modular versus monolithic task formulations in neural networks
A Das, A Agarwala, B Juba, R Zhang, R Panigrahy, V Sharan, X Wang
2021
Sistem tidak dapat melakukan operasi ini. Coba lagi nanti.
Artikel 1–20