DARTS: Differentiable architecture search H Liu, K Simonyan, Y Yang International Conference on Learning Representations, ICLR 2019, 2018 | 5322 | 2018 |
Modeling long- and short-term temporal patterns with deep neural networks G Lai, WC Chang, Y Yang, H Liu International ACM SIGIR Conference on Research and Development in …, 2017 | 2140 | 2017 |
RACE: Large-scale reading comprehension dataset from examinations G Lai, Q Xie, H Liu, Y Yang, E Hovy Empirical Methods in Natural Language Processing (EMNLP), 2017 | 1394 | 2017 |
CoAtNet: Marrying convolution and attention for all data sizes Z Dai, H Liu, QV Le, M Tan Advances in Neural Information Processing Systems (NeurIPS), 2021 | 1331 | 2021 |
Hierarchical representations for efficient architecture search H Liu, K Simonyan, O Vinyals, C Fernando, K Kavukcuoglu International Conference on Learning Representations, ICLR 2018, 2017 | 1146 | 2017 |
Rethinking pre-training and self-training B Zoph, G Ghiasi, TY Lin, Y Cui, H Liu, ED Cubuk, Q Le Advances in Neural Information Processing Systems (NeurIPS), 2020 | 777 | 2020 |
Pay attention to MLPs H Liu, Z Dai, D So, Q Le Advances in Neural Information Processing Systems (NeurIPS), 2021 | 708* | 2021 |
Analogical inference for multi-relational embeddings H Liu, Y Wu, Y Yang International Conference on Machine Learning (ICML), 2017 | 486 | 2017 |
Gated-attention readers for text comprehension B Dhingra*, H Liu*, Z Yang, WW Cohen, R Salakhutdinov Annual Meeting of the Association for Computational Linguistics, ACL 2017, 2016 | 469 | 2016 |
BigNAS: Scaling up neural architecture search with big single-stage models J Yu, P Jin, H Liu, G Bender, PJ Kindermans, M Tan, T Huang, X Song, ... European Conference on Computer Vision (ECCV), 2020 | 337 | 2020 |
Larger language models do in-context learning differently J Wei, J Wei, Y Tay, D Tran, A Webson, Y Lu, X Chen, H Liu, D Huang, ... arXiv preprint arXiv:2303.03846, 2023 | 274 | 2023 |
Transformer quality in linear time W Hua, Z Dai, H Liu, Q Le International Conference on Machine Learning (ICML), 2022 | 237 | 2022 |
Mixture-of-Experts with Expert Choice Routing Y Zhou, T Lei, H Liu, N Du, Y Huang, V Zhao, A Dai, Z Chen, Q Le, ... Advances in Neural Information Processing Systems (NeurIPS), 2022 | 234 | 2022 |
Combined scaling for open-vocabulary image classification H Pham, Z Dai, G Ghiasi, K Kawaguchi, H Liu, AW Yu, J Yu, YT Chen, ... arXiv preprint arXiv:2111.10050 1 (2), 4, 2021 | 229* | 2021 |
Neural predictor for neural architecture search W Wen, H Liu, H Li, Y Chen, G Bender, PJ Kindermans European Conference on Computer Vision (ECCV), 2020 | 226 | 2020 |
MobileDets: Searching for object detection architectures for mobile accelerators Y Xiong, H Liu, S Gupta, B Akin, G Bender, PJ Kindermans, M Tan, ... Computer Vision and Pattern Recognition (CVPR), 2021 | 168 | 2021 |
Searching for efficient transformers for language modeling D So, W Mańke, H Liu, Z Dai, N Shazeer, QV Le Advances in Neural Information Processing Systems (NeurIPS), 2021 | 161 | 2021 |
Can weight sharing outperform random architecture search? An investigation with TuNAS G Bender*, H Liu*, B Chen, G Chu, S Cheng, PJ Kindermans, QV Le Computer Vision and Pattern Recognition (CVPR), 2020 | 159 | 2020 |
Concept graph learning from educational data Y Yang, H Liu, J Carbonell, W Ma International Conference on Web Search and Data Mining (WSDM), 2015 | 110 | 2015 |
Doremi: Optimizing data mixtures speeds up language model pretraining SM Xie, H Pham, X Dong, N Du, H Liu, Y Lu, PS Liang, QV Le, T Ma, ... Advances in Neural Information Processing Systems 36, 2024 | 105 | 2024 |