A survey on in-context learning Q Dong, L Li, D Dai, C Zheng, J Ma, R Li, H Xia, J Xu, Z Wu, B Chang, ... Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024 | 1219 | 2024 |
Knowledge neurons in pretrained transformers D Dai, L Dong, Y Hao, Z Sui, C Baobao, F Wei Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 473 | 2022 |
Why can GPT learn in-context? language models implicitly perform gradient descent as meta-optimizers D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui, F Wei Findings of the Association for Computational Linguistics: ACL 2023, 4005-4019, 2023 | 341 | 2023 |
Deepseek llm: Scaling open-source language models with longtermism X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ... arXiv preprint arXiv:2401.02954, 2024 | 154 | 2024 |
Calibrating Factual Knowledge in Pretrained Language Models Q Dong*, D Dai*, Y Song, J Xu, Z Sui, L Li Findings of the Association for Computational Linguistics: EMNLP 2022, 2022 | 104 | 2022 |
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning L Wang, L Li, D Dai, D Chen, H Zhou, F Meng, J Zhou, X Sun (EMNLP 2023 Best Long Paper) Proceedings of the 2023 Conference on Empirical …, 2023 | 102 | 2023 |
Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen, J Li, W Zeng, X Yu, Y Wu, ... Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 101 | 2024 |
Math-shepherd: Verify and reinforce llms step-by-step without human annotations P Wang, L Li, Z Shao, RX Xu, D Dai, Y Li, D Chen, Y Wu, Z Sui Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 85* | 2024 |
Preliminary study on the construction of Chinese medical knowledge graph O Byambasuren, Y Yang, Z Sui, D Dai, B Chang, S Li, H Zan Journal of Chinese Information Processing 33 (10), 1-9, 2019 | 79* | 2019 |
On the representation collapse of sparse mixture of experts Z Chi, L Dong, S Huang, D Dai, S Ma, B Patra, S Singhal, P Bajaj, X Song, ... Advances in Neural Information Processing Systems 35, 34600-34613, 2022 | 72 | 2022 |
Livebot: Generating live video comments based on visual and textual contexts S Ma, L Cui, D Dai, F Wei, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 6810-6817, 2019 | 65 | 2019 |
Learning to control the fine-grained sentiment for story ending generation F Luo*, D Dai*, P Yang, T Liu, B Chang, Z Sui, X Sun Proceedings of the 57th Annual Meeting of the Association for Computational …, 2019 | 64 | 2019 |
StableMoE: Stable Routing Strategy for Mixture of Experts D Dai, L Dong, S Ma, B Zheng, Z Sui, B Chang, F Wei Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 56 | 2022 |
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Q Zhu, D Guo, Z Shao, D Yang, P Wang, R Xu, Y Wu, Y Li, H Gao, S Ma, ... arXiv preprint arXiv:2406.11931, 2024 | 52 | 2024 |
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DeepSeek-AI, A Liu, B Feng, B Wang, B Wang, B Liu, C Zhao, C Dengr, ... arXiv preprint arXiv:2405.04434, 2024 | 50 | 2024 |
Sememe prediction: Learning semantic knowledge from unstructured textual wiki descriptions W Li, X Ren, D Dai, Y Wu, H Wang, X Sun arXiv preprint arXiv:1808.05437, 2018 | 20 | 2018 |
Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions D Dai*, H Zheng*, F Luo, P Yang, T Liu, Z Sui, B Chang Proceedings of the 6th ACL Workshop on Representation Learning for NLP …, 2021 | 19 | 2021 |
Neural knowledge bank for pretrained transformers D Dai, W Jiang, Q Dong, Y Lyu, Z Sui CCF International Conference on Natural Language Processing and Chinese …, 2023 | 17 | 2023 |
Hierarchical Curriculum Learning for AMR Parsing P Wang, L Chen, T Liu, D Dai, Y Cao, B Chang, Z Sui Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 16 | 2022 |
Behind the scenes: An exploration of trigger biases problem in few-shot event classification P Wang, R Xun, T Liu, D Dai, B Chang, Z Sui Proceedings of the 30th ACM International Conference on Information …, 2021 | 16 | 2021 |