Double graph based reasoning for document-level relation extraction S Zeng, R Xu, B Chang, L Li arXiv preprint arXiv:2009.13752, 2020 | 243 | 2020 |
Raise a child in large language model: Towards effective and generalizable fine-tuning R Xu, F Luo, Z Zhang, C Tan, B Chang, S Huang, F Huang arXiv preprint arXiv:2109.05687, 2021 | 166 | 2021 |
Deepseekmath: Pushing the limits of mathematical reasoning in open language models Z Shao, P Wang, Q Zhu, R Xu, J Song, M Zhang, YK Li, Y Wu, D Guo arXiv preprint arXiv:2402.03300, 2024 | 117 | 2024 |
Document-level event extraction via heterogeneous graph-based interaction model with a tracker R Xu, T Liu, L Li, B Chang arXiv preprint arXiv:2105.14924, 2021 | 105 | 2021 |
Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen, J Li, W Zeng, X Yu, Y Wu, ... arXiv preprint arXiv:2401.06066, 2024 | 97 | 2024 |
Math-shepherd: Verify and reinforce llms step-by-step without human annotations P Wang, L Li, Z Shao, R Xu, D Dai, Y Li, D Chen, Y Wu, Z Sui Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 57 | 2024 |
Deepseek llm: Scaling open-source language models with longtermism X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ... arXiv preprint arXiv:2401.02954, 2024 | 56 | 2024 |
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Q Zhu, D Guo, Z Shao, D Yang, P Wang, R Xu, Y Wu, Y Li, H Gao, S Ma, ... arXiv preprint arXiv:2406.11931, 2024 | 52 | 2024 |
A two-stream AMR-enhanced model for document-level event argument extraction R Xu, P Wang, T Liu, S Zeng, B Chang, Z Sui arXiv preprint arXiv:2205.00241, 2022 | 50 | 2022 |
An enhanced span-based decomposition method for few-shot sequence labeling P Wang, R Xu, T Liu, Q Zhou, Y Cao, B Chang, Z Sui arXiv preprint arXiv:2109.13023, 2021 | 49 | 2021 |
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DS AI | 40 | 2024 |
Multimodal arxiv: A dataset for improving scientific comprehension of large vision-language models L Li, Y Wang, R Xu, P Wang, X Feng, L Kong, Q Liu arXiv preprint arXiv:2403.00231, 2024 | 33 | 2024 |
Math-shepherd: A label-free step-by-step verifier for llms in mathematical reasoning P Wang, L Li, Z Shao, RX Xu, D Dai, Y Li, D Chen, Y Wu, Z Sui arXiv preprint arXiv:2312.08935, 2023 | 27 | 2023 |
Making pre-trained language models end-to-end few-shot learners with contrastive prompt tuning Z Xu, C Wang, M Qiu, F Luo, R Xu, S Huang, J Huang Proceedings of the Sixteenth ACM International Conference on Web Search and …, 2023 | 26 | 2023 |
From dense to sparse: Contrastive pruning for better pre-trained language model compression R Xu, F Luo, C Wang, B Chang, J Huang, S Huang, F Huang Proceedings of the AAAI Conference on Artificial Intelligence 36 (10), 11547 …, 2022 | 25 | 2022 |
Behind the scenes: An exploration of trigger biases problem in few-shot event classification P Wang, R Xun, T Liu, D Dai, B Chang, Z Sui Proceedings of the 30th ACM International Conference on Information …, 2021 | 16 | 2021 |
Xiaomingbot: A Multilingual Robot News Reporter R Xu, J Cao, M Wang, J Chen, H Zhou, Y Zeng, Y Wang, L Chen, X Yin, ... The 58th Annual Meeting of the Association for Computational Linguistics, 2020 | 14 | 2020 |
Deepseekmath: Pushing the limits of mathematical reasoning in open language models, 2024 Z Shao, P Wang, Q Zhu, R Xu, J Song, X Bi, H Zhang, M Zhang, YK Li, ... URL https://arxiv. org/abs/2402.03300, 0 | 13 | |
ATP: AMRize then parse! enhancing AMR parsing with PseudoAMRs L Chen, P Wang, R Xu, T Liu, Z Sui, B Chang arXiv preprint arXiv:2204.08875, 2022 | 12 | 2022 |
S4-Tuning: A simple cross-lingual sub-network tuning method R Xu, F Luo, B Chang, S Huang, F Huang Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 10 | 2022 |