MHPP: Exploring the capabilities and limitations of language models beyond basic code generation J Dai, J Lu, Y Feng, D Huang, G Zeng, R Ruan, M Cheng, H Tan, Z Guo arXiv preprint arXiv:2405.11430, 2024 | 7 | 2024 |
MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models Z Zeng, Y Liu, Y Wan, J Li, P Chen, J Dai, Y Yao, R Xu, Z Qi, W Zhao, ... NeurIPS 2024, 2024 | 5 | 2024 |
EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization D Huang, J Dai, H Weng, P Wu, Y Qing, JM Zhang, H Cui, Z Guo NeurIPS 2024, 2024 | 3* | 2024 |
AutoPSV: Automated Process-Supervised Verifier J Lu, Z Dou, H Wang, Z Cao, J Dai, Y Wan, Y Huang, Z Guo NeurIPS 2024, 2024 | 2* | 2024 |
Effi-Code: Unleashing Code Efficiency in Language Models D Huang, G Zeng, J Dai, M Luo, H Weng, Y Qing, H Cui, Z Guo, JM Zhang arXiv preprint arXiv:2410.10209, 2024 | 1 | 2024 |
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs Z Zeng, Y Liu, Y Wan, J Li, P Chen, J Dai, Y Yao, R Xu, Z Qi, W Zhao, ... The Thirty-eighth Annual Conference on Neural Information Processing Systems, 0 | | |