Llm-based nlg evaluation: Current status and challenges M Gao, X Hu, J Ruan, X Pu, X Wan arXiv preprint arXiv:2402.01383, 2024 | 84 | 2024 |
Solving inverse problems with latent diffusion models via hard data consistency B Song, SM Kwon, Z Zhang, X Hu, Q Qu, L Shen arXiv preprint arXiv:2307.08123, 2023 | 78 | 2023 |
Are LLM-based Evaluators Confusing NLG Quality Criteria? X Hu, M Gao, S Hu, Y Zhang, Y Chen, T Xu, X Wan arXiv preprint arXiv:2402.12055, 2024 | 18 | 2024 |
Evoke: Evoking critical thinking abilities in llms via reviewer-author prompt editing X Hu, P Tang, S Zuo, Z Wang, B Song, Q Lou, J Jiao, D Charles arXiv preprint arXiv:2310.13855, 2023 | 7 | 2023 |
Themis: A reference-free nlg evaluation language model with flexibility and interpretability X Hu, L Lin, M Gao, X Yin, X Wan arXiv preprint arXiv:2406.18365, 2024 | 6 | 2024 |
Mc-mke: A fine-grained multimodal knowledge editing benchmark emphasizing modality consistency J Zhang, H Zhang, X Yin, B Huang, X Zhang, X Hu, X Wan arXiv preprint arXiv:2406.13219, 2024 | 5 | 2024 |
Rst discourse parsing as text-to-text generation X Hu, X Wan IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 3278-3289, 2023 | 4 | 2023 |
Deeptagger: Knowledge enhanced named entity recognition for web-based ads queries S Zuo, P Tang, X Hu, Q Lou, J Jiao, D Charles Proceedings of the 32nd ACM International Conference on Information and …, 2023 | 3 | 2023 |
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation M Gao, X Hu, L Lin, X Wan arXiv preprint arXiv:2410.16834, 2024 | 2 | 2024 |
Task Oriented In-Domain Data Augmentation X Liang, X Hu, S Zuo, Y Gong, Q Lou, Y Liu, SL Huang, J Jiao Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024 | 2 | 2024 |
Themis: Towards flexible and interpretable nlg evaluation X Hu, L Lin, M Gao, X Yin, X Wan arXiv e-prints, arXiv: 2406.18365, 2024 | 2 | 2024 |
Exploring context-aware evaluation metrics for machine translation X Hu, X Yin, X Wan Findings of the Association for Computational Linguistics: EMNLP 2023, 15291 …, 2023 | 2 | 2023 |
Exploring discourse structure in document-level machine translation X Hu, X Wan Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 2 | 2023 |
Error-Robust Retrieval for Chinese Spelling Check X Yin, X Hu, J Jiang, X Wan arXiv preprint arXiv:2211.07843, 2022 | 2 | 2022 |
Chinese spelling check with nearest neighbors X Yin, X Hu, X Wan arXiv preprint arXiv:2211.07843, 2022 | 2 | 2022 |
What You See Is What Matters: A Novel Visual and Physics-Based Metric for Evaluating Video Generation Quality Z Wang, S Li, L Hao, X Hu, B Song arXiv preprint arXiv:2411.13609, 2024 | 1 | 2024 |
Exploring the Multilingual NLG Evaluation Abilities of LLM-Based Evaluators J Chang, M Gao, X Hu, X Wan arXiv preprint arXiv:2503.04360, 2025 | | 2025 |
Aspect-Guided Multi-Level Perturbation Analysis of Large Language Models in Automated Peer Review J Li, Y Li, X Hu, M Gao, X Wan arXiv preprint arXiv:2502.12510, 2025 | | 2025 |
A Dual-Perspective NLG Meta-Evaluation Framework with Automatic Benchmark and Better Interpretability X Hu, M Gao, L Lin, Z Yu, X Wan arXiv preprint arXiv:2502.12052, 2025 | | 2025 |
Re-evaluating Automatic LLM System Ranking for Alignment with Human Preference M Gao, Y Liu, X Hu, X Wan, J Bragg, A Cohan arXiv preprint arXiv:2501.00560, 2024 | | 2024 |