FacTool: Factuality Detection in Generative AI--A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios IC Chern, S Chern, S Chen, W Yuan, K Feng, C Zhou, J He, G Neubig, ... arXiv preprint arXiv:2307.13528, 2023 | 138 | 2023 |
FELM: Benchmarking Factuality Evaluation of Large Language Models S Chen, Y Zhao, J Zhang, IC Chern, S Gao, P Liu, J He NeurIPS 2023, 2023 | 57 | 2023 |
Alignment for honesty Y Yang, E Chern, X Qiu, G Neubig, P Liu NeurIPS 2024, 2023 | 40 | 2023 |
Generative ai for math: Abel E Chern, H Zou, X Li, J Hu, K Feng, J Li, P Liu https://github.com/GAIR-NLP/abel, 2023 | 21 | 2023 |
Decoding of quantum data-syndrome codes via belief propagation KY Kuo, IC Chern, CY Lai ISIT 2021, 2021 | 17 | 2021 |
Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings IC Chern, KH Hung, YT Chen, T Hussain, M Gogate, A Hussain, Y Tsao, ... ICASSPW 2023, 2023 | 15* | 2023 |
Reformatted Alignment RZ Fan, X Li, H Zou, J Li, S He, E Chern, J Hu, P Liu EMNLP 2024, 2024 | 13 | 2024 |
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate S Chern, E Chern, G Neubig, P Liu arXiv preprint arXiv:2401.16788, 2024 | 12 | 2024 |
Align on the Fly: Adapting Chatbot Behavior to Established Norms C Xu, S Chern, E Chern, G Zhang, Z Wang, R Liu, J Li, J Fu, P Liu arXiv preprint arXiv:2312.15907, 2023 | 12 | 2023 |
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Z Huang, Z Wang, S Xia, X Li, H Zou, R Xu, RZ Fan, L Ye, E Chern, Y Ye, ... NeurIPS 2024, 2024 | 10* | 2024 |
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation E Chern*, J Su*, Y Ma*, P Liu arXiv preprint arXiv:2407.06135, 2024 | 7 | 2024 |
Improving Factuality of Abstractive Summarization via Contrastive Reward Learning IC Chern, Z Wang, S Das, B Sharma, P Liu, G Neubig The Third Workshop on Trustworthy Natural Language Processing @ ACL 2023, 2023 | 7* | 2023 |
BeHonest: Benchmarking Honesty of Large Language Models S Chern, Z Hu, Y Yang, E Chern, Y Guo, J Jin, B Wang, P Liu arXiv preprint arXiv:2406.13261, 2024 | 2 | 2024 |
Chinesefacteval: A factuality benchmark for chinese llms B Wang, E Chern, P Liu https://gair-nlp.github.io/ChineseFactEval/, 2023 | 2 | 2023 |
Halu-J: Critique-Based Hallucination Judge B Wang, S Chern, E Chern, P Liu arXiv preprint arXiv:2407.12943, 2024 | | 2024 |
Voice Direction-Of-Arrival Conversion IC Chern, S Chern, HC Kuo, HH Tseng, KH Hung, Y Tsao MLSP 2023, 2023 | | 2023 |