Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization Y Wang, Z Yu, Z Zeng, L Yang, C Wang, H Chen, C Jiang, R Xie, J Wang, ... arXiv preprint arXiv:2306.05087, 2023 | 211 | 2023 |
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model C Jiang, H Xu, M Dong, J Chen, W Ye, M Yan, Q Ye, J Zhang, F Huang, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024, 2023 | 83 | 2023 |
Hal-eval: A universal and fine-grained hallucination evaluation framework for large vision language models C Jiang, W Ye, M Dong, H Jia, H Xu, M Yan, J Zhang, S Zhang Proceedings of the 32nd ACM International Conference on Multimedia (MM '24), 2024 | 16 | 2024 |
Similarity learning for cover song identification using cross-similarity matrices of multi-level deep sequences C Jiang, D Yang, X Chen ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 15 | 2020 |
TRIPS: Efficient vision-and-language pre-training with text-relevant image patch selection C Jiang, H Xu, C Li, M Yan, W Ye, S Zhang, B Bi, S Huang Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 14 | 2022 |
Copa: Efficient vision-language pre-training through collaborative object-and patch-text alignment C Jiang, H Xu, W Ye, Q Ye, C Li, M Yan, B Bi, S Zhang, F Huang, J Zhang Proceedings of the 31st ACM International Conference on Multimedia, 4480-4491, 2023 | 13 | 2023 |
Exploiting Pseudo Image Captions for Multimodal Summarization C Jiang, R Xie, W Ye, J Sun, S Zhang Findings of the Association for Computational Linguistics: ACL 2023, 161–175, 2023 | 13 | 2023 |
Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation C Jiang, W Ye, H Xu, S Zhang, J Zhang, F Huang Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 10 | 2023 |
Mibench: Evaluating multimodal large language models over multiple images H Liu, X Zhang, H Xu, Y Shi, C Jiang, M Yan, J Zhang, F Huang, C Yuan, ... arXiv preprint arXiv:2407.15272, 2024 | 9 | 2024 |
BUS: Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization. C Jiang, H Xu, W Ye, Q Ye, C Li, M Yan, B Bi, S Zhang, F Huang, S Huang Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 6 | 2023 |
Pandalm: Reproducible and automated language model assessment Y Wang, Z Yu, Z Zeng, L Yang, Q Heng, C Wang, H Chen, C Jiang, R Xie, ... | 5 | 2023 |
Learn a robust representation for cover song identification via aggregating local and global music temporal context C Jiang, D Yang, X Chen 2020 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2020 | 5 | 2020 |
Pandalm: Reproducible and automated language model assessment W Yidong, Y Zhuohao, Z Zhengran, Y Linyi, H Qiang, W Cunxiang, C Hao, ... | 4 | 2023 |
TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training C Jiang, W Ye, H Xu, Q Ye, M Yan, J Zhang, S Zhang Proceedings of the AAAI Conference on Artificial Intelligence 2024, 2023 | 3 | 2023 |
Enhancing In-Context Learning via Implicit Demonstration Augmentation X Zhou, W Ye, Y Wang, C Jiang, Z Lee, R Xie, S Zhang arXiv preprint arXiv:2407.00100, 2024 | 2 | 2024 |
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization H Jia^, C Jiang^(Equal Contribution), H Xu, W Ye, M Dong, M Yan, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025, 2024 | 1 | 2024 |
MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model C Jiang, J Hongrui, H Xu, W Ye, M Dong, M Yan, J Zhang, F Huang, ... The Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024 | 1 | 2024 |
Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection W Ye, C Jiang, H Xu, C Ye, C Li, M Yan, S Zhang, S Huang, F Huang arXiv preprint arXiv:2403.07883, 2024 | | 2024 |