Peng Jin

Cited by

	All	Since 2019
Citations	931	931
h-index	13	13
i10-index	14	14

800

400

200

600

2022202320245 106 782

Public access

View all

7 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Li Yuan, 袁粒Peking University, School of ECE, Shenzhen Graduate SchoolVerified email at pku.edu.cn
Jie ChenPeking University, Peng Cheng LaboratoryVerified email at pku.edu.cn
Li HaoPhd candidate of computer science, Peking UniversityVerified email at pku.edu.cn
Jinfa HuangUniversity of Rochester, Peking UniversityVerified email at ur.rochester.edu
Zesen ChengPeking UniversityVerified email at stu.pku.edu.cn
Kehan LiPeking University Shenzhen Graduate SchoolVerified email at stu.pku.edu.cn
Bin ZhuPeking UniversityVerified email at stu.pku.edu.cn
Bin Lin, 林彬Master student, Peking UniversityVerified email at stu.pku.edu.cn
Guoli SongPeng Cheng LaboratoryVerified email at pcl.ac.cn
Yatian PangNational University of SingaporeVerified email at u.nus.edu
Fenglin LiuUniversity of OxfordVerified email at eng.ox.ac.uk
Shuicheng Yan, Fellow of AAAI, ACM,...Kunlun 2050 Research & Skywork AI, previously Sea AI Lab, SingaporeVerified email at kunlun-inc.com
Runyi YuPhD student at HKUSTVerified email at connect.ust.hk

Peng Jin

PhD student, Peking University

Verified email at stu.pku.edu.cn - Homepage

Vision and Language Multimodal LLM Cross-modal Retrieval


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection B Lin, B Zhu, Y Ye, M Ning, P Jin, L Yuan EMNLP 2024, 2024	292	2024
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models B Lin, Z Tang, Y Ye, J Cui, B Zhu, P Jin, J Zhang, M Ning, L Yuan arXiv preprint arXiv:2401.15947, 2024	124	2024
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding P Jin, R Takanobu, W Zhang, X Cao, L Yuan CVPR 2024 Highlight, 13700-13710, 2024	99	2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao, S Geng, Z Lin, P Jin, ... ICML 2024, 2024	80	2024
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations P Jin, J Huang, F Liu, X Wu, S Ge, G Song, D Clifton, J Chen NeurIPS 2022 Spotlight 35, 30291-30306, 2022	61	2022
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning P Jin, J Huang, P Xiong, S Tian, C Liu, X Ji, L Yuan, J Chen CVPR 2023 Highlight, 2472-2482, 2023	60	2023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model P Jin, H Li, Z Cheng, K Li, X Ji, C Liu, L Yuan, J Chen ICCV 2023, 2470-2481, 2023	47	2023
Weakly-Supervised 3D Spatial Reasoning for Text-based Visual Question Answering H Li, J Huang, P Jin, G Song, Q Wu, J Chen IEEE Transactions on Image Processing, 2023	34*	2023
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment P Jin, H Li, Z Cheng, J Huang, Z Wang, L Yuan, C Liu, J Chen IJCAI 2023, 938-946, 2023	30	2023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs P Jin, Y Wu, Y Fan, Z Sun, W Yang, L Yuan NeurIPS 2023, 2023	20	2023
Parallel Vertex Diffusion for Unified Visual Grounding Z Cheng, K Li, P Jin, X Ji, L Yuan, C Liu, J Chen AAAI 2024, 1326-1334, 2024	18	2024
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting J Zhang, Z Tang, Y Pang, X Cheng, P Jin, Y Wei, W Yu, M Ning, L Yuan ECCV 2024, 2024	14	2024
TG-VQA: Ternary Game of Video Question Answering H Li, P Jin, Z Cheng, S Zhang, K Chen, Z Wang, C Liu, J Chen IJCAI 2023, 1044-1052, 2023	13	2023
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference Z Wan, Z Wu, C Liu, J Huang, Z Zhu, P Jin, L Wang, L Yuan EMNLP 2024 Findings, 2024	10	2024
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation K Li, Y Zhao, Z Wang, Z Cheng, P Jin, X Ji, L Yuan, C Liu, J Chen ICCV 2023, 666-676, 2023	8	2023
LLMBind: A Unified Modality-Task Integration Framework B Zhu, P Jin, M Ning, B Lin, J Huang, Q Song, M Pan, L Yuan arXiv preprint arXiv:2402.14891, 2024	6	2024
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter M Cao, H Tang, J Huang, P Jin, C Zhang, R Liu, L Chen, X Liang, L Yuan, ... ACL 2024 Findings, 2024	4	2024
FreestyleRet: Retrieving Images from Style-Diversified Queries H Li, C Jia, P Jin, Z Cheng, K Li, J Sui, C Liu, L Yuan ECCV 2024, 2024	4	2024
WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation Z Cheng, P Jin, H Li, K Li, S Li, X Ji, C Liu, J Chen IJCAI 2023, 636-644, 2023	4	2023
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation Z Cheng, K Li, H Li, P Jin, C Liu, X Zheng, R Ji, J Chen arXiv preprint arXiv:2401.09732, 2024	2	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors