Follow
Peng Jin
Peng Jin
PhD student, Peking University
Verified email at stu.pku.edu.cn - Homepage
Title
Cited by
Cited by
Year
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
B Lin, B Zhu, Y Ye, M Ning, P Jin, L Yuan
EMNLP 2024, 2024
2922024
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
B Lin, Z Tang, Y Ye, J Cui, B Zhu, P Jin, J Zhang, M Ning, L Yuan
arXiv preprint arXiv:2401.15947, 2024
1242024
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
P Jin, R Takanobu, W Zhang, X Cao, L Yuan
CVPR 2024 Highlight, 13700-13710, 2024
992024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao, S Geng, Z Lin, P Jin, ...
ICML 2024, 2024
802024
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
P Jin, J Huang, F Liu, X Wu, S Ge, G Song, D Clifton, J Chen
NeurIPS 2022 Spotlight 35, 30291-30306, 2022
612022
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
P Jin, J Huang, P Xiong, S Tian, C Liu, X Ji, L Yuan, J Chen
CVPR 2023 Highlight, 2472-2482, 2023
602023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
P Jin, H Li, Z Cheng, K Li, X Ji, C Liu, L Yuan, J Chen
ICCV 2023, 2470-2481, 2023
472023
Weakly-Supervised 3D Spatial Reasoning for Text-based Visual Question Answering
H Li, J Huang, P Jin, G Song, Q Wu, J Chen
IEEE Transactions on Image Processing, 2023
34*2023
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment
P Jin, H Li, Z Cheng, J Huang, Z Wang, L Yuan, C Liu, J Chen
IJCAI 2023, 938-946, 2023
302023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
P Jin, Y Wu, Y Fan, Z Sun, W Yang, L Yuan
NeurIPS 2023, 2023
202023
Parallel Vertex Diffusion for Unified Visual Grounding
Z Cheng, K Li, P Jin, X Ji, L Yuan, C Liu, J Chen
AAAI 2024, 1326-1334, 2024
182024
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
J Zhang, Z Tang, Y Pang, X Cheng, P Jin, Y Wei, W Yu, M Ning, L Yuan
ECCV 2024, 2024
142024
TG-VQA: Ternary Game of Video Question Answering
H Li, P Jin, Z Cheng, S Zhang, K Chen, Z Wang, C Liu, J Chen
IJCAI 2023, 1044-1052, 2023
132023
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Z Wan, Z Wu, C Liu, J Huang, Z Zhu, P Jin, L Wang, L Yuan
EMNLP 2024 Findings, 2024
102024
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation
K Li, Y Zhao, Z Wang, Z Cheng, P Jin, X Ji, L Yuan, C Liu, J Chen
ICCV 2023, 666-676, 2023
82023
LLMBind: A Unified Modality-Task Integration Framework
B Zhu, P Jin, M Ning, B Lin, J Huang, Q Song, M Pan, L Yuan
arXiv preprint arXiv:2402.14891, 2024
62024
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter
M Cao, H Tang, J Huang, P Jin, C Zhang, R Liu, L Chen, X Liang, L Yuan, ...
ACL 2024 Findings, 2024
42024
FreestyleRet: Retrieving Images from Style-Diversified Queries
H Li, C Jia, P Jin, Z Cheng, K Li, J Sui, C Liu, L Yuan
ECCV 2024, 2024
42024
WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation
Z Cheng, P Jin, H Li, K Li, S Li, X Ji, C Liu, J Chen
IJCAI 2023, 636-644, 2023
42023
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
Z Cheng, K Li, H Li, P Jin, C Liu, X Zheng, R Ji, J Chen
arXiv preprint arXiv:2401.09732, 2024
22024
The system can't perform the operation now. Try again later.
Articles 1–20