دنبال کردن
Yinan He
Yinan He
Shanghai Al Laboratory
ایمیل تأیید شده در pjlab.org.cn
عنوان
نقل شده توسط
نقل شده توسط
سال
Videochat: Chat-centric video understanding
KC Li, Y He, Y Wang, Y Li, W Wang, P Luo, Y Wang, L Wang, Y Qiao
arXiv preprint arXiv:2305.06355, 2023
4632023
Videomae v2: Scaling video masked autoencoders with dual masking
L Wang, B Huang, Z Zhao, Z Tong, Y He, Y Wang, Y Wang, Y Qiao
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
3142023
Internvideo: General video foundation models via generative and discriminative learning
Y Wang, K Li, Y Li, Y He, B Huang, Z Zhao, H Zhang, J Xu, Y Liu, Z Wang, ...
arXiv preprint arXiv:2212.03191, 2022
2812022
Lavie: High-quality video generation with cascaded latent diffusion models
Y Wang, X Chen, X Ma, S Zhou, Z Huang, Y Wang, C Yang, Y He, J Yu, ...
arXiv preprint arXiv:2309.15103, 2023
1582023
Internvid: A large-scale video-text dataset for multimodal understanding and generation
Y Wang, Y He, Y Li, K Li, J Yu, X Ma, X Li, G Chen, X Chen, Y Wang, C He, ...
arXiv preprint arXiv:2307.06942, 2023
1562023
Mvbench: A comprehensive multi-modal video understanding benchmark
K Li, Y Wang, Y He, Y Li, Y Wang, Y Liu, Z Wang, J Xu, G Chen, P Luo, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
1522024
Forgerynet: A versatile benchmark for comprehensive forgery analysis
Y He, B Gan, S Chen, Y Zhou, G Yin, L Song, L Sheng, J Shao, Z Liu
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
1462021
Unmasked teacher: Towards training-efficient video foundation models
K Li, Y Wang, Y Li, Y Wang, Y He, L Wang, Y Qiao
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
1242023
Vbench: Comprehensive benchmark suite for video generative models
Z Huang, Y He, J Yu, F Zhang, C Si, Y Jiang, Y Zhang, T Wu, Q Jin, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
1162024
Uniformerv2: Spatiotemporal learning by arming image vits with video uniformer
K Li, Y Wang, Y He, Y Li, Y Wang, L Wang, Y Qiao
arXiv preprint arXiv:2211.09552, 2022
1142022
Videomamba: State space model for efficient video understanding
K Li, X Li, Y Wang, Y He, Y Wang, L Wang, Y Qiao
European Conference on Computer Vision, 237-255, 2025
1052025
Interngpt: Solving vision-centric tasks by interacting with chatgpt beyond language
Z Liu, Y He, W Wang, W Wang, Y Wang, S Chen, Q Zhang, Z Lai, Y Yang, ...
arXiv preprint arXiv:2305.05662, 2023
772023
Internvideo2: Scaling video foundation models for multimodal video understanding
Y Wang, K Li, X Li, J Yu, Y He, G Chen, B Pei, R Zheng, J Xu, Z Wang, ...
arXiv preprint arXiv:2403.15377, 2024
602024
Internvideo-ego4d: A pack of champion solutions to ego4d challenges
G Chen, S Xing, Z Chen, Y Wang, K Li, Y Li, Y Liu, J Wang, YD Zheng, ...
arXiv preprint arXiv:2211.09529, 2022
402022
Uniformerv2: Unlocking the potential of image vits for video understanding
K Li, Y Wang, Y He, Y Li, Y Wang, L Wang, Y Qiao
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
382023
Intern: A new learning paradigm towards general vision
J Shao, S Chen, Y Li, K Wang, Z Yin, Y He, J Teng, Q Sun, M Gao, J Liu, ...
arXiv preprint arXiv:2111.08687, 2021
342021
From gpt-4 to gemini and beyond: Assessing the landscape of mllms on generalizability, trustworthiness and causality through four modalities
C Lu, C Qian, G Zheng, H Fan, H Gao, J Zhang, J Shao, J Deng, J Fu, ...
arXiv preprint arXiv:2401.15071, 2024
122024
X-learner: Learning cross sources and tasks for universal visual representation
Y He, G Huang, S Chen, J Teng, K Wang, Z Yin, L Sheng, Z Liu, Y Qiao, ...
European Conference on Computer Vision, 509-528, 2022
62022
OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Q Li, Z Chen, W Wang, W Wang, S Ye, Z Jin, G Chen, Y He, Z Gao, E Cui, ...
arXiv preprint arXiv:2406.08418, 2024
52024
Harvest video foundation models via efficient post-pretraining
Y Li, K Li, Y He, Y Wang, Y Wang, L Wang, Y Qiao, P Luo
arXiv preprint arXiv:2310.19554, 2023
22023
سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.
مقاله‌ها 1–20