Segui
Kunchang Li
Titolo
Citata da
Citata da
Anno
Tip-adapter: Training-free clip-adapter for better vision-language modeling
R Zhang, R Fang, P Gao, W Zhang, K Li, J Dai, Y Qiao, H Li
ECCV2022, 2021
639*2021
VideoChat: Chat-Centric Video Understanding
K Li, Y He, Y Wang, Y Li, W Wang, P Luo, Y Wang, L Wang, Y Qiao
arXiv preprint arXiv:2305.06355, 2023
4622023
PointCLIP: Point Cloud Understanding by CLIP
R Zhang, Z Guo, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li
CVPR2022, 2021
4092021
UniFormer: Unifying convolution and self-attention for visual recognition
K Li, Y Wang, J Zhang, P Gao, G Song, Y Liu, H Li, Y Qiao
TPAMI, 2022
3562022
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Y Wang*, K Li*, Y Li*, Y He*, B Huang*, Z Zhao*, H Zhang, J Xu, Y Liu, ...
arXiv preprint arXiv:2212.03191, 2022
2812022
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
K Li, Y Wang, P Gao, G Song, Y Liu, H Li, Y Qiao
ICLR2022, 2022
2792022
Illumination Adaptive Transformer
Z Cui, K Li, L Gu, S Su, P Gao, Z Jiang, Y Qiao, T Harada
BMVC2022, 2022
169*2022
Grounded sam: Assembling open-world models for diverse visual tasks
T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen, X Huang, Y Chen, F Yan, ...
ICCV2023 Demo, 2024
1682024
Internvid: A large-scale video-text dataset for multimodal understanding and generation
Y Wang, Y He, Y Li, K Li, J Yu, X Ma, X Li, G Chen, X Chen, Y Wang, C He, ...
ICLR2024, 2023
1552023
Uniformerv2: Spatiotemporal learning by arming image vits with video uniformer
K Li, Y Wang, Y He, Y Li, Y Wang, L Wang, Y Qiao
ICCV2023, 2022
152*2022
Mvbench: A comprehensive multi-modal video understanding benchmark
K Li, Y Wang, Y He, Y Li, Y Wang, Y Liu, Z Wang, J Xu, G Chen, P Luo, ...
CVPR2024, 2023
1502023
Unmasked teacher: Towards training-efficient video foundation models
K Li, Y Wang, Y Li, Y Wang, Y He, L Wang, Y Qiao
ICCV2023 Oral, 2023
1242023
Videomamba: State space model for efficient video understanding
K Li, X Li, Y Wang, Y He, Y Wang, L Wang, Y Qiao
ECCV2024, 2024
1052024
Interngpt: Solving vision-centric tasks by interacting with chatgpt beyond language
Z Liu, Y He, W Wang, W Wang, Y Wang, S Chen, Q Zhang, Z Lai, Y Yang, ...
arXiv preprint arXiv:2305.05662, 2023
772023
CT-Net: Channel tensorization network for video classification
K Li, X Li, Y Wang, J Wang, Y Qiao
ICLR2021, 2021
712021
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
Y Wang*, K Li*, X Li*, J Yu*, Y He*, G Chen, B Pei, R Zheng, J Xu, Z Wang, ...
ECCV2024, 2024
592024
MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video
DJ Zhang*, K Li*, Y Chen, Y Wang, S Chandra, Y Qiao, L Liu, MZ Shou
ECCV2022, 2021
56*2021
Video mamba suite: State space model as a versatile alternative for video understanding
G Chen, Y Huang, J Xu, B Pei, Z Chen, Z Li, J Wang, K Li, T Lu, L Wang
arXiv preprint arXiv:2403.09626, 2024
472024
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges
G Chen, S Xing, Z Chen, Y Wang, K Li, Y Li, Y Liu, J Wang, YD Zheng, ...
ECCVW2022, 2022
402022
Self-slimmed vision transformer
Z Zong*, K Li*, G Song, Y Wang, Y Qiao, B Leng, Y Liu
ECCV2022, 2022
222022
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–20