Yuhang Cao

Sitert av

	Alle	Siden 2020
Sitater	5439	5382
h-indeks	13	13
i10-indeks	17	17

1900

950

475

1425

201920202021202220232024202547 280 630 1031 1213 1879 347

Offentlig tilgang

Vis alle

1 artikkel

0 artikler

tilgjengelige

ikke tilgjengelige

Basert på finansieringsmandater

Følg

Yuhang Cao

MMLab The Chinese University of Hong Kong

Verifisert e-postadresse på ie.cuhk.edu.hk

Multi-Modal Large Language Model Object Detection Few Shot Object Detection


Tittel Sorter etter sitater Sorter etter år Sorter etter tittel	Sitert av Sitert av	År
MMDetection: Open mmlab detection toolbox and benchmark K Chen, J Wang, J Pang, Y Cao, Y Xiong, X Li, S Sun, W Feng, Z Liu, J Xu, ... arXiv preprint arXiv:1906.07155, 2019	3568	2019
Seesaw loss for long-tailed instance segmentation J Wang, W Zhang, Y Zang, Y Cao, J Pang, T Gong, K Chen, Z Liu, CC Loy, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021	311	2021
Prime sample attention in object detection Y Cao, K Chen, CC Loy, D Lin Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020	274	2020
Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ... arXiv preprint arXiv:2401.16420, 2024	240	2024
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition P Zhang, X Dong, B Wang, Y Cao, C Xu, L Ouyang, Z Zhao, H Duan, ... arXiv preprint arXiv:2309.15112, 2023	201	2023
Side-aware boundary localization for more precise object detection J Wang, W Zhang, Y Cao, K Chen, J Pang, T Gong, J Shi, CC Loy, D Lin Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020	184	2020
MMDetection: open mmlab detection toolbox and benchmark. 2019 K Chen, J Wang, J Pang, Y Cao, Y Xiong, X Li, S Sun, W Feng, Z Liu, J Xu, ... arXiv preprint arXiv:1906.07155, 1906	134	1906
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ... Advances in Neural Information Processing Systems 37, 42566-42592, 2024	118	2024
Few-shot object detection via association and discrimination Y Cao, J Wang, Y Jin, T Wu, K Chen, Z Liu, D Lin Advances in neural information processing systems 34, 16570-16581, 2021	116	2021
Internlm-xcomposer-2.5: A versatile large vision language model supporting long-contextual input and output P Zhang, X Dong, Y Zang, Y Cao, R Qian, L Chen, Q Guo, H Duan, ... arXiv preprint arXiv:2407.03320, 2024	86	2024
Feature pyramid grids K Chen, Y Cao, CC Loy, D Lin, C Feichtenhofer arXiv preprint arXiv:2004.03580, 2020	67	2020
V3det: Vast vocabulary visual detection dataset J Wang, P Zhang, T Chu, Y Cao, Y Zhou, T Wu, B Wang, C He, D Lin Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	57	2023
Pyramiddrop: Accelerating your large vision-language models via pyramid visual redundancy reduction L Xing, Q Huang, X Dong, J Lu, P Zhang, Y Zang, Y Cao, C He, J Wang, ... arXiv preprint arXiv:2410.17247, 2024	13	2024
Wssod: A new pipeline for weakly-and semi-supervised object detection S Fang, Y Cao, X Wang, K Chen, D Lin, W Zhang arXiv preprint arXiv:2105.11293, 2021	13	2021
Mini: Mining implicit novel instances for few-shot object detection Y Cao, J Wang, Y Lin, D Lin arXiv preprint arXiv:2205.03381, 2022	12	2022
YuXiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, et al. Mmdetection: Open mmlab detectiontoolbox and benchmark K Chen, J Wang, J Pang, Y Cao arXiv preprint arXiv:1906.07155 6, 2019	12	2019
Dualfocus: Integrating macro and micro perspectives in multi-modal large language models Y Cao, P Zhang, X Dong, D Lin, J Wang arXiv preprint arXiv:2402.14767, 2024	11	2024
Mia-dpo: Multi-image augmented direct preference optimization for large vision-language models Z Liu, Y Zang, X Dong, P Zhang, Y Cao, H Duan, C He, Y Xiong, D Lin, ... arXiv preprint arXiv:2410.17637, 2024	7	2024
Internlm-xcomposer2. 5-omnilive: A comprehensive multimodal system for long-term streaming video and audio interactions P Zhang, X Dong, Y Cao, Y Zang, R Qian, X Wei, L Chen, Y Li, J Niu, ... arXiv preprint arXiv:2412.09596, 2024	5	2024
Sam2long: Enhancing sam 2 for long video segmentation with a training-free memory tree S Ding, R Qian, X Dong, P Zhang, Y Zang, Y Cao, Y Guo, D Lin, J Wang arXiv preprint arXiv:2410.16268, 2024	4	2024

Systemet kan ikke utføre handlingen. Prøv på nytt senere.

Artikler 1–20

Sitater per år

Duplikatsitater

Sammenslåtte sitater

Legg til medforfattereMedforfattere

Følg

Sitert av