Shizhe Chen

引用次数

	总计	2019 年至今
引用	3260	3039
h 指数	26	26
i10 指数	48	48

880

440

220

660

20162017201820192020202120222023202422 59 129 190 221 345 550 815 866

开放获取的出版物数量

查看全部

40 篇文章

10 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Qin Jin中国人民大学信息学院在 ruc.edu.cn 的电子邮件经过验证
Cordelia SchmidResearch director INRIA 在 inria.fr 的电子邮件经过验证
Ivan LaptevProfessor at MBZUAI, on leave from INRIA在 inria.fr 的电子邮件经过验证
Alex HauptmannCarnegie Mellon University在 cs.cmu.edu 的电子邮件经过验证
Ruihua SongRenmin University of China在 ruc.edu.cn 的电子邮件经过验证

关注

Shizhe Chen

INRIA Paris

在 inria.fr 的电子邮件经过验证 - 首页

Computer Vision Vision-and-Language


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Fine-grained video-text retrieval with hierarchical graph reasoning S Chen, Y Zhao, Q Jin, Q Wu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020	359	2020
Say as you wish: Fine-grained control of image caption generation with abstract scene graphs S Chen, Q Jin, P Wang, Q Wu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020	260	2020
Speech emotion recognition with acoustic and lexical features Q Jin, C Li, S Chen, H Wu 2015 IEEE international conference on acoustics, speech and signal …, 2015	218	2015
History aware multimodal transformer for vision-and-language navigation S Chen, PL Guhur, C Schmid, I Laptev Advances in neural information processing systems 34, 5834-5847, 2021	213	2021
Multimodal multi-task learning for dimensional and continuous emotion recognition S Chen, Q Jin, J Zhao, S Wang Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, 19-26, 2017	169	2017
Multi-modal dimensional emotion recognition using recurrent neural networks S Chen, Q Jin Proceedings of the 5th International Workshop on Audio/Visual Emotion …, 2015	146	2015
Airbert: In-domain pretraining for vision-and-language navigation PL Guhur, M Tapaswi, S Chen, I Laptev, C Schmid Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021	145	2021
Think global, act local: Dual-scale graph transformer for vision-and-language navigation S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	141	2022
WenLan: Bridging vision and language by large-scale multi-modal pre-training Y Huo, M Zhang, G Liu, H Lu, Y Gao, G Yang, J Wen, H Zhang, B Xu, ... arXiv preprint arXiv:2103.06561, 2021	139	2021
Describing videos using multi-modal fusion Q Jin, J Chen, S Chen, Y Xiong, A Hauptmann Proceedings of the 24th ACM international conference on Multimedia, 1087-1091, 2016	119	2016
Elaborative rehearsal for zero-shot action recognition S Chen, D Huang Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021	114	2021
Instruction-driven history-aware policies for robotic manipulations PL Guhur, S Chen, RG Pinel, M Tapaswi, I Laptev, C Schmid Conference on Robot Learning, 175-187, 2023	96	2023
Multi-modal conditional attention fusion for dimensional emotion prediction S Chen, Q Jin Proceedings of the 24th ACM international conference on Multimedia, 571-575, 2016	82	2016
Sketch, ground, and refine: Top-down dense video captioning C Deng, S Chen, D Chen, Y He, Q Wu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	75	2021
Video captioning with guidance of multimodal latent topics S Chen, J Chen, Q Jin, A Hauptmann Proceedings of the 25th ACM international conference on Multimedia, 1838-1846, 2017	74	2017
Multi-modal multi-cultural dimensional continues emotion recognition in dyadic interactions J Zhao, R Li, S Chen, Q Jin Proceedings of the 2018 on audio/visual emotion challenge and workshop, 65-72, 2018	57	2018
Few-shot action recognition with hierarchical matching and contrastive learning S Zheng, S Chen, Q Jin European Conference on Computer Vision, 297-313, 2022	53	2022
Language conditioned spatial relation reasoning for 3d object grounding S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev Advances in neural information processing systems 35, 20522-20535, 2022	50	2022
Unpaired cross-lingual image caption generation with self-supervised rewards Y Song, S Chen, Y Zhao, Q Jin Proceedings of the 27th ACM international conference on multimedia, 784-792, 2019	45	2019
Towards diverse paragraph captioning for untrimmed videos Y Song, S Chen, Q Jin Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	43	2021

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

关注

引用次数

合著作者