Følg
Fengyun RAO
Fengyun RAO
Tencent
Verifisert e-postadresse på tencent.com
Tittel
Sitert av
Sitert av
År
Clip4caption: Clip for video caption
M Tang, Z Wang, Z Liu, F Rao, D Li, X Li
Proceedings of the 29th ACM International Conference on Multimedia, 4858-4862, 2021
1392021
Discovery of millihertz X-ray oscillations in a transient ultraluminous X-ray source in M82
H Feng, F Rao, P Kaaret
The Astrophysical Journal Letters 710 (2), L137, 2010
582010
Detection of strong short-term variability in NGC 6946 X-1
F Rao, H Feng, P Kaaret
The Astrophysical Journal 722 (1), 620, 2010
402010
LOW-FREQUENCY OSCILLATIONS IN XTE J1550− 564
F Rao, T Belloni, L Stella, SN Zhang, T Li
The Astrophysical Journal 714 (2), 1065, 2010
362010
Ca-ssl: Class-agnostic semi-supervised learning for detection and segmentation
L Qi, J Kuen, Z Lin, J Gu, F Rao, D Li, W Guo, Z Wen, MH Yang, J Jia
European Conference on Computer Vision, 59-77, 2022
152022
Inter-x: Towards versatile human-human interaction analysis
L Xu, X Lv, Y Yan, X Jin, S Wu, C Xu, Y Liu, Y Zhou, F Rao, X Sheng, Y Liu, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
112024
Multi-task multi-head attention memory network for fine-grained sentiment analysis
Z Dai, W Dai, Z Liu, F Rao, H Chen, G Zhang, Y Ding, J Liu
CCF International Conference on Natural Language Processing and Chinese …, 2019
112019
Tencent-mvse: A large-scale benchmark dataset for multi-modal video similarity evaluation
Z Zeng, Y Luo, Z Liu, F Rao, D Li, W Guo, Z Wen
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
102022
Clip4caption++: Multi-clip for video caption
M Tang, Z Wang, Z Zeng, F Rao, D Li
arXiv preprint arXiv:2110.05204, 2021
92021
Image captioning with multi-context synthetic data
F Ma, Y Zhou, F Rao, Y Zhang, X Sun
Proceedings of the AAAI Conference on Artificial Intelligence 38 (5), 4089-4097, 2024
72024
ReGenNet: Towards Human Action-Reaction Synthesis
L Xu, Y Zhou, Y Yan, X Jin, W Zhu, F Rao, X Yang, W Zeng
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
52024
A similarity alignment model for video copy segment matching
Z Liu, F Ma, T Wang, F Rao
arXiv preprint arXiv:2305.15679, 2023
42023
Visual Perception by Large Language Model's Weights
F Ma, H Xue, G Wang, Y Zhou, F Rao, S Yan, Y Zhang, S Wu, MZ Shou, ...
arXiv preprint arXiv:2405.20339, 2024
32024
A dual-level detection method for video copy detection
T Wang, F Ma, Z Liu, F Rao
arXiv preprint arXiv:2305.12361, 2023
32023
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling
J Yang, D Yin, Y Zhou, F Rao, W Zhai, Y Cao, ZJ Zha
arXiv preprint arXiv:2410.10798, 2024
22024
Spatial-Semantic Collaborative Cropping for User Generated Content
Y Su, Y Cao, J Deng, F Rao, Q Wu
Proceedings of the AAAI Conference on Artificial Intelligence 38 (5), 4988-4997, 2024
12024
Number it: Temporal Grounding Videos like Flipping Manga
Y Wu, X Hu, Y Sun, Y Zhou, W Zhu, F Rao, B Schiele, X Yang
arXiv preprint arXiv:2411.10332, 2024
2024
EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model
F Ma, Y Zhou, H Li, Z He, S Wu, F Rao, Y Zhang, X Sun
arXiv preprint arXiv:2408.11795, 2024
2024
Multi-Modal Generative Embedding Model
F Ma, H Xue, G Wang, Y Zhou, F Rao, S Yan, Y Zhang, S Wu, MZ Shou, ...
arXiv preprint arXiv:2405.19333, 2024
2024
Task Navigator: Decomposing Complex Tasks for Multimodal Large Language Models
F Ma, Y Zhou, Y Zhang, S Wu, Z Zhang, Z He, F Rao, X Sun
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
2024
Systemet kan ikke utføre handlingen. Prøv på nytt senere.
Artikler 1–20