Yifei Xin

Cytowane przez

	Wszystkie	Od 2020
Cytowania	308	308
h-indeks	7	7
i10-indeks	3	3

200

100

150

20222023202420254 28 199 76

Dostęp publiczny

Wyświetl wszystko

4 artykuły

0 artykułów

dostępne

niedostępne

Objęte finansowaniem

Współautorzy

Yuexian ZouPeking University Shenzhen Graduate SchoolZweryfikowany adres z pku.edu.cn
Lidong BingShanda Group, Alibaba DAMO, Tencent, CMU, CUHKZweryfikowany adres z alibaba-inc.com
Xin LiAlibaba GroupZweryfikowany adres z se.cuhk.edu.hk
Dongchao YangChinese University of Hong KongZweryfikowany adres z se.cuhk.edu.hk
Yongxin ZhuUniversity of Science and Technology of ChinaZweryfikowany adres z mail.ustc.edu.cn
Zesen ChengPeking UniversityZweryfikowany adres z stu.pku.edu.cn
Sicong LengNanyang Technological University & Alibaba DAMO AcademyZweryfikowany adres z e.ntu.edu.sg
Wenqi ZhangZhejiang UniversityZweryfikowany adres z zju.edu.cn
Hang ZhangQwen Team; Zhejiang University; Sichuan UniversityZweryfikowany adres z stu.scu.edu.cn
Ziyang LuoSalesforce AI Research, Hong Kong Baptist UniversityZweryfikowany adres z comp.hkbu.edu.hk
Xiulian PengResearcher at Microsoft Research AsiaZweryfikowany adres z microsoft.com
Fan CuiNorthwestern Polytechnical UniversityZweryfikowany adres z mail.nwpu.edu.cn
Zejun MaBytedanceZweryfikowany adres z bytedance.com
Zhesong Yu (于哲松)Bytedance AILabZweryfikowany adres z pku.edu.cn
Lifeng ShangHuawei Noah's Ark LabZweryfikowany adres z huawei.com
Baojun WangHuawei Noah’s Ark LabZweryfikowany adres z huawei.com
Ziyu YaoPeking UniversityZweryfikowany adres z stu.pku.edu.cn

Obserwuj

Yifei Xin

Peking University

Zweryfikowany adres z stu.pku.edu.cn


Tytuł Sortuj wg cytatów Sortuj wg roku Sortuj wg tytułu	Cytowane przez Cytowane przez	Rok
Videollama 2: Advancing spatial-temporal modeling and audio understanding in video-llms Z Cheng, S Leng, H Zhang, Y Xin, X Li, G Chen, Y Zhu, W Zhang, Z Luo, ... arXiv preprint arXiv:2406.07476, 2024	167	2024
Improving text-audio retrieval by text-aware attention pooling and prior matrix revised loss Y Xin, D Yang, Y Zou ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	36	2023
Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification. Y Xin, D Yang, Y Zou INTERSPEECH, 1546-1550, 2022	17	2022
Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents L Li, W Xu, J Guo, R Zhao, X Li, Y Yuan, B Zhang, Y Jiang, Y Xin, R Dang, ... arXiv preprint arXiv:2410.13185, 2024	9	2024
Masked Audio Modeling with CLAP and Multi-Objective Learning Y Xin, X Peng, Y Lu Proc. INTERSPEECH 2023, 2763-2767, 2024	9	2024
Improving audio-text retrieval via hierarchical cross-modal interaction and auxiliary captions Y Xin, Y Zou Proc. INTERSPEECH 2023, 341-345, 2023	9	2023
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer Y Zhu, B Li, Y Xin, L Xu arXiv preprint arXiv:2411.02038, 2024	8	2024
Cooperative game modeling with weighted token-level alignment for audio-text retrieval Y Xin, B Wang, L Shang IEEE Signal Processing Letters 30, 1317-1321, 2023	7	2023
Improving weakly supervised sound event detection with causal intervention Y Xin, D Yang, F Cui, Y Wang, Y Zou ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	7	2023
Videollama 2: Advancing spatial-temporal modeling and audio understanding in video-llms, 2024 Z Cheng, S Leng, H Zhang, Y Xin, X Li, G Chen, Y Zhu, W Zhang, Z Luo, ... URL https://arxiv. org/abs/2406.07476 9, 0	7
Low-complexity acoustic scene classification with mismatch-devices using separable convolutions and coordinate attention Y Xin, Y Zou, F Cui, Y Wang DCASE2022 Challenge, Tech. Rep, 2022	6	2022
DiffATR: Diffusion-based Generative Modeling for Audio-Text Retrieval Y Xin, X Cheng, Z Zhu, X Yang, Y Zou Proc. Interspeech 2024, 1670-1674, 2024	5	2024
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning H Zhao, Y Xin, Z Yu, B Zhu, L Lu, Z Ma Proc. Interspeech 2024, 52-56, 2024	5*	2024
Soul-mix: Enhancing multimodal machine translation with manifold mixup X Cheng, Z Yao, Y Xin, H An, H Li, Y Li, Y Zou Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024	5	2024
Improving speech enhancement via event-based query Y Xin, X Peng, Y Lu ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	5	2023
Background-aware modeling for weakly supervised sound event detection Y Xin, D Yang, Y Zou Proc. ISCA Annu. Conf. Int. Speech Commun. Assoc, 1199-1203, 2023	5	2023
Audio-text Retrieval with Transformer-based Hierarchical Alignment and Disentangled Cross-modal Representation Y Xin, Z Zhu, X Cheng, X Yang, Y Zou Proc. Interspeech 2024, 1140-1144, 2024	1	2024
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark R Dang, Y Yuan, W Zhang, Y Xin, B Zhang, L Li, L Wang, Q Zeng, X Li, ... arXiv preprint arXiv:2501.05031, 2025		2025
Chain of Ideas: Revolutionizing Research in Idea Development with LLM Agents L Li, W Xu, J Guo, R Zhao, X Li, Y Yuan, B Zhang, Y Jiang, Y Xin, R Dang, ...

Nie można teraz wykonać tej operacji. Spróbuj ponownie później.

Prace 1–19

Cytowania rocznie

Powielone cytowania

Scalone cytowania

Dodaj współautorówWspółautorzy

Obserwuj

Cytowane przez

Współautorzy