팔로우
Zhen Ye
제목
인용
인용
연도
BLVD: Building a large-scale 5D semantics benchmark for autonomous driving
J Xue, J Fang, T Li, B Zhang, P Zhang, Z Ye, J Dou
2019 International Conference on Robotics and Automation (ICRA), 6685-6691, 2019
662019
Comospeech: One-step speech and singing voice synthesis via consistency model
Z Ye, W Xue, X Tan, J Chen, Q Liu, Y Guo
Proceedings of the 31st ACM International Conference on Multimedia, 1831-1839, 2023
282023
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Z Ye, Z Ju, H Liu, X Tan, J Chen, Y Lu, P Sun, J Pan, W Bian, S He, W Xue, ...
arXiv preprint arXiv:2404.14700, 2024
72024
CoMoSVC: Consistency Model-based Singing Voice Conversion
Y Lu, Z Ye, W Xue, X Tan, Q Liu, Y Guo
arXiv preprint arXiv:2401.01792, 2024
62024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
S Wang, H Lin, Z Luo, Z Ye, G Chen, J Ma
arXiv preprint arXiv:2406.11288, 2024
52024
NAS-FM: neural architecture search for tunable and interpretable sound synthesis based on frequency modulation
Z Ye, W Xue, X Tan, Q Liu, Y Guo
arXiv preprint arXiv:2305.12868, 2023
42023
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Z Ye, P Sun, J Lei, H Lin, X Tan, Z Dai, Q Kong, J Chen, J Pan, Q Liu, ...
arXiv preprint arXiv:2408.17175, 2024
22024
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
J Chen, W Xue, X Tan, Z Ye, Q Liu, Y Guo
arXiv preprint arXiv:2405.07682, 2024
22024
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
P Sun, S Cheng, X Li, Z Ye, H Liu, H Zhang, W Xue, Y Guo
arXiv preprint arXiv:2410.10676, 2024
12024
PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain
J Chen, Z Dai, Z Ye, X Tan, Q Liu, Y Guo, W Xue
Findings of the Association for Computational Linguistics: EMNLP 2024, 4253-4263, 2024
2024
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–10