Dongchao Yang

인용

	전체	2019년 이후
서지정보	1430	1429
h-index	17	17
i10-index	18	18

1100

550

275

825

202120222023202416 51 324 1017

공개 액세스

모두 보기

자료 8개

자료 1개

공개

비공개

재정 지원 요구사항 기준

공동 저자

Yuexian ZouPeking University Shenzhen Graduate Schoolpku.edu.cn의 이메일 확인됨
Rongjie HuangFAIR, Zhejiang Universityzju.edu.cn의 이메일 확인됨
Helin WangJohns Hopkins Universityjh.edu의 이메일 확인됨
Xu TanPrincipal Researcher and Research Manager, Microsoftmicrosoft.com의 이메일 확인됨
Yi Ren (任意)Research Scientist, Tiktokbytedance.com의 이메일 확인됨
Jinchuan TianLanguage Technologies Institute, Carnegie Mellon Universityandrew.cmu.edu의 이메일 확인됨
Jiatong Shi (史嘉彤)Carnegie Mellon Universityandrew.cmu.edu의 이메일 확인됨
Dong Yu (俞栋)Distinguished Scientist @ Tencent AI Lab, ACM/IEEE/ISCA Fellowglobal.tencent.com의 이메일 확인됨
Haohan GuoChinese University of Hong Kongse.cuhk.edu.hk의 이메일 확인됨
Yifei XinPeking Universitystu.pku.edu.cn의 이메일 확인됨
Wenwu WangProfessor, University of Surrey, UKsurrey.ac.uk의 이메일 확인됨
Haibin WuMicrosoftmicrosoft.com의 이메일 확인됨
Songxiang Liu

팔로우

Dongchao Yang

The Chinese University of HongKong

se.cuhk.edu.hk의 이메일 확인됨 - 홈페이지

TTS TTA Audio Codec Multi-modal Audio Fundation Models


제목 서지정보순 정렬 연도순 정렬 제목순 정렬	인용 인용	연도
Diffsound: Discrete diffusion model for text-to-sound generation D Yang, J Yu, H Wang, W Wang, C Weng, Y Zou, D Yu IEEE Transactions on Audio, Speech and Language Processing (TASLP)., 2023	276	2023
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang*, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, ... ICML 2023, 2023	266	2023
AudioGPT: Understanding and generating speech, music, sound, and talking head R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... AAAI, demo 2024, 2023	157	2023
Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023	92	2023
NaturalSpeech 3: Zero-shot speech synthesis with factorized codec and diffusion models Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ... ICML 2024, 2024	84	2024
UniAudio: An Audio Foundation Model Toward Universal Audio Generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... ICML 2024, 2023	83	2023
InstructTTS: Modelling expressive TTS in discrete latent space with natural language style prompt D Yang, S Liu, R Huang, C Weng, H Meng IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2024	72	2024
Make-an-audio 2: Temporal-enhanced text-to-audio generation J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ... arXiv preprint arXiv:2305.18474, 2023	44	2023
A Mutual learning framework for Few-shot Sound Event Detection D Yang, H Wang, Y Zou, Z Ye, W Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	38*	2022
Towards data distillation for end-to-end spoken conversational question answering C You, N Chen, F Liu, D Yang, Y Zou arXiv preprint arXiv:2010.08923, 2021	37	2021
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information Z Ye, H Wang, D Yang, Y Zou Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021	35	2021
Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss Y Xin, D Yang, Y Zou ICASSP2023, 2023	32	2023
Prompttts 2: Describing and generating voices with text prompt Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ... ICLR 2024, 2023	31	2023
Make-a-voice: Unified voice synthesis with discrete representation R Huang, C Zhang, Y Wang, D Yang, L Liu, Z Ye, Z Jiang, C Weng, ... ACL 2024, 2023	26	2023
Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification Y Xin, D Yang, Y Zou Proc. Interspeech 2022, 1546-1550, 2022	18	2022
Norespeech: Knowledge distillation based conditional diffusion model for noise-robust expressive tts D Yang, S Liu, J Yu, H Wang, C Weng, Y Zou Interspeech2023, 2022	17	2022
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches Z Zhao, D Yang, R Gu, H Zhang, Y Zou Interspeech2022, 2022	17	2022
Rall-e: Robust codec language modeling with chain-of-thought prompting for text-to-speech synthesis D Xin, X Tan, K Shen, Z Ju, D Yang, Y Wang, S Takamichi, H Saruwatari, ... arXiv preprint arXiv:2404.03204, 2024	15	2024
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models D Yang, D Wang, H Guo, X Chen, X Wu, H Meng Interspeech2024, 2024	8	2024
Improving Weakly Supervised Sound Event Detection with Causal Intervention Y Xin, D Yang, F Cui, Y Wang, Y Zou ICASSP2023, 2023	8	2023

현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.

학술자료 1–20

연간 인용횟수

중복된 서지정보

병합된 서지정보

공동 저자 추가공동 저자

팔로우

인용

공동 저자