Seed-tts: A family of high-quality versatile speech generation models P Anastassiou, J Chen, J Chen, Y Chen, Z Chen, Z Chen, J Cong, L Deng, ... arXiv preprint arXiv:2406.02430, 2024 | 61 | 2024 |
A unified sequence-to-sequence front-end model for mandarin text-to-speech synthesis J Pan, X Yin, Z Zhang, S Liu, Y Zhang, Z Ma, Y Wang ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 37 | 2020 |
A hybrid text normalization system using multi-head self-attention for mandarin J Zhang, J Pan, X Yin, C Li, S Liu, Y Zhang, Y Wang, Z Ma ICASSP 2020-2020 IEEE international conference on acoustics, speech and …, 2020 | 29 | 2020 |
Cross-speaker emotion transfer based on speaker condition layer normalization and semi-supervised training in text-to-speech P Wu, J Pan, C Xu, J Zhang, L Wu, X Yin, Z Ma arXiv preprint arXiv:2110.04153, 2021 | 21 | 2021 |
A chapter-wise understanding system for text-to-speech in Chinese novels J Pan, L Wu, X Yin, P Wu, C Xu, Z Ma ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 13 | 2021 |
A novel chinese dialect TTS frontend with non-autoregressive neural machine translation J Zhang, W Bao, J Pan, X Yin, Z Ma arXiv preprint arXiv:2206.04922, 2022 | 5 | 2022 |
Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features J Zhang, J Pan, X Yin, Z Ma arXiv preprint arXiv:2212.05805, 2022 | 1 | 2022 |
An Automatic Soundtracking System for Text-to-Speech Audiobooks. Z Chen, L Wu, J Pan, X Yin, AI Bytedance INTERSPEECH, 476-480, 2022 | | 2022 |
An End-to-End Speaker Determination Model with Joint Learning for Text-to-Speech Audiobooks L Wu, J Pan, X Yin, Z Ma, AI Bytedance | | |