Seed-tts: A family of high-quality versatile speech generation models P Anastassiou, J Chen, J Chen, Y Chen, Z Chen, Z Chen, J Cong, L Deng, ... arXiv preprint arXiv:2406.02430, 2024 | 60 | 2024 |
Controllable and lossless non-autoregressive end-to-end text-to-speech Z Liu, Q Tian, C Hu, X Liu, M Wu, Y Wang, H Zhao, Y Wang arXiv preprint arXiv:2207.06088, 2022 | 14 | 2022 |
Improving audio generation with visual enhanced caption Y Yuan, D Jia, X Zhuang, Y Chen, Z Liu, Z Chen, Y Wang, Y Wang, X Liu, ... arXiv e-prints, arXiv: 2407.04416, 2024 | 7 | 2024 |