ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models J Choi, S Kim, Y Jeong, Y Gwon, S Yoon ICCV 2021 (arXiv preprint arXiv:2108.02938), 2021 | 762 | 2021 |
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search J Kim, S Kim, J Kong, S Yoon Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020 | 587 | 2020 |
Perception Prioritized Training of Diffusion Models J Choi, J Lee, C Shin, S Kim, H Kim, S Yoon CVPR 2022 (arXiv preprint arXiv:2204.00227), 2022 | 237 | 2022 |
FloWaveNet: A generative flow for raw audio S Kim, S Lee, J Song, J Kim, S Yoon Proceedings of the International Conference on Machine Learning (ICML), 2018 | 216 | 2018 |
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance H Kim, S Kim, S Yoon Proceedings of the International Conference on Machine Learning (ICML), 2021 | 110 | 2021 |
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data S Kim, H Kim, S Yoon arXiv preprint arXiv:2205.15370, 2022 | 48 | 2022 |
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate J Song, S Kim, S Yoon EMNLP 2021 (arXiv preprint arXiv:2109.06481), 2021 | 39 | 2021 |
P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting S Kim, K Shih, JF Santos, E Bakhturina, M Desta, R Valle, S Yoon, ... Advances in Neural Information Processing Systems 36, 2024 | 36 | 2024 |
UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data H Kim, S Kim, J Yeom, S Yoon InterSpeech 2023, 2023 | 28 | 2023 |
FICGAN: Facial Identity Controllable GAN for De-identification Y Jeong, J Choi, S Kim, Y Ro, TH Oh, D Kim, H Ha, S Yoon arXiv preprint arXiv:2110.00740, 2021 | 19 | 2021 |
NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity S Lee, S Kim, S Yoon Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020 | 17 | 2020 |
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech H Kim, S Lee, J Yeom, CH Lee, S Kim, S Yoon arXiv preprint arXiv:2408.14739, 2024 | 2 | 2024 |
Scaling NVIDIA's multi-speaker multi-lingual TTS systems with voice cloning to Indic Languages A Arora, R Badlani, S Kim, R Valle, B Catanzaro arXiv preprint arXiv:2401.13851, 2024 | 1 | 2024 |
ETTA: Elucidating the Design Space of Text-to-Audio Models S Lee, Z Kong, A Goel, S Kim, R Valle, B Catanzaro arXiv preprint arXiv:2412.19351, 2024 | | 2024 |
Fugatto 1: Foundational Generative Audio Transformer Opus 1 R Valle, R Badlani, Z Kong, S Lee, A Goel, JF Santos, A Aljafari, S Kim, ... The Thirteenth International Conference on Learning Representations, 0 | | |