팔로우
Sungwon Kim
Sungwon Kim
nvidia.com의 이메일 확인됨
제목
인용
인용
연도
ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
J Choi, S Kim, Y Jeong, Y Gwon, S Yoon
ICCV 2021 (arXiv preprint arXiv:2108.02938), 2021
7622021
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
J Kim, S Kim, J Kong, S Yoon
Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020
5872020
Perception Prioritized Training of Diffusion Models
J Choi, J Lee, C Shin, S Kim, H Kim, S Yoon
CVPR 2022 (arXiv preprint arXiv:2204.00227), 2022
2372022
FloWaveNet: A generative flow for raw audio
S Kim, S Lee, J Song, J Kim, S Yoon
Proceedings of the International Conference on Machine Learning (ICML), 2018
2162018
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
H Kim, S Kim, S Yoon
Proceedings of the International Conference on Machine Learning (ICML), 2021
1102021
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
S Kim, H Kim, S Yoon
arXiv preprint arXiv:2205.15370, 2022
482022
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
J Song, S Kim, S Yoon
EMNLP 2021 (arXiv preprint arXiv:2109.06481), 2021
392021
P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting
S Kim, K Shih, JF Santos, E Bakhturina, M Desta, R Valle, S Yoon, ...
Advances in Neural Information Processing Systems 36, 2024
362024
UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
H Kim, S Kim, J Yeom, S Yoon
InterSpeech 2023, 2023
282023
FICGAN: Facial Identity Controllable GAN for De-identification
Y Jeong, J Choi, S Kim, Y Ro, TH Oh, D Kim, H Ha, S Yoon
arXiv preprint arXiv:2110.00740, 2021
192021
NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity
S Lee, S Kim, S Yoon
Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020
172020
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech
H Kim, S Lee, J Yeom, CH Lee, S Kim, S Yoon
arXiv preprint arXiv:2408.14739, 2024
22024
Scaling NVIDIA's multi-speaker multi-lingual TTS systems with voice cloning to Indic Languages
A Arora, R Badlani, S Kim, R Valle, B Catanzaro
arXiv preprint arXiv:2401.13851, 2024
12024
ETTA: Elucidating the Design Space of Text-to-Audio Models
S Lee, Z Kong, A Goel, S Kim, R Valle, B Catanzaro
arXiv preprint arXiv:2412.19351, 2024
2024
Fugatto 1: Foundational Generative Audio Transformer Opus 1
R Valle, R Badlani, Z Kong, S Lee, A Goel, JF Santos, A Aljafari, S Kim, ...
The Thirteenth International Conference on Learning Representations, 0
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–15