Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding C Saharia, W Chan, S Saxena, L Li, J Whang, E Denton, ... NeurIPS, 2022 | 5023 | 2022 |
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition DS Park, W Chan, Y Zhang, CC Chiu, B Zoph, ED Cubuk, QV Le INTERSPEECH, 2019 | 4239 | 2019 |
Listen, Attend and Spell: A Neural Network for Large Vocabulary Conversational Speech Recognition W Chan, N Jaitly, QV Le, O Vinyals ICASSP, 2016 | 3377* | 2016 |
Image Super-Resolution via Iterative Refinement C Saharia, J Ho, W Chan, T Salimans, D Fleet, M Norouzi IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022 | 1663 | 2022 |
Palette: Image-to-Image Diffusion Models C Saharia, W Chan, H Chang, C A. Lee, J Ho, D Tim Salimans, J. Fleet, ... SIGGRAPH, 2022 | 1328 | 2022 |
Video Diffusion Models J Ho, T Salimans, A Gritsenko, W Chan, M Norouzi, D Fleet arXiv:2204.03458, 2022 | 1220 | 2022 |
Imagen Video: High Definition Video Generation with Diffusion Models J Ho, W Chan, C Saharia, J Whang, R Gao, A Gritsenko, D P. Kingma, ... arXiv:2210.02303, 2022 | 1209 | 2022 |
Cascaded Diffusion Models for High Fidelity Image Generation J Ho, C Saharia, W Chan, D Fleet, M Norouzi, T Salimans Journal of Machine Learning Research 23 (47), 1-33, 2022 | 1053 | 2022 |
WaveGrad: Estimating Gradients for Waveform Generation N Chen, Y Zhang, H Zen, R Weiss, M Norouzi, W Chan ICLR, 2021 | 791 | 2021 |
Very Deep Convolutional Networks for End-to-End Speech Recognition Y Zhang, W Chan, N Jaitly ICASSP, 2017 | 567 | 2017 |
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM T Hori, S Watanabe, Y Zhang, W Chan INTERSPEECH, 2017 | 363 | 2017 |
Insertion Transformer: Flexible Sequence Generation via Insertion Operations M Stern, W Chan, J Kiros, J Uszkoreit ICML, 2019 | 260 | 2019 |
Novel View Synthesis with Diffusion Models D Watson, W Chan, R Martin-Brualla, J Ho, A Tagliasacchi, M Norouzi ICLR, 2023 | 231 | 2023 |
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019 | 214 | 2019 |
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ... IEEE Journal of Selected Topics in Signal Processing, 2021 | 188 | 2021 |
Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality D Watson, W Chan, J Ho, M Norouzi ICLR, 2022 | 170 | 2022 |
SpecAugment on Large Scale Datasets D Park, Y Zhang, CC Chiu, Y Chen, B Li, W Chan, Q Le, Y Wu ICASSP, 2020 | 164 | 2020 |
Noise2Music: Text-conditioned Music Generation with Diffusion Models Q Huang, D S. Park, T Wang, T I. Denk, A Ly, N Chen, Z Zhang, Z Zhang, ... arXiv:2302.03917, 2023 | 163 | 2023 |
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes B Li, Y Zhang, T Sainath, Y Wu, W Chan ICASSP, 2019 | 157 | 2019 |
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network W Chan, D Park, C Lee, Y Zhang, Q Le, M Norouzi INTERSPEECH: Workshop on Machine Learning in Speech and Language Processing, 2021 | 154 | 2021 |