Explore wav2vec 2.0 for Mispronunciation Detection X Xu, Y Kang, S Cao, B Lin, L Ma Proc. Interspeech 2021, 4428-4432, 2021 | 82 | 2021 |
Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning K Deng, S Cao, L Ma Proc. Interspeech 2021, 1504-1508, 2021 | 39 | 2021 |
Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model K Deng, S Cao, Y Zhang, L Ma 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 76-82, 2021 | 37 | 2021 |
Improving CTC-based speech recognition via knowledge transferring from pre-trained language models K Deng, S Cao, Y Zhang, L Ma, G Cheng, J Xu, P Zhang ICASSP 2022, 2022 | 32 | 2022 |
Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning S Cao, Y Kang, Y Fu, X Xu, S Sun, Y Zhang, L Ma Proc. Interspeech 2021, 706-710, 2021 | 18 | 2021 |
Multi-head monotonic chunkwise attention for online speech recognition B Liu, S Cao, S Sun, W Zhang, L Ma arXiv preprint arXiv:2005.00205, 2020 | 9 | 2020 |
Distillw2v2: A small and streaming wav2vec 2.0 based asr model Y Fu, Y Kang, S Cao, L Ma arXiv preprint arXiv:2303.09278, 2023 | 7 | 2023 |
Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training B Zhang, S Cao, X Zhang, Y Zhang, L Ma, T Shinozaki Interspeech 2022, 2022 | 5 | 2022 |
Improving speech recognition accuracy of local poi using geographical models S Cao, Y Zhang, X Feng, L Ma 2021 IEEE Spoken Language Technology Workshop (SLT), 180-185, 2021 | 5 | 2021 |
Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition Y Xie, X Wang, Z Wang, R Fu, Z Wen, S Cao, L Ma, C Li, H Cheng, L Ye arXiv preprint arXiv:2501.06514, 2025 | 1 | 2025 |
M-MoE: Mixture of Mixture-of-Expert Model for CTC-based Streaming Multilingual ASR S Cao, X Wang, Y Zhang, X Zhang, L Ma ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and …, 2025 | | 2025 |
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models W Wu, Z Lin, Y Zhou, J Li, R Niu, Q Wu, S Cao, L Ma, Z Wu ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and …, 2025 | | 2025 |
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition Y Li, X Wang, S Cao, Y Zhang, L Ma, L Xie Interspeech 2024, 2024 | | 2024 |
A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling Y Zhang, X Feng, Y Liu, S Cao, L Ma arXiv preprint arXiv:2203.04767, 2022 | | 2022 |