Følg
Takaaki Hori
Takaaki Hori
Verifisert e-postadresse på apple.com
Tittel
Sitert av
Sitert av
År
ESPnet: End-to-end speech processing toolkit
S Watanabe, T Hori, S Karita, T Hayashi, J Nishitoba, Y Unno, NEY Soplin, ...
arXiv preprint arXiv:1804.00015, 2018
16762018
Joint CTC-attention based end-to-end speech recognition using multi-task learning
S Kim, T Hori, S Watanabe
2017 IEEE international conference on acoustics, speech and signal …, 2017
10932017
Hybrid CTC/attention architecture for end-to-end speech recognition
S Watanabe, T Hori, S Kim, JR Hershey, T Hayashi
IEEE Journal of Selected Topics in Signal Processing 11 (8), 1240-1253, 2017
9352017
A comparative study on transformer vs rnn in speech applications
S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ...
2019 IEEE automatic speech recognition and understanding workshop (ASRU …, 2019
8602019
Attention-based multimodal fusion for video description
C Hori, T Hori, TY Lee, Z Zhang, B Harsham, JR Hershey, TK Marks, ...
Proceedings of the IEEE international conference on computer vision, 4193-4202, 2017
4352017
Advances in joint CTC-attention based end-to-end speech recognition with a deep CNN encoder and RNN-LM
T Hori, S Watanabe, Y Zhang, W Chan
arXiv preprint arXiv:1706.02737, 2017
3632017
Streaming automatic speech recognition with the transformer model
N Moritz, T Hori, J Le
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
2262020
Efficient WFST-based one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition
T Hori, C Hori, Y Minami, A Nakamura
IEEE Transactions on audio, speech, and language processing 15 (4), 1352-1365, 2007
1892007
Language independent end-to-end architecture for joint language identification and speech recognition
S Watanabe, T Hori, JR Hershey
2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017
1802017
Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling
J Cho, MK Baskar, R Li, M Wiesner, SH Mallidi, N Yalta, M Karafiat, ...
2018 IEEE Spoken Language Technology Workshop (SLT), 521-527, 2018
1492018
Joint CTC/attention decoding for end-to-end speech recognition
T Hori, S Watanabe, JR Hershey
Proceedings of the 55th Annual Meeting of the Association for Computational …, 2017
1492017
Triggered attention for end-to-end speech recognition
N Moritz, T Hori, J Le Roux
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1462019
End-to-end speech recognition with word-based RNN language models
T Hori, J Cho, S Watanabe
2018 IEEE spoken language technology workshop (SLT), 389-396, 2018
1462018
End-to-end audio visual scene-aware dialog using multimodal attention-based video features
C Hori, H Alamri, J Wang, G Wichern, T Hori, A Cherian, TK Marks, ...
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1442019
End-to-end speech recognition: A survey
R Prabhavalkar, T Hori, TN Sainath, R Schlüter, S Watanabe
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
1322023
Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB challenge
M Delcroix, T Yoshioka, A Ogawa, Y Kubo, M Fujimoto, N Ito, K Kinoshita, ...
Reverb workshop, 2014
1282014
Back-translation-style data augmentation for end-to-end ASR
T Hayashi, S Watanabe, Y Zhang, T Toda, T Hori, R Astudillo, K Takeda
2018 IEEE Spoken Language Technology Workshop (SLT), 426-433, 2018
1252018
Multichannel end-to-end speech recognition
T Ochiai, S Watanabe, T Hori, JR Hershey
International conference on machine learning, 2632-2641, 2017
1252017
Open-vocabulary spoken utterance retrieval using confusion networks
T Hori, IL Hetherington, TJ Hazen, JR Glass
2007 IEEE International Conference on Acoustics, Speech and Signal …, 2007
1202007
Duration-controlled LSTM for polyphonic sound event detection
T Hayashi, S Watanabe, T Toda, T Hori, J Le Roux, K Takeda
IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (11 …, 2017
1172017
Systemet kan ikke utføre handlingen. Prøv på nytt senere.
Artikler 1–20