Öffentlicher Zugriff

Learning joint embedding with multimodal cues for cross-modal video-text retrieval

NC Mithun, J Li, F Metze, AK Roy-Chowdhury

Proceedings of the 2018 ACM on international conference on multimedia …, 2018

Mandate: US National Science Foundation

[PDF] neurips.cc

Keeping your eye on the ball: Trajectory attention in video transformers

M Patrick, D Campbell, Y Asano, I Misra, F Metze, C Feichtenhofer, ...

Advances in neural information processing systems 34, 12493-12506, 2021

Mandate: UK Engineering and Physical Sciences Research Council, European Commission

[PDF] thecvf.com

How2sign: a large-scale multimodal dataset for continuous american sign language

A Duarte, S Palaskar, L Ventura, D Ghadiyaram, K DeHaan, F Metze, ...

Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021

Mandate: US National Science Foundation, Government of Spain

A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling

Y Wang, J Li, F Metze

ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019

Mandate: US National Science Foundation

Universal phone recognition with a multilingual allophone system

X Li, S Dalmia, J Li, M Lee, P Littell, J Yao, A Anastasopoulos, ...

ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020

Mandate: US National Science Foundation

Sequence-based multi-lingual low resource speech recognition

S Dalmia, R Sanabria, F Metze, AW Black

2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018

Mandate: US National Science Foundation, US Department of Defense

An empirical exploration of CTC acoustic models

Y Miao, M Gowayyed, X Na, T Ko, F Metze, A Waibel

2016 IEEE international conference on acoustics, speech and signal …, 2016

Mandate: US National Science Foundation

Audio-based multimedia event detection using deep recurrent neural networks

Y Wang, L Neves, F Metze

2016 IEEE international conference on acoustics, speech and signal …, 2016

Mandate: US National Science Foundation

Asr error correction and domain adaptation using machine translation

A Mani, S Palaskar, NV Meripo, S Konam, F Metze

ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020

Mandate: US National Science Foundation

Visual features for context-aware speech recognition

A Gupta, Y Miao, L Neves, F Metze

2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017

Mandate: US National Science Foundation

[PDF] uni-augsburg.de

Machine listening for heart status monitoring: Introducing and benchmarking hss—the heart sounds shenzhen corpus

F Dong, K Qian, Z Ren, A Baird, X Li, Z Dai, B Dong, F Metze, ...

IEEE journal of biomedical and health informatics 24 (7), 2082-2092, 2019

Mandate: National Natural Science Foundation of China, European Commission

End-to-end multimodal speech recognition

S Palaskar, R Sanabria, F Metze

2018 IEEE international conference on acoustics, speech and signal …, 2018

Mandate: US National Science Foundation

Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the “Speaking rosetta” JSALT 2017 workshop

O Scharenborg, L Besacier, A Black, M Hasegawa-Johnson, F Metze, ...

2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018

Mandate: US National Science Foundation, Deutsche Forschungsgemeinschaft, Netherlands …

A first attempt at polyphonic sound event detection using connectionist temporal classification

Y Wang, F Metze

2017 ieee international conference on acoustics, speech and signal …, 2017

Mandate: US National Science Foundation

Multimodal grounding for sequence-to-sequence speech recognition

O Caglayan, R Sanabria, S Palaskar, L Barraul, F Metze

ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019

Mandate: US National Science Foundation, US Department of Defense, Agence Nationale …

[PDF] aaai.org

Towards zero-shot learning for automatic phonemic transcription

X Li, S Dalmia, D Mortensen, J Li, A Black, F Metze

Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 8261-8268, 2020

Mandate: US Department of Defense

[PDF] researchgate.net

Joint embeddings with multimodal cues for video-text retrieval

NC Mithun, J Li, F Metze, AK Roy-Chowdhury

International Journal of Multimedia Information Retrieval 8, 3-18, 2019

Mandate: US National Science Foundation

[PDF] isca-archive.org

Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach.

Y Miao, F Metze

Interspeech 1 (2), 3, 2016

Mandate: US National Science Foundation

The ACLEW DiViMe: An easy-to-use diarization tool.

A Le Franc, E Riebling, J Karadayi, Y Wang, C Scaff, F Metze, A Cristia

Interspeech, 1383-1387, 2018

Mandate: US National Science Foundation

[PDF] github.io