Marco Tagliasacchi
Marco Tagliasacchi
Research Scientist, Google
Ingen verifisert e-postadresse
Sitert av
Sitert av
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
Soundstream: An end-to-end neural audio codec
N Zeghidour, A Luebs, A Omran, J Skoglund, M Tagliasacchi
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 495-507, 2021
Musiclm: Generating music from text
A Agostinelli, TI Denk, Z Borsos, J Engel, M Verzetti, A Caillon, Q Huang, ...
arXiv preprint arXiv:2301.11325, 2023
Audiolm: a language modeling approach to audio generation
Z Borsos, R Marinier, D Vincent, E Kharitonov, O Pietquin, M Sharifi, ...
IEEE/ACM transactions on audio, speech, and language processing 31, 2523-2533, 2023
Scream and gunshot detection and localization for audio-surveillance systems
G Valenzise, L Gerosa, M Tagliasacchi, F Antonacci, A Sarti
2007 IEEE Conference on Advanced Video and Signal Based Surveillance, 21-26, 2007
An overview on video forensics
S Milani, M Fontani, P Bestagini, M Barni, A Piva, M Tagliasacchi, ...
APSIPA Transactions on Signal and Information Processing 1, e2, 2012
Deep convolutional neural networks for pedestrian detection
D Tomè, F Monti, L Baroffio, L Bondi, M Tagliasacchi, S Tubaro
Signal processing: image communication 47, 482-489, 2016
Audiopalm: A large language model that can speak and listen
PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ...
arXiv preprint arXiv:2306.12925, 2023
An integrated system based on wireless sensor networks for patient monitoring, localization and tracking
A Redondi, M Chirico, L Borsani, M Cesana, M Tagliasacchi
Ad Hoc Networks 11 (1), 39-53, 2013
Speak, read and prompt: High-fidelity text-to-speech with minimal supervision
E Kharitonov, D Vincent, Z Borsos, R Marinier, S Girgin, O Pietquin, ...
Transactions of the Association for Computational Linguistics 11, 1703-1718, 2023
Subjective assessment of H. 264/AVC video sequences transmitted over a noisy channel
F De Simone, M Naccari, M Tagliasacchi, F Dufaux, S Tubaro, T Ebrahimi
2009 international workshop on quality of multimedia experience, 204-209, 2009
Local tampering detection in video sequences
P Bestagini, S Milani, M Tagliasacchi, S Tubaro
2013 IEEE 15th international workshop on multimedia signal processing (MMSP …, 2013
LEAF: A learnable frontend for audio classification
N Zeghidour, O Teboul, FDC Quitry, M Tagliasacchi
arXiv preprint arXiv:2101.08596, 2021
Towards learning a universal non-semantic representation of speech
J Shor, A Jansen, R Maor, O Lang, O Tuval, FDC Quitry, M Tagliasacchi, ...
arXiv preprint arXiv:2002.12764, 2020
A H. 264/AVC video database for the evaluation of quality metrics
F De Simone, M Tagliasacchi, M Naccari, S Tubaro, T Ebrahimi
2010 IEEE International Conference on Acoustics, Speech and Signal …, 2010
Discriminating multiple JPEG compressions using first digit features
S Milani, M Tagliasacchi, S Tubaro
APSIPA Transactions on Signal and Information Processing 3, e19, 2014
Scream and gunshot detection in noisy environments
L Gerosa, G Valenzise, M Tagliasacchi, F Antonacci, A Sarti
2007 15th European Signal Processing Conference, 1216-1220, 2007
A visual sensor network for parking lot occupancy detection in smart cities
L Baroffio, L Bondi, M Cesana, AE Redondi, M Tagliasacchi
2015 IEEE 2nd World Forum on Internet of Things (WF-IoT), 745-750, 2015
Evaluation of low-complexity visual feature detectors and descriptors
A Canclini, M Cesana, A Redondi, M Tagliasacchi, J Ascenso, R Cilla
2013 18th International Conference on Digital Signal Processing (DSP), 1-7, 2013
SPICE: Self-supervised pitch estimation
B Gfeller, C Frank, D Roblek, M Sharifi, M Tagliasacchi, M Velimirović
IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 1118-1128, 2020
Systemet kan ikke utføre handlingen. Prøv på nytt senere.
Artikler 1–20