CV

Basics

Name	Leonardo Pepino
Label	Research Scientist
Email	[email protected]
Summary	Making Gemini better at Google Deepmind.

Work

2025.10 - Present
Research Scientist

Google Deepmind
2024.07 - 2024.10
Research Intern

Google LLC

3 months internship at NYC offices. I experimented with novel Neural Audio Codec architectures and its applications to speech LLMs and speech synthesis.
- Neural Audio Codecs
- Speech LLM
- Representation Learning
2024.04 - 2024.07
Research Intern

ASAPP Inc.

3 months remote internship. I trained speech and audio LLMs from scratch and incorporated novel audio encoding approaches.
- Speech LLM
- Audio LLM
- Instruction Fine-tuning
- PEFT
- Audio encoders
2023.05 - 2023.08
Research Intern

Brno University of Technology

3 months internship in Brno, Czech Republic. I worked in the Chime Challenge for multi-channel, multi-speaker ASR.
- ASR
- CHiME
- RNN-T
- Transfer Learning
2022.09 - 2022.12
Student Researcher

Google LLC

3 months internship at Mountain View offices. I experimented with novel audio language modelling techniques for generation of long and coherent speech.
- AudioLM
- Textless NLP
- Long context
2021 - 2022
Research Intern

Hipcam

Part-time internship. I collaborated with the design and development of wake-word detection models using deep learning, and participated in the development of a complete machine learning pipeline, from data preprocessing to model deployment in intelligent surveillance systems.
- Wakeword detection
- Embedded systems

Education

2020 - 2025

Buenos Aires, Argentina

PhD in Computer Science

Universidad de Buenos Aires

My project is titled ”New deep learning strategies for general sound understanding” and is supervised by Dr. Luciana Ferrer and co-supervised by Dr. Pablo Riera. This research focuses on developing reusable deep learning models for general-purpose audio understanding, with an emphasis on transfer learning in low-resource scenarios. The project explores the adaptation of transformer architectures to audio signal processing and investigates various self-supervised pretraining strategies to enhance model generalization across diverse audio tasks.

2013 - 2019

Caseros, Argentina

Dipl. Sound Engineering

Universidad de Tres de Febrero

This integrated engineering degree is equivalent to a combined Bachelor's and Master's program (4 + 2 years). It includes a final project comparable to a Master's thesis. My thesis, titled ”Music Source Separation Using Convolutional Neural Networks”, was supervised by Dr. Laurence Bender. GPA: 8.07/10.

Publications

2025

A Dataset for Automatic Assessment of TTS Quality in Spanish

Interspeech 2025
2025

Benchmarking Time-localized Explanations for Audio Classification Models

Interspeech 2025
2025

EncodecMAE: Leveraging neural codecs for universal audio representation learning

Interspeech 2025
2025

Better audio representations are more brain-like: linking model-brain alignment with performance in downstream auditory tasks

Preprint
2025

Análisis y desarrollo de representaciones generales de audio

Universidad Nacional de Buenos Aires
2024

Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio

ICASSP 2024 Workshop on Explainable AI for Speech and Audio
2023

Phone and speaker spatial organization in self-supervised speech representations

ICASSP 2023
2022

Study of positional encoding approaches for Audio Spectrogram Transformers

ICASSP 2022
2021

Alzheimer Disease Recognition Using Speech-Based Embeddings From Pre-Trained Models.

Interspeech 2021
2021

Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings

Interspeech 2021
2020

Fusion approaches for emotion recognition from speech using acoustic and text-based features.

ICASSP 2020