Seminar from Lucas Ondel, post-doc at l’Université Paris-Saclay
Non-Parametric Subspace Models for Acoustic Units Discovery
This talk will be about subspace non-parametric models for the task of learning a set of acoustic units from unlabeled speech recordings. I will show how we can leverage phonetically transcribed data to learn the notion of phone (encoded as a phonetic subspace) and then “project” data from other languages into this subspace to learn their phonetics. I will show that the Bayesian (non-parametric) framework is a natural candidate to implement this idea and it clearly outperforms other neural network baselines such as vq-wav2vec and vq-vae.