Enrichir la représentation acoustique pour les langues faiblement dotées – Laboratoire d'Informatique de l'Université du Mans

Seminar from Yannick Yomie Nzeuhang, PhD student at Université Yaoundé 1 and ESPERANTO secondee at LIUM

Date: 21/06/2024
Time: 14h00
Localization: IC2, Boardroom
Speaker: Yannick Yomie Nzeuhang

Enriching acoustic representation for for LRL(Low resources Languages)

A popular approach in the literature for tackling the task of speech recognition for low-resource languages is “finetuning”. It is generally based on a multilingual feature extraction model, assumed to be sufficiently general to be exploited whatever the language of application. However, several studies have shown that the performance of these models depends on the linguistic distance between the languages used to pre-train the model and the target language. Furthermore, the specification of these extraction models for a low-resource language is made difficult by their data-intensive nature.

For low-resource languages, we propose an approach for learning acoustic representations for the speech recognition task. This approach is based on the use of graph neural networks to enrich acoustic features with linguistic information. We evaluated this enrichment potential by comparing the quality of acoustic-only features with those where they are combined with linguistic information. Preliminary results on the task of recognising isolated words from the google command dataset tend to confirm the improvement in performance with this approach.