Séminaire de Hugo Riguidel, doctorant au LIUM


Date: 26/05/2023
Heure: 11h00
Lieu: IC2, salle des conseils
Intervenant: Hugo Riguidel

ON-TRAC consortium systems for the IWSLT 2023 dialectal and low-resource speech translation tasks


This paper describes the ON-TRAC consortium speech translation systems developed for IWSLT 2023 evaluation campaign. Overall, we participated in three speech translation tracks featured in the low-resource and dialect speech translation shared tasks, namely; i) spoken Tamasheq to written French, ii) spoken Pashto to written French, and iii) spoken Tunisian to written English. All our primary submissions are based on the end-to-end speech-to-text neural architecture using a pretrained SAMU-XLSR model as a speech encoder and a mbart model as a decoder.

The SAMU-XLSR model is built from the XLS-R~128 in order to generate language agnostic sentence-level embeddings. This building is driven by the LaBSE model trained on multilingual text dataset. This architecture allows us to improve the input speech representations and achieve significant improvements compared to conventional end-to-end speech translation systems.