Seminare from Hugo Riguidel, PhD student at LIUM

 

Date: 26/05/2023
time: 11h00
Localization: IC2, Boardroom
Speaker: Hugo Riguidel
 
 

ON-TRAC consortium systems for the IWSLT 2023 dialectal and low-resource speech translation tasks

 
 

This paper describes the ON-TRAC consortium speech translation systems developed for IWSLT 2023 evaluation campaign. Overall, we participated in three speech translation tracks featured in the low-resource and dialect speech translation shared tasks, namely; i) spoken Tamasheq to written French, ii) spoken Pashto to written French, and iii) spoken Tunisian to written English. All our primary submissions are based on the end-to-end speech-to-text neural architecture using a pretrained SAMU-XLSR model as a speech encoder and a mbart model as a decoder.

The SAMU-XLSR model is built from the XLS-R~128 in order to generate language agnostic sentence-level embeddings. This building is driven by the LaBSE model trained on multilingual text dataset. This architecture allows us to improve the input speech representations and achieve significant improvements compared to conventional end-to-end speech translation systems.