
Transdisciplinary Analysis of French Newsreels (1945-1969) (Antract)Date: 10/2017 – 03/2022Funding: ANRCall: GenericPartners: INA (France), EURECOM (France), Voxolab (France), CHS (France)URL: Participant(s): Simon PetitrenaudAntoine LaurentSylvain MeignierPierre-Alexandre BrouxThe general objective of the ANTRACT project is the analysis of the images and sounds produced weekly in the framework of an independent company created in 1945, les Actualités […]

Sahar Ghannay

PhD defence, Sahar GHANNAY Title : A study of continuous word representations applied to ASR error detection. Composition of the jury : Présidente : Martine Adda-Decker, Reviewers : Sophie Rosset, Frédéric Béchet, Examiners : Benoit Favre, Benjamin Lecouteux, Supervisor : Yannick Estève Co-supervisor : Nathalie Camelin Abstract : This thesis concerns a study of continuous […]

Antoine Caubrière

Deep neural networks for oral and written language processingStarting: 04/09/2017PhD Student: Antoine CaubrièreAdvisor(s): Yannick Estève (LIUM, LST)Co-advisor(s): Antoine Laurent (LIUM, LST) & Emmanuel Morin (LS2N)Funding: RAPACE ProjectThe aim of this thesis is to develop a named entity recognition system in an audio stream that will rely solely on a deep neural network. Until now, this […]

Amira Barhoumi

Towards a hybrid approach for Arabic Sentiment AnalysisStarting: 03/10/2016PhD Student: Amira BarhoumiAdvisor(s): Yannick Estève (LIUM, LST)Co-advisor(s): Nathalie Camelin (LIUM, LST) & Lamia Hadrich Belguith (MIRACL, Tunisie)Funding: Agreement “Cotutelle Convention” (LIUM, LST) & (MIRACL, Tunisie)Sentiment analysis is a growing field of research and has been subject of numerous studies. This thesis aims at designing a hybrid […]

TED-LIUM Release 3

Corpus: TED-LIUM Release 3Licence: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)Author(s): François FernandezVincent NguyenSahar GhannayNatalia TomashenkoYannick EstèveThis is the TED-LIUM corpus release 3, licensed under Creative Commons BY-NC-ND 3.0 (   All talks and text are property of TED Conferences LLC.   This new TED-LIUM release was made through a collaboration between the Ubiqus company and the […]

TED-LIUM Release 2

Corpus: TED-LIUM Release 2Licence: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)Author(s): Anthony RousseauPaul DelégliseYannick EstèveThis is the TED-LIUM corpus release 2, licensed under Creative Commons BY-NC-ND 3.0 (   The TED-LIUM corpus was made from audio talks and their transcriptions available on the TED website. We have prepared and filtered these data in order to train acoustic […]

TED-LIUM Release 1

Corpus: TED-LIUM Release 1Licence: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)Author(s): Anthony RousseauPaul DelégliseYannick EstèveThis is the TED-LIUM corpus release 1, licensed under Creative Commons BY-NC-ND 3.0 (   The TED-LIUM corpus is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech.   More details are given in this paper: A. […]


Software: NMTPYLicence: MIT LicenseGitHub: Ozan CaglayanMercedes García MartínezAdrien BardetWalid AransaLoïc BarraultFethi Bougaresnmtpy is a suite of Python tools, primarily based on the starter code provided in dl4mt-tutorial for training neural machine translation networks using Theano. The basic motivation behind forking dl4mt-tutorial was to create a framework where it would be easy to implement […]


Software: NMTPYTORCHLicence: MIT LicenseGitHub: Ozan CaglayanMercedes García MartínezAdrien BardetWalid AransaFethi BougaresLoïc BarraultThis is the PyTorch fork of nmtpy, a sequence-to-sequence framework which was originally a fork of dl4mt-tutorial.

LIUM Speaker Diarization

Software: LIUM Speaker DiarizationLicence: GPLURL: de segmentation et regroupement locuteur (Speaker diarization) en java.