Martin Lebourdais

Extraction of end-to-end semantic information from audio signalStarting: 01/10/2020PhD Student: Martin LebourdaisAdvisor(s): Sylvain MeignierCo-advisor(s): Antoine Laurent, Marie TahonFunding: ANR GEMThe GEM project aims to describe the differences in representation and treatment between women and men in the media, based on the automatic analysis of large volumes of French-language data contained in the INA and Deezer […]

Salima Mdhaffar

PhD defence, Salima Mdhaffar Date: 01/07/2020 Time: 9h30 Location: Université d’Avignon, videoconference Title : Speech Recognition in the context of lectures: Evaluation, Progress and Enrichment Jury members: Reviewers: – Prof. Georges Linarès (Professeur, Université d’Avignon) – Dr. Irina Illina (Maître de conférences HDR, Université de Nancy) Examiners: – Prof. Sylvain Meignier (Professeur, Le Mans Université) […]

ArSentimentAnalysis

Corpus: ArSentimentAnalysis (ArSentimentAnalysis)GitHub: https://github.com/amirabaroumi/ArSentimentAnalysis Author(s): Amira BarhoumiNathalie CamelinYannick EstèveLe package ArSentimentAnalysis comprend un ensemble de ressources permettant de concevoir et évaluer un système d’analyse d’opinions en arabe. Le package contient: Des ensembles d’embeddings spécifiques à l’arabe pré-entrainés Le lexique polarisé ArSentLex 1/ Ensembles d’embeddings spécifiques à l’arabe : Les embeddings pré-entrainés existants représentent un mot […]

AlloSat

Corpus: AlloSat (AlloSat)Licence: creative CommonsAuthor(s): Manon MacaryMarie TahonAnthony RousseauYannick EstèveThe corpus, named AlloSat, is composed of real-life call center conversations in French and is continuously annotated in frustration and satisfaction. This corpus has been set up to develop new systems able to model the continuous aspect of semantic and paralinguistic information at the conversation level. […]

Multi30k

Corpus: Multi30k Dataset (Multi30k)Licence: Attribution-NonCommercial-ShareAlike 4.0 InternationalGitHub: https://github.com/multi30kAuthor(s): Loïc BarraultOzan CaglayanFethi BougaresThe Flickr30K Dataset contains 31,014 images sourced from online photo-sharing websites (Young et al., 2014). Each image is paired with five English descriptions, which were collected from Amazon Mechanical Turk2. The dataset contains 145,000 training, 5,070 development, and 5,000 test descriptions. The Multi30K dataset […]

TSAC

Corpus: Tunisian Sentiment Analysis Corpus. (TSAC)Licence: GNU Lesser General Public License v3.0GitHub: https://github.com/fbougares/TSACAuthor(s): Fethi BougaresSalima MdhaffarYannick EstèveAbout 17k user comments manually annotated to positive and negative polarities. This corpus is collected from Facebook users comments written on official pages of Tunisian radios and TV channels namely Mosaique FM, JawhraFM, Shemes FM, HiwarElttounsi TV and Nessma […]

Pierre-Alexandre Broux

PhD defence, Pierre-Alexandre Broux Date : 10/01/2020 Time : 14h00 Location : Room 210, IC2 building, LIUM, Le Mans Université Title : Speaker diarization in audiovisual files in interaction with human annotators Jury members : Reviewers: – Jean-François BONASTRE (LIA, Université d’Avignon) – Nicholas EVANS (EURECOM) Examiners: – Régine ANDRE-OBRECHT (Université Toulouse 3) Supervisor: – […]

Apprentissage actif, interprétation et contrôle pour la synthèse neuronale de parole expressive

Active learning, interpretation and control for neural synthesis of expressive speech   Supervisor: Sylvain Meignier and Anthony Larcher Co-supervisor(s): Marie Tahon Mails : prenom.nom@univ-lemans.fr Application deadline : 22 May 2020   Context : The thesis will take place at the Laboratoire d’Informatique de l’Université du Mans (LIUM) in the LST (Language and Speech Technology) team. […]

Extraction d’informations sémantiques end-to-end à partir du signal audio

Extraction of end-to-end semantic information from an audio signal     Supervisor: Sylvain Meignier Co-Supervisor(s) : Antoine Laurent, Nathalie Camelin, Nicolas Dugué Keywords: Speech recognition and understanding, End2End approaches, neural networks, gender Application deadline : 22 May 2020 Context : This thesis is part of the research topics of the Language and Speech Technologies (LST) […]

Word embeddings temporels : néologismes, biais de genre, corpus des actualités françaises

Temporal word embeddings: neologisms, gender bias, corpus of French news   Supervisor: Sylvain Meignier Co-supervisor(s): Nicolas Dugué and Nathalie Camelin Mails : prenom.nom@univ-lemans.fr Application deadline : 22 May 2020   Keywords: Word embeddings, temporal corpus, gender study, neologism detection, media   Context : La télévision, la production littéraire et internet fournissent des traces de notre […]