PhD defence, Alexis Conneau
Titre : Learning distributed representations of sentences using neural networks.
jury members :
- Claire Gardent, Directrice de Recherche Première Classe au CNRS/LORIO, Reviewer
- François Yvon Professeur des Universités à l’Université d’Orsay, Reviewer
- Yann Lecun, Professeur à New York University et Facebook AI Research, Examiner
- Chris Dyer Research Scientist at Google DeepMind, Examiner
- Paul Deléglise, Professeur émérite à l’Université du Mans, Supervisor
- Loïc Barrault Maître de conférence à l’Université du Mans, Co-supervisor
- Holger Schwenk, Professeur à l’Université du Maine et Facebook AI Research, Co-supervisor
Being able to learn generic representations of objects such as images, words or sentences is essential to building machines that have a broad understanding of the world. Through transfer learning, neural networks can learn representations from high-resource tasks and then leverage these to improve performance on lowresource task. While transfer learning has been very successful for transferring image features learned on ImageNet to low-resource visual understanding tasks, generic representations of text using neural networks were mostly limited to word embeddings. This dissertation presents a full study of sentence embeddings, through which I discuss how I have pushed the state of the art in monolingual and cross-lingual general-purpose embeddings. The first contributions of this thesis include SentEval, a transfer learning evaluation toolkit for universal sentence embeddings, InferSent, a state-of-the-art generic sentence encoder, and probing tasks, through which sentence encoders are analyzed and probed for linguistic properties.
We show in this first part that generic representations of sentence can be built and that they provide powerful out-of-the-box features of sentences. In the second part of my PhDs, my contributions have been centered around aligning distributions of words and sentences, in many languages. I show for the first time that it is possible to build generic cross-lingual word and sentence embedding spaces in a completely unsupervised way, without any parallel data. In particular, we show that we can perform word translation without parallel data, which was the building block for the new research field of “unsupervised machine translation”. My last contribution on cross-lingual language modeling shows that state-of-the-art sentence representations can be aligned in a completely unsupervised way, leading to a new state of the art on supervised and unsupervised machine translation, and on the zero-shot crosslingual classification benchmarked called “XNLI”.
Machine learning, deep neural networks, sentence embeddings