PhD defence, Amira Barhoumi

Date : 23/11/2020
Time : 10h00
Location : IC2 building, LIUM, Le Mans Université

Title : Neural approach for Arabic sentiment analysis

Jury members :
– Kamel Smaïli Professor, Université de Lorraine, France
– Nadia Essoussi Professor, Université de Tunis, Tunisie
– Emmanuel Morin, Professor, Université de Nantes, France
– Anthony Larcher, Professor, Le Mans Université, France

– Yannick Estève, Professor, Université d’Avignon, France
– Lamia Hadrich Belguith, Professor, FSEGS, Université de Sfax, Tunisie

Co-supervisor: :
– Nathalie Camelin (LIUM, Le Mans Université)

Invited jury member: :
– Chafik Aloulou, Lecturer, FSEGS, Université de Sfax, Tunisie

Abstract :

My thesis is part of Arabic sentiment analysis. Its aim is to determine the global polarity of a given textual statement written in MSA or dialectal arabic. This research area has been subject of numerous studies dealing with Indo-European languages, in particular English. One of difficulties confronting this thesis is the processing of Arabic. In fact, Arabic is a morphologically rich language which implies a greater sparsity : we want to overcome this problem by producing, in a completely automatic way, new arabic specific embeddings.

Our study focuses on the use of a neural approach to improve polarity detection, using embeddings. These embeddings have revealed fundamental in various natural languages processing tasks (NLP).

Our contribution in this thesis concerns several axis. First, we begin with a preliminary study of the various existing pre-trained word embeddings resources in arabic. These embeddings consider words as space separated units in order to capture semantic and syntactic similarities in the embedding space.

Second, we focus on the specifity of Arabic language. We propose arabic specific embeddings that take into account agglutination and morphological richness of Arabic. These specific embeddings have been used, alone and in combined way, as input to neural networks providing an improvement in terms of classification performance. Finally, we evaluate embeddings with intrinsic and extrinsic methods specific to sentiment analysis task. For intrinsic embeddings evaluation, we propose a new protocol introducing the notion of sentiment stability in the embeddings space. We propose also a qualitaive extrinsic analysis of our embeddings by using visualisation methods.

Keywords :
Sentiment analysis, Convolutional neural network, Recurrent neural Network, embeddings, arabic language.