{"id":24479,"date":"2020-09-14T11:14:05","date_gmt":"2020-09-14T09:14:05","guid":{"rendered":"https:\/\/lium.univ-lemans.fr\/?p=24479"},"modified":"2024-02-06T13:48:32","modified_gmt":"2024-02-06T12:48:32","slug":"thibault-prouteau","status":"publish","type":"post","link":"https:\/\/lium.univ-lemans.fr\/en\/thibault-prouteau\/","title":{"rendered":"Thibault Prouteau"},"content":{"rendered":"<div class=\"panel-grid\" id=\"pg-24479-0\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-24479-0-0\" ><div class=\"panel-widget-style\" ><h2 style=\"color: #e5442d;\">Temporal word embeddings: neologisms, gender bias, corpus of French news<br\/><\/h2><p><b>Starting: <\/b> 01\/10\/2020<br\/><b>PhD Student: <\/b> <a href=\"\" target=\"_blank\" >Thibault Prouteau<\/a><br\/><b>Advisor(s): <\/b> Sylvain Meignier<br\/><b>Co-advisor(s): <\/b> Nicolas Dugu\u00e9 <br\/><b>Funding: <\/b> Allocation de recherche du minist\u00e8re de l'enseignement sup\u00e9rieur<br\/><\/p><\/div><\/div><\/div><\/div><div class=\"panel-grid\" id=\"pg-24479-1\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-24479-1-0\" ><div class=\"panel-widget-style\" ><p><strong>Contexte de la th\u00e8se :<\/strong><\/p>\n<p align=\"justify\">La t\u00e9l\u00e9vision, la production litt\u00e9raire et internet fournissent des traces de notre utilisation de la langue [6]. Gr\u00e2ce \u00e0 l&#8217;Ina, la m\u00e9moire de la t\u00e9l\u00e9vision perdure, et se replonger dans le pass\u00e9 nous montre \u00e0 quel point la langue \u00e9volue [9]. Avec le Web moderne, les utilisateurs ont un rapport \u00e0 l&#8217;actualit\u00e9 diff\u00e9rent : il est possible d&#8217;y r\u00e9agir en ligne, tr\u00e8s rapidement et avec une plus grande cr\u00e9ativit\u00e9 linguistique (hashtags, acronymes, etc). Internet est donc propice \u00e0 la cr\u00e9ation de nouveaux mots, ou \u00e0 l&#8217;\u00e9mergence de nouveaux sens, r\u00e9inventant ainsi chaque jour notre langue [10]. Enfin, \u00e0 travers la num\u00e9risation des contenus papiers, la production litt\u00e9raire est accessible \u00e0 tous, retra\u00e7ant l&#8217;\u00e9volution de la langue depuis le dix-neuvi\u00e8me si\u00e8cle [7].<br \/>\nAinsi, ces diff\u00e9rents m\u00e9dias permettent de constituer des corpus textuels temporels qui sont autant de ressources pour \u00e9tudier l&#8217;\u00e9volution de notre langue et de notre soci\u00e9t\u00e9.\n<\/p>\n<p>&nbsp;<br \/>\n<strong>Descritpion<\/strong><\/p>\n<p align=\"justify\">Les m\u00e9thodes de <em>plongements<\/em> lexicaux (ou word embeddings) offrent de nouvelles possibilit\u00e9s pour l&#8217;\u00e9tude des corpus textuels [8], en particulier concernant la s\u00e9mantique du vocabulaire utilis\u00e9 dans ces corpus. \u00e0 la crois\u00e9e de l&#8217;intelligence artificielle et des humanit\u00e9s num\u00e9riques, ce projet a pour but de doter la communaut\u00e9 d&#8217;outils robustes pour concevoir des plongements lexicaux temporels interpr\u00e9tables [2] et de les appliquer dans des contextes tels que la d\u00e9tection et la caract\u00e9risation de n\u00e9ologismes sur de grands corpus textuels [4], dans le cadre de l&#8217;\u00e9volution du langage t\u00e9l\u00e9vis\u00e9 via des corpus Ina transcrits [6] ou encore pour \u00e9valuer l&#8217;\u00e9volution des st\u00e9r\u00e9otypes de genre dans le temps [1, 5, 3].<\/p>\n<p>&nbsp;<br \/>\n<strong>Bibliographie<\/strong><\/p>\n<p align=\"justify\">[1] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. Man is to computer programmer as woman is to home-maker? debiasing word embeddings. In Advances in neural information processing systems, pages 4349-4357, 2016.<br \/>\n[2] Nicolas Dugu\u00e9 and Victor Connes. Complex networks based word embeddings. arXiv preprint arXiv:1910.01489, 2019.<br \/>\n[3] Hila Gonen and Yoav Goldberg. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv preprint arXiv:1903.03862, 2019.<br \/>\n[4] William L. Hamilton, Jure Leskovec, and Dan Jurafsky. Diachronic word embeddings reveal statistical laws of semantic change. arXiv:1605.09096, 2016.<br \/>\n[5] Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yulia Tsvetkov. Measuring bias in contextualized word representations. arXiv preprint arXiv:1906.07337, 2019.<br \/>\n[6] Jean Lagane. L&#8217;\u00e9volution du langage radiophonique. Communication &#038; Langages, 111(1):39-52, 1997.<br \/>\n[7] Zehua Liu. A diachronic study on british and chinese cultural complexity with google books ngrams. Journal of Quantitative Linguistics, 23(4):361-373, 2016.<br \/>\n[8] Jeffrey Pennington, Richard Socher, and Christopher Manning. Glove: Global vectors for word representation. In EMNLP, pages 1532-1543, 2014.<br \/>\n[9] Jane Stuart-Smith, Gwilym Pryce, Claire Timmins, and Barrie Gunter. Television can also be a factor in language change: Evidence from an urban dialect. Language, 89(3):501-536, 2013.<br \/>\n[10] Sali A Tagliamonte et al. So sick or so cool? the language of youth on the internet. Language in Society, 45(1):1-32, 2016.\n<\/p<\/div><\/div><\/div><\/div><div class=\"panel-grid\" id=\"pg-24479-2\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-24479-2-0\" >&nbsp;<\/div><div class=\"panel-grid-cell\" id=\"pgc-24479-2-1\" >&nbsp;<\/div><div class=\"panel-grid-cell\" id=\"pgc-24479-2-2\" >&nbsp;<\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Temporal word embeddings: neologisms, gender bias, corpus of French newsStarting: 01\/10\/2020PhD Student: Thibault ProuteauAdvisor(s): Sylvain MeignierCo-advisor(s): Nicolas Dugu\u00e9 Funding: Allocation de recherche du minist\u00e8re de l&#8217;enseignement sup\u00e9rieurContexte de la th\u00e8se : La t\u00e9l\u00e9vision, la production litt\u00e9raire et internet fournissent des traces de notre utilisation de la langue [6]. Gr\u00e2ce \u00e0 l&#8217;Ina, la m\u00e9moire de la [&hellip;]<\/p>\n<p class=\"more-link style2\"><a href=\"https:\/\/lium.univ-lemans.fr\/en\/thibault-prouteau\/\"  class=\"themebutton\"><span class=\"more-text\">READ MORE<\/span><span class=\"more-icon\"><i class=\"fa fa-angle-right fa-lg\"><\/i><\/span><\/a><\/p>\n","protected":false},"author":14,"featured_media":13249,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[52],"tags":[49],"acf":[],"_links":{"self":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/24479"}],"collection":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/comments?post=24479"}],"version-history":[{"count":0,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/24479\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media\/13249"}],"wp:attachment":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media?parent=24479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/categories?post=24479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/tags?post=24479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}