{"id":25365,"date":"2021-06-10T17:06:28","date_gmt":"2021-06-10T15:06:28","guid":{"rendered":"https:\/\/lium.univ-lemans.fr\/?p=25365"},"modified":"2021-06-11T09:32:46","modified_gmt":"2021-06-11T07:32:46","slug":"seminaire-martin-lebourdais-et-theo-mariotte","status":"publish","type":"post","link":"https:\/\/lium.univ-lemans.fr\/en\/seminaire-martin-lebourdais-et-theo-mariotte\/","title":{"rendered":"Joint effort on data loading process and its application to VAD, speaker turn detection and overlap detection"},"content":{"rendered":"<div class=\"panel-grid\" id=\"pg-25365-0\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-25365-0-0\" ><div class=\"panel-widget-style\" ><h2 style=\"color: #e5442d;\">Seminar from Martin Lebourdais and Th\u00e9o Mariotte, PhD students at LIUM <\/h2>\n<p>&nbsp;<\/p>\n<p><strong>Date:<\/strong> 18\/06\/2021<br \/>\n<strong>Time:<\/strong> 11h00<br \/>\n<strong>Localization:<\/strong>IC2 Salle des conseils, <a href=\"https:\/\/univ-lemans-fr.zoom.us\/j\/92143709904?pwd=Qy9QTXUydExNSlFnS0pWY0ZaNXpzUT09\">online<\/a><br \/>\n<strong>Speakers:<\/strong> <a href=\"http:\/\/lium.univ-lemans.fr\/team\/martin-lebourdais\/\">Martin Lebourdais<\/a> and <a href=\"http:\/\/lium.univ-lemans.fr\/team\/theo-mariotte-2\/\">Th\u00e9o Mariotte<\/a><\/p>\n<p>&nbsp;<\/p>\n<p align=\"center\"><strong>Joint effort on data loading process and its application to VAD, speaker turn detection and overlap detection<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p align=\"justify\">Diarization is the task of finding \u201cWho spoke when?\u201d in an audio stream. It relies on two subtasks defined as segmentation and clustering. The former most often includes voice activity detection (VAD), overlapped speech detection (OSD) and speaker change detection (SCD). These three tasks are taking a sequence as input (audio signal) and outputting a sequence. The classification of the frames in the output sequence allows the segmentation of the audio signal, i.e. finding borders in the speech signal between different parts of interest.<br \/>\nOne difficulty when training such a system comes from the imbalance of the classes to be detected, especially in the case of overlap and speaker change detection.<\/p>\n<p align=\"justify\">This work introduces a data loading process (DataLoader) to format and distribute the speech segments used for the training of these three tasks. In addition, to overcome the imbalance of the data, we propose a segment selection process (DataSampler) to precisely choose the proportion of examples of each class in each training mini-batch (speech, overlap, speaker change, non-speech). The data loading process also enable the use of multi-microphone recordings, which are investigated in the context of speech segmentation. Experimental evaluation is being carried out using the multi-microphone speech corpus AMI and the diarization challenge DiHard corpus.<\/p><\/div><\/div><\/div><\/div><div class=\"panel-grid\" id=\"pg-25365-1\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-25365-1-0\" >&nbsp;<\/div><div class=\"panel-grid-cell\" id=\"pgc-25365-1-1\" >&nbsp;<\/div><div class=\"panel-grid-cell\" id=\"pgc-25365-1-2\" >&nbsp;<\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Seminar from Martin Lebourdais and Th\u00e9o Mariotte, PhD students at LIUM &nbsp; Date: 18\/06\/2021 Time: 11h00 Localization:IC2 Salle des conseils, online Speakers: Martin Lebourdais and Th\u00e9o Mariotte &nbsp; Joint effort on data loading process and its application to VAD, speaker turn detection and overlap detection &nbsp; Diarization is the task of finding \u201cWho spoke when?\u201d [&hellip;]<\/p>\n<p class=\"more-link style2\"><a href=\"https:\/\/lium.univ-lemans.fr\/en\/seminaire-martin-lebourdais-et-theo-mariotte\/\"  class=\"themebutton\"><span class=\"more-text\">READ MORE<\/span><span class=\"more-icon\"><i class=\"fa fa-angle-right fa-lg\"><\/i><\/span><\/a><\/p>\n","protected":false},"author":14,"featured_media":13238,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[43],"tags":[49],"acf":[],"_links":{"self":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/25365"}],"collection":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/comments?post=25365"}],"version-history":[{"count":0,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/25365\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media\/13238"}],"wp:attachment":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media?parent=25365"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/categories?post=25365"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/tags?post=25365"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}