{"id":26433,"date":"2025-02-13T15:39:22","date_gmt":"2025-02-13T14:39:22","guid":{"rendered":"https:\/\/lium.univ-lemans.fr\/?p=26433"},"modified":"2025-02-13T15:43:01","modified_gmt":"2025-02-13T14:43:01","slug":"towards-interpretable-representations-for-audio-and-speech-processing","status":"publish","type":"post","link":"https:\/\/lium.univ-lemans.fr\/en\/towards-interpretable-representations-for-audio-and-speech-processing\/","title":{"rendered":"Towards interpretable representations for audio and speech processing"},"content":{"rendered":"<div class=\"panel-grid\" id=\"pg-26433-0\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-26433-0-0\" ><div class=\"panel-widget-style\" ><h2 style=\"color: #e5442d;\">Seminar from Th\u00e9o Mariotte, lecturer at LIUM<\/h2>\n<p>&nbsp;<\/p>\n<p><strong>Date:<\/strong> 24\/02\/2025<br \/>\n<strong>Time:<\/strong> 10h30<br \/>\n<strong>Place:<\/strong> IC2, Boardroom<br \/>\n<strong>Speaker: <\/strong> <a href=\"http:\/\/lium.univ-lemans.fr\/en\/team\/theo-mariotte-2\/\">Th\u00e9o Mariotte<\/a><br \/>\n&nbsp;<br \/>\n&nbsp;<\/p>\n<p align=\"center\"><strong>Towards interpretable representations for audio and speech processing<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>This seminar is divided into two main parts. The first part reviews my previous research, while the second explores future research directions.<\/p>\n<p>In the first section, I will briefly introduce the methods developed during my thesis before delving deeper into my postdoctoral work. Specifically, I will present Annealed Multiple Choice Learning, a general training framework with applications to source separation. This method trains multiple hypotheses to handle ambiguous tasks effectively. Additionally, I will discuss the application of neural clustering for jointly performing source separation and speaker diarization in long-form meeting recordings.<\/p>\n<p>The second part of the seminar will focus on speaker segmentation in the multi-microphone scenario. The proposed method (WIP) combines spatial filtering, source localization, and voice activity detection to predict speaker activity. This approach aims to be more interpretable and requires fewer trainable parameters. I will also discuss the challenges of simulating training data and share my struggling. Finally, I will introduce other research directions, including disentangled self-supervised representation learning and large-scale source separation.<\/p><\/div><\/div><\/div><\/div><div class=\"panel-grid\" id=\"pg-26433-1\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-26433-1-0\" ><div class=\"panel-widget-style\" ><div class=\"margin10\"><\/div><\/div><\/div><\/div><\/div><div class=\"panel-grid\" id=\"pg-26433-2\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-26433-2-0\" >&nbsp;<\/div><div class=\"panel-grid-cell\" id=\"pgc-26433-2-1\" >&nbsp;<\/div><div class=\"panel-grid-cell\" id=\"pgc-26433-2-2\" >&nbsp;<\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Seminar from Th\u00e9o Mariotte, lecturer at LIUM &nbsp; Date: 24\/02\/2025 Time: 10h30 Place: IC2, Boardroom Speaker: Th\u00e9o Mariotte &nbsp; &nbsp; Towards interpretable representations for audio and speech processing &nbsp; This seminar is divided into two main parts. The first part reviews my previous research, while the second explores future research directions. In the first section, [&hellip;]<\/p>\n<p class=\"more-link style2\"><a href=\"https:\/\/lium.univ-lemans.fr\/en\/towards-interpretable-representations-for-audio-and-speech-processing\/\"  class=\"themebutton\"><span class=\"more-text\">READ MORE<\/span><span class=\"more-icon\"><i class=\"fa fa-angle-right fa-lg\"><\/i><\/span><\/a><\/p>\n","protected":false},"author":14,"featured_media":13238,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[46,43],"tags":[49],"acf":[],"_links":{"self":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/26433"}],"collection":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/comments?post=26433"}],"version-history":[{"count":0,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/26433\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media\/13238"}],"wp:attachment":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media?parent=26433"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/categories?post=26433"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/tags?post=26433"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}