{"id":26521,"date":"2025-05-22T14:01:52","date_gmt":"2025-05-22T12:01:52","guid":{"rendered":"https:\/\/lium.univ-lemans.fr\/?p=26521"},"modified":"2025-05-22T15:46:30","modified_gmt":"2025-05-22T13:46:30","slug":"ckbens2tt","status":"publish","type":"post","link":"https:\/\/lium.univ-lemans.fr\/en\/ckbens2tt\/","title":{"rendered":"Donn\u00e9es pseudo-\u00e9tiquet\u00e9es de kurde central vers l\u2019anglais pour la traduction de la parole"},"content":{"rendered":"<div class=\"panel-grid\" id=\"pg-26521-0\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-26521-0-0\" ><div class=\"panel-widget-style\" ><h2 style=\"color: #e5442d;\"><b>Corpus: <\/b>Central Kurdish to English Pseudo-Labeled Data for Speech Translation (Donn\u00e9es pseudo-\u00e9tiquet\u00e9es de kurde central vers l\u2019anglais pour la traduction de la parole)<br\/><\/h2><p><b>Licence: <\/b> CC BY 4.0 license<br\/><div class=\"table-responsive\" style=\"margin-left:-20px;\" ><table class=\"table\" style=\"border: 0 solid #ffffff;\"><tr class=\"col-sm-12\" ><td style=\"border: 0;\"><br\/><b>Author(s): <\/b><\/td><td style=\"border: 0\"><center><a href=\"https:\/\/lium.univ-lemans.fr\/en\/team\/mohammad-mohammadamini\/\" target=\"_blank\" ><img alt=\"User Pic\" src=https:\/\/lium.univ-lemans.fr\/wp-content\/uploads\/2024\/04\/Aran-Mohammadamini.jpeg class=\"img-circle img-responsive\" height=\"60\" width=\"60\"><\/a><a href=\"https:\/\/lium.univ-lemans.fr\/en\/team\/mohammad-mohammadamini\/\" target=\"_blank\" ><b style=\"color:#e5442d;\"><span style=\"font-size: 8pt;\">Mohammad  Mohammadamini<\/span><\/b><\/a><\/center><\/td><\/tr><\/table><\/div><\/div><br\/><\/p><\/div><\/div><\/div><\/div><div class=\"panel-grid\" id=\"pg-26521-1\" ><div class=\"panel-grid-core\"><div class=\"panel-grid-cell\" id=\"pgc-26521-1-0\" ><div class=\"panel-widget-style\" ><h4 style=\"color: #e5442d;\">Description <\/h4>\n<p align=\"justify\">In this repository, you will find large-scale pseudo-labeled data, including Central Kurdish audio translated into English. This dataset contains <strong>1.7 million samples<\/strong>, equivalent to <strong>3,000 hours<\/strong> of Kurdish audio, extracted from audiobooks and translated into English using a pipeline that combines a speech recognition system with a machine translation system. The samples have passed several filters, as described in the related paper.<\/p>\n<p align=\"justify\">This dataset was developed as part of the <a href=\"http:\/\/lium.univ-lemans.fr\/en\/projet-commute\/\">COMMUTE project<\/a>.<\/p>\n<p align=\"justify\"><strong>Note<\/strong>: Due to the large size of the dataset, publication may take some time. The download link will be shared on this page once it becomes available.<\/p><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Corpus: Central Kurdish to English Pseudo-Labeled Data for Speech Translation (Donn\u00e9es pseudo-\u00e9tiquet\u00e9es de kurde central vers l\u2019anglais pour la traduction de la parole)Licence: CC BY 4.0 licenseAuthor(s): Mohammad MohammadaminiDescription In this repository, you will find large-scale pseudo-labeled data, including Central Kurdish audio translated into English. This dataset contains 1.7 million samples, equivalent to 3,000 hours [&hellip;]<\/p>\n<p class=\"more-link style2\"><a href=\"https:\/\/lium.univ-lemans.fr\/en\/ckbens2tt\/\"  class=\"themebutton\"><span class=\"more-text\">READ MORE<\/span><span class=\"more-icon\"><i class=\"fa fa-angle-right fa-lg\"><\/i><\/span><\/a><\/p>\n","protected":false},"author":14,"featured_media":17309,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[48,47],"tags":[49],"acf":[],"_links":{"self":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/26521"}],"collection":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/comments?post=26521"}],"version-history":[{"count":0,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/posts\/26521\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media\/17309"}],"wp:attachment":[{"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/media?parent=26521"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/categories?post=26521"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lium.univ-lemans.fr\/en\/wp-json\/wp\/v2\/tags?post=26521"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}