Distributed software
The LIUM distributes several software and resources such as corpus. Some productions have been deposited with the Programme Protection Agency (APP) via the Technology Transfer Accelerating Society (SATT) Ouest Valorisation. The vast majority of the productions are distributed under free licenses, more or less restrictive (GPL, LGPL, Creative Common v3, CeCILL).
List of software
- KUTED
- Corpus ALLIES
- TimeLine Generator
- 2048 Atomes
- Jen-Planet
- VR-PEAS
- Le Chaudron Magique
- Get Your BUT
- Aux couleurs de l’océan
- Corpus PASTEL
- TurtleTablet
- Ecris Ton Zoo
- ArSentimentAnalysis
- AlloSat
- Multi30k
- TSAC
- TGRIS-tool
- FrNewsLink
- TED-LIUM Release 3
- TED-LIUM Release 2
- TED-LIUM Release 1
- NMTPY
- NMTPYTORCH
- LIUM Speaker Diarization
- SIDEKIT
- s4d
- Hop3x
- CSLM
- MANY
List of software
KUTED
Corpus : Kurdish TED
Licence: CreativeCommons Attribution NonCommercial-ShareAlike 4.0 International License.
Author(s): Mohammad Mohammadamini, Antoine Laurent
URL: https://huggingface.co/datasets/aranemini/kurdishted
Kurdish TED (KUTED) is the first Speech-to-Text-Translation (S2TT) dataset for the Central Kurdish language derived from TED Talks and TEDx. The corpus consists of 91,000 pairs, encompassing 170 hours of English audio, 1.65 million English tokens, and 1.40 million Central Kurdish tokens. ► Read more
Corpus ALLIES
Corpus : ALLIES
Author(s): Anthony Larcher, Martin Lebourdais, Marie Tahon
URL: https://lium.univ-lemans.fr/en/corpus-allies/
The ALLIES Corpus was produced within the European CHIST-Era project ALLIES. The ALLIES project enabled to carry out a campaign for the evaluation of Broadcast News across time diarization systems using French data. ► Read more
TimeLine Generator
Software: TimeLine Generator
Author(s): Iza Marfisi, Pierre Laforcade
URL: https://lium.univ-lemans.fr/en/timeline-editor/
2048 Atomes
Software: 2048 Atomes
Author(s): Iza Marfisi
URL: https://lium.univ-lemans.fr/en/atomes/
Jen-Planet
Software: Catalogue de Jeux Educatifs Numériques
Author(s): Iza Marfisi
URL: https://jen-planet.univ-lemans.fr/
Le but du catalogue Planète des Jeux Educatifs Numériques (JEN) est de proposer des JEN aux enseignants pour leurs séances de cours. ► Read more
VR-PEAS
Software: Virtual Reality PEdAgogical Scenarisation tool
Author(s): Oussema Mahdi, Lahcen Oubahssi
URL: https://lium.univ-lemans.fr/en/vr-peas/
VR-PEAS (Virtual Reality PEdAgogical Scenarisation tool) is an authoring tool with a service for the automatic operationalisation of VR-oriented pedagogical scenarios. ► Read more
Le Chaudron Magique
Software: Le Chaudron Magique
Author(s): Sébastien George, Iza Marfisi, Sofiane Touel
URL: https://lium.univ-lemans.fr/en/le-chaudron-magique/
The Chaudron Magique is a Mixed Reality mobile application mobile to teach fractions. ► Read more
Get Your BUT
Software: Get Your BUT
URL: https://lium.univ-lemans.fr/en/get-your-but/
Aux couleurs de l’océan
Software: Aux couleurs de l’océan
Author(s): Iza Marfisi
URL: https://lium.univ-lemans.fr/en/aux-couleurs-de-locean/
Corpus PASTEL
Corpus : PASTEL
Author(s): Salima Mdhaffar, Yannick Estève, Antoine Laurent, Nathalie Camelin
URL: https://lium.univ-lemans.fr/en/pastel-2/
The PASTEL corpus consists of a collection of courses from different computer science fields (automatic language processing, introduction to computer science, etc.) in the first year of the Bachelor's degree in computer science at the University of Nantes.
► Read moreTurtleTablet
Software: TurtleTablet
Author(s): Iza Marfisi, Sébastien George, Marc Leconte
URL: https://turtletablet.univ-lemans.fr/
TurtleTablet est un jeu collaboratif pour s’initier aux bases de la programmation. Pour favoriser une réelle collaboration entre les joueurs, le jeu peut être joué avec deux objets physiques (pièces tangibles) reconnus sur l’écran de la tablette. ► Read more
Ecris Ton Zoo
Software: Ecris Ton Zoo
URL: https://lium.univ-lemans.fr/en/ecris-ton-zoo/
ArSentimentAnalysis
Corpus : ArSentimentAnalysis
GitHub: https://github.com/amirabaroumi/ArSentimentAnalysis
Author(s): Amira Barhoumi, Nathalie Camelin, Yannick Estève
URL: https://lium.univ-lemans.fr/en/arsentimentanalysis/
The ArSentimentAnalysis package includes a set of resources for designing and evaluating an Arabic sentiment analysis system. The package contains: - 1/ Sets of pre-trained Arabic-specific embeddings - 2/ The ArSentLex polarized lexicon ► Read more
AlloSat
Corpus : AlloSat
Licence: creative Commons
Author(s): Manon Macary, Marie Tahon, Anthony Rousseau, Yannick Estève
URL: https://lium.univ-lemans.fr/en/allosat/
The corpus, named AlloSat, is composed of real-life call center conversations in French and is continuously annotated in frustration and satisfaction. This corpus has been set up to develop new systems able to model the continuous aspect of semantic and paralinguistic information at the conversation level. ► Read more
Multi30k
Corpus : Multi30k Dataset
Licence: Attribution-NonCommercial-ShareAlike 4.0 International
GitHub: https://github.com/multi30k
Author(s): Loïc Barrault, Ozan Caglayan, Fethi Bougares
URL: https://lium.univ-lemans.fr/en/multi30k/
The Flickr30K Dataset contains 31,014 images sourced from online photo-sharing websites (Young et al., 2014). The Multi30K dataset extends the Flickr30K dataset with translated and independent German sentences. ► Read more
TSAC
Corpus : Tunisian Sentiment Analysis Corpus.
Licence: GNU Lesser General Public License v3.0
GitHub: https://github.com/fbougares/TSAC
Author(s): Fethi Bougares, Salima Mdhaffar, Yannick Estève
URL: https://lium.univ-lemans.fr/en/tsac/
About 17k user comments manually annotated to positive and negative polarities. This corpus is collected from Facebook users comments written on official pages of Tunisian radios and TV channels ► Read more
TGRIS-tool
Software: TGRIS-tool
Author(s): Iza Marfisi
URL: https://lium.univ-lemans.fr/en/tgris-tool/
TGRIS is a Virtual Reality tool to simulate emotionally intense interviews. ► Read more
FrNewsLink
Corpus : Topic Segmentation
URL: https://hal.archives-ouvertes.fr/hal-01741177
FrNewsLink package allows to adress several applicative tasks in the domain of topic and titling segmentation. It is compososed of a set of resources from a varied corpus of French Broadcast News (BN) and press articles. Due to broadcasting rights, this package does not contain videos or audios files. ► Read more
TED-LIUM Release 3
Corpus : TED-LIUM Release 3
Licence: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)
Author(s): François Fernandez, Vincent Nguyen, Sahar Ghannay, Natalia Tomashenko, Yannick Estève
URL: https://lium.univ-lemans.fr/en/ted-lium3/
This is the TED-LIUM corpus release 3, licensed under Creative Commons BY-NC-ND 3.0 (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). ► Read more
TED-LIUM Release 2
Corpus : TED-LIUM Release 2
Licence: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)
Author(s): Anthony Rousseau, Paul Deléglise, Yannick Estève
URL: https://lium.univ-lemans.fr/en/ted-lium2/
This is the TED-LIUM corpus release 2, licensed under Creative Commons BY-NC-ND 3.0 (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). ► Read more
TED-LIUM Release 1
Corpus : TED-LIUM Release 1
Licence: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)
Author(s): Anthony Rousseau, Paul Deléglise, Yannick Estève
URL: https://lium.univ-lemans.fr/en/ted-lium1/
This is the TED-LIUM corpus release 1, licensed under Creative Commons BY-NC-ND 3.0 (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). ► Read more
NMTPY
Software: NMTPY
Licence: MIT License
GitHub: https://github.com/lium-lst/nmtpy
Author(s): Ozan Caglayan, Mercedes García Martínez, Adrien Bardet, Walid Aransa, Loïc Barrault, Fethi Bougares
URL: https://arxiv.org/abs/1706.00457
nmtpy is a suite of Python tools for training mono- and multimodal neural machine translation systems using Theano. ► Read more
NMTPYTORCH
Software: NMTPYTORCH
Licence: MIT License
GitHub: https://github.com/lium-lst/nmtpytorch/
Author(s): Ozan Caglayan, Mercedes García Martínez, Adrien Bardet, Walid Aransa, Fethi Bougares, Loïc Barrault
URL: https://arxiv.org/abs/1706.00457
This is the PyTorch fork of nmtpy, a sequence-to-sequence framework which was originally a fork of dl4mt-tutorial. ► Read more
LIUM Speaker Diarization
Software: LIUM Speaker Diarization
Licence: GPL
URL: https://projets-lium.univ-lemans.fr/spkdiarization/
Outil de segmentation et regroupement locuteur (Speaker diarization) en java. ► Read more
SIDEKIT
Software: SIDEKIT
Licence: LGPL
GitHub: https://git-lium.univ-lemans.fr/Larcher/sidekit
Author(s): Anthony Larcher, Kong Aik Lee, Sylvain Meignier
URL: https://projets-lium.univ-lemans.fr/sidekit/
SIDEKIT is an open source package for Speaker and Language recognition. ► Read more
s4d
Software: SIDEKIT for diarization
Licence: LGPL
GitHub: https://git-lium.univ-lemans.fr/Meignier/s4d
Author(s): Pierre-Alexandre Broux, Florent Desnous, Anthony Larcher, Sylvain Meignier
URL: https://projets-lium.univ-lemans.fr/s4d/
Speaker diarization tools. ► Read more
Hop3x
Software: Hop3x
URL: http://hop3x.univ-lemans.fr
Hop3x is a learning environment for learning programming. It allows the teacher to remotely follow the programming activity of the learners by providing qualitative information (indicators) on this activity and a real-time visualization of the productions (source code of the programs). ► Read more
CSLM
Software: Continuous Space Language Model toolkit
GitHub: https://git-lium.univ-lemans.fr/barrault/cslm
Author(s): Holger Schwenk
URL: https://git-lium.univ-lemans.fr/barrault/cslm/-/archive/master/cslm-master.tar.gz
CSLM toolkit is open-source software which implements the so-called continuous space language model. ► Read more
MANY
Corpus : MANY
Licence: GNU GPL v3
URL: https://code.google.com/archive/p/many/
Many is a MT System Combination software which architecture is described in ► Read more