Corpus : TED-LIUM Release 1
Licences : Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)
Auteur(s) : | ![]() | ![]() | ![]() |
Licences : Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives)
Auteur(s) : | ![]() | ![]() | ![]() |
This is the TED-LIUM corpus release 1,
licensed under Creative Commons BY-NC-ND 3.0 (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en).
The TED-LIUM corpus is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech.
More details are given in this paper:
A. Rousseau, P. Deléglise, and Y. Estève, “TED-LIUM: an automatic speech recognition dedicated corpus”, in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), May 2012.
Please cite this reference if you use these data in your research work.
Since 27/08/2025, the download of TED-LIUM1 is not possible anymore. Please contact the authors for further information.
SPH format info:
Channels: 1
Sample Rate: 16000
Precision: 16-bit
Bit Rate: 256k
Sample Encoding: 16-bit Signed Integer PCM