AdVanced ERror Analysis for speech recognition (VERA)
The VERA project aims at developing tools for diagnostic, localization, and measurements of automatic transcription errors. This project is based on a consortium of first-rate academic actors in this field. The objective is to study the errors in detail (at the perceptive, acoustico-phonetics, lexical, and syntactic levels) in order to yield a precise diagnosis of possible lacks of the current classical models on certain classes of linguistic phenomena.
At the application level, the VERA project is justified by an observation—that a number of applications offering access to the contents of multimedia data are made possible by the use of automatic transcription of speech : subtitling of video, search for precise portions of audio-visual archives, automated reports of meetings, extraction and structuring of information (Speech Analytics) in multimedia contents (Web, call centers, …). However large scale deployment is often slowed down by the fact that transcription by automatic speech recognition systems contains too many errors. Research and development in speech recognition has focused, successfully until now, on the improvement of methods and models implemented in the transcription process, measured through the word error rate ; however, past a given performance level, the the cost of reducing the residual errors increases exponentially.
Transcription errors thus persist, which are more or less awkward according to the applications. Information retrieval is tolerant with errors (up to 30%), but systematic errors on certain named entities can be prohibitive. On the contrary, subtitling or meeting transcription have a very weak tolerance with the errors, and even very low word error rates compared to the state of the art (lower than 5%) are too high for the end-users.
Error processing is not limited to increasing the acceptance level of applications based on automatic transcription. Error classification, impact measurement through perceptive tests, error diagnosis for current state-of-the-art transcription systems, constitute the first, crucial step in identifying the lacks of the current models and preparing the future generations of Automatic Speech Recognition system.
The VERA project aims, through close cooperation between complementary partners who excel in their field, at setting up an infrastructure for detection, diagnosis, and qualitative measurement, which makes it possible to create a virtuous circle of improvement of large and very large vocabulary continuous speech recognition systems.