PhD defence, Pierre Champion
Date : 20/04/2023
Time : 14h00
Location : Nancy
Title : Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques
Jury members :
- Luciana Ferrer Chargée de Recherche, University of Buenos Aires, Reviewer
- Lukáš Burget Associate professor, Brno University of Technology, Reviewer
- Jean-Francois Bonastre Professeur, Université d’Avignon, Examiner
- Lori Lamel Directrice de recherche, Université Paris-Scalay, Examiner
- Nicholas Evans Professeur, EURECOM, Invited
- Slim Ouni, Associate Professor, University of Lorraine, LORIA, France, Director of thesis
- Denis Jouvet Directeur de recherche, Nancy INRIA/LORIA, Director of thesis
- Anthony Larcher Professeur, Le Mans Université LIUM, Co-Director of thesis
Abstract:
The growing use of voice user interfaces, from telephones to remote controls, automobiles, and digital assistants, has led to a surge in the collection and storage of speech data. While data collection allows for the development of efficient tools powering most speech services, it also poses serious privacy issues for users as centralized storage makes private personal speech data vulnerable to cyber threats. Advanced speech technologies, such as voice-cloning and personal attribute recognition, can be used to access and exploit sensitive information. Voice-cloning technology allows an attacker to take a recording of a person’s voice and use it to generate new speech that sounds like it is coming from that person. For example, an attacker could use voice-cloning to impersonate a person’s voice to gain unauthorized access to his/her financial information over the phone. With the increasing use of voice-based digital assistants like Amazon’s Alexa, Google’s Assistant, and Apple’s Siri, and with the increasing ease with which personal speech data can be collected and stored, the risk of malicious use of voice-cloning and speaker/gender/pathological/etc. recognition technologies have increased. Companies and organizations need to consider these risks and implement appropriate measures to protect user data in order to prevent misuse of speech technologies and comply with legal regulations (e.g., General Data Protection Regulation (GDPR)).
To address these concerns, this thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization. In this work, anonymization refers to the process of making personal speech data unlinkable to an identity, while maintaining the usefulness (utility) of the speech signal (e.g., access to the linguistic content). The goal is to protect the privacy of individuals by removing or obscuring any Personally Identifiable Information (PPI) from the acoustic of speech. PPI includes things like a person’s voice, accent, and speaking style; other personal information in the speech content like phone number, person name, etc. is out of the scope of this thesis. Our research is built on top of existing anonymization methods based on voice conversion and existing evaluation protocols.
We start by identifying and explaining several challenges that evaluation protocols need to consider to properly evaluate the degree of privacy protection. We clarify how anonymization systems need to be configured for evaluation purposes, and highlight the fact that many practical deployment configurations do not permit privacy evaluation. Furthermore, we follow by studying and examining the most common voice conversion-based anonymization system and identify its weak points, before suggesting new methods to overcome some limitations. We isolate all components of the anonymization system to evaluate the degree of speaker PPI associated with each of them. Then, we propose several transformation methods for each component to reduce as much as possible speaker PPI while maintaining utility. We promote anonymization algorithms based on the use of quantization-based transformation as an alternative to the most-used and well-known noise-based approach. Finally, we endeavor a new kind of attacker method that aims to invert the anonymization, creating a new threat. In this thesis, we openly work on sharing anonymization systems and evaluation protocols, the purpose of which is to aid organizations in facilitating the preservation of privacy rights for individuals.
Keywords:
Speaker anonymization, Speech recognition, Speaker verification, Privacy.