Constructing Sound Zones using machine learning on a large dataset

Supervsisors : Théo Mariotte (LIUM), Manuel Melon (LAUM), Marie Tahon (LIUM)
Host Laboratory: Laboratoire d’Informatique de l’Université du Mans (LIUM) – Laboratoire d’Acoustique de l’Université du Mans (LAUM).
Location : Le Mans Université
Beginning of internship: Between January and March 2025
Contact : Théo Mariotte, Manuel Melon and Marie Tahon (prénom.nom@univ-lemans.fr)

 

Application: Send a CV, a covering letter relevant to the proposed subject, your grades for the last two years of study and the possibility of attaching letters of recommendation to Théo Mariotte before December 15, 2024
 
 

Description : The aim of the internship is to implement machine learning systems for the construction of differentiated listening zones. (Sound zones).

 

Context : The use of sound zones [1] has many applications, such as the broadcasting of personalised audio content in vehicle interiors. These methods make it possible to control the acoustic level emitted in defined areas of the space, known as light and dark zones. In the first case, the acoustic level is raised to allow the useful signal to be transmitted. In the second, the level is attenuated to restrict the acoustic signal transmitted to the clear zone. These zones can be constructed using a network of loudspeakers and microphones.

The methods used in the literature to implement differentiated listening zones exploit constrained optimisation (e.g. Acoustic Contrast Control (ACC), Pressure Matching (PM)). More recently, the work of Pepe et al [4] has proposed an approach using deep neural networks. Datasets have also been published for acoustic field reconstruction (ISOBEL [2]) and sound zone reproduction (Zhao et al. [3]). These two considerations pave the way for the use of neural methods for the construction of sound zones.

 

Objectives : The aim of the proposed internship is firstly to reproduce a method from the literature and apply it to public data sets. The second stage will involve improving this approach and assessing its robustness according to various criteria (acoustic environment, subject position, etc.).

Phase 1 :
• Study of the literature and familiarisation with conventional approaches for sound zones.
• Reproduce the method used in the article by Pepe et al [4].
• Get to grips with the public datasets ISOBEL [2] and Zhao et al [3].
• Evaluate the method on these datasets
• Compare this approach with conventional sound zone construction methods

Phase 2 :
• Study the robustness of the neural method according to different criteria (acoustic environment, subject position),
• Improve the robustness of the neural approach according to these criteria.

There are also plans to design a demonstrator enabling two users sharing the same space to listen to a text read in two different languages. This demonstrator could be presented at the next Le Mans Sonore Biennial in 2026.

 

Laboratories :
The Laboratoire d’Acoustique de l’Université du Mans (LAUM) has extensive expertise in acoustic field reproduction and control methods. Manuel Melon has led and supervised numerous projects on the subject of sound zones.

The Computer Science Laboratory at the University of Le Mans (LIUM) has historically focused on automatic speech processing, with a strong emphasis on deep machine learning approaches. Marie Tahon is working in particular on neural methods for emotion recognition and speech synthesis, with a focus on interpretability. Théo Mariotte is working on audio processing methods using neural networks, and in particular developing methods using microphone antennas.

The trainee will benefit from the expertise of the two laboratories, both in terms of acoustics (LAUM) and computer science and machine learning. (LIUM).

Profile sought : Candidates interested in artificial intelligence and acoustic field reproduction methods, enrolled in a Masters degree in computer science or acoustics.

 

Bibliography

[1] T. Betlehem, W. Zhang, M. A. Poletti, et T. D. Abhayapala, « Personal Sound Zones: Delivering interface-free audio to multiple listeners », IEEE Signal Process. Mag., vol. 32, no 2, p. 81‐91, mars 2015, doi: 10.1109/MSP.2014.2360707.

[2] M. S. Kristoffersen, M. B. Møller, P. Martínez-Nuevo, et J. Østergaard, « Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL Sound Field Dataset », 12 février 2021, arXiv: arXiv:2102.06455.

[3] S. Zhao, Q. Zhu, E. Cheng, et I. S. Burnett, « A room impulse response database for multizone sound field reproduction (L) », The Journal of the Acoustical Society of America, vol. 152, no 4, p. 2505‐2512, oct. 2022, doi: 10.1121/10.0014958.

[4] G. Pepe, L. Gabrielli, S. Squartini, L. Cattani, et C. Tripodi, « Deep Learning for Individual Listening Zone », in 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), Tampere, Finland: IEEE