Context: This topic is the exemplar of my research combining knowledge of psychoacoustics and signal processing. Precisely, this topic addresses the psychoacoustics question of "how does the spread of auditory masking produced by a signal well localized in the time-frequency plane look?"; a question that originally emerged from signal processing. The fact is, many audio applications combine time-frequency analysis methods and psychoacoustics models of auditory masking to represent and extract mostly the perceptually relevant information from sound signals. A typical example is perceptual audio coding. To reduce the digital size of audio files, audio codecs like mp3 decompose sounds into time-frequency segments and apply a masking model to reduce the bit rates in those segments. From a signal processing viewpoint, the accurate prediction of masking interactions between time-frequency coefficients requires masking data for stimuli with a good time-frequency localization. Nevertheless, little is known about the time-frequency spread of masking for well-localized signals, illustrated in Fig.1.
The phenomenon of auditory masking has been the focus of many psychoacoustics studies over the last decades. It occurs when the presence of a sound, called the masker, impedes the detection of another sound, the target. Masking has been extensively investigated with simultaneous (spectral masking) and non-simultaneous (temporal masking) presentation of masker and target. The results were used to develop models of either spectral or temporal masking that are currently implemented in audio applications. Some attempts were made to simply combine these models to predict masking in the time-frequency plane. However, the linear additivity of temporal and spectral masking to predict time-frequency masking has been tested and invalidated in a previous study (Lutfi, 1988; J. Acoust. Soc. Am. vol.83 pp.163–177). In addition, note that the signals used in most masking studies usually do not have a good localization in the time-frequency plane. A main reason for that is that experimenters generally employ masker and target with different spectral and temporal properties (e.g. broadband masker vs. narrowband target and/or long masker vs. short target) to reduce confusion effects. Consequently, most masking models used in applications were developed based on masking data for long and/or broadband stimuli.
Method: To accurately predict the audibility of each coefficient in the time-frequency plane, it is important not only to know how does the time-frequency masking function produced by one coefficient look, but also to investigate how do the masking functions resulting from multiple coefficients add up. A series of experiments was designed accordingly. Specifically, experiments using a single masker and one target are aimed to measure the time-frequency masking function for one coefficient (see Fig.2). Experiments using up to four maskers separated in time and/or frequency and one target are aimed to measure the additivity of masking in the time-frequency plane. In all experiments, masker and target are brief Gaussian-shaped pulses with a bandwidth of approximately one critical band. A Gaussian shape was chosen because it minimizes the time-frequency uncertainty principle. Masked thresholds are measured in normal-hearing listeners using an adaptive procedure.
Results: The average time-frequency masking function measured for a Gaussian masker at 4 kHz with a sensation level of 60 dB is illustrated in Fig.3. Time-frequency masking functions for other masker frequencies and levels are being collected. The results of the experiments using multiple maskers separated only in time or frequency showed under which conditions nonlinear masking additivity occurs in either domain and how it can be modeled assuming nonlinear stimulus processing in the cochlea. The results of the experiments using multiple maskers separated in time and frequency are being analyzed.
My contribution: Main investigator.
Potential applications: audio applications that perform time-frequency processing, perceptual audio codecs, auditory models.
Related publications:
T. Necciari, B. Laback, S. Savel, S. Ystad, P. Balazs, S. Meunier, and R. Kronland-Martinet. Auditory Time-Frequency Masking for Spectrally and Temporally Maximally-Compact Stimuli. PLOS ONE, 11:1-23, 2016.
G. Chardon, T. Necciari, and P. Balazs. Perceptual matching pursuit with Gabor dictionaries and time-frequency masking. In Proceedings of ICASSP 2014, pages 3126–3130, Florence, Italy, May 2014. IEEE.
B. Laback, T. Necciari, P. Balazs, S. Savel, and S. Ystad. Simultaneous masking additivity for short Gaussian-shaped tones: Spectral effects. The Journal of the Acoustical Society of America, 134(2):1160–1171, 2013.
B. Laback, P. Balazs, T. Necciari, S. Savel, S. Meunier, S. Ystad, and R. Kronland-Martinet. Additivity of nonsimultaneous masking for short Gaussian-shaped sinusoids. The Journal of the Acoustical Society of America, 129(2):888–897, 2011.
T. Necciari, P. Balazs, R. Kronland-Martinet, S. Ystad, B. Laback, S. Savel, and S. Meunier. Auditory time-frequency masking: Psychoacoustical data and application to audio representations. In S. Ystad et al., editor, Speech, Sound and Music Processing: Embracing Research in India, volume 7172 of *Lecture Notes in Computer Science*, pages 146–171. Springer, 2012.
T. Necciari, P. Balazs, R. Kronland-Martinet, S. Ystad, B. Laback, S. Savel, and S. Meunier. Perceptual optimization of audio representations based on time-frequency masking data for maximally-compact stimuli. In Proceedings of the 45th AES conference on Applications of Time-Frequency Processing in Audio, Helsinki, Finland, March 2012.
T. Necciari, S. Savel, S. Meunier, S. Ystad, and R. Kronland-Martinet. Masquage auditif temps-frequence avec des stimuli de forme Gaussienne. In Proceedings of the 10e Congrès Francais d’Acoustique (CFA’10), Lyon, France, April 2010. (In French).
B. Laback, P. Balazs, G. Toupin, T. Necciari, S. Savel, S. Meunier, S. Ystad, and R. Kronland-Martinet. Additivity of auditory masking using Gaussian-shaped tones. In Proceedings of the Acoustics’08 international conference, Paris, France, July 2008.
T. Necciari, S. Savel, S. Meunier, S. Ystad, R. Kronland-Martinet, B. Laback, and P. Balazs. Auditory masking using Gaussian-windowed stimuli. Presented at the Acoustics’08 international conference, Paris, France, July 2008.