Classical timbre studies have modeled timbre as the integration of a limited number of auditory dimensions and proposed acoustic correlates to these dimensions to explain sound identification. Here, the goal was to highlight time-frequency patterns subserving identification of musical voices and instruments, without making any assumption about these patterns. We adapted a “random search method” proposed in vision. The method consists of synthesizing sounds by randomly selecting “auditory bubbles” (small time-frequency glimpses) from the original sounds’ spectrograms, and then inverting the resulting sparsified representation. For each bubble selection, a decision procedure categorizes the resulting sound as a voice or an instrument. After hundreds of trials, the whole time-frequency space is explored, and adding together the correct answers reveals the relevant time-frequency patterns for each category. We used this method with two decision procedures: human listeners and a decision algorithm using auditory distances based on spectro-temporal excitation patterns (STEPs). The patterns were strikingly similar for the two procedures: they highlighted higher frequencies (i.e., formants) for the voices, whereas instrument identification was based on lower frequencies (particularly during the onset). Altogether these results show that timbre can be analyzed as time-frequency weighted patterns corresponding to the important cues subserving sound identification.
Skip Nav Destination
Article navigation
October 2016
Meeting abstract. No PDF available.
October 01 2016
Auditory bubbles reveal sparse time-frequency cues subserving identification of musical voices and instruments
Vincent Isnard;
Vincent Isnard
IRCAM, 1 Pl. Igor-Stravinsky, Paris 75004, France
Search for other works by this author on:
Guillaume Lemaitre
Guillaume Lemaitre
IRCAM, Paris 75004, France, GuillaumeJLemaitre@gmail.com
Search for other works by this author on:
J. Acoust. Soc. Am. 140, 3267 (2016)
Citation
Vincent Isnard, Clara Suied, Guillaume Lemaitre; Auditory bubbles reveal sparse time-frequency cues subserving identification of musical voices and instruments. J. Acoust. Soc. Am. 1 October 2016; 140 (4_Supplement): 3267. https://doi.org/10.1121/1.4970361
Download citation file:
Citing articles via
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Co-speech head nods are used to enhance prosodic prominence at different levels of narrow focus in French
Christopher Carignan, Núria Esteve-Gibert, et al.
Source and propagation modelling scenarios for environmental impact assessment: Model verification
Michael A. Ainslie, Robert M. Laws, et al.
Related Content
Spectral versus harmonic information for timbre: Pilot and experimental results
J Acoust Soc Am (May 1998)
Auditory and Visual Sensations: Yoichi Ando’s theory of architectural acoustics.
J Acoust Soc Am (October 2010)
Role of timbre and fundamental frequency in voice gender adaptation
J. Acoust. Soc. Am. (August 2015)
Neurally‐based acoustic and visual design.
J Acoust Soc Am (April 2011)
Acoustic properties of voice timbre types and their influence on voice classification
J Acoust Soc Am (June 1977)