We explore a set of methods referred to collectively as dynamic component analysis (DCA) to derive dictionaries of dynamical patterns for speech (TIMIT database). The methods use spatio-temporal singular value decomposition (ST-SVD) and common spatio-temporal pattern (CSTP) matrix computations. When used on the speech spectrogram, these yield a transformation to a new set of time series representing extracted features at reduced bandwidth. The method is computationally efficient (closed-form solutions suitable for real-time) and robust to additive noise, with diverse applications in speech processing and general multi-input/multi-output (MI/MO) modeling. When used to predict a single neural output (MI/SO), this gives an efficient new method for deriving the spectro-temporal receptive field (STRF), which is shown in our human cortical data to yield improved predictions. We also use DCA to reconstruct speech from cortical activity, wherein dynamical dictionaries for electrocortical data are derived, with application to brain-computer interfaces (BCI).
Skip Nav Destination
Article navigation
November 2013
Meeting abstract. No PDF available.
November 01 2013
Dynamic component analysis for multi-input/multi-output problems, with application to speech and neurophysiology
Erik Edwards;
Erik Edwards
Dept. of Neurological Surgery, UC San Francisco, Sandler Neurosci. Bldg., 675 Nelson Rising Ln., 535, San Francisco, CA 94143, [email protected]
Search for other works by this author on:
Edward F. Chang
Edward F. Chang
Dept. of Neurological Surgery, UC San Francisco, Sandler Neurosci. Bldg., 675 Nelson Rising Ln., 535, San Francisco, CA 94143, [email protected]
Search for other works by this author on:
J. Acoust. Soc. Am. 134, 4230 (2013)
Citation
Erik Edwards, Edward F. Chang; Dynamic component analysis for multi-input/multi-output problems, with application to speech and neurophysiology. J. Acoust. Soc. Am. 1 November 2013; 134 (5_Supplement): 4230. https://doi.org/10.1121/1.4831544
Download citation file:
78
Views
Citing articles via
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
Related Content
General properties of auditory spectro-temporal receptive fields
J. Acoust. Soc. Am. (December 2019)
Multiresolution spectrotemporal analysis of complex sounds
J. Acoust. Soc. Am. (August 2005)
Noise robust representation of speech in the primary auditory cortex.
J Acoust Soc Am (March 2010)
Human superior temporal gyrus encoding of speech sequence probabilities
J Acoust Soc Am (November 2013)
Automatic classification of the acoustical situation using amplitude‐modulation spectrograms
J Acoust Soc Am (February 1999)