In a mobile environment, automatic speech recognition (ASR) systems are being used for information retrieval. Due to the presence of multiple speakers, noise sources and reverberations in such environments, the ASR performance degrades. Here, the problem of improvement of ASR performance by separating the convolutively mixed speech signals that predominantly exist in mobile environments is addressed. For the separation, an extension of the algorithm published in A. Ossadtchi and S. Kadambe [‘‘Over‐complete blind source separation by applying sparse decomposition and information theoretic based probabilistic approach,’’ ICASSP, 2001] is applied in the time–frequency domain. In the extended algorithm, the dual update algorithm that minimizes L1 and L2 norms simultaneously is applied in every frequency band. The problem of channel swapping is also addressed. The experimental results of separation of convolutively mixed signals indicate about 6‐dB SNR improvement. The enhanced speech signals are then used in GMM based continuous speech recognizer. The recognition experiments are performed using the real speech data collected inside a vehicle. During the presentation, complete ASR performance improvement results will be provided.
Skip Nav Destination
Article navigation
November 2002
Meeting abstract. No PDF available.
October 25 2002
Convolutive mixture separation in time–frequency domain for robust automatic speech recognition
Shubha L. Kadambe
Shubha L. Kadambe
HRL Labs., LLC, 3011 Malibu Canyon Rd., Malibu, CA 90265
Search for other works by this author on:
J. Acoust. Soc. Am. 112, 2278 (2002)
Citation
Shubha L. Kadambe; Convolutive mixture separation in time–frequency domain for robust automatic speech recognition. J. Acoust. Soc. Am. 1 November 2002; 112 (5_Supplement): 2278. https://doi.org/10.1121/1.4779133
Download citation file:
44
Views
Citing articles via
All we know about anechoic chambers
Michael Vorländer
Day-to-day loudness assessments of indoor soundscapes: Exploring the impact of loudness indicators, person, and situation
Siegbert Versümer, Jochen Steffens, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Spectral network based on lattice convolution and adversarial training for noise-robust speech super-resolution
J. Acoust. Soc. Am. (November 2024)
An MTF‐based blind restoration of temporal power envelopes as a front‐end processor for automatic speech recognition systems in reverberant environments
J Acoust Soc Am (May 2008)
Evaluation of monaural and binaural speech enhancement for robust auditory‐based automatic speech recognition
J Acoust Soc Am (February 1999)
Robust speech recognition in adverse environments by separating speech and noise sources using JADE‐ICA
J Acoust Soc Am (November 2000)
Unsupervised modulation filter learning for noise-robust speech recognition
J. Acoust. Soc. Am. (September 2017)