A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (−10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).
Skip Nav Destination
,
,
,
,
,
Article navigation
April 2018
April 23 2018
Noise-robust speech triage
Anthony L. Bartos;
Anthony L. Bartos
1
Suzanne R. Miller Associates
, Marriotsville, Maryland 21104, USA
Search for other works by this author on:
Tomas Cipr;
Tomas Cipr
2
Phonexia Limited and Brno University of Technology
, Brno, Czech Republic
Search for other works by this author on:
Douglas J. Nelson;
Douglas J. Nelson
a)
3
United States Department of Defense
, 9800 Savage Road, Fort Meade, Maryland 20755, USA
Search for other works by this author on:
Petr Schwarz;
Petr Schwarz
2
Phonexia Limited and Brno University of Technology
, Brno, Czech Republic
Search for other works by this author on:
John Banowetz;
John Banowetz
4
Naval Research Laboratory
, Washington, DC 20375, USA
Search for other works by this author on:
Ladislav Jerabek
Ladislav Jerabek
1
Suzanne R. Miller Associates
, Marriotsville, Maryland 21104, USA
Search for other works by this author on:
Anthony L. Bartos
1
Tomas Cipr
2
Douglas J. Nelson
3,a)
Petr Schwarz
2
John Banowetz
4
Ladislav Jerabek
1
1
Suzanne R. Miller Associates
, Marriotsville, Maryland 21104, USA
2
Phonexia Limited and Brno University of Technology
, Brno, Czech Republic
3
United States Department of Defense
, 9800 Savage Road, Fort Meade, Maryland 20755, USA
4
Naval Research Laboratory
, Washington, DC 20375, USA
a)
Electronic mail: [email protected]
J. Acoust. Soc. Am. 143, 2313–2320 (2018)
Article history
Received:
September 21 2017
Accepted:
March 26 2018
Citation
Anthony L. Bartos, Tomas Cipr, Douglas J. Nelson, Petr Schwarz, John Banowetz, Ladislav Jerabek; Noise-robust speech triage. J. Acoust. Soc. Am. 1 April 2018; 143 (4): 2313–2320. https://doi.org/10.1121/1.5031029
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
140
Views
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
Related Content
Noise in the adult emergency department of Johns Hopkins Hospital
J. Acoust. Soc. Am. (April 2007)
Noise in the adult emergency department of Johns Hopkins Hospital
J. Acoust. Soc. Am. (May 2006)
Improving the accuracy in prediction of patient admission in emergency ward using logistic regression compared with AdaBoost
AIP Conf. Proc. (February 2025)
Customer review based sentiment analysis
AIP Conf. Proc. (April 2025)
Iot-enabled smart mask for respiratory monitoring with timely alarm notification
AIP Conf. Proc. (April 2025)