In this study we present a performance comparison for five pitch extraction algorithms: Auto Correlation, Cross Correlation, and Sub-Harmonic Summation (as implemented in PRAAT [Boersma and Weenick (2010)]), the Robust Algorithm for Pitch Tracking implemented in ESPS [Talkin (1995)], and SWIPE' [Camacho (2007)]. Recent research showed that SHS and SWIPE' outperformed the other algorithms on two speech databases with EGG reference values [Camacho (2007)]. That study, however, used a fixed search range of 40-800 Hz for all speakers, regardless of sex or speaker-specific pitch characteristics. In the current study, we adopt the parameter optimization strategy from De Looze and Rauzy (2009) to calculate specific pitch floor and ceiling values for each speaker. Our results show a substantial improvement in accuracy of the AC, CC, and RAPT algorithms when the optimized parameters are used (especially for the female speakers), and all five algorithms show similar performance. The gross error rate for all five algorithms ranges from 0.1% to 0.3% (N=18 098) on the FDA database [Bagshaw (1994)] and from 0.2% to 0.4% (N=11 527) on the Keele database [Plante et al. (1995)]. Our study thus highlights the importance of pre-processing the speech signal to determine optimal speaker-specific parameters for pitch extraction.
Skip Nav Destination
Article navigation
15 November 2010
160th Meeting Acoustical Society of America
15–19 November 2010
Cancun, Mexico
Session 1pSC: Speech Communication
June 22 2011
The importance of optimal parameter setting for pitch extraction
Evanini Keelan;
Evanini Keelan
Research and Development, Educational Testing Service, Rosedale Road MS R-11, Princeton, NJ 08541
Search for other works by this author on:
Catherine Lai;
Catherine Lai
Linguistics, University of Pennsylvania, 619 Williams Hall, Philadelphia, PA 19104
Search for other works by this author on:
Klaus Zechner
Klaus Zechner
Research and Development, Educational Testing Service, Rosedale Road MS R-11, Princeton, NJ 08541
Search for other works by this author on:
Proc. Mtgs. Acoust. 11, 060004 (2010)
Article history
Received:
May 13 2011
Accepted:
June 20 2011
Citation
Evanini Keelan, Catherine Lai, Klaus Zechner; The importance of optimal parameter setting for pitch extraction. Proc. Mtgs. Acoust. 15 November 2010; 11 (1): 060004. https://doi.org/10.1121/1.3609833
Download citation file:
499
Views
Citing articles via
Show your scattering coefficients
Michael Vorlaender, Stefan Feistel
An analysis of spatial impulse response measurements and their ability to validate spatial features within acoustic models
John S. Latta, Lauren M. Ronsse
Related Content
Leveraging laryngograph data for robust voicing detection in speech
J. Acoust. Soc. Am. (November 2024)
Performance analysis of various fundamental frequency estimation algorithms in the context of pathological speech
J. Acoust. Soc. Am. (November 2022)
Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method
J. Acoust. Soc. Am. (November 2019)
Robust fundamental frequency-detection algorithm unaffected by the presence of hoarseness in human voice
J. Acoust. Soc. Am. (December 2024)
Evaluation and analysis of whispered speech for cochlear implant users: Gender identification and intelligibility
J. Acoust. Soc. Am. (July 2015)