Pattern recognition technology that has been developed for recognizing units of human speech can often be adapted for both recognition and analysis of animal vocalizations. This paper discusses two types of speech recognition algorithms, template based and statistics based, with respect to their ease of deployment and potential application to the objective, quantitative analysis of animal vocalizations. Implementations of the two types of algorithms have been compared using a large database of song units recorded from two song bird species. The algorithms exhibit different strengths and weaknesses. The template‐based dynamic time‐warping algorithm provides quantitative sound comparisons that are directly useful to a researcher, but selection of training materials depends on expert knowledge. The statistics‐based hidden Markov model algorithm requires more training data, but usually performs better in noisy environments and with more variable vocalizations. While both algorithms are accurate in restricted domains, recognition performance could be improved if it were based on species‐specific features extracted from the acoustic input. [Work supported by NIH 1‐F32‐MH10525 and ARO DACA88‐95‐C‐0016.]
Skip Nav Destination
Article navigation
October 1999
Meeting abstract. No PDF available.
October 01 1999
Speech recognition meets bird song: A comparison of statistics‐based and template‐based techniques Free
Sven E. Anderson
Sven E. Anderson
Dept. of Computer Sci., Univ. of North Dakota, Grand Forks, ND 58202‐9015, [email protected]
Search for other works by this author on:
Sven E. Anderson
Dept. of Computer Sci., Univ. of North Dakota, Grand Forks, ND 58202‐9015, [email protected]
J. Acoust. Soc. Am. 106, 2130 (1999)
Citation
Sven E. Anderson; Speech recognition meets bird song: A comparison of statistics‐based and template‐based techniques. J. Acoust. Soc. Am. 1 October 1999; 106 (4_Supplement): 2130. https://doi.org/10.1121/1.428011
Download citation file:
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
Related Content
Automated bird songs recognition using dynamic time warping and hidden Markov models
J. Acoust. Soc. Am. (November 1997)
Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative study
J. Acoust. Soc. Am. (April 1998)
Semi-automatic classification of bird vocalizations using spectral peak tracks
J. Acoust. Soc. Am. (November 2006)
Template‐based automatic recognition of birdsong syllables from continuous recordings
J. Acoust. Soc. Am. (August 1996)
Hidden Markov models for the analysis of animal vocalizations.
J. Acoust. Soc. Am. (April 2009)