Automatic off-line classification and recognition of bird vocalizations has been a subject of interest to ornithologists and pattern detection researchers for many years. Several new applications, including bird vocalization classification for aircraft bird strike avoidance, will require real time classification in the presence of noise and other disturbances. The vocalizations of many common bird species can be represented using a sum-of-sinusoids model. An experiment using computer software to perform peak tracking of spectral analysis data demonstrates the usefulness of the sum-of-sinusoids model for rapid automatic recognition of isolated bird syllables. The technique derives a set of spectral features by time-variant analysis of the recorded bird vocalizations, then performs a calculation of the degree to which the derived parameters match a set of stored templates that were determined from a set of reference bird vocalizations. The results of this relatively simple technique are favorable for both clean and noisy recordings.

1.
Anderson
,
S. E.
,
Dave
,
A. S.
, and
Margoliash
,
D.
(
1996
). “
Template-based automatic recognition of birdsong syllables from continuous recordings
,”
J. Acoust. Soc. Am.
100
,
1209
1219
.
2.
Catchpole
,
C. K.
, and
Slater
,
P. J. B.
(
1995
).
Bird Song: Biological Themes and Variation
(
Cambridge University Press
, Cambridge, UK).
3.
Duda
,
R. O.
,
Hart
,
P. E.
, and
Stork
,
D. G.
(
2001
).
Pattern Classification
, 2nd ed. (
Wiley
, New York).
4.
Ellis
,
D. P. W.
(
2003
). “
Sinewave and sinusoid+noise analysis/synthesis in MATLAB
,” Electronic document, URL: http://www.ee.columbia.edu/∼dpwe/resources/matlab/sinemodel.
5.
Fagerlund
,
S.
(
2004
). “
Automatic recognition of bird species by their sounds
,” Masters thesis, Laboratory of Acoustics and Audio Signal Processing,
Helsinki Univ. of Technology
, Laboratory of Acoustics and Audio Signal Processing.
6.
Härmä
,
A.
(
2003
). “
Automatic identification of bird species based on sinusoidal modeling of a syllable
,”
IEEE Int. Conf. Acoust. Speech and Signal Processing (ICASSP 2003)
,
5
,
545
548
.
7.
Härmä
,
A.
, and
Somervuo
,
P.
(
2004
). “
Classification of the harmonic structure in bird vocalization
,”
IEEE Int. Conf. Acoust. Speech, Signal Processing (ICASSP 2004)
,
5
,
701
704
.
8.
Ito
,
K.
,
Mori
,
K.
, and
Iwasaki
,
S.
(
1996
). “
Application of dynamic programming matching to classification of budgerigar contact calls
,”
J. Acoust. Soc. Am.
100
,
3947
3956
.
9.
Kahrs
,
M.
, and
Avanzini
,
F.
(
2001
). “
Computer synthesis of bird songs and calls
,”
Proc. Conf. Digital Audio Effects (DAFx-01)
, pp.
23
27
.
10.
Kogan
,
J. A.
, and
Margoliash
,
D.
(
1998
). “
Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative study
,”
J. Acoust. Soc. Am.
103
,
2185
2196
.
11.
Krebs
,
J. R.
, and
Kroodsma
,
D. E.
(
1980
). “
Repertoires and geographical variation in bird song
,” Journal of Advances in the Study of Behavior
11
,
143
177
.
12.
Kroodsma
,
D. E.
, and
Miller
,
E. H.
(
1996
).
Ecology and Evolution of Acoustic Communication in Birds
(
Comstock
, Ithaca, NY).
13.
McAulay
,
R. J.
, and
Quatieri
,
T. F.
(
1986
). “
Speech analysis/synthesis based on a sinusoidal representation
,” IEEE Trans. Acoust., Speech, Signal Process.
34
,
744
754
.
14.
McIlraith
,
A. L.
, and
Card
,
H. C.
(
1997
). “
Birdsong recognition using backpropagation and multivariate statistics
,”
IEEE Trans. Signal Process.
45
,
2740
2748
.
15.
National Wind Coordinating Committee (NWCC)
(
2004
). “
Wind-turbine interactions with birds and bats: a summary of research results and remaining questions
,” RESOLVE, Washington, DC.
16.
Nowicki
,
S.
(
1997
). “
Bird acoustics
,” in
Encyclopedia of Acoustics
, edited by
M. J.
Crocker
(
Wiley
, New York), Chap. 150, pp.
1813
1817
.
17.
Pascarelle
,
S. M.
,
Pinezich
,
J.
,
Merritt
,
R. L.
,
Kelly
,
T. A.
,
Roman
,
B.
, and
Maher
,
R. C.
(
2004
). “
Automated acoustic monitoring of bird strike hazards
,”
6th Annual Meeting of the Bird Strike Committee USA/Canada
,
Baltimore, MD, September,
2004
.
18.
Rabiner
,
L. R.
,
Rosenberg
,
A. E.
, and
Levinson
,
S. E.
(
1978
). “
Considerations in dynamic time warping algorithms for discrete word recognition
,” IEEE Trans. Acoust., Speech, Signal Process.
26
,
575
582
.
19.
Rabiner
,
L. R.
(
1989
). “
A tutorial on hidden markov models and selected applications in speech recognition
,”
Proc. IEEE
77
,
257
286
.
20.
Rabiner
,
L. R.
, and
Juang
,
B. H.
(
1993
).
Fundamentals of Speech Recognition
(
Prentice–Hall
, Englewood Cliffs, NJ).
21.
Smith
,
J. O.
, and
Serra
,
X.
(
1987
). “
PARSHL: An analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation
,”
Proc. Int. Computer Music Conf.
, San Francisco,
Computer Music Association
.
22.
Vintsyuk
,
T. K.
(
1971
). “
Element-wise recognition of continuous speech composed of words from a specified dictionary
,” Journal of Cybernetics and Systems Analysis,
7
(2),
361
372
.
23.
Viterbi
,
A. J.
(
1967
). “
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
,”
IEEE Trans. Inf. Theory
13
,
260
269
.
24.
Zheng
,
F.
,
Zhang
,
G. L.
, and
Song
,
Z. J.
(
2001
). “
Comparison of different implementations of MFCC
,” Journal of Computer Science & Technology,
16
(6),
582
589
.
You do not currently have access to this content.