Quantitative measures of acoustic similarity can reveal patterns of shared vocal behavior in social species. Many methods for computing similarity have been developed, but their performance has not been extensively characterized in noisy environments and with vocalizations characterized by complex frequency modulations. This paper describes methods of bioacoustic comparison based on dynamic time warping (DTW) of the fundamental frequency or spectrogram. Fundamental frequency is estimated using a Bayesian particle filter adaptation of harmonic template matching. The methods were tested on field recordings of flight calls from superb starlings, Lamprotornis superbus, for how well they could separate distinct categories of call elements (motifs). The fundamental-frequency-based method performed best, but the spectrogram-based method was less sensitive to noise. Both DTW methods provided better separation of categories than spectrographic cross correlation, likely due to substantial variability in the duration of superb starling flight call motifs.

1.
Anderson
,
S. E.
,
Dave
,
A. S.
, and
Margoliash
,
D.
(
1996
). “
Template-based automatic recognition of birdsong syllables from continuous recordings
,”
J. Acoust. Soc. Am.
100
,
1209
1219
.
2.
Auger
,
F.
, and
Flandrin
,
P.
(
1995
). “
Improving the readability of time-frequency and time-scale representations by the reassignment method
,”
IEEE Trans. Signal Process.
43
,
1068
1089
.
3.
Baker
,
M. C.
, and
Logue
,
D. M.
(
2003
). “
Population differentiation in a complex bird sound: A comparison of three bioacoustical analysis procedures
,”
Ethology
109
,
223
242
.
4.
Beecher
,
M. D.
, and
Burt
,
J. M.
(
2004
). “
The role of social interaction in bird song learning
,”
Curr. Dir. Psychol. Sci.
13
,
224
228
.
5.
Beecher
,
M. D.
,
Stoddard
,
P. K.
,
Campbell
,
E. S.
, and
Horning
,
C. L.
(
1996
). “
Repertoire matching between neighbouring song sparrows
,”
Anim. Behav.
51
,
917
923
.
6.
Boersma
,
P.
(
1993
). “
Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
,”
Proc. Inst. Phonetic Sci.
17
,
97
110
.
7.
Boughman
,
J. W.
(
1997
). “
Greater spear-nosed bats give group-distinctive calls
,”
Behav. Ecol. Sociobiol.
40
,
61
70
.
8.
Brunton
,
D. H.
, and
Li
,
X.
(
2006
). “
The song structure and seasonal patterns of vocal behavior of male and female bellbirds (Anthornis melanura)
,”
J. Ethol.
24
,
17
25
.
9.
Charif
,
R. A.
,
Waack
,
A. M.
, and
Strickman
,
L. M.
(
2010
).
Raven 1.4 User's Manual
(
Cornell Lab of Ornithology
,
Ithaca, NY
), Chap. 9, pp.
221
236
.
10.
Chen
,
Z.
, and
Maher
,
R. C.
(
2006
). “
Semi-automatic classification of bird vocalizations using spectral peak tracks
,”
J. Acoust. Soc. Am.
120
,
2974
2984
.
11.
Christie
,
P. J.
,
Mennill
,
D. J.
, and
Ratcliffe
,
L. M.
(
2004
). “
Pitch shifts and song structure indicate male quality in the dawn chorus of black-capped chickadees
,”
Behav. Ecol. Sociobiol.
55
,
341
348
.
12.
Clark
,
C. W.
,
Marler
,
P.
, and
Beeman
,
K.
(
1987
). “
Quantitative analysis of animal vocal phonology: An application to swamp sparrow song
,”
Ethology
76
,
101
115
.
13.
Curry
,
R.
(
1937
). “
The mechanism of pitch change in the voice
,”
J. Physiol.
91
,
254
258
.
14.
Deecke
,
V. B.
,
Ford
,
J. K. B.
, and
Spong
,
P.
(
1999
). “
Quantifying complex patterns of bioacoustic variation: Use of a neural network to compare killer whale (Orcinus orca) dialects
,”
J. Acoust. Soc. Am.
105
,
2499
2507
.
15.
Farabaugh
,
S. M.
,
Linzenbold
,
A.
, and
Dooling
,
R. J.
(
1994
). “
Vocal plasticity in budgerigars (Melopsittacus undulatus): Evidence for social factors in the learning of contact calls
,”
J. Comp. Psychol.
108
,
81
92
.
16.
Feare
,
C. J.
, and
Craig
,
A.
(
1999
).
Starlings and Mynas
(
Princeton University Press
,
Princeton, NJ
), pp.
218
219
.
17.
Freeberg
,
T. M.
,
Lucas
,
J. R.
, and
Clucas
,
B.
(
2003
). “
Variation in chick-a-dee calls of a Carolina chickadee population, Poecile carolinensis: Identity and redundancy within note types
,”
J. Acoust. Soc. Am.
113
,
2127
2136
.
18.
Gardner
,
T. J.
, and
Magnasco
,
M. O.
(
2006
). “
Sparse time-frequency representations
,”
Proc. Natl. Acad. Sci. U.S.A.
103
,
6094
6099
.
19.
Godsill
,
S.
,
Doucet
,
A.
, and
West
,
M.
(
2001
). “
Maximum a posteriori sequence estimation using Monte Carlo particle filters
,”
Ann. Inst. Stat. Math.
53
,
82
96
.
20.
Gold
,
B.
,
Morgan
,
N.
, and
Ellis
,
D.
(
2011
).
Speech and Audio Signal Processing: Processing and Perception of Speech and Music
(
Wiley and Sons
,
New York
), Chap. 31, pp.
455
472
.
21.
Goller
,
F.
, and
Suthers
,
R. A.
(
1996
). “
Role of syringeal muscles in controlling the phonology of bird song
,”
J. Neurophysiol.
76
,
287
300
.
22.
Irwin
,
D. E.
,
Thimgan
,
M. P.
, and
Irwin
,
J. H.
(
2008
). “
Call divergence is correlated with geographic and genetic distance in greenish warblers (Phylloscopus trochiloides): A strong role for stochasticity in signal evolution?
J. Evol. Biol.
21
,
435
448
.
23.
Itakura
,
F.
(
1975
). “
Minimum prediction residual principle applied to speech recognition
,”
IEEE Trans. Acoust. Speech
23
,
67
72
.
24.
Janik
,
V. M.
, and
Slater
,
P. J. B.
(
1997
). “
Vocal learning in mammals
,”
Adv. Study Behav.
26
,
59
99
.
25.
Johansson
,
A. T.
, and
White
,
P. R.
(
2011
). “
An adaptive filter-based method for robust, automatic detection and frequency estimation of whistles
,”
J. Acoust. Soc. Am.
130
,
893
903
.
26.
Lachlan
,
R. F.
, and
Slater
,
P. J. B.
(
2003
). “
Song learning by chaffinches: How accurate, and from where?
Anim. Behav.
65
,
957
969
.
27.
Liu
,
J. S.
, and
Chen
,
R.
(
1998
). “
Sequential Monte Carlo methods for dynamic systems
,”
J. Am. Stat. Assoc.
93
,
1032
1044
.
28.
Mallawaarachchi
,
A.
,
Ong
,
S. H.
,
Chitre
,
M.
, and
Taylor
,
E.
(
2008
). “
Spectrogram denoising and automated extraction of the fundamental frequency variation of dolphin whistles
,”
J. Acoust. Soc. Am.
124
,
1159
1170
.
29.
Mammen
,
D. L.
, and
Nowicki
,
S.
(
1981
). “
Individual differences and within-flock convergence in chickadee calls
,”
Behav. Ecol. Sociobiol.
9
,
179
186
.
30.
McComb
,
K.
,
Reby
,
D.
,
Baker
,
L.
,
Moss
,
C.
, and
Sayialel
,
S.
(
2003
). “
Long-distance communication of acoustic cues to social identity in African elephants
,”
Anim. Behav.
65
,
317
329
.
31.
McCowan
,
B.
(
1995
). “
A new quantitative technique for categorizing whistles using simulated signals and whistles from captive bottlenose dolphins (Delphinidae, Tursiops truncatus)
,”
Ethology
100
,
177
193
.
32.
McDonald
,
P. G.
, and
Wright
,
J.
(
2011
). “
Bell miner provisioning calls are more similar among relatives and are used by helpers at the nest to bias their effort towards kin
,”
Proc. R. Soc. B.
278
,
3403
3411
.
33.
Mundinger
,
P. C.
(
1970
). “
Vocal imitation and individual recognition of finch calls
,”
Science
168
,
480
482
.
34.
Nelson
,
D. A.
, and
Marler
,
P.
(
1989
). “
Categorical perception of a natural stimulus continuum: Birdsong
,”
Science
244
,
976
978
.
35.
Pilowsky
,
J. A.
, and
Rubenstein
,
D. R.
(
2013
). “
Social context and the lack of sexual dimorphism in song in an avian cooperative breeder
,”
Anim. Behav.
85
,
709
714
.
36.
Ranjard
,
L.
,
Anderson
,
M. G.
,
Rayner
,
M. J.
,
Payne
,
R. B.
,
McLean
,
I.
,
Briskie
,
J. V.
,
Ross
,
H. A.
,
Brunton
,
D. H.
,
Woolley
,
S. M. N.
, and
Hauber
,
M. E.
(
2010
). “
Bioacoustic distances between the begging calls of brood parasites and their host species: A comparison of metrics and techniques
,”
Behav. Ecol. Sociobiol.
64
,
1915
1926
.
37.
Rousseeuw
,
P. J.
(
1987
). “
Silhouettes: A graphical aid to the interpretation and validation of cluster analysis
,”
J. Comput. Appl. Math.
20
,
53
65
.
38.
Rubenstein
,
D. R.
(
2007a
). “
Female extrapair mate choice in a cooperative breeder: Trading sex for help and increasing offspring heterozygosity
,”
Proc. R. Soc. B.
274
,
1895
1903
.
39.
Rubenstein
,
D. R.
(
2007b
). “
Territory quality drives intraspecific patterns of extrapair paternity
,”
Behav. Ecol.
18
,
1058
1064
.
40.
Runciman
,
D.
,
Zann
,
R. A.
, and
Murray
,
N. D.
(
2005
). “
Geographic and temporal variation of the male zebra finch distance call
,”
Ethology
111
,
367
379
.
41.
Schrader
,
L.
, and
Hammerschmidt
,
K.
(
1997
). “
Computer-aided analysis of acoustic parameters in animal vocalisations: A multi-parametric approach
,”
Bioacoustics
7
,
247
265
.
42.
Searcy
,
W. A.
, and
Yasukawa
,
K.
(
1996
). “
Song and female choice
,” in
Ecology and Evolution of Acoustic Communication in Birds
, edited by
D.
Kroodsma
and
E.
Miller
(
Comstock/Cornell
,
Ithaca, NY
), pp.
454
473
.
43.
Sewall
,
K. B.
(
2009
). “
Limited adult vocal learning maintains call dialects but permits pair-distinctive calls in red crossbills
,”
Anim. Behav.
77
,
1303
1311
.
44.
Shapiro
,
A. D.
, and
Wang
,
C.
(
2009
). “
A versatile pitch tracking algorithm: From human speech to killer whale vocalizations
,”
J. Acoust. Soc. Am.
126
,
451
459
.
45.
Shofner
,
W. P.
(
2005
). “
Comparative aspects of pitch perception
,” in
Springer Handbook of Auditory Research. Pitch
, edited by
C.
Plack
,
R.
Fay
,
A.
Oxenham
, and
A.
Popper
(
Springer
,
New York
), Vol.
24
, pp.
56
98
.
46.
Smolker
,
R.
, and
Pepper
,
J.
(
1999
). “
Whistle convergence among allied male bottlenose dolphins (Delphinidae, Tursiops sp.)
,”
Ethology
105
,
595
617
.
47.
Tchernichovski
,
O.
,
Nottebohm
,
F.
,
Ho
,
C. E.
,
Pesaran
,
B.
, and
Mitra
,
P. P.
(
2000
). “
A procedure for an automated measurement of song similarity
,”
Anim. Behav.
59
,
1167
1176
.
48.
Townsend
,
S. W.
,
Hollén
,
L. I.
, and
Manser
,
M. B.
(
2010
). “
Meerkat close calls encode group-specific signatures, but receivers fail to discriminate
,”
Anim. Behav.
80
,
133
138
.
49.
Vintsyuk
,
T. K.
(
1971
). “
Element-wise recognition of continuous speech composed of words from a specified dictionary
,”
Cybernetics
7
,
361
372
.
50.
Wang
,
C.
, and
Seneff
,
S.
(
2000
). “
Robust pitch tracking for prosodic modeling in telephone speech
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
, Vol.
3
, pp.
1343
1346
.
51.
Williams
,
H.
(
2004
). “
Birdsong and singing behavior
,”
Ann. N. Y. Acad. Sci.
1016
,
1
30
.
52.
Xiao
,
J.
, and
Flandrin
,
P.
(
2007
). “
Multitaper time-frequency reassignment for nonstationary spectrum estimation and chirp enhancement
,”
IEEE Trans. Signal Process.
55
,
2851
2860
.
You do not currently have access to this content.