Over time, a bird population's acoustic and morphological features can diverge from the parent species. A quantitative measure of difference between two populations of species/subspecies is extremely useful to zoologists. Work in this paper takes a dialect difference system first developed for speech and refines it to automatically measure vocalisation difference between bird populations by extracting pitch contours. The pitch contours are transposed into pitch codes. A variety of codebook schemes are proposed to represent the contour structure, including a vector quantization approach. The measure, called Bird Vocalisation Difference, is applied to bird populations with calls that are considered very similar, very different, and between these two extremes. Initial results are very promising, with the behaviour of the metric consistent with accepted levels of similarity for the populations tested to date. The influence of data size on the measure is investigated by using reduced datasets. Results of species pair classification using Gaussian mixture models with Mel-frequency cepstral coefficients is also given as a baseline indicator of class confusability.

1.
Beckers
,
G. J.
(
2011
). “
Bird speech perception and vocal production: A comparison with humans
,”
Human Biol.
83
,
191
212
.
2.
Beckers
,
G. J.
,
Suthers
,
R. A.
, and
Ten Cate
,
C.
(
2003
). “
Pure-tone birdsong by resonance filtering of harmonic overtones
,”
Proc. Natl. Acad. Sci. U.S.A.
100
,
7372
7376
.
3.
Biadsy
,
F.
, and
Hirschberg
,
J. B.
(
2009
). “
Using prosody and phonotactics in Arabic dialect identification
,” in
INTERSPEECH 2009—10th Annual Conference of the International Speech Communication Association
,
Brighton, United Kingdom
(
September 6–10
), pp.
208
211
.
4.
Bregman
,
M. R.
,
Patel
,
A. D.
, and
Gentner
,
T. Q.
(
2016
). “
Songbirds use spectral shape, not pitch, for sound pattern recognition
,”
Proc. Natl. Acad. Sci. U.S.A.
113
,
1666
1671
.
5.
Brookes
,
M.
(
2005
). “
Voicebox: Speech processing toolbox for matlab
,” http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html (Last viewed April 15, 2016).
6.
Catchpole
,
C. K.
, and
Slater
,
P. J.
(
2008
).
Bird Song: Biological Themes and Variations
, 2nd ed. (
Cambridge University Press
,
England
, 2008).
7.
Chi-Yueh
,
L.
, and
Wang
,
H. C.
(
2006
). “
Language identification using pitch contour information in the ergodic Markov model
,” in
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, Vol.
1
, pp.
I
I
.
8.
De Cheveigné
,
A.
, and
Kawahara
,
H.
(
2002
). “
YIN, a fundamental frequency estimator for speech and music
,”
J. Acoust. Soc. Am.
111
,
1917
1930
.
9.
Doupe
,
A. J.
, and
Kuhl
,
P. K.
(
1999
). “
Birdsong and human speech: Common themes and mechanisms
,”
Ann. Rev. Neurosci.
22
,
567
631
.
10.
Graciarena
,
M.
,
Delplanche
,
M.
,
Shriberg
,
E.
, and
Stolcke
,
A.
(
2011
). “
Bird species recognition combining acoustic and sequence modeling
,” in
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp.
341
344
.
11.
Graciarena
,
M.
,
Delplanche
,
M.
,
Shriberg
,
E.
,
Stolcke
,
A.
, and
Ferrer
,
L.
(
2010
). “
Acoustic front-end optimization for bird species recognition
,” in
IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
, pp.
293
296
.
12.
Harte
,
N.
,
Murphy
,
S.
,
Kelly
,
D. J.
, and
Marples
,
N. M.
(
2013
). “
Identifying new bird species from differences in birdsong
,” in
INTERSPEECH 2013—14th Annual Conference of the International Speech Communication Association
,
Lyon, France
(
August 25–29
), pp.
2900
2904
.
13.
Isler
,
M. L.
,
Isler
,
P. R.
, and
Whitney
,
B. M.
(
1998
). “
Use of vocalizations to establish species limits in antbirds (passeriformes: Thamnophilidae)
,”
The Auk
115
(
3
),
577
590
.
14.
Kaewtip
,
K.
,
Alwan
,
A.
,
O'Reilly
,
C.
, and
Taylor
,
C. E.
(
2016
). “
A robust automatic birdsong phrase classification: A template-based approach
,”
J. Acoust. Soc. Am.
140
,
3691
3701
.
15.
Lambert
,
F.
, and
Rasmussen
,
P.
(
1998
). “
A new scops owl from Sangihe Island, Indonesia
,”
Bull.-Brit. Ornithologists Club
118
,
204
216
.
16.
Lee
,
C.-H.
,
Han
,
C.-C.
, and
Chuang
,
C.-C.
(
2008
). “
Automatic classification of bird species from their sounds using two-dimensional cepstral coefficients
,”
IEEE Trans. Audio, Speech, Lang. Process.
16
,
1541
1550
.
17.
Lee
,
C.-H.
,
Hsu
,
S.-B.
,
Shih
,
J.-L.
, and
Chou
,
C.-H.
(
2013
). “
Continuous birdsong recognition using Gaussian mixture modeling of image shape features
,”
IEEE Trans. Multimedia
15
,
454
464
.
18.
Lindermuth
,
M.
(
2010
). “
Harma syllable segmentation
,” http://uk.mathworks.com/matlabcentral/fileexchange/29261-harma-syllable-segmentation (Last viewed November 3, 2016).
19.
Makhoul
,
J.
,
Roucos
,
S.
, and
Gish
,
H.
(
1985
). “
Vector quantization in speech coding
,”
Proc. IEEE
73
,
1551
1588 (1985)
.
20.
McKay
,
B. D.
,
Reynolds
,
M. B. J.
,
Hayes
,
W. K.
, and
Lee
,
D. S.
(
2010
). “
Evidence for the species status of the Bahama yellow-throated warbler (Dendronica “dominica” flavescens)
,”
The Auk
127
,
932
939
.
21.
Mehrabani
,
M.
,
Boril
,
H.
, and
Hansen
,
J. H.
(
2010
). “
Dialect distance assessment method based on comparison of pitch pattern statistical models
,” in
IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
, pp.
5158
5161
.
22.
Mehrabani
,
M.
, and
Hansen
,
J. H. L.
(
2015
). “
Automatic analysis of dialect/language sets
,”
Int. J. Speech Technol.
18
,
277
.
23.
Meliza
,
C. D.
,
Keen
,
S. C.
, and
Rubenstein
,
D. R.
(
2013
). “
Pitch-and spectral-based dynamic time warping methods for comparing field recordings of harmonic avian vocalizations
,”
J. Acoust. Soc. Am.
134
,
1407
1415
.
24.
O'Reilly
,
C.
, and
Harte
,
N.
(
2017
). “
Pitch tracking of bird vocalizations and an automated process using YIN-bird
,”
Cogent Biol.
3
,
1322025
.
25.
O'Reilly
,
C.
,
Marples
,
N. M.
,
Kelly
,
D. J.
, and
Harte
,
N.
(
2015
). “
Quantifying difference in vocalizations of bird populations
,” in
INTERSPEECH 2015—16th Annual Conference of the International Speech Communication Association
,
Dresden, Germany
,
September 6–10
, pp.
3417
3421
.
26.
O'Reilly
,
C.
,
Marples
,
N. M.
,
Kelly
,
D. J.
, and
Harte
,
N.
(
2016
). “
YIN-bird: Improved pitch tracking for bird vocalisations
,” in
INTERSPEECH 2016—17th Annual Conference of the International Speech Communication Association
,
San Francisco, CA
,
September 8–12
, pp.
2641
2645
.
27.
Ranjard
,
L.
, and
Ross
,
H. A.
(
2008
). “
Unsupervised bird song syllable classification using evolving neural networks
,”
J. Acoust. Soc. Am.
123
,
4358
4368
.
28.
Rasmussen
,
P.
,
Wardill
,
J.
,
Lambert
,
F.
, and
Riley
,
J.
(
2000
). “
On the specific status of the Sangihe white-eye Zosterops nehrkorni, and the taxonomy of the black-crowned white-eye Z. atrifrons complex
,”
Forktail
16
,
69
80
.
29.
Sangster
,
G.
,
King
,
B. F.
,
Verbelen
,
P.
, and
Trainor
,
C. R.
(
2013
). “
A new owl species of the genus Otus (Aves: Strigidae) from Lombok, Indonesia
,”
PLoS One
8
,
e53712
.
30.
Sennheiser
(
2013
). “
Sennheiser me62 omni-directional condenser microphone capsule with k6 power supply
,” http://www.bhphotovideo.com/c/product/73085-REG/Sennheiser_ME62_ME62_Omni_Mic.html (Last viewed April 22, 2014).
31.
Shaw
,
S.
(
2011
).
An Evaluation of Birdsong Recognition Techniques
,
Undergraduate dissertation, University of Sheffield
,
Sheffield, United Kingdom
.
32.
Stowell
,
D.
,
Wood
,
M.
,
Stylianou
,
Y.
, and
Glotin
,
H.
(
2016
). “
Bird detection in audio: A survey and a challenge
,” in
IEEE International Workshop on Machine Learning for Signal Processing
,
Salerno, Italy
, September
13
16
.
33.
Talkin
,
D.
(
1995
). “
A robust algorithm for pitch tracking (RAPT)
,” in
Speech Coding and Synthesis
, edited by
B.
Kleijn
and
K. K.
Palatal
(
Elsevier
,
Amsterdam
), pp.
495
518
.
34.
Tchernichovski
,
O.
,
Nottebohm
,
F.
,
Ho
,
C. E.
,
Pesaran
,
B.
, and
Mitra
,
P. P.
(
2000
). “
A procedure for an automated measurement of song similarity
,”
Animal Behav.
59
,
1167
1176
.
35.
Telinga
(
2013
). “
Telinga 1 mm flexible dish
,” http://www.telinga.com/products/dishes/ (Last viewed April 22, 2014).
36.
Tobias
,
J. A.
,
Seddon
,
N.
,
Spottiswoode
,
C. N.
,
Pilgrim
,
J. D.
,
Fishpool
,
L. D.
, and
Collar
,
N. J.
(
2010
). “
Quantitative criteria for species delimitation
,”
Int. J. Avian Sci. (IBIS)
152
,
724
746
.
37.
Trifa
,
V. M.
,
Kirschel
,
A. N.
,
Taylor
,
C. E.
, and
Vallejo
,
E. E.
(
2008
). “
Automated species recognition of antbirds in a Mexican rainforest using hidden Markov models
,”
J. Acoust. Soc. Am.
123
,
2424
2431
.
38.
Vilches
,
E.
,
Escobar
,
I. A.
,
Vallejo
,
E. E.
, and
Taylor
,
C. E.
(
2006
). “
Data mining applied to acoustic bird species recognition
,” in
18th International Conference on Pattern Recognition
, Vol.
3
, pp.
400
403
.
39.
Wei
,
C.
, and
Alwan
,
A.
(
2012
). “
Fbem: A filter bank em algorithm for the joint optimization of features and acoustic model parameters in bird call classification
,” in
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp.
1993
1996
.
40.
xeno-canto (
2014
). “
Xeno-canto.org online database, accessed on 07 August
,” http://www.xeno-canto.org (Last viewed November 22, 2016).
41.
Young
,
S.
,
Evermann
,
G.
,
Gales
,
M.
,
Hain
,
T.
,
Kershaw
,
D.
,
Liu
,
X.
,
Moore
,
G.
,
Odell
,
J.
,
Ollason
,
D.
,
Povey
,
D.
,
Valtchev
,
V.
, and
Woodland
,
P.
(
2002
).
The HTK Book
(
Cambridge University Engineering Department
,
Cambridge
), Vol.
3
, p.
175
.
42.
ZOOM
(
2013
). “
Zoom handy recorder h2
,” http://www.samsontech.com/site_media/legacy_docs/H2_user_manual.pdf (Last viewed April 22, 2014).
You do not currently have access to this content.