Bird localisation using passive acoustic methods is a non-intrusive solution for taking a census of bird species. Through the recordings from a microphone array, the generalised cross-correlation with phase transform (GCC-PHAT) has been adopted widely in the estimation of the direction-of-arrival (DOA) of audio signals especially speech in indoor environments, as it performs very well in a reverberant environment. However, the performance is degraded when the signal to noise ratio is low. This study investigates the performance of DOA estimation when the GCC-PHAT is applied in the wavelet domain. Three configurations are considered in this study and the performance of these configurations is assessed by numerical simulation under ideal setup and practical experiments using audio signals recorded in a typical native forest in New Zealand. The results suggest the configuration which applies GCC-PHAT and denoising in the wavelet domain and estimates DOA from the reconstructed cross-correlation can give more accurate estimation compared to the conventional GCC-PHAT.

1.
B. W.
Brook
,
N. S.
Sodhi
, and
P. K.
Ng
, “
Catastrophic extinctions follow deforestation in Singapore
,”
Nature
424
(
6947
),
420
(
2003
).
2.
N.
Ocampo-Peñuela
and
S. L.
Pimm
, “
Elevational ranges of montane birds and deforestation in the western andes of colombia
,”
PLoS One
10
(
12
),
e0143311
(
2015
).
3.
C.
Bellard
,
P.
Cassey
, and
T. M.
Blackburn
, “
Alien species as a driver of recent extinctions
,”
Biol. Lett.
12
(
2
),
20150623
(
2016
).
4.
J.
McLennan
and
M.
Potter
, “
Distribution, population changes and management of brown kiwi in Hawke's bay
,”
New Zealand J. Ecol.
16
,
91
(
1992
).
5.
C.
Both
,
S.
Bouwhuis
,
C.
Lessells
, and
M. E.
Visser
, “
Climate change and population declines in a long-distance migratory bird
,”
Nature
441
(
7089
),
81
(
2006
).
6.
C. B.
Kepler
and
J. M.
Scott
, “
Reducing bird count variability by training observers
,”
Stud. Avian Biol.
6
,
366
371
(
1981
).
7.
T. S.
Brandes
, “
Automated sound recording and analysis techniques for bird surveys and conservation
,”
Bird Conserv. Int.
18
(
S1
),
S163
S173
(
2008
).
8.
A.
Digby
,
M.
Towsey
,
B. D.
Bell
, and
P. D.
Teal
, “
A practical comparison of manual and autonomous methods for acoustic monitoring
,”
Methods Ecol. Eval.
4
(
7
),
675
683
(
2013
).
9.
E. B.
Spurr
,
Monitoring Bird Populations in New Zealand: A Workshop to Assess the Feasibility of a National Bird Population Monitoring Scheme
(
Manaaki Whenua Press
,
New Zealand
,
2005
).
10.
R.
Kojima
,
O.
Sugiyama
,
R.
Suzuki
,
K.
Nakadai
, and
C. E.
Taylor
, “
Semi-automatic bird song analysis by spatial-cue-based integration of sound source detection, localization, separation, and identification
,” in
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
,
Daejeon, Korea
(October 9–14,
2016
), pp.
1287
1292
.
11.
C.
Kwan
,
K.
Ho
,
G.
Mei
,
Y.
Li
,
Z.
Ren
,
R.
Xu
,
Y.
Zhang
,
D.
Lao
,
M.
Stevenson
,
V.
Stanford
, and
C.
Rochet
, “
An automated acoustic system to monitor and classify birds
,”
EURASIP J. Adv. Signal Process.
2006
(
1
),
096706
.
12.
R.
Hedley
,
Y.
Huang
, and
K.
Yao
, “
Direction-of-arrival estimation of animal vocalizations for monitoring animal behavior and improving estimates of abundance
,”
Avian Conserv. Ecol.
12
(
1
),
6
(
2017
).
13.
P. M.
Stepanian
,
K. G.
Horton
,
D. C.
Hille
,
C. E.
Wainwright
,
P. B.
Chilson
, and
J. F.
Kelly
, “
Extending bioacoustic monitoring of birds aloft through flight call localization with a three-dimensional microphone array
,”
Ecol. Evol.
6
(
19
),
7039
7046
(
2016
).
14.
D. T.
Blumstein
,
D. J.
Mennill
,
P.
Clemins
,
L.
Girod
,
K.
Yao
,
G.
Patricelli
,
J. L.
Deppe
,
A. H.
Krakauer
,
C.
Clark
,
K. A.
Cortopassi
,
S. F.
Hanser
,
B.
McCowan
,
A. M.
Ali
, and
A. N. G.
Kirschel
, “
Acoustic monitoring in terrestrial environments using microphone arrays: Applications, technological considerations and prospectus
,”
J. Appl. Ecol.
48
(
3
),
758
767
(
2011
).
15.
D. K.
Dawson
and
M. G.
Efford
, “
Bird population density estimated from acoustic signals
,”
J. Appl. Ecol.
46
(
6
),
1201
1209
(
2009
).
16.
J. L.
Bower
and
C. W.
Clark
, “
A field test of the accuracy of a passive acoustic location system
,”
Bioacoustics
15
(
1
),
1
14
(
2005
).
17.
D. R.
Wilson
,
M.
Battiston
,
J.
Brzustowski
, and
D. J.
Mennill
, “
Sound finder: A new software approach for localizing animals recorded with a microphone array
,”
Bioacoustics
23
(
2
),
99
112
(
2014
).
18.
C.
Kwan
,
G.
Mei
,
X.
Zhao
,
Z.
Ren
,
R.
Xu
,
V.
Stanford
,
C.
Rochet
,
J.
Aube
, and
K.
Ho
, “
Bird classification algorithms: Theory and experimental results
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04)
,
Montreal, Quebec
(May 17–21,
2004
), Vol. 5, pp.
V
289
.
19.
C.-E.
Chen
,
A. M.
Ali
, and
H.
Wang
, “
Design and testing of robust acoustic arrays for localization and enhancement of several bird sources
,” in
Proceedings of the 5th International Conference on Information Processing in Sensor Networks
,
Nashville, TN
(April 19–21,
2006
), pp.
268
275
.
20.
D. J.
Mennill
,
J. M.
Burt
,
K. M.
Fristrup
, and
S. L.
Vehrencamp
, “
Accuracy of an acoustic location system for monitoring the position of duetting songbirds in tropical forest
,”
J. Acoust. Soc. Am.
119
(
5
),
2832
2839
(
2006
).
21.
K. L.
Drake
,
M.
Frey
,
D.
Hogan
, and
R.
Hedley
, “
Using digital recordings and sonogram analysis to obtain counts of yellow rails
,”
Wildlife Soc. Bull.
40
(
2
),
346
354
(
2016
).
22.
J.
Benesty
,
J.
Chen
, and
Y.
Huang
,
Microphone Array Signal Processing, Vol. 1
(
Springer Science & Business Media
,
New York
,
2008
).
23.
H.
Wang
and
P.
Chu
, “
Voice source localization for automatic camera pointing system in videoconferencing
,” in
IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-97
,
Munich, Germany
(April 21–24,
1997
), pp.
187
190
.
24.
J. C.
Chen
,
K.
Yao
, and
R. E.
Hudson
, “
Source localization and beamforming
,”
IEEE Signal Process. Mag.
19
(
2
),
30
39
(
2002
).
25.
M. S.
Brandstein
and
H. F.
Silverman
, “
A practical methodology for speech source localization with microphone arrays
,”
Comput. Speech Lang.
11
(
2
),
91
126
(
1997
).
26.
J. L.
Flanagan
,
A. C.
Surendran
, and
E.-E.
Jan
, “
Spatially selective sound capture for speech and audio processing
,”
Speech Commun.
13
(
1–2
),
207
222
(
1993
).
27.
M. S.
Brandstein
, “
A framework for speech source localization using sensor arrays
,” Ph.D. thesis,
Brown University
,
Providence, RI
,
1995
.
28.
C.
Knapp
and
G.
Carter
, “
The generalized correlation method for estimation of time delay
,”
IEEE Trans. Acoust. Speech Signal Process.
24
(
4
),
320
327
(
1976
).
29.
J.
Thyssen
,
A.
Pandey
, and
B. J.
Borgström
, “
A novel time-delay-of-arrival estimation technique for multi-microphone audio processing
,” in
Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
,
Brisbane, Queensland, Australia
(April 19–24,
2015
), pp.
21
25
.
30.
R.
Suzuki
,
S.
Matsubayashi
,
K.
Nakadai
, and
H. G.
Okuno
, “
Localizing bird songs using an open source robot audition system with a microphone array
,” in
Proceedings of Interspeech 2016
,
San Francisco, CA
(September 8–12,
2016
), pp.
2626
2630
.
31.
R.
Suzuki
,
S.
Matsubayashi
,
R. W.
Hedley
,
K.
Nakadai
, and
H. G.
Okuno
, “
Harkbird: Exploring acoustic interactions in bird communities using a microphone array
,”
J. Robot. Mechatron.
29
(
1
),
213
223
(
2017
).
32.
K.
Yu
,
K.
Yao
,
R. E.
Hudson
,
C.
Taylor
, and
Z.
Wang
, “
A fast direct source localization approach for acoustic sensor array
,” in
Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
,
Shanghai, China
(March 20–25,
2016
), pp.
3211
3215
.
33.
K. J.
Piczak
, “
ESC: Dataset for environmental sound classification
,” in
Proceedings of the 23rd ACM International Conference on Multimedia
,
Brisbane, Australia
(October 26–30,
2015
), pp.
1015
1018
.
34.
C.
Gray
and
Y.
Hioka
, “
Direction of arrival estimation of kiwi call in noisy and reverberant bush
,” in
Proceedings of the IEEE Sensors Applications Symposium (SAS)
,
Queenstown, New Zealand
(February 18–20,
2014
), pp.
258
262
.
35.
M.
Padgham
, “
Reverberation and frequency attenuation in forests—Implications for acoustic communication in animals
,”
J. Acoust. Soc. Am.
115
(
1
),
402
410
(
2004
).
36.
C.
Zhang
,
D.
Florêncio
, and
Z.
Zhang
, “
Why does PHAT work well in lownoise, reverberative environments?
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
,
Las Vegas, NV
(March 31–April 4,
2008
), pp.
2565
2568
.
37.
S.
Mallat
,
A Wavelet Tour of Signal Processing
(
Academic Press
,
New York
,
1999
).
38.
D.
Mahmoudi
and
A.
Drygajlo
, “
Combined Wiener and coherence filtering in wavelet domain for microphone array speech enhancement
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
,
Seattle, WA
(May 12–15,
1998
), pp.
385
388
.
39.
B. M.
Gur
and
C.
Niezrecki
, “
Autocorrelation based denoising of manatee vocalizations using the undecimated discrete wavelet transform
,”
J. Acoust. Soc. Am.
122
(
1
),
188
199
(
2007
).
40.
N.
Priyadarshani
,
S.
Marsland
,
I.
Castro
, and
A.
Punchihewa
, “
Birdsong denoising using wavelets
,”
PLoS One
11
(
1
),
e0146790
(
2016
).
41.
D. L.
Donoho
and
J. M.
Johnstone
, “
Ideal spatial adaptation by wavelet shrinkage
,”
Biometrika
81
(
3
),
425
455
(
1994
).
42.
D. L.
Donoho
, “
De-noising by soft-thresholding
,”
IEEE Trans. Inf. Theory
41
(
3
),
613
627
(
1995
).
43.
D. L.
Donoho
and
I. M.
Johnstone
, “
Adapting to unknown smoothness via wavelet shrinkage
,”
J. Am. Stat. Assoc.
90
(
432
),
1200
1224
(
1995
).
44.
D.
Mahmoudi
, “
A microphone array for speech enhancement using multiresolution wavelet transform
,” in
Proceedings of the Fifth European Conference on Speech Communication and Technology
,
Rhodes, Greece
(September 22–25,
1997
).
45.
D.
Mahmoudi
and
A.
Drygajlo
, “
Wavelet transform based coherence function for multi-channel speech enhancement
,” in
Proceedings of the 9th European Signal Processing Conference (EUSIPCO 1998)
,
Rhodes, Greece
(September 8–11,
1998
), pp.
1
4
.
46.
D.
Liu
and
A. W.
Khong
, “
Improvement of speech source localization in noisy environment using overcomplete rational-dilation wavelet transforms
,” in
Proceedings of the 2010 International Conference on Cyberworlds (CW)
,
Singapore
(October 20–22,
2010
), pp.
77
81
.
47.
D.
Liu
and
A. W.
Khong
, “
A wavelet-based GCC prefiltering algorithm for speech DOA estimation
,” in
Proceedings of the 12th International Workshop on Acoustic Signal Enhancement (IWAENC)
,
Tel Aviv, Israel
(August 30–September 2,
2010
).
48.
R.
Samborski
and
M.
Ziolko
, “
Speaker localization in conferencing systems employing phase features and wavelet transform
,” in
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)
,
Athens, Greece
(December 12–15,
2013
), pp.
333
337
.
49.
L.
Huang
,
H.
Hao
,
X.
Li
, and
J.
Li
, “
Estimation of TDOA with cross wavelet analysis
,” in
Proceedings of the Australian Earthquake Engineering Society 2016 Conference
,
Melbourne, Australia
(November 25–27,
2016
).
50.
L.
Huang
,
H.
Hao
,
X.
Li
, and
J.
Li
, “
Micro-seismic monitoring in mines based on cross wavelet transform
,”
Earthq. Struct.
11
(
6
),
1143
1164
(
2016
).
51.
H.
Yan
,
G. N.
Chen
,
Y.
Xu
, and
H.
Xu
, “
Research on estimation of acoustic travel-time in soybeans
,” in
Proceedings of the International Conference on Measuring Technology and Mechatronics Automation
,
Zhangjiajie, China
(April 11–12,
2009
), pp.
832
835
.
52.
I.
Bayram
and
I. W.
Selesnick
, “
Overcomplete discrete wavelet transforms with rational dilation factors
,”
IEEE Trans. Signal Process.
57
(
1
),
131
145
(
2009
).
53.
D. H.
Johnson
and
D. E.
Dudgeon
,
Array Signal Processing: Concepts and Techniques
(
Simon & Schuster
,
New York
,
1992
).
54.
J.
Chen
,
J.
Benesty
, and
Y.
Huang
, “
Robust time delay estimation exploiting redundancy among multiple microphones
,”
IEEE Trans. Speech Audio Process.
11
(
6
),
549
557
(
2003
).
You do not currently have access to this content.