Passive acoustic monitoring (PAM) is a useful technique for monitoring marine mammals. However, the quantity of data collected through PAM systems makes automated algorithms for detecting and classifying sounds essential. Deep learning algorithms have shown great promise in recent years, but their performance is limited by the lack of sufficient amounts of annotated data for training the algorithms. This work investigates the benefit of augmenting training datasets with synthetically generated samples when training a deep neural network for the classification of North Atlantic right whale (Eubalaena glacialis) upcalls. We apply two recently proposed augmentation techniques, SpecAugment and Mixup, and show that they improve the performance of our model considerably. The precision is increased from 86% to 90%, while the recall is increased from 88% to 93%. Finally, we demonstrate that these two methods yield a significant improvement in performance in a scenario of data scarcity, where few training samples are available. This demonstrates that data augmentation can reduce the annotation effort required to achieve a desirable performance threshold.

1.
Bjorck
,
J.
,
Rappazzo
,
B. H.
,
Chen
,
D.
,
Bernstein
,
R.
,
Wrege
,
P. H.
, and
Gomes
,
C. P.
(
2019
). “
Automatic detection and compression for passive acoustic monitoring of the african forest elephant
,” in
Proceedings of the AAAI Conference on Artificial Intelligence
, January 27–February 1, Honolulu, HI, pp.
476
484
.
2.
Cooke
,
J.
(
2020
). “
Eubalaena glacialis. The IUCN Red List of threatened species 2020: E.t41712a162001243
,” .
3.
Erbe
,
C.
,
Dunlop
,
R.
, and
Dolman
,
S.
(
2018
).
Effects of Noise on Marine Mammals
(
Springer
,
New York
), pp.
277
309
.
4.
Fagerlund
,
S.
(
2007
). “
Bird species recognition using support vector machines
,”
EURASIP J. Adv. Signal Process.
2007
,
038637
.
5.
Gervaise
,
C.
,
Simard
,
Y.
,
Aulanier
,
F.
, and
Roy
,
N.
(
2019a
).
Optimal Passive Acoustic Systems for Real-Time Detection and Localization of North Atlantic Right Whales in Their Feeding Ground off Gaspé in the Gulf of St. Lawrence
(
Department of Fisheries and Oceans
,
Ottawa, Canada
).
6.
Gervaise
,
C.
,
Simard
,
Y.
,
Aulanier
,
F.
, and
Roy
,
N.
(
2019b
).
Performance Study of Passive Acoustic Systems for Detecting North Atlantic Right Whales in Seaways: The Honguedo Strait in the Gulf of St. Lawrence
(
Department of Fisheries and Oceans
,
Ottawa, Canada
).
7.
Gillespie
,
D.
(
2004
). “
Detection and classification of right whale calls using an ‘edge’detector operating on a smoothed spectrogram
,”
Can. Acoust.
32
(
2
),
39
47
.
8.
Gradišek
,
A.
,
Slapničar
,
G.
,
Šorn
,
J.
,
Luštrek
,
M.
,
Gams
,
M.
, and
Grad
,
J.
(
2017
). “
Predicting species identity of bumblebees through analysis of flight buzzing sounds
,”
Bioacoustics
26
(
1
),
63
76
.
9.
He
,
K.
,
Zhang
,
X.
,
Ren
,
S.
, and
Sun
,
J.
(
2016
). “
Deep residual learning for image recognition
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, June 26–July 1, Las Vegas, NV, pp.
770
778
.
10.
Ioffe
,
S.
, and
Szegedy
,
C.
(
2015
). “
Batch normalization: Accelerating deep network training by reducing internal covariate shift
,” arXiv:1502.03167.
11.
Kingma
,
D. P.
, and
Ba
,
J.
(
2014
). “
ADAM: A method for stochastic optimization
,” arXiv:1412.6980.
12.
Kirsebom
,
O. S.
,
Frazao
,
F.
,
Simard
,
Y.
,
Roy
,
N.
,
Matwin
,
S.
, and
Giard
,
S.
(
2020
). “
Performance of a deep neural network at detecting north atlantic right whale upcalls
,”
J. Acoust. Soc. Am.
147
(
4
),
2636
2646
.
13.
Kumar
,
K.
,
Kumar
,
R.
,
de Boissiere
,
T.
,
Gestin
,
L.
,
Teoh
,
W. Z.
,
Sotelo
,
J.
,
de Brebisson
,
A.
,
Bengio
,
Y.
, and
Courville
,
A.
(
2019
). “
MelGAN: Generative adversarial networks for conditional waveform synthesis
,” arXiv:1910.06711.
14.
Liu
,
J.
,
Yang
,
X.
,
Wang
,
C.
, and
Tao
,
Y.
(
2018
). “
A convolution neural network for dolphin species identification using echolocation clicks signal
,” in
2018 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)
, September 14–16, Qingdao, China, pp.
1
4
.
15.
Mehri
,
S.
,
Kumar
,
K.
,
Gulrajani
,
I.
,
Kumar
,
R.
,
Jain
,
S.
,
Sotelo
,
J.
,
Courville
,
A.
, and
Bengio
,
Y.
(
2016
). “
SampleRNN: An unconditional end-to-end neural audio generation model
,” arXiv:1612.07837.
16.
Nair
,
V.
, and
Hinton
,
G. E.
(
2010
). “
Rectified linear units improve restricted Boltzmann machines
,” in
Proceedings of ICML'10
, June 21–24, Haifa, Israel.
17.
O'Mahony
,
N.
,
Campbell
,
S.
,
Carvalho
,
A.
,
Harapanahalli
,
S.
,
Hernandez
,
G. V.
,
Krpalkova
,
L.
,
Riordan
,
D.
, and
Walsh
,
J.
(
2019
). “
Deep learning vs. traditional computer vision
,” in
Science and Information Conference
(
Springer
,
New York
), pp.
128
144
.
18.
Oswald
,
J. N.
,
Barlow
,
J.
, and
Norris
,
T. F.
(
2003
). “
Acoustic identification of nine delphinid species in the eastern tropical Pacific ocean
,”
Mar. Mam. Sci.
19
(
1
),
20
037
.
19.
Pace
,
F.
(
2008
). “
Comparison of feature sets for humpback whale song classification
,” Ph.D. thesis,
University of Southampton
, Southampton, UK.
20.
Park
,
D. S.
,
Chan
,
W.
,
Zhang
,
Y.
,
Chiu
,
C.-C.
,
Zoph
,
B.
,
Cubuk
,
E. D.
, and
Le
,
Q. V.
(
2019
). “
Specaugment: A simple data augmentation method for automatic speech recognition
,” arXiv:1904.08779.
21.
Pettis
,
H. M.
,
Pace
,
R. M.
 III
, and
Hamilton
,
P. K.
(
2019
). “
North Atlantic right whale consortium 2019 annual report card
,”
Report to the North Atlantic Right Whale Consortium
(
North Atlantic Right Whale Consortium
,
Boston, MA
).
22.
Ren
,
J.
,
Hu
,
Y.
,
Tai
,
Y.-W.
,
Wang
,
C.
,
Xu
,
L.
,
Sun
,
W.
, and
Yan
,
Q.
(
2016
). “
Look, listen and learn—A multimodal LSTM for speaker identification
,” in
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence
, February 12–17, Phoenix, AZ.
23.
Roch
,
M.
,
Soldevilla
,
M.
, and
Hildebrand
,
J.
(
2004
). “
Automatic species identification of odontocete calls in the southern California bight
,”
J. Acoust. Soc. Am.
116
(
4
),
2614
2614
.
24.
Serra
,
O.
,
Martins
,
F.
, and
Padovese
,
L. R.
(
2020
). “
Active contour-based detection of estuarine dolphin whistles in spectrogram images
,”
Ecol. Inf.
55
,
101036
.
25.
Shiu
,
Y.
,
Palmer
,
K.
,
Roch
,
M. A.
,
Fleishman
,
E.
,
Liu
,
X.
,
Nosal
,
E.-M.
,
Helble
,
T.
,
Cholewiak
,
D.
,
Gillespie
,
D.
, and
Klinck
,
H.
(
2020
). “
Deep neural networks for automated detection of marine mammal species
,”
Sci. Rep.
10
(
1
),
607
.
26.
Shorten
,
C.
, and
Khoshgoftaar
,
T. M.
(
2019
). “
A survey on image data augmentation for deep learning
,”
J. Big Data
6
,
60
.
27.
Simard
,
Y.
,
Roy
,
N.
,
Giard
,
S.
, and
Aulanier
,
F.
(
2019
). “
North Atlantic right whale shift to the gulf of St. Lawrence in 2015, revealed by long-term passive acoustics
,”
ESR
40
,
271
284
.
28.
Tyack
,
P. L.
(
2008
). “
Implications for marine mammals of large-scale changes in the marine acoustic environment
,”
J. Mammal.
89
(
3
),
549
558
.
29.
Urazghildiiev
,
I. R.
,
Clark
,
C. W.
,
Krein
,
T. P.
, and
Parks
,
S. E.
(
2009
). “
Detection and recognition of north Atlantic right whale contact calls in the presence of ambient noise
,”
IEEE J. Oceanic Eng.
34
(
3
),
358
368
.
30.
van den Oord
,
A.
,
Dieleman
,
S.
,
Zen
,
H.
,
Simonyan
,
K.
,
Vinyals
,
O.
,
Graves
,
A.
,
Kalchbrenner
,
N.
,
Senior
,
A.
, and
Kavukcuoglu
,
K.
(
2016
). “
WaveNet: A generative model for raw audio
,” arXiv:1609.03499.
31.
Weilgart
,
L. S.
(
2007
). “
A brief review of known effects of noise on marine mammals
,”
Int. J. Compar. Psychol.
20
(
2
),
159
168
.
32.
Xu
,
K.
,
Cai
,
H.
,
Liu
,
X.
,
Gao
,
Z.
, and
Zhang
,
B.
(
2017
). “
North Atlantic right whale call detection with very deep convolutional neural networks
,”
J. Acoust. Soc. Am.
141
,
3944
3945
.
33.
Xu
,
K.
,
Feng
,
D.
,
Mi
,
H.
,
Zhu
,
B.
,
Wang
,
D.
,
Zhang
,
L.
,
Cai
,
H.
, and
Liu
,
S.
(
2018
). “
Mixup-based acoustic scene classification using multi-channel convolutional neural network
,” in
Pacific Rim Conference on Multimedia
,
Springer
, pp.
14
23
.
34.
Zhang
,
H.
,
Cisse
,
M.
,
Dauphin
,
Y. N.
, and
Lopez-Paz
,
D.
(
2017
). “
mixup: Beyond empirical risk minimization
,” arXiv:1710.09412.
You do not currently have access to this content.