Passive acoustic monitoring (PAM) is a useful technique for monitoring marine mammals. However, the quantity of data collected through PAM systems makes automated algorithms for detecting and classifying sounds essential. Deep learning algorithms have shown great promise in recent years, but their performance is limited by the lack of sufficient amounts of annotated data for training the algorithms. This work investigates the benefit of augmenting training datasets with synthetically generated samples when training a deep neural network for the classification of North Atlantic right whale (Eubalaena glacialis) upcalls. We apply two recently proposed augmentation techniques, SpecAugment and Mixup, and show that they improve the performance of our model considerably. The precision is increased from 86% to 90%, while the recall is increased from 88% to 93%. Finally, we demonstrate that these two methods yield a significant improvement in performance in a scenario of data scarcity, where few training samples are available. This demonstrates that data augmentation can reduce the annotation effort required to achieve a desirable performance threshold.
Data augmentation for the classification of North Atlantic right whales upcallsa)
Bruno Padovese, Fabio Frazao, Oliver S. Kirsebom, Stan Matwin; Data augmentation for the classification of North Atlantic right whales upcalls. J. Acoust. Soc. Am. 1 April 2021; 149 (4): 2520–2530. https://doi.org/10.1121/10.0004258
Download citation file: