Convolutional neural networks (CNNs) have proven highly effective in automatically identifying and classifying underwater sound sources, enabling efficient analysis of marine environments. This work examines two key design choices for a CNN classifier: input representation and network architecture, analyzing their importance as training data size varies and their effectiveness in generalizing between sites. Passive acoustic data from three offshore sites in Western Scotland were used for hierarchical classification; categorizing sounds into one of four classes: delphinid tonal, delphinid clicks, vessels, and ambient noise. Three different input representations of the acoustic signals were investigated along with four CNN architectures, including three pre-trained for image classification tasks. Experiments show that a custom-built shallow CNN can outperform more complex ar chitectures if the input representation is chosen appropriately. For example, a shallow CNN using Mel-spectrogram normalised with per channel energy normalization (MS-PCEN) achieved a 12.5% accuracy improvement over a ResNet model when small amounts of training data are available. Studying model performance across the three sites demonstrates that input representation is an important factor for achieving robust results between sites, with MS-PCEN achieving the best performance. However, the importance of the choice of input representation decreases as the training dataset size increases.

1.
B. C.
Pijanowski
,
L. J.
Villanueva-Rivera
,
S. L.
Dumyahn
,
A.
Farina
,
B. L.
Krause
,
B. M.
Napoletano
,
S. H.
Gage
, and
N.
Pieretti
, “
Soundscape ecology: The science of sound in the landscape
,”
Bioscience
61
(
3
),
203
216
(
2011
).
2.
J. A.
Hildebrand
, “
Impacts of anthropogenic sound
,” in
Marine Mammal Research: Conservation Beyond Crisis
, edited by J. E. Reynolds, W. F. Perin, R. R. Reeves, S. Montgomery, and T. J. Ragen (The Johns Hopkins University Press, Baltimore, MD, 2005), pp. 101–124.
3.
R.
Williams
,
A. J.
Wright
,
E.
Ashe
,
L. K.
Blight
,
R.
Bruintjes
,
R.
Canessa
,
C. W.
Clark
,
S.
Cullis-Suzuki
,
D. T.
Dakin
,
C.
Erbe
,
P. S.
Hammond
,
N. D.
Merchant
,
P. D.
O'Hara
,
J.
Purser
,
A. N.
Radford
,
S. D.
Simpson
,
L.
Thomas
, and
M. A.
Wale
, “
Impacts of anthropogenic noise on marine life: Publication patterns, new discoveries, and future directions in research and management
,”
Ocean Coastal Manage.
115
,
17
24
(
2015
).
4.
E. L.
White
,
H.
Klinck
,
J. M.
Bull
,
P. R.
White
, and
D.
Risch
, “
One size fits all? Adaptation of trained CNNs to new marine acoustic environments
,”
Ecol. Inf.
78
,
102363
(
2023
).
5.
M. F.
Baumgartner
,
K. M.
Stafford
, and
G.
Latha
, “
Near real-time underwater passive acoustic monitoring of natural and anthropogenic sounds
,” in
Observing Oceans in Real Time
(
Springer
,
New York
,
2017
), pp.
203
226
.
6.
V. M.
Janik
, “
Cetacean vocal learning and communication
,”
Curr. Opin. Neurobiol.
28
,
60
65
(
2014
).
7.
B. K.
Branstetter
and
E.
Mercado
, “
Sound localization by cetaceans
,”
Int. J. Comp. Psychol.
19
(
1
),
26
61
(
2006
).
8.
L. S.
Sayigh
, “
Cetacean acoustic communication
,” in
Biocommunication of Animals
, edited by G. Witzany (Springer Netherlands, Dordrecht, 2014), pp. 275–297.
9.
M. A.
Roch
,
M. S.
Soldevilla
,
J. C.
Burtenshaw
,
E. E.
Henderson
, and
J. A.
Hildebrand
, “
Gaussian mixture model classification of odontocetes in the Southern California Bight and the Gulf of California
,”
J. Acoust. Soc. Am.
121
(
3
),
1737
1748
(
2007
).
10.
J. N.
Oswald
,
J.
Barlow
, and
T. F.
Norris
, “
Acoustic identification of nine delphinid Species in the eastern tropical Pacific Ocean
,”
Mar. Mammal Sci.
19
(
1
),
20
37
(
2003
).
11.
Y.
Shiu
,
K. J.
Palmer
,
M. A.
Roch
,
E.
Fleishman
,
X.
Liu
,
E.-M.
Nosal
,
T.
Helble
,
D.
Cholewiak
,
D.
Gillespie
, and
H.
Klinck
, “
Deep neural networks for automated detection of marine mammal species
,”
Sci. Rep.
10
(
1
),
607
(
2020
).
12.
G. D.
Hastie
,
B.
Wilson
, and
P. M.
Thompson
, “
Diving deep in a foraging hotspot: Acoustic insights into bottlenose dolphin dive depths and feeding behaviour
,”
Mar. Biol.
148
(
5
),
1181
1188
(
2006
).
13.
T.-H.
Lin
,
H.-Y.
Yu
,
C.-F.
Chen
, and
L.-S.
Chou
, “
Passive acoustic monitoring of the temporal variability of odontocete tonal sounds from a long-term marine observatory
,”
PLoS One
10
(
4
),
e0123943
(
2015
).
14.
N. R. E.
Todd
,
M.
Jessopp
,
E.
Rogan
, and
A. S.
Kavanagh
, “
Extracting foraging behavior from passive acoustic monitoring data to better understand harbor porpoise (Phocoena phocoena) foraging habitat use
,”
Mar. Mammal Sci.
38
(
4
),
1623
1642
(
2022
).
15.
T. A.
Marques
,
L.
Thomas
,
J.
Ward
,
N. A.
DiMarzio
, and
P. L.
Tyack
, “
Estimating cetacean population density using fixed passive acoustic sensors: An example with Blainville's beaked whales
,”
J. Acoust. Soc. Am.
125
(
4
),
1982
1994
(
2009
).
16.
E. T.
Küsel
,
D. K.
Mellinger
,
L.
Thomas
,
T. A.
Marques
,
D.
Moretti
, and
J.
Ward
, “
Cetacean population density estimation from single fixed sensors using passive acoustics
,”
J. Acoust. Soc. Am.
129
(
6
),
3610
3622
(
2011
).
17.
W. T.
Ellison
,
B. L.
Southall
,
C. W.
Clark
, and
A. S.
Frankel
, “
A new context-based approach to assess marine mammal behavioral responses to anthropogenic sounds
,”
Conserv. Biol.
26
(
1
),
21
28
(
2012
).
18.
C. R.
Weir
and
S. J.
Dolman
, “
Comparative review of the regional marine mammal mitigation guidelines implemented during industrial seismic surveys, and guidance towards a worldwide standard
,”
J. Int. Wildl. Law Policy
10
(
1
),
1
27
(
2007
).
19.
J.
Heiler
,
S. H.
Elwen
,
H. J.
Kriesell
, and
T.
Gridley
, “
Changes in bottlenose dolphin whistle parameters related to vessel presence, surface behaviour and group composition
,”
Anim. Behav.
117
,
167
177
(
2016
).
20.
M. J. G.
Parsons
,
T.-H.
Lin
,
T. A.
Mooney
,
C.
Erbe
,
F.
Juanes
,
M.
Lammers
,
S.
Li
,
S.
Linke
,
A.
Looby
,
S. L.
Nedelec
,
I.
Van Opzeeland
,
C.
Radford
,
A. N.
Rice
,
L.
Sayigh
,
J.
Stanley
,
E.
Urban
, and
L.
Di Iorio
, “
Sounding the call for a global library of underwater biological sounds
,”
Front. Ecol. Evol.
10
,
810156
(
2022
).
21.
R.
Lou
,
Z.
Lvu
,
S.
Dang
,
T.
Su
, and
X.
Li
, “
Application of machine learning in ocean data
,”
Multimedia Syst.
29
(
3
),
1815
1824
(
2023
).
22.
H.
Yang
,
K.
Lee
,
Y.
Choo
, and
K.
Kim
, “
Underwater acoustic research trends with machine learning: General background
,”
J. Ocean Eng. Technol.
34
(
2
),
147
154
(
2020
).
23.
E. H.
Belghith
,
F.
Rioult
, and
M.
Bouzidi
, “
Acoustic diversity classifier for automated marine big data analysis
,” in
Proceedings of the IEEE ICTAI
, Volos, Greece (
2018
), pp.
130
136
.
24.
C.
Bergler
,
H.
Schröter
,
R. X.
Cheng
,
V.
Barth
,
M.
Weber
,
E.
Nöth
,
H.
Hofer
, and
A.
Maier
, “
ORCA-SPOT: An automatic killer whale sound detection toolkit using deep learning
,”
Sci. Rep.
9
(
1
),
10997
(
2019
).
25.
E. L.
White
,
P. R.
White
,
J. M.
Bull
,
D.
Risch
,
S.
Beck
, and
E. W. J.
Edwards
, “
More than a whistle: Automated detection of marine sound sources with a convolutional neural network
,”
Front. Mar. Sci.
9
,
1
22
(
2022
).
26.
Y.
LeCun
,
L.
Bottou
,
Y.
Bengio
, and
P.
Haffner
, “
Gradient-based learning applied to document recognition
,”
Proc. IEEE
86
(
11
),
2278
2324
(
1998
).
27.
I.
Goodfellow
,
Y.
Bengio
, and
A.
Courville
,
Deep Learning
(
MIT Press
,
Cambridge, MA
,
2016
).
28.
Y.
LeCun
and
Y.
Bengio
, “
Convolutional networks for images, speech, and time series
,” in
The Handbook of Brain Theory and Neural Networks
, edited by
A. A.
Michael
(
MIT Press
,
Cambridge, MA
,
1998
), pp.
255
258
.
29.
K.
Zaman
,
M.
Sah
,
C.
Direkoglu
, and
M.
Unoki
, “
A survey of audio classification using deep learning
,”
IEEE Access
11
,
106620
106649
(
2023
).
30.
J.
Xie
,
K.
Hu
,
M.
Zhu
, and
Y.
Guo
, “
Bioacoustic signal classification in continuous recordings: Syllable-segmentation vs sliding-window
,”
Expert Syst. Appl.
152
,
113390
(
2020
).
31.
H.-C.
Shin
,
H. R.
Roth
,
M.
Gao
,
L.
Lu
,
Z.
Xu
, and
I.
Nogues
, “
Deep convolutional neural networks for computer aided detection: CNN architectures, dataset characteristics and transfer learning
,”
IEEE Trans. Med. Imaging
35
(
5
),
1285
1298
(
2016
).
32.
O. H.
Hamid
, “
From Model-Centric to Data-Centric AI: A Paradigm Shift or Rather a Complementary Approach?
,” in
Proceedings of the 2022 8th ITT Conference
, Dubai, UAE (
2022
).
33.
A.
Majeed
and
S. O.
Hwang
, “
Data-centric artificial intelligence, preprocessing, and the quest for transformative artificial intelligence systems development
,”
IEEE Comput.
56
(
5
),
109
115
(
2023
).
34.
M.
Movh
and
I. A.
Lawal
, “
Towards data-centric approaches to lung cancer classification
,” in
Mining Intelligence and Knowledge Exploration
(
Springer
,
New York
,
2023
), pp.
54
66
.
35.
A. N.
Allen
,
M.
Harvey
,
L.
Harrell
,
A.
Jansen
,
K. P.
Merkens
,
C. C.
Wall
,
J.
Cattiau
, and
E. M.
Oleson
, “
A convolutional neural network for automated detection of humpback whale song in a diverse, long-term passive acoustic dataset
,”
Front. Mar. Sci.
8
,
607321
(
2021
).
36.
S.
Yang
,
X.
Lingzhi
,
X.
Hong
, and
X.
Zeng
, “
A lightweight network model based on an attention mechanism for ship-radiated noise classification
,”
J. Mech. Sci. Eng.
11
(
2
),
432
(
2023
).
37.
A. M.
Usman
,
O. O.
Ogundile
, and
D. J. J.
Versfeld
, “
Review of automatic detection and classification techniques for cetacean vocalization
,”
IEEE Access
8
,
105181
105206
(
2020
).
38.
A. K.
Ibrahim
,
H.
Zhuang
,
L. M.
Chérubin
,
M. T.
Schärer-Umpierre
,
R. S.
Nemeth
,
N.
Erdol
, and
A. M.
Ali
, “
Transfer learning for efficient classification of grouper sound
,”
J. Acoust. Soc. Am.
148
(
3
),
EL260
EL266
(
2020
).
39.
H.
Schroter
,
E.
Noth
,
A.
Maier
,
R.
Cheng
,
V.
Barth
, and
C.
Bergler
, “
Segmentation, classification, and visualization of orca calls using deep learning
,” in
Proceedings of the 2019 ICASSP
,
Brighton, UK
(
2019
), pp.
8231
8235
.
40.
B. N.
Korkmaz
,
R.
Diamant
,
G.
Danino
, and
A.
Testolin
, “
Automated detection of dolphin whistles with convolutional networks and transfer learning
,”
Front. Artif. Intell.
6
,
1099022
(
2023
).
41.
M. R.
Khalilabadi
, “
Underwater ship-radiated acoustic noise recognition based on Mel-spectrogram and convolutional neural network
,”
Int. J. Coastal, Offshore Environ. Eng.
8
(
1
),
10
15
(
2023
).
42.
B.
Dommergues
,
E.
Cruz
, and
G.
Vaz
, “
Optimization of underwater acoustic detection of marine mammals and ships using CNN
,”
Proc. Mtgs. Acoust.
47
(
1
),
070012
(
2022
).
43.
M.
Ibrahim
,
J.
Sagers
, and
M.
Ballard
, “
A convolutional neural network applied to Arctic acoustic recordings to identify soundscape components
,”
Proc. Mtgs. Acoust.
42
(
1
),
070005
(
2020
).
44.
Q.
Yao
,
Y.
Wang
, and
Y.
Yang
, “
Underwater acoustic target recognition based on data augmentation and residual CNN
,”
Electronics
12
(
5
),
1206
(
2023
).
45.
S.
Yang
,
A.
Jin
,
X.
Zeng
,
H.
Wang
,
X.
Hong
, and
M.
Lei
, “
Underwater acoustic target recognition based on knowledge distillation under working conditions mismatching
,”
Multimedia Syst.
30
(
1
),
12
(
2024
).
46.
Z.
Hu
,
K.
LingHu
,
H.
Yu
, and
C.
Liao
, “
Speech emotion recognition based on attention MCNN combined with gender information
,”
IEEE Access
11
,
50285
50294
(
2023
).
47.
Y.
Wang
,
P.
Getreuer
,
T.
Hughes
,
R. F.
Lyon
, and
R. A.
Saurous
, “
Trainable frontend for robust and far-field keyword spotting
,” in
Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
,
New Orleans, LA
(
2017
), pp.
5670
5674
.
48.
V.
Lostanlen
,
J.
Salamon
,
M.
Cartwright
,
B.
McFee
,
A.
Farnsworth
,
S.
Kelling
, and
J. P.
Bello
, “
Per-channel energy normalization: Why and how
,”
IEEE Signal Process. Lett.
26
(
1
),
39
43
(
2019
).
49.
H.
Mohebbi-Kalkhoran
, “
Machine learning approaches for classification of myriad underwater acoustic events over Continental-Shelf scale regions with passive ocean acoustic waveguide remote sensing
,” Ph.D. thesis,
Northeastern University
,
Boston, MA
(
2022
).
50.
P.
Best
, “
Automated detection and classification of cetacean acoustic signals
,” Ph.D. thesis,
Universite de Toulon
,
La Valette-du-Var, France
(
2022
).
51.
P.
Zinemanas
,
P.
Cancela
, and
M.
Rocamora
, “
End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments
,” in
Proceedings of FRUCT 24'
,
Moscow, Russia
(
2019
), pp.
533
539
.
52.
COMPASS
, “
Collaborative Oceanography and Monitoring for Protected Areas and Species
,” https://compass-oceanscience.eu/ (Last viewed April 11, 2025).
53.
H.
Purwins
,
B.
Li
,
T.
Virtanen
,
J.
Schluter
,
S.-Y.
Chang
, and
T.
Sainath
, “
Deep Learning for Audio Signal Processing
,”
IEEE J. Sel. Top. Signal Process.
13
(
2
),
206
219
(
2019
).
54.
X.
Ying
, “
An overview of overfitting and its solutions
,”
J. Phys. Conf. Ser.
1168
,
022022
(
2019
).
55.
S. S.
Stevens
,
J.
Volkmann
, and
E. B.
Newman
, “
A scale for the measurement of the psychological magnitude pitch
,”
J. Acoust. Soc. Am.
8
(
3
),
185
190
(
1937
).
56.
M.
Taenzer
,
J.
Abeßer
,
S. I.
Mimilakis
,
C.
Weiß
,
M.
Müller
,
H.
Lukashevich
, and
Fraunhofer
, “
Investigating CNN-based instrument family recognition for Western classical music recordings
,” in
Proceedings of the 20th ISMIR Conference
,
Delft, The Netherlands
(
2019
), pp.
612
619
.
57.
C.
Ick
and
B.
McFee
, “
Sound event detection in urban audio with single and multi-rate PCEN
,” in
Proceedings of the IEEE ICASSP
,
Toronto, Canada
(
2021
), pp.
880
884
.
58.
M.
Olvera
,
E.
Vincent
,
R.
Serizel
, and
G.
Gasso
, “
Foreground-background ambient sound scene separation
,” in
Proceedings of the 28th European Signal Processing Conference
,
Virtual
(
2021
), pp.
281
285
.
59.
M. S.
Nagawade
and
V. R.
Ratnaparkhe
, “
Musical instrument identification using MFCC
,” in
Proceedings of the 2017 European Signal Processing Conference
,
Kos, Greece
(
2017
), p.
2198
.
60.
A.
Savitzky
and
M. J. E.
Golay
, “
Smoothing and differentiation of data by simplified least squares procedures
,”
Anal. Chem.
36
(
8
),
1627
1639
(
1964
).
61.
K.
He
,
X.
Zhang
,
S.
Ren
, and
J.
Sun
, “
Deep residual learning for image recognition
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
Las Vegas, NV
(
2016
), pp.
770
778
.
62.
A. G.
Howard
,
M.
Zhu
,
B.
Chen
,
D.
Kalenichenko
,
W.
Wang
,
T.
Weyand
,
M.
Andreetto
, and
H.
Adam
, “
MobileNets: Efficient convolutional neural networks for mobile vision applications
,” arXiv:1704.04861 (
2017
).
63.
M.
Tan
and
Q. V.
Le
, “
EfficientNet: Rethinking model scaling for convolutional neural networks
,” in
Proceedings of the International Conference on Machine Learning
,
Long Beach, CA
(
2019
), pp.
6105
6114
.
64.
M.
Thomas
,
B.
Martin
,
K.
Kowarski
,
B.
Gaudet
, and
S.
Matwin
, “
Marine mammal species classification using convolutional neural networks and a novel acoustic representation
,” in
Proceedings of ECML PKDD
,
Turin, Italy
(
2020
), pp.
290
305
.
65.
L.
Sifre
and
S.
Mallat
, “
Rotation, scaling and deformation invariant scattering for texture discrimination
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
Seattle, WA
(
2013
), pp.
1233
1240
.
66.
S.
Schneider
,
L.
von Fersen
, and
P. W.
Dierkes
, “
Acoustic estimation of the manatee population and classification of call categories using artificial intelligence
,”
Front. Conserv. Sci.
5
,
1405243
(
2024
).
67.
J.
Deng
,
W.
Dong
,
R.
Socher
,
L.-J.
Li
,
K.
Li
, and
L.
Fei-Fei
, “
ImageNet: A large-scale hierarchical image database
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
Miami, FL
(
2009
), pp.
248
255
.
68.
M.
Lin
,
Q.
Chen
, and
S.
Yan
, “
Network in network
,” arXiv:1312.4400 (
2013
).
69.
N.
Srivastava
,
G.
Hinton
,
A.
Krizhevsky
,
I.
Sutskever
, and
R.
Salakhutdinov
, “
Dropout: A simple way to prevent neural networks from overfitting
,”
J. Mach. Learn. Res.
15
(
56
),
1929
1958
(
2014
).
70.
X.
Glorot
and
Y.
Bengio
, “
Understanding the difficulty of training deep feedforward neural networks
,” in
Proceedings of the 13th International Conference on Artificial Intelligence and Statistics
,
Sardinia, Italy
(
2010
), pp.
249
256
.
71.
S.
Ioffe
and
C.
Szegedy
, “
Batch normalization: Accelerating deep network training by reducing internal covariate shift
,” arXiv:1502.03167 (2015)
72.
D.
Masters
and
C.
Luschi
, “
Revisiting small batch training for deep neural networks
,” arXiv:1804.07612 (
2018
).
73.
K.
You
,
M.
Long
,
J.
Wang
, and
M. I.
Jordan
, “
How does learning rate decay help modern neural networks?
,” arXiv:1908.01878 (
2019
).
74.
F.
Chollet
, “
Keras: The Python deep learning library
,” Astrophys. Source Code Library record ascl:1806.022 (
2018
).
75.
M.
Abadi
,
A.
Agarwal
,
P.
Barham
,
E.
Brevdo
,
Z.
Chen
,
C.
Citro
,
G. S.
Corrado
,
A.
Davis
,
J.
Dean
,
M.
Devin
,
S.
Ghemawat
,
I.
Goodfellow
,
A.
Harp
,
G.
Irving
,
M.
Isard
,
Y.
Jia
,
R.
Jozefowicz
,
L.
Kaiser
,
M.
Kudlur
,
J.
Levenberg
,
D.
Mane
,
R.
Monga
,
S.
Moore
,
D.
Murray
,
C.
Olah
,
M.
Schuster
,
J.
Shlens
,
B.
Steiner
,
I.
Sutskever
,
K.
Talwar
,
P.
Tucker
,
V.
Vanhoucke
,
V.
Vasudevan
,
F.
Viegas
,
O.
Vinyals
,
P.
Warden
,
M.
Wattenberg
,
M.
Wicke
,
Y.
Yu
, and
X.
Zheng
, “
TensorFlow: Large-scale machine learning on heterogeneous distributed systems
,” arXiv:1603.04467 (
2016
).
76.
Inference (or inferred) time refers to the time consumed by model on the testing set.
77.
M.
Buda
,
A.
Maki
, and
M. A.
Mazurowski
, “
A systematic study of the class imbalance problem in convolutional neural networks
,”
Neural Networks
106
,
249
259
(
2018
).
78.
P.
Nakkiran
,
G.
Kaplun
,
Y.
Bansal
,
T.
Yang
,
B.
Barak
, and
I.
Sutskever
, “
Deep double descent: Where bigger models and more data hurt
,”
J. Stat. Mech: Theory Exp.
2021
(
12
),
124003
(
2021
).
79.
M.
Belkin
,
D.
Hsu
,
S.
Ma
, and
S.
Mandal
, “
Reconciling modern machine-learning practice and the classical bias–variance trade-off
,”
Proc. Natl. Acad. Sci. U.S.A.
116
(
32
),
15849
15854
(
2019
).
80.
Y.
Dar
,
L.
Luzi
, and
R. G.
Baraniuk
, “
Frozen overparameterization: A double descent perspective on transfer learning of deep neural networks
,” arXiv:2211.11074 (
2022
).
81.
A.
Althnian
,
D.
AlSaeed
,
H.
Al-Baity
,
A.
Samha
,
A. B.
Dris
,
N.
Alzakari
,
A. A.
Elwafa
, and
H.
Kurdi
, “
Impact of dataset size on classification performance: An empirical evaluation in the medical domain
,”
Appl. Sci.
11
(
2
),
796
(
2021
).
82.
D.
Soekhoe
,
P. V. D.
Putten
, and
A.
Plaat
, “
On the impact of data set size in transfer learning using deep neural networks
,” in
Advances in Intelligent Data Analysis XV
, Lecture Notes in Computer Science (
Springer
,
New York
,
2016
), pp.
50
60
.
83.
S.
Shahinfar
,
P. D.
Meek
, and
G.
Falzon
, “‘
How many images do I need?’ Understanding how sample size per class affects deep learning model performance metrics for balanced designs in autonomous wildlife monitoring
,”
Ecol. Inf.
57
,
101085
(
2020
).
84.
J.
Hestness
,
S.
Narang
,
N.
Ardalani
,
G.
Diamos
,
H.
Jun
,
H.
Kianinejad
, Md.
M. A.
Patwary
,
Y.
Yang
, and
Y.
Zhou
, “
Deep learning scaling is predictable, empirically
,” arXiv:1712.00409 (
2017
).

Supplementary Material

You do not currently have access to this content.