Computers have undergone tremendous improvements in performance over the last 60 years, but those improvements have significantly slowed down over the last decade, owing to fundamental limits in the underlying computing primitives. However, the generation of data and demand for computing are increasing exponentially with time. Thus, there is a critical need to invent new computing primitives, both hardware and algorithms, to keep up with the computing demands. The brain is a natural computer that outperforms our best computers in solving certain problems, such as instantly identifying faces or understanding natural language. This realization has led to a flurry of research into neuromorphic or brain-inspired computing that has shown promise for enhanced computing capabilities. This review points to the important primitives of a brain-inspired computer that could drive another decade-long wave of computer engineering.

1.
See
https://en.wikipedia.org/wiki/Analog_computer for information on analog computing (last accessed October 15,
2019
).
2.
J. S.
Kilby
,
Resonance
17
,
11
(
1976
).
3.
G. E.
Moore
,
Electronics
38
,
114
(
1965
).
4.
S. B.
Desai
,
S. R.
Madhvapathy
,
A. B.
Sachid
,
J. P.
Llinas
,
Q.
Wang
,
G. H.
Ahn
,
G.
Pitner
,
M. J.
Kim
,
J.
Bokor
,
C.
Hu
 et al.,
Science
354
,
99
102
(
2016
).
5.
L. B.
Kish
,
Phys. Lett. A
305
,
144
149
(
2002
).
6.
M. M.
Waldrop
,
Nat. News
530
,
144
(
2016
).
7.
W.
Harrod
, “
A journey to exascale computing
,”
Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis
(
2012
)
1702
1730
.
8.
V. V.
Zhirnov
,
R. K.
Cavin
,
J. A.
Hutchby
, and
G. I.
Bourianoff
,
Proc. IEEE
91
,
1934
1939
(
2003
).
9.
K.
Rupp
, see https://www.karlrupp.net/2018/02/42-years-of-microprocessor-trend-data/ for trends in Moore's law and computing (
2018
).
10.
M.
Hilbert
and
P.
López
,
Science
332
,
60
65
(
2011
).
11.
See https://aiimpacts.org/trends-in-the-cost-of-computing/ for trends in computing (last accessed October 15,
2019
).
12.
T. N.
Theis
and
H. S. P.
Wong
,
Comput. Sci. Eng.
19
,
41
(
2017
).
13.
D.
Reinsel
,
J.
Gantz
, and
J.
Rydning
, in IDC (White Paper) (
2018
).
14.
R. S.
Williams
,
Comput. Sci. Eng.
19
,
7
13
(
2017
).
15.
E.
Track
,
N.
Forbes
, and
G.
Strawn
,
Comput. Sci. Eng.
19
,
4
(
2017
).
16.
D.
Gaston
,
C.
Newman
,
G.
Hansen
, and
D.
Lebrun-Grandie
,
Nucl. Eng. Des.
239
,
1768
1778
(
2009
).
17.
H. P.
Langtangen
and
R. L.
Huston
, in
Computational Partial Differential Equations: Numerical Methods and Diffpack Programming
, edited by American Society of Mechanical Engineers Digital Collection (
Springer
,
2003
).
18.
S.
Amat
,
S.
Busquier
, and
J. M.
Gutiérrez
,
J. Comput. Appl. Math.
157
,
197
205
(
2003
).
19.
S. H.
Strogatz
,
Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering
(
CRC Press
,
2018
).
20.
L. F.
Shampine
,
Numerical Solution of Ordinary Differential Equations
(
Routledge
,
2018
).
21.
See https://www.gurobi.com/news/the-national-football-league-chooses-gurobi/ for reference to computing resources required for NFL scheduling (last accessed December 24,
2019
).
22.
See https://about.google/stories/scientists-could-make-oceans-drinkable for how computing is a bottleneck in designing potable water solutions (last accessed October 15,
2019
).
23.
M. J.
Puckelwartz
,
L. L.
Pesce
,
V.
Nelakuditi
,
L.
Dellefave-Castillo
,
J. R.
Golbus
,
S. M.
Day
,
T. P.
Cappola
,
G. W.
Dorn
,
I. T.
Foster
, and
E. M.
McNally
,
Bioinformatics
30
,
1508
1513
(
2014
).
24.
E. A.
Huerta
,
R.
Haas
,
S.
Jha
,
M.
Neubauer
, and
D. S.
Katz
,
Comput. Software Big Sci.
3
,
5
(
2019
).
25.
P. J.
Phillips
,
A. N.
Yates
,
Y.
Hu
,
C. A.
Hahn
,
E.
Noyes
,
K.
Jackson
,
J. G.
Cavazos
,
G.
Jeckeln
,
R.
Ranjan
,
S.
Sankaranarayanan
 et al.,
Proc. Natl. Acad. Sci.
115
,
6171
6176
(
2018
).
26.
N. J.
Nilsson
,
Principles of Artificial Intelligence
(
Morgan Kaufmann
,
2014
).
27.
S. J.
Russell
and
P.
Norvig
,
Artificial Intelligence: A Modern Approach
(
Pearson Education Limited
,
Malaysia
,
2016
).
28.
W.
James
,
F.
Burkhardt
,
F.
Bowers
, and
I. K.
Skrupskelis
,
The Principles of Psychology
(
Macmillan
,
London
,
1890
), Vol.
1
.
29.
H.
Spencer
,
The Principles of Psychology
(
Appleton
,
1895
), Vol.
1
.
30.
L. O.
Chua
and
L.
Yang
,
IEEE Trans. Circuits Syst.
35
,
1257
1272
(
1988
).
31.
S.
Haykin
,
Neural Networks: A Comprehensive Foundation
(
Prentice Hall PTR
,
1994
).
32.
S. S.
Haykin
,
Neural Networks and Learning Machines
(
Prentice Hall
,
New York
,
2009
).
33.
A.
Krizhevsky
,
I.
Sutskever
, and
G. E.
Hinton
, “
ImageNet classification with deep convolutional neural networks
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2012
), pp.
1097
1105
.
34.
C. M.
Bishop
,
Neural Networks for Pattern Recognition
(
Oxford university Press
,
1995
).
35.
B. D.
Ripley
and
N. L.
Hjort
,
Pattern Recognition and Neural Networks
(
Cambridge University Press
,
1996
).
36.
B.
Widrow
and
M. A.
Lehr
,
Proc. IEEE
78
,
1415
1442
(
1990
).
37.
R.
Hecht-Nielsen
, “
Theory of the backpropagation neural network
,”
Neural Networks for Perception
(
Elsevier
,
1992
), pp.
65
93
.
38.
W. L.
Buntine
and
A. S.
Weigend
,
Complex Syst.
5
,
603
643
(
1991
).
39.
L. K.
Jones
,
Ann. Stat.
20
,
608
613
(
1992
).
40.
M.
Lukoševičius
and
H.
Jaeger
,
Comput. Sci. Rev.
3
,
127
149
(
2009
).
41.
Y.
Shang
and
B. W.
Wah
,
Computer
29
,
45
54
(
1996
).
42.
Y.
Boniface
,
F.
Alexandre
, and
S.
Vialle
, “
A bridge between two paradigms for parallelism: Neural networks and general purpose MIMD computers
,” in
IJCNN'99, International Joint Conference on Neural Networks, Proceedings
(
1999
), pp.
2441
2446
.
43.
M.
Beckerman
, “
Cooperativity and parallelism in mathematical models of brain function
,”
SIAM News
31
,
1
6
(
1998
).
44.
P. S.
Churchland
and
T. J.
Sejnowski
,
The Computational Brain
(
MIT Press
,
2016
).
45.
G. W.
Burr
,
R. M.
Shelby
,
A.
Sebastian
,
S.
Kim
,
S.
Kim
,
S.
Sidler
,
K.
Virwani
,
M.
Ishii
,
P.
Narayanan
,
A.
Fumarola
 et al.,
Adv. Phys. X
2
,
89
124
(
2017
).
46.
D.
Ielmini
and
H. S. P.
Wong
,
Nat. Electron.
1
,
333
(
2018
).
48.
C.
Eliasmith
, “
Is the brain analog or digital?
,”
Cognit. Sci. Q.
1
,
147
170
(
2000
).
49.
A.
Calimera
,
E.
Macii
, and
M.
Poncino
,
Funct. Neurol.
28
,
191
(
2013
).
50.
H. T.
Siegelmann
and
E. D.
Sontag
,
Theor. Comput. Sci.
131
,
331
360
(
1994
).
51.
H. R.
Mahdiani
,
A.
Ahmadi
,
S. M.
Fakhraie
, and
C.
Lucas
,
IEEE Trans. Circuits Syst. I
57
,
850
862
(
2010
).
52.
J. P.
Hayes
, “
Introduction to stochastic computing and its challenges
,” in
Proceedings of the 52nd Annual Design Automation Conference
(
2015
), p.
59
.
53.
J.
Von Neumann
and
R.
Kurzweil
,
The Computer and the Brain
(
Yale University Press
,
2012
).
54.
B. J.
MacLennan
,
Inf. Sci.
119
,
73
89
(
1999
).
55.
B.
Jacob
,
S.
Kligys
,
B.
Chen
,
M.
Zhu
,
M.
Tang
,
A.
Howard
,
H.
Adam
, and
D.
Kalenichenko
, “
Quantization and training of neural networks for efficient integer-arithmetic-only inference
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(
2018
), pp.
2704
2713
.
56.
C.
Zhu
,
S.
Han
,
H.
Mao
, and
W. J.
Dally
, preprint arXiv:1612.01064 (
2016
).
57.
D. P.
Kingma
,
T.
Salimans
, and
M.
Welling
, “
Variational dropout and the local reparameterization trick
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2015
), pp.
2575
2583
.
58.
M.
Figurnov
,
S.
Mohamed
, and
A.
Mnih
, “
Implicit reparameterization gradients
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2018
), pp.
441
452
.
59.
G.
Indiveri
,
B.
Linares-Barranco
,
R.
Legenstein
,
G.
Deligeorgis
, and
T.
Prodromakis
,
Nanotechnology
24
,
384010
(
2013
).
60.
A. C.
Torrezan
,
J. P.
Strachan
,
G.
Medeiros-Ribeiro
, and
R. S.
Williams
,
Nanotechnology
22
,
485203
(
2011
).
61.
T.
Rejimon
and
S.
Bhanja
, “
Scalable probabilistic computing models using Bayesian networks
,” in
48th Midwest Symposium on Circuits and Systems
(
2005
), pp.
712
715
.
62.
M.
Mitzenmacher
and
E.
Upfal
,
Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis
(
Cambridge University Press
,
2017
).
63.
J.
Han
and
M.
Orshansky
, “
Approximate computing: An emerging paradigm for energy-efficient design
,” in
2013 18th IEEE European Test Symposium (ETS)
(
2013
), pp.
1
6
.
64.
P. D.
Wasserman
,
Advanced Methods in Neural Computing
(
John Wiley & Sons, Inc
.,
1993
).
65.
D. J.
Rezende
,
S.
Mohamed
, and
D.
Wierstra
, “
Stochastic backpropagation and approximate inference in deep generative models
,” in
31st International Conference on International Conference on Machine Learning
(
2014
), pp.
1278
1286
.
66.
D.
George
,
W.
Lehrach
,
K.
Kansky
,
M.
Lázaro-Gredilla
,
C.
Laan
,
B.
Marthi
,
X.
Lou
,
Z.
Meng
,
Y.
Liu
,
H.
Wang
 et al.,
Science
358
,
2612
(
2017
).
67.
J.
Devlin
,
M.-W.
Chang
,
K.
Lee
, and
K.
Toutanova
, preprint arXiv:1810.04805 (
2018
).
68.
A.
Radford
,
J.
Wu
,
R.
Child
,
D.
Luan
,
D.
Amodei
, and
I.
Sutskever
, “
Language models are unsupervised multitask learners
,”
OpenAI Blog
1
,
8
(
2019
).
69.
Y.
Zhang
,
T.
Xiang
,
T. M.
Hospedales
, and
H.
Lu
, “
Deep mutual learning
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(
2018
), pp.
4320
4328
.
70.
J.
Dean
,
G.
Corrado
,
R.
Monga
,
K.
Chen
,
M.
Devin
,
M.
Mao
,
M. a
Ranzato
,
A.
Senior
,
P.
Tucker
,
K.
Yang
 et al., “
Large scale distributed deep networks
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2012
), pp.
1223
1231
.
71.
K.-S.
Oh
and
K.
Jung
,
Pattern Recognit.
37
,
1311
1314
(
2004
).
72.
A.
Brameller
,
Sparsity and Its Applications
(
CUP Archive
,
1985
).
73.
N.
Mishra
,
R.
Schreiber
,
I.
Stanton
, and
R. E.
Tarjan
, “
Clustering social networks
,” in
International Workshop on Algorithms and Models for the Web-Graph
(
2007
), pp.
56
67
.
74.
B.
Liu
,
M.
Wang
,
H.
Foroosh
,
M.
Tappen
, and
M.
Pensky
, “
Sparse convolutional neural networks
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(
2015
), pp.
806
814
.
75.
J.
Rae
,
J. J.
Hunt
,
I.
Danihelka
,
T.
Harley
,
A. W.
Senior
,
G.
Wayne
,
A.
Graves
, and
T.
Lillicrap
, “
Scaling memory-augmented neural networks with sparse reads and writes
,” in
Advances in Neural Information Processing Systems
(
2016
), pp.
3621
3629
.
76.
T.
Gale
,
E.
Elsen
, and
S.
Hooker
, preprint arXiv:1902.09574 (
2019
).
77.
M.
Zhu
and
S.
Gupta
, preprint arXiv:1710.01878 (
2017
).
78.
N.
Shazeer
,
A.
Mirhoseini
,
K.
Maziarz
,
A.
Davis
,
Q.
Le
,
G.
Hinton
, and
J.
Dean
, preprint arXiv:1701.06538 (
2017
).
79.
H.
Soltau
,
H.
Liao
, and
H.
Sak
, “
Reducing the computational complexity for whole word models
,” in
2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
(
2017
), pp.
63
68
.
80.
S.
Gray
,
A.
Radford
, and
D. P.
Kingma
, https://openai.com/blog/block-sparse-gpu-kernels/ for GPU Kernels for Block-Sparse Weights (2017) (last accessed December 25,
2019
).
81.
V.
Liu
,
R.
Kumaraswamy
,
L.
Le
, and
M.
White
, “
The utility of sparse representations for control in reinforcement learning
,” in
Proceedings of the AAAI Conference on Artificial Intelligence
(
2019
), pp.
4384
4391
.
82.
J.
Frankle
and
M.
Carbin
, preprint arXiv:1803.03635 (
2018
).
83.
J.
Frankle
,
G. K.
Dziugaite
,
D. M.
Roy
, and
M.
Carbin
, preprint arXiv:1903.01611 (
2019
).
84.
Z.
Ji
and
W.
Wang
,
J. Visual Commun. Image Representation
28
,
44
52
(
2015
).
85.
H.
Zhang
,
N. M.
Nasrabadi
,
Y.
Zhang
, and
T. S.
Huang
,
Pattern Recognit.
45
,
1290
1298
(
2012
).
86.
B.
Hassibi
and
D. G.
Stork
, “
Second order derivatives for network pruning: Optimal brain surgeon
,” in
Advances in Neural Information Processing Systems (NIPS)
(
1993
), pp.
164
171
.
87.
S.
Lewandowsky
and
S.-C.
Li
, “
Catastrophic interference in neural networks: Causes, solutions, and data
,”
Interference and Inhibition in Cognition
(
Elsevier
,
1995
), pp.
329
361
.
88.
R. M.
French
,
Trends Cognit. Sci.
3
,
128
135
(
1999
).
89.
R. M.
French
, “
Using semi-distributed representations to overcome catastrophic forgetting in connectionist networks
,” in
Proceedings of the 13th Annual Cognitive Science Society Conference
(
1991
), pp.
173
178
.
90.
R. M.
French
, “
Dynamically constraining connectionist networks to produce distributed, orthogonal representations to reduce catastrophic interference
,” in
16th Annual Cog. Sci. Society Conference
(
1994
), pp.
335
340
.
91.
J.
Kirkpatrick
,
R.
Pascanu
,
N.
Rabinowitz
,
J.
Veness
,
G.
Desjardins
,
A. A.
Rusu
,
K.
Milan
,
J.
Quan
,
T.
Ramalho
,
A.
Grabska-Barwinska
 et al.,
Proc. Natl. Acad. Sci.
114
,
3521
3526
(
2017
).
92.
F.
Van den Bergh
and
A. P.
Engelbrecht
,
South Afr. Comput. J.
2000
,
84
90
(
2000
).
93.
D.
Balduzzi
,
H.
Vanchinathan
, and
J.
Buhmann
, “
Kickback cuts backprop's red-tape: Biologically plausible credit assignment in neural networks
,” in
Twenty-Ninth AAAI Conference on Artificial Intelligence
(
2015
).
94.
J.
Schmidhuber
,
Neural Networks
61
,
85
117
(
2015
).
95.
C.
Louizos
,
U.
Shalit
,
J. M.
Mooij
,
D.
Sontag
,
R.
Zemel
, and
M.
Welling
, “
Causal effect inference with deep latent-variable models
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2017
), pp.
6446
6456
.
96.
G.
Marcus
, preprint arXiv:1801.00631 (
2018
).
97.
J.
Pearl
,
Econometric Theory
31
,
152
179
(
2015
).
98.
J.
Pearl
,
Causality: Models, Reasoning and Inference
(
Springer
,
2000
), Vol.
29
.
99.
M. P.
Kennedy
and
L. O.
Chua
,
IEEE Trans. Circuits Syst.
35
,
554
562
(
1988
).
100.
E.
Kaslik
and
S.
Sivasundaram
,
Neural Networks
32
,
245
256
(
2012
).
101.
L. O.
Chua
and
L.
Yang
,
IEEE Trans. Circuits Syst.
35
,
1273
1290
(
1988
).
102.
S. A.
Fahmy
,
K.
Vipin
, and
S.
Shreejith
, “
Virtualized FPGA accelerators for efficient cloud computing
,” in
2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom)
(
2015
), pp.
430
435
.
103.
E.
Nurvitadhi
,
D.
Sheffield
,
J.
Sim
,
A.
Mishra
,
G.
Venkatesh
, and
D.
Marr
, “
Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC
,” in
2016 International Conference on Field-Programmable Technology (FPT)
(
2016
), pp.
77
84
.
104.
E.
Nurvitadhi
,
J.
Sim
,
D.
Sheffield
,
A.
Mishra
,
S.
Krishnan
, and
D.
Marr
, “
Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC
,” in
2016 26th International Conference on Field Programmable Logic and Applications (FPL)
(
2016
), pp.
1
4
.
105.
V.
Gupta
,
A.
Gavrilovska
,
K.
Schwan
,
H.
Kharche
,
N.
Tolia
,
V.
Talwar
, and
P.
Ranganathan
, “
GViM: GPU-accelerated virtual machines
,” in
Proceedings of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing
(
2009
), pp.
17
24
.
106.
J.
Kurzak
,
D. A.
Bader
, and
J.
Dongarra
,
Scientific Computing with Multicore and Accelerators
(
CRC Press
,
2010
).
107.
S.
Crago
,
K.
Dunn
,
P.
Eads
,
L.
Hochstein
,
D.-I.
Kang
,
M.
Kang
,
D.
Modium
,
K.
Singh
,
J.
Suh
, and
J. P.
Walters
, “
Heterogeneous cloud computing
,” in
2011 IEEE International Conference on Cluster Computing
(
2011
), pp.
378
385
.
108.
C.
Kachris
and
D.
Soudris
, “
A survey on reconfigurable accelerators for cloud computing
,” in
2016 26th International Conference on Field Programmable Logic and Applications (FPL)
(
2016
), pp.
1
10
.
109.
L. F.
Abbott
and
S. B.
Nelson
,
Nat. Neurosci.
3
,
1178
(
2000
).
110.
J.
Rubin
,
D. D.
Lee
, and
H.
Sompolinsky
,
Phys. Rev. Lett.
86
,
364
(
2001
).
111.
S.
Song
,
K. D.
Miller
, and
L. F.
Abbott
,
Nat. Neurosci.
3
,
919
(
2000
).
112.
Y.
Dan
and
M.-M.
Poo
,
Neuron
44
,
23
30
(
2004
).
113.
N.
Caporale
and
Y.
Dan
,
Annu. Rev. Neurosci.
31
,
25
46
(
2008
).
114.
T.
Masquelier
and
S. J.
Thorpe
,
PLoS Comput. Biol.
3
,
e31
(
2007
).
115.
Y.
Mu
and
M.-M.
Poo
,
Neuron
50
,
115
125
(
2006
).
116.
A. K.
Fidjeland
and
M. P.
Shanahan
, “
Accelerated simulation of spiking neural networks using GPUs
,” in
the 2010 International Joint Conference on Neural Networks (IJCNN)
(
2010
), pp.
1
8
.
117.
R. V.
Florian
,
Neural Comput.
19
,
1468
1502
(
2007
).
118.
R.
Gütig
,
Curr. Opin. Neurobiol.
25
,
134
139
(
2014
).
119.
Y.
Bengio
and
P.
Frasconi
, “
Credit assignment through time: Alternatives to backpropagation
,” in
Advances in Neural Information Processing Systems
(
1994
), pp.
75
82
.
120.
I.
Goodfellow
,
Y.
Bengio
, and
A.
Courville
,
Deep Learning
(
MIT Press
,
2016
).
121.
J.
Pei
,
L.
Deng
,
S.
Song
,
M.
Zhao
,
Y.
Zhang
,
S.
Wu
,
G.
Wang
,
Z.
Zou
,
Z.
Wu
,
W.
He
 et al.,
Nature
572
,
106
(
2019
).
122.
G.
Rachmuth
,
H. Z.
Shouval
,
M. F.
Bear
, and
C.-S.
Poon
,
Proc. Natl. Acad. Sci.
108
,
E1266
E1274
(
2011
).
123.
S.
Scholze
,
S.
Schiefer
,
J.
Partzsch
,
S.
Hartmann
,
C. G.
Mayr
,
S.
Höppner
,
H.
Eisenreich
,
S.
Henker
,
B.
Vogginger
, and
R.
Schüffny
,
Front. Neurosci.
5
,
117
(
2011
).
124.
D.
Roggen
,
S.
Hofmann
,
Y.
Thoma
, and
D.
Floreano
, “
Hardware spiking neural network with run-time reconfigurable connectivity in an autonomous robot
,” in
Proceedings of NASA/DoD Conference on Evolvable Hardware
(
2003
), pp.
189
198
.
125.
X.
Zhou
,
Z.
Du
,
Q.
Guo
,
S.
Liu
,
C.
Liu
,
C.
Wang
,
X.
Zhou
,
L.
Li
,
T.
Chen
, and
Y.
Chen
, “
Cambricon-S: Addressing irregularity in sparse neural networks through a cooperative software/hardware approach
,” in
2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
(
2018
), pp.
15
28
.
126.
E.
Hoffer
,
B.
Weinstein
,
I.
Hubara
,
T.
Ben-Nun
,
T.
Hoefler
, and
D.
Soudry
, preprint arXiv:1908.08986 (
2019
).
127.
A.
Reuther
,
P.
Michaleas
,
M.
Jones
,
V.
Gadepally
,
S.
Samsi
, and
J.
Kepner
, “
Survey and benchmarking of machine learning accelerators
,” in
IEEE High Performance Extreme Computing Conference
(2019).
128.
See https://www.gyrfalcontech.ai/solutions/2803s/ for “
Gyrfalcon Technology
(last accessed October 15,
2019
).
130.
A. S.
Cassidy
,
R.
Alvarez-Icaza
,
F.
Akopyan
,
J.
Sawada
,
J. V.
Arthur
,
P. A.
Merolla
,
P.
Datta
,
M. G.
Tallada
,
B.
Taba
,
A.
Andreopoulos
 et al., “
Real-time scalable cortical computing at 46 giga-synaptic OPS/watt with ∼100× speedup in time-to-solution and ∼100,000× reduction in energy-to-solution
,” in
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
(
2014
), pp.
27
38
.
131.
F.
Checconi
and
F.
Petrini
, “
Traversing trillions of edges in real time: Graph exploration on large-scale parallel machines
,” in
2014 IEEE 28th International Parallel and Distributed Processing Symposium
(
2014
), pp.
425
434
.
132.
D.
Merrill
,
M.
Garland
, and
A.
Grimshaw
, “
Scalable GPU graph traversal
,” in
ACM Sigplan Notices
(
2012
), pp.
117
128
.
133.
See https://sambanova.ai/ for “
SambaNova
” (last accessed October 15,
2019
).
134.
See https://en.wikichip.org/wiki/tesla_(car_company)/fsd_chip for “
FSD Chip-Tesla
” (last accessed October 15,
2019
).
135.
See https://en.wikichip.org/wiki/intel/loihi for “
Loihi-Intel
” (last accessed October 15,
2019
).
136.
See http://www.tinymlsummit.org/syntiant_7-25_meetup.pdf, for “
Syntiant
” (last accessed October 15,
2019
).
137.
See https://lightmatter.co/ for “
Lightmatter
” (last accessed October 15,
2019
).
138.
See https://www.lightelligence.ai/ for “
Lightelligence
” (last accessed October 15,
2019
).
140.
L.
Gwennap
, see https://habana.ai/wp-content/uploads/2019/06/Habana-Offers-Gaudi-for-AI-Training.pdf for “
Habana
” (last accessed October 15,
2019
).
141.
See http://www.Groq.com for “Groq” (last accessed October 15,
2019
)
142.
See http://cerebras.net/ for “
Cerebras
” (last accessed October 15,
2019
).
143.
P.
Kennedy
, see https://www.servethehome.com/hands-on-with-a-graphcore-c2-ipu-pcie-card-at-dell-tech-world/ for “
Graphcore
” (last Accessed October 15,
2019
).
145.
P.
Teich
, see https://www.nextplatform.com/2018/05/10/tearing-apart-googles-tpu-3-0-ai-coprocessor/ for “
Google TPU
” (last accessed October 15,
2019
).
146.
See https://www.mythic-ai.com/technology/ for “
Mythic
” (last accessed October 15,
2019
).
147.
See http://brainscales.kip.uni-heidelberg.de/public/results/ for “
BrainScaleS
” (last Accessed October 15,
2019
).
148.
M.
Feldman
, see https://www.top500.org/news/wave-computing-launches-machine-learning-appliance/ for “
Wave Computing
” (last accessed October 15,
2019
).
149.
P. A.
Merolla
,
J. V.
Arthur
,
R.
Alvarez-Icaza
,
A. S.
Cassidy
,
J.
Sawada
,
F.
Akopyan
,
B. L.
Jackson
,
N.
Imam
,
C.
Guo
,
Y.
Nakamura
 et al.,
Science
345
,
668
673
(
2014
).
150.
P.
Lichtsteiner
,
C.
Posch
, and
T.
Delbruck
,
IEEE J. Solid-State Circuits
43
,
566
576
(
2008
).
152.
B. V.
Benjamin
,
P.
Gao
,
E.
McQuinn
,
S.
Choudhary
,
A. R.
Chandrasekaran
,
J.-M.
Bussat
,
R.
Alvarez-Icaza
,
J. V.
Arthur
,
P. A.
Merolla
, and
K.
Boahen
,
Proc. IEEE
102
,
699
716
(
2014
).
153.
E.
Painkras
,
L. A.
Plana
,
J.
Garside
,
S.
Temple
,
S.
Davidson
,
J.
Pepper
,
D.
Clark
,
C.
Patterson
, and
S.
Furber
, “
Spinnaker: A multi-core system-on-chip for massively-parallel neural net simulation
,” in
Proceedings of the IEEE 2012 Custom Integrated Circuits Conference
(
2012
), pp.
1
4
.
154.
A.
Tavanaei
,
M.
Ghodrati
,
S. R.
Kheradpisheh
,
T.
Masquelier
, and
A.
Maida
,
Neural Networks
111
,
47
63
(
2018
).
155.
A.
Mizrahi
,
T.
Hirtzlin
,
A.
Fukushima
,
H.
Kubota
,
S.
Yuasa
,
J.
Grollier
, and
D.
Querlioz
,
Nat. Commun.
9
,
1533
(
2018
).
156.
A. P.
Georgopoulos
,
A. B.
Schwartz
, and
R. E.
Kettner
,
Science
233
,
1416
1419
(
1986
).
157.
H.
Lee
,
A.
Battle
,
R.
Raina
, and
A. Y.
Ng
, “
Efficient sparse coding algorithms
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2007
), pp.
801
808
.
158.
R. P.
Rao
and
D. H.
Ballard
,
Nat. Neurosci.
2
,
79
(
1999
).
159.
A. M.
Bastos
,
W. M.
Usrey
,
R. A.
Adams
,
G. R.
Mangun
,
P.
Fries
, and
K. J.
Friston
,
Neuron
76
,
695
711
(
2012
).
160.
T.
Kohonen
, “
Exploration of very large databases by self-organizing maps
,” in
Proceedings of International Conference on Neural Networks (ICNN'97)
(
1997
), Vol.
1
, pp.
PL1
PL6
.
161.
S.
Van Gassen
,
B.
Callebaut
,
M. J.
Van Helden
,
B. N.
Lambrecht
,
P.
Demeester
,
T.
Dhaene
, and
Y.
Saeys
,
Cytometry, Part A
87
,
636
645
(
2015
).
162.
C.
Du
,
F.
Cai
,
M. A.
Zidan
,
W.
Ma
,
S. H.
Lee
, and
W. D.
Lu
,
Nat. Commun.
8
,
2204
(
2017
).
163.
J.
Long
,
E.
Shelhamer
, and
T.
Darrell
, “
Fully convolutional networks for semantic segmentation
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(
2015
), pp.
3431
3440
.
164.
G.
Huang
,
Z.
Liu
,
L.
Van Der Maaten
, and
K. Q.
Weinberger
, “
Densely connected convolutional networks
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(
2017
), pp.
4700
4708
.
165.
K. S.
Tai
,
R.
Socher
, and
C. D.
Manning
, preprint arXiv:1503.00075 (
2015
).
166.
K.
Greff
,
R. K.
Srivastava
,
J.
Koutník
,
B. R.
Steunebrink
, and
J.
Schmidhuber
,
IEEE Trans. Neural Networks Learn. Syst.
28
,
2222
2232
(
2017
).
167.
M.
Jaderberg
,
K.
Simonyan
, and
A.
Zisserman
, “
Spatial transformer networks
,”
Advances in Neural Information Processing Systems (NIPS)
(
2015
), pp.
2017
2025
.
168.
X.
Yan
,
J.
Yang
,
E.
Yumer
,
Y.
Guo
, and
H.
Lee
, “
Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision
,”
Advances in Neural Information Processing Systems (NIPS)
(
2016
), pp.
1696
1704
.
169.
Z. C.
Lipton
,
J.
Berkowitz
, and
C.
Elkan
, preprint arXiv:1506.00019 (
2015
).
170.
J. A.
Hertz
,
Introduction to the Theory of Neural Computation
(
CRC Press
,
2018
).
171.
J.
Tang
,
C.
Deng
, and
G.-B.
Huang
,
IEEE Trans. Neural Networks Learn. Syst.
27
,
809
821
(
2016
).
172.
Y.
Zhang
,
Y.
Sun
,
P.
Phillips
,
G.
Liu
,
X.
Zhou
, and
S.
Wang
,
J. Med. Syst.
40
,
173
(
2016
).
173.
C.
Doersch
, preprint arXiv:1606.05908 (
2016
).
174.
C. K.
Sønderby
,
T.
Raiko
,
L.
Maaløe
,
S. K.
Sønderby
, and
O.
Winther
, “
Ladder variational autoencoders
,”
Advances in Neural Information Processing Systems (NIPS)
(
2016
), pp.
3738
3746
.
175.
Y.
Gal
,
R.
Islam
, and
Z.
Ghahramani
, “
Deep Bayesian active learning with image data
,” in Proceedings of the 34th International Conference on Machine Learning (
2017
), Vol. 70, pp.
1183
1192
.
176.
I.
Hubara
,
M.
Courbariaux
,
D.
Soudry
,
R.
El-Yaniv
, and
Y.
Bengio
,
J. Mach. Learn. Res.
18
,
6869
6898
(
2017
).
177.
I.
Hubara
,
M.
Courbariaux
,
D.
Soudry
,
R.
El-Yaniv
, and
Y.
Bengio
, “
Binarized neural networks
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2016
), pp.
4107
4115
.
178.
S.
Shin
,
K.
Hwang
, and
W.
Sung
, “
Fixed-point performance analysis of recurrent neural networks
,” in
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
(
2016
), pp.
976
980
.
179.
M.
Hausknecht
and
P.
Stone
, “
Deep recurrent q-learning for partially observable MDPs
,” in
2015 AAAI Fall Symposium Series
(
2015
).
180.
H.
Van Hasselt
,
A.
Guez
, and
D.
Silver
, “
Deep reinforcement learning with double q-learning
,” in
Thirtieth AAAI Conference on Artificial Intelligence
(
2016
).
181.
D.
Joel
,
Y.
Niv
, and
E.
Ruppin
,
Neural Networks
15
,
535
547
(
2002
).
182.
D.
Bahdanau
,
P.
Brakel
,
K.
Xu
,
A.
Goyal
,
R.
Lowe
,
J.
Pineau
,
A.
Courville
, and
Y.
Bengio
, preprint arXiv:1607.07086 (
2016
).
183.
J.
Schulman
,
S.
Levine
,
P.
Abbeel
,
M.
Jordan
, and
P.
Moritz
, “
Trust region policy optimization
,” in
International Conference on Machine Learning
(
2015
), pp.
1889
1897
.
184.
J.
Schulman
,
F.
Wolski
,
P.
Dhariwal
,
A.
Radford
, and
O.
Klimov
, preprint arXiv:1707.06347 (
2017
).
185.
I.
Goodfellow
,
J.
Pouget-Abadie
,
M.
Mirza
,
B.
Xu
,
D.
Warde-Farley
,
S.
Ozair
,
A.
Courville
, and
Y.
Bengio
, “
Generative adversarial nets
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2014
), pp.
2672
2680
.
186.
A.
Radford
,
L.
Metz
, and
S.
Chintala
, preprint arXiv:1511.06434 (
2015
).
187.
A.
Liaw
and
M.
Wiener
,
R News
2
,
18
22
(
2002
).
188.
R.
Díaz-Uriarte
and
S. A.
De Andres
,
BMC Bioinf.
7
,
3
(
2006
).
189.
J. A.
Suykens
and
J.
Vandewalle
,
Neural Process. Lett.
9
,
293
300
(
1999
).
190.
B.
Scholkopf
and
A. J.
Smola
,
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
(
MIT Press
,
2001
).
191.
P. K.
Andersen
and
R. D.
Gill
,
Ann. Stat.
10
,
1100
1120
(
1982
).
192.
Y.
Xu
,
J.
Du
,
L.-R.
Dai
, and
C.-H.
Lee
,
IEEE/ACM Trans. Audio, Speech Lang. Process. (TASLP)
23
,
7
19
(
2015
).
193.
194.
L.-Y.
Wei
and
M.
Levoy
, “
Fast texture synthesis using tree-structured vector quantization
,” in
Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques
(
2000
), pp.
479
488
.
195.
D. B.
Fogel
,
IEEE Trans. Neural Networks
5
,
3
14
(
1994
).
196.
S.-A. N.
Alexandropoulos
,
C. K.
Aridas
,
S. B.
Kotsiantis
, and
M. N.
Vrahatis
, “
Multi-objective evolutionary optimization algorithms for machine learning: A recent survey
,”
Approximation and Optimization
(
Springer
,
2019
), pp.
35
55
.
197.
H.
Rue
and
L.
Held
,
Gaussian Markov Random Fields: Theory and Applications
(
Chapman and Hall/CRC
,
2005
).
198.
E.
Aarts
and
J.
Korst
,
Simulated Annealing and Boltzmann Machines
(
Wiley
,
USA
,
1989
).
199.
J.
Schulman
,
N.
Heess
,
T.
Weber
, and
P.
Abbeel
, “
Gradient estimation using stochastic computation graphs
,”
Advances in Neural Information Processing Systems (NIPS)
(
2015
), pp.
3528
3536
.
200.
K.
Arulkumaran
,
M. P.
Deisenroth
,
M.
Brundage
, and
A. A.
Bharath
, preprint arXiv:1708.05866 (
2017
).
201.
T. P.
Lillicrap
,
J. J.
Hunt
,
A.
Pritzel
,
N.
Heess
,
T.
Erez
,
Y.
Tassa
,
D.
Silver
, and
D.
Wierstra
, preprint arXiv:1509.02971 (
2015
).
202.
T.
Weber
,
N.
Heess
,
L.
Buesing
, and
D.
Silver
, “
Credit assignment techniques in stochastic computation graphs
,”
Proceedings of Machine Learning Research
(
2019
), Vol. 89, pp.
2650
2660
.
203.
B.
Scellier
and
Y.
Bengio
,
Front. Comput. Neurosci.
11
,
24
(
2017
).
204.
E.
Jang
,
S.
Gu
, and
B.
Poole
, preprint arXiv:1611.01144 (
2016
).
205.
C.
Louizos
,
M.
Reisser
,
T.
Blankevoort
,
E.
Gavves
, and
M.
Welling
, preprint arXiv:1810.01875 (
2018
).
206.
I.
Sutskever
,
J.
Martens
,
G.
Dahl
, and
G.
Hinton
, “
On the importance of initialization and momentum in deep learning
,” in
International Conference on Machine Learning
(
2013
), pp.
1139
1147
.
207.
D.
Masters
and
C.
Luschi
, preprint arXiv:1804.07612 (
2018
).
208.
Y.
Dauphin
,
H. D.
Vries
, and
Y.
Bengio
, “
Equilibrated adaptive learning rates for non-convex optimization
,” in
Advances in Neural Information Processing Systems (NIPS)
(
2015
), pp.
1504
1512
.
209.
Z.
Sun
,
G.
Pedretti
,
E.
Ambrosi
,
A.
Bricalli
,
W.
Wang
, and
D.
Ielmini
,
Proc. Natl. Acad. Sci.
116
,
4123
4128
(
2019
).
210.
J.
Martens
, “
Deep learning via hessian-free optimization
,” in
ICML
(
2010
), pp.
735
742
.
211.
J.
Martens
and
I.
Sutskever
, “
Learning recurrent neural networks with hessian-free optimization
,” in
Proceedings of the 28th International Conference on Machine Learning (ICML-11)
(
2011
), pp.
1033
1040
.
212.
B.
Kingsbury
,
T. N.
Sainath
, and
H.
Soltau
, “
Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization
,” in
Thirteenth Annual Conference of the International Speech Communication Association
(
2012
).
213.
A.
Makarov
,
V.
Sverdlov
, and
S.
Selberherr
,
Microelectron. Reliab.
52
,
628
634
(
2012
).
214.
A.
Chen
,
Solid-State Electron.
125
,
25
38
(
2016
).
215.
R.
Islam
,
H.
Li
,
P.-Y.
Chen
,
W.
Wan
,
H.-Y.
Chen
,
B.
Gao
,
H.
Wu
,
S.
Yu
,
K. C.
Saraswat
, and
H.-S. P.
Wong
,
J. Phys. D
52
,
113001
(
2018
).
216.
S.
Kvatinsky
,
E. G.
Friedman
,
A.
Kolodny
, and
U. C.
Weiser
,
IEEE Circuits Syst. Mag.
13
,
17
22
(
2013
).
217.
C.
Xu
,
X.
Dong
,
N. P.
Jouppi
, and
Y.
Xie
, “
Design implications of memristor-based RRAM cross-point structures
,” in
2011 Design, Automation and Test in Europe
(
2011
), pp.
1
6
.
218.
E. J.
Fuller
,
S. T.
Keene
,
A.
Melianas
,
Z.
Wang
,
S.
Agarwal
,
Y.
Li
,
Y.
Tuchman
,
C. D.
James
,
M. J.
Marinella
,
J. J.
Yang
 et al., “
Parallel programming of an ionic floating-gate memory array for scalable neuromorphic computing
,”
Science
364
,
570
(
2019
).
219.
C.
Yakopcic
,
M. Z.
Alom
, and
T. M.
Taha
, “
Extremely parallel memristor crossbar architecture for convolutional neural network implementation
,” in
2017 International Joint Conference on Neural Networks (IJCNN)
(
2017
), pp.
1696
1703
.
220.
M.
Hu
,
C. E.
Graves
,
C.
Li
,
Y.
Li
,
N.
Ge
,
E.
Montgomery
,
N.
Davila
,
H.
Jiang
,
R. S.
Williams
, and
J. J.
Yang
,
Adv. Mater.
30
,
1705914
(
2018
).
221.
X.
Sheng
,
C. E.
Graves
,
S.
Kumar
,
X.
Li
,
B.
Buchanan
,
L.
Zheng
,
S.
Lam
,
C.
Li
, and
J. P.
Strachan
,
Adv. Electron. Mater.
5
,
1800876
(
2019
).
222.
S.
Agarwal
,
S. J.
Plimpton
,
D. R.
Hughart
,
A. H.
Hsia
,
I.
Richter
,
J. A.
Cox
,
C. D.
James
, and
M. J.
Marinella
, “
Resistive memory device requirements for a neural algorithm accelerator
,” in
2016 International Joint Conference on Neural Networks (IJCNN)
(
2016
), pp.
929
938
.
223.
J. J.
Yang
,
M. X.
Zhang
,
M. D.
Pickett
,
F.
Miao
,
J. P.
Strachan
,
W.-D.
Li
,
W.
Yi
,
D. A. A.
Ohlberg
,
B. J.
Choi
,
W.
Wu
 et al.,
Appl. Phys. Lett.
100
,
113501
(
2012
).
224.
J.
Woo
and
S.
Yu
,
IEEE Nanotechnol. Mag.
12
,
36
44
(
2018
).
225.
M.
Azzaz
,
E.
Vianello
,
B.
Sklenard
,
P.
Blaise
,
A.
Roule
,
C.
Sabbione
,
S.
Bernasconi
,
C.
Charpin
,
C.
Cagli
,
E.
Jalaguier
 et al., “
Endurance/retention trade off in HfOx and TaOx based RRAM
,” in
IEEE 8th International on Memory Workshop (IMW
) (
2016
), pp.
1
4
.
226.
J. J.
Yang
,
M. X.
Zhang
,
J. P.
Strachan
,
F.
Miao
,
M. D.
Pickett
,
R. D.
Kelley
,
G.
Medeiros-Ribeiro
, and
R. S.
Williams
,
Appl. Phys. Lett.
97
,
232102
(
2010
).
227.
K. M.
Kim
,
J. J.
Yang
,
J. P.
Strachan
,
E. M.
Grafals
,
N.
Ge
,
N. D.
Melendez
,
Z.
Li
, and
R. S.
Williams
,
Sci. Rep.
6
,
20085
(
2016
).
228.
T.
Chang
,
S.-H.
Jo
, and
W.
Lu
,
ACS Nano
5
,
7669
7676
(
2011
).
229.
S.
Kumar
,
N.
Davila
,
Z.
Wang
,
X.
Huang
,
J. P.
Strachan
,
D.
Vine
,
A. D.
Kilcoyne
,
Y.
Nishi
, and
R. S.
Williams
,
Nanoscale
9
,
1793
(
2017
).
230.
K. M.
Kim
,
J.
Zhang
,
C.
Graves
,
J. J.
Yang
,
B. J.
Choi
,
C. S.
Hwang
,
Z.
Li
, and
R. S.
Williams
,
Nano Lett.
11
,
6724
6732
(
2016
).
231.
B. J.
Choi
,
A. C.
Torrezan
,
J. P.
Strachan
,
P. G.
Kotula
,
A. J.
Lohn
,
M. J.
Marinella
,
Z.
Li
,
R. S.
Williams
, and
J. J.
Yang
,
Adv. Funct. Mater.
26
,
5290
5296
(
2016
).
232.
S. H.
Jo
,
T.
Chang
,
I.
Ebong
,
B. B.
Bhadviya
,
P.
Mazumder
, and
W.
Lu
,
Nano Lett.
10
,
1297
1301
(
2010
).
233.
S.
Kumar
,
J. P.
Strachan
, and
R. S.
Williams
,
Nature
548
,
318
321
(
2017
).
234.
S.
Kumar
,
Z.
Wang
,
N.
Davila
,
N.
Kumari
,
K.
Norris
,
X.
Huang
,
J. P.
Strachan
,
D.
Vine
,
A. L.
Kilcoyne
,
Y.
Nishi
, and
R. S.
Williams
,
Nat. Commun.
8
,
658
(
2017
).
235.
A.
Parihar
,
N.
Shukla
,
M.
Jerry
,
S.
Datta
, and
A.
Raychowdhury
,
Sci. Rep.
7
,
911
(
2017
).
236.
M.
Ercsey-Ravasz
and
Z.
Toroczkai
,
Nat. Phys.
7
,
966
970
(
2011
).
237.
W. A.
Borders
,
A. Z.
Pervaiz
,
S.
Fukami
,
K. Y.
Camsari
,
H.
Ohno
, and
S.
Datta
,
Nature
573
,
390
393
(
2019
).
239.
J.
Welser
,
J.
Pitera
, and
C.
Goldberg
, “
Future computing hardware for AI
,” in 2018 IEEE International Electron Devices Meeting (IEDM) (
2018
), pp.
1.3.1
1.3.6
.
240.
Y.
Wang
,
Q.
Wang
,
S.
Shi
,
X.
He
,
Z.
Tang
,
K.
Zhao
, and
X.
Chu
, preprint arXiv:1909.06842 (
2019
).
241.
C.
Liu
,
M.
Hu
,
J. P.
Strachan
, and
H.
Li
, “
Rescuing memristor-based neuromorphic design with high defects
,” in
54th Design Automation Conference (DAC)
(
2017
), pp.
1
6
.
242.
S.
Hamdioui
,
S.
Kvatinsky
,
G.
Cauwenberghs
,
L.
Xie
,
N.
Wald
,
S.
Joshi
,
H. M.
Elsayed
,
H.
Corporaal
, and
K.
Bertels
, “
Memristor for computing: Myth or reality?
,” in
Proceedings of the Conference on Design, Automation & Test in Europe
(
2017
), pp.
722
731
.
You do not currently have access to this content.