Rich dynamics in a living neuronal system can be considered as a computational resource for physical reservoir computing (PRC). However, PRC that generates a coherent signal output from a spontaneously active neuronal system is still challenging. To overcome this difficulty, we here constructed a closed-loop experimental setup for PRC of a living neuronal culture, where neural activities were recorded with a microelectrode array and stimulated optically using caged compounds. The system was equipped with first-order reduced and controlled error learning to generate a coherent signal output from a living neuronal culture. Our embodiment experiments with a vehicle robot demonstrated that the coherent output served as a homeostasis-like property of the embodied system from which a maze-solving ability could be generated. Such a homeostatic property generated from the internal feedback loop in a system can play an important role in task solving in biological systems and enable the use of computational resources without any additional learning.

Physical reservoir computing (PRC) is an emerging concept in which intrinsic nonlinear dynamics in a given physical system (e.g., a photonic system, magnetic material, mechanical robot, and a neural system) are exploited as a computational resource, or a reservoir.1–5 Recent studies have characterized the rich dynamics of spatiotemporal neural activities as an origin of neuronal computation, sometimes as a reservoir,6–14 and demonstrated PRC in living neuronal cultures.15–19 However, PRC that generates a coherent signal output from a spontaneously active neural system, typically with chaotic dynamics, is still challenging. To overcome this difficulty, first-order reduced and controlled error (FORCE) learning has been proposed in an artificial neural network.20,21 In this study, we attempted to implement FORCE learning in PRC using a living neuronal culture. We conducted embodiment experiments with a vehicle robot to demonstrate that the coherent output could serve as a homeostasis-like property of the embodied system, which could result in the development of problem-solving abilities. Our PRC embodiment was characterized as having a linear readout from neural activities [Fig. 1(a)] and was substantially different from conventional “Braitenberg vehicle-type” embodiment of a living neuronal culture in which sensory-motor coupling was optimized through the Hebbian learning.22–28 The Hebbian learning is a neural mechanism to produce associative memory, which directly modifies input–output relationships in the embodiment experiments, whereas the homeostasis is a mechanism to maintain the internal state of the living system. These two mechanisms might play complementary roles in task solving in the neural systems.29,30

We developed a closed-loop system to generate a coherent signal from a spontaneously active living neuronal culture and embodied the culture with a mobile vehicle robot [Fig. 1(b)]. To measure extracellular signals, the neuronal culture was grown on a microelectrode array (MEA). The spiking event was convoluted with a half-Gaussian kernel to smoothen the signals, and the weighted sum of the signals was used as the output of FORCE learning. This output structure of linear readout from a reservoir (i.e., a living neuronal culture) was a fundamental feature of PRC. The feedback signals were generated by photo-active caged-glutamate, Rubi-glutamate, which excited neuronal cells when uncaged. A blue laser of 473 nm was used for uncaging, and the duration of illumination was manipulated according to the output of FORCE learning. In FORCE learning, the weights were adjusted by the recursive least squares (RLS) algorithm so that the output signal became the target constant signal. The error, i.e., the deviation between the output and the target, was used to control the robot. If the error was zero, the robot moved forward; otherwise, the robot turned either right or left. When the robot collided with obstacles or when its goal was not within 90° in front of it, electrical stimulation from an electrode was applied to the culture. A custom-made C++ program processed spike data to perform the FORCE learning and bi-directionally communicated with the robot. The program also modulated the duration of illumination according to the output of FORCE learning.

The experimental protocol was approved by the Committee on the Ethics of Animal Experiments of the Research Center for Advanced Science and Technology, the University of Tokyo (Permit No.: RAC130106). The protocol of dissociate culture was described in detail elsewhere previously.6,7,31 Cortical cells were obtained from E18 rat embryos. The substrate around the electrodes of the MEA (60MEA200/30iR-Ti-gr, Multichannel Systems, Germany) was coated with 0.05% polyethyleneimine (Sigma-Aldrich, USA) and subsequently coated with 0.02 mg/ml laminin (Sigma-Aldrich). The dissociated cortical cells were placed in the area around the MEA electrodes. The cells were fed Dulbecco's modified Eagle medium (DMEM; Life Technologies, USA) supplemented with 10% horse serum (Hyclone, USA), 0.25% GlutaMax (Life Technologies), and 1% sodium pyruvate (Life Technologies). A cell culture grown for more than 25 days in vitro was matured and considered ready for recording. Neurons in the culture emitted electrical signals, which were captured by the extracellularly placed electrodes of the MEA. The MEA included 59 TiN electrodes for recording and an electrode for reference with an interspace of 200 μm. The shape of the electrodes was circle with a diameter of 30 μm. The electrode layout was 8 × 8 in a grid except for four corners. The sampling frequency of the recording was 25 kHz. The captured signal was bandpass filtered for 1–3000 Hz with an analog filter. It was then amplified and converted to a digital signal using a 16 bit A/D converter. The signal was digitally bandpass filtered at 100–3000 Hz. Artifacts were suppressed using the SALPA algorithm. Spike events were detected using the LimAda algorithm (threshold of 5.5).32 The atmosphere was heated and maintained at 30–36 °C during the recording.

We implemented a closed-loop experiment based on FORCE learning using the cortical culture. FORCE learning has been used in reservoir computing to generate coherent signals from the spontaneous chaotic activity of a neural network. Spikes detected in each electrode were convoluted with 2.5 s wide half-Gaussian functions. The convoluted signals were defined as the activity of the neurons xt. The activity of neurons was linearly combined with weight wt, and the value yt=wtxt was used as the output of the neural network. The weight wt was initialized in every trial as w0 = 0. The weight wt was recursively updated at every time step of FORCE learning so that the error against the target signal st was minimized by the RLS algorithm.20 The intervals of leaning steps were decided in a manner of the best effort. The average time interval was 14 ms.

To feed back the output of FORCE learning to the cortical culture, we adopted optical stimulation with caged glutamate. Rubi-glutamate (Abcam, UK) of 100 μM was diluted in the culture medium. The compound was temporarily inactivated by a bound chemical group and was uncaged by illumination with blue light. When uncaged, glutamate activated neurons via glutamate receptors in the cells. The blue light was emitted from a 473 nm diode-pumped solid-state laser (Ciel, Laser Quantum, UK). The laser beam was collimated by a collimator (CFC-11X-A, THORLABS, USA), and the beam width was expanded by a factor of 3 using a beam expander (BE03M-A, THORLABS). The adjusted laser beam was incident onto a digital micro-mirror device (DMD; Discovery 1100, Texas Instruments, USA). When the DMD was “ON,” the beam was reflected toward two convex lenses (F2: focal length f2=40 mm, F3: focal length f3=60 mm, THORLABS) and finally to an optical system of an upright microscope (Eclipse FN1, Nikon, Japan). If the DMD was “OFF,” the beam was vignetted outside the optical system. The two lenses adjusted the point where the image formed. The convex lens inside the microscope was termed as lens F1, whose estimated focal length was f1=170 mm. The position of lens F3 was adjusted so that the focal plane coincided with the surface of the DMD. The position of the lens F2 was adjusted according to the following constraint equation:

d2+f1·d+f1·f2f1+f2d+2f3=L,

where L (= 226.3 mm) is the distance between lens F1 and the surface of the DMD and d is the distance between lens F1 and lens F2 [Fig. 1(c)]. Under these conditions, the image illustrated by the DMD was forwarded to the focal plane of the objective lens of the microscope, which was adjusted on the surface of the MEA. The DMD was used to adjust the illumination area onto the electrode area and the illumination duration. Because plating of neurons was confined around the electrode area, almost all neurons illuminated at once. The total number of spikes in the culture increased with the ON duration of DMD.

The computed FORCE output yt modulated the illumination power. The output yt was quantized to ten levels. According to the output yt, the time for which the DMD was ON was modulated. Such an adjustment of the feedback intensity was performed every 100 ms. In this experiment, the target signal st was set as a constant signal. The value of the target constant signal was set arbitrary.

We embodied the cortical culture with the proposed system and tested whether the robot could perform a goal-directed task in a maze. A mobile vehicle robot (E-puck, EPFL, Switzerland) was controlled according to the error between the output and the target signal. If the error was zero, the output signal obeyed the constant signal, i.e., the left and right wheels of the robot rotated at the same speed, and the robot moved straightforward. If the error was positive, the rotation speed of the right wheel surpassed that of the left wheel, moving the robot toward the left, and vice versa. Additionally, a feedback signal was sent from the robot to the cortical culture. If the robot collided with a wall or an obstacle, or if the goal deviated from 90° in the moving direction of the robot, the culture was stimulated electrically using the MEA. Electrical stimulus was biphasic voltage pulses with 200-mV, 500-μs pulse followed by −500-mV, 200-μs pulse. The detection of collision and the direction deviation triggered the electrical stimulus in every loop of the FORCE learning program. Collisions were detected using eight infrared sensors attached to the robot. If at least one sensor detected objects within 2 cm, it was judged that the robot collided with the object. To monitor the location of the goal and the moving direction of the robot, a standard web camera placed above the experimental setup captured the entire field. The moving direction of the robot was traced using the Kalman filter algorithm, and any deviation was recorded.

The robot was placed on a square field, where certain obstacles hid the goal from the starting points. We tested whether the robot could reach this goal. Cortical cultures showed spiking activity under light-stimulated conditions [Fig. 2(a)]. We trained the FORCE learning system for more than 180 s so that the output obeyed the target constant signal and fixed the weight. When the robot moved around the field, optical and electrical stimulations were seamlessly applied to the culture, although the FORCE output roughly followed the target constant signal with a slight deviation [Figs. 2(b) and 2(c)]. The robot could successfully reach its goal in four different fields (Fig. 3 and supplementary material, Video S1). Because organized information about the environment was not provided through the sensory inputs, the cortical culture was not able to obtain any useful information from previous trials. Our experiments, therefore, demonstrated the potential to exhibit goal-directed behavior without any additional learning by the cortical culture.

In this study, we demonstrated PRC with FORCE learning in a living neuronal culture. Expanding beyond the general idea that rich dynamics in a neuronal culture could be considered as a computational resource for PRC,6–19 our experiments suggest that the homeostasis-like property of the embodied system extracted a potential of task solving ability. The internal feedback loop to make a coherent output maintained the internal state of the living neuronal culture, and this homeostasis-like property was used to head for the goal in the maze. When the robot encountered obstacles and/or lost the direction to the goal in the maze, disturbance stimuli broke the homeostatic balance and triggered exploratory behaviors. Thus, the maze-solving ability emerged from both generating the homeostatic-like property and breaking the homeostatic balance.

Because wt was initialized in every trial as w0 = 0 in our experiments, the robot remained still without FORCE learning. The FORCE learning algorithm here found the weights to cancel temporal periodic fluctuation of each neuron. Therefore, when a random vector was used, the robots showed cyclic behaviors due to the periodic fluctuation, which did not lead to maze-solving ability.

Previous studies have confirmed that a living neuronal culture can be equipped with separation18 and fading-memory16,17 properties, both of which are key requirements for reservoir computing.4,5,33,34 These properties enable PRC by a neuronal culture, which offers the ability to classify spatiotemporal input patterns only through a linear readout of the neural activities. Given that these properties are subject to change, possibly through the homeostatic property and Hebbian learning, the present experimental setup of a living neuronal system for PRC can offer insights into how to exploit computational resources in the brain in a given task.

A previous theoretical study originally demonstrated that FORCE learning can change chaotic spontaneous activity into a wide variety of desired patterns.20,21 FORCE learning in the present study was relatively successful when the target was a constant signal but was not very successful when the target contained various frequencies. FORCE learning in the spiking network is still challenging because the rich dynamics of the subthreshold activities of neurons remain inaccessible as computational resources.35,36 Furthermore, population bursts typically observed in a neuronal culture extinguish past memory in the network activity and, thus, substantially deteriorate the performance of PRC.16,17 PRC with a living neuronal system may benefit from burst suppression with spatio-temporally randomized stimulations.37 

In contrast to the present study, which assumed no learning (i.e., no plasticity) in the network during a task, previous pioneering experiments of robotic embodiments of a living neuronal network exploited Hebbian plasticity within networks to optimize sensory-motor coupling for a given task.22–28 These contradictory strategies in embodiment experiments have confirmed that both the homeostatic property and Hebbian learning play substantial roles in task solving in the brain.29,30 A synergetic effect should be considered in future embodiment experiments and in the theory of brain-inspired PRC.

In summary, we used a living neuronal culture as PRC and implemented FORCE learning to produce a coherent signal output from a spontaneously active reservoir. The output signal served as a homeostasis-like property, enabling the embodied robot to solve a maze. Our results suggest that the homeostatic property generated from the internal feedback loop in the system plays an important role in employing computational resources for task solving in biological systems.

See the supplementary material for embodiment experiments of the living cortical culture in a maze-solving task (Video S1).

This work was partly supported by JSPS KAKENHI (No. 20H04252), AMED (No. JP21dm0307009), the Naito Foundation, and the Asahi Glass Foundation. The results of the experiments were partly obtained from projects commissioned by NEDO (No. 18101806-0).

The experimental protocol was approved by the Committee on the Ethics of Animal Experiments of the Research Center for Advanced Science and Technology, University of Tokyo (Permit No.: RAC130106).

Y.Y. and S.Y. contributed equally to this work.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
K.
Caluwaerts
,
J.
Despraz
,
A.
Iscen
 et al,
J. R. Soc. Interface
11
,
20140520
(
2014
).
2.
L. F.
Seoane
,
Philos. Trans. R. Soc. B
374
,
20180377
(
2019
).
3.
K.
Nakajima
,
T.
Li
,
H.
Hauser
 et al,
J. R. Soc. Interface
11
,
20140437
(
2014
).
4.
G.
Tanaka
,
T.
Yamane
,
J. B.
Héroux
 et al,
Neural Networks
115
,
100
123
(
2019
).
5.
K.
Nakajima
,
Jpn. J. Appl. Phys., Part 1
59
,
060501
(
2020
).
6.
Y.
Yada
,
T.
Mita
,
A.
Sanada
 et al,
Neuroscience
343
,
55
65
(
2017
).
7.
Y.
Yada
,
R.
Kanzaki
, and
H.
Takahashi
,
Front. Syst. Neurosci.
10
,
28
(
2016
).
8.
D. V.
Buonomano
and
W.
Maass
,
Nat. Rev. Neurosci.
10
,
113
125
(
2009
).
9.
D.
Nikolic
,
S.
Hausler
,
W.
Singer
 et al,
PLoS Biol.
7
,
e1000260
(
2009
).
10.
V.
Mante
,
D.
Sussillo
,
K. V.
Shenoy
 et al,
Nature
503
,
78
84
(
2013
).
11.
A.
Goel
and
D. V.
Buonomano
,
Neuron
91
,
320
327
(
2016
).
12.
P.
Enel
,
E.
Procyk
,
R.
Quilodran
 et al,
PLoS Comput. Biol.
12
,
e1004967
(
2016
).
13.
E. D.
Remington
,
D.
Narain
,
E. A.
Hosseini
 et al,
Neuron
98
,
1005
1019.e5
(
2018
).
14.
S.
Tajima
,
T.
Mita
,
D. J.
Bakkum
 et al,
Proc. Natl. Acad. Sci.
114
,
9517
(
2017
).
15.
S.
Hafizovic
,
F.
Heer
,
T.
Ugniwenko
 et al,
J. Neurosci. Methods
164
,
93
106
(
2007
).
16.
M. R.
Dranias
,
H.
Ju
,
E.
Rajaram
 et al,
J. Neurosci.
33
,
1940
1953
(
2013
).
17.
H.
Ju
,
M. R.
Dranias
,
G.
Banumurthy
 et al,
J. Neurosci.
35
,
4040
4051
(
2015
).
18.
K. P.
Dockendorf
,
I.
Park
,
P.
He
 et al,
Biosystems
95
,
90
97
(
2009
).
19.
T.
Gürel
,
S.
Rotter
, and
U.
Egert
,
J. Comput. Neurosci.
29
,
279
299
(
2010
).
20.
D.
Sussillo
and
L. F.
Abbott
,
Neuron
63
,
544
557
(
2009
).
21.
R.
Laje
and
D. V.
Buonomano
,
Nat. Neurosci.
16
,
925
933
(
2013
).
22.
T. B.
DeMarse
,
D. A.
Wagenaar
,
A. W.
Blau
 et al,
Auton. Robots
11
,
305
310
(
2001
).
23.
A.
Novellino
,
P.
D'Angelo
,
L.
Cozzi
 et al,
Comput. Intell. Neurosci.
2007
,
012725
.
24.
Z. C.
Chao
,
D. J.
Bakkum
, and
S. M.
Potter
,
J. Neural Eng.
4
,
294
308
(
2007
).
25.
D. J.
Bakkum
,
Z. C.
Chao
, and
S. M.
Potter
,
J. Neural Eng.
5
,
310
323
(
2008
).
26.
K.
Warwick
,
D.
Xydas
,
S. J.
Nasuto
 et al,
Def. Sci. J.
60
,
5
14
(
2010
).
27.
J.
Tessadori
,
M.
Bisio
,
S.
Martinoia
 et al,
Front. Neural Circuits
6
,
99
(
2012
).
28.
A.
Masumori
,
L.
Sinapayen
,
N.
Maruyama
 et al,
Artif. Life
26
,
130
151
(
2020
).
29.
E.
Marder
and
J.-M.
Goaillard
,
Nat. Rev. Neurosci.
7
,
563
574
(
2006
).
30.
A. X.
Yee
,
Y.-T.
Hsu
, and
L.
Chen
,
Philos. Trans. R. Soc. B
372
,
20160155
(
2017
).
31.
H.
Takahashi
,
T.
Sakurai
,
H.
Sakai
 et al,
Biosystems
107
,
106
112
(
2012
).
32.
D.
Wagenaar
,
T. B.
DeMarse
, and
S. M.
Potter
, in
Conference Proceedings 2nd International IEEE EMBS Conference on Neural Engineering
(
IEEE
,
2005
), pp.
518
521
.
33.
W.
Maass
,
T.
Natschlager
, and
H.
Markram
,
Neural Comput.
14
,
2531
2560
(
2002
).
34.
H.
Jaeger
and
H.
Haas
,
Science
304
,
78
80
(
2004
).
35.
L. F.
Abbott
,
B.
DePasquale
, and
R. M.
Memmesheimer
,
Nat. Neurosci.
19
,
350
355
(
2016
).
36.
W.
Nicola
and
C.
Clopath
,
Nat. Commun.
8
,
2208
(
2017
).
37.
J. F.
Ferron
,
D.
Kroeger
,
O.
Chever
 et al,
J. Neurosci.
29
,
9850
9860
(
2009
).

Supplementary Material