While it is challenging for a traditional propulsor to achieve a wide range of force profile manipulation and propulsion efficiency, nature provides a solution for a flapping foil such as that found in birds and turtles. In this paper, we introduce a deep reinforcement learning (DRL) algorithm with great potential for solving nonlinear systems during the simulation to achieve a self-learning posture adjustment for a flapping foil to effectively improve its thrust performance. With DRL, a brute-force search is first carried out to provide intuition about the optimal trajectories of the foil and also a database for the following case studies. We implement an episodic training strategy for intelligent agent learning using the DRL algorithm. To address a slow data generation issue in the computational fluid dynamics simulation, we introduce a multi-environment technique to accelerate data exchange between the environment and the agent. This method is capable of adaptively and automatically performing an optimal foil path planning to generate the maximum thrust under various scenarios and can even outperform the optimal cases designed by users. Numerical results demonstrate how the proposed DRL is powerful to achieve optimization and has great potential to solve a more complex problem in the field of fluid mechanics beyond human predictability.

1.
M. S.
Triantafyllou
,
G.
Triantafyllou
, and
D.
Yue
, “
Hydrodynamics of fishlike swimming
,”
Annu. Rev. Fluid Mech.
32
,
33
53
(
2000
).
2.
D. A.
Read
,
F.
Hover
, and
M.
Triantafyllou
, “
Forces on oscillating foils for propulsion and maneuvering
,”
J. Fluids Struct.
17
,
163
183
(
2003
).
3.
G.
Pedro
,
A.
Suleman
, and
N.
Djilali
, “
A numerical study of the propulsive efficiency of a flapping hydrofoil
,”
Int. J. Numer. Methods Fluids
42
,
493
526
(
2003
).
4.
J. M.
Anderson
,
K.
Streitlien
,
D.
Barrett
, and
M. S.
Triantafyllou
, “
Oscillating foils of high propulsive efficiency
,”
J. Fluid Mech.
360
,
41
72
(
1998
).
5.
L.
Schouveiler
,
F.
Hover
, and
M.
Triantafyllou
, “
Performance of flapping foil propulsion
,”
J. Fluids Struct.
20
,
949
959
(
2005
).
6.
Q.
Zhu
, “
Optimal frequency for flow energy harvesting of a flapping foil
,”
J. Fluid Mech.
675
,
495
517
(
2011
).
7.
X.
Wu
,
X.
Zhang
,
X.
Tian
,
X.
Li
, and
W.
Lu
, “
A review on fluid dynamics of flapping foils
,”
Ocean Eng.
195
,
106712
(
2020
).
8.
R. S.
Sutton
and
A. G.
Barto
,
Reinforcement Learning: An Introduction
(
MIT Press
,
2018
).
9.
S. L.
Brunton
,
B. R.
Noack
, and
P.
Koumoutsakos
, “
Machine learning for fluid mechanics
,”
Annu. Rev. Fluid Mech.
52
,
477
508
(
2020
).
10.
J.
Viquerat
,
P.
Meliga
,
A.
Larcher
, and
E.
Hachem
, “
A review on deep reinforcement learning for fluid mechanics: An update
,”
Phys. Fluids
34
,
111301
(
2022
).
11.
J.
Rabault
and
A.
Kuhnle
, “
Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach
,”
Phys. Fluids
31
,
094105
(
2019
).
12.
H.
Tang
,
J.
Rabault
,
A.
Kuhnle
,
Y.
Wang
, and
T.
Wang
, “
Robust active flow control over a range of Reynolds numbers using an artificial neural network trained through deep reinforcement learning
,”
Phys. Fluids
32
,
053605
(
2020
).
13.
C.
Vignon
,
J.
Rabault
, and
R.
Vinuesa
, “
Recent advances in applying deep reinforcement learning for flow control: Perspectives and future directions
,”
Phys. Fluids
35
,
031301
(
2023
).
14.
C.
Zheng
,
F.
Xie
,
T.
Ji
,
X.
Zhang
,
Y.
Lu
,
H.
Zhou
, and
Y.
Zheng
, “
Data-efficient deep reinforcement learning with expert demonstration for active flow control
,”
Phys. Fluids
34
,
113603
(
2022
).
15.
C. Y.
Li
,
Z.
Chen
,
K.
Tim
,
A. U.
Weerasuriya
,
X.
Zhang
,
Y.
Fu
, and
X.
Lin
, “
The linear-time-invariance notion of the Koopman analysis—Part 2: Dynamic Koopman modes, physics interpretations and phenomenological analysis of the prism wake
,”
J. Fluid Mech.
959
,
A15
(
2023
).
16.
C. Y.
Li
,
Z.
Chen
,
A. U.
Weerasuriya
,
X.
Zhang
,
X.
Lin
,
L.
Zhou
,
Y.
Fu
, and
K.
Tim
, “
Best practice guidelines for the dynamic mode decomposition from a wind engineering perspective
,”
J. Wind Eng. Ind. Aerodyn.
241
,
105506
(
2023
).
17.
P. J.
Schmid
, “
Dynamic mode decomposition and its variants
,”
Annu. Rev. Fluid Mech.
54
,
225
254
(
2022
).
18.
D. E.
Goldberg
and
J. H.
Holland
, “
Genetic algorithms and machine learning
,”
Mach. Learn.
3
,
95
99
(
1988
).
19.
V.
Mnih
,
K.
Kavukcuoglu
,
D.
Silver
,
A.
Graves
,
I.
Antonoglou
,
D.
Wierstra
, and
M.
Riedmiller
, “
Playing Atari with deep reinforcement learning
,” arXiv:1312.5602 (
2013
).
20.
V.
Mnih
,
K.
Kavukcuoglu
,
D.
Silver
,
A. A.
Rusu
,
J.
Veness
,
M. G.
Bellemare
,
A.
Graves
,
M.
Riedmiller
,
A. K.
Fidjeland
,
G.
Ostrovski
et al, “
Human-level control through deep reinforcement learning
,”
Nature
518
,
529
533
(
2015
).
21.
T. P.
Lillicrap
,
J. J.
Hunt
,
A.
Pritzel
,
N.
Heess
,
T.
Erez
,
Y.
Tassa
,
D.
Silver
, and
D.
Wierstra
, “
Continuous control with deep reinforcement learning
,” arXiv:1509.02971 (
2015
).
22.
N.
Brown
and
T.
Sandholm
, “
Superhuman AI for multiplayer poker
,”
Science
365
,
885
890
(
2019
).
23.
H.
Kim
,
M.
Jordan
,
S.
Sastry
, and
A.
Ng
, “
Autonomous helicopter flight via reinforcement learning
,” in
Advances in Neural Information Processing Systems 16 (NIPS 2003)
(
2003
).
24.
M.
Gazzola
,
B.
Hejazialhosseini
, and
P.
Koumoutsakos
, “
Reinforcement learning and wavelet adapted vortex methods for simulations of self-propelled swimmers
,”
SIAM J. Sci. Comput.
36
,
B622
B639
(
2014
).
25.
S.
Colabrese
,
K.
Gustavsson
,
A.
Celani
, and
L.
Biferale
, “
Flow navigation by smart microswimmers via reinforcement learning
,”
Phys. Rev. Lett.
118
,
158004
(
2017
).
26.
G.
Reddy
,
J.
Wong-Ng
,
A.
Celani
,
T. J.
Sejnowski
, and
M.
Vergassola
, “
Glider soaring via reinforcement learning in the field
,”
Nature
562
,
236
239
(
2018
).
27.
J.
Rabault
,
M.
Kuchta
,
A.
Jensen
,
U.
Réglade
, and
N.
Cerardi
, “
Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control
,”
J. Fluid Mech.
865
,
281
302
(
2019
).
28.
D.
Fan
,
L.
Yang
,
Z.
Wang
,
M. S.
Triantafyllou
, and
G. E.
Karniadakis
, “
Reinforcement learning for bluff body active flow control in experiments and simulations
,”
Proc. Natl. Acad. Sci.
117
,
26091
26098
(
2020
).
29.
N.
Lagopoulos
,
G.
Weymouth
, and
B.
Ganapathisubramani
, “
Universal scaling law for drag-to-thrust wake transition in flapping foils
,”
J. Fluid Mech.
872
,
R1
(
2019
).
30.
G. D.
Weymouth
and
D. K.
Yue
, “
Boundary data immersion method for Cartesian-grid simulations of fluid-body interaction problems
,”
J. Comput. Phys.
230
,
6233
6247
(
2011
).
31.
S. C.
Schlanderer
,
G. D.
Weymouth
, and
R. D.
Sandberg
, “
The boundary data immersion method for compressible flows with application to aeroacoustics
,”
J. Comput. Phys.
333
,
440
461
(
2017
).
32.
A. P.
Maertens
and
G. D.
Weymouth
, “
Accurate Cartesian-grid simulations of near-body flows at intermediate Reynolds numbers
,”
Comput. Methods Appl. Mech. Eng.
283
,
106
129
(
2015
).
33.
S.
Fujimoto
,
H.
Hoof
, and
D.
Meger
, “
Addressing function approximation error in actor-critic methods
,” in
International Conference on Machine Learning
(
PMLR
,
2018
), pp.
1587
1596
.
34.
P.
Garnier
,
J.
Viquerat
,
J.
Rabault
,
A.
Larcher
,
A.
Kuhnle
, and
E.
Hachem
, “
A review on deep reinforcement learning for fluid mechanics
,”
Comput. Fluids
225
,
104973
(
2021
).
35.
D.
Silver
,
T.
Hubert
,
J.
Schrittwieser
,
I.
Antonoglou
,
M.
Lai
,
A.
Guez
,
M.
Lanctot
,
L.
Sifre
,
D.
Kumaran
,
T.
Graepel
et al, “
Mastering chess and Shogi by self-play with a general reinforcement learning algorithm
,” arXiv:1712.01815 (
2017
).
36.
M.
Raissi
,
P.
Perdikaris
, and
G. E.
Karniadakis
, “
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
,”
J. Comput. Phys.
378
,
686
707
(
2019
).
You do not currently have access to this content.