We propose an open-source Python platform for applications of deep reinforcement learning (DRL) in fluid mechanics. DRL has been widely used in optimizing decision making in nonlinear and high-dimensional problems. Here, an agent maximizes a cumulative reward by learning a feedback policy by acting in an environment. In control theory terms, the cumulative reward would correspond to the cost function, the agent to the actuator, the environment to the measured signals, and the learned policy to the feedback law. Thus, DRL assumes an interactive environment or, equivalently, a control plant. The setup of a numerical simulation plant with DRL is challenging and time-consuming. In this work, a novel Python platform, namely DRLinFluids, is developed for this purpose, with DRL for flow control and optimization problems in fluid mechanics. The simulations employ OpenFOAM as a popular, flexible Navier–Stokes solver in industry and academia, and Tensorforce or Tianshou as widely used versatile DRL packages. The reliability and efficiency of DRLinFluids are demonstrated for two wake stabilization benchmark problems. DRLinFluids significantly reduces the application effort of DRL in fluid mechanics, and it is expected to greatly accelerate academic and industrial applications.

1.
M.
Xu
,
S.
Song
,
X.
Sun
, and
W.
Zhang
, “
A convolutional strategy on unstructured mesh for the adjoint vector modeling
,”
Phys. Fluids
33
,
036115
(
2021
).
2.
E. K.
Dehkordi
,
M.
Goodarzi
, and
S. H.
Nourbakhsh
, “
Optimal active control of laminar flow over a circular cylinder using Taguchi and ANN
,”
Eur. J. Mech.—B/Fluids
67
,
104
115
(
2018
).
3.
N. A.
Khan
,
M.
Sulaiman
,
P.
Kumam
, and
A. J.
Aljohani
, “
A new soft computing approach for studying the wire coating dynamics with Oldroyd 8-constant fluid
,”
Phys. Fluids
33
,
036117
(
2021
).
4.
D.
Fernex
,
R.
Semaan
,
M.
Albers
,
P. S.
Meysonnat
,
W.
Schröder
, and
B. R.
Noack
, “
Actuation response model from sparse data for wall turbulence drag reduction
,”
Phys. Rev. Fluids
5
,
073901
(
2020
).
5.
F.
Ren
,
C.
Wang
, and
H.
Tang
, “
Active control of vortex-induced vibration of a circular cylinder using machine learning
,”
Phys. Fluids
31
,
093601
(
2019
).
6.
P.-Y.
Passaggia
,
A.
Quansah
,
N.
Mazellier
,
G. Y. C.
Maceda
, and
A.
Kourta
, “
Real-time feedback stall control of an airfoil at large Reynolds numbers using linear genetic programming
,”
Phys. Fluids
34
,
045108
(
2022
).
7.
R.
Castellanos
,
G. Y.
Cornejo Maceda
,
I.
de la Fuente
,
B. R.
Noack
,
A.
Ianiro
, and
S.
Discetti
, “
Machine-learning flow control with few sensor feedback and measurement noise
,”
Phys. Fluids
34
,
047118
(
2022
).
8.
A. B.
Blanchard
,
G. Y.
Cornejo Maceda
,
D.
Fan
,
Y.
Li
,
Y.
Zhou
,
B. R.
Noack
, and
T. P.
Sapsis
, “
Bayesian optimization for active flow control
,”
Acta Mech. Sin.
37
,
1786
1798
(
2021
).
9.
T.
Ji
,
F.
Jin
,
F.
Xie
,
H.
Zheng
,
X.
Zhang
, and
Y.
Zheng
, “
Active learning of tandem flapping wings at optimizing propulsion performance
,”
Phys. Fluids
34
,
047117
(
2022
).
10.
A.
Mukhtar
,
V. K.
Tayal
, and
H.
Singh
, “
PSO optimized PID controller design for the process liquid level control
,” in
2019 3rd International Conference on Recent Developments in Control, Automation & Power Engineering (RDCAPE)
(IEEE,
2019
), pp.
590
593
.
11.
V.
Mnih
,
K.
Kavukcuoglu
,
D.
Silver
,
A. A.
Rusu
,
J.
Veness
,
M. G.
Bellemare
,
A.
Graves
,
M.
Riedmiller
,
A. K.
Fidjeland
,
G.
Ostrovski
,
S.
Petersen
,
C.
Beattie
,
A.
Sadik
,
I.
Antonoglou
,
H.
King
,
D.
Kumaran
,
D.
Wierstra
,
S.
Legg
, and
D.
Hassabis
, “
Human-level control through deep reinforcement learning
,”
Nature
518
,
529
533
(
2015
).
12.
J.
Rabault
,
F.
Ren
,
W.
Zhang
,
H.
Tang
, and
H.
Xu
, “
Deep reinforcement learning in fluid mechanics: A promising method for both active flow control and shape optimization
,”
J. Hydrodyn.
32
,
234
246
(
2020
).
13.
Y.-Z.
Wang
,
Y.
Hua
,
N.
Aubry
,
Z.-H.
Chen
,
W.-T.
Wu
, and
J.
Cui
, “
Accelerating and improving deep reinforcement learning-based active flow control: Transfer training of policy network
,”
Phys. Fluids
34
,
073609
(
2022
).
14.
Y.
LeCun
,
Y.
Bengio
, and
G.
Hinton
, “
Deep learning
,”
Nature
521
,
436
444
(
2015
).
15.
Y.
Li
, “
Deep reinforcement learning: An overview
,” arXiv:1701.07274 (
2018
).
16.
J.
Schulman
,
F.
Wolski
,
P.
Dhariwal
,
A.
Radford
, and
O.
Klimov
, “
Proximal policy optimization algorithms
,” arXiv:1707.06347 (
2017
).
17.
V.
Mnih
,
A. P.
Badia
,
M.
Mirza
,
A.
Graves
,
T.
Lillicrap
,
T.
Harley
,
D.
Silver
, and
K.
Kavukcuoglu
, “
Asynchronous methods for deep reinforcement learning
,” in
Proceedings of the 33rd International Conference on Machine Learning
(PMLR,
2016
), pp.
1928
1937
.
18.
T. P.
Lillicrap
,
J. J.
Hunt
,
A.
Pritzel
,
N.
Heess
,
T.
Erez
,
Y.
Tassa
,
D.
Silver
, and
D.
Wierstra
, “
Continuous control with deep reinforcement learning
,” arXiv:1509.02971 (
2019
).
19.
T.
Haarnoja
,
A.
Zhou
,
P.
Abbeel
, and
S.
Levine
, “
Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor
,” in
International Conference on Machine Learning
(PMLR,
2018
), pp.
1861
1870
.
20.
V.
Mnih
,
K.
Kavukcuoglu
,
D.
Silver
,
A.
Graves
,
I.
Antonoglou
,
D.
Wierstra
, and
M.
Riedmiller
, “
Playing Atari with deep reinforcement learning
,” arXiv:1312.5602 (
2013
).
21.
S.
Fujimoto
,
H.
van Hoof
, and
D.
Meger
, “
Addressing function approximation error in actor-critic methods
,” arXiv:1802.09477 (
2018
).
22.
D.
Ha
and
J.
Schmidhuber
, “
World models
,” arXiv:1803.10122 (
2018
).
23.
T.
Weber
,
S.
Racanière
,
D. P.
Reichert
,
L.
Buesing
,
A.
Guez
,
D. J.
Rezende
,
A. P.
Badia
,
O.
Vinyals
,
N.
Heess
,
Y.
Li
,
R.
Pascanu
,
P.
Battaglia
,
D.
Hassabis
,
D.
Silver
, and
D.
Wierstra
, “
Imagination-augmented agents for deep reinforcement learning
,” arXiv:1707.06203 (
2018
).
24.
A.
Nagabandi
,
G.
Kahn
,
R. S.
Fearing
, and
S.
Levine
, “
Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning
,” in
2018 IEEE International Conference on Robotics and Automation (ICRA)
(
IEEE
,
2018
), pp.
7559
7566
.
25.
V.
Feinberg
,
A.
Wan
,
I.
Stoica
,
M. I.
Jordan
,
J. E.
Gonzalez
, and
S.
Levine
, “
Model-based value estimation for efficient model-free reinforcement learning
,” arXiv:1803.00101 (
2018
).
26.
D.
Silver
,
T.
Hubert
,
J.
Schrittwieser
,
I.
Antonoglou
,
M.
Lai
,
A.
Guez
,
M.
Lanctot
,
L.
Sifre
,
D.
Kumaran
,
T.
Graepel
,
T.
Lillicrap
,
K.
Simonyan
, and
D.
Hassabis
, “
Mastering chess and shogi by self-play with a general reinforcement learning algorithm
,” arXiv:1712.01815 (
2017
).
27.
M.
Gazzola
,
B.
Hejazialhosseini
, and
P.
Koumoutsakos
, “
Reinforcement learning and wavelet adapted vortex methods for simulations of self-propelled swimmers
,”
SIAM J. Sci. Comput.
36
,
B622
B639
(
2014
).
28.
M.
Gazzola
,
A. A.
Tchieu
,
D.
Alexeev
,
A.
de Brauer
, and
P.
Koumoutsakos
, “
Learning to school in the presence of hydrodynamic interactions
,”
J. Fluid Mech.
789
,
726
749
(
2016
).
29.
G.
Novati
,
S.
Verma
,
D.
Alexeev
,
D.
Rossinelli
,
v W. M.
Rees
, and
P.
Koumoutsakos
, “
Synchronisation through learning for two self-propelled swimmers
,”
Bioinspiration Biomimetics
12
,
036001
(
2017
).
30.
S.
Verma
,
G.
Novati
, and
P.
Koumoutsakos
, “
Efficient collective swimming by harnessing vortices through deep reinforcement learning
,”
Proc. Natl. Acad. Sci.
115
,
5849
5854
(
2018
).
31.
O. J.
Dressler
,
P. D.
Howes
,
J.
Choo
, and
A. J.
deMello
, “
Reinforcement learning for dynamic microfluidic control
,”
ACS Omega
3
,
10084
10091
(
2018
).
32.
X. Y.
Lee
,
A.
Balu
,
D.
Stoecklein
,
B.
Ganapathysubramanian
, and
S.
Sarkar
, “
A case study of deep reinforcement learning for engineering design: Application to microfluidic devices for flow sculpting
,”
J. Mech. Des.
141
,
111401
(
2019
).
33.
X. Y.
Lee
,
A.
Balu
,
D.
Stoecklein
,
B.
Ganapathysubramanian
, and
S.
Sarkar
, “
Flow shape design for microfluidic devices using deep reinforcement learning
,” arXiv:1811.12444 (
2018
).
34.
P.
Garnier
,
J.
Viquerat
,
J.
Rabault
,
A.
Larcher
,
A.
Kuhnle
, and
E.
Hachem
, “
A review on deep reinforcement learning for fluid mechanics
,”
Comput. Fluids
225
,
104973
(
2021
).
35.
P.
Meliga
,
E.
Boujo
,
G.
Pujals
, and
F.
Gallaire
, “
Sensitivity of aerodynamic forces in laminar and turbulent flow past a square cylinder
,”
Phys. Fluids
26
(
26
),
104101
(
2014
).
36.
J.
Viquerat
,
J.
Rabault
,
A.
Kuhnle
,
H.
Ghraieb
,
A.
Larcher
, and
E.
Hachem
, “
Direct shape optimization through deep reinforcement learning
,”
J. Comput. Phys.
428
,
110080
(
2021
).
37.
S.
Li
,
R.
Snaiki
, and
T.
Wu
, “
A knowledge-enhanced deep reinforcement learning-based shape optimizer for aerodynamic mitigation of wind-sensitive structures
,”
Comput.-Aided Civil Infrastruct. Eng.
36
,
733
746
(
2021
).
38.
S. L.
Brunton
and
B. R.
Noack
, “
Closed-loop turbulence control: Progress and challenges
,”
Appl. Mech. Rev.
67
,
050801
(
2015
).
39.
T.
Duriez
,
S. L.
Brunton
, and
B. R.
Noack
,
Machine Learning Control—Taming Nonlinear Dynamics and Turbulence
(
Springer International Publishing
,
2017
).
40.
P.
Ma
,
Y.
Tian
,
Z.
Pan
,
B.
Ren
, and
D.
Manocha
, “
Fluid directed rigid body control using deep reinforcement learning
,”
ACM Trans. Graph.
37
,
96
(
2018
).
41.
J.
Rabault
,
M.
Kuchta
,
A.
Jensen
,
U.
Réglade
, and
N.
Cerardi
, “
Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control
,”
J. Fluid Mech.
865
,
281
302
(
2019
).
42.
A.
Logg
,
K.-A.
Mardal
, and
G.
Wells
,
Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book
(
Springer Science & Business Media
,
2012
), Vol. 84.
43.
J.
Rabault
and
A.
Kuhnle
, “
Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach
,”
Phys. Fluids
31
,
094105
(
2019
).
44.
F.
Ren
,
J.
Rabault
, and
H.
Tang
, “
Applying deep reinforcement learning to active flow control in weakly turbulent conditions
,”
Phys. Fluids
33
,
037121
(
2021
).
45.
R.
Paris
,
S.
Beneddine
, and
J.
Dandois
, “
Robust flow control and optimal sensor placement using deep reinforcement learning
,”
J. Fluid Mech.
913
,
A25
(
2021
).
46.
H.
Tang
,
J.
Rabault
,
A.
Kuhnle
,
Y.
Wang
, and
T.
Wang
, “
Robust active flow control over a range of Reynolds numbers using an artificial neural network trained through deep reinforcement learning
,”
Phys. Fluids
32
,
053605
(
2020
).
47.
H.
Xu
,
W.
Zhang
,
J.
Deng
, and
J.
Rabault
, “
Active flow control with rotating cylinders by an artificial neural network trained by deep reinforcement learning
,”
J. Hydrodyn.
32
,
254
258
(
2020
).
48.
D.
Fan
,
L.
Yang
,
Z.
Wang
,
M. S.
Triantafyllou
, and
G. E.
Karniadakis
, “
Reinforcement learning for bluff body active flow control in experiments and simulations
,”
Proc. Natl. Acad. Sci.
117
,
26091
26098
(
2020
).
49.
P.
Lai
,
R.
Wang
,
W.
Zhang
, and
H.
Xu
, “
Parameter optimization of open-loop control of a circular cylinder by simplified reinforcement learning
,”
Phys. Fluids
33
,
107110
(
2021
).
50.
M. A.
Bucci
,
O.
Semeraro
,
A.
Allauzen
,
G.
Wisniewski
,
L.
Cordier
, and
L.
Mathelin
, “
Control of chaotic systems by deep reinforcement learning
,”
Proc. R. Soc. A
475
,
20190351
(
2019
).
51.
V.
Belus
,
J.
Rabault
,
J.
Viquerat
,
Z.
Che
,
E.
Hachem
, and
U.
Reglade
, “
Exploiting locality and translational invariance to design effective deep reinforcement learning control of the 1-dimensional unstable falling liquid film
,”
AIP Adv.
9
,
125014
(
2019
).
52.
G.
Beintema
,
A.
Corbetta
,
L.
Biferale
, and
F.
Toschi
, “
Controlling Rayleigh–Bénard convection via reinforcement learning
,”
J. Turbul.
21
,
585
605
(
2020
).
53.
R.
Vinuesa
,
O.
Lehmkuhl
,
A.
Lozano-Durán
, and
J.
Rabault
, “
Flow control in wings and discovery of novel approaches via deep reinforcement learning
,”
Fluids
7
,
62
(
2022
).
54.
N.
Ashton
and
V.
Skaperdas
, “
Verification and validation of OpenFOAM for high-lift aircraft flows
,”
J. Aircr.
56
,
1641
1657
(
2019
).
55.
A.
Mack
and
M. P. N.
Spruijt
, “
Validation of OpenFoam for heavy gas dispersion applications
,”
J. Hazard. Mater.
262
,
504
516
(
2013
).
56.
E.
Robertson
,
V.
Choudhury
,
S.
Bhushan
, and
D. K.
Walters
, “
Validation of OpenFOAM numerical methods and turbulence models for incompressible bluff body flows
,”
Comput. Fluids
123
,
122
145
(
2015
).
57.
C.
Suvanjumrat
, “
Implementation and validation of OpenFOAM for thermal convection of airflow
,”
Eng. J.
21
,
225
241
(
2017
).
58.
H.
Jasak
,
A.
Jemcov
, and
Z.
Tukovic
, “
Openfoam: A C++ library for complex physics simulations
,” in
International Workshop on Coupled Methods in Numerical Dynamics
(
IUC Dubrovnik Croatia
,
2007
), Vol. 1000, pp.
1
20
.
59.
A.
Kuhnle
,
M.
Schaarschmidt
, and
K.
Fricke
(2017). “
Tensorforce: A TensorFlow library for applied reinforcement learning
,” Github. https://github.com/tensorforce/tensorforce.
60.
J.
Weng
,
H.
Chen
,
D.
Yan
,
K.
You
,
A.
Duburcq
,
M.
Zhang
,
H.
Su
, and
J.
Zhu
, “
Tianshou: A highly modularized deep reinforcement learning library
,” e-print arXiv:2107.14171 (
2021
).
61.
H.
Jasak
, “
OpenFOAM: Open source CFD in research and industry
,”
Int. J. Nav. Archit. Ocean Eng.
1
,
89
94
(
2009
).
62.
G.
Chen
,
Q.
Xiong
,
P. J.
Morris
,
E. G.
Paterson
,
A.
Sergeev
, and
Y.
Wang
, “
OpenFOAM for computational fluid dynamics
,”
Not. AMS
61
,
354
363
(
2014
).
63.
K.
Arulkumaran
,
M. P.
Deisenroth
,
M.
Brundage
, and
A. A.
Bharath
, “
Deep reinforcement learning: A brief survey
,”
IEEE Signal Process. Mag.
34
,
26
38
(
2017
).
64.
V.
François Lavet
et al, “
Deer
,” https://deer.readthedocs.io/ (
2016
).
65.
P. S.
Castro
,
S.
Moitra
,
C.
Gelada
,
S.
Kumar
, and
M. G.
Bellemare
, “
Dopamine: A research framework for deep reinforcement learning
,” e-print arXiv:1812.06110 (
2018
).
66.
P.
Dhariwal
,
C.
Hesse
,
O.
Klimov
,
A.
Nichol
,
M.
Plappert
,
A.
Radford
,
J.
Schulman
,
S.
Sidor
,
Y.
Wu
, and
P.
Zhokhov
(2017). “
Openai baselines
,” Github. https://github.com/openai/baselines.
67.
I.
Caspi
,
G.
Leibovich
,
G.
Novik
, and
S.
Endrawis
(2017). “
Reinforcement learning coach
,” Zenodo. https://doi.org/10.5281/zenodo.1134899.
68.
M.
Hoffman
,
B.
Shahriari
,
J.
Aslanides
,
G.
Barth-Maron
,
F.
Behbahani
,
T.
Norman
,
A.
Abdolmaleki
,
A.
Cassirer
,
F.
Yang
,
K.
Baumli
,
S.
Henderson
,
A.
Novikov
,
S. G.
Colmenarejo
,
S.
Cabi
,
C.
Gulcehre
,
T. L.
Paine
,
A.
Cowie
,
Z.
Wang
,
B.
Piot
, and
N.
de Freitas
, “
Acme: A research framework for distributed reinforcement learning
,” e-print arXiv:2006.00979 (
2020
).
69.
G.
Brockman
,
V.
Cheung
,
L.
Pettersson
,
J.
Schneider
,
J.
Schulman
,
J.
Tang
, and
W.
Zaremba
, “
Openai gym
,” e-print arXiv:1606.01540 (
2016
).
70.
A.
Paszke
,
S.
Gross
,
F.
Massa
,
A.
Lerer
,
J.
Bradbury
,
G.
Chanan
,
T.
Killeen
,
Z.
Lin
,
N.
Gimelshein
,
L.
Antiga
,
A.
Desmaison
,
A.
Kopf
,
E.
Yang
,
Z.
DeVito
,
M.
Raison
,
A.
Tejani
,
S.
Chilamkurthy
,
B.
Steiner
,
L.
Fang
,
J.
Bai
, and
S.
Chintala
, “
Pytorch: An imperative style, high-performance deep learning library
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc
.,
2019
), Vol. 32.
71.
M.
Abadi
,
A.
Agarwal
,
P.
Barham
,
E.
Brevdo
,
Z.
Chen
, “
C.
Citro
,
G. S.
Corrado
,
A.
Davis
,
J.
Dean
,
M.
Devin
,
S.
Ghemawat
,
I.
Goodfellow
,
A.
Harp
,
G.
Irving
,
M.
Isard
,
Y.
Jia
,
R.
Jozefowicz
,
L.
Kaiser
,
M.
Kudlur
,
J.
Levenberg
,
D.
Mané
,
R.
Monga
,
S.
Moore
,
D.
Murray
,
C.
Olah
,
M.
Schuster
,
J.
Shlens
,
B.
Steiner
,
I.
Sutskever
,
K.
Talwar
,
P.
Tucker
,
V.
Vanhoucke
,
V.
Vasudevan
,
F.
Viégas
,
O.
Vinyals
,
P.
Warden
,
M.
Wattenberg
,
M.
Wicke
,
Y.
Yu
, and
X.
Zheng
,
TensorFlow: Large-scale machine learning on heterogeneous systems
,” e-print arXiv:1603.04467 (
2015
).
72.
M.
Schäfer
,
S.
Turek
,
F.
Durst
,
E.
Krause
, and
R.
Rannacher
, “
Benchmark computations of laminar flow around a cylinder
,” in
Flow Simulation with High-Performance Computers II
(
Springer
,
1996
), pp.
547
566
.
You do not currently have access to this content.