Computational resources have grown exponentially in the past few decades. These machines make possible research and design in fields as diverse as medicine, astronomy, and engineering. Despite ever-increasing computational capabilities, direct simulation of complex systems has remained challenging owing to the degrees of freedom involved. At the cusp of exascale computing, high-resolution simulation of practical problems with minimal model assumptions may soon experience a renaissance. However, growing reliance on modern computers comes at the cost of a growing carbon footprint. To illustrate this, we examine historic computations in fluid dynamics where larger computers have afforded the opportunity to simulate flows at increasingly relevant Reynolds numbers. Under a variety of flow configurations, the carbon footprint of such simulations is found to scale roughly with the fourth power of Reynolds number. This is primarily explained by the computation cost in core-hours, which is also described by similar scaling, though regional differences in renewable energy use also play a role. Using the established correlation, we examine a large database of simulations to develop estimates for the carbon footprint of computational fluid dynamics in a given year. Collectively, the analysis provides an additional benchmark for new computations where, in addition to balancing considerations of model fidelity, carbon footprint should also be considered.

## I. INTRODUCTION

It is remarkable that a wallet-sized “rectangle”^{31} ensconced in a pocket has more computing power than the 70 lb machine aiding orbital calculations for the Apollo missions^{33} more than 50 years prior.^{32} Indeed, computing power has grown enormously over the preceding decades. In 1989, the fastest calculation was performed on a Cray 416 at 400 megaflops.^{19} Now, only three decades later, the present fastest computer, Frontier, boasts a theoretical peak performance of 1 exaflops,^{20} a billion-fold improvement over the Cray. These advances in computer hardware have greatly benefited the scientific community who rely on these machines to perform large numerical experiments. These include simulations of the heart,^{21} wind turbines,^{22} black holes,^{23} and worldwide climate.^{24} The increase in computational power has meant not only the types of problems that can be simulated are more complicated (for example, a section of a wing^{18} vs a full aircraft^{17}) but also the resolution, hence accuracy of these models to predict relevant physical quantities, has also expanded. Large-scale computation has now become the standard for basic science research. A particular modality, namely, computational fluid dynamics (CFD), has been at the forefront of large-scale computation as long as computers have existed, dating back to the Manhattan project. In addition to academic research, computational fluid dynamics is regularly used in the design of car bodies to reduce drag,^{25} aircraft engine combustors to improve efficiency,^{26} and fluidized bed reactors used in chemical processing facilities.^{27} Although the wealth of computational resources enables contributions to both cutting-edge science and societal economic needs, there is a carbon footprint cost resulting from the electricity expense required to power these machines. Though conventions such as the Paris climate agreement have renewed the world's aspiration to decarbonize our energy sources, the majority of the world's energy use (from all sources) and 75% of electricity still come from nonrenewable sources.^{28} Therefore, it appears that some estimates of the carbon footprint associated with the increasing use of large-scale machines to conduct science are warranted. As a case study in this regard, we focus on large-scale computational fluid dynamics (CFD) simulations spanning nearly the past four decades. We pursue two analyses in the estimate of CFD's carbon footprint. First, we consider single-phase turbulent flows in canonical settings, including isotropic, channel, Couette, boundary layer, duct, and pipe flow, and develop a correlation of simulation carbon footprint as a function of the simulation Reynolds number. In the second part of this work, we consider a larger database of turbulent flow simulations conducted in 2022 to establish confidence intervals on the Reynolds numbers modelers are most likely to study as well as the expected value of the carbon footprint of a direct simulation of turbulence in 2022. Other statistics of this database are presented, showing that the individual carbon footprint of many CFD simulations can be relatively small, but their sum is significant. The goal of this work is to cast light on simulation carbon footprints, which we feel should be considered when designing new CFD studies.

## II. ESTABLISHMENT OF A CARBON FOOTPRINT CORRELATION FOR CFD

### A. Database and methods

We consider direct numerical simulations (DNSs) of turbulent flows where all length and timescales of the problem are resolved. A summary of these simulations is presented in Table I. We focus initially on single-phase incompressible turbulent flows in canonical settings, which have well-characterized resolution requirements.^{34} More complicated problems including flows with heat transfer, particles, and rough walls are considered in Sec. II B. In procuring this database, we only consider simulations where the number of cores and hours and/or core-hours were reported, along with information on the region (nation/state) where the simulations were conducted. Although this strict criterion greatly limits the number of historical simulations analyzed, this eliminates the use of theoretical estimates of carbon scaling. Reliance on well-characterized simulation data is advantageous over theoretical scaling since the present predictions are based on the actual number of processors and total simulation time, which, in practice, balance a variety of factors including the available resources and differences in averaging windows for computing flow statistics.

Reference . | Flow . | Re
. | Hours . | Core type . | Cores . | Model . | Memory (GB) . | Region . | kWh . | Mass (kg) . |
---|---|---|---|---|---|---|---|---|---|---|

1. Kerr^{1} | Isotropic | 48.2 | 6 | CPU | 1 | Cray 1-S | 0.008 39 | California | 0.120 27 | 0.026 03 |

2. Gotoh^{2} | Isotropic | 381 | 500 | CPU | 32 | Fujitsu 5500/ 56 | 1^{*} | Japan | 320.95 | 149.5 |

3. Bech^{3} | Couette | 82.2 | 154.07 | CPU | 1 | Cray 1-YP | 344 Mb | Norway | 3.12 | 0.023 78 |

4. Unger^{4,5} | Pipe | 180 | 38 | CPU | 4 | Cray Y-MP4/464 | .0594 | Germany | 3.05 | 1.03 |

5. Eggels^{4} | Pipe | 180 | 40 | CPU | 4 | Cray Y-MP4/464 | .0792 | Netherlands | 3.21 | 1.2 |

6. Mansour^{34} | Channel | 595 | 185 | CPU | 64 | IBM SP2 | 4.096 | California | 237.74 | 51.46 |

7. Kim^{6} | Channel | 180 | 62.5 | CPU | 4 | Cray-XMP | 56 Mb | California | 5.01 | 1.08 |

8. Jiménez^{7} | Channel | 2003 | 2929.69 | CPU | 2048 | PowerPC970FX | 4096 | Spain | $ 1.28 \xd7 10 5$ | $ 2.18 \xd7 10 4$ |

9. Silero^{8} | Boundary layer | 2000 | 1373.29 | CPU | 32 768 | PowerPC 450 | 32 768 | Illinois | $ 9.3 \xd7 10 5$ | $ 2.47 \xd7 10 5$ |

10. Lee^{9,10} | Channel | 5186 | 381.47 | CPU | 52 4288 | PowerPC A2 | 512 TB | Illinois | $ 4.13 \xd7 10 6$ | $ 1.1 \xd7 10 6$ |

11. Alfonsi^{11} | Channel | 200 | 51.39 | CPU/GPU | 6 CPU/1 GPU | Xeon X5660 Nvidia C-1060 | 28 | Italy | 25.17 | 8.15 |

12. Alfonsi^{11} | Channel | 400 | 237.5 | CPU/GPU | 6 CPU/1 GPU | Xeon X5660 Nvidia C-1060 | 28 | Italy | 116.3 | 37.66 |

13. Alfonsi^{11} | Channel | 600 | 461.11 | CPU/GPU | 18 CPU/3 GPU | Xeon X5660 Nvidia C-1060 | 84 | Italy | 677.41 | 219.37 |

14. Zhang^{12} | Duct | 1200 | 2040.82 | CPU | 392 | Xeon E5-2670 | 784 | Spain | $ 2.02 \xd7 10 4$ | 3460.579 |

15. Gavrilakis^{13} | Duct | 150 | 180.36 | CPU | 4 | Cray 2 | 0.262 | France | 14.49 | 0.7429 |

16. Yeung^{14,15} | Isotropic | 1300 | 2989 | CPU/GPU | 135168/18432 | IBM Power 9 Nvidia V-100 | 1 867 776 | Tennessee | $ 3.92 \xd7 10 7$ | $ 1.07 \xd7 10 7$ |

17. Vela-Martín^{16} | Channel | 2000 | 507.81 | GPU | 128 | Tesla P-100 | 2048 | Switzerland | $ 2.78 \xd7 10 4$ | 320.08 |

18. Vela-Martín^{16} | Channel | 5303 | 2734.38 | GPU | 512 | Tesla P-100 | 8192 | Switzerland | $ 5.98 \xd7 10 5$ | 6894 |

Reference . | Flow . | Re
. | Hours . | Core type . | Cores . | Model . | Memory (GB) . | Region . | kWh . | Mass (kg) . |
---|---|---|---|---|---|---|---|---|---|---|

1. Kerr^{1} | Isotropic | 48.2 | 6 | CPU | 1 | Cray 1-S | 0.008 39 | California | 0.120 27 | 0.026 03 |

2. Gotoh^{2} | Isotropic | 381 | 500 | CPU | 32 | Fujitsu 5500/ 56 | 1^{*} | Japan | 320.95 | 149.5 |

3. Bech^{3} | Couette | 82.2 | 154.07 | CPU | 1 | Cray 1-YP | 344 Mb | Norway | 3.12 | 0.023 78 |

4. Unger^{4,5} | Pipe | 180 | 38 | CPU | 4 | Cray Y-MP4/464 | .0594 | Germany | 3.05 | 1.03 |

5. Eggels^{4} | Pipe | 180 | 40 | CPU | 4 | Cray Y-MP4/464 | .0792 | Netherlands | 3.21 | 1.2 |

6. Mansour^{34} | Channel | 595 | 185 | CPU | 64 | IBM SP2 | 4.096 | California | 237.74 | 51.46 |

7. Kim^{6} | Channel | 180 | 62.5 | CPU | 4 | Cray-XMP | 56 Mb | California | 5.01 | 1.08 |

8. Jiménez^{7} | Channel | 2003 | 2929.69 | CPU | 2048 | PowerPC970FX | 4096 | Spain | $ 1.28 \xd7 10 5$ | $ 2.18 \xd7 10 4$ |

9. Silero^{8} | Boundary layer | 2000 | 1373.29 | CPU | 32 768 | PowerPC 450 | 32 768 | Illinois | $ 9.3 \xd7 10 5$ | $ 2.47 \xd7 10 5$ |

10. Lee^{9,10} | Channel | 5186 | 381.47 | CPU | 52 4288 | PowerPC A2 | 512 TB | Illinois | $ 4.13 \xd7 10 6$ | $ 1.1 \xd7 10 6$ |

11. Alfonsi^{11} | Channel | 200 | 51.39 | CPU/GPU | 6 CPU/1 GPU | Xeon X5660 Nvidia C-1060 | 28 | Italy | 25.17 | 8.15 |

12. Alfonsi^{11} | Channel | 400 | 237.5 | CPU/GPU | 6 CPU/1 GPU | Xeon X5660 Nvidia C-1060 | 28 | Italy | 116.3 | 37.66 |

13. Alfonsi^{11} | Channel | 600 | 461.11 | CPU/GPU | 18 CPU/3 GPU | Xeon X5660 Nvidia C-1060 | 84 | Italy | 677.41 | 219.37 |

14. Zhang^{12} | Duct | 1200 | 2040.82 | CPU | 392 | Xeon E5-2670 | 784 | Spain | $ 2.02 \xd7 10 4$ | 3460.579 |

15. Gavrilakis^{13} | Duct | 150 | 180.36 | CPU | 4 | Cray 2 | 0.262 | France | 14.49 | 0.7429 |

16. Yeung^{14,15} | Isotropic | 1300 | 2989 | CPU/GPU | 135168/18432 | IBM Power 9 Nvidia V-100 | 1 867 776 | Tennessee | $ 3.92 \xd7 10 7$ | $ 1.07 \xd7 10 7$ |

17. Vela-Martín^{16} | Channel | 2000 | 507.81 | GPU | 128 | Tesla P-100 | 2048 | Switzerland | $ 2.78 \xd7 10 4$ | 320.08 |

18. Vela-Martín^{16} | Channel | 5303 | 2734.38 | GPU | 512 | Tesla P-100 | 8192 | Switzerland | $ 5.98 \xd7 10 5$ | 6894 |

^{29}based on the work of Lannelongue

*et al.*

^{30}The calculator estimates the carbon footprint from the following formula:

The carbon footprint of a calculation depends primarily on how many computational hours were needed, the power requirement of the processors, and the carbon intensity of the power source. The last quantity depends critically on the fuel source for the electricity: Renewable sources have low carbon intensity, and fossil fuels have high carbon intensity. The calculator estimates the carbon intensity of a calculation by accounting for the region in which the calculation is done. For example, a hypothetical simulation run in California on the same computing resources as in France has different estimated carbon footprints because the energy budget of renewable vs nonrenewable fuels is different in these regions, and, therefore, the carbon intensity associated with electricity production needed for the simulations is also different.

We consider the flow Reynolds number $ R e = U L / \nu $ of the various simulations as the principal variable for our study. Here, *U* and *L* are, respectively, reference flow velocity and length scales, and *ν* is the kinematic viscosity. The Reynolds number is characterized by the relative importance of inertial and viscous forces and is related to the range of flow scales in the problem. Higher Reynolds number flows contain more features to resolve. Hence, the larger the Reynolds number, the larger the simulation (mesh points/time steps) and the more machine-hours required to simulate that flow. Consequently, a carbon footprint correlation based on *Re* abstracts away many factors used in simulation design, allowing a simple available metric to estimate a simulation's carbon footprint. In this study, we use two definitions of the Reynolds number: For wall-bounded flows, the Reynolds number is based on the friction velocity, while for isotropic turbulence, it is based on the Taylor scale.^{34} These separate definitions ameliorate comparison among disparate flow types where a single common definition of the Reynolds number is not available. The relevant Reynolds numbers for the respective flow types were reported in each study and reproduced in Table I. The database spans two decades in Reynolds number across six canonical turbulent flows. The computations were conducted on CPUs, GPUs, and hybrid architectures in 11 regions across the U.S. and Europe. Figure 1 shows the growth in Reynolds number over the years for the selected simulations. The exponential change in Reynolds number over this time underscores the growing power of modern computing and the necessity of developing carbon footprint estimates for these simulations.

### B. Limitations of methodology

There are limitations in the methodology used to estimate carbon footprints we report. The Green Algorithms GUI does not have all processor types listed in Table I incorporated into its database, so most cases use a typical power per core using the “any”/“other” processor type. This assumption is more accurate for recently conducted simulations (say those conducted from 2010 onward) whose processor type and power use will align most closely with a typical modern processor provided in the GUI. Conditioning our analysis on only these simulations still shows a strong scaling with Reynolds number though the correlation coefficient is reduced; therefore, we conduct our analysis on all of the simulations in Table I with the acknowledged limitation. The power estimates in this work are based on core thermal design power (TDP) specifications,^{30} which may not reflect the actual power consumed during a calculation.^{47} Code optimizations can be employed to improve energy use on a machine.^{46} Careful power measurements of different codes run on different machines confirm that the TDP values are not, in general, equal to the measured power when running an application, though the difference in these values was usually less than a factor of two across the cases considered.^{47} We also do not consider additional power requirements needed to house and cool the computing facilities, and we also assume that for the total compute-time, the processors are running (there is no idle time). Usage factors have been considered in other works,^{30} and there is typically not an order of magnitude difference in estimated power usage considering these factors. Most cases in Table I reported either the number of cores or core-hours, but not both. The simulation time was then inferred from the reported core count and core-hour values and/or by multiplying the number of time steps by the time per time step. Memory use was not typically reported but usually could be inferred reasonably from specification sheets of the individual processors or from the cluster information pages. Typically, memory usage accounted only for a few percent of total energy use. Finally, we are not able to consider local differences in renewables' contribution to electricity use, for example, if a computing cluster draws its power directly from onsite photovoltaic cells. All of these assumptions affect the precise magnitude of energy consumption without affecting the order of magnitude and still allow incorporation of the Green Algorithms calculator to predict a scaling behavior for energy usage/carbon production vs flow Reynolds number. It should be noted that the carbon intensity of electricity used in this work is contemporary regional estimates. In other words, the reported carbon footprints are our estimates of the cost of running these simulations today, *not* an estimate of how much carbon these simulations produced at the time they were run.

### C. The correlation

Here, the Reynolds number is defined based on the Taylor scale in isotropic flows, and based on the friction scale for wall-bounded settings. The near fourth power in scaling is consistent with the scaling with respect to core-hours which scales with nearly the same factor [Fig. 3(c)]. Also note the wide distribution in wall clock times, ranging from hours to months, depending on the Reynolds number [Fig. 3(b)].

Interestingly the core-hour scaling, $ R e \tau , \lambda 3.5$ (with the subscripts included to emphasize the consistent Reynolds numbers, $ R e \lambda $ or $ R e \tau $) is less than the theoretical prediction of $ R e \lambda 6$.^{34,37} Pope [Ref. 34, p. 350] has shown that this theoretical scaling, valid for high Reynolds number isotropic turbulence, can be relaxed to $ R e \lambda 5.5$ using a time step that scales with the Kolmogorov time. The discrepancy in the observed correlation compared with these theoretical scalings may arise from modeler freedom in balancing resolution of large and small scales in turbulent flows, differences in choices for CFL number, length of simulation for time averaging, finite *Re* not achieving the asymptotic turbulence scalings, and machine architecture and clock speeds. Numerical modeling choices such as large-scale forcing,^{38} which can increase the Reynolds number for a given domain size in isotropic turbulence and non-uniform grids in wall-bounded flows,^{6} also affect scaling. The computational requirements of wall-bounded flows, particularly turbulent boundary layers, have also been considered by Choi and Moin^{48} and Yang and Griffin,^{49} hereafter CM and YG. For direct simulation of a turbulent boundary layer, these works, respectively, estimated the computational resources (number of grid points times the number of timesteps) as $ N N t \u223c R e L 3.52$ (CM) and $ N N t \u223c R e L 2.91$ (YG), where *Re _{L}* is the Reynolds number based on the streamwise distance. These expressions can be cast in terms of friction Reynolds number [the consistent Reynolds number for Eq. (2)] using $ \delta / L \u223c R e L \u2212 1 / 7$,

^{48,49}yielding $ R e \delta = R e L 6 / 7$, and along with $ R e \tau \u223c R e \delta 0.88$ (Pope

^{34}p. 279, 312) gives the respective estimates $ N N t \u223c R e \tau 4.67$ (CM) and $ N N t \u223c R e \tau 3.86$ (YG). These theoretical estimates are in better agreement with the scalings observed for carbon footprint, $ \u223c R e \tau , \lambda 3.73$ Eq. (2), and core-hours, $ \u223c R e \tau , \lambda 3.5$.

While the observed core-hour and carbon footprint scaling, nevertheless, grow rapidly with Reynolds number, this analysis suggests that DNS may be more affordable than theoretical scalings would suggest. These various considerations underscore the importance of using fielded simulations to estimate the carbon footprint of this modality rather than theoretical scalings. Using the estimate provided by Eq. (2) as a starting point, increased documentation of computational resources used in fluid dynamics simulations can better establish the true fielded cost of DNS and corroborate theoretical scalings.

The carbon footprint for a simulation depends greatly on where the simulation is run since the percentage of renewable energy that goes toward electricity generation varies greatly with region. For most of the cases considered, on a core-hour basis, and for the same Reynolds number, we see around a factor of 5 differences in carbon footprint when run in different locales. For example, the simulations of Hoyas and Jiménez,^{7} Silero *et al.*,^{8} and Vela-Martín^{16} all considered wall-bounded turbulent flow simulations at $ R e \u2248 2000$, yet the carbon footprint of these respective simulations differed by three orders of magnitude. For these three simulations, the carbon footprint per core-hour differed by less than a factor of 2. Therefore, the difference in renewable energy percentage that goes toward electricity generation in the respective regions (Spain, Illinois, and Switzerland) plays a significant role in controlling the carbon footprint of “nominally equivalent” simulations. This suggests that all else equal, modelers should submit proposals to and target machines whose electricity use relies on greater fractions of renewable energy. Though there is considerable scatter, much of the data are consistent with a cost of $\u223c4.2g\u2009C O 2e/core\u2212hour$ [dashed line in Fig. 3(d)], which is about an American teaspoon (tsp) of sugar by mass. Another observation is that a factor of two increase in Reynolds number represents a factor of 13 increase in carbon.

## III. STATISTICS ON CARBON FOOTPRINT IN 2022

Having established a Reynolds number dependent carbon footprint correlation for direct simulations of turbulence, we turn our attention toward the questions: (1) What is a typical Reynolds number simulated? (2) What is the typical carbon footprint of a direct simulation? and (3) For a large database of simulations, what is the total footprint?

### A. Procurement of the database

Google Scholar searches were conducted on July 6, 2023, and September 15, 2023, to develop a large database of turbulent flow direct numerical simulations (see database II in the Supplementary material). We considered articles exclusively from 2022. In an effort to incorporate simulations whose run criteria best matched the assumptions used to develop the carbon footprint correlation, we searched exclusively for direct numerical simulations with the search criteria: Intitle: direct AND simulation AND turbulence as well as Intext: direct AND simulation AND turbulence. These searches yielded tens of thousands of hits. The task at hand would be to determine the Reynolds number(s) of the simulations contained within a list of simulations chosen from the list of papers. Since the correlation Sec. II was developed for isotropic and wall-bounded flows whose characteristic Reynolds numbers are based on the Taylor microscale and friction velocity, respectively, it was important to ensure the reported Reynolds numbers in these works were consistent with these definitions. A number of other constraints were imposed including exclusion of papers that mentioned direct simulation but were not in fact on that topic, studies using large eddy simulation (LES) or Reynolds Averaged Navier–Stokes (RANS), experimental works on turbulence, and relevant papers where the appropriate Reynolds number was not reported. A few papers were not incorporated into the final database because, while they were reported as direct numerical simulations, the reported spatial resolution was too coarse in this author's judgment to warrant the DNS label. Most importantly, care was taken to ensure that no duplicate simulations were used in the database—either duplicate scholar articles or papers presenting analysis of turbulence simulations that were previously published. For all of these reasons, the author determined no automated data scraper was up to the task of developing such a database, and, consequently, this task was conducted manually. This last point is mentioned to raise awareness of how databases are catalogued with the hope that uniform standards may ease processing of archival data in the future.

In all, 645 Google Scholar entries were examined, and 148 papers ultimately selected that met the above criteria. This yielded a total of 902 direct simulations of turbulence, sufficient for establishing confidence intervals. In summary, this database includes isotropic, channel, Couette, boundary layer, pipe, and duct flows since these flows were amenable to the definitions of Reynolds number consistent with the developed correlation. Many direct simulations of other turbulent phenomena were not considered including combustion, free shear, and flow over bluff bodies, and Rayliegh–Benard convection, since, in general, it was not straightforward to translate the relevant parameters of these flows into the required Reynolds numbers consistent with the carbon footprint correlation. License was given to additional physics not considered in the establishment of the carbon footprint: we considered new factors including compressibility, particle-laden and multiphase flows, heat transfer, and flows with rough walls and riblets. These additional physics are uniformly biased toward increasing the resolution requirements for a given Reynolds number while still likely following the same trend of carbon footprint scaling with Reynolds number $ \u223c R e 3.73$. The results to be presented in Sec. III B also do not account for “trial runs” (scaling, debugging, and scoping studies) performed prior to the production calculation. These preliminary calculations are frequently left out of the final published results, yet still contribute to the carbon footprint of the final simulation product.

### B. Statistics of the database

Figure 4 shows the probability density function (pdf) of simulated Reynolds numbers along with the corresponding pdf of carbon footprint for database II. Additionally, Table II presents a summary of Reynolds number and carbon footprint statistics. Low Reynolds number simulations are most prevalent, which correspond to lower carbon footprint on a per simulation basis. The mode simulation Reynolds number is 180 with 34% of the database falling within the Reynolds numbers 180 ± 30. The prevalence of this particular Reynolds number may stem from the seminal turbulent channel flow calculations at *Re* = 180 of Kim *et al.*^{6} What was then a pioneering calculation owing to computational expense is now, nearly four decades later a routine Reynolds number considered in computational fluid dynamics simulations. The mean Reynolds number of this sample is 315, demonstrating the skewness toward less frequently simulated larger Reynolds number calculations. The corresponding carbon footprint of the sample mean and mode Reynolds numbers are 18.4 and 2.3 kg, respectively. Another way to interpret the expected carbon footprint of a direct simulation is to calculate the confidence interval of the mean directly from the carbon footprint distribution. The corresponding 95% confidence interval for the mean carbon footprint of a DNS is $ 8.5 \xd7 10 3 \xb1 1.6 \xd7 10 4$ kg, considerably larger than the carbon footprint associated with the expected Reynolds number. This discrepancy underscores the disproportionate role that lower frequency large Reynolds number calculations contribute in their respective footprints and are consistent with a rapid scaling of carbon footprint with Reynolds number $ \u223c R e 3.73$. Indeed, of the 902 simulations considered here, 53 (6% of the database) are simulations conducted with $ R e \u2265 1000$. Remarkably, the largest simulation in this database, conducted at $ R e = 10 4$, by itself constitutes 94% of the carbon footprint of the entire database, $ 7.2 \xd7 10 3$ tons out of $ 7.7 \xd7 10 3$ total tons of carbon for the 902 simulations from 2022. This observation underscores the large contribution that even a single “hero” calculation—one that uses all or a large fraction of a machine for a significant time—can contribute toward CFD's carbon production in a given year. It also suggests that the title “DNS” is not descriptive enough to assess whether a simulation will be carbon intensive. Two DNSs at different Reynolds numbers may differ in their carbon footprint by many orders of magnitude.

Statistic . | Mean (confidence interval) . | Median . | Mode . | Min . | Max . |
---|---|---|---|---|---|

1. Reynolds number | $ R e \xaf = 315 \xb1 30$ | 188 | 180 | 9 | 10^{4} |

2. Carbon footprint (kg) | (1) $ \u2009 C I ( R e \xaf ) = ( 12.6 , 18.4 , 25.9 )$ | 2.7 | 2.3 | $ 3 \xd7 10 \u2212 5$ | $ 7.2 \xd7 10 6$ |

(2) $ C I ( R e ) = 8.5 \xd7 10 3 \xb1 1.6 \xd7 10 4$ |

Statistic . | Mean (confidence interval) . | Median . | Mode . | Min . | Max . |
---|---|---|---|---|---|

1. Reynolds number | $ R e \xaf = 315 \xb1 30$ | 188 | 180 | 9 | 10^{4} |

2. Carbon footprint (kg) | (1) $ \u2009 C I ( R e \xaf ) = ( 12.6 , 18.4 , 25.9 )$ | 2.7 | 2.3 | $ 3 \xd7 10 \u2212 5$ | $ 7.2 \xd7 10 6$ |

(2) $ C I ( R e ) = 8.5 \xd7 10 3 \xb1 1.6 \xd7 10 4$ |

Placed in the context of annual human activities, the carbon footprint of this database corresponds to the annual carbon footprint of about 2741 people, assuming 2.8 tons per capita in the U.S.^{36} While the largest simulation constitutes the majority of this footprint, the remaining 6% of simulations still amount to a non-negligible carbon footprint of 160 persons. In assessing the carbon impact of a simulation, it is worth considering not only the carbon footprint of a simulation at a given Reynolds number but also the cumulative contribution across all Reynolds numbers, as shown in Fig. 5. Indeed, while the majority of simulations are conducted at low Reynolds number, their cumulative carbon footprint is significant. Noticeable upticks in the cumulative distribution are observed around Reynolds numbers of 180 and 395, demonstrating the carbon impact of a simulation at a given Reynolds number depends both on the carbon footprint associated with a given Reynolds number, and how frequently that Reynolds number is simulated. The latter points to behavioral patterns of computational fluid dynamicists who may be biased to simulate particular Reynolds numbers owing to historical precidence for that simulation type in the literature,^{6,40} as well as tradeoffs scientists must consider in conducting higher Reynolds number simulations to elucidate more fundamental features of turbulent flows, while also balancing machine-time allocations on various computing centers.

## IV. OUTLOOK

This review has focused primarily on fluids applications, and it is expected that similar takeaways would apply to other fields making heavy use of computational resources including computational biology, medicine, machine learning, atmospheric science, and finance. As large-scale scientific computing becomes routine in both academia and industry, greater responsibility is warranted. We are now stewards of these powerful machines. It is incumbent we use their might judiciously. This work focused on direct numerical simulations owing (1) to their computational expense and (2) the agreement in their definition, requiring sufficient resolution of the largest and smallest spatial and temporal scales.^{6,34,45} LES and RANS simulation methodologies are likely performed more frequently in academia and industry owing to their relative low cost compared with direct simulation. However, lack of an agreed-upon definition for the resolution requirements for these modalities makes it difficult to provide an assessment of their respective carbon footprint scaling for a given Reynolds number. Analysis of database II shows that a large number of low Reynolds number (cheap) simulations may still sum to an appreciable carbon footprint. In the development of database II, many candidate simulations were discarded as they constituted simulations re-analyzed from prior years. This is a welcome behavior which scientists should accelerate in their practice. Unfortunately, simulation databases are too often used only once for a particular research article, with the data becoming inaccessible after a student graduates. Public sharing of databases, particularly high quality direct simulation databases,^{40–42} which may be mined for decades beyond their inception should be championed by the fluid dynamics community. In developing some initial ideas toward estimating the carbon footprint of computational fluid dynamics, we are still left wondering what is the *total* footprint. To estimate this number for a given year, we would need a database of every simulation performed of all modalities (DNS, LES, RANS, and other). There seems to be no straightforward way to construct such a database. A work around, however, is to consider the fraction of CFD applications running on the world's largest computers. Top500^{39} provides an active list of the world's fastest computers. The total annual energy use of the 188 computers on the November 2023 list which provided power use information amounts to $3.41\xd7 10 9\u2009kWh$. These computers are harbored in many countries with different carbon intensities for electricity use. As a very rough estimate of the carbon footprint of this fraction of Top500, we can use the carbon intensity from a few countries to get a sense of the order of magnitude of the footprint. Six countries are chosen, the United States, Japan, China, France, the Netherlands, and Russia, since they have diverse budgets of energy resources and makeup about two-thirds of the Top500 computers. Assuming all Top500 computers were powered by the average carbon intensity of one of these countries yield respective carbon footprint estimates of $ 1.44 \xd7 10 6$ tons (U.S.), $ 1.59 \xd7 10 6$ tons (Japan), $ 1.83 \xd7 10 6$ tons (China), $ 1.75 \xd7 10 5$ tons (France), $ 1.28 \xd7 10 6$ tons (Netherlands), and $ 1.06 \xd7 10 6$ tons (Russia). Cast in terms of the annual Indianapolis carbon footprint (site of the 2022 APS DFD) of $ 19.4 \xd7 10 6$ tons,^{44} this range corresponds to about $1%\u221210%$ of the annual Indianapolis footprint. The percentage of CFD applications on these computers is not available, but even if these use cases constitute a small percentage of the total, the order of magnitude of these carbon footprint estimates is concerning.

## V. CONCLUSION

The main finding of this work is Eq. (2), which describes the relationship of carbon footprint for direct numerical simulation as a function of the simulated Reynolds number. This simple correlation can be used to estimate the carbon footprint during the planning stages of a new simulation without resorting to scaling studies. The main observation of this correlation is that a twofold change in the Reynolds number of a turbulent flow simulation produces about 13 times more carbon. While the main factors contributing to this increase are higher computational core and simulation time requirements, regional differences in carbon intensity are also significant. Running the same simulation on machines whose power source is derived from a high fraction of renewable energy will have a lower footprint than on machines whose power source is derived primarily from fossil fuels.

The second main finding concerns the makeup of a typical direct numerical simulation, which, to the best of the author's knowledge, has never been presented. A large sample database of direct simulations of turbulence from 2022 found an average Reynolds number of 315 whose corresponding footprint is 18.4 kg. The carbon footprint distribution of the database sample is wide, and while there are many relatively low Reynolds number simulations, their net carbon footprint is significant. Most strikingly, the highest Reynolds number simulation accounts for 94% of the carbon footprint of the database, meaning that a single hero calculation in a year can dramatically increase CFD's carbon footprint. As scientists consider what Reynolds number to simulate for their application, they may consider whether a lower Reynolds number simulation would yield the essential physics for their application, at a fraction of the carbon production. Another factor to consider is whether a direct simulation is required, and a reduced order approach such as LES or RANS would be sufficient. Even a large ensemble of these lower fidelity simulations may produce significantly less carbon than a single DNS at the same Reynolds number. While this work has focused on quantifying the carbon footprint of DNS, a modality mainly performed in the academic community, industry also engages in significant computing. Efforts to estimate the industrial carbon footprint owing to computing are needed.

This work relied on estimates of simulation power use through TDP, which, in general, does not equal the power consumed *in situ*. More work to measure power use during runtime for various flow types and Reynolds numbers will provide more accurate correlations building on these initial estimates.

Future work should also focus on understanding the net science generated from computation compared with experiments. Both modalities require initial capital and carbon investment to construct the respective facilities. Data collection—running a simulation or a wind tunnel—will have different carbon costs *in situ*. The author's impression is that running a high equivalent Reynolds number turbulent flow DNS calculation is likely more energy (carbon) intensive than running a wind tunnel, for example, since the former must typically run for long wall-clock times compared with the real measurement time of a turbulent boundary layer (e.g.) where ample time averaging can be performed in minutes. The precise comparison is sensitive to the respective details of the experiment and simulation. There is a trade-off in the type of data that is accessible in simulations compared with experiments, one example being the resolution of physics close to boundaries. Collecting data near a wall is easy for a simulation and hard in an experiment. Design optimization is another area where simulations may be advantageous compared with experiments. As scientific representatives, we must optimize the amount of science produced per dollar of investment while simultaneously seeking to maximize the amount of science produced per mass of carbon produced. These goals may be in conflict.

If the exponential growth in Reynolds number accessible with simulation is to continue in concert with growing computational infrastructure, the carbon footprint of scientific simulation will increase in unison with no change in scientific community behavior. By analogy to Richardson's famous quote on turbulent flows: “Big whirls have little whirls that feed on their velocity, and little whirls have lesser whirls and so on to viscosity,”^{35} the apt comparison here may be: “big computers have big simulations that feed on their electricity, bigger computers have bigger simulations, and so on as $ R e \u2192 \u221e$.” As we 21st-century scientists have inherited the privilege and opportunity to conduct the most detailed numerical studies ever conceived, it is the author's belief that such inquiry should now consider the carbon cost of our work, in addition to all the other careful choices we deliberate on in search for knowledge.

## CARBON STATEMENT

The principal factors contributing to the carbon footprint of this paper were my travel to present preliminary versions of this work at the 2022 and 2023 APS Division of Fluid Dynamics annual meeting. Roundtrip flights from San Francisco to Indianapolis and San Francisco to Washington D.C. generated around two metric tons of carbon equivalent for a single economy ticket.^{30}

## SUPPLEMENTARY MATERIAL

See the supplementary material for a spreadsheet containing the reference and Reynolds number information for database II.

## ACKNOWLEDGMENTS

I am grateful for the detailed feedback of M. Lee and K. Ravikumar regarding their simulations. I am also thankful to J. Capecelatro for suggesting the Top500 database as an avenue for exploring CFD's carbon footprint, and to P. Johnson for advocating public access databases. I appreciate valuable conversations with K. Ferguson who provided helpful feedback and thought provoking questions, including on cost associated with preparatory simulations conducted prior to production runs. I am also grateful to his comments on this paper. I am thankful to A. Bertsch, B. Rountree, B. R. de Supinski, B. Springmeyer, T. Patki, and B. Ryujin for insightful conversations, particularly on the subject of HPC power management, limitations of TDP, cost of simulations vs experiments, and quantification of net “science” production. In addition, I received valuable feedback from M. Howland, E. Loth, P. Miller, and T. Bailey. I thank the anonymous reviewers for their constructive comments, particularly for pointing me to the computational resource requirements for wall-bounded flows. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC, LLNL-JRNL-858734.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**J. A. K. Horwitz:** Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data that support the findings of this study are available within the article and its supplementary material.

## REFERENCES

*Flow Simulation of High Performance Computers I,*

*Computers in Spaceflight: The NASA Experience*. https://history.nasa.gov/computers/Ch2-5.html/

*Weather Prediction by Numerical Process*