A regularized version of the lattice Boltzmann method for efficient simulation of soft materials is introduced. Unlike standard approaches, this method reconstructs the distribution functions from available hydrodynamic variables (density, momentum, and pressure tensor) without storing the full set of discrete populations. This scheme shows significantly lower memory requirements and data access costs. A series of benchmark tests of relevance to soft matter, such as collisions of fluid droplets, is discussed to validate the method. The results can be of particular interest for high-performance simulations of soft matter systems on future exascale computers.
I. INTRODUCTION
Minimizing the cost of accessing data is one of the most compelling challenges of present-day high-performance computing (HPC).1 This is particularly true for memory-bound computational schemes used in computational fluid dynamics. Among them, the lattice Boltzmann (LB) method2 is characterized by a comparatively low arithmetic intensity spanning in a range between 1 and 5 Flops/Byte ratio depending on the implementation, hence below the roofline threshold beyond which computing resources become the limiting factor.3,4
Given the widespread use of LB across a broad spectrum of scales and regimes, it is of prime interest to develop techniques to mitigate the impact of data access on future (exascale) LB simulations. Several such techniques have been developed in the past, based on hierarchical memory access,5 data organization in terms of array of structures and structure of arrays,6 as well as special crystallographic lattices that double the resolution at a given level of memory occupation.7 Furthermore, a number of approaches have been proposed to reduce memory occupancy. These can be summarized in two main strategies: the first one focuses on algorithmic implementation, while the second one exploits different numerical representations. As a remarkable example of the algorithmic approach, we recall the so-called “in-place” streaming, like the esoteric pull,8,9 while in the latest one, low precision and shifted equilibrium distributions are used either with single precision (instead of double) or with a mixed single precision/half-precision representation, with a factor two of memory saving and no loss of accuracy.10,11
In this work, we retrieve the pioneering considerations given by Ladd and Verberg12 on the possibility of reducing memory consumption by only storing hydrodynamic fields and their gradients. Thus, we present a new strategy to implement the LB approach to model a bi-component system based on the idea of refraining from storing the populations and reconstructing them on the fly based on the available hydrodynamic information. This approach grants a significant reduction in memory usage since it avoids population storage (in the following simply called “without populations”). Furthermore, this is particularly useful for multi-component and multiple species applications in which each species is carried by a common flow field, thereby requiring a single hydrodynamic field (density) as opposed to the full kinetic representation with O(30) populations. A similar argument applies to flows far from equilibrium or relativistic hydrodynamics,13,14 which demand higher order lattices, sometimes entailing hundreds of discrete populations per species.
The present approach is tested on a selection of physically relevant situations in soft matter. More specifically, we simulate a 3D off-axis and a 3D head-on collision between two equally sized fluid droplets in which the merging is suppressed. These simulations are compared with those performed using a regularized LB version with populations, and the accuracy of the results is quantitatively assessed by computing the time evolution of the droplet densities and the relative error. Even though the reconstruction procedure suffers from an inherent approximation concerning the standard implementation of the LB method (see Sec. II), our results show that the error remains negligible, the numerical accuracy is generally preserved, and the physics is correctly reproduced. Furthermore, the reconstruction procedure benefits from the regularization techniques15–17 that grants stability properties filtering out the non-hydrodynamic content in off-equilibrium distribution functions. The distinctive feature of the reconstruction procedure discussed here is that, unlike a standard LB approach, it considerably enhances the computational performance of lattice Boltzmann codes, entailing about 40 percent of memory savings on large-scale simulations. This represents a significant step toward the study of the multiscale physics of soft materials, particularly on modern graphics processing unit (GPU)-accelerated machines where the minimization of the burden imposed by memory requirements and data access are often mandatory.
II. LATTICE BOLTZMANN WITHOUT POPULATIONS: THE BASIC IDEA
The LB scheme is based on a set of discrete populations describing the probability of finding a representative fluid particle at position and time t with a discrete molecular velocity . The discrete velocities are chosen according to suitable crystallographic symmetries ensuring compliance with the basic conservation laws of macroscopic fluids. This implies a set of order b ∼ 20 − 30 discrete populations fi, i = 0, b, which exceed the number of associated hydrodynamic fields—scalar density ρ, (1), flow velocity ua (a = 1, 3), and momentum–flux tensor Πab (six terms for each lattice point)—totaling ten independent fields. Hence, LB requires about twice as much memory as compared to a corresponding Navier–Stokes-based computational fluid dynamics method. This redundancy buys several major computational advantages, primarily the fact that streaming is exact and diffusion is emergent (no need for a second order spatial derivatives), which proved invaluable assets in achieving outstanding parallel efficiency across virtually any HPC platform, also in the presence of real-world complex geometries.
These extra-memory requirements put a premium on strategies aimed at minimizing the cost of data access in massively parallel LB codes. Among others, recognized techniques are the hierarchical access and the use of special crystallographic lattices.7
The reconstructed populations hi are then relaxed and streamed without storing them (see Sec. III). Note that streaming revives the ghost components gi, which affect only the non-equilibrium part of the set of populations. As a result, the operational sequence at each time step is Reconstruct–Collide–Stream (RCS). It should be emphasized that the RCS sequence does not match exactly the result of the original, populations preserving, LB scheme. The underlying assumption is that, whenever the present approach is compared to standard Bathnagar–Gross–Krook (BGK) collision,27 the discrepancy introduced by the R-step naturally heals itself via the ghost revival described earlier. Since the ghosts contain high-order, non-hydrodynamic non-equilibrium information, this assumption appears well justified, except perhaps in the case of very low viscous flows (turbulence), which lie outside the scope of this paper. However, discrepancies are lowered whenever the comparison is performed with a regularized LB version with populations, as in the present case. To be noted that a further source of discrepancy relates to the truncation error due to the floating-point representation, as well as to the different sequences of operations in the algorithmic implementation. However, none of these are expected to play a significant role in the physical phenomena under study in this work.
III. LATTICE BOLTZMANN WITHOUT POPULATIONS: IMPLEMENTATION DETAILS
A. Implementation procedure
The basic idea is to construct the populations on the fly, based on Eq. (3) without storing them. In order to caution against overwriting the values at time t and t + 1, two copies of the hydrodynamic arrays must be stored, doubling the memory request. For a single component fluid in the paradigm of flip–flop memory access pattern,28 the memory saving is significant, 20 hydro-arrays (1 for density, 3 for velocity vector, 6 for the pressure tensor, two times in the flip–flop paradigm) vs 38 populations for the D3Q19 lattice scheme, adopted in the present investigation. For two component fluids, the advantage increases. For example, using the color-gradient lattice Boltzmann (LBCG) model,29–31 we need to add two extra arrays for the second component, thus obtaining 22 hydro-arrays vs 76 populations (i.e., 38 populations per each fluid component).
About the implementation of the boundary conditions (BCs), since the populations are reconstructed on the fly, BCs can be easily implemented by conditional statements for the applications of ad-hoc rules on the post-collision populations in the fused approach. For instance, halfway bounce-back can be easily implemented, including a minor correction to endure a Dirichlet boundary condition.12,32 On the other hand, the performance could notably decrease due to the extensive use of conditional statements in the fused implementations. The problem can be avoided by adopting the non-local regularized BCs,21,26 where the BCs are applied by modifying only the hydrodynamics fields used to reconstruct the populations. In particular, in the non-local regularized BCs, the target values of ρBC and uBC at the wall can be assigned or extrapolated by finite differences. At the same time, the non-equilibrium tensor can be estimated18,26 as . Both the proposed BC strategies preserve second-order accuracy in space and time.12,21
B. Near-contact interactions
Recently, an extension of the lattice Boltzmann method to model near contact interactions among interfaces (in the following denoted LBNCI)33,34 has granted the possibility to simulate a broad variety of complex flow systems, such as foams, emulsions, flowing collections of droplets, bubbles, and high internal phase emulsions, i.e., foamy-like media displaying ordered patterns of fluid drops when under confinement.35–37 In those systems, the relative motion of two fluid interfaces in close contact represents a complex hydrodynamic process due to the concurrent action of several repulsive and/or attractive forces, such as electrostatic and van der Waals forces, hydration repulsion, steric interactions, and depletion attraction.38,39 Importantly, these forces govern the physics of phenomena, such as coalescence and/or repulsion between neighboring droplets that can considerably alter the mechanics of the material, occurring at lengthscales much larger than the near-contact ones.40–44 Their nature has been initially elucidated by the pioneering works of Gibbs and Marangoni on the thermodynamics of liquid thin films39 and by the seminal contributions of Derjaguin and Overbeek45,46 concurring in the development of the DLVO theory.
The LBNCI method is a variant of an LBM approach for multicomponent flows, based on the color-gradient model,30 augmented with an extra forcing term capturing the effects of the near-contact forces acting at the fluid interface level. This method still holds to a continuum description of the interface dynamics, even though near interaction forces arise at a molecular level down to nanometers (and below), namely, the relevant spatial scale of contact forces. In other words, the mesoscale representation of near-contact forces provides a coarse-grained description able to retain computational viability without compromising the underlying physics.
In the above equation, is the parameter controlling the strength (force per unit volume) of the near contact interactions, is the distance between two close interfaces, is a unit vector normal to the interface, and δI ∝ ∇Θ is a function, proportional to the phase field Θ, which localizes the force at the interface.
IV. VALIDATION
The numerical experiment is as follows: Two drops, having diameter D = 30 lattice sites, of a fluid (red component) are dispersed in a second fluid (blue component), with the viscosity ratio reported above and densities equal to one for both the components. The system is placed within a cubic box of linear size L = 128 and the drops are located at a distance such that reciprocal interactions are negligible. The impact parameter (i.e., the distance between the collision trajectories perpendicular to their relative direction of motion) is b = 0.85, the strength of near-contact forces preventing coalescence is Ah = 0.1, while other values can be found in Table I of Ref. 33. In Fig. 1, we show the time sequence of such a collision simulated using an LB approach without populations [(a)–(e)] and with populations [(f)–(j)]. In both cases, the relative impact velocity Urel is set to 0.5, while Reynolds and Capillary numbers are ReR = 450 and CaR ≃ 1.6 for the droplet (red fluid), and ReB = 50 for the surrounding medium (blue fluid). The two drops initially approach and then impact without merging [Figs. 1(b), 1(c), 1(g), and 1(h)], since the repulsive near-contact interactions prevent the coalescence and enable the formation of a thin film that separates the drops.33 During the bouncing, the shape of the drops considerably departs from the initial spherical geometry, temporarily flattening where opposite interfaces come into close contact. Afterward, they separate and gradually reacquire a spherical shape when their reciprocal distance is sufficiently high.
The time sequence of Fig. 1 shows that the two numerical approaches with and without populations essentially lead to equivalent dynamic behaviors. On a quantitative basis, this result is corroborated by evaluating, for example, the time evolution of the density field shown in Fig. 2, where ⟨⋯⟩ represents a spatial average computed over lattice points for which ρ > 0.5 (i.e., within both drops, the red regions in Fig. 1). Larger fluctuations occur only at early times, i.e., when the droplets come into contact, while they progressively shrink as the droplets move away. Note that the differences between the two plots remain marginal over the full process. In Fig. 3, we compute, at two different lattice sites, the time evolution of the relative error , where ρN and ρP are the densities of the LB without and with populations, while is the maximum value of ρ calculated using LB without populations. Unlike the previous one, this quantity provides a local measure of potential discrepancies between the two approaches. The plot shows that the relative error remains rather small over time, a further indication that the procedure of reconstructing the populations proposed in this work preserves, at a high level of accuracy, the physics of the systems.
The approach without populations has been also tested in simulations of a head-on collision between two fluid droplets (see Fig. 4). Here, b = 0, Urel = 0.25, and D = 30. Once again, the physics is correctly reproduced and no appreciable discrepancies emerge, as also shown in Fig. 5 where the time evolution of ⟨ρ⟩ρ>0.5 is plotted. Note incidentally that, in this case, the interaction time is considerably longer than before since, after the bouncing, the drops remain in close contact for a longer period of time.
The numerical experiments discussed so far present results in which near-contact interactions function as an effective force field preventing droplet coalescence. But what if these forces are turned off? In Fig. 6, we simulate precisely this case by setting Ah equal to zero (other parameters are the same as those of Fig. 1). The drops, initially static and perfectly spherical, approach and connect through a tiny neck of red fluid, a region in which capillary forces are strong enough to prevent its breakup. Afterward, the two droplets merge into a single one that slowly attains an equilibrated spherical shape. Once again, both methods with and without populations describe the physics of the process with negligible differences, as also demonstrated in Fig. 7, where early times density fluctuations capture the dynamics of droplet coalescence.
A. Performance data
V. SUMMARY AND OUTLOOK
In summary, we have introduced a regularized lattice Boltzmann method for the efficient simulation of soft materials that reconstructs the distribution functions from hydrodynamics quantities (density, momentum, and pressure tensor) with no need to store the full set of populations. This entails a significant decrease in memory requirements and a substantial enhancement of the computational performance, features of particular relevance for large-scale simulations. The accuracy of the method has been investigated in numerical experiments of relevance to soft matter, such as the collision of fluid droplets. For such cases, the formulation without populations shows satisfactory numerical stability and negligible differences with respect to a standard reconstruction-free LB approach. In practical terms, 40% memory reduction essentially means doubling the volume of the simulation, which would save roughly 4 TB memory, for the case of a simulation demanding about 10 TB of data.50–52 It would be of interest to further test the model on more complex states of matter, such as colloidal fluids and fluid droplets flowing within microfluidic channels, where the implementation of solid boundaries is often necessary.53 On a more general basis, our results could be helpful for the design of novel high-performance computing methods in soft matter, where customized numerical schemes are fundamental to tackle phenomena occurring over a broad spectrum of scales in space and time.
ACKNOWLEDGMENTS
The research leading to these results has received funding from MIUR under the project “3D-Phys” (Grant No. PRIN 2017PHRM8X) and from the European Research Council under the European Union’s Horizon 2020 Framework Program (Grant No. FP/2014–2020)/ERC Grant Agreement No. 739964 (“COPMAT”). We acknowledge CINECA Project No. IsB25_DIW3F under the ISCRA initiative, for the availability of high-performance computing resources and support. F.B. acknowledges funding by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Program (Grant Agreement No. 882340). M.L. acknowledges the support of the Italian National Group for Mathematical Physics (GNFM-INdAM).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Adriano Tiribocchi: Software (equal); Validation (equal); Visualization (lead); Writing – original draft (lead). Andrea Montessori: Formal analysis (equal); Software (equal); Validation (equal); Writing – review & editing (equal). Giorgio Amati: Software (equal); Writing – review & editing (equal). Massimo Bernaschi: Software (equal); Writing – review & editing (equal). Fabio Bonaccorso: Software (equal); Writing – review & editing (equal). Sergio Orlandini: Software (equal); Writing – review & editing (equal). Sauro Succi: Formal analysis (equal); Supervision (equal); Writing – review & editing (equal). Marco Lauricella: Conceptualization (lead); Formal analysis (equal); Methodology (lead); Software (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data supporting this study’s findings are available from the corresponding author upon reasonable request.