The swift progression and expansion of machine learning (ML) have not gone unnoticed within the realm of statistical mechanics. In particular, ML techniques have attracted attention by the classical density-functional theory (DFT) community, as they enable automatic discovery of free-energy functionals to determine the equilibrium-density profile of a many-particle system. Within classical DFT, the external potential accounts for the interaction of the many-particle system with an external field, thus, affecting the density distribution. In this context, we introduce a statistical-learning framework to infer the external potential exerted on a classical many-particle system. We combine a Bayesian inference approach with the classical DFT apparatus to reconstruct the external potential, yielding a probabilistic description of the external-potential functional form with inherent uncertainty quantification. Our framework is exemplified with a grand-canonical one-dimensional classical particle ensemble with excluded volume interactions in a confined geometry. The required training dataset is generated using a Monte Carlo (MC) simulation where the external potential is applied to the grand-canonical ensemble. The resulting particle coordinates from the MC simulation are fed into the learning framework to uncover the external potential. This eventually allows us to characterize the equilibrium density profile of the system by using the tools of DFT. Our approach benchmarks the inferred density against the exact one calculated through the DFT formulation with the true external potential. The proposed Bayesian procedure accurately infers the external potential and the density profile. We also highlight the external-potential uncertainty quantification conditioned on the amount of available simulated data. The seemingly simple case study introduced in this work might serve as a prototype for studying a wide variety of applications, including adsorption, wetting, and capillarity, to name a few.
The advancement of statistical learning techniques, coupled with more affordable computing power and access to large datasets, has fueled the recent rapid growth of machine learning (ML).1 ML has achieved unprecedented progress in a large spectrum of fields including image recognition, natural language processing, and biological-structure prediction. At present, incorporating ML into science and engineering is seen as essential for advancing and expediting research.
In the field of materials science, in particular, an interdisciplinary subject spanning the physics, chemistry and engineering of matter, and industrial manufacturing processes, driven by a fast and rational approach to design new materials, ML has become an essential toolkit in computational materials modeling and molecular simulations.2–4 In this field, ML facilitates the discovery of novel molecules, materials, or systems exhibiting desired characteristics. In particular, a very important contribution to the field is the accurate representation of potential-energy surfaces for atomistic computer simulations. This problem has attracted considerable attention with recent developments of the so-called ML potentials.5–7 These potentials provide a direct functional relation between the atomic positions and the associated energy configuration by fitting a training set of electronic structure calculations. However, ML potentials utilize elements from quantum density-functional theory (DFT), and therefore their applicability is necessarily restricted to electron-size scale.
Quantum DFT is built upon a theorem formulated by Mermin.8 It establishes that the grand potential of an electron gas subjected to a one-body potential can be expressed as a distinctive functional of the local density. This theorem is a finite-temperature extension of the Hohenberg–Kohn theorem9 that operates at zero-temperature conditions. These two theorems provide the foundations for the DFT in ab initio quantum mechanical calculations.
Classical DFT, which deals with molecular-scale interactions in many-particle systems, such as liquids and colloids, has its roots on the same fundamental theorems.10 It is a statistical-mechanical framework in which the free energy of a many-particle system is written as a functional of the particle density, ρ(r). The system’s equilibrium is then determined by the extrema of the free-energy functional, which represent the density profiles at equilibrium.11 The density ρ(r) itself can be understood as the probability density function describing the likelihood of a particle being located in the proximity of the position-vector r. By means of this one-particle representation of the density, properties of matter, such as electric charge, magnetization, or pressure, are derived by weighing the microscopic interactions between the constituent particles.
While classical DFT is a popular statistical-mechanical framework, it hinges on the exact formulation of the Helmholtz free energy, which remains undetermined for certain system configurations. This barrier can be overcome by simplifying the intermolecular interactions through coarse-graining techniques yielding an approximate free-energy functional. Such approximations have been widely explored in several realistic many-particle systems, accounting for effects, such as geometry, phase transitions, nucleation, and multiple components.12–14
Previously, we provided an example of an ML application to quantum DFT. The applicability of ML to classical DFT has primarily focused on the discovery of true free-energy functionals as an alternative to the human-designed functionals.15 Ref. 16 combines the classical DFT formalism with ML, specifically kernel-density regression, to approximate free-energy functionals. In that work, a collection of grand-canonical Monte Carlo (MC) simulations at different chemical and external potentials in a planar geometry is used as a training set to obtain free-energy functionals for a Lennard-Jones fluid at supercritical temperature. The learning process involves measurement of the error between the predicted density profiles coming from the DFT apparatus in which the ML free-energy functional is embedded and the empirical results of the MC simulations, which are taken as the ground truth. Despite the accuracy obtained in the free-energy functional reconstruction and the ability to extract additional physics-related parameters such as thermodynamic bulk quantities and two-body correlations, the authors of Ref. 16 highlight some spurious results attributed to overfitting in the training process.
The authors of Refs. 17 and 18 were the first to employ neural networks for free-energy functional reconstruction; particularly the free-energy functional for hard rods (HR) and a Lennard-Jones fluid in one dimension (while a comprehensive perspective of ML methods in DFT, including neural networks, is given in Ref. 19). Neural networks are very good at handling complex inputs, such as images,20 and perform as universal approximators21 yielding high accuracy in function reconstruction. However, they effectively operate as a black-box model being difficult to interpret and requiring large datasets to approximate high-dimensional functions. In addition, neural networks do not generate uncertainty quantification by default. They produce a single length-fixed array instead of a collection of possible outputs weighted by probabilities as in the case of Bayesian inference. There exist techniques to incorporate uncertainty quantification into neural networks, such as MC dropout22 and the ensemble method.23 However, both methods pose some challenges. MC dropout relies on the empirical definition of the neural network architecture, making the uncertainty estimation potentially sensitive to different network configurations. On the other hand, the ensemble method might be computationally demanding as it requires training the same neural network model a predefined number of times. In addition, the choice of this predefined number introduces another hyperparameter that affects the uncertainty quantification process.
In contrast to neural networks and traditional ML techniques, non-parametric Bayesian inference from canonical and grand-canonical simulations provides uncertainty quantification of the predictions for free-energy functionals, as demonstrated in Ref. 24. (The same study also implemented adjoint techniques, allowing the use of gradient-based samplers, such as Hamiltonian MC). Following the Bayesian approach also, Ref. 25 utilized simulated data from molecular dynamics to approximate the effective potential governing the random walk of an atom in a lattice.
The employment of classical DFT to obtain ρ(r) might be a challenging task if system particles interact through fine-detailed potentials or if there exists an external potential acting on the many-particle system. External fields might modify the mechanical, electrical, or optical properties of systems in equilibrium with a bulk fluid having profound impacts on fields such as separation processes, crystallization, or nanofluidics, to name a few. Therefore, the external potential stands as one of the central variables to accommodate classical DFT,26 given that it might alter the equilibrium density ρ(r). It is noteworthy that the external-potential effects on the adsorption and phase transitions occurring at the substrate–fluid interface have been researched by ab initio analytical calculations (including DFT),13,27 molecular simulations,28–30 or both.31,32
Rather than assessing the external-potential effects on properties of matter, our aim is to uncover the external potential acting on a many-particle system by means of simulated data. The data corresponds to system-configurations observations that on average resemble the equilibrium-density profile of the many-particle system under the influence of the external potential. As a prototypical example, we consider a many-particle system with excluded volume interactions, which are ubiquitous in many applications, where a predefined external potential is exerted. The advantage of our prototype is that the associated free-energy functional is known analytically. Thus, we can leverage the classical DFT apparatus to statistically learn the external potential in a physics-informed ML formulation. Regarding the ML technique, we adopt a Bayesian inference framework, precisely because of its ability to provide full uncertainty quantification, as already alluded to. As we will see, uncertainty quantification depends solely on the amount of available data. It is based on seamless uncertainty propagation through all modeling hierarchy levels, thereby facilitating a judicious interpretation of the results.
II. EXTERNAL POTENTIAL ON THE GRAND-CANONICAL ENSEMBLE
In statistical mechanics, the one-dimensional (1D) functional of HR has a tractable analytical expression.33 The same expression can be obtained using particle simulations over a computational ensemble configured with identical parameters to the theoretical set-up. This equivalence is also valid when an external field or potential is exerted over the many-particles system. The objective of this work is to learn the external potential applied over a 1D purely repulsive HR system by using as empirical data the particles’ coordinates obtained from molecular simulations. For the sake of simplicity, we restrict our attention to a 1D HR system confined between two walls. The configuration resembles a fluid inside a pore. In this context, we calculate the exact equilibrium density resulting from DFT. The obtained density is used as the ground truth in the external-potential learning process. We then present the MC simulation that generates the particle configurations that are used as the learning dataset.
A. Exact density profile from DFT
The value μ = −2 defined in the absence of external potential makes the system very diffuse, as the average number of interacting particles is ⟨N⟩ = 2.05. This small number of particles, taking also into account the pore dimension, yields a low density profile. In fact, the layering effect close to the side walls is insignificant and the system behaves like bulk fluid along the whole pore. In contrast, the external potential increases the average number of particles up to ⟨N⟩ = 6.32, strengthening the effects of the side walls, especially for the right wall at x = 10, where the potential intensity is larger compared to the left wall at x = −10.
B. MC simulation
The particle coordinates used to learn the external potential are simulated using the grand-canonical MC algorithm.35 This method allows us to simulate particle coordinates x of HR of radius R along the line L considering an external potential V(x) at a given μ. This technique is widely used in adsorption studies, so-called the (μ, υ, T) ensemble in the literature, given that we are imposing μ over a predefined geometry with system volume υ (for the 1D system in our case, the characteristic scale is L) at a given temperature T. In the simulations, we use reduced units for T, obtaining T = 1 and β = 1/T = 1. The grand-canonical MC algorithm is a computationally efficient and accurate method for our purposes, given the straightforward nature of our prototypical system. The simulation steps are as follows:
Draw a valid random configuration of particle coordinates for 1 ≤ N1 ≤ L/2R. Run step 2 until a desired number of configurations Nconf is obtained.
Attempt to displace a particle (step 3) or exchange a particle (step 4) with the reservoir maintaining a ratio of npav/nexc, where npav and nexc are the average numbers of attempts to displace particles and exchange particles, respectively, per cycle.
Displace a particle from :
Choose a particle randomly xi, 1 ≤ i ≤ Nj.
Compute the potential of current configuration V(xi).
Give the particle a random displacement, , where rand is a sample drawn from a uniform distribution U(0, 1).
Compute the potential of new configuration .
- Accept the new configuration with probabilityIf is rejected, is retained.(4)
Return to step 2.
Exchange a particle with the reservoir: Choose whether inserting or removing a particle with equal probability. If the outcome is to insert a particle, go to step 4.1., and to remove a particle, go to step 4.2.
Insert a particle into :
Add a new particle at random position, .
Compute the potential of the new particle .
- Accept the new configuration with probability(5)
If is rejected, is retained.
Return to step 2.
Remove a particle from :
Choose a particle randomly xi, 1 ≤ i ≤ Nj.
Compute the potential of the selected particle V(xi).
- Accept the new configuration with probability(6)
If is rejected, is retained.
Return to step 2.
In the above steps a number of constraints are satisfied. Because the HR behave as purely repulsive particles, if during a particle displacement (step 3) or particle insertion (step 4.1.), there is a volume overlap between the HR, the new configuration is automatically rejected going back to step 2. In addition, when there are no particles during the simulation steps, i.e., Nj = 0, the particle displacement (step 3) or the particle deletion (step 4.2.) is automatically skipped.
The algorithm runs an undefined number of iterations until obtaining the desired dataset , which gives us the particle coordinates . When M is large enough, the normalized histogram of and the exact solution ρ(x) from Eq. (2) converge as shown in Fig. 2.
III. BAYESIAN INFERENCE OF THE EXTERNAL POTENTIAL
Up to this point, we have computed the density profile ρ(x) of our grand-canonical 1D HR ensemble with external potential using the DFT formalism and the MC simulation. With both techniques, ρ(x) is obtained having a perfect knowledge of the external potential applied, V(x). In what follows, we aim at treating the external potential as an unknown function to be inferred. The exact ρ(x) obtained from Eq. (2), with the external potential V(x) applied, represents the target density profile for our external-potential learning problem. At this stage, we deal with the inverse problem of statistical mechanics.36 In this inverse problem, we use a set of particle coordinate observations to approximate the unidentified V(x). The approximated V(x) is then inserted in the DFT formulation to recover ρ(x). The greater the precision in determining V(x), the more reliable representation of ρ(x) will be.
The external-potential learning process involves using partial information coming from the MC simulated particle configurations. Our statistical learning framework is based on Bayesian inference. It allows us to approximate a probability distribution over possible external-potential functions conditioned by the supplied simulated data. We first outline the Bayesian inference methodology to obtain a probabilistic representation of the unknown external potential. We then comment on the algorithm used to construct the probabilistic formulation and discuss the results obtained.
A. External potential learning procedure
B. Posterior sampling
Obtaining samples from the posterior distribution might seem a challenging task when only information about the prior and the likelihood distributions is available. Indeed, the product yields a non-normalized form of the posterior distribution as seen in Eq. (7). However, the posterior distribution can be approximated by considering samples from a Markov chain whose stationary distribution converges to the desired posterior representation. Here, we apply the Metropolis–Hastings algorithm40 as a Markov chain MC method to obtain a large number of samples whose distribution resembles the posterior distribution.
As a proposal distribution P(Q*|Q), we consider a spherical-Gaussian distribution with zero mean and diagonal covariance matrix, . We assume the same step-transition level for each parameter of Q leading to , where I is an identity matrix. Note that the spherical-Gaussian distribution yields a symmetric transition probability P(Q*|Q) = P(Q|Q*).
Given that we have defined as in Eq. (9), ρ(x|Q) must be computed for each sample Q obtained from the Markov chain. Therefore, after sampling from the proposal distribution P(Q*|Q) to obtain a new Q, the external potential is computed following Eq. (8). Then, is inserted in the minimization problem stated in Eq. (2) to recover ρ(x|Q) in order to calculate the likelihood function, i.e., Eq. (9).
To assess the external-potential inference capabilities of our model we generate a dataset with Nconf = 3 × 105 and collect 800 independent configurations to avoid particle configuration correlation, obtaining a training dataset with M ≈ 5 × 103. We then run the Metropolis–Hastings algorithm for S = 105 steps with δ = 0.01 in the transition distribution to validate the inference procedure. The chain is then thinned every 5 steps to de-correlate it, obtaining 2 × 104 final samples. The resulting Markov chain of the parameter for is represented in Fig. 3(a). It is evident that the resulting chain yields a stationary distribution and a fast mixing.
The predictive distribution can be obtained by sampling for each Q. The predictive distribution and the exact external potential are superimposed in Fig. 4(a), showing a good approximation of the external potential through the proposed Bayesian scheme. It is worth noting that while the predicted external potential [shown as a black line in Fig. 4(a)] for the mean parameters in does not fully match the exact potential [shown as a red line (online) in Fig. 4(a)], it yields a density profile for almost identical to the exact , as demonstrated in Fig. 4(b). Recall that is computed solving the minimization problem in Eq. (2) inserting as the external potential.
From Fig. 4(a), the exact external potential [red line (online)] lies inside the 99% probability region (light gray area) of the stationary predictive distribution for the majority of the x-coordinates except for the small range x ∈ [−2.5, 2]. In addition, the highest discrepancy between the exact potential [red line (online)] and the predicted external potential for the mean parameters (black solid line) occurs in this range. Not surprisingly, this small range [−2.5, 2] where fails to represent the external potential corresponds to the region with the lowest density, as shown in detail in Fig. 4(b). Nevertheless, this misrepresented region has a negligible impact on the reconstruction of ρ(x) from the predictive distribution ρ(x|Q), since ρ(x) near the walls is inferred with high accuracy, capturing even the oscillatory behavior in x ∈ (5.5, 9.5).
The high accuracy in the ρ(x) approximation is achieved even with a training dataset whose histogram [see Fig. 4(b)] does not resemble the exact density profile as shown in Fig. 2. This precision highlights the efficiency and robustness of our Bayesian inference procedure. For the sake of evaluating the accuracy of our posterior distribution, we rerun the Metropolis–Hastings algorithm for two more training datasets, namely, and . While provides M ≈ 1.5 × 104 particle coordinates, generates M ≈ 3.7 × 104 particle coordinates. Evidently, the two datasets yield one order of magnitude above the baseline scenario, i.e., . The resulting chains and histograms for each dataset are represented in Fig. 3. The histograms in Fig. 3(b) attest that increasing the number of particle coordinates reduces the variance of the estimated parameter obtaining contracted posterior distributions. The same behavior is observed for the remaining Q parameters. This variance reduction propagates through the predictive distributions and ρ(x|Q), diminishing the inference uncertainty. Such uncertainty reduction can be observed in Fig. 4(a), where the 99% probability region of for (dark gray area) is narrower compared to the 99% probability region of for (light gray area).
Furthermore, constitutes our most faithful scenario to infer the external potential because we are considering more datapoints. Indeed, the higher performance of the Bayesian inference procedure for compared to can be seen in the density comparison for x ∈ [−7.5, 2.5] in Fig. 4(b), where (black dashed line) approximates better the HR DFT density [red line (online)] compared to (black solid line). On the other hand, the histogram obtained for in Fig. 3(b) is within the estimated range of the histogram generated by . This suggests that, with fewer datapoints, we achieve a rough potential distribution with higher standard deviation (uncertainty), although the obtained distribution still represents a credible estimation of the external potential.
We eventually determine a lower bound of particle coordinates at which our Bayesian procedure becomes limited in its ability to accurately infer the external potential. Considering a training dataset with M ≈ 2.3 × 103 particle coordinates, we found that the resulting Markov chains of the Q parameters quickly oscillate in the steady-state around value levels far from the ones obtained for . This means that the Q samples obtained from the stationary distribution for are unable to efficiently recover the external potential.
The description of the microscopic interactions of a many-particle system is a long-standing issue across several scientific and engineering disciplines. The presence of an external potential in a many-particle system dramatically alters these interactions. For example, the governing external field plays a crucial role in fluid-substrate interactions like adsorption or wetting transitions. Therefore, knowing the external potential exerted over a many-particle system is key to making advances in the theoretical-computational exploration of novel complex materials.
Our overarching objective here is to push forward existing techniques to uncover the potential function between the external field and the many-particle ensemble. For this purpose we adopted a Bayesian framework, exemplified with a grand-canonical 1D HR system in a confined geometry under the influence of an external potential. The training dataset to infer the external potential is generated using a grand-canonical MC simulation. From the simulated data, the Bayesian framework reconstructs the external potential, which is embedded into the classical DFT formulation to generate the equilibrium-density function. The resulting density is then compared with the exact density output produced by the DFT apparatus using the true external potential term. Hence, the validity of our framework is contingent upon the degree to which the inferred potential accurately leads to the same density function. Despite the simplicity of our prototype, its core functionality resembles many real-world applications involving adsorption or capillarity features, and our statistical learning framework can be employed for empirical-microscopic observations of systems exhibiting a similar configuration to our computational set-up, thus, driving a more efficient, rational, and systematic study of fluid-substrate interactions.
The primary benefit of our Bayesian methodology lies in its ability to quantitatively assess uncertainty throughout all modeling hierarchy levels. This uncertainty quantification is rooted in the potential-parameter estimation as well as in the density-profile distribution obtained after inserting the inferred potential in the classical DFT formulation. In fact, we have demonstrated the existing trade-off between the degree of uncertainty in the modeling and the amount of available data. This uncertainty-data-availability relationship and the probabilistic description of the results motivate a much improved interpretation of the modeling output over classical ML techniques such as neural networks.
At the heart of our framework is the functional estimation of the external potential acting on the system at hand for which a basic Gaussian RBF has been proposed. Here, only three terms in the RBF, six parameters in total, are implemented to approximate the external potential. This small number of parameters is a direct consequence of the simple external-potential functional form we have imposed. Further improvements of the external potential evaluation can be carried out considering more terms in the RBF expansions or a polynomial expansion providing capabilities to represent more complex potential functions.
While our learning framework has shown a very good performance in reconstructing the external potential, there is scope for improvement and further refinement. First, of particular interest would be extension to higher-dimensional many-particle systems, which, at least theoretically, seems straightforward to implement if the one-body density is defined appropriately. However, a higher dimensional configuration would incur greater computational costs, thus hindering the sampling procedure. This computational constraint should be painstakingly addressed. Another interesting extension would be to utilize appropriate coarse-graining techniques to generalize the interparticle-interaction potential to account for subtle details of complex microscopic interactions. Such techniques typically decompose the external potential function into different terms serving specific purposes. Future research could be devoted to inferring each component of the external potential decomposition using the Bayesian inference framework developed here.
A.M. was supported by Imperial College London President’s Ph.D. Scholarship scheme. P.Y. was supported by Wave 1 of The UKRI Strategic Priorities Fund under EPSRC Grant No. EP/T001569/1, particularly the “Digital Twins for Complex Engineering Systems” theme within that grant, and The Alan Turing Institute. S.K. was supported by ERC through Advanced Grant No. 247031 and EPSRC through Grant Nos. EP/L025159 and EP/L020564.
Conflict of Interest
The authors have no conflicts to disclose.
Antonio Malpica-Morales: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Peter Yatsyshin: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Miguel A. Durán-Olivencia: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Serafim Kalliadasis: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
The data that support the findings of this study are available from the corresponding author upon reasonable request.