Optimization of materials’ performance for specific applications often requires balancing multiple aspects of materials’ functionality. Even for the cases where a generative physical model of material behavior is known and reliable, this often requires search over multidimensional function space to identify low-dimensional manifold corresponding to the required Pareto front. Here, we introduce the multi-objective Bayesian optimization (MOBO) workflow for the ferroelectric/antiferroelectric performance optimization for memory and energy storage applications based on the numerical solution of the Ginzburg–Landau equation with electrochemical or semiconducting boundary conditions. MOBO is a low computational cost optimization tool for expensive multi-objective functions, where we update posterior surrogate Gaussian process models from prior evaluations and then select future evaluations from maximizing an acquisition function. Using the parameters for a prototype bulk antiferroelectric (PbZrO3), we first develop a physics-driven decision tree of target functions from the loop structures. We further develop a physics-driven MOBO architecture to explore multidimensional parameter space and build Pareto-frontiers by maximizing two target functions jointly—energy storage and loss. This approach allows for rapid initial materials and device parameter selection for a given application and can be further expanded toward the active experiment setting. The associated notebooks provide both the tutorial on MOBO and allow us to reproduce the reported analyses and apply them to other systems (https://github.com/arpanbiswas52/MOBO_AFI_Supplements).

Ferroelectric (FE) and antiferroelectric (AFE) materials are one of the key material classes for memory and energy storage applications.1 For ferroelectrics, the classical applications include ferroelectric capacitors2,3 as well as emergent applications such as ferroelectric tunneling devices4–6 and long-sought ferroelectric field effect transistors.7,8 The complex ferroelectric structures are of broad interest in the context of ferroelectric device integration as well as enabling new functionalities such as composite magnetoelectric devices.9,10 Similarly, antiferroelectric materials are broadly considered for potential energy storage, either in the form of on-chip components or in the capacitive or electrolytically gated configurations. Common for both ferroelectric and antiferroelectric materials is the presence of polar instability, which is a ground state for ferroelectrics and is suppressed by the structural instability at zero fields for antiferroelectrics.11,12 Correspondingly, FE and AFE materials both allow for explanation in the form of Ginzburg–Landau-Devonshire (GLD)-type theories1,13–15 with several competing order parameters, which include a polarization for multiaxial FE and a structural order parameter for AFE systems. This formalism allows for the direct prediction of corresponding ground states, concentration, field and strain responses, and serves as the basis for the phase-field models.16–18 

The potential applications of ferroelectrics in information technology have stimulated over two decades of effort in integrating the classical perovskite-based systems with semiconductors, stymied by the intrinsic instability of the oxide–semiconductor interfaces. The development of HfO2 and other binary ferroelectrics by Mikolajick et al. in early 2011 have stimulated new and rapidly growing research effort in this area,19–23 also stimulating the search for and discovery of novel binary systems such as (AlxSc1−xN)24,25 and (Mg1−xZnxO).26 Notably, despite significant controversies on the origin of ferroelectric-like responses in these systems,27–31 these material systems also allow effective representation via GLD formalism, predicting dopant-dependent phase transitions and temperature-polarization behavior.32,33

Both for conventional and binary ferroelectrics, of fundamental interest is their behavior and performance in the realistic devices and structures, i.e., in the form of films, multilayers, or 3D integrated objects.34–41 The intrinsic complexity of this problem stems from the non-local nature of the depolarization phenomena in ferroelectric systems, where the distributions of the polarization and screening charges on the bounding interfaces result in the dominant contribution to the thermodynamics of the system. Traditionally, these phenomena are analyzed using effective dead layer models, where the complex electrostatic structure of the interface is approximated via the introduction of an effective dielectric layer.14,42 However, this approximation is limited to more realistic scenarios including an open ferroelectric surface with ionic screening, interfaces between ferroelectric and ionically conductive and electrochemically active materials, and interfaces between combining ferroelectric and semiconductor functionalities, i.e., ferroelectric–semiconductor interfaces,43,44 interfaces between ferroelectric–semiconductor and other materials, and interfaces with intrinsic finite density of electronic (or ionic) states.

Despite the apparent variability of these systems, this broad range of phenomena allows ready representation in terms of formalism combining the Landau–Ginzburg–Devonshire (LGD) description of the ferroelectric system with the corresponding surface and bulk semiconductor subsystem or surface/interface electrochemical models. For many ferroelectric materials, corresponding free energy coefficients are well known, as are their temperature, concentration, and strain dependencies, whereas others can be readily measured via classical semiconductor characterization techniques. As such, they are amenable for numerical assessment via approximate analytical, 1D numerical, and 3D numerical models. However, many of these models are time consuming and bring forth the challenge of effective navigation of possible parameter space toward target functionalities or their combination.

Here, we introduce the combination of multi-objective Bayesian optimization (MOBO) and ferroelectric-electrochemical model for figure of merit optimization for memory and energy storage devices. The analysis here is developed for a 1D model that does not allow for the formation of domain structures but can be trivially extended to the full 3D finite element models. We develop the general formalism for the (anti)ferroelectric with the interface control in Sec. I, introduce the Bayesian optimization and multi-objective Bayesian optimization in Sec. II, and demonstrate the exploration of optimal functionalities in the interface-controlled ferroelectrics in Sec. III. The analyses reported here are summarized in the associated Colab notebooks that can both serve as tutorials and allow us to reproduce the results in this manuscript and apply the same analytics for other models (https://github.com/arpanbiswas52/MOBO_AFI_Supplements).112 

Traditionally, interfacial phenomena in ferroelectrics are studied in the dielectric dead layer approximation, effectively allowing for separation between polarization bound charges and screening charges. While capturing some aspects of polarization behavior in ferroelectrics, this model is insufficient to describe the surface and interface phenomena in ferroelectrics in the presence of finite electronic or ionic density of states. For ferroelectric–semiconductor interfaces, nonlinear screening phenomena and their influence on the contact barrier can be important, while the phenomena are usually either neglected (model of constant screening charge density) or linear approximation is used (the Bardeen states model).45–47 Similarly, a number of authors have explored ferroelectric semiconductors, where both ferroelectric and semi-conductive subsystems coexist in a single material.48 Finally, for surfaces, a complementary thermodynamic approach was developed by Stephenson and Highland (SH),49,50 who consider an ultrathin film in equilibrium with a chemical environment that supplies (positive and negative) ionic species to compensate its polarization bound charge at the surface.

Recently, we modified the SH approach allowing for the presence of a gap between the ferroelectric surface covered by ions and an SPM (scanning probe microscopy) tip39,51–54 and developed the analytical description for thermodynamics and kinetics of these systems. The analysis leads to the elucidation of ferroionic states, which are the result of the nonlinear electrostatic interaction between the ions with the surface charge density described via Langmuir adsorption isotherm and ferroelectric dipoles. Recently, the emergent behaviors in uniaxial antiferroelectric (AFE) thin films covered with surface ions have been explored within the framework of the Kittel–Landau–Ginzburg–Devonshire–Stephenson–Highland (KLGD-SH) approach.55 The approach allows the characterization of the polar and anti-polar ordering’s dependence on the voltage applied to the antiferroelectric film, and the analysis of their static and dynamic hysteresis loops.

Below, we introduce the general formalism for polarization rotation in AFE films covered by surface ions with a charge density proportional to the relative partial oxygen pressure. We calculate the phase diagrams and analyze the dependence of polarization and anti-polarization on applied voltage and discuss the peculiarities of static and dynamic hysteresis loops. It is important to note that, while here we consider a special case of adsorption isotherm, this analysis can be readily extended toward other functional forms and mechanisms that give rise to finite electron or ion density of states at the interface. While the exact nature of the carriers will affect the transport and reaction mechanisms, the ferroelectric responses are expected to be universal.

Here, we consider the case of ultrathin dielectric gaps between the top electrode and the surface of AFE film with the controlled electrochemical potential of mobile component, and the control is performed by the oxygen pressure [see Fig. 1(a)]. The linear equation of state D=ε0εdE relates an electric displacement D and field E in the gap. Here, ε0 is a universal dielectric constant and εd(110) is a relative permittivity in the gap filled by air with controllable oxygen pressure. A wide bandgap AFE film can be considered insulating, and hence within the material D=ε0E+P. A potential ϕ of a quasi-static electric field inside the film satisfies a Laplace equation in the gap and a Poisson equation in the film, with the bound charge density determined by the polarization gradient. The boundary conditions for the system are the equivalence of the electric potential to the applied voltage U at the top electrode (modeled by a flattened region z=λ); and the equivalence of the difference D3(gap)D3(film) to the ionic surface charge density σ[ϕ(r)] at z=0; the continuity of the ϕ at gap–film interface z=0; and zero potential at the conducting bottom electrode z=h.

The polarization components of the AFE film depend on the electric field Ei as Pi=Pif+ε0(εijb1)Ej, where i=1,2,3, and ε33b is a relative background permittivity of antiferroelectric, εijb10].56 The polarization component Pif is further abbreviated as Pi. To determine the spatial–temporal evolution of Pi, we use either the simplest two-sublattice Kittel model,57 or more complicated models incorporating anti-polar mode related oxygen octahedral rotations58–60 combined with the LGD approach. Corresponding LGD free energy functional F additively includes a bulk part—an expansion on powers of 2–4 of the polarization (Pi) and anti-polarization (Ai), Fbulk; a polarization gradient energy contribution, Fgrad; an electrostatic contribution, Fel; and a surface energy, FS. It has the form

(1)

where the constituent parts are

(2a)
(2b)
(2c)
(2d)

Here, Vf is the film volume. The coefficients ai and bi linearly depend on the temperature T; in particular, ai=αT(TPT) and bi=αT(TAT), so they change their sign at Curie temperature TP and AFE temperature TA, respectively; αT>0. All other tensors in Eq. (2) are regarded as temperature-independent. The tensors aijk and bijk and the gradient coefficients tensor gijkl are positively defined for the functional stability. The elastic and electrostriction energies are negligibly small for a single-domain film on a matched substrate. Otherwise, the influence of elastic strain can be approximately taken into account by the renormalization of coefficients in Eq. (2a) (see, e.g., Ref. 1).

The energy Fel is dependent on the surface charge density σ(ϕ), since the electric field is ultimately determined by the charge distribution in the system. Several models have been introduced for the charge variation.39,47 In principle, these models can incorporate specific DOS and electric potential at the interface, but only a few are sensitive to the changes of interfacial potential in a self-consistent manner.

Below, we use an equation for the surface ionic charge density σ(ϕ) analogous to the Langmuir adsorption isotherm61 and a linear relaxation model to describe the temporal dynamics of the surface charge density,

(3)

Here, the dependence of an equilibrium charge density σ0[ϕ] on the electric potential ϕ is controlled by the concentration of surface ions, θi(ϕ), at the interface z=0 in a self-consistent manner, as proposed by Stephenson and Highland,49,50

(4)

where e is an elementary charge, Zi is the ionization degree of the surface ions/electrons, and 1/Ni are saturation densities of the surface ions. A subscript i designates the summation on positive (i=1) and negative (i=2) charges, respectively; ρ=pO200pO2 is the relative partial pressure of oxygen (or other ambient gas);14,49ni is the number of surface ions created per gas molecule. Two surface charge species exist, since the gas molecule had been electroneutral before its electrochemical decomposition started. The dimensionless ratio ρ is chosen to vary in a range from 106 to 106.

Positive parameters ΔG100 and ΔG200 are the free energies of the surface defects formation at normal conditions, pO200=1 bar, and zero applied voltage U=0. The energies ΔGi00 are responsible for the formation of different surface charge states (ions, vacancies, or their complexes). Specifically, exact values of ΔGi00 are poorly known even for many practically important cases, and so hereinafter they are regarded varying in the range ∼(0–1) eV.49 Note that exact values of the parameters can be derived either via variable pressure experiments or by exploring polarization dynamics under electrochemical control.62 However, we refer the theory–experiment matching to further studies. Schematic dependence of the normalized charge densities σ0 vs dimensionless electric potential ψ calculated for the SH model in comparison with other models is shown in Fig. 1(b).

Since the stabilization of a single-domain polarization in ultrathin perovskite films can take place due to the chemical switching (see, e.g., Refs. 49, 50, and 63), we can assume the same for an AFE film. Also, we will assume that the distributions of Pi(x,y,z) and Ai(x,y,z) do not deviate significantly from their values averaged over the film thickness, which are further abbreviated as “polarization” PiPi and “anti-polarization” AiAi. Note that this mean-field approximation reduces the problem to zero dimensional and allows for analytical treatment as performed here.

In this case, the behavior of the polarization Pi and anti-polarization Ai can be described via relaxation-type nonlinear differential equations,

(5)

Hereinafter, we assume that the in-plane electric field components are absent, E1=E2=0, and consider only the component E3 in Eq. (5). The expression for electric field E3 follows from the minimization of electrostatic energy,55 

(6)

Equation (6) corresponds either to the stationary case or to the adiabatic conditions, when σ=σ0[Ψ] in Eq. (3). The overpotential Ψ contains the contribution from surface charges proportional to σ0, the depolarization field contribution proportional to P, and the external potential drop proportional to U. Hence, the component E3 consists of the external field, depolarization field, and the field created by the surface ion layer.

The constituent parts of the thermodynamic potential (1) can be further expanded in Ai, Pi, and Ψ powers, assuming that |eZiΨ/kBT|1. As a result, we obtain that the surface ions and depolarization field change the coefficients a3, ai3, and χi3; induce the built-in field ESI; and decrease the external field Ea in the following way:

(7a)
(7b)
(7c)
(7d)

The last term in Eq. (7a), λ2ε0(εdh+λε33b), originated from the depolarization field. Since, as a rule, εdhλε33b, the acting field is close to an external field, EactUh. Also, we introduce positive functions in Eq. (7),

(7e)
(7f)

It is seen from the expressions (6) and (7) that possible solutions of relaxation Eq. (5) and, in particular, the features of static and dynamic polarization (P–E) hysteresis loops depend on a great number of parameters, some of which can vary in a relatively wide range. Specifically, besides the dynamic variables, such as applied pressure U, temperature T, and partial oxygen pressure ρ, one can vary the film thickness h and the gap thickness λ, as well as the energies ΔGi00 of the surface defects formation and their saturation densities 1/Ni. Not the least, the coupling constants χij are poorly known, as are the fitting parameters.

FIG. 1.

(a) Layout of the considered system, consisting of electron conducting bottom electrode, ferroelectric (AFE) film, screening layer with charge density σ(ϕ), and ultrathin gap separating the film surface and the top electrode. (b) Schematic dependence of the normalized charge densities σ0 vs dimensionless electric potential ψ calculated for the models of Bardeen-type surface states (black line “BS”) and Fermi–Dirac density of states (red curve “FD”) describing 2D electron gas, in comparison with the Stephenson–Highland model describing the surface charge density of absorbed ions (blue step-like curve “SH”). Adapted from Morozovska et al., Acta Mater. 160, 57–71 (2018).47 Copyright 2018 Acta Materialia Inc. Published by Elsevier Ltd. (c) A typical phase diagram of an AFE film dependent on the temperature T and relative partial oxygen pressure ρ. There are an antiferroelectric (AFE) phase coexisting with a weak ferroelectric (FE) phase, a ferroelectric-like antiferroionic (AFI) phase, and an electret-like paraelectric (PE) phase. Free energy maps at zero electric field E=0 (upper insets) and the polar (P) and anti-polar (A) order parameters dependence on the static E-field (bottom insets) are calculated in each phase. Adapted from Morozovska et al., Phys. Rev. Appl. 16, 044053 (2021);55 Copyright 2021 APS.

FIG. 1.

(a) Layout of the considered system, consisting of electron conducting bottom electrode, ferroelectric (AFE) film, screening layer with charge density σ(ϕ), and ultrathin gap separating the film surface and the top electrode. (b) Schematic dependence of the normalized charge densities σ0 vs dimensionless electric potential ψ calculated for the models of Bardeen-type surface states (black line “BS”) and Fermi–Dirac density of states (red curve “FD”) describing 2D electron gas, in comparison with the Stephenson–Highland model describing the surface charge density of absorbed ions (blue step-like curve “SH”). Adapted from Morozovska et al., Acta Mater. 160, 57–71 (2018).47 Copyright 2018 Acta Materialia Inc. Published by Elsevier Ltd. (c) A typical phase diagram of an AFE film dependent on the temperature T and relative partial oxygen pressure ρ. There are an antiferroelectric (AFE) phase coexisting with a weak ferroelectric (FE) phase, a ferroelectric-like antiferroionic (AFI) phase, and an electret-like paraelectric (PE) phase. Free energy maps at zero electric field E=0 (upper insets) and the polar (P) and anti-polar (A) order parameters dependence on the static E-field (bottom insets) are calculated in each phase. Adapted from Morozovska et al., Phys. Rev. Appl. 16, 044053 (2021);55 Copyright 2021 APS.

Close modal

Our preliminary analysis of the phase diagrams calculated for a simplified 2–4-LGD expansion [without 6th powers in Eq. (2a)] show that there are multiple possible phases in the system, such as paraelectric (PE) and multiple ferroelectric-like (FE), antiferroelectric, and ferroionic phases nontrivially located in the space of variables {T,ρ,U} ruled by the other parameters, namely, external ones (h and λ), interfacial, and material constants (ΔGi00, 1/Ni, and χij), see Fig. 1(c) for illustration of the influence of the film thickness h on the phase diagram. We expect multiple solutions routes, especially in a dynamic case, and the specific route critically depends on the point in the above multi-parameter space, via, e.g., interfacial conditions (controlled by λ), finite size effects (controlled by h), and maybe the most intriguing via the surface ions coupling to the soft modes in the AFE film (controlled by parameters ΔGi00, 1/Ni, and χij). These control parameters define the multidimensional parameter space of the system.

Practically of interest for applications are the macroscopic responses of the system, including energy storage and losses for a given film thickness and bias window. With the formalism above, the relevant P–E loops can be readily evaluated, and parameters can be extracted. However, the relationships between these can be highly non-trivial, with a change in a single parameter affecting multiple aspects of materials functionality. Combined with the high-dimensional nature of the parameter space, this precludes the simple gradient descent or grid search methods. This necessitates the introduction of Bayesian optimization methods capable to combine exploration and exploitation of the relevant parameter spaces. Previously, we adopted these approaches for single-parameter optimization of ferroelectric materials48 and lattice models1 and theory experiment matching.64 Below, we discuss the basic principles of BO and particularly the multi-objective BO, necessary for discovery balancing multiple functionalities.

Bayesian optimization65 (BO) is an emerging field of study in sequential design methods. It has been considered a low cost computationally global optimization tool for design problems having expensive black-box objective functions. The general idea of BO is to emulate unknown functional behavior in the given parameter space and find the local and global optimal locations, while reducing the cost of function evaluations from expensive high-fidelity models. The reason it is called Bayesian is that it follows the ideology of the Bayes theorem, which states that “posterior probability of a model (or parameters) M given evidence (or data, or observations) E is proportional to the likelihood of E given M, multiplied by the prior probability of E.” Mathematically,

(8)

In the BO setting, the prior represents the belief of the unknown function f, assuming there exists prior knowledge about the function (e.g., smoothness, noisy or noise-free, etc.). Given the prior belief, the likelihood represents how likely is the observed data, D. These data are the sampled data as the true function value from expensive evaluations and can be viewed as the realizations from the unknown function. Finally, given the data, the posterior distribution is computed in the BO as a posterior surrogate model [e.g., Gaussian process model (GPM)]: Δ=p(f|(D)) is developed from these sampled data. Thus, Eq. (9) in the Bayesian optimization setting can be modified as below,

(9)

where D1:k=[x1:k,f(x1:k)] is the augmentation of the observation or sampled data till kth iteration of BO. Unlike the general Bayesian modeling, here prior or likelihood function is not written separately, but the augmentation of data at each iteration combines the prior distribution with the likelihood function. Generally, the GPM posterior model is considered based on conjugate normal prior assuming the data are realized from the Gaussian (normal) function.

This approach has been widely used in many machine learning problems.66–70 However, attempts have been made when the response is discrete such as in consumer modeling problems where the responses are in terms of user preference.65,71 The idea is to approximate the user preference discrete response function into continuous latent functions using the binomial-probit model for binary choices72,73 and polychotomous regression model for more than two choices where the user can state no preference.74 In the domain of materials science, though the optimization problem in material characterization and syntheses have expensive (computational) or unknown (physical experiments) functions, the application of BO has not progressed much yet.75 In some notable collaboration between BO and material science, researchers found a similar performance with lot lesser BO evaluations than exhaustive evaluations in optimizing different properties of materials like grain boundary structure, grain boundary energy, elastic properties, etc., thus reducing cost and increasing efficiency.76–79 Even if the functions are computationally inexpensive, BO allows for quick convergence for very large parameter space than doing an exhaustive search.80 Furthermore, BO is also applied in optimizing the hysteresis loop shape in ferroelectric materials using a continuous lattice model.81 Just like in expensive computational simulations, BO also helped in reducing the effort and resource cost in experimental materials science problems.82–84 However, the vast majority of these applications address only single-parameter optimization, whereas in many cases, we aim to discover the optimal balance between two or more functionalities, as implemented in multi-objective BO. To date, there are only a few applications of MOBO for material discovery in optimizing multiple target properties, considering both experimental and computer simulated data.85 MOBO has also been applied for efficient discovery of the targeted materials, performed by a thermodynamically consistent micromechanical model that predicts the material’s response based on its composition and microstructure.86 

BO adopts a Bayesian perspective and assumes that there is a prior on the function; typically, a Gaussian process. The prior is represented from the true model or experiments, which are assumed as the realizations of the true function, are treated as training data. The overall Bayesian optimization approach has two major components: a predictor or Gaussian process model (GPM) and an acquisition function (AF). As shown in Fig. 2, in this approach, a posterior GPM is first built or iteratively updated, given the data from the current evaluated locations (red dots). The surrogate GPM then predicts the realizations of the function at the unexplored locations over the control parameter space. The best locations are then strategically selected in the space (green dots) for future expensive evaluations by maximizing the acquisition functions, defined from the posterior GPM simulations. Finally, after the evaluations are done, they are augmented with the existing evaluations and the cycle is repeated till the model is converged.

FIG. 2.

Bayesian optimization framework.

FIG. 2.

Bayesian optimization framework.

Close modal

It is important to mention that as an alternative to a GPM, random forest regression has been proposed as an expressive and flexible surrogate model in the context of sequential model-based algorithm configuration.87 Although random forests are good interpolators in the sense that they output good predictions in the neighborhood of training data, they are very poor extrapolators where the training data are far away.88 This can lead to selecting redundant exploration (more experiments) in the non-interesting region as suggested by the acquisition function in the early iterations of the optimization, due to having additional prediction errors of the region far away from the training data. This motivates us to consider the GPM in a Bayesian framework while extending the application to complex physics-driven problems.

The numerical optimization problems can be classified on the basis of a number of objective functions as single (SOO) and multi-objective optimization problems (MOOs). MOO is the extension of SOO having more than one objective as in Eq. (10),

(10)

It is obvious that SOO is relatively simpler with lower computational cost; however, in practical problems, it may be a challenge to formulate a single objective problem and, therefore, much focus has been given on multi-objective optimization (MOO) methods. The question generally attempted to solve in any MOO is the optimal decisions under user defined preferences of objectives, which are referred to as Pareto-optimal solutions. Considering a minimization (maximization) problem, a candidate point is a Pareto-optimal solution if there is no other feasible point, which gives a lower (higher) minimum (maximum) objective function for at least one of the objectives without the sacrifice from others. Such Pareto-optimal solutions at different trade-offs of objectives are represented by a Pareto-frontier. Pareto-optimal solutions are also referred to as non-dominated solutions in the design or objective criterion space. The methods to solve MOO problems can be classified into a priori and a posteriori methods based on whether we predefine the weightage associated to each objective or not respectively. Similarly, in the Bayesian framework, MOBO is the extension of BO, where we focus on optimizing multiple expensive responses or objective functions and build a Pareto-frontier. We next describe the GPM and AF.

As a simple example, Figs. 3(a)3(f) show a simple 1D Gaussian process model with one control parameter x and one objective function variable z=f(x). The green dots are the evaluated locations, and the gray dotted and blue solid lines are the true and the predictor mean functions in the parameter space, given the prior evaluated data. The area enclosed by the black curves shows the measure of uncertainty over the surrogate GPM prediction. It is clearly seen that the variance near the observations is small and increases as the design samples are farther away from the evaluated data, thereby related to kriging models where the errors are not independent. Much research has been ongoing regarding incorporating and quantifying uncertainty of the experimental or training data by using a nugget term in the predictor GPM. It has been found that the nugget provides a better solution and a computational stability framework.89,90 Furthermore, GPM has also been implemented in high-dimensional design space exploration91 and BIG DATA problems,92 as an attempt to increase computational efficiency. A survey of implementation of different GP packages has been provided in different coding languages such as MATLAB, R, and Python.93 Extending BO to MOBO, we have multiple Gaussian priors, one for each expensive functions, and therefore we develop respective multiple posterior GPMs. Since, in classical MOO problems, each function is treated independently during optimization, here also in MOBO, we consider multiple independently fitted GPMs.

FIG. 3.

1D Gaussian process and search space exploration by maximizing acquisition function. The top images of (a)–(f) are the Gaussian process illustration. Gray dotted lines are the true function value, blue solid lines are the GP predicted mean, black solid lines are the GP mean ± 2× standard deviation. Green dots are the current evaluated locations. Red dots are the new locations for next evaluations. The bottom images of (a)–(f) are the acquisition function illustration. Blue solid lines are acquisition function. Red dots are the new locations (at maximum acquisition function value) for next evaluations.

FIG. 3.

1D Gaussian process and search space exploration by maximizing acquisition function. The top images of (a)–(f) are the Gaussian process illustration. Gray dotted lines are the true function value, blue solid lines are the GP predicted mean, black solid lines are the GP mean ± 2× standard deviation. Green dots are the current evaluated locations. Red dots are the new locations for next evaluations. The bottom images of (a)–(f) are the acquisition function illustration. Blue solid lines are acquisition function. Red dots are the new locations (at maximum acquisition function value) for next evaluations.

Close modal

The general form of the GPM is as follows:

(11)

where xTβ is the polynomial regression model. The polynomial regression model captures the global trend of the data. In general, the first order polynomial regression is used, which is also known as universal kriging;94 however, it has also been claimed that it is fine to use a constant mean model.95 z(x) is a realization of a correlated Gaussian process with mean E[z(x)] and covariance cov(xi,xj) functions defined as follows:

(12a)
(12b)
(12c)

where σ2 is the overall variance parameter and θm is the correlation length scale parameter in dimension m of d dimension of x. These are termed as the hyper-parameters of GP model. R(xi,xj) is the spatial correlation function. In this paper, we have considered the constant mean model and a radial basis function, which is given by Eq. (12c). The objective is to estimate (by MLE) the hyper-parameters σ,θm that create the surrogate model that best explains the training data Dk at iteration k.

After the GP model is fitted, the next task of the GP model is to predict at an arbitrary (unexplored) location drawn from the parameter space. Assume Dk={Xk,Y(Xk)} is the prior information from previous evaluations or experiments from high-fidelity models, and x¯¯k+1X¯¯ is a new design within the unexplored locations in the parameter space, X¯¯. The predictive output distribution of xk+1, given the posterior GP model, is given by Eq. (13a),

(13a)

where

(13b)
(13c)

COVk is the kernel matrix of already sampled designs Xk, and covk+1 is the covariance function of new design x¯¯k+1, which is defined as follows:

The second major component in Bayesian optimization is the acquisition function (AF). AF guides the search for future evaluations or experiments toward the desired goal and thereby brings the sequential design into the BO. The AF predicts an improvement metric for each sample. The improvement metric depends on exploration (probability to discover useful behaviors in unexplored locations) and exploitation (known region with high responses). Thus, the acquisition function gives high value of improvement to the samples whose mean prediction is high, variance is high, or a combination of both. By maximizing the acquisition function, we select the best samples to find the optimum solution and reduce the uncertainty of the unknown expensive design space. Figures 3(a)3(f) show a simple example of BO exploration with one control parameter x and one objective function variable z=f(x) of the sequential selection of samples by maximizing the acquisition function, given the posterior GP model in iterations 1–3, 7, 15, and 25. The solid blue line is the GP predicted function and the dotted gray line is the original function. The green dots in iteration 1 [Fig. 3(a), top panel] are the evaluated locations and the red dot is the new evaluated location, as per maximizing the acquisition function [Fig. 3(a), bottom panel]. The new data are augmented with current data and the GPM is updated in the next iteration, and then the acquisition function is maximized to get the next evaluated locations [see Fig. (3b)]. This process is repeated iteratively, and in this manner, the search space is explored. We can see from the figure that the acquisition function value is the highest where the samples have high prediction mean and/or high variance and the lowest where the samples have low prediction, low variance, or both. Thus, the acquisition function strategically selects points that have the potential to have the optimal (e.g., maximum value of the unknown function) and gradually reduces the error and aligns with the true function, at those potential region, with the iterations.

Throughout the years, various formulations have been applied to define the acquisition functions. One such method is the probability of improvement (PI),96 which is an improvement-based acquisition function. Jones97 notes that the performance of PI “is truly impressive; however, the difficulty is that the PI method is extremely sensitive to the choice of the target. If the desired improvement is too small, the search will be highly local and will only move on to search globally after searching nearly exhaustively around the current best point. On the other hand, if the small-valued tolerance parameter ξ in PI equation is set too high, the search will be excessively global, and the algorithm will be slow to fine-tune any promising solutions.” Thus, the expected improvement acquisition function (EI)65 is widely used over PI, which is a trade-off between exploration and exploitation. Another acquisition function is the confidence bound criteria (CB), introduced by Cox and John,98 where the selection of points is based on the upper or lower confidence bound of the predicted design surface for maximization or minimization problem respectively. However, these acquisition functions are limited for the case when there is a single optimization functionality. Hence, we further discuss the acquisition functions for MOBO, where the target is to find the manifold in the parameter space of the system containing the points on the Pareto-frontier for the system.

In MOBO, the computation of the acquisition function is more complicated and can be computationally costly. Currently, there are two approaches to configure the acquisition functions in MOBO, namely, a priori method and a posteriori method. The two differ in whether the weighting factors for each objective are predefined or not. This is a similar technique like classical MOO problems. However, unlike MOO where the weighting factors are associated with objective functions, in MOBO, the same are associated with acquisition functions. In the a posteriori approach, the goal is to maximize the hyper-volume indicator function to maximize the likelihood of iteratively selecting non-dominated solutions (green dots) as shown in Fig. 4. The shaded area is the hyper-volume at iteration 1. As we proceed from iteration 1 to 2, we found three additional non-dominated solutions, where two of the earlier non-dominated solutions are now dominated (blue dots), which guaranteed that those solutions are not Pareto-optimal. Eventually, we see an increase of hyper-volume in iteration 2 (red colored area). This reselection of dominated solutions goes on iteratively until the convergence criteria are met, which finally configures the Pareto-frontier by the set of current dominated solutions. Different methods in this approach are expected improvement hyper-volume (EIHV), expected hyper-volume gradient-based (EIHVG), max-value entropy search (MESMO), and predictive entropy search (PESMO) acquisition functions. EIHV has been modeled to provide better performance.99,100 However, to increase the computational efficiency, Yang et al.101 modified the EIHV acquisition function into the EIHVG acquisition function and proposed an efficient algorithm to calculate it. To reduce the computational cost, MESMO102 and PESMO103 have been formulated. Abdolshah et al.104 proposed a multi-objective BO framework with preference over objectives on the basis of stability and attempting to focus on the Pareto front where preferred objectives are more stable. However, due to the computational complexity of all these MOBO approaches with increasing number of objectives, the weighted Tchebycheff method has been augmented in MOBO with a ridge regularization term to smoothen the converted single multi-objective function.105 Likewise, a weighted Tchebycheff method has been implemented in MOBO and introduced a regression-based model calibration to estimate the parameter (utopia) of the multi-objective function.106 Here, the weights are predefined by the user. The architecture is further modified with optimizing the regression-based model for calibration from the ensemble of models in order to enhance the model performance.107 Another a priori method is ParEGO,108 where the weights are randomly assigned with a uniform distribution. In these a priori methods,106–108 the multiple objectives are first transformed into a single weighted multi-objective function and then fit into a single-function based acquisition function (for example, EI).

FIG. 4.

Hyper-volume computation in MOBO.

FIG. 4.

Hyper-volume computation in MOBO.

Close modal

Table I presents the high level description of the MOBO algorithm. For the sake of simplicity, we considered two control parameters (2D) and two objective functions. However, the model can be easily extended to “n” parameters and “j” objective functions.

TABLE I.

Algorithm: multi-objective Bayesian optimization (MOBO)—2D and two objectives.

1. Initialization: Define parameter space. State maximum iteration, N. Randomly select m samples, X = {X1, X2}. Assuming f1, f2 are the two expensive objective functions. Evaluate m samples for both objectives as, Y1(Xk) and Y2(Xk) respectively. Build training data matrices, D1,k = {Xk, Y1,k} and D2,k = {Xk, Y2,k}. Set k = 1. 
For k ≤ N 
2. Surrogate Modeling: Develop or update GPM models for both objectives as Δ1(D1,k) and Δ2(D2,k). a. Optimize the hyper-parameters of GPM by minimizing the loss (negative marginal log-likelihood) function using Adam optimizer algorithm. Here, we consider learning rate 0.2. 
3. Posterior Predictions: Given the surrogate model, compute posterior means and variances for the unexplored locations, Xk¯¯, over the parameter space as{π1(Y(Xk¯¯)|Δ1,π2(Y(Xk¯¯)|Δ2} and {σ12(Y(Xk¯¯)|Δ1,σ22(Y(Xk¯¯)|Δ2} respectively. 
4. Acquisition function: Compute and maximize acquisition function, maxxU(f1,f2|Δ1,Δ2) to select next best location, xbest,k for evaluations. 
5. Augmentation: Evaluate y1(xbest,k) and y2(xbest,k) and augment data, D1,k+1 = [D1,k; {xbest,k, y1} and D2,k+1 = [D2,k; {xbest,k, y2}. 
1. Initialization: Define parameter space. State maximum iteration, N. Randomly select m samples, X = {X1, X2}. Assuming f1, f2 are the two expensive objective functions. Evaluate m samples for both objectives as, Y1(Xk) and Y2(Xk) respectively. Build training data matrices, D1,k = {Xk, Y1,k} and D2,k = {Xk, Y2,k}. Set k = 1. 
For k ≤ N 
2. Surrogate Modeling: Develop or update GPM models for both objectives as Δ1(D1,k) and Δ2(D2,k). a. Optimize the hyper-parameters of GPM by minimizing the loss (negative marginal log-likelihood) function using Adam optimizer algorithm. Here, we consider learning rate 0.2. 
3. Posterior Predictions: Given the surrogate model, compute posterior means and variances for the unexplored locations, Xk¯¯, over the parameter space as{π1(Y(Xk¯¯)|Δ1,π2(Y(Xk¯¯)|Δ2} and {σ12(Y(Xk¯¯)|Δ1,σ22(Y(Xk¯¯)|Δ2} respectively. 
4. Acquisition function: Compute and maximize acquisition function, maxxU(f1,f2|Δ1,Δ2) to select next best location, xbest,k for evaluations. 
5. Augmentation: Evaluate y1(xbest,k) and y2(xbest,k) and augment data, D1,k+1 = [D1,k; {xbest,k, y1} and D2,k+1 = [D2,k; {xbest,k, y2}. 

Here, we considered two acquisition functions, namely, (1) qEIHV109 and (2) qParEGO. These are the extensions of EIHV and ParEGO, respectively, where parallelism is introduced, and we can jointly optimize to provide the next best location, xbest,k in batch such that Xbest,k={xbest,k,1,,xbest,k,q}. This reduces the computational effort and can be extremely helpful when it is desirable (lesser time) for expensive evaluations in batch. For this study, we choose batch size, q=4. Obviously, the choice of acquisition function can change the performance of MOBO depending on the problem; however, choosing the best acquisition function is a hard problem and is not the scope of this paper and can be addressed in the future. It is up to the knowledge of the user to select or build an appropriate acquisition function for a given problem.

To initialize the computation of the acquisition function, we select 20 restart initial locations for a set of 1000 random points and consider maximum iterations of 200. The acquisition function is approximated using 500 Monte Carlo (MC) based samples. To compute the hyper-volume indicator function in qEIHV, a reference point (red star in Fig. 4), which is the lower bound of the objectives (worst solution since MOBO is set up as maximization problem), is required. We set this value (after normalizing the objectives) as 0δ. δ=0.001 is a small positive value to make the reference point slightly worse than true lower bound of the objectives. Different strategies can be applied to select reference point based on the problem complexity, especially when we do not have much knowledge on the lower bound of the objective functions (for constrained problems), to increase the computational efficiency of the acquisition function. However, this is beyond the scope of this paper and we have considered the stated reference point throughout our analysis since we have good knowledge on the lower bound of the objectives (normalized).

To illustrate the MOBO framework in Table I, we implemented it on two numerical test problems. To see the general application of the proposed design architecture, we first implement using two numerical multi-objective examples. As the scope of the proposed Bayesian framework in this paper is for solving expensive multi-objective problems, we assume that the numerical functions are expensive in nature. We have considered here two numerical test problems (maximization): Test Problem (1): ZDT1 and Test Problem (2): 2-D Six-hump camel back—inversed-Ackley's path (6HC-IAP)110 multi-objective functions. The full mathematical equations of the optimization problems are provided in Appendix A in the supplementary material. We start with ten randomly selected samples and set the max iteration of MOBO as 50. Thus, the total evaluation is 10 + 50 × 4 = 210. Though, we solved the problems, each with two acquisition functions, we present here that the results (Figs. 5 and 6) of MOBO for the acquisition function performed better.

FIG. 5.

Test Problem: ZDT1. (a) Ground truth for function 1 [Eq. (A1a) in the Appendix in the supplementary material]. (b) Ground truth for function 2 [Eq. (A1b) in the Appendix in the supplementary material]. (c) All samples selected after 50 MOBO iterations in real valued parameter space. (d) Pareto-optimal solutions over real valued parameter space. (e) All samples selected after 50 MOBO iterations in real valued objective space. (f) Pareto-frontiers drawn over real valued objective space. Here, the acquisition function chosen was qEIHV.

FIG. 5.

Test Problem: ZDT1. (a) Ground truth for function 1 [Eq. (A1a) in the Appendix in the supplementary material]. (b) Ground truth for function 2 [Eq. (A1b) in the Appendix in the supplementary material]. (c) All samples selected after 50 MOBO iterations in real valued parameter space. (d) Pareto-optimal solutions over real valued parameter space. (e) All samples selected after 50 MOBO iterations in real valued objective space. (f) Pareto-frontiers drawn over real valued objective space. Here, the acquisition function chosen was qEIHV.

Close modal
FIG. 6.

Test Problem: 6HC-IAP. (a) Ground truth for function 1 [Eq. (A2a) in the Appendix in the supplementary material]. (b) Ground truth for function 2 [Eq. (A2b) in the Appendix in the supplementary material]. (c) All samples selected after 50 MOBO iterations in real valued parameter space. (d) Pareto-optimal solutions over real valued parameter space. (e) All samples selected after 50 MOBO iterations in real valued objective space. (f) Pareto-frontiers drawn over real valued objective space. Here, the acquisition function chosen was qParEGO.

FIG. 6.

Test Problem: 6HC-IAP. (a) Ground truth for function 1 [Eq. (A2a) in the Appendix in the supplementary material]. (b) Ground truth for function 2 [Eq. (A2b) in the Appendix in the supplementary material]. (c) All samples selected after 50 MOBO iterations in real valued parameter space. (d) Pareto-optimal solutions over real valued parameter space. (e) All samples selected after 50 MOBO iterations in real valued objective space. (f) Pareto-frontiers drawn over real valued objective space. Here, the acquisition function chosen was qParEGO.

Close modal

Figures 5(a), 5(b), 6(a), and 6(b) show the ground truth of the individual functions (f1, f2) of test problems (1) and (2), respectively. The ground truth is configured with exhaustive grid-based sampling in the parameter space where true evaluations are done at all the grid points on a 100 × 100 grid. Comparing individual functions [panels (a) and (b)] of each of the test problems in Figs. 5 and 6, we can see clear trade-offs between them (see the color representations—yellow region being the maximum value region of individual functions), which gives us the Pareto-frontier (optimal balance) as we attempt to optimize (e.g., maximize) both functions jointly (optimizing trade-offs) by implementing MOBO.

Comparing the ground truth of the functions of both test problems, it can be easily interpreted that the test problem (2) is highly complex than test problem (1). Figures 5(c), 5(e), 6(c), and 6(e) show all the samples selected through MOBO iterations in the parameter and objective space for each of the test problem (1) and (2), using qEIHV and qParEGO acquisition functions. The color coding is defined as the darker color resembles early iterations and the lighter color resembles later stage of iterations. Here, looking at Figs. 5(e) and 6(e), we see at the early stage of the evaluation, the MOBO model explores by sampling over the entire region, as denoted by the scattered samples in dark colors. Then, in the later stages, once it quickly identifies the potential region of interest (Pareto-frontier), the model exploits by sampling more often in that region, as denoted by the cluster of samples with light colors. This is power of adaptive sampling in MOBO, which maximizes the learning focused on the region of interest through exploration–exploitation. Readers can look into Figs. 5(c) and 6(c) to get an understanding of where most of the sampling has been done in the parameter space of ground truth. Also, we have seen that using the stated acquisition functions for the test problems, qParEGO does more exploration than qEIHV and, therefore, has few more selection of samples far away from the Pareto-frontier. The Pareto-optimal solutions, extracted from respective Figs. 5(c) and 6(c) in the parameter space and build the Pareto-frontier in the objective space for both test problems, are drawn over the parameter space as shown in Figs. 5(d) and 6(d), respectively. Figures 5(f) and 6(f) show the Pareto-frontiers in the objective space, extracted from the respective Fig. 5(e) and 6(e). From the Pareto-frontier, one can visualize the trade-off between the objectives and can choose one or multiple solutions based on the desired trade-off.

To measure the performance of Pareto-frontier of the test problems with MOBO, we compare with the respective Pareto-frontier by solving the non-dominated sorting genetic algorithm II (NSGA-II) in the “pymoo” package111 in Python. We select the population size = 100 and generation = 100, a total of 10 000 function evaluations to ensure the configuration of the Pareto-frontier. We consider this as the ground truth of the Pareto-frontier. Using MOBO, our goal is to cover this Pareto-frontier with maximizing accuracy and minimizing cost (lesser evaluations). Also, we run NSGA-II with only population size = 15 and generation = 15, total of 225 function evaluations, which is close to the number of function evaluations we did in MOBO. Figure 7 shows the comparison. Given that we did only 2% evaluation in MOBO that what we did in full convergence of NSGA-II (210 function evaluations in MOBO, compared to 10 000 evaluations in NSGA-II), MOBO offered a superior performance in identifying a large portion of Pareto-optimal solutions, which overlaps with the ground truth in the test problem ZDT1. Similarly, MOBO, with both acquisition functions, clearly outperforms by a long margin when the Pareto-frontier is built with similar function evaluations in NSGA-II [Figs. 7(a) and 7(b)]. Obviously, the 6HC-IAP problem is much more complex than ZDT1 as we can see from the ground truth of individual functions [Figs. 6(a) and 6(b)] and highly disjoint Pareto-frontier (10 000 NSGA-II evaluations), MOBO needs more evaluations to cover more Pareto-frontier regions. However, even with only 210 evaluations for a complex problem with highly dis-jointed Pareto-frontier, MOBO could identify Pareto-solutions in three out of four Pareto-sub-frontiers (for qParEGO) and four out of four Pareto-sub-frontiers (for qEIHV) and did not get stuck into one region. However, the Pareto-optimal solutions are more accurate for qParEGO [Figs. 7(c) and 7(d)]. Thus, the user though would not have full knowledge of the Pareto-frontier will have a good approximated idea over the whole objective or parameter space. Here also, MOBO particularly with the qParEGO acquisition function found more Pareto-optimal solutions than running similar evaluations with NSGA-II. Obviously, in case of expensive functions, it would not be possible for 10 000 expensive evaluations; thus, we are confident enough with the performance of MOBO in terms of the trade-off between accuracy and cost. More detailed figures on the test problems with the iteration (steps-wise) of MOBO is provided in Colab notebook. Next, we proceed to solve a complex physics-driven model using MOBO for performance optimization for memory and energy storage applications.

FIG. 7.

Comparison of Pareto-frontier between ground truth (using NSGA-II), similar No. of function evaluation in NSGA-II and MOBO for (a) ZDT1, acquisition function—qParEGO; (b) ZDT1, acquisition function—qEIHV; (c) 6HC-IAP, acquisition function—qParEGO; and (d) 6HC-IAP, acquisition function—qEIHV.

FIG. 7.

Comparison of Pareto-frontier between ground truth (using NSGA-II), similar No. of function evaluation in NSGA-II and MOBO for (a) ZDT1, acquisition function—qParEGO; (b) ZDT1, acquisition function—qEIHV; (c) 6HC-IAP, acquisition function—qParEGO; and (d) 6HC-IAP, acquisition function—qEIHV.

Close modal

With the general principles of MOBO discussed in Sec. II and interface-controlled ferroelectric model introduced in Sec. I, here we present a MOBO optimization of the energy storage and losses in this system. For a given set of model parameters, we can generate the hysteresis loops for the field-dependent polarization response. The per se are too complex (high dimensionality) in the MOBO framework to define objective functions and train the MOBO model. Thus, we build a decision tree to first extract different low-dimensional target functions from different loop structures, given the numerical physics-driven model parameters through an automated task.

Figure 8 shows the full decision tree. For chosen parameters of a material [e.g., 2–4–6 KLGD for bulk PbZrO3 (PZO)] we can get different loop structures such as zero, one, and two loops, which defines different phases like paraelectric (PE), ferroelectric (FE) and antiferroelectric (AFE), respectively. For each of these phases, we have different classifications. For example, if the phase is PE, we then classify the hysteresis curve as linear, sigmoid. and two waves. If the phase is AFE, we further classify as the two loops are attached or not, with quantifying the distance between separation between the loops. If the phase is FE, we have two different classifications. The first one is the measure of the shift of the loops and in the positive or negative direction. The second one is the measure of the width of the loop. Finally, as shown in Fig. 8, irrespective of the loop structures, we calculate the energy storage and loss.

FIG. 8.

Physics-driven decision tree from complex loop structures.

FIG. 8.

Physics-driven decision tree from complex loop structures.

Close modal

As the focus of this paper is the performance optimization for memory and energy storage applications, we selected the energy storage and loss as the objectives of the MOBO. It is to be noted that the stated automated task of developing a similar decision tree from the hysteresis loop can be applicable to any theoretical model or experimental settings of any materials, and the user can choose any target function (branch) in the decision tree as the objectives as appropriate for the optimization problem. Finally, integrating the decision tree in Fig. 8 with the general MOBO framework (Table I), we develop a physics-driven MOBO architecture to optimize various target material properties jointly and thus attempt to rapidly explore the multidimensional parameter space for initial material and device parameter selection. The full architecture is summarized in Fig. 9. The MOBO framework is coded in the Python BoTorch package. GP models are configured using GPyTorch package and to handle multiple independent GP models, we utilize the multi-output GP function, ModelListGP. The full code is available in Colab notebook.

FIG. 9.

Physics-driven MOBO architecture (flow chart): Application to physics-driven numerical models to optimize target material properties.

FIG. 9.

Physics-driven MOBO architecture (flow chart): Application to physics-driven numerical models to optimize target material properties.

Close modal

Now, we illustrate the application of the proposed physics-driven MOBO architecture in the 2–4–6 KLGD for bulk PZO, which contains different phases such as PE, FE, and AFE for the chosen parameter space. Here, we aim to explore the balance between energy storage and loss, and build the Pareto front where we have various flavors of solutions—high storage-low loss, medium storage-medium loss, and low storage-high loss. It is up to the decision maker to choose the appropriate Pareto-optimal solution(s), which gives the desired balance in device parameter selection for given application. We started with 20 randomly selected initial samples and then stopped MOBO after 30 iterations, with four samples (batch_size = 4) selected at each iteration (30 × 4 = 120 sampling). Figures 10–12 show the Pareto-frontiers through rapid exploration–exploitation using MOBO with a priori and a posteriori acquisition functions over different parameter spaces such as such as temperature, partial O2 pressure, film thickness, and surface ion energy.

FIG. 10.

Parameter space T=[250,410]K,andnρ=[3,3],whereρ=10nρ,h=5nm,andΔG=0.2eV. External field E=3sin(2ωt). This parameter space falls into AFE. Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (a) Ground truth for function 1 (storage) and (b) ground truth for function 2 (loss). (c) and (d) Pareto-optimals, explored with qParEGO and qEIHV acquisition functions, respectively, in the parameter space. (e) and (f) Pareto-frontiers drawn, after exploring with qParEGO and qEIHV acquisition functions, respectively, in the objective space. The colors map the projection of the Pareto-solutions between the objective [(e) and (f)] and parameter space [(c) and (d)], respectively. The y-labels of the parameter space of (a)–(d) are the same; for better visualization to compare with ground truth figures, we have drawn the Pareto plots over the power of 10 of the values of partial O2 pressure, ρ.

FIG. 10.

Parameter space T=[250,410]K,andnρ=[3,3],whereρ=10nρ,h=5nm,andΔG=0.2eV. External field E=3sin(2ωt). This parameter space falls into AFE. Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (a) Ground truth for function 1 (storage) and (b) ground truth for function 2 (loss). (c) and (d) Pareto-optimals, explored with qParEGO and qEIHV acquisition functions, respectively, in the parameter space. (e) and (f) Pareto-frontiers drawn, after exploring with qParEGO and qEIHV acquisition functions, respectively, in the objective space. The colors map the projection of the Pareto-solutions between the objective [(e) and (f)] and parameter space [(c) and (d)], respectively. The y-labels of the parameter space of (a)–(d) are the same; for better visualization to compare with ground truth figures, we have drawn the Pareto plots over the power of 10 of the values of partial O2 pressure, ρ.

Close modal
FIG. 11.

Parameter space T=[260,350]Kandnρ=[6,8],whereρ=10nρ,h=50nm,andΔG=0.2eV. External fieldE=3sin(2ωt). This parameter space falls into FE. Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (a) Ground truth for function 1 (Storage) and (b) ground truth for function 2 (loss). (c) and (d) Pareto-optimals, explored with qParEGO and qEIHV acquisition functions, respectively, in the parameter space. (e) and (f) Pareto-frontiers drawn, after exploring with qParEGO and qEIHV acquisition functions, respectively, in the objective space. The colors map the projection of the Pareto-solutions between the objective [(e) and (f]) and parameter space [(c) and (d)], respectively. The y-labels of the parameter space of (a)–(d) are the same; for better visualization to compare with ground truth figures, we have drawn the Pareto plots over the power of 10 of the values of partial O2 pressure, ρ.

FIG. 11.

Parameter space T=[260,350]Kandnρ=[6,8],whereρ=10nρ,h=50nm,andΔG=0.2eV. External fieldE=3sin(2ωt). This parameter space falls into FE. Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (a) Ground truth for function 1 (Storage) and (b) ground truth for function 2 (loss). (c) and (d) Pareto-optimals, explored with qParEGO and qEIHV acquisition functions, respectively, in the parameter space. (e) and (f) Pareto-frontiers drawn, after exploring with qParEGO and qEIHV acquisition functions, respectively, in the objective space. The colors map the projection of the Pareto-solutions between the objective [(e) and (f]) and parameter space [(c) and (d)], respectively. The y-labels of the parameter space of (a)–(d) are the same; for better visualization to compare with ground truth figures, we have drawn the Pareto plots over the power of 10 of the values of partial O2 pressure, ρ.

Close modal
FIG. 12.

(a)–(d) Parameter space T=[255,410]K,ρ=[102,108],h=50nm,andΔG=0.2eV. External fieldE=3sin(2ωt). Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (a) Ground truth for function 1 (storage) and (b) ground truth for function 2 (loss). Same Pareto-frontier, explored with qParEGO acquisition function in (c) objective space and in (d) parameter space. The y-labels of the parameter space of (a)–(d) are same; for better visualization to compare with ground truth figures, we have drawn the Pareto plots over the power of 10 of the values of partial O2 pressure, ρ. (e)–(h) Parameter space T=[255,410]K,ρ=102,h=5nm,andΔG=[0.01,0.3]eV. External fieldE=3sin(2ωt). Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (e) Ground truth for function 1 (storage) and (f) ground truth for function 2 (loss). Same Pareto-frontier, explored with qEIHV acquisition function in (g) objective space and in (h) parameter space. Both parameter spaces fall into both AFE and FE.

FIG. 12.

(a)–(d) Parameter space T=[255,410]K,ρ=[102,108],h=50nm,andΔG=0.2eV. External fieldE=3sin(2ωt). Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (a) Ground truth for function 1 (storage) and (b) ground truth for function 2 (loss). Same Pareto-frontier, explored with qParEGO acquisition function in (c) objective space and in (d) parameter space. The y-labels of the parameter space of (a)–(d) are same; for better visualization to compare with ground truth figures, we have drawn the Pareto plots over the power of 10 of the values of partial O2 pressure, ρ. (e)–(h) Parameter space T=[255,410]K,ρ=102,h=5nm,andΔG=[0.01,0.3]eV. External fieldE=3sin(2ωt). Exhaustive low-sampling (7 × 7) grid exploration under the same parameter space: (e) Ground truth for function 1 (storage) and (f) ground truth for function 2 (loss). Same Pareto-frontier, explored with qEIHV acquisition function in (g) objective space and in (h) parameter space. Both parameter spaces fall into both AFE and FE.

Close modal

To check the performance of MOBO, we draw the ground truth of storage and loss [Figs. 10(a) and 10(b)] using low-sampling (7 × 7), expensive exhaustive (grid-based) exploration. Here, we configure the Pareto-frontier within the AFE phase only. Interpreting the trade-off between storage and loss in the parameter and objective space [Figs. 10(c)10(f)], we see that higher storage maps to higher temperature and lower order (in the absolute power of 10) partial O2 pressure, whereas lower storage maps generally to lower temperature and higher order partial O2 pressure at the chosen value of film thickness and surface ion energy. The balanced trade-off between storage and loss populates the intermediate values in the parameter space. The ground truth shows the similar maps of high and low values of storage and loss in the parameter space. As we compare the Pareto-frontier and the ground truth in the parameter space [Figs. 10(a)10(d)], the Pareto-frontier actually connects the maximum value regions of both storage and loss, giving a trade-off between them. Also, we present the loop diagrams in Fig. 13 in the Appendix in the supplementary material, from where the ground truth of storage/loss is drawn following the decision tree. It is clearly seen that the loop area is decreasing as the temperature and partial O2 pressure are higher and lower (in the absolute power of 10), respectively, thus maximizing the storage or minimizing the loss. This agrees to the results in Fig. 10. Both the acquisition functions performed quite similar.

Also in Fig. 11, we first draw the ground truth of storage and loss [Figs. 11(a) and 11(b)]. Here, we configure the Pareto-frontier within the FE phase only. Interpreting the trade-off between storage and loss in the parameter and objective space [Figs. 11(c)11(f)], we see that higher loss maps to higher partial O2 pressure, whereas lower loss maps to lower partial O2 pressure at the chosen value of film thickness and surface ion energy. The balanced trade-off between storage and loss populates the intermediate values of partial O2 pressure. Though the dependence of temperature is much lesser than partial O2 pressure in the effect of storage or loss in the FE phase at the given parameters, however, most of the Pareto-solutions in the intermediate region are at low temperature. Validating the interpretation with the ground truth using low-sampling (7 × 7), expensive exhaustive (grid-based) exploration, it shows the similar maps of high and low values of storage and loss in the parameter space. As we compare the Pareto-frontier and the ground truth in the parameter space [Figs. 11(a)11(d)], the Pareto-frontier actually connects the maximum value regions of both storage and loss, giving a trade-off between them. Here, we also see that the dependence of partial O2 pressure is much greater than temperature, which goes by the results from MOBO. From the loop diagrams in Fig. 14 in the Appendix in the supplementary material, it is clearly seen that the loop area is increasing as the partial O2 pressure is higher, thus minimizing the storage or maximizing the loss. This agrees to the results in Fig. 11. Both the acquisition functions performed quite similar.

Both the parameter spaces in Fig. 12 configure the Pareto-frontier within both AFE and FE phases. Focusing on the parameter space as in Figs. 12(a)12(d), we first draw the ground truth of storage and loss [Figs. 12(a) and 12(b)]. Interpreting the trade-off between storage and loss in the objective and parameter space [Figs. 12(c) and 12(d)], we see that higher loss maps to higher partial O2 pressure and lower temperature, whereas higher storage maps to lower partial O2 pressure at the chosen value of film thickness and surface ion energy. Validating the interpretation with the ground truth using low-sampling (7 × 7), expensive exhaustive (grid-based) exploration, it shows the similar maps of high and low values of storage and loss in the parameter space. As we compare the Pareto-frontier and the ground truth in the parameter space [Figs. 12(a), 12(b), and 12(d)], the Pareto-frontier actually connects the maximum value regions of both storage and loss, giving a trade-off between them. From the loops diagrams in Fig. 15 in the Appendix in the supplementary material, it is clearly seen that the loop area (considering both single loop FE region and double loop AFE region) is increasing as the partial O2 pressure is higher and temperature is mostly lower (as we move from AFE to FE, and deeper into FE domain) and is decreasing as the partial O2 pressure is lower (as we move from FE to AFE, and deeper into AFE domain), thus minimizing (maximizing) the storage (loss) and maximizing (minimizing) the storage (loss), respectively. This agrees to the results in Figs. 12(a)12(d). Interestingly, we see a sudden jump in storage and loss in the Pareto after very closely space Pareto-solutions, which likely happened in this case, due to phase transition between FE and AFE (see Fig. 15in the Appendix in the supplementary material).

Focusing on the parameter space (choosing temperature and surface ion energy) as in Figs. 12(e)12(h), we first draw the ground truth of storage and loss [Figs. 12(e) and 12(f)]. Interpreting the trade-off between storage and loss in the objective and parameter space [Figs. 12(g) and 12(h)], we see that higher loss maps to lower surface ion energy and temperature, whereas higher storage maps to higher surface ion energy at the chosen value of film thickness and partial O2 pressure. Validating the interpretation with the ground truth using low-sampling (7 × 7), expensive exhaustive (grid-based) exploration, we find similar maps of high and low values of storage and loss in the parameter space. As we compare the Pareto-frontier and the ground truth in the parameter space (Figs. 12(e), 12(f), and 12(h)], the Pareto-frontier actually connects the maximum value regions of both storage and loss, giving a trade-off between them. From the loops diagrams in Fig. 16 in the Appendix in the supplementary material, it is clearly seen that the loop area (considering both single loop FE region and double loop AFE region) is increasing as the surface ion energy and temperature is lower (as we move from AFE to FE, and deeper into FE domain) and is decreasing as the surface ion energy is higher (as we move from FE to AFE, and deeper into AFE domain), thus minimizing (maximizing) the storage (loss) and maximizing (minimizing) the storage (loss), respectively. This agrees to the results in Figs. 12(e)12(h). We see a similar jump of storage and loss, which again likely to be due to phase transition between FE and AFE (see Fig. 16 in the Appendix in the supplementary material). In practical, we do not know this transition region between the phases, and this type of sharp jumps can be problematic for the surrogate GP model in MOBO. Thus, to avoid this type of sudden jump at the unknown transition boundary, we can build Pareto-frontier (trade-off between the target functions) over any parameter space for a pre-selected phase (AFE or FE) using more complex architecture of constrained multi-objective Bayesian optimization (MOBO), which will be addressed in the future. More detailed figures for the analysis of 2–4–6 KLGD for bulk PZO, with the iteration (steps-wise) of MOBO is provided in Colab notebook.

To summarize, here, we develop a workflow for the joint optimization of multiple materials functionalities from expensive computations using the multi-objective Bayesian optimization. This approach is applied for the discovery of the optimal Pareto-frontier between the energy storage and loss in interfacially controlled ferroelectric materials but can be equally applied to other functionalities that can be derived from the P–E hysteresis loops.

In general, MOBO offers a powerful tool for the optimization of material functionalities and can be further extended toward theory–experiment matching. As in practical problems, there is a general need to focus of different functionalities from expensive experiments, which is more complex in the field of optimization. Here, the proposed MOBO architecture offers a great choice where it first replaces with inexpensive surrogate theoretical modeling with physics-driven prior knowledge and then optimizes any chosen multiple functionalities jointly in an attempt to rapidly discover the optimal systems with desired balance between the functionalities. Going forward, this approach can help in guiding the experiments by exploring toward the region of interest in high-dimensional multi-functionality space, and thereby reducing time and effort.

See the supplementary material for the mathematical equations of the test problems in Sec. II, hysteresis plots for the low-sampling grid (ground truth) as part of the analysis in Sec. III. From these hysteresis plots, we calculate the respective ground truth of storage and loss to validate the MOBO results as presented in Sec. III.

This work was supported by the US Department of Energy, Office of Science, Office of Basic Energy Sciences, as part of the Energy Frontier Research Centers program: CSSAS—The Center for the Science of Synthesis Across Scales—under Award No. DE-SC0019288 (A.B.), located at University of Washington, DC; by the Center for 3D Ferroelectric Microelectronics (3DFeM), an Energy Frontier Research Center funded by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences under Award No. DE-SC0021118 (S.V.K.); and performed at Oak Ridge National Laboratory's Center for Nanophase Materials Sciences (CNMS), a U.S. Department of Energy, Office of Science User Facility. A.N.M.’s work was supported by the National Research Foundation of Ukraine (Grant Application No. Ф81/41481).

The data that support the findings of this study are openly available in Github at https://github.com/arpanbiswas52/MOBO_AFI_Supplements, Ref. 112.

The authors have no conflicts to disclose.

1.
A.
Tagantsev
,
L. E.
Cross
, and
J.
Fousek
,
Domains in Ferroic Crystals and Thin Films
(
Springer
,
New York
,
2010
).
2.
O.
Auciello
,
J. F.
Scott
, and
R.
Ramesh
, “
The physics of ferroelectric memories
,”
Phys. Today
51
(
7
),
22
27
(
1998
).
3.
J. F.
Scott
and
C. A.
Paz de Araujo
, “
Ferroelectric memories
,”
Science
246
(
4936
),
1400
1405
(
1989
).
4.
E. Y.
Tsymbal
and
H.
Kohlstedt
, “
Tunneling across a ferroelectric
,”
Science
313
(
5784
),
181
183
(
2006
).
5.
A.
Gruverman
,
D.
Wu
,
H.
Lu
,
Y.
Wang
,
H. W.
Jang
,
C. M.
Folkman
,
M. Y.
Zhuravlev
,
D.
Felker
,
M.
Rzchowski
,
C.-B.
Eom
, and
E. Y.
Tsymbal
, “
Tunneling electroresistance effect in ferroelectric tunnel junctions at the nanoscale
,”
Nano Lett.
9
(
10
),
3539
3543
(
2009
).
6.
P.
Maksymovych
,
S.
Jesse
,
P.
Yu
,
R.
Ramesh
,
A. P.
Baddorf
, and
S. V.
Kalinin
, “
Polarization control of electron tunneling into ferroelectric surfaces
,”
Science
324
(
5933
),
1421
1425
(
2009
).
7.
S. L.
Miller
and
P. J.
McWhorter
, “
Physics of the ferroelectric nonvolatile memory field effect transistor
,”
J. Appl. Phys.
72
(
12
),
5999
6010
(
1992
).
8.
S.
Mathews
,
R.
Ramesh
,
T.
Venkatesan
, and
J.
Benedetto
, “
Ferroelectric field effect transistor based on epitaxial perovskite heterostructures
,”
Science
276
(
5310
),
238
240
(
1997
).
9.
V.
Garcia
,
M.
Bibes
,
L.
Bocher
,
S.
Valencia
,
F.
Kronast
,
A.
Crassous
,
X.
Moya
,
S.
Enouz-Vedrenne
,
A.
Gloter
,
D.
Imhoff
,
C.
Deranlot
,
N. D.
Mathur
,
S.
Fusil
,
K.
Bouzehouane
, and
A.
Barthélémy
, “
Ferroelectric control of spin polarization
,”
Science
327
(
5969
),
1106
1110
(
2010
).
10.
G.
Catalan
and
J. F.
Scott
, “
Physics and applications of bismuth ferrite
,”
Adv. Mater.
21
(
24
),
2463
2485
(
2009
).
11.
E. V.
Balashova
and
A. K.
Tagantsev
, “
Polarization response of crystals with structural and ferroelectric instabilities
,”
Phys. Rev. B
48
(
14
),
9979
9986
(
1993
).
12.
E. V.
Balashova
,
V. V.
Lemanov
,
A. K.
Tagantsev
,
A. B.
Sherman
, and
S. H.
Shomuradov
, “
Betaine arsenate as a system with two instabilities
,”
Phys. Rev. B
51
(
14
),
8747
8752
(
1995
).
13.
E. V.
Chensky
and
V. V.
Tarasenko
, “
Theory of phase-transitions to inhomogeneous states in finite ferroelectrics in an external electric-field
,”
Sov. Phys. JETP
56
,
618
[Zh. Eksp. Teor. Fiz. 83, 1089 (1982)].
14.
A. M.
Bratkovsky
and
A. P.
Levanyuk
, “
Continuous theory of ferroelectric states in ultrathin films with real electrodes
,”
J. Comput. Theor. Nanosci.
6
(
3
),
465
489
(
2009
).
15.
A. M.
Bratkovsky
and
A. P.
Levanyuk
, “
Effects of anisotropic elasticity in the problem of domain formation and stability of monodomain state in ferroelectric films
,”
Phys. Rev. B
84
(
4
),
045401
(
2011
).
16.
L. Q.
Chen
and
A. G.
Khachaturyan
, “
Dynamics of simultaneous ordering and phase separation and effect of long-range Coulomb interactions
,”
Phys. Rev. Lett.
70
(
10
),
1477
1480
(
1993
).
17.
D.
Fan
and
L.-Q.
Chen
, “
Possibility of spinodal decomposition in ZrO2-Y2O3 alloys: A theoretical investigation
,”
J. Am. Ceram. Soc.
78
(
6
),
1680
1686
(
1995
).
18.
N.
Balke
,
B.
Winchester
,
W.
Ren
,
Y. H.
Chu
,
A. N.
Morozovska
,
E. A.
Eliseev
,
M.
Huijben
,
R. K.
Vasudevan
,
P.
Maksymovych
,
J.
Britson
,
S.
Jesse
,
I.
Kornev
,
R.
Ramesh
,
L.
Bellaiche
,
L. Q.
Chen
, and
S. V.
Kalinin
, “
Enhanced electric conductivity at ferroelectric vortex cores in BiFeO3
,”
Nat. Phys.
8
(
1
),
81
88
(
2012
).
19.
T. S.
Böscke
,
S.
Teichert
,
D.
Bräuhaus
,
J.
Müller
,
U.
Schröder
,
U.
Böttger
, and
T.
Mikolajick
, “
Phase transitions in ferroelectric silicon doped hafnium oxide
,”
Appl. Phys. Lett.
99
(
11
),
112904
(
2011
).
20.
J.
Müller
,
T. S.
Böscke
,
D.
Bräuhaus
,
U.
Schröder
,
U.
Böttger
,
J.
Sundqvist
,
P.
Kücher
,
T.
Mikolajick
, and
L.
Frey
, “
Ferroelectric Zr0.5Hf0.5O2 thin films for nonvolatile memory applications
,”
Appl. Phys. Lett.
99
(
11
),
112901
(
2011
.
21.
M. H.
Park
,
C.-C.
Chung
,
T.
Schenk
,
C.
Richter
,
M.
Hoffmann
,
S.
Wirth
,
J. L.
Jones
,
T.
Mikolajick
, and
U.
Schroeder
, “
Origin of temperature-dependent ferroelectricity in Si-doped HfO2
,”
Adv. Electron. Mater.
4
(
4
),
1700489
(
2018
).
22.
J.
Cao
,
S.
Shi
,
Y.
Zhu
, and
J.
Chen
, “
An overview of ferroelectric hafnia and epitaxial growth
,”
Phys. Status Solidi RRL
15
(
5
),
2100025
(
2021
).
23.
P. D.
Lomenzo
,
U.
Schroeder
, and
T.
Mikolajick
, “
Electronic contributions to ferroelectricity and field-induced phase transitions in doped-HfO2
,” in
2021 IEEE International Symposium on Applications of Ferroelectrics (ISAF)
(IEEE,
2021
), pp.
1
4
.
24.
W.
Zhu
,
J.
Hayden
,
F.
He
,
J.-I.
Yang
,
P.
Tipsawat
,
M. D.
Hossain
,
J.-P.
Maria
, and
S.
Trolier-McKinstry
, “
Strongly temperature dependent ferroelectric switching in AlN, Al1-xScxN, and Al1-xBxN thin films
,”
Appl. Phys. Lett.
119
(
6
),
062901
(
2021
).
25.
Y.
Song
,
C.
Perez
,
G.
Esteves
,
J. S.
Lundh
,
C. B.
Saltonstall
,
T. E.
Beechem
,
J. I.
Yang
,
K.
Ferri
,
J. E.
Brown
,
Z.
Tang
,
J.-P.
Maria
,
D. W.
Snyder
,
R. H.
Olsson
,
B. A.
Griffin
,
S. E.
Trolier-McKinstry
,
B. M.
Foley
, and
S.
Choi
, “
Thermal conductivity of aluminum scandium nitride for 5G mobile applications and beyond
,”
ACS Appl. Mater. Interfaces
13
(
16
),
19031
19041
(
2021
).
26.
K.
Ferri
,
S.
Bachu
,
W.
Zhu
,
M.
Imperatore
,
J.
Hayden
,
N.
Alem
,
N.
Giebink
,
S.
Trolier-McKinstry
, and
J.-P.
Maria
, “
Ferroelectrics everywhere: Ferroelectricity in magnesium substituted zinc oxide thin films
,”
J. Appl. Phys.
130
(
4
),
044101
(
2021
).
27.
X.
Sang
,
E. D.
Grimley
,
T.
Schenk
,
U.
Schroeder
, and
J. M.
LeBeau
, “
On the structural origins of ferroelectricity in HfO2 thin films
,”
Appl. Phys. Lett.
106
(
16
),
162905
(
2015
).
28.
F.
Delodovici
,
P.
Barone
, and
S.
Picozzi
,
Phys. Rev. Mater.
5
(
6
), 064405 (
2021
).
29.
P.
Nukala
,
M.
Ahmadi
,
Y.
Wei
,
S.
Graaf
,
E.
Stylianidis
,
T.
Chakrabortty
,
S.
Matzen
,
H. W.
Zandbergen
,
A.
Björling
,
D.
Mannix
,
D.
Carbone
,
B.
Kooi
, and
B.
Noheda
, “
Reversible oxygen migration and phase transitions in hafnia-based ferroelectric devices
,”
Science
372
(
6542
),
630
635
(
2021
).
30.
M.
Dogan
,
N.
Gong
,
T.-P.
Ma
, and
S.
Ismail-Beigi
, “
Causes of ferroelectricity in HfO2-based thin films: An ab initio perspective
,”
Phys. Chem. Chem. Phys.
21
(
23
),
12150
12162
(
2019
).
31.
T. D.
Huan
,
V.
Sharma
,
G. A.
Rossetti
, and
R.
Ramprasad
, “
Pathways towards ferroelectricity in hafnia
,”
Phys. Rev. B
90
(
6
),
064111
(
2014
).
32.
P.
Wu
,
X.
Ma
,
Y.
Li
,
V.
Gopalan
, and
L.-Q.
Chen
, “
Dipole spring ferroelectrics in superlattice SrTiO3/BaTiO3 thin films exhibiting constricted hysteresis loops
,”
Appl. Phys. Lett.
100
(
9
),
092905
(
2012
).
33.
J. F.
Scott
,
E. K. H.
Salje
, and
M. A.
Carpenter
, “
Domain wall damping and elastic softening in SrTiO3 evidence for polar twin walls
,”
Phys. Rev. Lett.
109
(
18
),
187601
(
2012
).
34.
Y.
Lee
,
Y.
Goh
,
J.
Hwang
,
D.
Das
, and
S.
Jeon
, “
The influence of top and bottom metal electrodes on ferroelectricity of hafnia
,”
IEEE Trans. Electron Devices
68
(
2
),
523
528
(
2021
).
35.
A. N.
Morozovska
,
E. A.
Eliseev
,
S. V.
Svechnikov
,
A. D.
Krutov
,
V. Y.
Shur
,
A. Y.
Borisevich
,
P.
Maksymovych
, and
S. V.
Kalinin
, “
Finite size and intrinsic field effect on the polar-active properties of ferroelectric-semiconductor heterostructures
,”
Phys. Rev. B
81
(
20
),
205308
(
2010
).
36.
J. E.
Spanier
,
A. M.
Kolpak
,
J. J.
Urban
,
I.
Grinberg
,
L.
Ouyang
,
W. S.
Yun
,
A. M.
Rappe
, and
H.
Park
, “
Ferroelectric phase transition in individual single-crystalline BaTiO3 nanowires
,”
Nano Lett.
6
(
4
),
735
739
(
2006
).
37.
R. V.
Wang
,
D. D.
Fong
,
F.
Jiang
,
M. J.
Highland
,
P. H.
Fuoss
,
C.
Thompson
,
A. M.
Kolpak
,
J. A.
Eastman
,
S. K.
Streiffer
,
A. M.
Rappe
, and
G. B.
Stephenson
, “
Reversible chemical switching of a ferroelectric film
,”
Phys. Rev. Lett.
102
(
4
),
047601
(
2009
).
38.
D. D.
Fong
,
A. M.
Kolpak
,
J. A.
Eastman
,
S. K.
Streiffer
,
P. H.
Fuoss
,
G. B.
Stephenson
,
C.
Thompson
,
D. M.
Kim
,
K. J.
Choi
,
C. B.
Eom
,
I.
Grinberg
, and
A. M.
Rappe
, “
Stabilization of monodomain polarization in ultrathin PbTiO3 films
,”
Phys. Rev. Lett.
96
(
12
),
127601
(
2006
).
39.
S. V.
Kalinin
,
Y.
Kim
,
D. D.
Fong
, and
A. N.
Morozovska
, “
Surface-Screening mechanisms in ferroelectric thin films and their effect on polarization dynamics and domain structures
,”
Rep. Prog. Phys.
81
(
3
),
036502
(
2018
).
40.
N.
Domingo
,
I.
Gaponenko
,
K.
Cordero-Edwards
,
N.
Stucki
,
V.
Pérez-Dieste
,
C.
Escudero
,
E.
Pach
,
A.
Verdaguer
, and
P.
Paruch
, “
Surface charged species and electrochemistry of ferroelectric thin films
,”
Nanoscale
11
(
38
),
17920
17930
(
2019
).
41.
M. D.
Glinchuk
,
A. N.
Morozovska
,
A.
Lukowiak
,
W.
Stręk
,
M. V.
Silibin
,
D. V.
Karpinsky
,
Y.
Kim
, and
S. V.
Kalinin
, “
Possible electrochemical origin of ferroelectricity in HfO2 thin films
,”
J. Alloys Compd.
830
,
153628
(
2020
).
42.
C.
Zhou
and
D. M.
Newns
, “
Intrinsic dead layer effect and the performance of ferroelectric thin film capacitors
,”
J. Appl. Phys.
82
(
6
),
3081
3088
(
1997
).
43.
G.
Gerra
,
A. K.
Tagantsev
, and
N.
Setter
, “
Ferroelectricity in asymmetric metal-ferroelectric-metal heterostructures: A combined first-principles–phenomenological approach
,”
Phys. Rev. Lett.
98
(
20
),
207601
(
2007
).
44.
P.
Zubko
,
S.
Gariglio
,
M.
Gabay
,
P.
Ghosez
, and
J.-M.
Triscone
, “
Interface physics in complex oxide heterostructures
,”
Annu. Rev. Condens. Matter Phys.
2
(
1
),
141
165
(
2011
).
45.
J.
Bardeen
, “
Surface states and rectification at a metal semi-conductor contact
,”
Phys. Rev.
71
(
10
),
717
727
(
1947
).
46.
I. S.
Vorotiahin
,
E. A.
Eliseev
,
Q.
Li
,
S. V.
Kalinin
,
Y. A.
Genenko
, and
A. N.
Morozovska
, “
Tuning the polar states of ferroelectric films via surface charges and flexoelectricity
,”
Acta Mater.
137
,
85
92
(
2017
).
47.
A. N.
Morozovska
,
E. A.
Eliseev
,
I. S.
Vorotiahin
,
M. V.
Silibin
,
S. V.
Kalinin
, and
N. V.
Morozovsky
, “
Control of polarization reversal temperature behavior by surface screening in thin ferroelectric films
,”
Acta Mater.
160
,
57
71
(
2018
).
48.
V. M.
Fridkin
,
Ferroelectric Semiconductors
(
Springer
,
New York
,
1980
).
49.
G. B.
Stephenson
and
M. J.
Highland
, “
Equilibrium and stability of polarization in ultrathin ferroelectric films with ionic surface compensation
,”
Phys. Rev. B
84
(
6
),
064107
(
2011
).
50.
M. J.
Highland
,
T. T.
Fister
,
D. D.
Fong
,
P. H.
Fuoss
,
C.
Thompson
,
J. A.
Eastman
,
S. K.
Streiffer
, and
G. B.
Stephenson
, “
Equilibrium polarization of ultrathin PbTiO3 with surface compensation controlled by oxygen partial pressure
,”
Phys. Rev. Lett.
107
(
18
),
187602
(
2011
).
51.
S. M.
Yang
,
A. N.
Morozovska
,
R.
Kumar
,
E. A.
Eliseev
,
Y.
Cao
,
L.
Mazet
,
N.
Balke
,
S.
Jesse
,
R. K.
Vasudevan
,
C.
Dubourdieu
, and
S. V.
Kalinin
, “
Mixed electrochemical–ferroelectric states in nanoscale ferroelectrics
,”
Nat. Phys.
13
(
8
),
812
818
(
2017
).
52.
A. N.
Morozovska
,
E. A.
Eliseev
,
N. V.
Morozovsky
, and
S. V.
Kalinin
, “
Ferroionic states in ferroelectric thin films
,”
Phys. Rev. B
95
(
19
),
195413
(
2017
).
53.
A. N.
Morozovska
,
E. A.
Eliseev
,
N. V.
Morozovsky
, and
S. V.
Kalinin
, “
Piezoresponse of ferroelectric films in ferroionic states: Time and voltage dynamics
,”
Appl. Phys. Lett.
110
(
18
),
182907
(
2017
).
54.
A. N.
Morozovska
,
E. A.
Eliseev
,
A. I.
Kurchak
,
N. V.
Morozovsky
,
R. K.
Vasudevan
,
M. V.
Strikha
, and
S. V.
Kalinin
, “
Effect of surface ionic screening on the polarization reversal scenario in ferroelectric thin films: Crossover from ferroionic to antiferroionic states
,”
Phys. Rev. B
96
(
24
),
245405
(
2017
).
55.
A. N.
Morozovska
,
E. A.
Eliseev
,
A.
Biswas
,
N. V.
Morozovsky
, and
S. V.
Kalinin
,
Phys. Rev. Appl.
16, 044053 (2021).
56.
A. K.
Tagantsev
and
G.
Gerra
, “
Interface-induced phenomena in polarization response of ferroelectric thin films
,”
J. Appl. Phys.
100
(
5
),
051607
(
2006
).
57.
R.
Blinc
and
B.
Zeks
,
Soft Mode in Ferroelectrics and Antiferroelectrics
(
North-Holland Publishing Company
,
Amsterdam
,
1974
).
58.
A. K.
Tagantsev
,
K.
Vaideeswaran
,
S. B.
Vakhrushev
,
A. V.
Filimonov
,
R. G.
Burkovsky
,
A.
Shaganov
,
D.
Andronikova
,
A. I.
Rudskoy
,
A. Q. R.
Baron
,
H.
Uchiyama
,
D.
Chernyshov
,
A.
Bosak
,
Z.
Ujma
,
K.
Roleder
,
A.
Majchrowski
,
J.-H.
Ko
, and
N.
Setter
, “
The origin of antiferroelectricity in PbZrO3
,”
Nat. Commun.
4
(
1
),
2229
(
2013
).
59.
K. M.
Rabe
, “
Antiferroelectricity in oxides: A reexamination
,” in
Functional Metal Oxides
(
John Wiley & Sons, Ltd
), pp.
221
244
.
60.
J.
Hlinka
,
T.
Ostapchuk
,
E.
Buixaderas
,
C.
Kadlec
,
P.
Kuzel
,
I.
Gregora
,
J.
Kroupa
,
M.
Savinov
,
A.
Klic
,
J.
Drahokoupil
,
I.
Etxebarria
, and
J.
Dec
, “
Multiple soft-mode vibrations of lead zirconate
,”
Phys. Rev. Lett.
112
(
19
),
197601
(
2014
).
61.
K. Y.
Foo
and
B. H.
Hameed
, “
Insights into the modeling of adsorption isotherm systems
,”
Chem. Eng. J.
156
(
1
),
2
10
(
2010
).
62.
Y.
Tian
,
L.
Wei
,
Q.
Zhang
,
H.
Huang
,
Y.
Zhang
,
H.
Zhou
,
F.
Ma
,
L.
Gu
,
S.
Meng
,
L.-Q.
Chen
,
C.-W.
Nan
, and
J.
Zhang
, “
Water printing of ferroelectric polarization
,”
Nat. Commun.
9
(
1
),
3809
(
2018
).
63.
Y.
Cao
and
S. V.
Kalinin
, “
Phase-field modeling of chemical control of polarization stability and switching dynamics in ferroelectric thin films
,”
Phys. Rev. B
94
(
23
),
235444
(
2016
).
64.
S. M.
Sze
,
Physics of Semiconductor Devices
, 2nd ed. (
John Wiley and Sons Ltd
.,
New York
,
1981
).
65.
E.
Brochu
,
V. M.
Cora
, and
N.
de Freitas
, arXiv:1012.2599 [cs] (2010).
66.
D.
Lizotte
,
T.
Wang
,
M.
Bowling
, and
D.
Schuurmans
,
“Automatic gait optimization with Gaussian process regression,” in IJCAI’07: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 6–12 January
(Morgan Kaufmann Publishers Inc.,
2007
), pp.
944
949
.
67.
D.
Lizotte
,
Practical Bayesian regression,
Ph.D. thesis (University of Alberta, Edmonton,
2008
).
68.
V. M.
Cora
,
Model-Based Active Learning in Hierarchical Policies
(
University of British Columbia
,
2008
).
69.
M.
Frean
and
P.
Boyle
, “
Using Gaussian processes to optimize expensive functions
,” in
AI 2008 Advances in Artificial Intelligence
, Lecture Notes in Computer Science, edited by
W.
Wobcke
and
M.
Zhang
(
Springer
,
2008
), pp.
258
267
.
70.
R.
Martinez-Cantin
,
N.
de Freitas
,
E.
Brochu
,
J.
Castellanos
, and
A.
Doucet
, “
A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot
,”
Auton. Robots
27
(
2
),
93
103
(
2009
).
71.
W.
Chu
and
Z.
Ghahramani
, “
Extensions of Gaussian processes for ranking: Semisupervised and active learning
,”
in NIPS Workshop on Learning to Rank, Whistler, Canada, 9 December
(
2005
).
72.
L. L.
Thurstone
, “
A law of comparative judgment
,”
Psychol. Rev.
34
(
4
),
273
286
(
1927
).
73.
F.
Mosteller
, “
Remarks on the method of paired comparisons I. The least squares solution assuming equal standard deviations and equal correlations
,” in
Selected Papers of Frederick Mosteller
, Springer Series in Statistics, edited by
S. E.
Fienberg
and
D. C.
Hoaglin
(
Springer
,
New York
,
2006
), pp.
157
162
.
74.
C. C.
Holmes
and
L.
Held
, “
Bayesian auxiliary variable models for binary and multinomial regression
,”
Bayesian Anal.
1
(
1
),
145
168
(
2006
).
75.
L.
Kotthoff
,
H.
Wahab
, and
P.
Johnson
, arXiv:2108.00002 [cond-mat, physics:physics] (2021).
76.
S.
Kiyohara
,
H.
Oda
,
K.
Tsuda
, and
T.
Mizoguchi
, “
Acceleration of stable interface structure searching using a kriging approach
,”
Jpn. J. Appl. Phys.
55
(
4
),
045502
(
2016
).
77.
S.
Kikuchi
,
H.
Oda
,
S.
Kiyohara
, and
T.
Mizoguchi
, “
Bayesian optimization for efficient determination of metal oxide grain boundary structures
,”
Physica B
532
,
24
28
(
2018
).
78.
T.
Ueno
,
T. D.
Rhone
,
Z.
Hou
,
T.
Mizoguchi
, and
K.
Tsuda
, “
COMBO: An efficient Bayesian optimization library for materials science
,”
Mater. Discovery
4
,
18
21
(
2016
).
79.
A.
Talapatra
,
S.
Boluki
,
T.
Duong
,
X.
Qian
,
E.
Dougherty
, and
R.
Arróyave
, “
Autonomous efficient experiment design for materials discovery with Bayesian model averaging
,”
Phys. Rev. Mater.
2
(
11
),
113803
(
2018
).
80.
S.
Hankins
,
L.
Kotthoff
,
S.
Ray
, and
I.
Fertig
, “
Bio-like composite microstructure designs for enhanced damage tolerance via machine learning
,” in
Proceedings of the American Society for Composites, Thirty-Fourth Technical Conference
(Destech Publications,
2019
).
81.
S. V.
Kalinin
,
M.
Ziatdinov
, and
R. K.
Vasudevan
, “
Guided search for desired functional responses via Bayesian optimization of generative model: Hysteresis loop shape engineering in ferroelectrics
,”
J. Appl. Phys.
128
(
2
),
024102
(
2020
).
82.
F.
Ren
,
L.
Ward
,
T.
Williams
,
K. J.
Laws
,
C.
Wolverton
,
J.
Hattrick-Simpers
, and
A.
Mehta
, “
Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments
,”
Sci. Adv.
4
(
4
),
eaaq1566
(
2018
).
83.
F.
Häse
,
L. M.
Roch
,
C.
Kreisbeck
, and
A.
Aspuru-Guzik
, “
Phoenics: A Bayesian optimizer for chemistry
,”
ACS Cent. Sci.
4
(
9
),
1134
1145
(
2018
).
84.
L.
Kotthoff
,
V.
Jain
,
A.
Tyrrell
,
H.
Wahab
, and
P.
Johnson
, “
AI for materials science: Tuning laser-induced graphene production
,” in
Data Science Meets Optimisation Workshop
(IJCAI 2019 Workshop,
2019
).
85.
A. M.
Gopakumar
,
P. V.
Balachandran
,
D.
Xue
,
J. E.
Gubernatis
, and
T.
Lookman
, “
Multi-objective optimization for materials discovery via adaptive design
,”
Sci. Rep.
8
(
1
),
3738
(
2018
).
86.
A.
Solomou
,
G.
Zhao
,
S.
Boluki
,
J. K.
Joy
,
X.
Qian
,
I.
Karaman
,
R.
Arróyave
, and
D. C.
Lagoudas
, “
Multi-objective Bayesian materials discovery: Application on the discovery of precipitation strengthened NiTi shape memory alloys through micromechanical modeling
,”
Mater. Des.
160
,
810
827
(
2018
).
87.
F.
Hutter
,
H. H.
Hoos
, and
K.
Leyton-Brown
, “
Sequential model-based optimization for general algorithm configuration
,” in
Learning and Intelligent Optimization
, Lecture Notes in Computer Science, edited by
C. A. C.
Coello
(
Springer
,
2011
), pp.
507
523
.
88.
B.
Shahriari
,
K.
Swersky
,
Z.
Wang
,
R. P.
Adams
, and
N.
de Freitas
, “
Taking the human out of the loop: A review of Bayesian optimization
,”
Proc. IEEE
104
(
1
),
148
175
(
2016
).
89.
I.
Andrianakis
and
P. G.
Challenor
, “
The effect of the nugget on Gaussian process emulators of computer models
,”
Comput. Statistics Data Anal.
56
,
4215
4228
(
2012
).
90.
A.
Pepelyshev
, “
The role of the nugget term in the Gaussian process method
,” in
mODa 9—Advances in Model-Oriented Design and Analysis
, Contributions to Statistics, edited by
A.
Giovagnoli
,
A. C.
Atkinson
,
B.
Torsney
, and
C.
May
(
Physica-Verlag HD
,
Heidelberg
,
2010
), pp.
149
156
.
91.
W.
Xing
,
S. Y.
Elhabian
,
V.
Keshavarzzadeh
, and
R. M.
Kirby
, “
Shared-Gaussian process: Learning interpretable shared hidden structure across data spaces for design space analysis and exploration
,”
J. Mech. Des.
142
(
8
),
081707
(
2020
).
92.
R.
Bostanabad
,
Y.-C.
Chan
,
L.
Wang
,
P.
Zhu
, and
W.
Chen
, “
Globally approximate Gaussian processes for big data with application to data-driven metamaterials design
,”
J. Mech. Des.
141
(
11
),
111402
(
2019
).
93.
C. B.
Erickson
,
B. E.
Ankenman
, and
S. M.
Sanchez
, “
Comparison of Gaussian process modeling software
,”
Eur. J. Oper. Res.
266
(
1
),
179
192
(
2018
).
94.
L. S.
Bastos
and
A.
O’Hagan
, “
Diagnostics for Gaussian process emulators
,”
Technometrics
51
(
4
),
425
438
(
2009
).
95.
H.
Chen
,
J. L.
Loeppky
,
J.
Sacks
, and
W. J.
Welch
, “
Analysis methods for computer experiments: How to assess and what counts?
,”
Stat. Sci.
31
(
1
),
40
60
(
2016
).
96.
H. J.
Kushner
, “
A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise
,”
J. Basic Eng.
86
(
1
),
97
106
(
1964
).
97.
D. R.
Jones
, “
A taxonomy of global optimization methods based on response surfaces
,”
J. Global Optim.
21
(
4
),
345
383
(
2001
).
98.
D. D.
Cox
and
S.
John
, “
A statistical method for global optimization
,” in
Proceedings of the 1992 IEEE International Conference on Systems, Man, and Cybernetics
(IEEE,
1992
), Vol. 2, pp.
1241
1246
.
99.
M. T. M.
Emmerich
,
K. C.
Giannakoglou
, and
B.
Naujoks
, “
Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels
,”
IEEE Trans. Evol. Comput.
10
(
4
),
421
439
(
2006
).
100.
M.
Abdolshah
,
A.
Shilton
,
S.
Rana
,
S.
Gupta
, and
S.
Venkatesh
, “
Expected hypervolume improvement with constraints
,” in
2018 24th International Conference on Pattern Recognition (ICPR)
(IEEE,
2018
), pp.
3238
3243
.
101.
K.
Yang
,
M.
Emmerich
,
A.
Deutz
, and
T.
Bäck
, “
Multi-objective Bayesian global optimization using expected hypervolume improvement gradient
,”
Swarm Evol. Comput.
44
,
945
956
(
2019
).
102.
Z.
Wang
and
S.
Jegelka
, arXiv:1703.01968 [cs, math, stat] (2018).
103.
D.
Hernández-Lobato
,
J. M.
Hernández-Lobato
,
A.
Shah
, and
R. P.
Adams
, arXiv:1511.05467 [stat] (2016).
104.
M.
Abdolshah
,
A.
Shilton
,
S.
Rana
,
S.
Gupta
, and
S.
Venkatesh
, arXiv:1902.04228 [cs, stat] (2019).
105.
A.
Tran
,
M.
Eldred
,
S.
McCann
, and
Y.
Wang
,
SrMO-BO-3GP: A Sequential Regularized Multi-Objective Constrained Bayesian Optimization for Design Applications
(
American Society of Mechanical Engineers Digital Collection
,
2020
).
106.
A.
Biswas
,
C.
Fuentes
, and
C.
Hoyle
, “
A Mo-Bayesian optimization approach using the weighted Tchebycheff method
,”
J. Mech. Des.
144
,
011703
(
2021
).
107.
A.
Biswas
, “Hybrid statistical and engineering optimization architectures in early multidisciplinary designs of resilience and expensive black-box complex systems,” Ph.D. dissertation (Oregon State University, 2021).
108.
J.
Knowles
, “
ParEGO: A hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems
,”
IEEE Trans. Evol. Comput.
10
(
1
),
50
66
(
2006
).
109.
S.
Daulton
,
M.
Balandat
, and
E.
Bakshy
, arXiv:2006.05078 [cs, math, stat] (2020).
110.
H.
Pohlheim
, Examples of objective functions (GEATbx.Com) (2006); available at http://www.geatbx.com/ver_3_3/fcnindex.html.
111.
J.
Blank
and
K.
Deb
, “
Pymoo: Multi-objective optimization in python
,”
IEEE Access
8
,
89497
89509
(
2020
).

Supplementary Material