This tutorial introduces systematically the foundational concepts undergirding the recently formulated AI (artificial intelligence)-based materials knowledge system (AI-MKS) framework. More specifically, these concepts deal with features engineering the heterogeneous material internal structure to obtain low-dimensional representations that can then be combined with machine learning models to establish low-computational cost surrogate models for capturing the process–structure–property linkages over a hierarchy of material structure/lengths scales. Generally referred to as materials knowledge systems (MKS), this framework synergistically leverages the emergent AI/ML (machine learning) toolsets in conjunction with the modern experimental and physics-based simulation toolsets employed currently by the domain experts in the materials field. The primary goal of this tutorial is to present to the domain expert the foundations needed to understand and take advantage of the impending opportunities arising from a synergistic integration of AI/ML tools into the current materials innovation efforts while identifying a specific path forward for accomplishing this goal.

Data analytics and machine learning tools have been shown to be invaluable in establishing low-computational cost surrogate models for a broad variety of applications ranging from recommendation systems (e.g., Ref. 1) to cancer detection (e.g., Ref. 2) to self-driving cars (e.g., Ref. 3). They are being increasingly explored for addressing challenges related to accelerated materials discovery and development—the central focus of the Materials Genome Initiative (MGI).4–13 Some of the notable successes in this direction have included the automated extraction of materials data from published reports (e.g., Refs. 14 and 15), automated identification of salient features in images (e.g., Refs. 16 and 17), and the identification/prediction of material chemistries and internal structures with unusual/superior combinations of properties (e.g., Refs. 18–29). Although these successes have identified many exciting new research avenues, they have also identified a central gap in the field today. It is generally observed that a simple brute-force application of the established machine learning tools to problems and challenges encountered in advanced materials development and deployment often fails to deliver the expected benefits. This is mainly because such explorations fail to leverage the vast amount of the previously accumulated domain knowledge in the materials field. Therefore, in the materials innovation efforts, there is a critical need and a tremendous opportunity for the development and deployment of novel frameworks12,30 that facilitate synergistic use of the emergent data analytic tools in conjunction with the established toolsets used in the materials science and engineering domain. The latter includes a broad suite of sophisticated physics-based multiscale materials modeling tools31 and multiresolution materials structure and response characterization protocols (e.g., Refs. 32–36).

It is highly desirable to develop integrated mathematical frameworks that take advantage of the relative strengths of the different approaches mentioned above (i.e., data analytics, physics-based computations, multiresolution experiments). For example, most machine learning tools are inherently aimed at the interpolation of the available (usually small number of data points) high-dimensional data, with some of them paying special attention to uncertainty quantification (e.g., Bayesian approaches37–41). On the other hand, physics-based models and simulations provide the only avenues for building models with the potential for high fidelity extrapolation. It is also important to recognize that experiments provide the only avenue for collecting ground truth data. Therefore, it is clear that the best strategy for accelerated materials innovation lies in our ability to develop and deploy novel frameworks capable of exploiting the relative strengths of all the different classes of toolsets mentioned above. It is also important to recognize that vast differences exist in the cost and fidelity of the different classes of tools mentioned above. For example, multiresolution experiments spanning multiple material length scales often require significant investments of time and money, while physics-based simulation tools are generally less expensive (on a relative basis). However, there is currently no framework for providing objective guidance to the researcher on where one should invest their time and effort (e.g., Should one do more experiments or physics-based simulations? Which ones?) in order to optimally reach their targets in materials innovation (e.g., attain a specified combination of material properties or performance metrics). The above discussion points to the critical need for a foundational framework that can systematically and comprehensively extract the embedded knowledge in the physics-based simulations and multimodal multiresolution experimental datasets, and express this core knowledge in forms that can objectively support and guide the accelerated materials innovation envisioned by MGI.

A central tenet in the field of materials science and engineering is that the processing history controls the material's internal structure over a hierarchy of length scales, which, in turn, controls the effective (macroscale) properties or performance characteristics exhibited by the material. The core materials knowledge needed to drive materials innovation is therefore most conveniently captured in the form of process–structure–property (PSP) linkages.13,42–47 Such PSP linkages can be formulated at salient material length/structure scales13,48 to facilitate computationally efficient scale-bridging in both directions (i.e., homogenization and localization13,49). Briefly, homogenization focuses on aggregating information from the lower material structure/length scale to the next higher scale, while localization addresses the spatial distribution (i.e., partitioning) of imposed quantities at the higher material structure/length scale to the next lower scale. The envisioned network of linkages is presented schematically in Fig. 1, where it is implied that a large library of low-computational cost, reduced-order (surrogate), PSP linkages could drive optimally the materials innovation effort. One of the salient aspects of the aggregated and curated PSP linkages shown in this figure is that their uncertainty will be rigorously modeled in a suitable Bayesian framework. Consequently, at any given time, one will be able to answer instantly any materials-related queries arising from design/manufacturing experts, while also quantifying the confidence levels in the provided answers. Simultaneously, the queries themselves could be used to prioritize and streamline future efforts aimed at refinement and/or expansion of the PSP linkages in order to provide a better answer at a future time.

FIG. 1.

Schematic depiction of the formulation of the core knowledge needed to support accelerated materials innovation, expressed in the form of homogenization and localization PSP linkages formulated at a hierarchy of materials structure/length scales.

FIG. 1.

Schematic depiction of the formulation of the core knowledge needed to support accelerated materials innovation, expressed in the form of homogenization and localization PSP linkages formulated at a hierarchy of materials structure/length scales.

Close modal

The vision and paradigm presented in Fig. 1 is distinctly different from the current practices in multiscale materials innovation efforts, which largely design and launch experiments/simulations in response to the specific needs articulated by the design/manufacturing end-user. Since most multiscale materials experiments and simulations demand significant time and effort, the data collection efforts are invariably slow (often requiring several months or even years). Most importantly, the decisions made by the domain specialists regarding which specific experiments or simulations are to be performed to obtain the required insights are often made in an ad-hoc manner by relying largely on their own individual analysis of the data/information accessible to them. Consequently, the decisions made in the current workflows do not usually lead to optimal learning of the critical knowledge needed to drive the targeted materials innovation.

The novel concept presented in Fig. 1 fundamentally argues that it would actually be much more beneficial to pursue the aggregation and curation of the PSP linkages in a variety of materials classes and structure/length scales in a fully de-coupled manner. The overall scheme presented in this figure allows the different experts engaged in the many different aspects of materials science and engineering to generate and contribute their datasets for community-level curation of the underlying materials knowledge (e.g., see the data repositories aggregated by Materials Project,10 AFLOW,50 PRISMS,51 MDF,52,53 and MDMC,54 MEAD55). In this context, it is important to recognize that any single dataset (from either experiments or simulations) is unlikely to produce the PSP linkages depicted in Fig. 1. This is because any dataset produced from a single source (i.e., either an experimental setup or a software framework) is likely to provide only a partial clue into the overall puzzle. It should be further recognized that even this partial clue comes with inherent uncertainty that can be attributed to many factors, including (i) insufficient knowledge of the governing physics and (ii) the limitations of the tools and machines used to generate the data.

The discussion above emphasizes the critical need for a rigorous framework for the objective (i.e., data-driven) extraction of knowledge from disparate, incomplete, and uncertain datasets produced by the different experts in the materials science and engineering field. The strategy outlined in Fig. 1 effectively de-couples the data generation and aggregation tasks (these can be broadly referred as materials data management tasks) from the knowledge extraction tasks (these can be broadly referred as materials data analytics tasks).12,13,56,57 This fundamental separation of the tasks should allow the materials community to pursue the envisioned materials knowledge in a highly systematic and organized manner. Potentially, a community-level organization of the overall effort involved can lead to highly optimized exploration of the unimaginably large materials space spanning a hierarchy of material length/structure scales. Indeed, the proposed transformation of the current practices in the materials innovation efforts could streamline the efforts of the broader materials community into building systematically and optimally the core materials knowledge of high value sought by the design/manufacturing stakeholders. The novel mathematical framework and the associated toolsets needed to address the grand challenge depicted in Fig. 1 are referred as the AI-based Materials Knowledge Systems (AI-MKS) in this paper. This tutorial expounds the foundational concepts and frameworks needed to pursue these knowledge systems.

As already expounded above, the core materials knowledge needed to drive objectively (and optimally) materials innovation efforts is best expressed as a very large and highly organized collection (i.e., a library) of PSP linkages whose uncertainties have been quantified. Figure 2 depicts examples of PSP linkages that could be targeted for the envisioned materials knowledge systems. It is emphasized that neither the lists of variables nor the set of PSP linkages depicted in this figure are intended to be comprehensive. These are intended to serve merely as examples. It is further noted that comprehensive and standardized lists of variables and the corresponding PSP linkages that hold high value for materials innovation efforts have not yet been identified or compiled systematically by the materials experts. Indeed, the lack of a standardized and broadly adopted taxonomy for the PSP linkages constitutes an important current gap and a significant hurdle for the realization of the vision depicted in Fig. 1. Addressing this critical need will impact positively the labeling or structuring of all materials data through the development and adoption of standardized schemas [e.g., the Materials Data Curation System (MDCS)58 developed at the National Institute of Standards and Technology (NIST)].

FIG. 2.

Examples of the different variables involved in the formulation of the hierarchical PSP linkages.

FIG. 2.

Examples of the different variables involved in the formulation of the hierarchical PSP linkages.

Close modal

Before undertaking the immense task of creating a suitable taxonomy (or equivalently an ontology) for the aggregation and curation of the desired library of PSP linkages, it is important to establish suitable criteria/guidelines for the same. The following offers an initial list (to be further tweaked and refined by the materials research community): (i) Comprehensive—the list of variables identified for PSP linkages needs to cover all materials classes and the entire hierarchy of material structure/length scales. (ii) Versatile—the selected variables should be able to represent the broad diversity of the features involved in the PSP linkages with the most economical representations (i.e., allow the use of the smallest number of dominant features). This is particularly important for the variables selected to represent the details of the materials internal structure (see the middle column of Fig. 2). (iii) Interoperability—the selected variables should maximize the interoperability of the formulated PSP linkages (in both homogenization and localization directions; see Fig. 1). This criterion is the central key to ensuring that all of the relevant PSP linkages can be utilized in providing the most valuable responses to the queries from the design/manufacturing end-user (see Fig. 1). Note that several prior reports in the literature31,59,60 have emphasized the need for the interoperability of the different software codes used by materials experts in simulating or predicting the macroscale response of a material or the performance of a device used in advanced technology. The concept articulated here is distinctly different in that it recognizes that we only need interoperability of the knowledge expressed at different material length/structure scales. In other words, even if the software codes used to generate the data are not interoperable, the concept presented in Fig. 1 suggests that we can still extract high value knowledge in the form of interoperable PSP linkages. This task is indeed substantially easier than requiring the interoperability of the diverse software components employed by the materials experts. In fact, interoperable PSP linkages are much more practical and can lead to rigorous assessment of the uncertainty associated with the overall predictions made using the aggregated knowledge systems.

Next, it is important to recognize and understand clearly the expected functionality of the AI-MKS depicted schematically in Fig. 1. Broadly, most of the functionality of the AI-MKS can be envisioned to be delivered through three main components: (i) Diagnostics Engine—this set of computations are aimed at identifying similarity or differences between different datasets. They can be used to perform tasks such as outlier analysis (i.e., how similar or dissimilar are a given set of data points to another set), machine health prognosis (e.g., is a given machine producing consistent material), and quality control tests (e.g., is a specified protocol being performed consistently). (ii) Prediction Engine—this set of computations are aimed at answering the “what if …” questions posed by the design/manufacturing end-user. As examples, one might be interested in assessing the impact of specific changes in material chemistry and/or process history on a desired set of macroscale material properties. (iii) Recommendation Engine—this set of computations are aimed at providing objective decision support in materials innovation. More specifically, they are aimed at identifying the specific next steps (from a list of potential options) that exhibit the highest potential for information gain toward a specified target. In the context of materials innovation efforts, the potential next steps typically constitute multiple options in either multiresolution experiments or physics-based simulations, while the target is usually specified in the form of a desired combination of macroscale materials properties and performance criteria.

It should be noted that the desired functionality outlined above is highly ambitious, especially given the rich diversity of multiscale materials phenomena occurring in the broad variety of material systems of interest in advanced technologies. Given the challenges of disparate, incomplete, and uncertain materials data discussed earlier, the only practical path forward is to design an agile software platform that allows continuous computationally efficient updates of the aggregated knowledge as new data becomes available. This requires the formation of synergistic partnerships between the materials science and computer science communities. This, above all, holds the central key to the successful realization of the envisioned AI-MKS.

The overall effort involved in realizing the vision depicted in Fig. 1 faces two main technical hurdles: (i) a mathematical framework for the rigorous quantification of the material structure over a hierarchy of structure/length scales spanning from the subatomic to the macroscale (hereafter referred to as the feature engineering of the hierarchical material structure) and (ii) a learning framework that can objectively drive the curation of the desired PSP linkages from available disparate, uncertain, and incomplete materials data (hereafter referred to as the machine learning framework for PSP linkages). The other challenges anticipated in the implementation of the envisioned AI-MKS are largely non-technical (e.g., establishing productive collaborations between computer scientists, data scientists, and material scientists; community-level sharing of data and codes; training of materials specialists in the correct use of the emerging machine learning tools), and have been addressed elsewhere.5,57 In the rest of this tutorial, we will establish the necessary mathematical frameworks for addressing the two main technical hurdles identified above. As such, these two components constitute the foundational technical elements of the envisioned AI-MKS.

In order to appreciate the fundamental challenge of this task, it is instructive to imagine what details of the material structure need to be captured to fully describe the material state of a given sample. It should be noted that the current widely used material naming conventions are based largely on the overall chemical composition and some details of the final processing steps employed in the manufacture of the material. As an example, 7075-T6 Al61 generally implies that this Al metal alloy has 5.6%–6.1% Zn, 2.1%–2.5% Mg, and 1.2%–1.6% Cu, while the label “T6” implies the use of an aging treatment that results in an ultimate tensile strength of 510–572 MPa, a yield strength of 434–503 MPa, and a failure elongation of 5%–11%. Although the T6 temper is usually achieved by homogenizing the cast 7075 Al alloy at 450 °C for several hours, and then aging at 120 °C for 24 h, it does not automatically imply that this was the exact temper treatment employed. This is because it is entirely possible to obtain the set of properties specified above using other thermo-mechanical processing histories, even within the specified composition window. The main limitation of the current approach in naming the materials is that it does not capture or reflect the salient features of the hierarchical material structure that would uniquely identify the material and control its many properties or performance characteristics.

As already noted, the description of the material structure is heavily complicated by the fact that the material internal structure spans several length scales (from the subatomic to the macroscale). Over this large span of length scales covering about seven orders of magnitude, the material internal structure exhibits a very large set of salient features that can potentially influence the macroscale properties of interest. For example, if one looks into the mesoscale structures of typical structural alloys (e.g., Al alloys, Ti alloys, steels) using optical and/or scanning electron microscopes, one finds diverse multiphase polycrystalline microstructures exhibiting rich variations in grain orientation, grain/phase size, and morphology distributions. At the lower length scales, the crystalline arrangements exhibit an equally rich variety of defects (e.g., dislocation structures, solute segregation). Given this basic introduction of the material structure, it is clear that the number of features needed to define the hierarchical material structure in any given physical sample of a material is going to be extremely large (i.e., almost an infinite number of features would be required to uniquely identify each material). Since each distinct feature of the material structure can be treated as a dimension in its description, one can conclude that the rigorous quantification of the hierarchical material structure inherently demands an extremely high dimensional representation.

Most commonly employed approaches for the quantification of the material structure are based on highly simplified statistical measures. For example, there have been several attempts to correlate the yield strength and failure strength of metal alloys to the overall alloy composition, phase volume fractions and the average grain sizes in the sample.62–67 Although these highly simplified statistical measures of the microstructure show tremendous potential, it should also be clear that these are woefully inadequate for establishing high fidelity PSP linkages of the kind depicted in Fig. 1. In fact, established composite theories actually point out that one can only obtain elementary bounds49 on properties based on the simplified measures. Often, these elementary bounds are widely separated to be of practical value in rational design of materials meeting designer-specified property combinations. Indeed, more sophisticated (i.e., higher-order) statistical measures of the material structure are essential for establishing the desired higher fidelity models.47,68–71

Before we delve into the details of an advanced framework for efficient feature engineering of the material structure, we should clearly lay down our expectations from such a framework. In the context of the AI-MKS described in this tutorial (see Fig. 1), the necessary attributes of the framework include versatility (allow generalization to all relevant structure/length scales in all material systems) and extensibility (allow systematic incorporation of more detailed information as needed), while facilitating low-dimensional representations (i.e., needing only a small number of parameters) with a high ability to recover most of the original information. Needless to say, any framework adopted for material structure quantification needs to serve the primary mission of formulating low computational cost, high fidelity, PSP linkages needed for AI-MKS.

At the outset, it should be recognized that most data about the material internal structure is derived from a variety of images. For example, one obtains data about the material internal structure through images (i.e., maps) obtained using a variety of microscopy techniques (e.g., optical, scanning electron, transmission electron), sometimes aided by tomography techniques to obtain three-dimensional information.72–75 Any single two-dimensional or three-dimensional map obtained in this process simply represents a single instantiation of the material internal structure (at a spatial resolution and an accuracy dictated by the characterization machine and protocols employed). In other words, any such map should be treated as a material structure and not the material structure.76–78 This is because it is possible to obtain a large number of material structure instantiations from a single physical sample, where the different instantiations exhibit unavoidable variance in the values of the extracted statistics. Indeed, it should be recognized that the material structure refers to a fictitious realization that is representative of a large ensemble of material structures obtained from a single sample. Such a representative material structure is generally referred to as the representative volume element (RVE), and can be established at the different salient material structure/length scales using suitably established criterion.79 

The discussion above makes clear that the central notion behind the description of the material internal structure is essentially a spatial mapping of the local state of the material within the internal structure. Let h denote the local state of the material found at the spatial location x in a suitably defined RVE of the material structure. Both of these variables need additional clarification and discussion. The spatial location x has to have a certain material volume associated with it. This is because if one thinks of the spatial location as a point in space without any associated volume, then it is impossible to associate any local state descriptors with that spatial point. Note that all materials characterization techniques (e.g., diffraction, spectroscopy) need to probe a finite material volume in order to quantify reliably the local material state. Moreover, these measurements are often conducted on a fixed spatial grid to produce the desired material structure maps. Consequently, all experimentally measured material structure maps are implicitly associated with a spatial resolution dictated by the limitations of the characterization equipment and/or the protocols employed. In other words, the information captured in the structure maps reflects an averaged measure of the material structure over a very small volume that is implicitly associated with each spatial point in the structure map. Given these considerations, a practical path forward for capturing the details of the material structure at any selected length scale is to define the material local state on a uniformly discretized (2D or 3D) spatial grid, just the same way digital images are stored on a computer. Each grid point is then associated with a material volume or a voxel (equivalently for a 2D description each grid point would be associated with a pixel). It should be noted that the discretized description of the material structure described here is equivalent to uniform sampling, and lends itself to computationally efficient transformations based on discrete Fourier transforms (DFTs).80,81

Next, let us formalize what material local state information should be associated with each voxel. Formally, the material local state h can include any and all attributes needed to describe fully the local material response at the scale of the voxel. Implicitly, the variables included in h would depend on the specifics of the material system and the material's structure length scale under consideration. For example, at the mesoscale (voxel sizes in the range of ∼ 1 μm to ∼100 μm) most metals exhibit multiphase polycrystalline microstructures. The appropriate material structure scale for the description of these material structures is the grain scale, i.e., one would employ a multitude of voxels in each grain. For these material structures, one could include the thermodynamic phase identifier, α, the chemical composition of the phase, c, the crystal lattice orientation, g, and suitably defined defect densities such as the dislocation density, ρ. As already discussed, these would have to be defined as averaged quantities over the voxel volume. Based on this example, it can be seen that the complexity of the local material state in most advanced materials demands a multivariate description. For the example described above, h = ( α , c , g , ρ ). Note also that some of the variables used in the description of h themselves demand multivariate descriptions. In the example above, the phase composition may need to be represented by a set of chemical compositions of individual chemical elements, the lattice orientation is usually defined by a ordered set of three rotation angles called Euler angles,82 and one might be interested in including additional defect and/or damage measures (e.g., microporosity, density of small cracks). At a lower material structure length scale, say the atomic structure, one needs to adopt a different definition of the local material state. If one were to voxelize the atomic structure of a material where the atoms are represented as hard spheres,48,83 the local material state could be described using an identifier for the chemical species.

It should be clear that the definition of the material local state h is quite flexible and can be used to incorporate any attributes of interest. In the examples described above, we mainly restricted our attention to attributes that describe the material structure at the lower length scale. For example, the definition h = ( α , c , g , ρ ) is exclusively based on averaged statistical measures of the material structure at the lower length scale. As another alternative, one can also define the local material state directly by the local properties exhibited at the voxel scale. As an example, one can choose to represent the local material state at the mesoscale as h = ( C , Y ), where C denotes the fourth-rank elastic stiffness tensor and Y denotes the anisotropic plastic yield surface for the material in the voxel. Similarly, at the atomic scale, one can use a combination of physical parameters instead of the identifier for the chemical species. For example, one can employ a combination of parameters such as atomic mass, atomic radius, and electronegativity to describe the material local state in describing the material structure at the atomic scale. This alternate expression of the material local state directly in terms of physical properties of interest allows for a better interpolation of the material structure to facilitate the formulation of high fidelity PSP linkages across different chemical compositions and/or thermodynamic phases.

The voxelized representation of the material volume at a hierarchy of material structure/length scales and the assignment of suitably defined material local states to each voxel can provide a sufficiently accurate and versatile description of the highly complex material structure. However, this would result in an extremely high-dimensional representation that will be unwieldy for formulating the PSP linkages of kind depicted in Fig. 1. Most importantly, as already described earlier, we need a stochastic formulation that allows a quantification of the uncertainty in the formulated PSP linkages. In the context of material structure quantification, uncertainty arises from the limitations in the capabilities of the materials characterization equipment, unintended changes introduced into the material structure in the preparation of samples for the characterization protocols, and the availability of limited or incomplete data (e.g., availability of only two-dimensional maps of material structure, inadequate number and size of microstructure maps or scans).

A stochastic material structure can be defined by invoking a material structure function,84, m ( h , x ), which reflects the probability density for finding local state h at the spatial location x. For reasons already discussed above, it is much more practical to employ a discretized version of the microstructure function defined as the array m [ h , s ], where s S indexes the voxels in the discretized material volume. In this notation, s denotes a multi-dimensional integer index. For example, in describing 3D microstructures, it is convenient to represent the index as s = { s 1 , s 2 , s 3 }. Furthermore, we will also restrict our attention in this tutorial to situations where the local state h is limited to only a finite number of discrete choices. For this special case, m [ h , s ] simply reflects the volume fraction of local state h in the voxel indexed by s. Extensions of the concepts described in this tutorial to the more complex choices of the local state descriptors can be found in recent publications.13,27,85,86

For the special case of discretized material structures (i.e., both the material volume as well the local state can be indexed suitably), the material structure array m [ h , s ] exhibits the following properties:
0 m [ h , s ] 1 , h = 1 H m [ h , s ] = 1 , s S m [ h , s ] = V [ h ] | S | ,
(1)
where V [ h ] denotes the volume fraction of local state h in the entire structure and | S | denotes the total number of voxels in the microstructure. Note that m [ h , s ] admits fractional values (between zero and one). The deficiency in utilizing m [ h , s ] directly for the quantification of material structure lies in the lack of natural origin for indexing the spatial cell variable, s. In other words, m [ h , s ] lacks translational invariance because material structure images taken from different locations (using different origins for indexing s) on the same sample can exhibit very different values of m [ h , s ], but their underlying material structure statistics are expected to be quite similar to each other. A rigorous statistical treatment of the material structure as a random process suggests the use of the framework of n-point spatial correlations (also called n-point statistics).47,68,69,76,77,87 This formalism provides a natural approach for quantifying the material structure in ways that allow systematic addition of higher-order information (as the value of n is increased). For example, 1-point statistics is the simplest form of n-point statistics and captures information on the volume fractions of the different local states present in the material structure.

The next level of n-point statistics is 2-point statistics and denotes the probability of finding local states h and h separated by a vector indexed by r. Since the vectors that can be placed in a voxelized volume are also naturally discretized (see Fig. 3), r can also be treated as a vector integer index (allowed to take only integer values for its vectorial components). Indeed, indices s and r share many common features. The main difference is that while the index s enumerates each of the voxels, the index r enumerates the vectors that can be thrown into the discretized material structure (essentially as a difference between any two values of s).

FIG. 3.

Illustration of the s and r vector integer indices used to label the discretized spatial bins (i.e., pixels or voxels) and the corresponding discretized vector space.

FIG. 3.

Illustration of the s and r vector integer indices used to label the discretized spatial bins (i.e., pixels or voxels) and the corresponding discretized vector space.

Close modal
Mathematically, a discretized description of the 2-point statistics can be expressed by the array69,77
f [ h , h , r ] = 1 | S r | s S r m [ h , s ] m [ h , s + r ] ,
(2)
where | S r | represents the normalization factor denoting the available number of trials for each unique discrete vector indexed by r. It should be noted that the discretization of the vector space is accomplished in such a way that is completely consistent with the discretization used for the material structure image using the integer index array s (cf. Ref. 13) The 2-point statistics array defined in Eq. (2) can be computed efficiently using discrete Fourier transforms (DFTs).76,77,88 The formalism presented in Eqs. (1) and (2) represents the most comprehensive and systematic digital representation of the material structure available today (see Ref. 47 for a discussion of how it relates to other traditionally used measures of the microstructure and Ref. 89 for reconstructions of the original microstructure from the 2-point statistics). It has also been pointed out that this formalism (i) provides a comprehensive treatment of the neighborhood of a selected voxel as a stochastic variable,76,77 and (ii) connects directly with the most sophisticated physics-derived composite theories available in the published literature.13,47,68,90

In order to illustrate how the 2-point statistics capture the important attributes of microstructure morphology, it is instructive to look at a few idealized microstructures and their corresponding 2-point statistics computed using Eq. (2). Figure 4 shows a few digitally created (i.e., synthetic) microstructures from Ref. 88 in the top row and their corresponding 2-point statistics in the bottom row. The first four digital microstructures are produced by randomly placing ellipses of a selected size, while the fifth microstructure was produced by randomly placing spheres of a selected size. Furthermore, in the first two micrographs, the ellipses were oriented in a single direction (horizontal in the first one and vertical in the second one), while the third microstructure was generated by placing ellipses in two selected orientations. The fourth microstructure was generated by placing ellipses in arbitrary orientations. One of the most important statistic in an autocorrelation map is the one corresponding to the zero vector, which is seen at the center of the autocorrelation map. The value at the center of the autocorrelation maps in Fig. 4 reflects the volume fraction of the corresponding local states. The central pattern in the middle of the autocorrelation captures information on the phase morphology. Note that the central pattern in the autocorrelation clearly captures the elliptical shape and its orientation quite well in the first two microstructures. For the third microstructure, the pattern in the middle of the autocorrelation reflects a combined morphology resulting from both orientations of the ellipses in the microstructure. Since the fourth microstructure has ellipses oriented in all directions, the average morphology is reflected in the roughly equiaxed pattern in the middle of the autocorrelation (the slight elliptical shape in this plot is a consequence of the fact that it is nearly impossible to obtain a perfectly random distribution of oriented ellipses in any finite domain; larger microstructural domains will make the central pattern in this plot approach a circle). In addition to the central feature, there are many local peaks and patterns visible from the autocorrelation maps in the bottom row of Fig. 4. These additional local peaks and patterns carry information about the spacing of the ellipses (or circles) in the corresponding microstructure. For example, the horizontal bands just outside the central ellipse in the autocorrelation of the first microstructure indicate that the ellipses are more aligned with each other in the horizontal direction compared to their alignment in the vertical direction. This indeed can be confirmed in the corresponding microstructure. It is also noted that the autocorrelation for the fourth microstructure depicts fewer discernible additional patterns or peaks outside the central pattern. This indicates that this particular microstructure exhibits a higher level of randomness (i.e., disorder) compared to the other microstructures in this example.

FIG. 4.

Example digital microstructures (top row) and their corresponding 2-point statistics (bottom row).

FIG. 4.

Example digital microstructures (top row) and their corresponding 2-point statistics (bottom row).

Close modal
The last step in feature engineering is establishing a high-value low-dimensional representation of the material internal structure. The n-point statistics defined above are indeed very high-dimensional, where the statistic for each spatial configuration (i.e., each spatial arrangement of specified local states) constitutes one dimension. The term “high-value” in the context of this discussion refers to the efficacy of the low-dimensional microstructure representation in arriving at reliable and robust PSP linkages. This is exactly where some of the data science toolsets become very valuable. Prior work13 has demonstrated the remarkable efficacy of principal component analysis (PCA) in obtaining low-dimensional representations of 2-point statistics and establishing high-fidelity PSP linkages. PCA essentially transforms the 2-point statistics into a new reference frame that allows the ensemble of microstructure statistics be represented most economically in terms of the capture of the variance between the elements of the ensemble. In other words, at any selected truncation level, this representation guarantees the capture of the highest amount of the variance between the data points, within the constraints of a linear, distance-preserving, transformation. For performing PCA, it is convenient to aggregate the n-point statistics deemed important for a selected application into an array denoted by f [ k , r ], where k = 1 , 2 , , K indexes all microstructures in the ensemble. In principal component space, the desired set of spatial statistics of the kth microstructure can be expressed as
f [ k , r ] i = 1 R α [ k , i ] ψ [ i , r ] + f ¯ [ r ] ,
(3)
where R is the number of retained components (typically very small) and f ¯ [ r ] is the ensemble average. ψ [ i , r ] are called the basis vectors and they identify systematically the spatial patterns explaining the variance between the microstructures in a given ensemble. On the other hand, α [ k , i ] are called PC scores, and they represent the kth microstructure in a PC space. In other words, each row of this array provides a low-dimensional representation of each microstructure in the ensemble, in the form of PCi with i = 1 , 2 , , R . One of the main benefits of Eq. (3) is that it provides the highest capability for the reconstruction of the original spatial statistics through the use of the stored values of the basis vectors, ψ [ i , r ], and the ensemble average, f ¯ [ r ].

As a simple illustration, Fig. 5 shows the PC representation in the first two dimensions for spatial statistics aggregated from an ensemble of 287 experimentally obtained segmented micrographs in superalloy samples exhibiting two-phase microstructures.91 In this plot, each data point represents 36 360 spatial statistics extracted from each micrograph (i.e., microstructure). It was also reported that the first two PC scores captured 93.3% of the variance among the different microstructures in this ensemble. This example shows the power of PCA in attaining a dimensionality reduction (i.e., from 36 360 to 2) with minimal loss of information. Specific microstructures corresponding to eight selected points (shown in different colors) in this plot are identified at the top and bottom of the PC plot. In this study, the microstructures were obtained by aging the superalloy samples at different temperatures for different time periods. The information on the aging temperature, the aging time, and the area fraction of γ precipitates for each microstructure are given at the top of each micrograph. These eight points were specifically selected to illustrate what features were captured in PC1 and PC2. The points selected in the top row have very different PC2 values compared to the points selected in the bottom row. The value of PC1 in each row is increasing systematically from left to right. Each set of points with the same color have similar PC1 scores but significantly different PC2 score. A careful study of the colored points and their microstructures in Fig. 5 indicates that PC1 strongly correlates with precipitate area fraction, while PC2 appears to capture the coarsening and coalescing of precipitates. A full interpretation of the PC scores is complicated by the large dimensionality of the PC basis, which is equal to the number of collected spatial statistics (36 360 for this example). Tools aimed at the improved interpretation of the PC basis are a topic of active research in many fields, even outside of materials research where similar tools have been deployed extensively.

FIG. 5.

PC representation of a set of 287 superalloy microstructure images in the first two dimensions. Microstructures corresponding to the selected rows of colored points are shown above and below the PC plot. Each pair of points with the same color is selected such that their PC1 scores are close and the main difference is in their PC2 scores. The microstructures in each row (above and below the plot) are selected such that they are primarily different from each other in their PC1 scores. Reproduced with permission from Acta Mater. 178, 45 (2019). Copyright 2019 Elsevier.

FIG. 5.

PC representation of a set of 287 superalloy microstructure images in the first two dimensions. Microstructures corresponding to the selected rows of colored points are shown above and below the PC plot. Each pair of points with the same color is selected such that their PC1 scores are close and the main difference is in their PC2 scores. The microstructures in each row (above and below the plot) are selected such that they are primarily different from each other in their PC1 scores. Reproduced with permission from Acta Mater. 178, 45 (2019). Copyright 2019 Elsevier.

Close modal

Before closing this section, a few additional remarks are warranted. In the formalism presented here, the information on grain (or phase) boundary character is not explicitly incorporated. It is only incorporated indirectly. Changes in the phase or grain orientation from one voxel to the next indirectly imply the presence of a phase or grain boundary. Indeed, while the lattice misorientation across the boundary is captured to a higher fidelity in this formalism, the boundary plane is itself captured only to the level of accuracy allowed by the discretized representation of the material volume. If such material local states are deemed important for the problem, they need to be explicitly included in the descriptors of the material local state. Another salient aspect of the presented framework is that it produces an objective (unsupervised) low dimensional representation of the material structure statistics that automatically maximizes our ability to reconstruct the statistics of the original material structure. This, of course, is a property of PCA. The reconstruction of the original material structure from the statistics, however, requires other sophisticated computational strategies.89,92–94 Although there exist a number of other dimensionality reduction strategies (e.g., kernel PCA,95 local linear embedding,96 local tangent space alignment97), they were not found thus far to produce an improved unsupervised classification of the material structure statistics for the many case studies explored by our research group. Further investigations are clearly needed to further explore critically the utility and efficacy of these methods for AI-MKS applications.

Once the features have been identified for the inputs and outputs of interest, the next task in creating the foundational elements of the AI-MKS is the formulation of reduced-order low-computational cost PSP linkages. In this task, one usually starts with a suitable dataset (could come from physics-based simulation tools or from experimental protocols). As an example, one might have generated digitally a large ensemble of 3D RVEs and evaluated their effective mechanical properties using micro-mechanical finite element (FE) models that incorporate sophisticated physics about the constitutive responses of microscale constituents and their interactions. In this case, the PC scores of the microstructure (treated as inputs) and its FE predicted effective mechanical property (treated as output) would constitute a single data point, and a collection of such data points would constitute a dataset. As another example, in studies of microstructure evolution using phase-field models, the averaged chemical compositions and the process parameters driving the microstructure evolution would be treated as inputs and the time-evolving PC scores of the microstructure statistics would be treated as outputs. Note that one can establish suitably modified definitions for experimentally acquired datasets.

An important component of all reduced-order model building efforts is the validation of the model. Since modern machine learning tools employ highly sophisticated algorithms for training the desired models, they often employ a very large number of implicit model-fit parameters. Since a larger number of learnt model-fit parameters is very likely to improve the model predictions, there is always an incentive to keep increasing the complexity of the model. However, it is important to understand that the most likely outcome of unnecessarily increasing the model complexity is an over-fit characterized by dramatic loss in the predictive accuracy of the model for new inputs. Therefore, it is very important to design and adopt a rigorous validation strategy. In our work,98–100 we have generally favored a hybrid approach that utilized (i) a hard train-test split of the dataset where a certain fraction of the available data points are randomly selected and set aside for the critical validation of the trained model (these are referred as test data points and are not exposed to the model training effort in any manner) and (ii) a leave-one-out cross-validation (LOOCV) during the training phase of the model. In LOOCV, one data point is excluded from the training set while building the model, and utilized to quantify the prediction error. The process is repeated by excluding each of the training data points, one at a time. Consequently, one collects as many evaluations of the model prediction errors as the number of the training data points. This set of errors are referred as LOOCV errors. When the LOOCV errors are higher than the errors obtained using all of the training points, it indicates an over-fit of the model. Therefore, LOOCV is valuable in guiding the model training process, while avoiding an over-fit. Separately, the prediction errors from the test data points that have not been exposed to any aspect of the model training are used to assess critically the accuracy of the model. It is noted that a number of different variants of the protocols described above are indeed possible. In fact, one of the most commonly employed approach is k-fold validation, where the data is partitioned randomly into k equal folds, and cross-validation is performed by excluding one fold at a time. In our prior work, we found the hybrid approach described above to be much more systematic as it separates formally the model training phase from the model testing phase. The use of LOOCV in the model training phase allows the optimal development of the surrogate model with the small training datasets that are typically available in the materials innovation problems discussed in this tutorial.

Reduced-order models of interest (i.e., PSP linkages) can be established using a variety of strategies. Broadly, these fall into two main categories: (i) regression approaches and (ii) Bayesian-inference based approaches. In the regression approaches, one defines an error measure and identifies the values of the adjustable model-fit parameters that minimize the average error over the training data points. The simplest of the regression approaches is the least-squares regression,81 and can be augmented using a variety of regularization techniques (e.g., Ridge Regression,101 least absolute shrinkage and selection operator (LASSO),102 elastic net103) to mitigate the propensity for over-fit. Modern sophisticated implementations of the regression approaches can be found in Neural Networks (NNs),104–106 and Convolutional Neural Networks (CNNs).107–109 The approaches based on Bayesian inference offer a powerful alternative to the regression techniques, especially for problems with relatively small datasets (i.e., small number of training data points available for model building). In these approaches, regularization is accomplished by prescribing a suitable prior distribution on the unknowns (could be parameters of an assumed model form or directly the unknown function itself). In general, one might argue that the efficacy of Bayesian inference is likely to be controlled significantly by the specific details of the assumed prior. Fortunately, in the field of materials science and engineering, there exists significant prior knowledge established by the domain experts in the form of known materials physics. Priors informed by established but uncertain physics in the materials field offer a powerful approach to building the desired surrogate PSP linkages. Specific examples of such model building approaches include Bayesian Linear Regression (BLR)26,110,111 and Gaussian Process regression (GPR).112–115 One of the main benefits of these statistical approaches is that they provide a natural framework for the rigorous treatment of the uncertainty in the model predictions for new inputs. This, in turn, can provide objective guidance on where new training points should be generated in order to optimize the potential gain in the fidelity of the model being built. Bayesian inference approaches are therefore ideally suited for most multiscale materials design problems, where the cost of generating data points (either experimental or simulations) is very high. There exist a number of adaptive sampling strategies for addressing the optimal selection of inputs for data generation; these are generally referred to as sequential design strategies and utilize criterion such as maximum surrogate uncertainty116 or maximum difference from current estimates made by the surrogate model117,118 or maximum expected improvement in the fit of the Gaussian process model to the noisy training data.119 

Among the different model building approaches described above, GPR offers a powerful toolset for the envisioned AI-MKS systems. In addition to allowing a formal treatment of the uncertainty in the model predictions (this feature is central to build the diagnostics, prediction, and recommendation engines described earlier), it employs a non-parametric approach (i.e., this approach does not invoke a specific model form). This is of tremendous value for establishing reduced-order PSP linkages of the kind depicted in Figs. 1 and 2, for which generalized model forms exhibiting high fidelity have not yet been established in the prior literature. Indeed, GPR is being increasingly utilized in the current literature for addressing a broad range of materials problems.91,110,113,116,117,120–126

GPR127,128 starts with a prior distribution over the desired function defined as a joint Gaussian distribution of a set of input (i.e., features) values denoted by the vector x (assumed to be of D dimensions) and their corresponding output (i.e., target) values denoted by y. Mathematically, the function of interest is assumed to be represented by a Gaussian process (GP) as
y ( x ) N ( μ ( x ) , K ( x , x ) ) ,
(4)
where μ ( x ) and K ( x , x ) represent the mean and the covariance of the GP, respectively, and N ( ) denotes a multivariate normal distribution. In GPR, the covariance is generally defined using a kernel function k ( x , x ), whose selection plays a critical role in the accuracy of the model predictions for test inputs. An automatic relevance determination squared exponential (ARDSE) kernel is a good choice for our needs in formulating PSP linkages, since it allows the use of different interpolation hyperparameters for the different input variables. The ARDSE kernel is mathematically expressed as
k ( x , x ) = σ f 2 exp ( 1 2 d = 1 D ( x d x d ) 2 l d 2 ) + σ n 2 δ x x ,
(5)
where σ f is a hyperparameter controlling the scaling of the variance in the output values, l d is the interpolation hyperparameter associated with the input variable x d, and σ n is the hyperparameter capturing the output noise. The hyperparameters listed above serve as the fit parameters for the GPR and are estimated from the training data by maximizing the marginal (log) likelihood with respect to hyperparameters129 using gradient-ascent optimization algorithms.130 While GPR is accessible through many software packages (e.g., MATLAB,131 R132), it is instructive to understand the roles of the different hyper-parameters introduced in Eq. (5). Smaller values of the interpolation parameters l d indicate higher sensitivity of the specific input variable x d to the output y. In other words, as the value of l d increases, the specific input variable x d exhibits less influence on the predicted value of the ouput variable. However, very small values of l d would result in noisier predictions. The output noise σ n in Eq. (5) is assumed to be independent of input (referred as homoscedasticity). For high fidelity data (such as those obtained from established simulations or highly validated experimental protocols), the value of σ n should be very low (can even be taken as zero if justified).
Let X and X denote N × D and N × D input matrices for training and test points, respectively, where N and N denote the numbers of train and test points, respectively. Let y and y denote the output vectors for the training and test points, respectively. Let K ( X , X ), k ( X , X ), and K ( X , X ) denote the covariance matrices computed using the kernel function [see Eq. (5)] on the respective combinations of training and test inputs. In GPR, the predictive distribution for the test points is expressed as
μ = k K 1 t , Σ = K k K 1 k ,
(6)
where μ and Σ denote the mean and variance (i.e., uncertainty) in the predictions for the test points, respectively, and represents the transpose operation. The main challenge in calculating the mean and variance comes from the inverse operation on the kernel matrix of training points, which is of size N × N. Therefore, the matrix inversion represents a significant computational cost for large training datasets, and has led to the development of local GPR strategies113,133 that utilize only the training points in the neighborhood of the test point in making the predictions. An additional complication can arise if the K matrix exhibits a large condition number. It should be noted that the σ n term in Eq. (5) essentially regularizes the K matrix. Once K 1 is obtained, predictions for test points can be realized through much cheaper matrix operations.127,134

One of the clearest demonstration of the power of GPR in formulating PSP linkages can be found in the recent work of Yabansu et al.,135 where it was used to formulate reduced-order models relating the macroporous structure of a membrane to its effective permeability. The training data for this model were generated using physics-based simulations that explicitly solved the governing transport equations on digitally created 3D RVEs of porous microstructures. PC representations of suitably defined spatial correlations were used as low-dimensional features (i.e., inputs), while the effective permeability of the 3D RVE predicted by the simulation tool was used as the output. Figure 6 depicts the predictive capability of the GPR model developed in this study, while using only two PC scores. This example demonstrates the remarkable efficiency of the low-dimensional representations obtained from the feature engineering approaches described in this tutorial and the high fidelity of the GPR-based structure-property models produced using these features.

FIG. 6.

Demonstration of the use of GPR for the extraction of a reduced-order model connecting the 3D microstructure of a porous membrane with its permeability predicted by a physics-based simulation tool. A typical porous structure used in the study is shown in the top left, while the predictive accuracy of the GPR for an ensemble of 1238 microstructures is depicted in a parity plot in the bottom left. The plots on the right top and right bottom show the predicted means and the predicted variances over a selected region in the input domain (defined by two PC scores). Each white dot in these plots represents a microstructure used in the study.

FIG. 6.

Demonstration of the use of GPR for the extraction of a reduced-order model connecting the 3D microstructure of a porous membrane with its permeability predicted by a physics-based simulation tool. A typical porous structure used in the study is shown in the top left, while the predictive accuracy of the GPR for an ensemble of 1238 microstructures is depicted in a parity plot in the bottom left. The plots on the right top and right bottom show the predicted means and the predicted variances over a selected region in the input domain (defined by two PC scores). Each white dot in these plots represents a microstructure used in the study.

Close modal

As stipulated at the start of this tutorial, one of the major impediments to accelerating the current pace of materials innovation comes from the lack of a rigorous mathematical framework for the objective fusion of incomplete and uncertain data from disparate sources (e.g., physical experiments, physics-based simulations). In recent work,110 I have proposed a new Bayesian framework for addressing this critical gap. This framework builds on the foundational elements presented above and is briefly summarized next.

Physical experiments and physics-based simulations conducted by materials experts provide distinctly different insights into the hierarchical (i.e., multiscale) PSP linkages needed to drive materials innovation (see Figs. 1 and 2). Most importantly, they offer completely different avenues for studying the governing physics mediating the PSP linkages of interest. In the simulations, one usually prescribes the governing physics (expressed as a suitable combination of thermodynamic and/or mechanical laws). Simulations generally allow one to explore the overall material response to any prescribed physics, even when it might be inconsistent with the governing physics realized in the actual material samples of interest. In fact, herein lies the main challenge. The governing physics in a given material sample of interest is often only known to a limited extent. Although experiments are aimed mainly at uncovering the governing physics, they cannot directly reveal the desired insights. A physics-based model is always needed to map the quantities measured in the experiment to the physics governing the material response. Sometimes, this mapping is accomplished with very simple models. However, the mapping of experimental results to the physics governing the material response gets extremely challenging as one extends the considerations to lower material length/structure scales. At the lower material length scales, experiments rarely provide direct evidence about the physics governing the material response. Additionally, they exhibit high levels of uncertainty in the acquired data and incur high cost.

The gap described above between the physical experiments and the physics-based simulations can be bridged if the Bayesian inference is targeted at uncovering the physics governing the response of a selected material system. Indeed, a Bayesian update of the governing physics can be expressed as
p ( φ | E ) p ( E | φ ) p ( φ ) ,
(7)
where p ( φ ) denotes a prior distribution on the governing physics (i.e., our initial guess based on previously accumulated knowledge in the materials science and engineering domain), p ( E | φ ) denotes the likelihood function reflecting the probability of the realization of the available experimental observations E for the specified governing physics φ, p ( φ | E ) denotes the (updated) posterior distribution on the governing physics. It is important to note that the representation of the governing physics can pose significant challenges, especially as one explores material behavior/response at the lower length scales. In its simplest form, this is accomplished through the specification of material-specific parameters in established physical laws (e.g., field equations expressed in differential or integral forms, material constitutive laws). In the more complex cases, the governing physics may be specified by introducing functions using a suitable basis (e.g., Fourier basis), which then reduces the specification of the governing physics to the specification of the weights (i.e., coefficients) of the selected basis. In a completely different but equivalent approach, the governing physics may be specified through Green's function-based influence kernels operating on material structure functions.13,47,90,136 These kernels can also be expressed in digital forms (either through sampling or through the use of a suitable Fourier basis).137 Therefore, it should be recognized that the specification of φ in high value low-dimensional forms in an important open area of active research.

The formulation proposed in Eq. (7) offers many advantages. Most importantly, it offers the best opportunity for utilizing physics-based simulations in accelerating materials innovation. This is because the likelihood function p ( E | φ ) evaluates probabilities that are conditional on specified governing physics; such an evaluation can only be conducted with physics-based simulation tools (it would be impossible to evaluate the likelihood function through experiments). Equation (7) also offers new avenues for the sequential design of experiments39,41,138–140 where one might maximize the potential information gain with each new experiment conducted. Therefore, a rigorous framework centered around Eq. (7) offers a systematic approach for fusing optimally the knowledge extracted from data acquired in physical experiments and physics-based simulations.

The practical implementation of the fusion framework outlined above is largely hindered by the high computational cost of the physics-based simulations. A statistically meaningful evaluation of the likelihood function p ( E | φ ) for most multiscale materials problems requires the execution of an extremely large number of physics-based simulations spanning a sufficiently large space in the domain of φ; the corresponding computational cost would be prohibitive for most materials innovation efforts. The foundational elements described earlier in this tutorial offer the only practical way for addressing this immense challenge. It is suggested that we first establish highly reliable and robust surrogate (i.e., reduced-order) models for the physics-based multiscale simulation tools (such as those demonstrated in Fig. 6), and subsequently employ the surrogate models for evaluating the likelihood function p ( E | φ ). The approach outlined above has been successfully demonstrated recently in the estimation of single crystal elastic stiffness parameters from spherical indentations performed on individual grains in a polycrystalline sample116 and the estimation of the single ply elastic stiffness parameters from indentation measurements on a multilaminate polymer matrix composite sample.141 

A mathematically rigorous framework has been presented for pursuing AI-based materials knowledge systems that addresses the objective fusion of disparate data gathered from multiscale physics-based simulations and multi-resolution experiments conducted by materials specialists. The presented framework is applicable to virtually all materials classes and systems. It is also applicable to virtually all steps in the materials innovation workflow that require objective decision support. Finally, it offers new avenues for optimizing the cost and effort incurred in materials innovation, by directing the effort through the objective selection of new experiments (physical or numerical) that exhibit the highest potential for information gain based on rigorous statistical analyses. Because of the features described above, the proposed AI-MKS framework offers tremendous potential for accelerating the pace of materials innovation.

The author acknowledges support from ONR Award No. N00014-18-1-2879. The author is grateful to Dr. Yuksel Yabansu for providing some of the figures used in this paper.

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

1.
G.
Linden
,
B.
Smith
, and
J.
York
, “
Amazon.com recommendations: Item-to-item collaborative filtering
,”
IEEE Internet Comput.
7
(
1
),
76
80
(
2003
).
2.
J. A.
Cruz
and
D. S.
Wishart
, “
Applications of machine learning in cancer prediction and prognosis
,”
Cancer Inform.
2
,
59
77
(
2006
).
3.
M.
Bojarski
, et al, “End to end learning for self-driving cars,” arXiv:1604.07316 (
2016
).
4.
“Materials Genome Initiative for Global Competitiveness” (National Science and Technology Council, Office of the President of United States,
2011)
.
5.
D. L.
McDowell
and
S. R.
Kalidindi
, “
The materials innovation ecosystem: A key enabler for the materials genome initiative
,”
MRS Bull.
41
(
4
),
326
337
(
2016
).
6.
M.
Drosback
, “
Materials genome initiative: Advances and initiatives
,”
JOM
66
(
3
),
334
(
2014
).
7.
G. B.
Olson
and
C. J.
Kuehmann
, “
Materials genomics: From CALPHAD to flight
,”
Scr. Mater.
70
,
25
30
(
2014
).
8.
J.-C.
Zhao
, “
High-throughput experimental tools for the materials genome initiative
,”
Chin. Sci. Bull.
59
(
15
),
1652
1661
(
2014
).
9.
C. M.
Breneman
et al, “
Stalking the materials genome: A data-driven approach to the virtual design of nanostructured polymers
,”
Adv. Funct. Mater.
23
(
46
),
5746
5752
(
2013
).
10.
A.
Jain
et al, “
Commentary: The materials project: A materials genome approach to accelerating materials innovation
,”
APL Mater.
1
(
1
),
011002
(
2013
).
11.
S.
Ramakrishna
et al, “
Materials informatics
,”
J. Intell. Manuf.
30
(
5
) (
2018
).
12.
S. R.
Kalidindi
,
A. J.
Medford
, and
D. L.
McDowell
, “
Vision for data and informatics in the future materials innovation ecosystem
,”
JOM
68
(
8
),
2126
2137
(
2016
).
13.
S. R.
Kalidindi
,
Hierarchical Materials Informatics
(
Butterworth Heinemann
,
2015
).
14.
Y.
Kajikawa
et al, “
Causal knowledge extraction by natural language processing in material science: A case study in chemical vapor deposition
,”
Data Sci. J.
5
,
108
118
(
2006
).
15.
E.
Kim
et al, “
Machine-learned and codified synthesis parameters of oxide materials
,”
Sci. Data
4
,
170127
(
2017
).
16.
J.
Nunez-Iglesias
et al, “
Machine learning of hierarchical clustering to segment 2D and 3D images
,”
PLoS One
8
(
8
),
e71715
(
2013
).
17.
A.
Chowdhury
et al, “
Image driven machine learning methods for microstructure recognition
,”
Comput. Mater. Sci.
123
,
176
187
(
2016
).
18.
P.
Raccuglia
et al, “
Machine-learning-assisted materials discovery using failed experiments
,”
Nature
533
(
7601
),
73
(
2016
).
19.
B.
Meredig
et al, “
Combinatorial screening for new materials in unconstrained composition space with machine learning
,”
Phys. Rev. B
89
(
9
),
094104
(
2014
).
20.
Y.
Liu
et al, “
Materials discovery and design using machine learning
,”
J. Materiomics
3
(
3
),
159
177
(
2017
).
21.
R.
Ramprasad
et al, “
Machine learning in materials informatics: Recent applications and prospects
,”
npj Comput. Mater.
3
(
1
),
54
(
2017
).
22.
G.
Pilania
et al, “
Machine learning bandgaps of double perovskites
,”
Sci. Rep.
6
,
19375
(
2016
).
23.
G.
Pilania
et al, “
Accelerating materials property predictions using machine learning
,”
Sci. Rep.
3
,
2810
(
2013
).
24.
Y. C.
Yabansu
et al, “
Extraction of reduced-order process-structure linkages from phase-field simulations
,”
Acta Mater.
124
,
182
194
(
2017
).
25.
E.
Popova
et al, “
Process-structure linkages using a data science approach: Application to simulated additive manufacturing data
,”
Integr. Mat. Manuf. Innov.
1
15
(
2017
).
26.
A.
Iskakov
et al, “
Application of spherical indentation and the materials knowledge system framework to establishing microstructure-yield strength linkages from carbon steel scoops excised from high-temperature exposed components
,”
Acta Mater.
144
,
758
767
(
2018
).
27.
N. H.
Paulson
et al, “
Reduced-order structure-property linkages for polycrystalline microstructures based on 2-point statistics
,”
Acta Mater.
129
,
428
438
(
2017
).
28.
M. W.
Priddy
et al, “
Strategies for rapid parametric assessment of microstructure-sensitive fatigue for HCP polycrystals
,”
Int. J. Fatigue
104
,
231
(
2017
).
29.
H. K. D. H.
Bhadeshia
, “
Neural networks and information in materials science
,”
Stat. Anal. Data Min.
1
(
5
),
296
305
(
2009
).
30.
S. R.
Kalidindi
, “
Data science and cyber infrastructure: Critical enablers for accelerated development of hierarchical materials
,”
Int. Mater. Rev.
60
(
3
),
150
168
(
2015
).
31.
P.
Voorhees
and
G.
Spanos
, “Modeling across scales: A roadmapping study for connecting materials models and simulations across length and time scales,” Technical Report (The Minerals, Metals & Materials Society, TMS,
2015
).
32.
E. B.
Gulsoy
et al, “
Four-dimensional morphological evolution of an aluminum silicon alloy using propagation-based phase contrast x-ray tomographic microscopy
,”
Mater. Trans.
55
(
1
),
161
164
(
2014
).
33.
M. D.
Uchic
,
M. A.
Groeber
, and
A. D.
Rollett
, “
Automated serial sectioning methods for rapid collection of 3-D microstructure data
,”
JOM
63
(
3
),
25
29
(
2011
).
34.
J. F.
Bingert
et al, “
High-energy diffraction microscopy characterization of spall damage
,” in
Dynamic Behavior of Materials
(
Springer
,
2014
),
Vol. 1
, pp.
397
403
.
35.
U.
Lienert
et al, “
High-energy diffraction microscopy at the advanced photon source
,”
JOM
63
(
7
),
70
77
(
2011
).
36.
T. J.
Nizolek
et al, “
Strain fields induced by kink band propagation in Cu-Nb nanolaminate composites
,”
Acta Mater.
133
(
Suppl. C
),
303
315
(
2017
).
37.
P. I.
Frazier
and
J.
Wang
, “
Bayesian optimization for materials design
,” in
Information Science for Materials Discovery and Design
(
Springer
,
2016
), pp.
45
75
.
38.
P.
Angelikopoulos
,
C.
Papadimitriou
, and
P.
Koumoutsakos
, “
X-TMCMC: Adaptive kriging for Bayesian inverse modeling
,”
Comput. Methods Appl. Mech. Eng.
289
,
409
428
(
2015
).
39.
A.
Gelman
et al, “
Bayesian data analysis
,” in
Chapman & {Hall/CRC} Texts in Statistical Science
, 3rd ed. (
Chapman and Hall/CRC
,
2014
).
40.
D.
Gamerman
and
H. F.
Lopes
,
Markov Chain Monte Carlo Stochastic Simulation for Bayesian Inference
(
CRC Press
,
2006
).
41.
G. E. P.
Box
,
Bayesian Inference in Statistical Analysis
, edited by
G. C.
Tiao
(
Addison-Wesley Pub. Co.
,
Reading, MA.
,
1973
).
42.
D. L.
McDowell
and
G. B.
Olson
, “
Concurrent design of hierarchical materials and structures
,”
Sci. Model. Simul.
15
,
207
240
(
2008
).
43.
G. B.
Olson
,
G. B.
, (
2000
)
Designing a new material world
.
Science
288
(
5468
),
993
998
.
44.
G. B.
Olson
, “
Computational design of hierarchically structured materials
,”
Science
277
(
29
),
1237
1242
(
1997
).
45.
G. B.
Olson
, “
Systems design of hierarchically structured materials: Advanced steels
,”
J. Comput. Aided Mater. Des.
4
,
143
156
(
1998
).
46.
D. L.
McDowell
et al,
Integrated Design of Multiscale, Multifunctional Materials and Products
(
Elsevier
,
2009
).
47.
B. L.
Adams
,
S. R.
Kalidindi
, and
D.
Fullwood
,
Microstructure Sensitive Design for Performance Optimization
(
Butterworth-Heinemann
,
2012
).
48.
J. A.
Gomberg
,
A. J.
Medford
, and
S. R.
Kalidindi
, “
Extracting knowledge from molecular mechanics simulations of grain boundaries using machine learning
,”
Acta Mater.
133
(
Suppl. C
),
100
108
(
2017
).
49.
G. W.
Milton
et al,
The Theory of Composites. Cambridge Monographs on Applied and Computational Mathematics
(
Cambridge University Press
,
Cambridge
,
2002
).
50.
See http://www.aflowlib.org for “Automatic-FLOW for Materials Discovery” (
2014
).
51.
V.
Marx
, et al, “
Simulation of the texture evolution during annealing of cold rolled BCC and FCC metals using a cellular automation approach research
,”
Textures Microstruct.
28
(
3–4
),
211
(
1997
).
52.
B.
Blaiszik
et al, “
The materials data facility: Data services to advance materials science research
,”
JOM
68
(
8
),
2045
2052
(
2016
).
53.
See https://materialsdatafacility.org/ for “
The Materials Data Facility (MDF)
” (
2019
).
54.
D.
Raabe
, “
Numerical three-dimensional simulations of the stress fields of dislocations in face-centred cubic crystals
,”
Modell. Simul. Mater. Sci. Eng.
3
(
5
),
655
664
(
1995
).
55.
D.
Raabe
, “
Modelling of active slip systems, Taylor factors and grain rotations during rolling and compression deformation of polycrystalline intermetallic Li2 compounds
,”
Acta Metall. Mater.
43
(
4
),
1531
(
1995
).
56.
S. R.
Kalidindi
and
M. D.
Graef
, “
Materials data science: Current status and future outlook
,”
Annu. Rev. Mater. Res.
45
,
171
(
2015
).
57.
S. R.
Kalidindi
et al, “
Data infrastructure elements in support of accelerated materials innovation: ELA, PyMKS, and MATIN
,”
Integr. Mater. Manuf. Innov.
8
(
4
),
441
454
(
2019
).
58.
Y. B.
Park
,
D.
Raabe
, and
T. H.
Yim
, “
Cold rolling textures of Fe-Ni soft magnetic alloys
,”
Scr. Mater.
35
(
11
),
1277
1283
(
1996
).
59.
“Implementing ICME in the Aerospace, Automotive, and Maritime Industries,” (The Minerals, Metals, and Materials Society, 2013).
60.
G. J.
Schmitz
and
U.
Prahl
, “
ICMEg—The integrated computational materials engineering expert group—A new European coordination action
,”
Integr. Mater. Manuf. Innov.
3
(
24
),
2
(
2014
).
61.
See http://www.matweb.com/ fro “
MatWeb
” (
2014
).
62.
N. J.
Petch
, “
The cleavage strength of polycrystals
,”
J. Iron and Steel Institute
174
,
25
28
(
1953
).
63.
E. O.
Hall
, “
The deformation and ageing of mild steel III. Discussion of results
,”
Proc. Phys. Soc. London Sect. B
64
,
747
753
(
1951
).
64.
L. J.
Huang
et al, “
Effects of volume fraction on the microstructure and tensile properties of in situ TiBw/Ti6Al4V composites with novel network microstructure
,”
Mater. Des.
45
,
532
538
(
2013
).
65.
S. K.
Paul
,
N.
Stanford
, and
T.
Hilditch
, “
Effect of martensite volume fraction on low cycle fatigue behaviour of dual phase steels: Experimental and microstructural investigation
,”
Mater. Sci. Eng. A
638
,
296
304
(
2015
).
66.
M.
Asadi
,
B. C.
De Cooman
, and
H.
Palkowski
, “
Influence of martensite volume fraction and cooling rate on the properties of thermomechanically processed dual phase steel
,”
Mater. Sci. Eng. A
538
,
42
52
(
2012
).
67.
S. R.
Kalidindi
and
A.
Abusafieh
, “
Longitudinal and transverse moduli and strengths of low angle 3-D braided composites
,”
J. Compos. Mater.
30
(
8
),
885
905
(
1996
).
68.
S.
Torquato
,
Random Heterogeneous Materials
(
Springer-Verlag
,
New York
,
2002
).
69.
D. T.
Fullwood
et al, “
Microstructure sensitive design for performance optimization
,”
Prog. Mater. Sci.
55
(
6
),
477
562
(
2010
).
70.
M.
Binci
,
D.
Fullwood
, and
S. R.
Kalidindi
, “
A new spectral framework for establishing localization relationships for elastic behavior of composites and their calibration to finite-element models
,”
Acta Mater.
56
(
10
),
2272
2282
(
2008
).
71.
S. R.
Kalidindi
et al, “
Elastic properties closures using second-order homogenization theories: Case studies in composites of two isotropic constituents
,”
Acta Mater.
54
(
11
),
3117
3126
(
2006
).
72.
M.
Ebner
et al, “
X-ray tomography of porous, transition metal oxide based lithium ion battery electrodes
,”
Adv. Energy Mater.
3
(
7
),
845
850
(
2013
).
73.
M. K.
Miller
and
R. G.
Forbes
, “
Atom probe tomography
,”
Mater. Charact.
60
(
6
),
461
469
(
2009
).
74.
M. D.
Uchic
et al, “
Three-dimensional microstructural characterization using focused ion beam tomography
,”
MRS Bull.
32
(
5
),
408
416
(
2007
).
75.
H.
Proudhon
,
J.-Y.
Buffière
, and
S.
Fouvry
, “
Three-dimensional study of a fretting crack using synchrotron X-ray micro-tomography
,”
Eng. Fract. Mech.
74
(
5
),
782
793
(
2007
).
76.
S. R.
Niezgoda
,
A. K.
Kanjarla
, and
S. R.
Kalidindi
, “
Novel microstructure quantification framework for databasing, visualization, and analysis of microstructure data
,”
Integr. Mater. Manuf. Innov.
2
,
3
(
2013
).
77.
S. R.
Niezgoda
,
Y. C.
Yabansu
, and
S. R.
Kalidindi
, “
Understanding and visualizing microstructure and microstructure variance as a stochastic process
,”
Acta Mater.
59
,
6387
6400
(
2011
).
78.
S. R.
Niezgoda
,
D. T.
Fullwood
, and
S. R.
Kalidindi
, “
Delineation of the space of 2-point correlations in a composite material system
,”
Acta Mater.
56
(
18
),
5285
5292
(
2008
).
79.
S. R.
Niezgoda
et al, “
Optimized structure based representative volume element sets reflecting the ensemble-averaged 2-point statistics
,”
Acta Mater.
58
(
13
),
4432
4445
(
2010
).
80.
W. L.
Briggs
and
V. E.
Henson
,
The DFT: Aan Owner's Manual for the Discrete Fourier Transform
(
Society for Industrial and Applied Mathematics
,
PA
,
1995
).
81.
W. H.
Press
et al,
Numerical Recipes in C++: The Art of Scientific Computing
(
Cambridge University Press
,
2002
).
82.
H.-J.
Bunge
,
Texture Analysis in Materials Science. Mathematical Methods
(
Cuvillier Verlag
,
Göttingen
,
1993
).
83.
S. R.
Kalidindi
et al, “
Application of data science tools to quantify and distinguish between structures and models in molecular dynamics datasets
,”
Nanotechnology
26
(
34
),
344006
(
2015
).
84.
B. L.
Adams
,
G.
Xiang
, and
S. R.
Kalidindi
, “
Finite approximations to the second-order properties closure in single phase polycrystals
,”
Acta Mater.
53
(
13
),
3563
3577
(
2005
).
85.
N. H.
Paulson
et al, “
Reduced-order microstructure-sensitive protocols to rank-order the transition fatigue resistance of polycrystalline microstructures
,”
Int. J. Fatigue
119
,
1
10
(
2019
).
86.
N. H.
Paulson
et al, “
Data-driven reduced-order models for rank-ordering the high cycle fatigue performance of polycrystalline microstructures
,”
Mater. Des.
154
,
170
183
(
2018
).
87.
S.
Torquato
and
G.
Stell
, “
Microstructure of two-phase random media. I. The n-point probability functions
,”
J. Chem. Phys.
77
(
4
),
2071
2077
(
1982
).
88.
A.
Cecen
,
T.
Fast
, and
S. R.
Kalidindi
, “
Versatile algorithms for the computation of 2-point spatial correlations in quantifying material structure
,”
Integr. Mater. Manuf. Innov.
5
(
1
),
1
(
2016
).
89.
D. T.
Fullwood
,
S. R.
Niezgoda
, and
S. R.
Kalidindi
, “
Microstructure reconstructions from 2-point statistics using phase-recovery algorithms
,”
Acta Mater.
56
(
5
),
942
948
(
2008
).
90.
E.
Kroner
, in
Statistical Modelling, in Modelling Small Deformations of Polycrystals
, edited by
J.
Gittus
and
J.
Zarka
(
Elsevier Science Publishers
,
London
,
1986
), pp.
229
291
.
91.
Y. C.
Yabansu
et al, “
Application of Gaussian process regression models for capturing the evolution of microstructure statistics in aging of nickel-based superalloys
,”
Acta Mater.
178
,
45
58
(
2019
).
92.
D. M.
Turner
and
S. R.
Kalidindi
, “
Statistical construction of 3-D microstructures from 2-D exemplars collected on oblique sections
,”
Acta Mater.
102
,
136
148
(
2016
).
93.
V.
Sundararaghavan
, “
Reconstruction of three-dimensional anisotropic microstructures from two-dimensional micrographs imaged on orthogonal planes
,”
Integr. Mater. Manuf. Innov.
3
(
1
),
19
(
2014
).
94.
R.
Bostanabad
et al, “
Computational microstructure characterization and reconstruction: Review of the state-of-the-art techniques
,”
Prog. Mater. Sci.
95
,
1
41
(
2018
).
95.
S.
Mika
,
B.
Schölkopf
,
A. J.
Smola
,
K. R.
Müller
,
M.
Scholz
, and
G.
Rätsch
, “
Kernel PCA and de-noising in feature spaces
,” in
Advances in Neural Information Processing Systems
(MIT Press,
1999
) pp.
536
542
.
96.
S. T.
Roweis
and
L. K.
Saul
, “
Nonlinear dimensionality reduction by locally linear embedding
,”
Science
290
(
5500
),
2323
2326
(
2000
).
97.
Z.
Zhang
and
H.
Zha
, “
Principal manifolds and nonlinear dimensionality reduction via tangent space alignment
,”
SIAM J. Sci. Comput.
26
(
1
),
313
338
(
2004
).
98.
M. I.
Latypov
,
L. S.
Toth
, and
S. R.
Kalidindi
, “
Materials knowledge system for nonlinear composites
,”
Comput. Methods Appl. Mech. Eng.
346
,
180
(
2018
).
99.
M. I.
Latypov
and
S. R.
Kalidindi
, “
Data-driven reduced order models for effective yield strength and partitioning of strain in multiphase materials
,”
J. Comput. Phys.
346
,
242
(
2017
).
100.
A.
CeCen
et al, “
A data-driven approach to establishing microstructure-property relationships in porous transport layers of polymer electrolyte fuel cells
,”
J. Power Sources
245
,
144
153
(
2014
).
101.
A. E.
Hoerl
and
R. W.
Kennard
, “
Ridge regression: Biased estimation for nonorthogonal problems
,”
Technometrics
12
(
1
),
55
67
(
1970
).
102.
R.
Tibshirani
, “
Regression shrinkage and selection via the Lasso
,”
J. R. Stat. Soc. Ser. B Method.
58
(
1
),
267
288
(
1996
).
103.
H.
Zou
and
T.
Hastie
, “
Regularization and variable selection via the elastic net
,”
J. R. Stat. Soc. Ser. B Method
67
(
2
),
301
320
(
2005
).
104.
Z.
Liu
,
C. T.
Wu
, and
M.
Koishi
, “
Transfer learning of deep material network for seamless structure–property predictions
,”
Computat. Mech.
64
(
2019
).
105.
Z.
Yang
et al, “
Deep learning approaches for mining structure-property linkages in high contrast composites from simulation datasets
,”
Comput. Mater. Sci.
151
,
278
287
(
2018
).
106.
S. S.
Haykin
et al,
Neural Networks and Learning Machines
(
Pearson Education,
Upper Saddle River
,
2009
),
Vol. 3
.
107.
Z.
Cao
et al, “
Convolutional neural networks for crystal material property prediction using hybrid orbital-field matrix and magpie descriptors
,”
Crystals
9
(
4
),
191
(
2019
).
108.
A.
Cecen
et al, “
Material structure-property linkages using three-dimensional convolutional neural networks
,”
Acta Mater.
146
,
76
84
(
2018
).
109.
A.
Krizhevsky
,
I.
Sutskever
, and
G. E.
Hinton
, “
Imagenet classification with deep convolutional neural networks
,” in
Advances in Neural Information Processing Systems
, edited by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (MIT Press,
2012
) pp.
1097
1105
.
110.
S. R.
Kalidindi
, “
A Bayesian framework for materials knowledge systems
,”
MRS Commun.
9
(
2
),
518
531
(
2019
).
111.
A. E.
Raftery
,
D.
Madigan
, and
J. A.
Hoeting
, “
Bayesian model averaging for linear regression models
,”
J. Am. Stat. Assoc.
92
(
437
),
179
191
(
1997
).
112.
P.
Fernandez-Zelaia
et al, “
Estimating mechanical properties from spherical indentation using Bayesian approaches
,”
Mater. Des.
147
,
92
105
(
2018
).
113.
P.
Fernandez-Zelaia
,
Y. C.
Yabansu
, and
S. R.
Kalidindi
, “
A comparative study of the efficacy of local/global and parametric/nonparametric machine learning methods for establishing structure–property linkages in high-contrast 3D elastic composites
,”
Integr. Mater. Manuf. Innov.
55
(
2019
).
114.
Paciorek
,
C. J.
and
M. J.
Schervish
, “
Nonstationary covariance functions for Gaussian process regression
,” in
Advances in Neural Information Processing Systems
(MIT Press,
2004
) pp.
273
280
. 2004.
115.
C. E.
Rasmussen
, “
Gaussian processes in machine learning
,” in
Summer School on Machine Learning
(
Springer
,
2003
).
116.
A.
Castillo
and
S. R.
Kalidindi
, “
A Bayesian framework for the estimation of the single crystal elastic parameters from spherical indentation stress-strain measurements
,”
Front. Mater.
6
,
136
(
2019
).
117.
A. R.
Castillo
,
V. R.
Joseph
, and
S. R.
Kalidindi
, “
Bayesian sequential design of experiments for extraction of single-crystal material properties from spherical indentation measurements on polycrystalline samples
,”
JOM
71
,
2671
(
2019
).
118.
H.
Wang
and
J.
Li
, “
Adaptive Gaussian process approximation for Bayesian inference with expensive likelihood functions
,”
Neural Comput.
30
(
11
),
3072
3094
(
2018
).
119.
T.
Takhtaganov
and
J.
Müller
, “Adaptive Gaussian process surrogates for Bayesian inference,” arXiv:1809.10784 (
2018
).
120.
P.
Fernandez-Zelaia
and
S. N.
Melkote
, “
Process-structure-property modeling for severe plastic deformation processes using orientation imaging microscopy and data-driven techniques
,”
Integr. Mater. Manuf. Innov.
8
,
1
20
(
2019
).
121.
J.
Jung
et al, “
Bayesian approach in predicting mechanical properties of materials: Application to dual phase steels
,”
Mater. Sci. Eng. A
743
,
382
(
2018
).
122.
J.
Jung
et al, “
An efficient machine learning approach to establish structure-property linkages
,”
Comput. Mater. Sci.
156
,
17
25
(
2019
).
123.
X.
Zhang
and
C.
Oskay
, “
Polycrystal plasticity modeling of nickel-based superalloy IN 617 subjected to cyclic loading at high temperature
,”
Modell. Simul. Mater. Sci. Eng.
24
(
5
),
055009
(
2016
).
124.
G.
Stevens
et al, “
Experiment-based validation and uncertainty quantification of coupled multi-scale plasticity models
,”
Multidiscip. Model. Mater. Struct.
12
(
1
),
151
176
(
2016
).
125.
P.
Acar
, “
Crystal plasticity model calibration for Ti-7Al alloy with a multi-fidelity computational scheme
,”
Integr. Mater. Manuf. Innov.
7
(
4
),
186
194
(
2018
).
126.
Y.
Yabansu
et al, “
Application of Gaussian process autoregressive models for capturing the time evolution of microstructure statistics from phase-field simulations for sintering of polycrystalline ceramics
,”
Modell. Simul. Mater. Sci. Eng.
27
,
084006
(
2019
).
127.
C. M.
Bishop
,
Pattern Recognition and Machine Learning
(
Springer
,
2006
).
128.
C. K.
Williams
and
C. E.
Rasmussen
,
Gaussian Processes for Machine Learning
(
The MIT Press
,
2006
).
129.
E.
Schulz
,
M.
Speekenbrink
, and
A.
Krause
, “
A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions
,”
J. Math. Psychol.
85
,
1
16
(
2018
).
130.
S.
Ruder
, “An overview of gradient descent optimization algorithms,” arXiv:1609.04747 (
2016
).
132.
See https://www.r-project.org/ for “
The R Project for Statistical Computing
” (
2016
).
133.
R. B.
Gramacy
and
D. W.
Apley
, “
Local Gaussian process approximation for large computer experiments
,”
J. Comput. Graph. Stat.
24
(
2
),
561
578
(
2015
).
134.
T.
Chen
,
J.
Morris
, and
E.
Martin
, “
Gaussian process regression for multivariate spectroscopic calibration
,”
Chemom. Intell. Lab. Syst.
87
(
1
),
59
71
(
2007
).
135.
Y. C.
Yabansu
et al, “
A digital workflow for learning the reduced-order structure-property linkages for permeability of macroporous membranes
,”
195
,
668
680
(
2020
).
136.
H.
Garmestani
et al, “
Statistical continuum theory for large plastic deformation of polycrystalline materials
,”
J. Mech. Phys. Solids
49
(
3
),
589
607
(
2001
).
137.
Y. C.
Yabansu
and
S. R.
Kalidindi
, “
Representation and calibration of elastic localization kernels for a broad class of cubic polycrystals
,”
Acta Mater.
94
,
26
35
(
2015
).
138.
X.
Huan
and
Y. M.
Marzouk
, “
Simulation-based optimal Bayesian experimental design for nonlinear systems
,”
J. Comput. Phys.
232
(
1
),
288
317
(
2013
).
139.
D. J. C.
MacKay
, “
Introduction to Gaussian processes
,” in
NATO ASI Series F Computer and Systems Sciences
(Springer,
1998
), Vol. 168, pp.
133
166
.
140.
C. E.
Rasmussen
,
Evaluation of Gaussian Processes and Other Methods for Non-Linear Regression
(
University of Toronto
,
1996
).
141.
A. R.
Castillo
and
S. R.
Kalidindi
, “
Bayesian estimation of single ply anisotropic elastic constants from spherical indentations on multi-laminate polymer-matrix fiber-reinforced composite samples
,”
Meccanica
, (
2020
).