This paper delves into the transformative role of Machine Learning (ML) and Artificial Intelligence (AI) in materials science, spotlighting their capability to expedite the discovery and development of newer, more efficient, and stronger compounds. It underscores the shift from traditional, resource-intensive approaches toward data-driven methodologies that leverage large datasets to predict properties, identify new materials, and optimize synthesis conditions with a satisfactory level of accuracy. Highlighting various techniques, including supervised, unsupervised, and reinforcement learning, alongside deep learning potential, the chapter presents case studies and applications ranging from predicting stress points in stochastic fields to optimizing thermal protection systems for spacecraft re-entry. It also explores the challenges and future directions, emphasizing the need for integrating experimental validations and developing tailored algorithms to overcome data and computational constraints. The narrative showcases ML and AI’s promise in revolutionizing material discovery, paving the way for innovative solutions in science and engineering.
I. INTRODUCTION AND THE FOURTH PARADIGM OF SCIENCE
In the field of materials science, researchers typically rely on experiments and simulations to develop and produce new materials. This approach requires significant resources and often involves a certain degree of luck and skills. Recent discoveries in data-driven knowledge fields have given researchers the opportunity to use what was defined in Material Gnosis in Refs. 1 and 2 as the fourth paradigm of science based upon data driven methodologies. The purpose of using big data is to improve the efficiency and accuracy of their work, as well as produce more sustainable and durable results with less computational power and fewer testing facilities.
In May 2016, Nature published3 a paper that explained how failed experiments could help the discovery of new materials. The accuracy of these predictions outperformed that of expert chemists, leading researchers to believe that Machine Learning (ML) could significantly alter the traditional approach to material discovery and open new possibilities for inventing materials. Since then, the use of Machine Learning (ML) and Artificial Intelligence (AI) in materials science has continued to gain momentum. Researchers are using these technologies to quickly predict candidate materials based on desired performance requirements, which can then be developed and tested using computer simulations. This approach can significantly speed up the development process and improve efficiency (Figs. 1-4).
In recent years, machine learning and artificial intelligence have become increasingly popular in materials science. One reason for this is the significant potential that these technologies hold for improving the efficiency and accuracy of material discovery and development. Traditionally, materials scientists have relied on experiments and simulations to identify and produce new materials. However, this approach can be time-consuming and resource-intensive, requiring significant amounts of trial and error to achieve the desired results. Machine learning and AI offer an alternative approach that can speed up the discovery and development process.10
For example, researchers can use machine learning algorithms to analyze large amounts of experimental data, identify patterns and correlations, and predict new results based on this information. These predictions can then be tested using computer simulations, which can save significant time and resources compared to traditional experimental methods. Another advantage of machine learning and AI is that they can help overcome some of the limitations of traditional material discovery methods. For example, machine learning algorithms can quickly analyze large datasets and identify new materials with desired properties that may have been overlooked using traditional approaches. This can help open up new avenues for material discovery and development. Overall, the use of machine learning and AI in materials science has the potential to revolutionize the field, leading to breakthroughs in many different areas. As researchers continue to explore the possibilities of these technologies, it is likely that we will see new and exciting advances in material discovery and development in the years to come.11
II. MATERIAL DISCOVERY ACCELERATION WITH MACHINE LEARNING
The discovery of new materials with desirable properties is a critical step in advancing various fields of science and engineering, such as electronics, energy, and medicine. Traditional methods of material discovery are time-consuming and expensive, as they involve synthesizing and characterizing numerous samples. Machine learning has emerged as a powerful tool in material discovery, as it can accelerate the process by predicting material properties, identifying new materials, and optimizing synthesis conditions. In this section, an overview of the different techniques that can be used in material discovery will be provided. Advantages, limitations, and applications in predicting material properties, identifying new materials, and optimizing synthesis conditions.
A. Machine learning techniques for material discovery
The three main types of machine learning techniques are as follows: supervised learning, unsupervised learning, and reinforced learning.
Supervised learning is a technique in which a machine learning model is trained on a labeled dataset to predict the output of new, unseen data. In material discovery, supervised learning can be used to predict material properties based on the chemical composition, crystal structure, and processing conditions. The labeled dataset may include data from experiments, simulations, or a combination of both. Examples of supervised learning algorithms include regression, support vector machines, decision trees, and neural networks.4
Unsupervised learning is a technique in which a machine learning model is trained on an unlabeled dataset to identify patterns and relationships in the data. In material discovery, unsupervised learning can be used to cluster similar materials based on their properties or to identify outliers that exhibit unique properties. Examples of unsupervised learning algorithms include clustering, principal component analysis (PCA), and autoencoders.1
Reinforcement learning is a technique in which a machine learning model learns to make decisions based on feedback from the environment. In material discovery, reinforcement learning can be used to optimize synthesis conditions by selecting the best set of parameters to maximize a desired property. Examples of reinforcement learning algorithms include Q-learning and policy gradients.10
B. Applications of machine learning in material discovery
1. Predicting material properties
Machine learning can be used to predict material properties such as electronic, magnetic, and mechanical properties based on the chemical composition, crystal structure, and processing conditions. This can be achieved by training a model on a labeled dataset that includes experimental or simulated data. For example, machine learning models have been used to predict the bandgap of materials, which is a critical property for applications in electronics and optoelectronics. Models based upon these methodologies have also been used to predict the mechanical properties of materials, such as the Young’s modulus and Poisson’s ratio, which are important for designing materials with specific mechanical properties.5
2. Identify new materials
Machine learning can be used to identify new materials with desirable properties by screening a large database of materials. This can be achieved by training a machine learning model on a labeled dataset that includes data on the properties of known materials. The ML model can then be used to predict the properties of new materials based on their chemical composition and crystal structure. For example, ML models have been used to predict the properties of new battery materials, which can potentially improve the energy density and safety of batteries.2
3. Optimizing conditions
Machine learning can be used to optimize synthesis conditions by selecting the best set of parameters to maximize a desired property. This can be achieved by training an ML model on a labeled dataset that includes data on the properties of materials synthesized under different conditions. The ML model can then be used to predict the properties of materials synthesized under new conditions. For example, ML models have been used to optimize the synthesis of catalysts by selecting the best set of parameters to maximize their activity and selectivity.12
C. Deep learning for material discovery
Deep learning (DL) is a subset of machine learning that has the potential to revolutionize the field of material discovery by accelerating the process of identifying new materials with desirable properties. In this section, we will explore the potential of such methodology in material discovery, including an overview of the different techniques, their advantages and limitations, and their applications in predicting material properties, identifying new materials, and optimizing synthesis conditions. Deep learning techniques are neural network architectures that consist of multiple layers of interconnected nodes that can learn and make predictions based on input data. The most commonly used DL architectures in material discovery are convolutional neural networks (CNNs) and recurrent neural networks (RNNs).6,13
CNNs are used for image-based data and are designed to automatically learn and extract features from image data. In material discovery, CNNs can be used to predict material properties based on images of the material’s microstructure, such as the crystal structure and morphology. RNNs are used for sequential data and are designed to process data with temporal dependencies, such as time-series data. In material discovery, RNNs can be used to predict the properties of materials based on the synthesis conditions, which are often represented as time-series data.
1. Advantages and limitations of deep learning
Deep learning has several advantages over traditional machine learning techniques, such as the ability to automatically learn features from raw data without the need for feature engineering. DL also has the potential to achieve higher accuracy and efficiency in predicting material properties compared to traditional machine learning techniques. However, DL also has several limitations, such as the requirement for large amounts of training data and the need for significant computational resources. In addition, DL models can be difficult to interpret, which can limit their usefulness in providing insights into the underlying relationships between material properties and their predictors.7,13
D. Applications of deep learning in material discovery
Deep learning has several applications in material discovery, including predicting material properties, identifying new materials, and optimizing synthesis conditions. For example, deep learning can be used to predict the mechanical properties of materials based on images of their microstructure, which can potentially accelerate the design of new materials with specific mechanical properties. Deep learning can also be used to identify new materials with desirable properties by predicting the properties of hypothetical materials that have not yet been synthesized. In addition, deep learning can be used to optimize synthesis conditions by predicting the properties of materials synthesized under different conditions and identifying the optimal conditions for achieving specific material properties.6,7 Despite its potential, deep learning also faces several challenges in material discovery. One challenge is the interpretability of DL models, which can limit their usefulness in providing insights into the underlying relationships between material properties and their predictors. Another challenge is the requirement for large amounts of training data, which may not be available for certain materials or properties. In addition, DL models require significant computational resources, which can limit their scalability and accessibility. Using this new methodology has the potential to revolutionize the field of material discovery by accelerating the process of identifying new materials with desirable properties. The use of deep learning in predicting material properties, identifying new materials, and optimizing synthesis conditions has several advantages and limitations. However, challenges such as model interpretability and the requirement for large amounts of training data must be addressed to fully realize the potential of deep learning in material discovery.
III. AI DRIVEN ALGORITHM TO PREDICT STRESS POINT IN STOCHASTIC FIELDS
A. Simplified neural network using Gaussian process distribution regression and principal component analysis
is the mean function, often assumed to be zero or another known function based upon prior knowledge.
k(x, x′) is the covariance function (kernel), which defines the covariance between any two points in the field, reflecting how stress values correlate with each other across the field.
k∗ is the mean vector of covariances between the new point x* and all points in X.
K is the covariance matrix for points in X.
I is the identity matrix.
is the variance of the observation noise.
The resulting datasets or stress distribution plots can be further reduced by training a simplified neural network that can increase the accuracy of stress prediction levels and reduce computational costs.4 The supervised technique of clustering could help further enhance the results by linking together the results of multiple probabilistic simulations. The picture below shows how the k-mean clustering application led to the identification of key stress points in an I-beam that underwent cyclical loading.15
Such neural networks, combined with clustering, can be used to predict high stress points in stochastic fields. Although, at this point in time, there are not enough results to verify the accuracy of the models, using ML has certainly led to less computational power usage and more accessibility of results.5 If such algorithms need to be implemented within a data or ML operations pipeline, they would need evaluation using synthetic or dummy data as well as real world data. Such AI driven mythologies have potential applications in engineering and mechanics, where predicting high stress points is crucial for maintaining the structural integrity and safety of various systems. For example, in a case study where there was a need to estimate high stress points within a wind turbine case, this Gaussian based stochastic algorithm was trained using stress data from a historical series of failure data about wind turbines and, once applied within the FEA based model, gave a fairly accurate prediction of the locations of the high stress points using less computational power. Results were validated using experimental data, and the algorithm was shown to be highly accurate in predicting high-stress points in wind turbine blades.16 In another case study, the need was to estimate high-stress points in an underground pipeline. The data regarding previous pipeline inspections were used as training data in a convoluted neural network pipeline similar to the one described above, and the prediction of high-stress points was compared with real data that were acquired in subsequent inspections, showing a good degree of accuracy.17
B. Surrogate learning
Machine learning can be applied in parallel and isolation, but also in sequential logic, meaning that the output or dataset generated by one algorithm could be further included within a different pipeline or even used in conjunction with traditional computational methods to generate a new way of computing simulation called surrogate learning or surrogate modeling.18 In the evolving landscape of material science, the integration of ML algorithms has marked a significant shift toward more innovative and efficient approaches to studying and optimizing materials for extreme conditions. One key application of the surrogate model has been carried out by Zuccarini et al. and involved the analysis of stress distribution in Ultra-High Temperature Ceramics (UHTCs), critical for aerospace applications. The model was built upon a combination, also known as surrogacy, of traditional computational fluid dynamic and deep learning to estimate the stress distribution in a Zirconium Diboride and Silica Carbine (ZrB2–SiC) based Thermal Protection System (TPS) under re-entry conditions.9
UHTCs such as ZrB2/SiC are indispensable for safeguarding spacecraft against the intense thermal and mechanical stresses encountered during hypersonic re-entry into Earth’s atmosphere, where temperatures can soar to 5000 K and pressures can reach up to 205 MPa. The study embarked on a comprehensive methodology that combined finite element modeling and computational fluid dynamics to simulate the innovative use of the Keras library for DL, underscoring both feasibility as well as a significant reduction of computational time and resource demands, thereby presenting a scalable and efficient tool for predicting complex material behaviors under extreme conditions.8 Findings from the study revealed a maximum stress threshold indicative of the ablative behavior of the TPS material, with specific attention drawn to the thermal stress concentration at boundary layer separation points. These insights are vital for the design and optimization of TPS materials, ensuring spacecraft can withstand the rigors of re-entry. This example underscores the transformative potential of surrogate modeling in material science, particularly in the study of UHTCs for aerospace applications. By enabling more accurate and efficient predictions of material behavior under extreme conditions, such algorithms are not only optimizing the design and testing of thermal protection systems but also paving the way for advancements in the broader field of materials science. Applications in such contexts illustrate a pivotal step toward leveraging computational power to enhance our understanding and innovation in material technologies, marking a significant milestone in the ongoing synergy between materials science and artificial intelligence.7,19
V. CONCLUSIONS AND FURTHER DIRECTIONS
Machine learning and deep learning have the potential to revolutionize the field of material discovery by accelerating the process of identifying new materials with desirable properties. In this section, we will discuss the outlook of machine learning and DL in material discovery, including potential avenues for future research, such as the integration of experimental and computational approaches and the development of algorithms specifically tailored for material discovery.1 While machine learning and deep learning can accelerate the screening of materials, experimental validation is still necessary to confirm the predicted properties of new materials. Therefore, an integrated approach that combines such methodologies with experimental methods, integrated within a stable data driven pipeline, could potentially speed up the process of material discovery while also providing experimental validation of the predicted properties.
For example, machine learning and deep learning could be used to guide the synthesis of new materials, and the properties of the synthesized materials can be characterized using experimental techniques such as x-ray diffraction, scanning electron microscopy, and spectroscopy. Another potential avenue for future research is the development of new machine learning and DL algorithms specifically tailored for material discovery. Traditional machine learning algorithms may not be optimized for the unique challenges associated with material discovery, such as the high dimensionality of the data and the complex relationships between the chemical composition, crystal structure, and properties of materials.20–26
Therefore, tailored algorithms that are specifically designed for material discovery can potentially improve the accuracy and efficiency of the predictions. For example, Convolutional Neural Networks (CNNs) and recurrent neural networks (RNNs) can be adapted for material discovery by incorporating domain-specific knowledge into the architecture. In conclusion, the integration of experimental and computational approaches and the development of new ML and DL algorithms specifically tailored for material discovery are potential avenues for future research. However, challenges such as data availability and quality must also be addressed to fully realize the potential of these methodologies in the future.4
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Carmine Zuccarini: Writing – original draft (equal). Karthikeyan Ramachandran: Writing – original draft (equal). Doni Daniel Jayaseelan: Writing – review & editing (lead).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.