Neutron and x-ray scattering represent two classes of state-of-the-art materials characterization techniques that measure materials structural and dynamical properties with high precision. These techniques play critical roles in understanding a wide variety of materials systems from catalysts to polymers, nanomaterials to macromolecules, and energy materials to quantum materials. In recent years, neutron and x-ray scattering have received a significant boost due to the development and increased application of machine learning to materials problems. This article reviews the recent progress in applying machine learning techniques to augment various neutron and x-ray techniques, including neutron scattering, x-ray absorption, x-ray scattering, and photoemission. We highlight the integration of machine learning methods into the typical workflow of scattering experiments, focusing on problems that challenge traditional analysis approaches but are addressable through machine learning, including leveraging the knowledge of simple materials to model more complicated systems, learning with limited data or incomplete labels, identifying meaningful spectra and materials representations, mitigating spectral noise, and others. We present an outlook on a few emerging roles machine learning may play in broad types of scattering and spectroscopic problems in the foreseeable future.
I. INTRODUCTION
A. Neutron and x-ray scattering in the data era
Neutron and x-ray scattering are two closely related and complementary techniques that can be used to measure a wide variety of materials structural and dynamical properties from atomic to mesoscopic scales.1,2 Representing two state-of-the-art materials characterization techniques, neutron and x-ray scattering have undergone significant advancement in the past several decades. While the average neutron flux for reactor-based neutron generation has reached a plateau of , accelerator-based neutron generation has improved steadily over the past 50 years3 [Fig. 1(a)]. The planned Second Target Station (STS) at Oak Ridge National Lab (ORNL) is expected to reach 25 times enhancement in brightness and a factor of 10–1000 capability enhancement in instruments compared to other neutron sources in the United States. For x-ray scattering, the peak brightness of synchrotron sources has increased drastically across a broad range of x-ray photon energies4 [Fig. 1(b)]. In fact, the improvement in peak brightness of synchrotron x-ray sources even exceeds the rate of Moore's law5,6 [Fig. 1(c)], with a few major facility upgrades, such as Advanced Photon Source-Upgrade (APS-U), European Synchrotron Radiation Facility-Extremely Brilliant Source (ESRF-EBS), and Positron-Electron Tandem Ring Accelerator (PETRA-IV), bringing significant capability boosts. A direct consequence of the enhanced capability is the high efficiency of data collection, enabling the measurement of more diverse types of materials.
In addition to increased data availability for a broader materials composition space, the higher brightness further opens up the possibility for higher-dimensional data collection for a single material type or within one scattering experiment. Spectroscopies, like time-of-flight inelastic neutron scattering, measure the dynamical structure factor in four-dimensional (4D) momentum–energy space, while x-ray photon correlation spectroscopy measures the intensity auto-correlation in 4D momentum–time space.7 The emerging frontier of multimodal scattering, which simultaneously measures samples with multiple probes, or in in situ environments such as extreme temperatures or pressures, elastic strain, or applied electrical and magnetic fields, introduces additional dimensions to the measured parameter space. Alongside high intrinsic momentum Q, energy , and time t dimensions, multimodality leads to an even higher overall data dimension and adds inevitable complexities to data analysis.
Finally, the discovery of new functional and quantum materials—often accompanied by novel or unexpected emergent properties—poses a significant challenge to materials analysis. In many scattering experiments with a given measurable signal , associated theoretical models exist, parameterized by a set of fitting parameters representing materials properties to extract. For the optimal fitting parameters , the difference between the experiment and model reaches a minimum, i.e.,
However, even for a perfect fit , the information that can be extracted is still ultimately limited by the theoretical model itself. Until recently, avenues that can access materials properties outside the parameter set have been lacking.
In short, large data volume, high data dimension combined with multimodality, and new classes of quantum and functional materials with emergent properties that go beyond approximate models all call for a revolutionary approach to learn materials properties from neutron and x-ray scattering data. Machine learning,8,9 especially emerging techniques that incorporate physical insight10–15 or respect symmetries and physical constraints of atomic, crystalline, and molecular structures,16–26 appears to be a promising and powerful tool to extract useful information from large, high-dimensional experimental datasets,27 going far beyond approximate analytical fitting models. The past few years have witnessed a surge in machine learning research with scattering and spectroscopic applications. Even so, we foresee that machine learning, if properly implemented, has the potential not only to serve as a powerful tool to conduct data analysis but also to gain new knowledge and physical insight from materials, which can assist experimental design and accelerate materials discovery.
B. Integrating machine learning into the scattering setup
Machine learning has already been widely applied to materials science in many areas, especially in directly predicting or facilitating predictions of various materials properties from structural information, including but not limited to mechanical properties,24,26,28,29 thermodynamic properties,28,30–32 and electronic properties.24,33–39 Their strong predictive power and capacity for representation learning enable machine learning models to perform comparably to more expensive numerical models, like first-principles calculations, at much lower computational cost. This asset greatly accelerates materials discovery and design.40–45 Machine learning models can also distill complex structural information46–48 and be trained to acquire interatomic force fields and potential energy surfaces,49–54 where the accurate, yet computationally cheap access to atomic potentials has proven successful in simulating the transitions in a disordered silicon system with 100 000 atoms.55 Evidently, machine learning models have already initiated a paradigm shift in the way people study materials science and physics.56–62
To see how machine learning can be applied to neutron and x-ray scattering, we show a simple scattering setup in Fig. 2(a). A beam of neutrons or x-ray photons is generated at the source. After passing through the beam optics that prepares the incident beam state, the beam impinges on the sample with a set of incident parameters , where denote the incident beam intensity, momentum, energy, and polarization, respectively. After interacting with the sample, the scattered beam can be described by another set of parameters , which are partially or fully recorded by the detector. In this source–sample–detector tripartite scheme, the possible application scope of machine learning can be seen clearly: At the “source” stage, machine learning can be used to optimize beam optics; at the “sample” stage, to better learn materials properties; and at the “detector” stage, to improve data quality, such as realizing super-resolution. Setting aside the source and detector stages, which will be introduced in Sec. IV, we focus first on the sample stage, particularly the application of machine learning to relate materials spectra and their properties.
To further illustrate the general relationship between machine learning and scattering experiments, we consider the scattering data as one component in a typical machine learning architecture. In the case of supervised machine learning, the scattering or spectral data can serve either as input to predict other materials properties [Fig. 2(b)] or as output generated from known or accessible materials parameters, such as atomic structures and other materials representations [Fig. 2(c)]. Alternately, unsupervised machine learning can be used to identify underlying patterns in spectral data through dimensionality reduction and clustering, which can be useful for data exploration or identification of key descriptors in the data [Fig. 2(d)].
C. Machine learning architectures for scattering data
With the various roles machine learning may play in a scattering experiment pipeline, one may ask what particular machine learning architecture should be used for a certain task. Given the no free lunch theorem for optimization,63 many algorithms are interchangeable. Even so, a number of machine learning models are naturally suited to scattering experiments. Here, we introduce a few categories of useful architectures, many of which are implemented in the examples that will be discussed Secs. II–IV.
1. Representation of materials
For materials studies, the representation of materials, particularly atomistic structures, is crucial. Various representational approaches have been developed to describe molecules and solids. These methods include the Coulomb matrix representation,64 which translates molecules into matrices; the Ewald sum matrix representation, which generalizes the Coulomb matrix to periodic structures;65 the partial radial distribution function (PRDF), which describes the radial density of species surrounding an atom;16 atom-centered symmetry functions, which capture both radial and angular information;66 and the power spectrum defined by the Smooth Overlap of Atomic Positions (SOAP) descriptor.67 An excellent review of different materials representations can be found in Schmidt et al.58
2. Representation of scattering data
Complementary to materials representation is that of scattering data. The scattering intensity can be represented as a high-dimensional array, , indexed by momentum k, energy , and polarization ε. Such data structures are naturally compatible with convolutional neural networks (CNNs), which are widely applied in image processing. Atomic structures can also be interpreted as images by regarding them as density fields based on atomic species and positions on 3D real-space grids, which enables them to work with convolutional filters.44,68 Architectures beyond CNNs, such as deep U-Nets, are also widely used to compress the feature size while increasing the number of features, with skip connections to corresponding layers69 [Fig. 3(a)].
3. Autoencoder and generator
Another powerful architecture is the variational autoencoder (VAE),70 which compresses the input into a distributed area in a lower-dimensional latent space (encoding) then reconstructs the input from this low-dimensional representation (decoding). The latent space is thus a “compressed” continuous representation of the training samples, which can be an effective strategy for learning meaningful, continuous representations of materials properties [Fig. 3(b)]. For example, VAEs can be combined with CNNs to learn latent representations of atomic structures.68 Moreover, the stability of crystal structures can be easily inferred from latent space clustering,44 and similar methods can also be applied to analyze scattering data, such as x-ray absorption spectroscopy (XAS)39 or neutron diffuse scattering.71 Another use of VAE is to serve as a generative model to facilitate materials design, such as generating new structures through sampling and exploring the latent space.44 The generative adversarial network (GAN) is another popular generative framework that is composed of a generator and a discriminator72 [Fig. 3(c)]. The generator is a neural network that converts latent space representations to desired objects, such as crystal structures,45 while the discriminator is a second network that aims to discern “fake” (generated) from “realistic” (training) samples. The main goal of the generator is to create high-fidelity objects that can effectively fool the discriminator, thereby generating outputs that closely resemble real data.
4. Graph neural networks
Graph neural networks with nodes and edges are naturally suited to represent atomic structures, where atoms can be represented as nodes and bonds as edges in a graph. In graphs, information at each node is updated using information from its neighboring nodes, mimicking a local chemical environment in which an atom is most influenced by its neighboring atoms. Crystal graph CNNs24 [Fig. 3(d)] and Euclidean neural networks (E3NN)19,73,74 are two such examples. E3NNs are equipped with sophisticated filters that incorporate radial functions and spherical harmonics, which render these networks equivariant to 3D Euclidean transformations; thus, all crystallographic symmetries of input structures are preserved. Because symmetry is built into these models, E3NNs do not require data augmentation and can achieve accurate results without significant data volume.
5. Nonparametric learning algorithms
The aforementioned machine learning architectures contain parameters that need to be learned during the training process, yet plenty of unsupervised learning or nonparametric algorithms exist that do not contain learnable parameters but are more procedural. For instance, k-means clustering and Gaussian mixture models (GMMs) can be applied to data clustering; decision trees, such as gradient boosted trees (GB Trees), can be used for classification and regression [Fig. 3(f)]; and principal component analysis (PCA) can be used for dimensionality reduction. One particularly interesting method is non-negative matrix factorization (NMF), which decomposes a matrix into lower dimensions but maintains an intuitive representation75,76 [Fig. 3(e)]. Conceptually, NMF resembles the widely used dimension reduction algorithm of PCA but with additional non-negativity constraints. Such non-negative matrix descriptions are extremely powerful when interpreting certain physical signals, such as music spectrograms.77 Likewise, scattering data collected from detectors have non-negative counts in the array and can, in principle, be decomposed with NMF.
II. STATIC PROPERTIES IN RECIPROCAL OR REAL SPACE
A. Diffraction with machine learning
To see how machine learning can benefit neutron and x-ray diffraction (XRD), we follow the taxonomy illustrated in Fig. 2. In a supervised learning problem, diffraction signatures can serve as an input to predict either materials structure or properties, while in an unsupervised learning approach, clustering of diffraction data can be used to infer patterns or relationships between materials in the absence of known data labels. The inverse problem, such as using structures or other physical properties to predict diffraction patterns, can either be posed as physics-based forward problems or are less relevant to machine learning studies and will be left out of this discussion.
1. Diffraction or structure as input, property as output
Since the most straightforward information one can extract from diffraction data is atomic structure, whose variation is directly associated with mechanical properties, we start by discussing high-entropy alloys as an example to show how diffraction can be used to predict elastic constants in complicated materials. High-entropy alloys have received tremendous attention in the past decade due to their extraordinary strength-to-weight ratios and chemical stability. However, given their complex atomic configurations, direct property calculation has been challenging. To enable efficient prediction of elastic constants in high-entropy alloys, Kim et al. conducted a combined neutron diffraction, ab initio calculation, and machine learning study.78 In this study, an in situ diffraction experiment and high-quality ab initio density functional theory (DFT) calculations with special quasi-random structure (SQS) were carried out on the high-entropy alloy Al0.3CoCrFeNi. The experimental result and ab initio calculations of elastic constants showed good agreement and, thus, can serve as the ground truth values [Fig. 4(a), ground truth block]. Due to the limited neutron beamtime and high computational cost of DFT + SQS, it would be unrealistic to either measure or compute the elastic constants in a large number of high-entropy alloys. To bridge this gap, the authors built a GB Tree–based predictive model using a separate set of nearly 7000 ordered, crystalline solids from the Materials Project, in which the elastic constants have already been properly labeled [Fig. 4(a), machine learning block]. It is worth mentioning that the training set and validation set do not contain any high-entropy alloys. Even so, there are a few indicators that demonstrate the model's transferability and generalizability. On the one hand, the elastic constants of Al0.3CoCrFeNi predicted by the machine learning model show good agreement with ground truth values established by experiments and DFT + SQS calculations. On the other hand, the lower training error compared to a benchmark model, and the reasonable dependence as a function of training data volume, give a level of confidence in the model generalizability.
The high-entropy alloy example demonstrates a general pathway for efficient property predictions in complex materials, where data scarcity is a common challenge. With a small labeled “hard” dataset , that is difficult to acquire, training a machine learning model directly may not be feasible due to the low data volume n. To make machine learning possible, a different, large set , with and , can be used, where the features and labels are easier to obtain, such as properties of simple crystalline solids or outputs of efficient forward computations. The key step is to build a predictive model using the large, “easy” set that also minimizes the test error from [Fig. 4(b)]. A few approaches can help to achieve this step, such as transfer learning, where the learning task in one setting (e.g., easy set) is generalized and transferred to another setting (e.g., hard set), and similar examples will be discussed frequently in Secs. II–IV. The outcome thus has a level of generalizability. In the previous example,
and is simply used to test the machine learning model built upon [Fig. 4(b), right-facing arrow].
2. Diffraction as input, structure as output
In addition to the direct structure-to-property prediction, given the close relationship between diffraction and structure, another central machine learning application is diffraction-to-structure prediction, as done by Garcia-Cardona et al. for neutron diffraction.79 The conventional solution to this problem is iterative optimization of computed scattering patterns from physics-based forward models. A key challenge, however, lies in the scarcity of labeled neutron diffraction data, i.e.,
To facilitate training of machine learning models, simulated diffraction patterns simulated neutron diffraction may be generated by sweeping the structure parameter space (comprising lattice parameters, unit cell angles, etc.), which relies on data augmentation [Fig. 4(b), left-facing arrow]. Similarly, augmented XRD data can be used to obtain crystal dimensionality and space group information.80 Another approach is to use the atomic pair distribution function (PDF) to predict space group information.81 Conventionally, the PDF is a powerful tool in determining local order and disorder information.82 By setting
PDF may further be exploited to allow determination of global space group information. A separate set, , containing 15 examples is used as a test dataset [Fig. 4(b), right-facing arrow], where the space groups of 12 examples are among the top six predicted labels.
3. Unsupervised learning for diffraction data
In addition to supervised learning on XRD and PDF spectra, unsupervised learning, which aims to elicit the internal categorical structure of data, can also assist the analysis and interpretation of scattering measurements. Since mature fitting and refinement methods exist to identify phases among different crystallographic structures, one key application for unsupervised learning is phase identification in a complex compositional phase space where multiple phases coexist. One major success of unsupervised learning in the context of XRD analysis has been the use of NMF and its variants,83 which can decompose diffraction spectra into simpler basis patterns. Recalling the introduction in Sec. I C, NMF decomposes a non-negative matrix into two smaller non-negative matrices, namely, the basis matrix , representing basis patterns, and the coefficient matrix , indicating contributions of those patterns. In XRD, denotes the diffraction intensity of the mth composition at the nth sample diffraction angle (or, equivalently, the momentum transfer Q). Long et al. applied NMF to identify phases within the metallic Fe–Ga–Pd ternary compositional phase diagram.84 For instance, given a nominal composition Fe46Pd26Ga28, NMF decomposes its XRD pattern [Fig. 5(a), middle] into a weighted sum of five basis patterns [Fig. 5(a), bottom and top] with a total number of basis patterns. The entire structural phase diagram in the compositional phase space can then be constructed [Fig. 5(b)] and contains the quantitative weight information of each pure constituent phase. Limitations exist, however, when the same nominal composition corresponds to different structural combinations with slightly varied diffraction peaks. To overcome this limitation, Stanev et al. extended NMF with custom clustering (NMFk) algorithms to capture the nuanced peak shifts from changes in the lattice constant, which can further resolve the constituent phases, even within the same nominal composition.85 Compared to NMF, where the small matrix dimension K is chosen manually through trial and error, NMFk automatically searches and optimizes K. The same Fe–Ga–Pd dataset first analyzed in Long et al.84 by NMF is re-analyzed using NMFk.85 An optimized basis pattern number is found by NMFk, which contains four basis patterns representing BCC Fe structures but with a slight peak shift [Fig. 5(d)]. Although the BCC Fe structure corresponds to almost identical regions in the structural phase diagrams produced by NMFk and NMF methods [comparing Fig. 5(c) to blue points in Fig. 5(b)], the weight of each NMFk basis pattern for the BCC Fe structures can be seen clearly [Fig. 5(e)], tracing the nuanced lattice parameter change in the phase diagram. In a more recent example, XRD measurements were also analyzed to obtain quantum phase diagrams with charge ordering and structural phase transitions using a novel approach called XRD temperature clustering (X-TEC), which builds upon the GMM.86
Beyond learning materials properties and identifying structure–property relationships, machine learning has also been applied to empower the analysis of diffraction patterns themselves87–91 or to automate the structure refinement process.92 Since the focus here is to explore materials properties, we leave those examples to Sec. IV B as part of the section on the data analysis process.
B. Small-angle neutron and x-ray scattering
Small-angle scattering (SAS), which includes small-angle neutron scattering (SANS) and small-angle x-ray scattering (SAXS), is a powerful technique used to probe structures and their evolution on length scales of 0.5–100 nm.93,94 They have been widely applied to study soft matter systems, such as rough surfaces,95 colloids, and polymers,96–98 and biological macromolecules,98–101 as well as mesoscopic magnetic structures, namely, magnetic vortex lattices in superconductors (SANS only).102–105 In the past few years, a surge of machine learning–augmented SAS works has been reported.106–118 At least two reasons make SAS an ideal technique that can benefit from machine learning. On the one hand, SAS represents one of the rare techniques for which experimental data can be directly and quantitatively compared to theoretical models with minimal postexperimental data processing. This direct data-to-data comparison increases transferability using computationally easy data to conduct training with high fidelity. On the other hand, SAS allows for highly efficient synthetic data generation, since in many cases, only effective geometrical models at intermediate scales are needed to compute the 1D SAS spectra . Even in the case of atomistic scale data generation, methods with low computational cost, such as molecular dynamics, Monte Carlo simulations, or micromagnetic simulations, are generally sufficient.
1. Spectra as input, structure as output
Since the original goal of SAS is to learn structural information, we start by introducing one example that predicts structural properties. Franke et al. provided such a machine learning–based structure predictor in biomacromolecular solutions.108 For a given geometrical object, although the form factor and corresponding SAS spectra are directly computable [Fig. 6(a)], the effect of disorder must be considered to generate realistic data. To consider this disorder effect, an ensemble optimization method is implemented to generate SAS patterns of random chains, which are averaged to simulate mixtures [Fig. 6(b)] that then augment the original geometrical data. The first task is to classify the shape of macromolecules from SAS. By defining a structural parameter, the radius of gyration , the original SAS can be compressed onto a 3D parameter space, with the three coordinates representing normalized apparent values corresponding to the integral upper bounds of , respectively. It can be seen directly that different basic shapes separate well in this 3D parameter space [Fig. 6(b)]. By performing unsupervised k-nearest neighbor classification, these shapes can also be classified from the SAS curves. To perform a structural parameter prediction, a separate set of atomistic structure data from the Protein Data Bank (PDB) is used to compute both SAS patterns and structural parameters, from which a predictive machine learning model is built, showing good transferability when applied to experimental databases [Fig. 6(c)]. A summarized workflow is shown in Fig. 6(d). In this example, we can still state that
but with two different sources,
depending on whether the target is obtaining shape features or obtaining structural parameters, respectively. Shape classification and structure parameter prediction, with targeted synthetic data augmentation for the respective tasks, represent two key applications of machine learning to SAS,112,117 and have been employed in studying systems like RNA114 and 3D protein structures.116 Machine learning also enables direct analysis using 2D SAS data,111,118 where traditional analysis methods frequently require data reduction to 1D.
Another example of machine learning in the context of SAS is in micromagnetic structural determination from SANS. As in studies of soft matter, real space structural information is encoded in 2D maps of the neutron scattering cross section. As noted previously, a strong benefit of magnetic SANS is that the structure factor and cross section are relatively straightforward to calculate from a theoretical model—often a micromagnetic continuum model—of the real space magnetization. With sufficiently labeled experimental data and micromagnetic simulations, supervised learning with neural networks or unsupervised clustering methods can be used to solve the inverse scattering problem of determining real space magnetic structures from SANS spectra at the mesoscale.
2. Spectra as input, other property as output
Since the structures of macromolecules are directly linked to their microscopic interactions, one further use of machine learning is to augment SAS to learn interatomic interaction properties. Demerdash et al. directly extracted force field parameters from SAS109 using an iterative algorithm as depicted in Fig. 6(e). First, a molecular dynamics (MD) simulation is performed from an initial set of force field parameters, and corresponding SAS intensities are then calculated. If the simulated and experimental scattering intensities are in good agreement according to a specified convergence criterion, the current force field parameters are output; otherwise, the parameters are updated and the process is repeated until optimal force field parameters are obtained. Such refined force field parameters improve the agreement between simulated SAS and experimental data.
C. Imaging and tomography
Neutron119,120 and x-ray121 imaging encompass a variety of modalities and have become essential techniques to unravel multidimensional and multiscale information in materials systems. As the complexity and size of imaging data grow, machine learning has also been applied to solve a variety of imaging-related computational tasks, including tomography and phase-contrast imaging. Tomography and phase-contrast imaging are two types of high-dimensional imaging techniques sensitive to either the beam absorption or the phase shift associated with sample rotation, respectively. We restrict further discussion to materials science and refer the readers to other reviews for applications in biomedical imaging.122,123 Despite the wide variety of imaging modalities used today, the major data processing steps generally include image reconstruction and image segmentation.
In image reconstruction, one recovers the real space information (usually the amplitude and phase of the imaged object) from data obtained at different sample positions. Neural network–based reconstruction algorithms have been shown to improve reconstruction speed and quality,124 as demonstrated in a neutron tomography experiment.125 Yang et al. demonstrated the use of a GAN for reconstructing limited-angle tomography data by considering reconstruction as an image translation problem between the sinogram domain and real space.126 Their method, called GANrec, has been shown to tolerate large missing wedges without obvious degradation in reconstruction quality. GANrec has been successfully applied to tomographic imaging of zeolite particles deposited on a microelectromechanical systems (MEMS) chip, which due to limited rotation capability, has a missing wedge of 110°. The reconstruction from GANrec shows significant improvement over outcomes from conventional reconstruction algorithms, which are corrupted by artifacts due to the missing data.
With regard to image segmentation, in which pixels representing the desired structures are separated from the background, typical approaches include variants of the CNN and U-Net architectures.69 A number of related studies have been conducted, such as materials defect recognition,127–129 mineral phase segmentation,130 automated feature extraction for microtomography,131 and nondestructive, in vivo estimation of plant starch.132 Deep transfer learning, which has demonstrated great power in image processing, can also be applied to feature extraction in x-ray tomography133 using a network pretrained on a large image database. Deep U-Nets, on the other hand, are shown to be highly successful on image segmentation tasks.134
A particularly powerful technique called coherent diffraction imaging (CDI) has attracted significant research attention since its first demonstration in 1999.135 Contrary to conventional imaging, the resolution in CDI is not limited by the imaging optics. This allows for 3D structure determination in nanoscale materials through computational phase retrieval.136,137 Given the data complexity, lack of phase information, and high data volume inherent to this technique, machine learning is becoming a promising tool for CDI analysis. As an example, the shapes of helium nano-droplets have been measured by single-shot CDI with a free-electron laser,138 where shape classification from the diffraction images could be obtained using a CNN.139 More recently, Cherukara et al. applied the CNN depicted in Fig. 7(a) to directly address the phase retrieval problem in a particular type of CDI called ptychography.140 By inputting the diffraction patterns at different spatial points of the scan [row A of Fig. 7(b)], the amplitude and phase images of the 2D tungsten calibration chart obtained by the machine learning model [rows C and E of Fig. 7(b)] show good agreement with those retrieved by a conventional iterative phase retrieval algorithm [rows B and D of Fig. 7(b)]. The CNN-assisted approach can effectively speed up scanning by a factor of five, thus greatly reducing the imaging time and dose. Furthermore, Scheinker and Pokharel built an additional model-independent adaptive feedback loop ontop of the CNN output,141 which allows for more accurate recovery of the 3D shape [Fig. 7(c)]. Iterative projection approaches still demonstrate great flexibility in tomographic reconstruction because constraints such as multiple scattering effects can be captured well by physical models,142 while they are currently implemented only in a few example cases using machine learning–based approaches in optical imaging.143
III. SPECTROSCOPIES AND DYNAMICAL PROPERTIES
A. X-ray absorption spectroscopy
XAS is another characterization technique widely used in materials science, chemistry, biology, and physics. The possibility to reach excellent agreement between experimental and computational data makes XAS suitable for training machine learning models on bulk computational data that translate well to experimental examples. The absorption of x rays reflects electronic transitions from an atomic core orbital to either unoccupied bound levels or the free continuum, producing sharp jumps in the absorption spectrum at specific energies called absorption edges.144 Such a measurement is therefore sensitive to the species of the absorbing atom as well as to its valence state and local chemical environment, including the local symmetry, coordination number, and bond length.145,146 As a result, XAS is routinely used in the characterization of materials structural and electronic properties. However, interpretation of XAS spectra ranges from qualitative comparisons with known model complexes, to more quantitative comparisons with theoretical models147,148 or band structure calculations, making the process difficult to standardize and automate across different materials and applications. Machine learning methods are therefore sought to better extract and decipher the rich electronic and structural information encoded in XAS signatures.149 The availability of large XAS databases, such as the XASdb,150 further facilitates this objective.
To this end, Carbone et al. developed a neural network classifier to identify the local coordination environments of absorbing atoms in >18 000 transition metal oxides using simulated K-edge x-ray absorption near-edge structure (XANES) spectra.151 The input of their neural network model is the discretized XANES spectrum, while the output is a predicted class label corresponding to one of three coordination geometries: tetrahedral, square pyramidal, and octahedral. The authors achieved an average 86% classification accuracy when using the full (pre-, main-, and post-edge) feature space of the discretized spectra; however, by also training their model using only the pre-edge region, they further revealed the significance of features beyond the pre-edge for accurate classification of the coordination environments [Fig. 8(a)]. The work of Torrisi et al. expanded on this approach by subdividing the discretized XANES spectra into smaller domains ranging from 2.5 to 12.5 eV, thereby capturing spectral features on both coarse and fine scales.152 The spectrum within each domain is then fit by a cubic polynomial whose coefficients serve as inputs to random forest models for predicting the properties of interest, including coordination number, mean nearest-neighbor distance, and Bader charge. Through this multiscale featurization, the authors highlighted the importance of developing effective data representations to improve model interpretability and accuracy.
Another focus of machine learning efforts in this context includes accelerating high-throughput modeling of XAS spectra. As a proof of concept, Carbone et al. showed that a message-passing neural network (MPNN) is capable of predicting the discretized XANES spectra of molecules to quantitative accuracy by using a graph representation of molecular geometries and their chemical properties.153 An MPNN, shown in Fig. 8(b), refers to a neural network framework that operates on graph-structured data: Hidden state vectors at each node in the graph are updated according to a function of their neighbors' state vectors for a specified number of time steps, and the results are ultimately aggregated over the entire graph to produce the final output.154 The structural similarities between MPNNs and molecular systems suggest that these networks may better predict molecular properties by remaining invariant to the molecular symmetries that help to determine these properties. In their work, Carbone et al. constructed each molecular graph by associating with each graph node a list of atom features (absorber, atom type, donor or acceptor states, and hybridization) and with each graph edge a list of bond features (bond type and length). The MPNN then passes the encoded feature information between adjoining nodes to learn effective atomic properties before computing a discretized output XANES spectrum from the final hidden state vectors. The network is optimized by minimizing the mean absolute error between this predicted spectrum and a ground truth XANES spectrum obtained from simulation.
Furthermore, Madkhali et al. investigated how the choice of representation for the local environment of an absorbing atom affects the performance of a neural network in predicting the corresponding K-edge XANES spectrum.155 In particular, the authors examined two different representations of chemical space—the Coulomb matrix and the radial distribution curve (RDC) shown in Fig. 8(c)—to represent the local environment around an Fe absorption site and evaluated them based on their ability to recover the Fe K-edge XANES spectra of 9040 unique Fe-containing compounds. They concluded that RDC featurization can achieve smaller mean squared error (MSE) between the predicted and target XANES spectra more quickly and with fewer data samples, reinforcing the need for effective data representations of materials-specific descriptors. Rankine et al. built upon this work by implementing a deep neural network to estimate Fe K-edge XANES spectra, relying only on geometric information about the Fe local environment as input.156 Specifically, the authors represented the local environment around the Fe absorption site by computing a discrete RDC comprising all two-body pairs within a fixed cutoff radius. Despite the limited input information, they demonstrated that a properly trained network can be used to make rapid, quantitatively accurate predictions of XANES spectra while circumventing the time and resource demands of advanced theoretical calculations.
Finally, one major advantage of XAS is its compatibility with diverse samples, both crystalline and amorphous, and sample environments, as in the case of in situ or operando measurements under extreme temperatures or externally applied fields, leading to diverse applications and opportunities for machine learning–assisted analysis. In particular, XAS is a prominent method used to correlate the structure of nanoparticle catalysts to properties such as catalytic activity, which is often characterized under the operando conditions of a harsh reaction environment, as shown in Fig. 8(d). Thus, the predictive ability of machine learning methods is attractive for directly recognizing encoded structural descriptors, such as coordination number, from evolving XAS spectral features. For example, Timoshenko et al. demonstrated that neural networks can be used to predict the average coordination numbers of Pt nanoparticles directly from their XANES spectra, which can then be used to determine particle sizes, shapes, and other structural motifs needed to inform catalyst design.157 Several successful examples of machine learning–aided analysis for operando XAS spectra of catalyst structures have been reported in recent years.158–161 Machine learning has also been applied to conduct high-throughput screening and obtain additional chemical insight into the atomic configurations of thin films monitored by in situ XAS during synthesis.162 Overall, machine learning methods have shown incredible potential for improving and accelerating the analysis of this versatile characterization tool, and more widespread integration of machine learning solutions within routine XAS analysis workflows may be on the horizon.
B. Photoemission spectroscopies
Contrary to XAS, which is generally bulk sensitive, photoelectron or photoemission spectroscopy (PES) is a surface-sensitive, photon-in, electron-out technique performed with light sources ranging from hard x rays to the extreme ultraviolet (UV) energy regime, which provides direct access to a material's electronic structure.163,164 The high sensitivity of x-ray photoelectron spectroscopy (XPS) to the chemical environment makes it an essential tool for quantifying a material's composition. In this regard, machine learning–based fitting may be used to disentangle complex overlapping spectra. Aarva et al. used fingerprint spectra calculated with bonding motifs obtained from an unsupervised clustering algorithm to fit x-ray photoelectron spectra.165 In another work, Drera et al. trained a CNN using simulated spectra to predict chemical composition directly from multicomponent x-ray photoelectron spectra from a survey spectral library.166 To achieve high-quality training, an easy set containing ∼100 000 computed XPS examples is generated using electron scattering theory in the transport approximation, which is shown to generalize well to a set of ∼500 well-characterized experimental examples in the hard set . Their approach obviates the need to fit these complex spectra directly while showing robustness against the contaminant signal within the survey spectra.
Apart from chemical quantification, modern PES with momentum-resolved detectors is capable of mapping the entire electronic structure of materials through multidimensional detection of photoelectron energy and momentum distributions.163,167 The resulting 4D intensity data in energy–momentum space exhibit the same data structure as vibrational spectra obtained through inelastic scattering measurements. While this analogy implies transferability of machine learning approaches developed for inelastic scattering, to be discussed in Sec. III C, the relationship between PES observables and microscopic quantities is significantly more complex due to the quantum nature of the electronic states and the multiple prefactors that effectively modulate the intensity values in a momentum-dependent manner.168 The current understanding of the complex photoemission spectra is limited by the available computational tools. Therefore, machine learning is a potential avenue to understanding such data. Highlighting the dispersive features is of primary importance for comparison between experiments and theories. For this task, robust methods are needed to tolerate the noise level and intensity modulations in the data.169 Peng et al. trained a super-resolution neural network based on simulated angle-resolved PES (ARPES) data, the easy set to enhance the dispersive features in the experimental data, the hard set , without explicit models of the band dispersion.170 By contrast, Xian et al. cast band fitting as an inference problem and used a probabilistic graphical model to recover the underlying dispersion.171 Remarkably, this approach does not require training but a reasonably good prior guess as a starting point. Its reasonable computational scaling allows the reconstruction of multiband dispersions within the entire Brillouin zone, as demonstrated in the 2D material tungsten diselenide (WSe2).
C. Inelastic scattering
One of the major triumphs of neutron and x-ray characterization techniques is inelastic scattering, which measures the elementary excitations of materials.172–175 There are generally two types of elementary excitations at the meV energy range, including (a) collective atomic vibrations, such as phonons in crystalline solids176–180 and Boson peaks in amorphous materials,181–185 and (b) magnetic excitations, which are essential to understand the nature of strongly correlated materials,186 such as frustrated magnetic systems187–189 and unconventional superconductors.190–192 However, unlike elastic scattering, where massive synthetic data can be generated from forward models to build the easy set , inelastic scattering is challenging for machine learning due to the atomistic origin and quantum nature of the excitations, where forward models have high computational cost. Therefore, one major hurdle for machine learning in the context of inelastic scattering is data scarcity. Here, we introduce two examples of using machine learning to overcome this hurdle to study elementary excitations of phonons and magnetic excitations, respectively.
For machine learning–assisted phonon studies, Chen et al. built a neural network model that directly predicts a material's phonon density of states (DOS) using only the atomic coordinates and masses of its constituent atoms as input.193 Two key challenges were addressed in this work. First, there was a lack of a large training set; a reliable density-functional perturbation theory (DFPT) database contains a only small set of ∼1500 examples.194 Second, the predicted outcome, the phonon DOS, was a continuous function instead of a single scalar quantity. To tackle these challenges, a graph-based neural network termed the Euclidean neural network195 was implemented. Euclidean neural networks are by construction equivariant to permutation, 3D rotations, translations, and inversion and thus, fully respect crystallographic symmetry [Fig. 9(a)]. This inherent symmetry eliminates the need for data augmentation and enables the networks to generalize well without significant data volume. Intuitively, the symmetry constraint imposed on the operations of the neural network restricts the search space of functions to those that are physically meaningful; therefore, data become more powerful, and fewer data are needed to achieve accurate results that generalize well. The predicted phonon DOS is shown in Fig. 9(b), with each of the four rows representing an error quartile. For lower error predictions [first three rows in Fig. 9(b)], the shape of DOS can be finely resolved; for high-error predictions [fourth row in Fig. 9(b)], coarse features such as bandwidth and DOS gaps can still largely be predicted. With such a predictive model available, the computational cost for phonon DOS is significantly reduced, and the prediction in alloy systems becomes feasible.
As for magnetic systems, Samarakoon et al. implemented an autoencoder to assist the estimation of magnetic Hamiltonian parameters in the spin ice Dr2Ti2O7, including magnetic exchange coupling between neighboring spins and magnetic dipolar interactions.71 Although the work considers diffuse scattering, which measures the static magnetic structure factor , the architecture is also well suited for inelastic scattering with dynamical structure factor , since the forward model that obtains from a parameterized Hamiltonian can also be used to calculate . The workflow is shown in Fig. 10. A Monte Carlo–based forward model is used to compute the simulated structure factor . Instead of directly comparing to , which could suffer from experimental artifacts, an autoencoder is applied to compress the structure factor into a lower-dimensional, latent representation L where , with . The optimization process of the parameters is then performed in the latent space of the autoencoder by comparing and . This example demonstrates a generic principle of how machine learning can aid inelastic scattering to probe magnetic orderings and excitations. In particular, if the forward problem of calculating the dynamical structure factor from some parameterized Hamiltonian becomes feasible, for example using linear spin-wave theory,196 we expect that similar machine learning models will have huge potential to study magnetic excitations with experimental data in strongly correlated systems.
IV. EXPERIMENTAL INFRASTRUCTURE AND DATA
A. Instrument and beam
Thus far, the discussion has focused on using machine learning–augmented elastic and inelastic scattering and spectroscopies to better elucidate materials properties. Given the central role of beamline infrastructure in a successful scattering experiment, machine learning has also been applied to optimize instrument operation,197–200 such as accurately characterizing x-ray pulse properties from a free-electron laser.201 Li et al. achieved dynamic aperture optimization using machine learning for the storage ring at the National Synchrotron Light II (NSLS-II) in Brookhaven National Laboratory (BNL).197 Dynamic aperture optimization aims to tune the configuration of the sextupole magnets to increase the ultra-relativistic electron lifetime in the storage ring. It is a multi-objective optimization problem with multiple objective functions , , to minimize within the parameter space , which can be solved by a conventional multi-objective genetic algorithm (MOGA) with further augmentation by machine learning. The direct tracking of a large number of particles forms a so-called “population” in the parameter space ; Fig. 11(a) depicts example populations in a generic 2D parameter space . Using k-means clustering, the populations are classified into different clusters, as shown in step 1 of Fig. 11(a). These clusters are further assigned a quality label [Fig. 11(a), step 2] by evaluating a fitness function, defined as the weighted average of objective functions , where the best or “elite” label corresponds to those populations that optimize the largest number of objective functions (and have the longest electron lifetime in the storage ring). Finally, some proportion of candidates among the entire generation are replaced with potentially more competitive candidates repopulated within the range of the elite cluster [Fig. 11(a), step 3]. The replacement proportion in each intervention can further be dynamically adjusted or skipped based on a discrepancy score that compares the actual fitness value to that predicted by a k-nearest neighbor regression model. Here, the use of machine learning accelerates the convergence toward optimized parameters [Fig. 11(b)] and increases the number of high-quality elite candidates to reach longer-term electron beam stability in the storage ring.
In a different example, Leemann et al. applied machine learning to study the synchrotron source size stabilization from previous instrumental conditions at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBL).198 The electron beam size can vary [Fig. 11(c), top] due to insertion device gaps [Fig. 11(c), bottom]. By constructing a neural network–based supervised learning model using the insertion device gaps or phase configurations as input and beam size as output, the authors achieved improved performance over simple regression models in accurately predicting the resulting beam sizes [Fig. 11(d)]. It is worth mentioning that the chosen fully connected artificial neural network contains three hidden layers and more parameters, which may approximate functions more complex than polynomial models and thus contribute the superior performance over polynomial regression models.
B. Data collection and processing
Machine learning can also facilitate the collection and processing of scattering data. Here, by “processing” we mean procedures like data refinement, denoising, automatic information–background segmentation, etc., but do not include the extraction of further materials information. Given how precious beamline resources are, the goal of machine learning in this context is to extract essential information with reduced beamtime. For diffractometry, one typical problem is diffraction peak–background segmentation, which usually requires that diffraction spots are collected with fine resolution. Sullivan et al. applied a deep U-Net to extract the shape of the Bragg peaks from time-of-flight neutron diffraction87 and x-ray diffraction, which enables more reliable peak area integration.89 Training data are augmented by applying the following additional operations [Fig. 12(a)]:
In another example, Ke et al. applied a CNN to identify diffraction spots from noisy data taken with an x-ray free-electron laser.88
For small-angle scattering, given the rapid drop of intensity at high Q (for a 3D object, ) and limited beamtime resources, a typical problem is optimization of the data collection strategy in the high-Q regime. Asahara et al. applied Gaussian mixture modeling to predict longer-time SANS spectra by employing a prior from B-spline regression. The proposed B-spline Gaussian mixture model (BSGMM) outperforms conventional kernel density estimation (KDE) algorithms202 [Fig. 12(b)] and shortens the SANS experiment by a factor of five.
Measurements can also be accelerated by reducing the number of necessary sampling points in a given parameter space with guidance from machine learning. Kanazawa et al. proposed a workflow that optimizes automatic sequential Q-sampling, which suggests the next Q-point based on uncertainties estimated from previously measured data203 [Fig. 12(c)]. Noack and colleagues204,205 have used kriging,206,207 a Gaussian process regression method, to design an experimental sampling strategy in a spatially resolved SAXS measurement of block copolymer thin films. While a complete set of SAXS measurements is traditionally sampled using a regular grid, the authors showed that the use of kriging and its variants requires only a fraction of the sampled spatial coordinates to arrive at a reconstruction with comparable detail to that produced by the grid scan. Their closed-loop approach highlights the potential for experimental automation to improve the efficiency in data acquisition and to maximize the information gathered from fragile samples.
Chang et al. addressed a similar challenge by applying a CNN to SANS spectral data to reach super-resolution.113 Even for anisotropic scattering, the CNN-based super-resolution reconstruction has better agreement with the ground truth than the conventional bicubic algorithm [Fig. 12(d)].
Finally, machine learning can also be applied in problems that improve other data collection processes, such as calibrating the rotation axis for x-ray tomography,208 improving the phase-contrast–spatial resolution contradiction in phase-contrast imaging,209 optimizing data segmentation in transmission x-ray microscopy (TXM),210 enhancing visualization of neutron scattering data,211 and achieving super-resolution in x-ray tomography.212
V. OUTLOOK
A. Machine learning on time-resolved spectroscopies
A wide variety of machine learning models are available to study the dynamics of physical systems, for example, recurrent neural networks (RNNs) and RNN-based architectures. These architectures can be used for metamodeling of structural dynamics,213 inferring quantum evolution of superconducting qubits,214 and modeling quantum many-body systems on large lattices.215 RNN-based models have also been applied to study nonlinear tomographic absorption spectra216 and optical spectra from optoelectronic polymers.217 However, the application of RNNs to time-resolved neutron or x-ray scattering experiments is still scarce. In the context of scattering measurements, additional challenges exist given that physical processes are represented by neutron or photon counts on detector arrays and experience noise and loss of phase information. Fortunately, neural networks are effective at denoising,218 solving phase retrieval problems,219,220 and handling missing information in time series data.221 Thus, RNN-based models can serve as promising techniques to extract deeper insight from time-resolved neutron and x-ray spectra.
Neural ordinary differential equations (ODEs) are an alternative framework that can be used to learn from time series data.222 As this framework can be intimately related to physical models, it is able to extrapolate well even with limited training data and has already found applications in modeling quantum phenomena.223,224 As a result, it is interesting to consider how such physics-informed neural networks can perform in the context of neutron and x-ray scattering problems. Another approach for learning complex nonlinear dynamics is through deep Koopman operators.225,226 In this technique, an autoencoder-like structure is developed to connect observed states with intrinsic states, represented by learned Koopman coordinates, which evolve within the latent space according to the learned system dynamics. Such an architecture can be analogously mapped to physical observables, such as scattering data, and the intrinsic quantum states of the measured samples and can thus serve as another promising approach to interpret time-resolved scattering data.
B. Leveraging information in real and reciprocal spaces
Frameworks that employ the principles of symmetry and Fourier transforms could efficiently learn models of complex physical systems and effectively represent scattering data in either real space, reciprocal space, or both. Symmetry and Fourier transforms are two of the most valuable and commonly used computational tools for tackling complex physics problems. These tools encode much of the domain knowledge we have about arbitrary scientific data in 3D space. First, the properties of physical systems (geometry and geometric tensor fields) transform predictably under rotations, translations, and inversions (3D Euclidean symmetry). Second, while physical systems can be described with equal accuracy in both real (position) space and reciprocal (momentum) space, some patterns and operations (e.g., convolutions, derivatives) are much simpler to identify or evaluate in one space than another. The beauty of symmetry and Fourier transforms is that they make no assumptions about the incoming data (only that they exist in 3D Euclidean space); this generality is also an opportunity for improvement. The strength of machine learning is the ability to build efficient algorithms by leveraging the context contained in a given dataset to forgo expensive computation.
A constant theme in scattering experiments is the acquisition of data in reciprocal space to inform something traditionally represented in real space. While there are models that can operate on these domains separately, it would be a valuable and natural direction to extend these methods to simultaneously operate and exchange information in both spaces. This exchange would also allow the user to input and output data in whichever space is more convenient and intuitive, and can directly support methods like diffraction imaging, which contain information in both spaces.
Using learnable context in combination with the fundamental principles of symmetry and Fourier transforms could help to alleviate some of the primary challenges associated with scattering experiments: missing phase information and sampling. Additionally, frameworks that can simultaneously compute in and exchange information between real and reciprocal space could naturally predict quasiparticle band structures from real space coordinates and express charge densities in terms of commonly used plane wave basis sets.
C. Multimodal machine learning
Materials characterization often requires insight from multiple experimental techniques with sensitivity to different types of excitations in order to gain a complete understanding of a material's properties and behaviors. Data acquired using different neutron and x-ray scattering techniques are often complementary but are typically synthesized manually by researchers. In this regard, machine learning may provide an important avenue toward intelligent analysis across multiple modalities. Multimodal machine learning227–230 has already been explored for a range of versatile applications, including activity and context detection;231,232 recognition of objects,233 images,234 and emotions;235 and improvement of certain medical diagnostics.236,237 By consolidating information from multiple, complementary sources, multimodal machine learning models have the potential to make more robust predictions and discover more sophisticated relationships among data. At the same time, this approach introduces new prerequisites compared to learning from single modalities. The taxonomy by Baltrušaitis et al. considers five principal challenges of multimodal machine learning:229 (1) representation of heterogeneous data; (2) translation, or mapping, of data from one modality to another; (3) alignment between elements of two or more different modalities; (4) fusion of information to perform a prediction; and (5) co-learning, which considers how knowledge gained by learning from one modality can assist a model trained on a different modality whose resources may be more limited. These are likewise important considerations for the application of multimodal machine learning in the context of neutron and x-ray data analysis: Different experimental techniques access widely different energy, time, length, and momentum scales; produce diverse data structures; and carry varying levels of uncertainty. Additionally, developing the data infrastructure to aggregate measurements from multiple instruments would be an important undertaking for neutron and x-ray facilities as a whole. Nonetheless, intelligent synthesis of multiple experimental signatures appears to be a promising direction to better extract insights from data and possibly accelerate materials design and discovery.
D. High-performance computing for quantum materials
Increasingly, studies in functional materials underscore emergent quantum phenomena that arise from entanglement. These quantum phenomena, such as quantum spin liquids, unconventional superconductivity, and many-body localization, are beyond the scope of an exclusively structural description and thus pose a challenge for reliable acquisition of high-quality training data for machine learning. Even so, the associated correlations in these materials are encoded in their inelastic scattering signatures, which motivates corresponding theoretical descriptions. Due to quantum entanglement, semiclassical theories like mean-field theory, linear spin-wave theory, or even DFT become insufficient because of their lack of static or dynamic electron correlations. Thus, machine learning and cross validation of spectroscopies associated with these materials require sophisticated computational methods. In addition to providing a high-throughput dataset for machine learning, recent studies have demonstrated that machine learning can benefit these numerical calculations by improving their efficiency and accuracy.238,239
To sufficiently include quantum entanglement in spectral calculations, two promising routes have been attempted. The first route is the correction of DFT by embedding other methods. Beyond the elementary DFT + U corrections for total energy, the GW method allows a self-consistent correction of the Green's function using a screened Coulomb interaction in the random-phase approximation (RPA).240 A more sophisticated correction for strong correlation effects is the DFT + DMFT (dynamical mean-field theory) method.241 By mapping the self-energy into a single-site impurity problem, DMFT incorporates local high-order correlations into spectral calculations.242 These corrections to DFT enable spectral calculations for materials with substantial quantum entanglement. However, as the corrections are usually biased, the accuracy of the results is sometimes not well controlled. The DFT + DMFT method has been widely used to simulate the single-particle Green's function relevant for photoemission experiments.243 Its numerical complexity increases dramatically when extended to two-particle or four-particle correlation functions, which are required to evaluate inelastic scattering cross sections. Implemented using the Bethe–Salpeter equation and Lanczos method, the DFT + DMFT methods have been recently applied to the simulation of neutron scattering and resonant inelastic x-ray scattering (RIXS) spectra,244,245 correctly reflecting the multiplet effects and Mott transition in transition metal materials. In light of these developments, it is, in principle, possible for hidden correlation information to be revealed from spectra with proper training and selection of machine learning architectures.
The second route to include quantum entanglement is the construction of effective low-energy models based on ab initio Wannier orbitals and the evaluation of spectral properties based on these highly entangled effective models. Along this route, wavefunction-based methods, including exact diagonalization,246 coupled clusters,247 and density-matrix renormalization group (DMRG)248 provide exact or asymptotically exact solutions to excited-state spectra for arbitrary strong correlations. The disadvantage of these wave function–based methods is that the rapid scaling of computational complexity restricts the calculations to only small systems or low dimensions with a limited number of bands. Another class of the model-based unbiased methods is quantum Monte Carlo,249 which is less sensitive to the system's size but restricted by high temperature. These methods have been widely used in scattering spectrum calculations for spin liquids250 and unconventional superconductors,251 where spin correlations are dominant in a few bands. With constantly increasing computational power, we expect these techniques to play a more prominent role in future applications of machine learning to quantum materials.
Spectral calculations based on either route are computationally expensive and require massively parallel computing techniques. Most methods exhibit good scaling performance in distributed computing. With the reconstruction of bottom-level linear algebraic operations, these approaches have been further accelerated using the general-purpose graphics processing unit (GPGPU).
E. Conclusion: Fundamental impact on experimental facilities
Beamtime at neutron and x-ray facilities is a limited and expensive scientific resource. This review outlines the remarkable impact that machine learning has had on scattering science in a very short span of time. The work so far shows a tantalizing glimpse of the way that these new approaches can revolutionize the collection, analysis, and interpretation of data and, in turn, the use and access to beamline facilities. Perhaps the most fundamental step under way is the use of machine learning to solve the inverse scattering problem [Fig. 2(a)]. The inversion of data to a model or representation is the goal of most experiments, whether it is to create a structural solution or to understand the couplings and dynamics in materials. It is a highly time-consuming process that currently demands a great deal of expertise and, as such, is a major bottleneck in extracting meaningful scientific information. Training machine learning models using large-scale simulations provides an approach that finally addresses this bottleneck in a realistic way. A closely related problem is the treatment of experimental backgrounds and artifacts, which can also be addressed by artificial intelligence. Bringing these developments from machine learning into the realm of experiments through workflows and supporting computation is sure to have a fundamental impact throughout the neutron and x-ray scattering community.
There are a series of scattering and spectroscopy problems where artificial intelligence can be expected to make a significant impact in the foreseeable future. Machine learning approaches to SAS (Sec. II B) and XAS (Sec. III A) are well under way, and it can be expected that they will be implemented on beamlines in the near term to aid further automation of experiments and analysis. The problems of diffraction (Sec. II A), both on single crystal and powder samples, are at an earlier stage. Here, the crucial problem is to improve the automation of finding structure solutions. Hybridizing these methods with more powerful optimization routines that can search for solutions in the very rugged landscape of the parameter space are likely to produce automated approaches, which would be transformative in the delivery of rapid materials understanding. Another aspect of diffraction experiments where the prospects of machine learning are especially promising is in the domain of diffuse scattering. A great deal of critical information about disorder and defects in materials is contained within the scattering between Bragg peaks. It has long been recognized that accessing this information would be very important, and the use of machine learning is proving to be well suited to this problem, especially in the case of magnetic materials. Large-scale simulations will open up this area for other applications, too. Finally, inelastic scattering data are particularly hard to visualize due to their four-dimensional nature. Experimentalists have trouble identifying not only underlying models but also what real features are present in the data (Sec. III C). Use of computationally efficient theoretical methods and simulations of instrumental effects can address this challenge and would change both the impact and the time frame of analyses. Finally, many experiments currently depend on high-purity single crystals, which are difficult and laborious to grow. Machine learning is showing that it is feasible to extract models from experimental data on powders instead of single crystals, which would promise much faster experimental throughput and turnaround in the understanding of materials.
The widespread deployment of machine learning can be expected to have a major impact on redefining the relationship between experimentation and modeling. Large-scale simulations used for training can also be used to explore phase diagrams of materials and identify underlying physical mechanisms. They also provide a powerful basis on which to steer experiments. Mining simulations present an opportunity to transform data interpretation, potentially giving enhanced significance to the results. Furthermore, experiments are conducted in high-dimensional parameter spaces of sample conditions, orientations, instrumental configurations, and counting times. Autonomous steering of experiments thus promises significant enhancement to experimental practice. Closing the loop between modeling and experimental control not only enables experiments that are otherwise too fast for humans to steer effectively but also allows the collection of information in scenarios that are too complex for conventional decision making.
Artificial intelligence is driving a change in the scale and speed of modeling as machine learning removes computational bottlenecks and provides the means to synthesize and compress simulations and their physical information content. This revolution will allow training and simulation over much wider classes of problems, potentially extending the scope of experiments. Integrating these advances in computational methods together with the needs of experiments is a key step that requires the fields of theory, applied mathematics, and computer science to work more closely with experimental science than ever before. To achieve these promising advances, federated data, curation, autonomous steering, and validated codes as well as new types of experimental groups, remote access, and team formation will need to be part of the future research landscape. The early indications of this change are very promising, and now is the time for researchers to explore machine learning as a powerful new capability that can transform neutron and x-ray scattering, one of the most demanding and data-intensive branches of experimental science.
ACKNOWLEDGMENTS
Z.C., N.A., and M.L. acknowledge support from United States Department of Energy Basic Energy Sciences Award No. DE-SC0020148. N.A. acknowledges National Science Foundation (NSF) Graduate Research Fellowships Program support under Grant No. 1122374. Y.W. acknowledges support from NSF award DMR-2038011. R.P.X. and R.E. acknowledge the support by BiGmax, the Max Planck Society's Research Network on Big-Data–Driven Materials Science, and the European Research Council (ERC) under the European Union's Horizon 2020 Research and Innovation Program Grant No. ERC-2015-CoG-982843. DAT was supported by DOE BES User Facilities.
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.