Polarized neutron reflectometry is a powerful technique to interrogate the structures of multilayered magnetic materials with depth sensitivity and nanometer resolution. However, reflectometry profiles often inhabit a complicated objective function landscape using traditional fitting methods, posing a significant challenge for parameter retrieval. In this work, we develop a data-driven framework to recover the sample parameters from polarized neutron reflectometry data with minimal user intervention. We train a variational autoencoder to map reflectometry profiles with moderate experimental noise to an interpretable, low-dimensional space from which sample parameters can be extracted with high resolution. We apply our method to recover the scattering length density profiles of the topological insulator–ferromagnetic insulator heterostructure Bi2Se3/EuS exhibiting proximity magnetism in good agreement with the results of conventional fitting. We further analyze a more challenging reflectometry profile of the topological insulator–antiferromagnet heterostructure (Bi,Sb)2Te3/Cr2O3 and identify possible interfacial proximity magnetism in this material. We anticipate that the framework developed here can be applied to resolve hidden interfacial phenomena in a broad range of layered systems.
Neutron reflectometry facilitates structural characterization of multilayered materials by probing their nuclear and magnetic depth profiles at device-relevant spatial scales, enabling the study of hidden interfaces in a broad range of nanostructured and thin film systems.1–11 Leveraging the interaction of spin-polarized neutrons with magnetic moments, polarized neutron reflectometry (PNR) is particularly well-suited to detecting magnetic interfacial phenomena,12–16 such as the magnetic proximity effect. Proximity coupling to a magnetic material induces magnetic order near the interface of an otherwise non-magnetic system, making it a promising pathway for magnetizing topological insulators (TIs) without introducing magnetic dopants.17–34 This opens up the possibility of realizing emergent phenomena such as the quantum anomalous Hall effect35–39 or axion insulator state40–42 at room temperature and advancing TI-based device applications. More recently, the realization of proximity magnetism in van der Waals heterostructures43–54 presents new opportunities to engineer atomically thin devices with novel functionalities, and at the same time, highlights an increasing need for precise characterization of interfacial effects at subnanometer length scales. Thus, accurate quantitative analysis of magnetic structural information obtained by PNR is critical to resolving important interfacial effects within a broad range of materials systems.
However, subtle interfacial phenomena such as proximity magnetism can be difficult to single out from bulk contributions to the PNR signature. The composition and magnetic profiles of a sample are typically expressed in terms of the nuclear and magnetic scattering length densities (SLD), which can be recovered from reflectometry measurements by fitting the data to a candidate model. This traditionally involves building a theoretical model of the experimental system in terms of structural parameters—density, thickness, interface roughness, and magnetization—of the constituent layers and simulating the associated reflectometry profile using the methods of Parratt recursion55 or the Abelès matrix formalism.56 However, due to information loss about the phase of the reflected neutrons, different SLD profiles can generate highly similar reflectivities, leading to a complicated cost landscape between the theoretical and experimental profiles with potentially many local minima. Thus, parameter refinement often demands expert insight to identify a suitable starting point and adequately constrain the parameter space of the model. Methods to resolve the phase ambiguity through carefully designed experiments57–61 have also been proposed. More generally, additional insights from x-ray diffraction, transmission electron microscopy, and/or bulk magnetometry are required as well as selection of appropriate models62,63 and optimization methods, which can play a critical role in steering the refinement process. For example, many existing neutron and x-ray reflectivity refinement programs (e.g., GenX,64 Refl1D,65 StochFit66) implement stochastic optimization methods such as differential evolution, simulated annealing, or stochastic tunneling to better manage multiple local minima. More recently, machine learning-guided fitting approaches have been introduced to both improve and automate parameter retrieval from neutron reflectometry data with promising results.67–71 However, elucidating subtle interfacial phenomena such as magnetic proximity from PNR signatures remains a significant challenge.
In this work, we develop an alternate, data-driven framework to retrieve the sample parameters of candidate proximity-coupled systems from their PNR profiles with minimal user intervention. Using a variational autoencoder (VAE), we map reflectometry profiles simulated from a broad range of candidate physical parameters to a low-dimensional latent space from which the true sample parameters can be readily obtained. The decoded profiles directly inform the suitability of the parameter search space through the reconstruction quality and are robust to moderate perturbation of the input reflectivities emulating experimental noise. Importantly, we find that the latent mapping naturally bypasses the issue of multiple local minima and is both well-organized and visually interpretable in terms of the physical parameters. Thus, the latent representation can further be used to automatically refine the parameter search space for poorly reconstructed profiles. We evaluate our model on its ability to recover the sample parameters of a well-studied TI–ferromagnetic (FM) insulator heterostructure, Bi2Se3/EuS, exhibiting proximity magnetism. Our model predictions are found to be consistent with the results of traditional fitting methods and, at the same time, require no expert insight for parameter initialization or refinement. We conclude by applying our model to analyze a more challenging PNR profile of the TI–antiferromagnet (AFM) heterostructure, (Bi,Sb)2Te3/Cr2O3, which points to possible proximity magnetism at the resolution limit.
BACKGROUND
Polarized neutron reflectometry measures the spin-dependent specular reflection of an incident neutron beam from the surface of a magnetic thin film. The reflectometry profile, R(Q), is a function of the wave vector transfer , where θ is the angle of reflection and λ is the wavelength of the neutron. In the first Born approximation, reflectivities of the neutron spin non-flip channels, and , are related to the nuclear and magnetic SLDs according to13
where ρN and ρM are the nuclear and magnetic scattering length densities, respectively; is the angle between the magnetization and the neutron polarization; and the superscripts + and − denote the neutron spin up and down states, respectively. The coordinate z measures the depth perpendicular to the sample surface. To develop the VAE-based approach to recover the SLD profiles of candidate proximity-coupled systems, we consider the specific thin film system consisting of a Bi2Se3/EuS heterostructure atop a sapphire ( Al2O3) substrate with an amorphous a-Al2O3 capping layer, as shown in Fig. 1(a). Proximity-induced magnetism has been reported at the interface between TI Bi2Se3 and EuS, a ferromagnet with a Curie temperature of approximately 16.6 K.24,72 The reflectometry profile shown in Fig. 1(b) consists of two curves, and , corresponding to the two neutron spin non-flip channels aligned parallel and antiparallel, respectively, to an in-plane external magnetic field = 1 T. Note that and have been normalized to a maximum value of 1. In a typical parameter refinement program, the theoretical model is fit simultaneously to both spin channels to obtain the SLD profile of the sample [Fig. 1(c)]. However, due to the phase ambiguity and large number of fitting parameters, different SLD profiles can produce excellent fits to the measured data. For instance, the fit obtained in Fig. 1(b) corresponds to the SLD profile shown in Fig. 1(c), which proposes a high interface roughness between the Bi2Se3 and EuS films but no discernible proximity effect. Figure S1 shows three additional fits to the data obtained using the GenX parameter refinement program with different initial populations, which produce mixed results for the relevant parameters. The objective of the data-driven approach is to retrieve the optimal physical parameters of a target sample from its PNR profile under moderate experimental noise and to compute a reliable SLD profile from learned sample parameters with minimal influence from common issues in iterative optimization algorithms, such as sensitivity to parameter initialization and stagnation. We find this approach can further inform the suitability of the entire parameter search space, not just the predicted parameters.
VAE with regression for PNR data analysis. (a) Schematic illustration of the proximity-coupled Bi2Se3/EuS system. A depiction of the spin-polarized neutron reflectometry experiment under an externally applied magnetic field is shown at right. (b) Reflectometry profile of the heterostructure in (a) measured at T = 5 K. Solid lines correspond to a representative fit obtained using the GenX parameter refinement program. Error-bars represent 1 standard deviation. (c) Nuclear, magnetic, and absorption SLD profiles corresponding to the fit in (b). (d) Schematic illustration of the VAE architecture used for PNR parameter retrieval.
VAE with regression for PNR data analysis. (a) Schematic illustration of the proximity-coupled Bi2Se3/EuS system. A depiction of the spin-polarized neutron reflectometry experiment under an externally applied magnetic field is shown at right. (b) Reflectometry profile of the heterostructure in (a) measured at T = 5 K. Solid lines correspond to a representative fit obtained using the GenX parameter refinement program. Error-bars represent 1 standard deviation. (c) Nuclear, magnetic, and absorption SLD profiles corresponding to the fit in (b). (d) Schematic illustration of the VAE architecture used for PNR parameter retrieval.
FRAMEWORK
Our approach is based on the VAE,73 an unsupervised deep generative model which is trained to recover an input from a low-dimensional encoding by minimizing the associated reconstruction error. The VAE comprises an encoder and decoder network, as shown in Fig. 1(d). The encoder network outputs parameters to a probability density, , parameterized by a set of trainable weights θ from which the latent features z are sampled. The decoder network outputs the parameters to the probability distribution of the data, , parameterized by a set of weights using the sampled latent features z. In contrast to a simple autoencoder, the VAE assumes that the encoded—or latent—vector elements are drawn from a prior distribution , which is enforced by an additional regularization term in the loss function,
where DKL denotes the Kullback–Leibler (KL) divergence computed between the returned distribution of the latent vector z and the prior distribution and β is a hyperparameter regulating the degree of entanglement between the learned latent channels.74 The prior distributions are typically modeled as independent unit Gaussians, i.e., , and the approximate posterior as a Gaussian with mean and variance estimated by the encoder. Additional details of our specific implementation are provided in the supplementary material. By encoding the input as a distribution rather than as a single point, the VAE compels the latent space to be smooth and continuous with nearby points corresponding to similar reconstructions of the input. Thus, in the context of PNR, the VAE can be considered as a way to map PNR profiles into a well-organized and informative low-dimensional space, as they naturally evolve as a function of a few well-defined structural parameters (Fig. S3). The potential advantages of a VAE-based approach for parameter retrieval are further exemplified through a toy example described in the supplementary material.
Network architecture
Like the conventional fitting programs, the VAE treats the and channels jointly using a convolutional neural network (CNN) encoder with a combination of one- and two- dimensional kernels [Fig. 1(d)]. The convolutional and pooling layers are followed by a set of fully connected layers operating on the flattened CNN output, returning the predicted means and standard deviations of the normal distributions from which the latent vector z is sampled. When the latent representation is conditioned on the sample parameters, we can interpret each latent channel as effectively returning a distribution over one parameter's value, illustrated schematically in Fig. 1(d). The mean values of the latent distributions are fed to a simple regressor consisting of a single hidden and activation layer that predicts the physical parameter values. A ReLU activation is used to restrict the predicted values to be non-negative in accordance with the physical parameters. At the same time, the sampled vector z is passed to the decoder, which returns the reconstructed profiles and . All three networks—encoder, decoder, and regressor—are trained end-to-end by minimizing the total loss function,
where v and denote the true and predicted parameter values, respectively, and λ is a hyperparameter weighting the contribution of parameter regression to the total loss. Details regarding the implementation of the loss function and the selected hyperparameters are provided in the supplementary material.
Data preparation
To generate the training and development datasets for the neural network model, we used the GenX neutron reflectivity modeling code64 to simulate the PNR profiles of 200 000 candidate systems of the Bi2Se3/EuS heterostructure. For each example, the constituent layers are parameterized by their density, thickness, roughness, and magnetization, which are sampled uniformly at random over a range of experimentally feasible values (Fig. S4). Importantly, these parameter ranges can be quite broad around the set of nominal parameter values and can differ in size for different quantities depending on their level of uncertainty. For example, the parameter ranges for the amorphous capping layer are intentionally broader compared to the TI and FM layer thicknesses that are carefully controlled during growth. Density and magnetization are expressed in terms of formula units, which are compatible with the GenX simulation software; however, the final results are converted to conventional units before plotting. The proximity effect is modeled as a thin interfacial layer between the Bi2Se3 and EuS films with a sampled thickness, roughness, and magnetization, and sharing the density value of the neighboring TI film. The proximity layer magnetization is constrained not to exceed the sampled value of the EuS magnetization for any given example. Additionally, the minimum possible thickness of the proximity layer is set to 2 Å representing a target spatial resolution threshold. Note that the neutron wavelengths for the PNR measurements conducted in this work are on the order of 5 Å. Examples for which the sampled proximity layer thickness falls below the threshold are simulated without an interfacial layer and are designated as non-proximity-coupled. The PNR profiles are simulated over the experimentally accessible Q-range from 0.1 to 1.3 nm−1 and the intensities normalized to a maximum value of 1. To simulate experimental noise, the generated PNR profiles are randomly perturbed at each Q point by sampling a Gaussian distribution with standard deviation estimated using the errorbars of the corresponding experimental reflectometry profile. Specific details regarding noise estimation, including the selection of the standard deviation and dependence of the results on different noise levels, are provided in the supplementary material. Additionally, the instrument resolution and background are sampled uniformly at random between 0.001 and 0.01 nm−1, and 10−8 and 10−4 on a logarithmic scale, respectively. Since the reflectivity spans nearly eight decades, the base-10 logarithm of the profiles is used as an input (output) of the encoder (decoder) to more equitably treat the intensity values. To similarly place the sample parameters on equal footing, the output of the regressor is taken to be the standardized values of the physical parameters. In particular, the regressor is trained to predict the density, thickness, and roughness of each thin film layer as well as the magnetization of the ferromagnetic and proximity layers. The substrate density is also predicted by the regressor, but substrate thickness and roughness are excluded from fitting as the substrate is considered macroscopically thick with a relatively uniform surface roughness (approximately 3 Å for the sapphire substrate used in this system). While the instrument resolution and background are not predicted explicitly by the regressor, variations in the data as a function of these parameters can still be captured by the unregularized latent dimensions. They can, thus, be regarded as underlying degrees of freedom, which may be more complex functions of the latent space. The freedom to choose the number of output quantities, even as the training data reflect variations in the full set of sample and instrument parameters, is one advantage of the machine learning-based approach: It allows one to output only the most relevant quantities, reducing the needed training data volume and neural network size. Additionally, machine learning makes it possible to seek hidden relationships between the data and parameters that may not be captured in an approximate theoretical model. Finally, the generated data are subdivided into training, validation, and test sets according to a 70%/10%/20% split. The data generation, model implementation, and analysis codes are provided in the linked GitHub repository.75
RESULTS
We evaluate the trained VAE on its ability to recover the sample parameters from the experimental PNR profiles of the Bi2Se3/EuS system. The loss trajectories of the training and validation sets are shown in Fig. S8. We first compare the measured and decoded reflectometry profiles corresponding to four PNR experiments taken at different temperatures between 5 and 300 K in Fig. 2(a). These experimental PNR data are reproduced from Ref. 24. The left panel of Fig. 2(a) shows the reconstructed reflectometry profiles for both spin channels for the measurements at 5 K. The right panel shows the spin asymmetry () calculated for both the measured and decoded profiles of the four experimental reflectivities at 5, 50, 75, and 300 K. Representative reconstructions of the test dataset in each error quartile are also shown in Fig. S10(a). The four decoded experimental profiles are all found to be inliers of the distribution of reconstruction errors [Fig. S10(b)]. This suggests that the chosen parameter ranges are likely suitable for the data under consideration. Next, using the parameter values predicted by the regressor, we calculate the SLD profiles for the measurements at each temperature [Fig. 2(b)]. Note that the SLD profiles are computed directly using predicted parameter values and are not derived from the reconstructed PNR profiles. The nuclear (NSLD) and absorption (ASLD) scattering length density profiles appear largely consistent for the measurements at different temperatures, suggesting that the predicted values of the temperature-independent parameters, such as the thickness and density of each layer, are physically plausible. However, we do observe a change in the NSLD of the bulk FM layer at 300 K that is worth mentioning. By examining the underlying parameters generating each SLD profile, we identify that the bulk FM thickness increases slightly with temperature, which appears to coincide with slight reductions in the thickness of the TI and proximity layers and a more significant reduction in the interface roughness. At this stage, the exact origin of this temperature dependence is not well understood; the roughness values predicted for the FM and capping layers do not appear to follow a clear temperature-dependent trend and are likely prone to bigger uncertainties than the bulk parameters, but a possibility is that higher temperatures contribute to smoothing the buried interfaces, such as those between the TI and FM and FM and capping layers, which could partially explain the NSLD fluctuation. The magnetic scattering length density (MSLD) profile is maximal at the EuS layer and exhibits a slight shoulder near the TI interface at 5 K, corresponding to the proximity layer. The MSLD magnitude drops progressively as the temperature is increased and disappears at 300 K. These observations can be further traced back to the latent representations of the four experimental examples. In Fig. 2(c), we visualize the latent space by projecting the encoded test dataset along the two dimensions with the largest local gradients for a given parameter value, e.g., substrate density . Specifically, the horizontal and vertical axes of each subplot correspond to the latent dimensions with the largest and second largest gradient of the target parameter, respectively. The local gradients , where vi denotes the ith parameter and zj denotes the jth latent channel, are estimated using the 32 nearest-neighbors of each scattered point. Visualizations of all predicted parameters are given in Fig. S12. The scattered points, each corresponding to one profile of the test dataset, are colored according to the true value of the parameter viewed in each subplot. We find that the latent space is well-organized according to these parameter values, including the thickness tprox and magnetization mprox of the proximity layer. The latent representations of the four experimental PNR profiles are indicated by the outlined circles in the projection plots and are colored by the corresponding temperature. Notably, for temperature-independent quantities such as the TI density , the experimental points at different temperatures are generally insensitive to the gradient direction of the underlying parameter value, while for those like the EuS magnetization mFM and mprox, the points at different temperatures follow the gradient direction of the parameter values closely. This corroborates our observations that the trained VAE learns a sensible and interpretable latent representation of PNR profiles from which the physical parameter values may be estimated. The degree of parameter entanglement can be inferred from Fig. 2(d), which shows the magnitudes of the average gradients of each sample parameter with respect to each latent channel. The first 13 latent dimensions are conditioned to vary with the values of the corresponding parameter, while the remaining channels are not explicitly linked to one specific physical parameter, i.e., they are regularized by the conventional standard normal prior distribution. However, we observe that these “free” channels sometimes participate in relating two or more parameters to one another, as observed for tprox, tTI, and tFM in Fig. 2(d).
VAE performance on experimental data. (a) At left, the decoded PNR profile of the Bi2Se3/EuS heterostructure measured at 5 K. For each channel, points represent experimental data and solid lines are the corresponding reconstruction. At right, the spin asymmetry () calculated from the data (points) and decoded profiles (solid lines) of four PNR measurements taken at different temperatures. Error bars represent 1 standard deviation. (b) Nuclear (NSLD), magnetic (MSLD), and absorption (ASLD) scattering length density profiles obtained from the regressor predictions for the measurements at four temperatures in (a). The MSLD contribution from the proximity layer is shown in dark green. (c) Projections of the latent encoding of the test dataset along the latent dimensions with the largest gradients for different sample parameters (density ρ, thickness t, magnetization m), colored by their true values. Outlined points show the latent encoding of the four experimental measurements from (a). Specifically, the horizontal and vertical axes correspond to the latent dimensions with the largest and second largest gradient of the target parameter, respectively. (d) Parameter entanglement inferred from gradients along each latent channel. Heatmap indicates the relative magnitudes of the gradients of a given parameter with respect to each latent channel. Namely, each cell is colored by the normalized value of for the ith parameter and jth latent channel. Normalization maps gradients in each row to lie between 0 and 1.
VAE performance on experimental data. (a) At left, the decoded PNR profile of the Bi2Se3/EuS heterostructure measured at 5 K. For each channel, points represent experimental data and solid lines are the corresponding reconstruction. At right, the spin asymmetry () calculated from the data (points) and decoded profiles (solid lines) of four PNR measurements taken at different temperatures. Error bars represent 1 standard deviation. (b) Nuclear (NSLD), magnetic (MSLD), and absorption (ASLD) scattering length density profiles obtained from the regressor predictions for the measurements at four temperatures in (a). The MSLD contribution from the proximity layer is shown in dark green. (c) Projections of the latent encoding of the test dataset along the latent dimensions with the largest gradients for different sample parameters (density ρ, thickness t, magnetization m), colored by their true values. Outlined points show the latent encoding of the four experimental measurements from (a). Specifically, the horizontal and vertical axes correspond to the latent dimensions with the largest and second largest gradient of the target parameter, respectively. (d) Parameter entanglement inferred from gradients along each latent channel. Heatmap indicates the relative magnitudes of the gradients of a given parameter with respect to each latent channel. Namely, each cell is colored by the normalized value of for the ith parameter and jth latent channel. Normalization maps gradients in each row to lie between 0 and 1.
Next, we assess the overall regression accuracy of our trained model on the test dataset for each of the sample parameters visualized in Fig. 2(c). In each subplot of Fig. 3(a), the test data points are histogrammed according to the true and predicted values of a given sample parameter. Corresponding plots for the complete set of predicted parameters are shown in Fig. S14(a). We also include the histogram for the parameter mtprox, defined as the product of proximity layer thickness and magnetization, in the last panel of Fig. 3(a). The values of the bulk layer properties appear very well reproduced by the regressor, while tprox and mprox, which exhibit much weaker signatures and tend to be expressed most in the noisier, high-Q region of the PNR profiles, are somewhat underestimated at large values and overestimated for non-proximity-coupled samples. Note that the sharp discontinuity in the tprox histogram corresponds to the resolution threshold of 2 Å imposed on the generated data. To assess the reproducibility of the regression results, we trained ten identical models with different initial weights and collected statistics of the resulting predictions. Figure 3(b) shows the predictions of these models for the values of tprox, mprox, and mFM. We can optimize the trade-off between the true (tpr) and false (fpr) positive rates of correctly classifying proximity magnetism to obtain classification thresholds of thickness and magnetization that best separate the data points between the two classes. These allow us to estimate the resolution threshold of the trained model to correctly distinguish samples with and without proximity magnetism within a certain confidence interval. The method of threshold determination is described in detail in the supplementary material and yields the average classification thresholds of thickness and magnetization across the ten models indicated by the dashed gray lines in Fig. 3(b). The thresholds are found at 5.1 Å and 16 emu cm−3 (1 emu = 1A·m2), corresponding to recalls of 85% for and 80% for for both the positive and negative classes [Figs. S14(b) and S14(c)]. The optimal thickness threshold is slightly higher than the resolution threshold of the generated data but corresponds well to the neutron wavelength of the reflectivity simulation, 4.75 Å. While a small spread in the predicted values across the ten models is observed, the overall trend in the predictions is as expected. Notably, all ten models predict tprox and mprox values above their respective thresholds at 5 K. The values of mprox decay with increasing temperature while tprox remains relatively constant until a slight drop at 300 K. Similarly, mFM drops rapidly beyond its Curie temperature. We note that the predicted values of mprox at intermediate temperatures are still often non-zero. This may be attributable to strong magnetic fluctuations above the EuS Curie temperature stabilizing a weak proximity effect below the resolution threshold of our model.47,76 Note that if weak proximity magnetism persists at high temperatures, it must be below the resolution threshold of our current model. A tailored network trained on a narrow range of parameters can potentially be devised to clarify even weaker signatures of proximity magnetism that may be expected at higher temperatures; however, the current model is highly suitable for surveying the evolution of proximity magnetism over a broad experimental parameter space, such as a wide temperature range. The predicted values obtained across the ten models for the remaining sample parameters, which are expected to be temperature-independent, are plotted in Fig. S16. Here, we observe similar consistency among most parameters with the exception of fluctuations in the roughnesses and FM layer thickness noted in the previous discussion.
Regressor performance and predictions on experimental data. (a) Histograms of predicted vs true values for different sample parameters of the test dataset. (b) The predictions of proximity layer thickness tprox and magnetization mprox, and FM layer magnetization mFM obtained from ten instances of the VAE trained with different initial weights, shown as a function of the measurement temperature of the corresponding experiment. Gray dashed lines indicate the optimal thresholds obtained for proximity classification. Scattered points above (below) the determined threshold are colored yellow (blue). Violin plots of the predicted values for experiments with majority predictions above (below) the threshold are shaded yellow (blue).
Regressor performance and predictions on experimental data. (a) Histograms of predicted vs true values for different sample parameters of the test dataset. (b) The predictions of proximity layer thickness tprox and magnetization mprox, and FM layer magnetization mFM obtained from ten instances of the VAE trained with different initial weights, shown as a function of the measurement temperature of the corresponding experiment. Gray dashed lines indicate the optimal thresholds obtained for proximity classification. Scattered points above (below) the determined threshold are colored yellow (blue). Violin plots of the predicted values for experiments with majority predictions above (below) the threshold are shaded yellow (blue).
Resolving interfacial antiferromagnetic coupling
Finally, we apply our approach to elucidate proximity magnetism from a more challenging PNR profile of intrinsic TI (Bi,Sb)2Te3 interfaced with AFM Cr2O3, shown schematically in Fig. 4(a). Bulk Cr2O3 is a well-known antiferromagnetic insulator with a Néel temperature of 307 K. At the interface between a TI and AFM, magnetic atoms on the AFM surface have been shown to induce interfacial ferromagnetic order in the TI, which can survive at much higher temperatures than that produced by doping or interfacing with a FM film, owing to the typically higher Néel temperatures.25,28,29,33,77,78 However, magnetic proximity coupling between an AFM and TI is comparatively weaker and thereby more challenging to isolate experimentally. Figure 4(b) shows the experimental PNR profile of the intrinsic (Bi,Sb)2Te3/Cr2O3 system measured at the magnetism reflectometer at the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory.79 Data were collected at a temperature of 5 K with an in-plane external magnetic field of 1 T. The TI has a nominal composition of (Bi0.2Sb0.8)2Te3. Additional experimental details are described in the Methods section. Subtle evidence of spin splitting is observed in the associated plot of the spin asymmetry, (), shown in Fig. 4(c). The experimental data in Figs. 4(b) and 4(c) are superimposed with the corresponding best fit obtained using the GenX parameter refinement program.64 However, a major challenge encountered during conventional fitting is that repeated refinement with different initial populations fails to reproducibly predict the proximity effect, as the weak spin splitting observed in the PNR profiles could be attributed to either a small net magnetization at the AFM surface or to proximity magnetism in the interfacial TI layer. To address this challenge, we train our VAE on a set of synthetic PNR profiles of the heterostructure shown in Fig. 4(a). Similar to the case of Bi2Se3/EuS, we simulate the PNR profiles of 200 000 candidate systems parameterized by the density, thickness, and roughness of each layer, which included the TI, AFM, sapphire substrate, Te capping layer, and a possible TeO2 surface film. In a subset of these examples, we model the presence of an interfacial FM layer on either the AFM or TI surface, or both, parameterized by thickness, roughness, and magnetization, and sharing the density value of the corresponding bulk layer. Note that the predefined range for the magnetization of the interfacial layers is equivalent in terms of formula units but inequivalent in units of emu cm−3 due to the very different densities of the TI and AFM materials (Fig. S5). The theoretical thickness resolution is likewise set to 2 Å for both possible interfacial layers. PNR profiles are simulated over the experimentally measured Q-range from 0.1 to 1.72 nm−1 and normalized to a maximum value of 1. The instrument background is sampled uniformly at random between 10−8 and 10−4 on a logarithmic scale. The remaining data preparation steps are conducted as for the Bi2Se3/EuS example. The reconstructed PNR profile, shown in Fig. 4(e), matches the experimental data closely and is in the top reconstruction error quartile (Fig. S11). The corresponding SLD profile obtained using the predicted sample parameters is shown in Fig. 4(d), where an interfacial FM layer is evidenced by the peak in the MSLD profile at the TI surface. We also plot the spin asymmetry in Fig. 4(f), which displays a weak non-zero signature near 0.5 nm−1. Most significantly, the trained VAE yields a latent representation for the test dataset and experimental example as shown in Fig. 4(g). The points in each subplot are colored according to the true values of the thickness or magnetization of the interfacial FM layer on either the AFM surface, denoted tiAFM and miAFM, or the TI surface, denoted tprox and mprox, respectively. The remaining sample parameters predicted by the model are plotted in Fig. S13(a). We see that the experimental PNR profile is mapped unambiguously to a region with t and m, while tprox and mprox are predicted to be approximately 8.4 Å and 12 emu cm−3, respectively. The regression accuracy for each predicted parameter is plotted in Fig. S15(a). To test the robustness of these predictions, we train another ten identical models with different initial weights. Figures 4(h) and 4(i) show the predictions of the ten models for the values of tiAFM and miAFM and tprox and mprox, respectively. The gray dashed lines delimiting proximity-coupled examples are similarly obtained by optimizing the trade-off between the true and false positive rates of correctly classifying proximity magnetism in the validation set for each model. The average threshold values for the proximity layer are found at 4.4 Å and 8.2 emu cm−3, corresponding to recalls of 76% and 72% in both classes, respectively [Figs. S15(b) and S15(c)]. Since the values of miAFM are roughly three times as large as those of mprox in conventional units, the resolution threshold for miAFM is computed separately following the same procedure and found to be 20.6 emu cm−3. Using these thresholds, we find that only half of the models predict proximity magnetism is present when considering the predicted mprox values, though tprox is well-resolved by most models; however, almost all models unambiguously predict tiAFM and miAFM well below their respective thresholds. Thus, although the VAE approach pushes the boundary for resolving subtle magnetic signatures, it is possible that a very weak proximity effect is present at or slightly below the resolution threshold, which we could achieve with the current model. Nonetheless, the predictions could be used as a valuable screening tool before conducting finer measurements with either longer acquisition times or at higher Q to more clearly resolve the spin splitting, which could potentially benefit experimental planning and optimize the use of scientific user facilities.
Proximity magnetism in the TI-AFM system. (a) Schematic illustration of the (Bi,Sb)2Te3/Cr2O3 system. (b) Experimental PNR profile at 5 K and best fit obtained with the GenX parameter refinement program. An expanded view of splitting between the two channels at high Q is shown in the inset. Error-bars represent 1 standard deviation. (c) The spin asymmetry () calculated from the measured (points) and best fit (solid line) and profiles shown in (b). (d) Nuclear (NSLD), magnetic (MSLD), and absorption (ASLD) scattering length density profiles obtained from the regressor predictions for the experimental measurement shown in (b). The MSLD contribution from the proximity layer is shown in dark green. (e) Decoded profiles of the PNR measurement in (b). An expanded view of the profiles at high Q is shown in the inset. (f) Spin asymmetry calculated from the measured (points) and predicted (solid line) and profiles shown in (b). (g) Projections of the latent encoding of the test dataset along the latent dimensions with the largest gradients for the thickness and magnetization of the interfacial AFM and TI layers. Red points show the latent encoding of the PNR profile in (b). (h) and (i) The predictions of (h) interfacial AFM layer thickness and magnetization, and (i) proximity layer thickness and magnetization obtained from ten instances of the VAE trained with different initial weights. Gray dashed lines indicate the optimal thresholds for proximity classification. Scattered points above (below) the threshold are colored yellow (blue).
Proximity magnetism in the TI-AFM system. (a) Schematic illustration of the (Bi,Sb)2Te3/Cr2O3 system. (b) Experimental PNR profile at 5 K and best fit obtained with the GenX parameter refinement program. An expanded view of splitting between the two channels at high Q is shown in the inset. Error-bars represent 1 standard deviation. (c) The spin asymmetry () calculated from the measured (points) and best fit (solid line) and profiles shown in (b). (d) Nuclear (NSLD), magnetic (MSLD), and absorption (ASLD) scattering length density profiles obtained from the regressor predictions for the experimental measurement shown in (b). The MSLD contribution from the proximity layer is shown in dark green. (e) Decoded profiles of the PNR measurement in (b). An expanded view of the profiles at high Q is shown in the inset. (f) Spin asymmetry calculated from the measured (points) and predicted (solid line) and profiles shown in (b). (g) Projections of the latent encoding of the test dataset along the latent dimensions with the largest gradients for the thickness and magnetization of the interfacial AFM and TI layers. Red points show the latent encoding of the PNR profile in (b). (h) and (i) The predictions of (h) interfacial AFM layer thickness and magnetization, and (i) proximity layer thickness and magnetization obtained from ten instances of the VAE trained with different initial weights. Gray dashed lines indicate the optimal thresholds for proximity classification. Scattered points above (below) the threshold are colored yellow (blue).
DISCUSSION
Machine learning methods are valuable means of uncovering hidden patterns in materials' data and elucidating the relationships between structural descriptors and measured quantities. However, it is often desirable to balance the flexibility of “black box” neural networks with a degree of interpretability in terms of physical parameters. We accomplish this in our VAE-based framework by conditioning the latent channels to emulate the behavior of the original sample parameters, which enables direct, visual inspection of encoded profiles in terms of meaningful physical quantities. However, our assessment of parameter entanglement in Fig. 2(d) reveals underlying correlations between sample parameters; for example, proximity layer thickness and magnetization are deeply entwined, since finite proximity magnetization implies finite thickness of the interfacial layer, and vice versa. This suggests that the prior assumption that all latent dimensions are sampled from independent normal distributions does not perfectly describe a latent space that is conditioned to vary directly with certain correlated physical parameters. A possible improvement to the existing approach would be to describe the latent space in terms of several joint distributions of a few strongly correlated parameters, which can be tuned to balance the number of necessary network parameters. We note on a few additional considerations for future work. In particular, density fluctuations may be present in certain samples and can require fitting a number of distinct sub-layers of each material. Additionally, a more comprehensive study of the effects of noise on the VAE outcomes would be relevant to determine the effectiveness of such a model to screen candidate systems. These and other specialized features can be readily integrated into the framework presented in this work. Finally, it is important to acknowledge that while the present work aims to reduce reliance on expert insight for PNR parameter retrieval, domain knowledge is still needed to construct a layered representation of the system that considers all relevant features for generating the training data as well as in the interpretation of the final results. Machine learning-driven discovery in this domain might entail more sophisticated models that not only determine the suitability of a particular hypothesis, such as the presence or absence of proximity magnetism, but rather discover the plausible mechanisms underlying a given observation.
CONCLUSION
A quantitative understanding of structural and magnetic information encoded in PNR measurements is often critical to resolving important interfacial phenomena, but experimental factors and lack of adequate fitting constraints can impede parameter retrieval without expert insight. In this work, we construct a data-driven framework for PNR parameter retrieval by training a conditioned VAE to map reflectometry profiles with moderate experimental noise to a well-organized, low-dimensional space from which sample parameters can be readily obtained. We balance the flexibility and interpretability of our model through latent space engineering, enabling in-depth analysis of the resulting predictions. Compared to traditional fitting methods, our framework involves minimal user intervention overall, requiring no expert insight for parameter initialization or refinement, yet is capable of resolving parameter values near the experimental resolution limit. It further enables evaluation of the entire parameter search space by readily identifying outliers of the chosen domain. A possible extension of the framework is suggested to account for intrinsic correlations between conditioning variables. We apply our method to recover the SLD profiles of two proximity-coupled systems at subnanometer resolution, and we envision its potential application to a broader context of elusive phases expressed through weak experimental signatures, such as the axion insulator and topological superconducting phases. We anticipate that the methodology developed in this work can facilitate the development of comprehensive and fully automated analysis routines for PNR parameter retrieval of a broad range of materials systems as well as inform the wide spectrum of spectroscopic analysis workflows requiring parameter refinement.
METHODS
Training details
Training data were generated using a Python implementation of the GenX neutron reflectivity modeling code.64 Simulation of the 200 000 PNR profiles took approximately 48 s using 25 parallel processes on Intel(R) Xeon(R) Gold 5218 processors. Neural network models were implemented in Python using the PyTorch80 libraries and trained on a Quadro RTX 6000 graphics processing unit (GPU) with 24 GB of random access memory (RAM). For the architecture used in this work, training a single epoch took ∼45 s, and each model was trained over 100 epochs. Additional details of the network architecture and final hyperparameters are provided in the supplementary material.
Experimental details
Intrinsic (Bi,Sb)2Te3 consisting of a ∼15 quintuple-layer (QL) of (Bi,Sb)2Te3 with nominal composition (Bi0.2Sb0.8)2Te3 was grown by molecular beam epitaxy (MBE) on a thin film of the AFM insulator Cr2O3 with a nominal thickness of 20 nm. The Bi:Sb ratio was optimized to locate the Fermi level near the Dirac node of the electronic surface states. The AFM Cr2O3 film was grown on a sapphire substrate in a pulsed laser deposition chamber with a base pressure of 2 × 10−8 mbar. A ∼10 nm amorphous Te capping layer was deposited on top of the (Bi,Sb)2Te3 film to protect from degradation. PNR experiments were carried out at the magnetism reflectometer at the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory at a temperature of 5 K with a 1 T in-plane magnetic field.
SUPPLEMENTARY MATERIAL
See the supplementary material for additional details regarding data preparation and noise estimation, the VAE architecture and training history, the complete set of latent space visualizations, and the full set of regressor performance results, including the method of threshold determination.
ACKNOWLEDGMENTS
N.A., Z.C., and M.L. thank C. H. Rycroft for helpful discussions. N.A., Z.C., and M.L. acknowledge the support from the U.S. DOE, Office of Science (SC), Basic Energy Sciences (BES), Award Nos. DE-SC0020148 and DE-SC0021940. N.A. acknowledges the support of the National Science Foundation Graduate Research Fellowship Program under Grant No. 1122374. M.L. is partially supported under Nos. DOE DE-AR0001298 and NSF DMR-2118448 and Norman C. Rasmussen Career Development Chair. This research used resources at the Spallation Neutron Source, a DOE Office of Science User Facility operated by the Oak Ridge National Laboratory. L.-J. Z., Y-F. Z. and C-Z. C. acknowledge support from the ARO Young Investigator Program Award (No. W911NF1810198) and the Gordon and Betty Moore Foundation's EPiQS Initiative (GBMF9063 to C. Z. C.).
AUTHOR DECLARATIONS
Conflict of Interest
The authors declare no conflict of interest.
Author Contributions
N.A. and Z.C. contributed equally to this work.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding authors upon reasonable request.