The correlated semiconductor vanadium dioxide (VO2) exhibits an insulator–metal transition (IMT) near room temperature, which is of interest in various device applications. Precise IMT temperature control is crucial to determine the use cases across technologies such as thermochromic windows, actuators for robots or neuronal oscillators. Doping the cation or anion sites can modulate the IMT by several tens of degrees and control hysteresis. However, modeling the effects of control parameters (e.g., doping concentration, type of dopants) is challenging due to complex experimental procedures and limited data, hindering the use of traditional data-driven machine learning approaches. Symbolic regression (SR) can bridge this gap by identifying nonlinear expressions connecting key input parameters to target properties, even with small data sets. In this work, we develop SR models to capture the IMT trends in VO2 influenced by different dopant parameters. Using experimental data from the literature, our study reveals a dual nature of the IMT temperature with varying tungsten (W) doping concentrations. The symbolic model captures data trends and accounts for experimental variability, providing a complementary approach to first-principles calculations. Our feature-driven analysis across a broader class of dopants informs selectivity and provides qualitative insights into tuning phase transition properties valuable for neuromorphic computing and thermochromic windows.

Vanadium dioxide (VO2), well-known for its insulator–metal transition (IMT),1–4 experiences a sharp alteration in both electrical conductivity and optical transmittance near room temperature.3,4 This transition involves shifting from a high-temperature metallic state with a rutile structure to a low-temperature insulating state exhibiting a monoclinic structure.3,4 Among materials considered to exhibit Mott physics as the basis for electronic transitions, VO2 stands out due to its ultrafast transformation dynamics, making it a model candidate for low-power electronics applications and has been explored in various modern electronic devices5,6 and optical switches.7 Understanding the mechanistic origin of the metal-to-insulator transition is crucial for precise control over the transition. Aliovalent element doping is an effective approach to regulate carrier concentration. Alteration of the charge state and local environment, provides a means to engineer and manipulate electronic and structural properties with continuous transition points.8,9 Among various strategies, aliovalent doping with tungsten (W)1,10,11 has been widely studied in the context of suppressing the IMT temperature.

To study the dynamic responses of such systems, one must gain insights into the origins of the intertwined relationship between structural deformation and complex electronic changes associated with the introduction of impurities into the lattice. Over the years, there have been many incumbent theories emphasizing mechanistic insights into the origin of the IMT in VO2 from theoretical and experimental endeavors. Two major dimensions of theory exist in understanding the underlying mechanism: the Peierls transition, with emphasis on structural change associated with the dimerization of the V atoms along the rutile crystal axis and consequently opening of the gap, or the Mott transition4 to an insulating state, where the gap opens due to the strong Coulomb repulsion between the localized V atom 3d orbital and related dynamical effects. However, despite existing contrasting theories,4,10 one primary commonality is the local structural changes due to the insertion of dopants in the lattice and relevant electronic changes associated with the changes in the occupancy of the d orbitals, allowing control over IMT. The incorporation of locally inserted W ions causes pileup of itinerant electrons on the neighboring V-sites, which works as a destabilizing agent.11 

While the complexity associated with experimental procedures limits the scope in these systems, quantum calculation methods, such as density functional theory (DFT), have been an extensively used tool for modeling the behavior theoretically. However, when confronted with strongly correlated systems, such as VO2, the efficacy of existing DFT techniques in accurately predicting electronic structures encounters challenges.12 The crux of the issue lies in the inadequacy of DFT’s exchange-correlation energy functional, which is often approximated as semi-local and static. This sometimes leads to results having large departures from experimental insights, making these models difficult to implement for gaining insight on the general trends, particularly when dealing with impure crystals.13 

Decades of experimental effort have yielded data encoding crucial insights into the thermal control of transitions in these systems. These data are a potential seed for modern artificial intelligence (AI)/machine learning (ML)14,15 algorithms to uncover complex relationships. However, most widely used AI/ML approaches demand much larger data sets and often produce black-box models, complicating the understanding of the independent contributions from different input parameters and, thus, posing interpretability challenges. Additionally, the limited quantity of data achievable from experiments is associated with dopant-driven IMT in VO2 is primarily due to procedural complexities, impeding the development of traditional AI/ML models. Consequently, only a few empirical linear models are used.1 A promising approach for bridging the gap between physical reasoning and data-driven methodologies is symbolic regression16 (SR).17 SR can identify nonlinear analytical expressions linking a target property to key input parameters, even with relatively small data sets. The absence of a predefined functional form in symbolic regression necessitates the use of optimization algorithms differing from conventional analytic or numerical methods. However, these expressions may provide valuable insights into the underlying processes governing the property.

In this work, we develop analytical models that capture the trends in the IMT in VO2 influenced by different dopant control parameters through the exploration of the symbolic search space. Our data set, consisting of experimental data compiled from an extensive literature survey over the past two decades, focuses on W dopant-driven IMT of VO2 at various concentrations. We focused on W as a dopant since it is perhaps the most widely studied impurity in VO2 and has broad relevance to development of thermochromic materials for energy-efficient windows.18 In recent years, similar compounds have gained importance in the design of low-voltage switchable artificial neurons and oscillators for neuromorphic computing.19 We also contrast these studies against other dopants wherever possible from literature data. We observe a dual trend in the data set: one trend aligns with the prediction by the free energy-based analytical model developed by Adler et al.,20,21 while a second trend has not been noted earlier. Drawing inspiration from the Adler model, we develop a generalized analytical expression capturing the trend in IMT against different dopant concentrations in the symbolic space. Our model captures both the trend and variability in the experimental observations and ties them to the fundamentals of semiconductor physics. Additionally, we analyze other dopant candidates and their influence toward IMT in VO2. Our analysis reveals qualitative understanding of the preference of dopants for tuning the properties of these systems.

The symbolic regression workflow utilized in this study is illustrated in Fig. 1(a). It involves four main steps for developing a complete symbolic regression model. Initially, the data set is analyzed to identify existing trends. Data that is extremely noisy or clean based on presumed correlations can adversely affect both the accuracy of the model and the insights gained from the data. Therefore, trends with prominent segregation in features are separated, and a symbolic model is developed for each trend. The symbolic regression section of the workflow employs the PySR package,24 which implements an evolutionary algorithm-based multi-population approach. The main loop of PySR operates on each population independently and utilizes a classic evolutionary algorithm based on tournament selection for individual. The goal is to identify simple analytic expressions for accurate and interpretable models. Additionally, the PySR regression package implements a simulated annealing-based approach to accept or reject mutations, which helps retain diversity and aids in convergence for a given population. During training, multiple symbolic searches are conducted with random initialization. After training, several symbolic models capturing trends in the data with various degrees of complexity are obtained. Appropriate selection of the simplest model (least complexity) based on the physics behind the captured data is crucial (see the model selection section in Methods). If a satisfactory model is obtained, we stop the symbolic search.

FIG. 1.

Symbolic regression for modeling dopant-mediated metal-to-insulator transition in VO2. (a) The workflow for developing a symbolic regression model. (b) Metal-to-insulator transition incorporating the transition from the insulating monoclinic state (M) to the metallic rutile state (R). (c) Experimental data for the IMT temperature for different concentrations of tungsten (W) doping.

FIG. 1.

Symbolic regression for modeling dopant-mediated metal-to-insulator transition in VO2. (a) The workflow for developing a symbolic regression model. (b) Metal-to-insulator transition incorporating the transition from the insulating monoclinic state (M) to the metallic rutile state (R). (c) Experimental data for the IMT temperature for different concentrations of tungsten (W) doping.

Close modal
The model developed by Adler et al.20,21 represents one of the earliest analytical frameworks used to describe semiconductor-to-metal transitions observed in correlated complex oxides such as VO2. Their model hinges on the understanding that the transition stems from distortions within the crystalline structure, resulting in diminished symmetry and the emergence of ion pairings within a one-dimensional crystal lattice. By assuming narrow 3d bands within transition metal oxides, they were able to establish the correlation between the energy gap and the concentration of carriers. The assumption is that the energy gap introduced is directly linked to the degree of structural distortion, even as it becomes substantial. As temperature rises, electrons are stimulated to cross this gap. Consequently, the system’s free energy can be derived to determine the necessary degree of distortion for energy minimization. The free energy of the system as a function of the concentration of free carriers and temperature can be given as20,
(1)
where x is the intrinsic carrier concentration, δ ¯ is associated with the distortion in the electronic band structure, and F ( x , y ) represents the free energy of the semiconducting state, where y is inversely proportional to the temperature (T). This equation can be further extended by considering the additional extrinsic carriers due to donor dopants and the introduction of associated strain energy, leading to a modified equation analogous to Ref. 21 (equation 4.11). We assume δ ¯ = 0.438, suitable for VO2, as within the narrow band limit, it provides a close estimation of bandgap to transition temperature ratio.20 Subsequently, the free energy of the metallic state can be determined as20,21
(2)

Here, x d represents the dopant concentration ( x d 1 ). There exists a point at which the free energy of the metallic phase becomes lower than that of the semiconducting phase, leading to a first-order transition.

A challenge associated with the development of any symbolic model is the presence of redundant expressions. These expressions may capture the data within the limited window to which they are exposed, but they often tend to be overly complex and may miss the underlying physics. To ensure clarity, the first step is to choose the optimal basis set that captures the physics essential for the model. Our basis set selection is, thus, inspired by the Adler model. One challenge with the Adler model is that it implicitly assumes a structural phase transition (SPT) occurs alongside the insulator-to-metal transition (IMT), which may not always be the case. For example, a Mott insulator can undergo a first-order metal–insulator transition (MIT) due to the breakdown of strong electron–electron Coulomb interactions, without an accompanying SPT. In contrast, the Peierls MIT is typically accompanied by an SPT.22 It remains a matter of debate which mechanism dominates in practical scenarios and could be material specific. Additionally, an IMT can occur without structural changes, such as under pressure-induced electronic shifts, particularly due to extrinsic carriers introduced by external dopants.23 Nonetheless, the foundation of the Adler model primarily relies on the relationship between free energy and electrical conductivity, which should hold under varying conditions, as described by the equation for electronic free energy,20 
(3)
The first two terms on the right represent the system’s energy at zero temperature (Ev and Ec, which are the energies of the top of the valence band and the bottom of the conduction band, respectively, n is the concentration of carriers in the conduction band, and N is the total number of carriers), while the remaining two terms account for the entropy contribution to the free energy. Whether the SPT follows or occurs alongside the IMT, the corresponding free energy change and the entropic contribution from extrinsic carrier concentrations significantly impact the total free energy. The entropic contribution follows a logarithmic form, and the free energy change is linked to the IMT. The first two energy terms depend on the mechanism involved, which should, in some way, correlate with the extrinsic carriers. For our application, inspired by this, we limit our basis set to logarithmic and polynomial functions, allowing for insightful interpretation of the symbolic expressions while adhering to Adler’s model. The second part of the challenge involves eliminating redundant models. We employ a two-step process to achieve this. First, the model with the least complexity and lower mean absolute error (MAE) is selected. In PySR, complexity is defined as the number of nodes in an expression tree, regardless of each node's content. During the fitting process, an additional penalty to complexity is added by introducing a “parsimony” (refer to note 1 in the supplementary material) term,24 
(4)

For an expression E, the total loss term is l ( E ), lpred (E) is the prediction loss, and the complexity is C ( E ). With a suitable choice of “parsimony” (see note 1 in the supplementary material), extremely complex equations are avoided during the fitting process itself. After removing equations based on their complexity and MAE error, we select the final candidate based on physical criteria. The model should be monotonously decreasing and exhibit no discontinuities or sharp changes in gradients. Additionally, the model should not have any singular points at lower dopant concentrations.

The data set used for the development of the symbolic model, as depicted in Fig. 1(c), comprises different experimental data points showcasing the change in IMT of VO2 for tungsten (W) doping up to a doping concentration of 5%. A total of 56 points are in the data set collected across a wide range of literature.11,25–35 As can be seen in the data set, there is some variability that we have characterized. However, we carefully chose data sets from thin films of W-doped VO2 in almost all cases to minimize microstructural effects due to bulk or Nano confined crystals. While the thin film deposition methods did vary between references such as sputtering, pulsed laser deposition, etc., we have limited to data from such films. We hope this is a more representative sub-group of experimental data despite differences in the deposition method. A closer look at the data set reveals a dual nature of transition temperature change [Fig. 2(a)]. We additionally validate the trends mathematically (refer to note 2 in the supplementary material). One trend (“trend 1”) depicts relatively low depression in IMT temperature with increasing W concentration [red points in Fig. 2(a)], while the rest of the points indicate a second trend (“trend 2”) with relatively high depression in temperature but noisy in nature. Additional data36–39 include the rate of change (RC) (°C/at. %) and IMT temperature for various dopant species [Figs. S1(a) and S1(b) in the supplementary material].

FIG. 2.

Modeling trends in the IMT with symbolic regression. (a) Two different trends in the transition temperature with doping. (b) Modeling “trend 1” from (a) with symbolic regression and the corresponding Adler model predicting similar values in temperature. (c) Parity plot displaying the predicted temperature against experimental temperature for the symbolic model along with the distribution of points along the x–y axis.

FIG. 2.

Modeling trends in the IMT with symbolic regression. (a) Two different trends in the transition temperature with doping. (b) Modeling “trend 1” from (a) with symbolic regression and the corresponding Adler model predicting similar values in temperature. (c) Parity plot displaying the predicted temperature against experimental temperature for the symbolic model along with the distribution of points along the x–y axis.

Close modal

With the analyzed experimental data, we further assess the predictability of Adler models. As seen in Fig. 2(a), the Adler model agrees relatively well with “trend 1,” indicating a relatively low depression in IMT with increased W doping concentration. However, the model disregards the rest of the points predominantly observed in experiments. The occurrence of predominant trends can be attributed to various factors. For instance, the extent of crystallinity in VO2 differs between monocrystal samples and polycrystalline films and powders,4 leading to different rates of dispersion in IMT. Additionally, the emergence of intermediate phases,40 such as the M2 phase identified owing to doping, can also significantly contribute to the variability in experimental observations. The preparation of the sample, strain in the lattice, and the mechanism for dopant-induced phase change profoundly affect IMT, as previously observed in the VO2 phase diagram.41 The effect of W doping involves more than just the injection of electrons; it also introduces lattice strain, making IMT a complex phenomenon resulting from competitive interactions among electrons, orbitals, and lattices. Although the Adler model initially captures specific and more idealistic trends in the data (similar to single crystal), it exhibits shortcomings in its idealistic assumptions that prevent it from representing more realistic scenarios in IMT trends. The additional complexity associated with the experiments surpasses the scope of the Adler model, which is based on ideal theoretical assumptions. While the Adler model does not account for these additional factors, it may serve as a valuable baseline model, providing a direct link between IMT and semiconductor physics.

This foundational link is utilized by symbolic regression model 1 (SR1) as depicted in Fig. 2(b). The symbolic regression model is based on the chosen basis set inspired by the Adler model,
(5)
where x d represents the dopant concentration and T represents the IMT temperature. The equation is selected from a pool of approximately 500 candidates using the filtering criteria described in the Method section. The cost complexity analysis (see Fig. S2 in the supplementary material), involving the removal of unphysical equations, leads to the selection of this model. With less weight on the logarithmic part compared to the linear and constant parts involving the dopant, the electronic changes associated with IMT depression is less pronounced. The logarithmic factor generally originates from electronic entropic contribution free energy in the Adler model. The model also captures the trends with a mean absolute error (MAE) of 1.9 K [see Fig. 2(c)].
We next developed symbolic models based on “trend 2” of the data set, as depicted in Fig. 2(a). The data set yields inherent variability associated with the experiments. Despite this variability, there is a general trend showcasing depression in IMT temperature with increasing W doping concentration. Regardless of the variability, for any physics-driven process, there are threshold limitations that inherently possess upper and lower bounds, irrespective of extrinsic factors. In this case, we also observe both lower and upper limits of IMT at any dopant concentration. Therefore, to develop an unbiased model capturing the overall trend while considering variability as parameters, we fit both the upper and lower limits of the data points [see Fig. S3(a) in the supplementary material] and subsequently combine both models. Consequently, two models are fitted as depicted in Fig. 3(a). The SR3 model has a MAE error of 3.9 K [Fig. 3(b)] with the upper limit of the data set points, while the SR2 model on the lower limit of the data set has an error of 2.2 K [Fig. 3(c)], showcasing the ability of symbolic models to capture the trend in the data. The SR2 and SR3 models are given by
(6)
(7)
observing equation SR2, at the upper threshold, the electronic changes are coupled with structural changes (usually attributed to the polynomial terms), causing the depression in IMT. However, at the lower threshold, both electronic and structural changes contribute individually toward a more pronounced depression in IMT. This indicates the variability in the mechanism of transition, as discussed earlier. To develop a general equation of IMT with W doping, we integrate SR2 and SR3 with an interpolation parameter a,
(8)
a accounts for variability in experimental and associated manifestations of mechanistic differences during transition. Based on different values of a, we may obtain different equations in the region of trend 2. For a given dopant concentration, reducing a will cause a drop in IMT [Fig. 3(d)]. As the dopant concentration increases, the rate of reduction in IMT also decreases. For a dopant concentration <1 at. %, the mean rate of change (RC) (reduction), in IMT observed for different levels of DFT theory43 and empirical model44 [Fig. 3(e)], displays large variability in the prediction primarily due to the fixed functional form,44 which inhibits accurate mapping of the phase space. Additionally, for empirical models, there is the limiting nature of learning from the DFT data upon which the model is built. These factors further deteriorate the performance. The mean prediction of the symbolic models, however, to some extent agrees with the DFT prediction at a very low concentration <1 at. %. In conclusion, the SR1 model demonstrates the utility of symbolic regression in capturing trends in experimental data, drawing on physics-based insights. Meanwhile, SR2 and SR3 improve upon the Adler model by incorporating additional mechanistic complexity from experimental data, leading to expressions better suited for predictions.
FIG. 3.

Modeling the region under “trend 2” with symbolic regression. (a) Two different symbolic regression models, SR2 and SR3, fitted to the upper and lower bound points (color-coded similarly). (b) Parity plot displaying the predicted temperature against experimental temperature for the SR3 model. (c) Parity plot displaying the predicted temperature against experimental temperature for the SR2 model. (d) Variation in IMT with “a” parameter attributed to structural change for different dopant concentrations. (e) Comparison of the average rate of change (reduction) (RC) (°C/at. %) for the symbolic models, DFT, and empirical42 models.

FIG. 3.

Modeling the region under “trend 2” with symbolic regression. (a) Two different symbolic regression models, SR2 and SR3, fitted to the upper and lower bound points (color-coded similarly). (b) Parity plot displaying the predicted temperature against experimental temperature for the SR3 model. (c) Parity plot displaying the predicted temperature against experimental temperature for the SR2 model. (d) Variation in IMT with “a” parameter attributed to structural change for different dopant concentrations. (e) Comparison of the average rate of change (reduction) (RC) (°C/at. %) for the symbolic models, DFT, and empirical42 models.

Close modal

The nature of the dopant has a profound effect on dictating the IMT in VO2. While tungsten is the most widely used dopant to modulate the IMT in VO2, other candidates have also been explored. Here, we discuss 29 other dopants and their corresponding effects on IMT and the rate of change (RC) (reduction) (°C/at. %) utilizing an experimental data set, as depicted in Fig. 4(a), collected from different pieces of the literature.36–39 The bubble sizes [Fig. 4(a)] indicate the dopant concentration in at. %. It is intuitive that upon doping, the dopant modifies the local VO2 structure either electronically, structurally, or in combination. In substitutional doping, replacement of the transition metal cation V4+ often occurs in a different formal oxidation state. Based on the valency of the dopant, we can categorize the doping in three different ways: i.e., hole doping (valency of dopant less than +4), neutral doping (valency of dopant equal to +4), and electron doping (valency of dopant greater than +4). For example, W dopant with a +6 formal oxidation state provides additional electrons per dopant atom to VO2 due to a charge transfer between W6+ and V4+ ions, causing the reduction of V4+ to V3+ ions.45 In contrast, a dopant like Ti with a +4-oxidation state is unlikely to alter the electronic structure significantly due to the lack of charge transfer between Ti4+ and V4+ ions. The overall effect of the oxidation state of the dopant on IMT can be seen in Fig. 4(b). For hole doping (e.g., Cu3+, Fe3+, and F), the IMT is altered by either increment or reduction, but the degree of modulation is comparatively low. For neutral doping, the IMT modulation is very small, and the mean IMT remains very close to that of the undoped system. However, electron doping significantly alters the IMT, primarily reducing it. A similar trend can be observed for the rate of change (RC) (reduction) in Fig. 4(c). The predominant RC can be observed for electron doping. To understand the qualitative effect of the electronic changes and structural changes, we compared the SR2 model for W doping in VO2 with neutral doping of VO2 with Ti doping data [Fig. 4(d)] collected from different literature studies.46,47

FIG. 4.

Different dopant species affecting the IMT in VO2. (a) Experimental observations for IMT and respective rate of change (reduction) (RC) (°C/at. %) for different dopant species, with the size of the bubbles representing respective doping concentrations (at. %). The IMT for pure VO2 is shown as the black dotted line. (b) Dopant nature (hole, neutral, and electronic) and the IMT values for the data in (a). (c) Dopant nature and corresponding RC for the data represented in (a). (d) Comparison of dopant concentration-driven change in IMT for nominal isovalent doping (Ti) and electronic doping (W—SR2 equation). Black dotted lines represent the IMT of undoped VO2.

FIG. 4.

Different dopant species affecting the IMT in VO2. (a) Experimental observations for IMT and respective rate of change (reduction) (RC) (°C/at. %) for different dopant species, with the size of the bubbles representing respective doping concentrations (at. %). The IMT for pure VO2 is shown as the black dotted line. (b) Dopant nature (hole, neutral, and electronic) and the IMT values for the data in (a). (c) Dopant nature and corresponding RC for the data represented in (a). (d) Comparison of dopant concentration-driven change in IMT for nominal isovalent doping (Ti) and electronic doping (W—SR2 equation). Black dotted lines represent the IMT of undoped VO2.

Close modal

The introduction of Ti4+ induces negligible changes in the valence band of VO2. Without a donor or acceptor nature of doping, the mechanism of the regulation of Tc by Ti4+ doping is predominantly associated with the local structure perturbations induced.45 At the same time, W replacement destabilizes the monoclinic phase through the disruption of the dimeric V4+–V4+ bond to form W6+–V3+ and V4+–V3+ pairs, thus lowering the energy barrier for the transition to the rutile structure. While both Ti4+ and W6+ have similar ionic radii48 [Fig. 4(d)], it is difficult to discretize the exact electronic and structural contribution during the transition and compare. For dopants such as W, it is evident that imminent changes in the electronic structure lead to structural destabilization, altering the phase stability of the phases. This further corroborates the development of a symbolic model with the Adler model as a basis, which stems from electronic theory.

To understand the correlation between dopant characteristics and the relative effectiveness of tunability in IMT, we use a set of relevant periodic table and electronic structure-based features for dopant elements as depicted in Fig. 5. The features are based on specific periodic table-based attributes such as group, column number (atomic number), Mendeleev number, atomic weight, row, valency, ionic radius, and element block (s, p, d, and f), as well as electronic properties. We select the last two orbital attributes of the electronic configuration of the element, such as principal quantum number (n1 and n2), types of orbital (o1_s, o2_s, etc., for s and p orbitals), and number of electrons (e1 and e2) in them. Through feature-driven correlation analysis, we shed light on the attributes crucial for the choice of dopants. The feature importance is computed using SHAP (SHapley Additive exPlanations)49 [Figs. 5(a) and 5(b)] using random forest (RF) models that are fitted to predict IMT and RC on the constructed feature space. For fitting the random forest models, a fivefold cross-validation strategy was used, and SHAP was applied to each fold separately and the results were combined afterward. As depicted in Fig. 5(a), the most important feature for predicting IMT is e1, indicating that elements with partially filled last s orbitals (low feature values) (e.g., Nb and Mo) are likely to impact IMT negatively (reduce it) compared to dopants having filled s orbitals. Apart from dopant concentrations (at. %), we see that valency can impact IMT, particularly for the elements with high valency [Fig. 5(a)] are likely to reduce it. For RC, we see a different trend, as valency has the most importance, with high-valent dopants likely to have a positive RC (reduce IMT). Other important features include atomic weight and atomic number. Additionally, we see that ionic radius impacts IMT, with elements having a low ionic radius likely to have a positive RC contribution (reduce IMT) (e.g., Boron) compared to elements with a high ionic radius (e.g., Tb). Overall normalized feature importance is given in Fig. 5(c). It is also imperative to remember that some of these features are correlated among themselves. Hence, to present the degree of correlation between features, we also showcase a dendrogram in Fig. 5(c). The height of the dendrogram represents the similarity among the features. For example, one can start from the extreme bottom left side of the dendrogram with two connected features (block_d and O2_d), which are practically the same. As one moves upward, it gets connected with o1_s. The d block elements share some degree of similarity in the electronic structure of the last s orbital. Overall we can see that valency, ionic radius, atomic number, and dopant concentration are among the crucial factors that dictate IMT in VO2.

FIG. 5.

Feature-driven analysis of the influence of dopants on the IMT and RC [rate of change (reduction)] (°C/at. %). (a) SHAP importance of different features computed from a fitted random forest model for predicting IMT. (b) SHAP importance of different features computed from a fitted random forest model for predicting RC. (c) Overall importance of the features of the dopant species in predicting IMT and RC, along with the connectivity among the features shown as a dendrogram.

FIG. 5.

Feature-driven analysis of the influence of dopants on the IMT and RC [rate of change (reduction)] (°C/at. %). (a) SHAP importance of different features computed from a fitted random forest model for predicting IMT. (b) SHAP importance of different features computed from a fitted random forest model for predicting RC. (c) Overall importance of the features of the dopant species in predicting IMT and RC, along with the connectivity among the features shown as a dendrogram.

Close modal

Modulating the insulator–metal transition (IMT) temperature through aliovalent doping is an effective method for regulating carrier concentration, offering a robust means to adjust device performance. While tungsten (W) is widely explored for this purpose, other dopants also influence the IMT in VO2. Given the complexity of experiments and limited experimental data, we have developed symbolic models to capture IMT trends in VO2 under various dopant control parameters. The symbolic models, namely, SR1, SR2, and SR3, encompass the dual trend observed in the data set. Specifically, SR2 and SR3, along with a correlation parameter a, effectively capture the variability in experimental conditions and transition mechanisms, aligning with the principles of semiconductor physics. These models provide valuable alternatives to experiments, with the a parameter requiring significantly fewer trials for determination, facilitating extrapolation to other dopant concentrations. Furthermore, our analysis of different dopant candidates sheds light on specific dopant features critical for tuning the properties of VO2 systems, aiding in informed dopant selection for desired device characteristics. Our approach may be effective for exploring other phase change materials by modeling the control parameters and their effect on the phase space.

The supplementary material consists of data sets used to develop the SR model and related analysis.

S.R. acknowledges AFOSR (No. FA9550-23-1-0215) for support. Work performed at the Center for Nanoscale Materials, a U.S. Department of Energy Office of Science User Facility, was supported by the U.S. DOE, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. This material is based on work supported by the DOE, Office of Science, BES Data, Artificial Intelligence, and Machine Learning at DOE Scientific User Facilities program (Digital Twin). This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231.

The authors have no conflicts to disclose.

S. Banik: Formal analysis (lead); Writing – original draft (lead). S. V. Shriram: Data curation (equal); Writing – review & editing (supporting). S. Ramanathan: Conceptualization (supporting); Writing – review & editing (supporting). S. K. R. S. Sankaranarayanan: Supervision (lead); Writing – review & editing (equal).

The data that support the findings of this study are available within the article and its supplementary material.

1.
B. G.
Chae
and
H. T.
Kim
,
Physica B
405
,
663
(
2010
).
2.
C.
Zhu
,
M.
Kaur
,
F.
Tang
,
X.
Liu
,
D. J.
Smith
, and
R. J.
Nemanich
, “
Band alignment of vanadium oxide as an interlayer in a hafnium oxide-silicon gate stack structure
,”
J. Appl. Phys.
112
(
8
), 084105 (
2012
).
3.
R.
Zhang
,
Q. S.
Fu
,
C. Y.
Yin
,
C. L.
Li
,
X. H.
Chen
,
G. Y.
Qian
,
C. L.
Lu
,
S. L.
Yuan
,
X. J.
Zhao
, and
H. Z.
Tao
,
Sci. Rep.
8
,
17093
(
2018
).
4.
Z.
Shao
,
X.
Cao
,
H.
Luo
, and
P.
Jin
,
NPG Asia Mater.
10
,
581
(
2018
).
5.
M. J.
Lee
,
Y.
Park
,
D. S.
Suh
,
E. H.
Lee
,
S.
Seo
,
D. C.
Kim
,
R.
Jung
,
B. S.
Kang
,
S. E.
Ahn
,
C. B.
Lee
,
D. H.
Seo
,
Y.-K.
Cha
,
I.-K.
Yoo
,
J.-S.
Kim
, and
B. H.
Park
,
Adv. Mater.
19
,
3919
(
2007
).
6.
J.
del Valle
,
Y.
Kalcheim
,
J.
Trastoy
,
A.
Charnukha
,
D. N.
Basov
, and
I. K.
Schuller
,
Phys. Rev. Appl.
8
,
054041
(
2017
).
7.
T.
Driscoll
,
H.-T.
Kim
,
B.-G.
Chae
,
B.-J.
Kim
,
Y.-W.
Lee
,
N. M.
Jokerst
,
S.
Palit
,
D. R.
Smith
,
M.
Di Ventra
, and
D. N.
Basov
,
Science
325
,
1518
(
2009
).
8.
H.-T.
Zhang
,
T. J.
Park
,
A. N. M. N.
Islam
,
D. S. J.
Tran
,
S.
Manna
,
Q.
Wang
,
S.
Mondal
,
H.
Yu
,
S.
Banik
,
S.
Cheng
,
H.
Zhou
,
S.
Gamage
,
S.
Mahapatra
,
Y.
Zhu
,
Y.
Abate
,
N.
Jiang
,
S. K.R. S.
Sankaranarayanan
,
A.
Sengupta
,
C.
Teuscher
, and
S.
Ramanathan
,
Science
375
,
533
(
2022
).
9.
S.
Banik
,
T.
Loefller
,
S.
Manna
,
H.
Chan
,
S.
Srinivasan
,
P.
Darancet
,
A.
Hexemer
, and
S. K. R. S.
Sankaranarayanan
,
npj Comput. Mater.
9
,
177
(
2023
).
10.
X.
He
,
Y.
Zeng
,
X.
Xu
,
C.
Gu
,
F.
Chen
,
B.
Wu
,
C.
Wang
,
H.
Xing
,
X.
Chen
, and
J.
Chu
,
Phys. Chem. Chem. Phys.
17
,
11638
(
2015
).
11.
X.
Tan
,
T.
Yao
,
R.
Long
,
Z.
Sun
,
Y.
Feng
,
H.
Cheng
,
X.
Yuan
,
W.
Zhang
,
Q.
Liu
,
C.
Wu
,
Y.
Xie
, and
S.
Wei
,
Sci. Rep.
2
,
466
(
2012
).
12.
Y.
Zhang
,
D.
Ke
,
J.
Wu
,
C.
Zhang
,
L.
Hou
,
B.
Lin
,
Z.
Chen
,
J. P.
Perdew
, and
J.
Sun
, “
Challenges for density functional theory in simulating metal–metal singlet bonding: A case study of dimerized VO2
,”
J. Chem. Phys.
160
(
13
), 134101 (
2024
).
13.
A.
Iwaszuk
and
M.
Nolan
,
J. Phys. Chem. C
115
,
12995
(
2011
).
14.
S.
Banik
,
K.
Balasubramanian
,
S.
Manna
,
S.
Derrible
, and
S. K. R. S.
Sankaranarayananan
,
Comput. Mater. Sci.
236
,
112847
(
2024
).
15.
S.
Banik
,
D.
Dhabal
,
H.
Chan
,
S.
Manna
,
M.
Cherukara
,
V.
Molinero
, and
S. K. R. S.
Sankaranarayanan
,
npj Comput. Mater.
9
,
23
(
2023
).
16.
Y.
Wang
,
N.
Wagner
, and
J. M.
Rondinelli
,
MRS Commun.
9
,
793
(
2019
).
17.
L.
Foppa
,
T. A. R.
Purcell
,
S. V.
Levchenko
,
M.
Scheffler
, and
L. M.
Ghiringhelli
,
Phys. Rev. Lett.
129
,
055301
(
2022
).
18.
Y.
Cui
,
Y.
Ke
,
C.
Liu
,
Z.
Chen
,
N.
Wang
,
L.
Zhang
,
Y.
Zhou
,
S.
Wang
,
Y.
Gao
, and
Y.
Long
,
Joule
2
,
1707
(
2018
).
19.
G.
Li
,
D.
Xie
,
H.
Zhong
,
Z.
Zhang
,
X.
Fu
,
Q.
Zhou
,
Q.
Li
,
H.
Ni
,
J.
Wang
,
E.-J.
Guo
,
M.
He
,
C.
Wang
,
G.
Yang
,
K.
Jin
, and
C.
Ge
,
Nat. Commun.
13
,
1729
(
2022
).
20.
D.
Adler
and
H.
Brooks
,
Phys. Rev.
155
,
826
(
1967
).
21.
D.
Adler
,
J.
Feinleib
,
H.
Brooks
, and
W.
Paul
,
Phys. Rev.
155
,
851
(
1967
).
22.
B.-J.
Kim
,
Y. W.
Lee
,
S.
Choi
,
J.-W.
Lim
,
S. J.
Yun
,
H.-T.
Kim
,
T.-J.
Shin
, and
H.-S.
Yun
,
Phys. Rev. B
77
,
235401
(
2008
).
23.
H.
Zhang
,
Z.
Guan
,
B.
Cheng
,
Q.
Li
,
R.
Liu
,
J.
Zhang
,
Z.
Liu
,
K.
Yang
,
T.
Cui
, and
B.
Liu
,
RSC Adv.
7
,
31597
(
2017
).
24.
M.
Cranmer
, “
Interpretable machine learning for science with PySR and SymbolicRegression. jl
,” arXiv:2305.01582 (
2023
).
25.
K.
Sun
,
C.
Wheeler
,
J. A.
Hillier
,
S.
Ye
,
I.
Zeimpekis
,
A.
Urbani
,
N.
Kalfagiannis
,
O. L.
Muskens
, and
C. H.
de Groot
,
Adv. Opt. Mater.
10
,
2201326
(
2022
).
26.
Y.
Bleu
,
F.
Bourquard
,
V.
Barnier
,
A.-S.
Loir
,
F.
Garrelie
, and
C.
Donnet
,
Materials
16
,
461
(
2023
).
27.
C.
Piccirillo
,
R.
Binions
, and
I. P.
Parkin
,
Thin Solid Films
516
,
1992
(
2008
).
28.
J. M.
Reyes
,
M.
Sayer
, and
R.
Chen
,
Can. J. Phys.
54
,
408
(
1976
).
29.
L.
Whittaker
,
T.-L.
Wu
,
C. J.
Patridge
,
G.
Sambandamurthy
, and
S.
Banerjee
,
J. Mater. Chem.
21
,
5580
(
2011
).
30.
C.
Ling
,
Z.
Zhao
,
X.
Hu
,
J.
Li
,
X.
Zhao
,
Z.
Wang
,
Y.
Zhao
, and
H.
Jin
,
ACS Appl. Nano Mater.
2
,
6738
(
2019
).
31.
N.
Shen
,
S.
Chen
,
R.
Shi
,
S.
Niu
,
A.
Amini
, and
C.
Cheng
,
ACS Appl. Electron. Mater.
3
,
3648
(
2021
).
32.
K.
Mulchandani
,
A.
Soni
,
K.
Pathy
, and
K. R.
Mavani
,
Superlattices Microstruct.
154
,
106883
(
2021
).
33.
J.
Chaillou
,
Y.-F.
Chen
,
N.
Émond
,
T.
Hajlaoui
,
B.
Torriss
,
K. D.
Malviya
,
E.
Orgiu
, and
M.
Chaker
,
ACS Appl. Electron. Mater.
4
,
1841
(
2022
).
34.
M. R.
Yamagata
,
Y.
Wakita
,
Y.
Tsuruda
, and
K.
Miyata
,
Therm. Sci. Eng. Prog.
37
,
101601
(
2023
).
35.
M.
Pattanayak
,
M. N. F.
Hoque
,
Y.-C.
Ho
,
W.
Li
,
Z.
Fan
, and
A. A.
Bernussi
,
Appl. Mater. Today
30
,
101642
(
2023
).
36.
X.
Wang
,
L.
Chen
,
H.
Lu
,
W.
Fang
,
H.
Li
,
W.
Yin
,
M.
Li
,
Y.
Lu
,
P.
Li
, and
Y.
He
, “
Enhancing visible-light transmittance while reducing phase transition temperature of VO2 by Hf–W co-doping
,”
Appl. Phys. Lett.
118
(
19
), 192102 (
2021
).
37.
N.
Wang
,
M.
Duchamp
,
R. E.
Dunin-Borkowski
,
S.
Liu
,
X.
Zeng
,
X.
Cao
, and
Y.
Long
,
Langmuir
32
,
759
(
2016
).
38.
Y.
Xue
and
S.
Yin
,
Nanoscale
14
,
11054
(
2022
).
39.
A.
Krammer
,
A.
Magrez
,
W. A.
Vitale
,
P.
Mocny
,
P.
Jeanneret
,
E.
Guibert
,
H. J.
Whitlow
,
A. M.
Ionescu
, and
A.
Schüler
,
J. Appl. Phys.
122
,
045304
(
2017
).
40.
M.
Israelsson
and
L.
Kihlborg
,
Mater. Res. Bull.
5
,
19
(
1970
).
41.
K.
Shibuya
,
M.
Kawasaki
, and
Y.
Tokura
, “
Metal–insulator transition in epitaxial V1–xWxO2 (≤ x≤ 0.33) thin films
,”
Appl. Phys. Lett.
96
(
2
), 022102 (
2010
).
42.
M.
Netsianda
,
P. E.
Ngoepe
,
C. R. A.
Catlow
, and
S. M.
Woodley
,
Chem. Mater.
20
,
1764
(
2008
).
43.
J. Zhang, H. He, Y. Xie, and B. Pan
, “
Theoretical study on the tungsten-induced reduction of transition temperature and the degradation of optical properties for VO2
,”
J. Chem. Phys.
138
(
11
), 114706 (
2013
).
44.
S.
Banik
,
P. S.
Dutta
,
S.
Manna
, and
S. K. R. S.
Sankaranarayanan
, arXiv:2405.14683 (
2024
).
45.
Y.
Wu
,
L.
Fan
,
Q.
Liu
,
S.
Chen
,
W.
Huang
,
F.
Chen
,
G.
Liao
,
C.
Zou
, and
Z.
Wu
,
Sci. Rep.
5
,
9328
(
2015
).
46.
Y.
Hu
,
Q.
Shi
,
W.
Huang
,
H.
Zhu
,
F.
Yue
,
Y.
Xiao
,
S.
Liang
, and
T.
Lu
,
J. Sol-Gel Sci. Technol.
78
,
19
(
2016
).
47.
Z.
Hiroi
,
H.
Hayamizu
,
T.
Yoshida
,
Y.
Muraoka
,
Y.
Okamoto
,
J.-I.
Yamaura
, and
Y.
Ueda
,
Chem. Mater.
25
,
2202
(
2013
).
48.
R. D.
Shannon
,
Found. Crystallogr.
32
,
751
(
1976
).
49.
S. M.
Lundberg
,
G.
Erion
,
H.
Chen
,
A.
DeGrave
,
J. M.
Prutkin
,
B.
Nair
,
R.
Katz
,
J.
Himmelfarb
,
N.
Bansal
, and
S.-I.
Lee
,
Nat. Mach. Intell.
2
,
56
(
2020
).