Scanning tunneling microscopy (STM) is a widely used tool for atomic imaging of novel materials and their surface energetics. However, the optimization of the imaging conditions is a tedious process due to the extremely sensitive tip–surface interaction, thus limiting the throughput efficiency. In this paper, we deploy a machine learning (ML)-based framework to achieve optimal atomically resolved imaging conditions in real time. The experimental workflow leverages the Bayesian optimization (BO) method to rapidly improve the image quality, defined by the peak intensity in the Fourier space. The outcome of the BO prediction is incorporated into the microscope controls, i.e., the current setpoint and the tip bias, to dynamically improve the STM scan conditions. We present strategies to either selectively explore or exploit across the parameter space. As a result, suitable policies are developed for autonomous convergence of the control parameters. The ML-based framework serves as a general workflow methodology across a wide range of materials.

## I. INTRODUCTION

Scanning tunneling microscopy (STM) is an invaluable tool for imaging material surfaces at atomic resolution and obtaining spectroscopic information related to their local electronic structure.^{1,2} STM utilizes an atomically sharp tip, which, when biased, measures currents from a conductive surface based on the quantum mechanical tunneling effect.^{3} This technique has been used to explore various physical processes such as band-gap energetics,^{4} superconductivity,^{5} and other quantum phenomena.^{6} However, the optimization of STM for high-quality imaging and spectroscopy measurements is time-consuming due to extremely sensitive tip–surface interactions, thus limiting the efficiency. This difficulty is compounded, especially in long-term measurements by physical factors such as the tip drift, piezo creep, noise, and surface states.

In the last several years, advances in machine learning (ML) have brought significant developments to multiple areas of imaging sciences.^{7–11} Recent developments in the application of ML to scanning probe and electron microscopy methods have been realized to achieve autonomous experimental workflows to assist in materials discovery.^{12,13} The use of neural networks and reinforcement learning-based frameworks on STM systems have been demonstrated for tip functionalization and atom manipulation.^{14–17}

In STM measurements, a high-quality image is necessary for an accurate determination of atomic positions, defect locations, and symmetries. An example of this is shown in Fig. 1, which compares STM images of graphene for different sets of control parameters. The image on the top panel [Fig. 1(a)] is a higher quality image, as evidenced by the sharp peaks of the hexagonal structure in the Fourier space [Fig. 1(b)]. To determine the atom position from the scan data, we used a blob detection based on the Laplacian of Gaussians (LOG) process on the noise-filtered FFT data.^{9,18,19} The determined atom position (blob), indicated by the red circles in Fig. 1(c), shows accurate atom detection and alignment along the crystallographic axis. The lower quality image [shown in Fig. 1(d)], on the other hand, introduces an error in the position detection [shown in Fig. 1(f)]. Although suitable techniques such as noise filtering and priors are used to improve image quality, a bad choice of control parameters results in noisy images with indistinguishable features. Therefore, optimization of the instrument controls for better imaging is desirable. In practice, this optimization is implemented by initially deciphering the range of values for various controls based on prior knowledge, followed by incremental optimization, usually by trial and error.

In this work, we demonstrate an ML-based technique to realize autonomous convergence of the STM controls for imaging graphene films deposited on copper substrate. The process is implemented via an active-learning-based Bayesian optimization (BO) method, which is designed to provide a derivative-free optimization task of an expensive (time/resources) black-box problem, where the functional representation between the input (control parameters) and the output (experimental results) is unknown and the optimal solution can be found with minimal experiments. These techniques have gained traction in recent years to realize autonomous workflows.^{20–26}

^{27}Given an input set of data points

*X*≡

*x*

_{1:k}=

*x*

_{1},

*x*

_{2}, … ,

*x*

_{k}, the GP surrogate function

*f*(

*x*

_{1:k}) = [

*f*(

*x*

_{1}),…,

*f*(

*x*

_{k})] is a multivariate normal (

*MVNormal*) distribution across mean functions (

*m*(

*x*

_{1:k})) and covariance functions

*K*(

*x*

_{1:k}) and is expressed as

*ε*is the normally distributed observational noise. In our implementation, we treat the noise term as a hyperparameter that is constrained with a minimum bound of ∼0.1. The mean function is usually considered to be zero, and the matrix

*K*=

*k*(

*x*

_{i},

*x*

_{j}) depends on the covariance kernel. The estimation of the observational noise is absorbed into the kernel computation. In this study, we employed the radial basis function (RBF) kernel due to the expected smoothness of the objective function (i.e., the image quality represented by the FFT peak) across the parameter space.

^{28}The form of RBF kernel is given as

*d*is the Euclidean distance,

*η*is the output scale, and

*l*is the kernel length scale. Given a new set of input parameters

*X*

_{*}≡

*x*

_{1:n}=

*x*

_{1},

*x*

_{2}, … ,

*x*

_{n}, the posterior distribution function (

*f*

_{*}) depends on the posterior mean (

*μ*

^{post}) and variance (

*σ*

^{post}) functions, and is given by

*θ*= (

*η*,

*l*) are the inferred kernel parameters.

The output of the GP regression provides the predictive objective mean and variance that is mapped across the parameter space, following which a suitable acquisition function is employed to sample the optimal set of parameters. The suggested parameters are then directly incorporated into the STM controls, in real time, to rapidly traverse across the parameter space. The automated experiment is implemented by integrating the instrumentation feedback, on-the-fly data analysis, and BO-based prediction of the control parameters.

## II. RESULTS AND DISCUSSION

### A. Experimental workflow

We design an experimental workflow to incorporate the BO prediction of the control parameters to the microscope controls in real time. A schematic of the workflow is illustrated in Fig. 2. The experiment begins with an initial set of controls input by the user. Once the scan is performed, the FFT data are analyzed, and the average of the six hexagonal peak intensities [shown in Fig. 2(c)] is used as the target to be optimized.

At every iteration, the preceding set of control features (i.e., current setpoint and bias) and the targets (average FFT peak intensity) serve as the training set to construct a surrogate function across the parameter space, as depicted in Fig. 2(d). Here, a Gaussian process (GP) is utilized as the surrogate function and the output of the regression yields the predictive mean and variance (“uncertainty”) values for the function of interest across the parameter space.^{28} This process further employs an acquisition function to suggest the optimal set of STM control parameters, i.e., the tip bias and the current setpoint. This point of acquisition function maxima is considered for subsequent experimental data sampling.

### B. Role of acquisition functions

The acquisition function affects the exploration trajectory of the sampling across the parameter space. The acquisition function is expected to balance the process of exploration (probing where the predictive uncertainty is high) and exploitation (probing where the predicted function value is “good”). For some acquisition functions like the upper confidence bound (UCB), the extent of exploration/exploitation strongly depends on the hand-selected hyperparameter value. We carried out experiments across a range of hyperparameter values for the different acquisition functions.

In the experiments that were carried out, the overall ranges of tip bias and setpoint current values were set based on prior domain knowledge about the system under study, and the BO was applied to find their optimal combination within those fixed ranges. Specifically, the tip-bias parameter was maintained in the range of 10–110 mV and the setpoint was in the range of 400–2000 pA. The experimental workflow starts at the initial condition set at 110 mV and 400 pA for the tip bias and the setpoint current, respectively. Once the initial scan data are obtained, the average FFT intensity across the six peaks is determined. The training set is then processed by the BO algorithm for parameter acquisition, which is then incorporated into the subsequent iteration. The experiment is automated until a set number of iterations are complete.

The experimental results indicate distinct trends associated with each of the acquisition functions. We have used the three acquisition functions in the automated experimental workflow, and their estimation is given by the following expressions:^{27,29,30}

- Expected Improvement (EI)where$EI=\mu \u2212ybest\u2212\xi \varphi \mu \u2212ybest\u2212\xi \sigma +\sigma \psi \mu \u2212ybest\u2212\xi \sigma ,$
*μ*is the GP predictive mean,*σ*is the GP predictive variance,*y*_{best}is the maximum value of the target in the training set,**ϕ**is the cumulative distribution function,**ψ**is the probability distribution function, and*ξ*is a slack parameter that determines the extent of improvement at every iteration. - Upper Confidence Bound (UCB)where$UCB=\mu +\beta \sigma ,$
*β*is a factor that controls the extent of exploration/exploitation of the acquisition. - Probability of Improvement (PI)$PI=\varphi \mu \u2212ybest\u2212\xi \sigma .$
The summary of the experimental trends for the acquisition functions is illustrated in Fig. 3. The different rows in Fig. 3 show the results pertaining to the acquisition functions. The first column in Fig. 3 shows the map of the GP mean value constructed at the end of the experiment (corresponding GP variance maps are provided in Fig. S3). The scatter points relate to the experimental data points. For each of the acquisition functions, the GP map indicates a higher value of the target parameter at the top left region of the parameter space, which corresponds to low bias and high setpoint. This observation is in close agreement with the ground truth (Fig. S1) trends for the graphene sample. Essentially, a lower bias and higher setpoint indicate a closer approach of the STM tip to the sample surface and result in a lowering of the tunneling resistance across the tip–sample junction.

The second column in Fig. 3 corresponds to the variation of the target FFT peak across the different iterations. The orange solid line in all the plots corresponds to the moving five-point average, which reflects the general trend of data-sampling progression during the experiment. The iteration indicated by the dashed red line corresponds to a “fairly optimized” image and is denoted by a considerably higher value of the FFT parameter. This optimal condition is also correlated in the third column of figures, which indicates the variation of the bias and setpoint controls across the iterations. It is observed that the optimized FFT corresponds to low bias and high setpoint value. For each of the experiments, the histogram plots (shown in Fig. S4) were monitored, and it shows asymmetric distribution skewed toward higher setpoint and lower bias values and correlates with the GP-predicted mean for the objective function.

Interestingly, we note that the acquisition traverses differently across the parameter space depending on the acquisition functions and the associated hyperparameter value. In the case of the EI function, optimization of the parameters is efficient, and this is evident in the histogram plots [Fig. S4(d)]. However, the acquisition continues to explore points away from the optimal region.

In contrast, the UCB function is dominantly exploitative for low and intermediate values of *β*. The acquisitions are in close vicinity of the optimal value. We observed that low values of *β* = 1–5 lead to the sampling around a local (or even false) maxima and, in some cases, do not converge even for iterations ∼100 (Fig. S5). However, for higher values of *β* (in the range of 500–1000), the acquisition is explorative, with no sign of convergence (Fig. S6).

On the other hand, the PI balances exploration and exploitation. It is also seen that the exploration is in the neighborhood of the optimal region. For the EI and the PI acquisition functions, the trends do not differ dramatically for the different *ξ* values (Figs. S7 and S8). The results summarizing the trends for the acquisition functions are tabulated in Table I.

Acquisition function . | Hyperparameter . | Optimized at iteration . | Inference . |
---|---|---|---|

Expected improvement (EI) | ξ = 0 | 3, 8 | Explorative |

ξ = 0.001 | 4, 20 | ||

ξ = 0.1 | 4, 8 | ||

Upper confidence bound (UCB) | β = 1–5 | Exploited at false maxima | Dominantly exploitative |

β = 10–100 | <10 iterations, exploits around (false) maxima | Explorative at high β | |

β = 500–100 | <10 iterations, continues exploration | ||

Probability of improvement (PI) | ξ = 0, 0.001, 0.1 | <10 iterations | Explores the neighborhood of the maxima |

Acquisition function . | Hyperparameter . | Optimized at iteration . | Inference . |
---|---|---|---|

Expected improvement (EI) | ξ = 0 | 3, 8 | Explorative |

ξ = 0.001 | 4, 20 | ||

ξ = 0.1 | 4, 8 | ||

Upper confidence bound (UCB) | β = 1–5 | Exploited at false maxima | Dominantly exploitative |

β = 10–100 | <10 iterations, exploits around (false) maxima | Explorative at high β | |

β = 500–100 | <10 iterations, continues exploration | ||

Probability of improvement (PI) | ξ = 0, 0.001, 0.1 | <10 iterations | Explores the neighborhood of the maxima |

In general, the results for convergence depend on the initial choice of the control parameters. However, in our study, to retain consistency across the experiments, we maintained the initial set of control parameters (i.e., 100 mV and 400 pA). Experimentally, this corresponds to the least favorable condition of high bias and low setpoint within the given parameter space. Therefore, the results in Table I are indicative of an upper estimate for the number of iterations required for optimization. We further explore strategies for parameter convergence by including an initial exploration policy.

### C. Convergence policies

The results from Sec. II B show different trends associated with each of the acquisition functions. Of particular interest is the UCB function that shows dramatic variations in the exploration/exploitation trends for higher/lower values of the *β*. We have utilized this feature (and in conjunction with other acquisition functions) to develop two policies for autonomous convergence of STM scan controls.

#### 1. Combination of acquisition functions

Here, we have used the EI and UCB functions as the two agents for exploration and exploitation, respectively. Given that EI (*ξ* = 0.1) predicts optimized control parameters within the first few iterations (indicated in Table I), the policy here uses the initial ten iterations for exploration with the EI followed by the next ten iterations of UCB (*β* = 1) for convergence.

The results of the experiments are summarized in Fig. 4. Figure 4(a) illustrates the GP-predictive mean [variance shown in Fig. S9(b)], which shows a higher density of acquisition in the optimal region of the parameter space. In Figs. 4(b) and 4(c), the blue shaded region indicates the part of the experiment with the EI acquisition function, followed by the white background denoting the utilization of the UCB function. The EI is explorative with higher variations in the FFT values [Fig. 4(b)] and the predicted parameters [Fig. 4(c)]. However, once the UCB agent is deployed, the FFT values steadily converge to the optimal value. This is correlated with the convergence of the control parameters to high setpoint and low bias values, as is evident in Fig. 4(c) and the histogram plots in Fig. 4(d).

A similar experiment was carried out with the combination of the PI and UCB acquisition functions, and the convergence shows a similar trend (Fig. S10). In this policy, the choice of the switching iteration is chosen based on the performance of the different acquisition functions (indicated in Table I). However, this varies depending on the dynamic nature of experiments, the size/dimension of the parameter space, and the sample under consideration. In such a scenario, the switching iteration could either be indicated as a hyperparameter or be conditioned based on performance factors such as improvement in the image quality, scan stability, and the sparsity of exploration.

#### 2. *β* annealing

In the second policy, we employ the UCB acquisition function while gradually tuning the *β* parameter. The UCB function has been previously utilized in statistical problems and reinforcement learning that involve optimizing through regret minimization and maximizing information gain.^{31–33} These procedures involved balancing the twin processes of exploration and exploitation across the iterations. We use an alternate method to initially explore across the parameter space followed by convergence of the optimal parameters by gradually diminishing the *β* parameter.^{34,35} In this experiment, we extend the control parameter space across the working region of the STM, i.e., bias in the range of 10–510 mV and setpoint in the range of 50–2000 pA. The experiment starts with an initial *β* = 1000 for dominant exploration. As the experiment proceeds, the *β* value is “annealed” by iteratively reducing the value by 10%, in comparison to the previous iteration. Therefore, we expect the acquisition to be “greedy” in the later part of the experiment. The results of the experiments are illustrated in Fig. 5.

Figure 5(a) shows the map of the GP mean and the scatter points show that experimental sampling across the parameter space [variance map shown in Fig. S9(d)]. The top panel in Fig. 5(b) shows the variation of the *β* parameter across the iterations starting from *β* = 1000 in the initial iteration to *β* ∼5 in the final iteration. The variation of the target FFT parameter (orange solid line) shows a gradual improvement in the initial part of the experiment, followed by saturation in the later part of the iteration. This is correlated with the control parameter values [in Fig. 5(c)], which show high variability followed by convergence to higher setpoint and lower bias values. These trends are correlated with the asymmetry in the histogram plots shown in Fig. 5(d).

The above-described policies are demonstrations of autonomous convergence of scan parameters. In this experiment, we maintained a relatively slow *β* annealing to allow for exploration across the larger parameter space. However, this rate of exploration/exploitation can be tuned by suitably modifying the range and the annealing rate of the *β* value.

### D. Scan stability

One of the crucial concerns related to the STM measurements and optimization is the stability of the scanning conditions. The STM measurements are influenced by the extremely sensitive tip–surface interactions. The dynamic nature of tip–surface interactions due to processes such as temperature fluctuations, piezo creep, and the presence of atomic impurities can influence the image quality, despite fixing the control parameters.

To account for the stability of the tip, we estimate an “FFT-variance parameter” that indicates the extent of the tip stability. The method of estimating this parameter is described in Fig. 6(a). Here, a scan image is fragmented into multiple sections and the prominent FFT peak is determined for each of the sub-images. The variance of these peaks is indicative of the scan stability. The lower the variance of the FFT-peak values, the better the scan stability. Figures 6(b) and 6(c) show the FFT peak and the peak-variance parameter across the iterations for the beta-annealing optimization experiment (described in Fig. 5). On average, the trends for the FFT peak and the FFT-peak variance show an inverse correlation. This is observed in Fig. 6(d), which shows the plot of the FFT peak and the peak variance. The data shown in Fig. 6(b) are clustered using a K-Means method with the number of classes, *N* = 2. It is seen that the better quality images (indicated by red) have relatively lower peak variance. The scan images of low- and high-quality images are provided in the inset for comparison.

Given the sensitive nature of the tip, the correlation between the image stability and quality would not hold under conditions that result in modification of the tip. This could include unintentional functionalization or the presence of impurities at the tip. Under such conditions, additional tip-conditioning procedures improve the scan quality and stability. Figure S11 describes the improvement of scan stability and scan quality after a “tip conditioning” procedure. Here, we performed a bias-pulsing procedure followed by a larger area scan. We believe this procedure assisted in the removal of possible impurities accumulated at the tip. Furthermore, the correlation of the scan stability and quality is seen in Fig. S11(e), which shows a better imaging quality upon tip conditioning.

## III. CONCLUSION

In conclusion, we demonstrated the use of the ML framework given by the Bayesian optimization for the autonomous convergence of the STM scan controls. This work demonstrates the proof-of-concept on the graphene sample, which has a relatively smoother dependence across the energy landscape. The BO procedure can be extended to complex systems that show deviation from the standard convex trends, where regions of high image quality and stability can exist as isolated regions within the parameter space. In addition, the automated routine demonstrated in our approach can be implemented on dynamical systems to converge on the optimal parameters in real time.

The workflow methodology presented in this work can be modified and applicable to systems such as semiconductors, 2D layers, topological materials, and quantum structures, as well as other imaging techniques. Depending on the system under investigation, suitable physical models can be incorporated into the GP regression. The convergence policies, when further combined with spectroscopic techniques, allow for autonomous exploitation/exploration of target properties across the sample space. This includes probing features such as superconducting phases, topological defects, impurities, and interfacial regions.

## IV. METHODS

### A. STM imaging

STM experiments were conducted on CVD-grown graphene monolayers on copper foil. Samples were cleaned by sonication in an acetone bath for 20 minutes and annealed in an ultra-high vacuum (<1 × 10^{–10} Torr) at 500 °C for 4 h to remove the contaminants on the graphene surface. After annealing, samples were transferred to an *in situ* STM system (Omicron VT-STM). All measurements were performed in constant current mode using chemically etched tungsten tips. The quality and metallicity of tips were confirmed on the Cu(100) surface before the measurements. The sample temperature was maintained at 302.5 K during the experiment to reduce thermal fluctuations. In our experiments, the tip is grounded, and a negative of the indicated bias was applied to the sample. This preserves the convention of the tip bias indicated in this work. For each STM image, a 3 × 3 nm^{2} area is scanned with a scan speed of 20 nm/s.

### B. Data analysis

#### 1. Bayesian optimization

The autonomous workflow of the STM experiments to find the optimal imaging condition was demonstrated using Bayesian optimization (BO). The initial experiment is carried out by a set of parameters input by the user. Once the scan data are obtained, the average FFT intensity across the six peaks is determined. The training set is then processed by the BO algorithm for parameter prediction, which is then incorporated into the subsequent iteration. The experiment is automated until a set number of iterations are complete. In the BO methodology, a Gaussian process (GP) model (replacing the unknown function) is fit to predict the outcome of the non-explored locations in the parameter space. This is used to suggest future experimental parameters using the acquisition function maxima. The optimization process is driven by the objective to maximize the STM image quality, which is indicated by the average FFT peak intensity.

The Bayesian optimization framework has been developed in the “Pytorch” Python environment. The Gaussian process regression is modeled with the “Gpytorch” library package. The hyperparameter optimization for GP training has been conducted with Adam optimizer with a learning rate of 0.1, and computational epochs = 150 at every iteration.

#### 2. Noise filtering

Prior to the blob detection, the FFT images were filtered for noise using a low-pass and a high-pass filter. A threshold of 0.6 for the top panel (and 0.3 for the bottom panel) of Fig. 1 was implemented on the normalized FFT to pick the dominant peaks in the FFT data. An inverse FFT operation was performed to reconstruct the atomic structure.

#### 3. Blob detection

The blob detection was carried out based on the Laplacian of Gaussian (LOG) function, which is available on the scikit-image library.

#### 4. K-Means for data clustering

The scatter data were clustered using the *K-Means* function imported from the scikit library in Python. The data were classified into two clusters, *N* = 2.

#### 5. Peak identification

Peaks in the normalized FFT data were identified using the find_peaks function available in the scipy.signal library. The following function descriptions were used to identify the maximum number of peaks: *height* = 0.05, *width* = 1, *threshold* = 0.02. The six peaks were identified by carrying on rotation operations on the images and discerning the quadrant and the lattice parameter associated with the peak positions.

## SUPPLEMENTARY MATERIAL

The supplementary material contains additional experimental data, supporting GP variance maps and analysis related to tip conditioning.

## ACKNOWLEDGMENTS

The work was supported by the Center for Nanophase Materials Sciences (CNMS), which is a U.S. Department of Energy, Office of Science User Facility at Oak Ridge National Laboratory. We thank Professor Sergei V. Kalinin (University of Tennessee) for helpful discussions about optimizing imaging parameters in scanning probes and scanning tunneling microscopy experiments. We also acknowledge Mr. Sai Mani Prudhvi Valleti for helping with the blob-detection method.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

G.N. designed the LabVIEW controls, performed the experiments, and compiled the data. S.H. arranged for the sample and set up the instrumentation for the STM experiments. A.B. worked on the Bayesian Optimization code used in the experiments. R.V. and M.Z. conceptualized, coordinated, and supervised the implementation of the autonomous STM. All authors were involved in the data analysis and contributed to manuscript preparation.

**Ganesh Narasimha**: Data curation (equal); Investigation (equal); Software (equal); Writing – original draft (equal). **Saban Hus**: Formal analysis (equal); Investigation (equal); Methodology (equal); Resources (equal); Software (equal); Validation (equal); Writing – review & editing (equal). **Arpan Biswas**: Formal analysis (equal); Methodology (equal); Software (equal); Validation (equal); Writing – review & editing (equal). **Rama Vasudevan**: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal). **Maxim Ziatdinov**: Conceptualization (equal); Formal analysis (equal); Methodology (equal); Project administration (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data and the code for the Bayesian optimization program are openly available at: github.com/gnganesh99/BO-for-AutoSTM.

## REFERENCES

*Introduction to Scanning Tunneling Microscopy*

*Catalogue of Artificial Intelligence Tools*

*(*