The development of high throughput reaction discovery methods such as the *ab initio* nanoreactor demands massive numbers of reaction rate calculations through the optimization of minimum energy reaction paths. These are often generated from interpolations between the reactant and product endpoint geometries. Unfortunately, straightforward interpolation in Cartesian coordinates often leads to poor approximations that lead to slow convergence. In this work, we reformulate the problem of interpolation between endpoint geometries as a search for the geodesic curve on a Riemannian manifold. We show that the perceived performance difference of interpolation methods in different coordinates is the result of an implicit metric change. Accounting for the metric explicitly allows us to obtain good results in Cartesian coordinates, bypassing the difficulties caused by redundant coordinates. Using only geometric information, we are able to generate paths from reactants to products which are remarkably close to the true minimum energy path. We show that these geodesic paths are excellent starting guesses for minimum energy path algorithms.

## I. INTRODUCTION

The search for minimum energy paths (MEPs) is a ubiquitous task in the study of chemical reactions. The MEP,^{1,2} often referred to as the “reaction path,” provides a compact description of the rearrangement of atoms from one molecular structure to another, forming the basis of our intuitive understanding of chemical reaction mechanisms. The highest energy point along the MEP is a transition state, which allows an estimate of reaction rates through transition state theory (TST).^{3–5} Recently, the *ab initio* nanoreactor^{6} and similarly motivated methods^{7–11} for automatic reaction discovery have been introduced which require massive numbers of reaction rate calculations. This has intensified the demand for highly efficient and robust methods for MEP optimization. Construction of a MEP often requires a reasonable path from reactants to products as an initial guess. These initial guess paths are usually generated by interpolation, such as Linear Synchronous Transit (LST),^{12} but can also be generated from molecular dynamics simulations. Nudged Elastic Band (NEB)^{13} and the closely related “string” methods^{14} are often used to refine the initial pathway. A practical difficulty is that the rate of convergence as well as the quality of the final converged MEP depends strongly on the choice of the initial path.

Interpolation in Cartesian coordinates can result in improbable high-energy structures (for example, when atoms come into close contact) and is one of the primary sources of instability in MEP optimization.^{15} In fact, the topological structure of molecular potential energy surfaces (PESs) is better captured by internal coordinates.^{16–18} Throughout this paper, we use internal coordinates to denote a set of (possibly redundant) functions of the Cartesian coordinates that are invariant to overall rotations and translations of the molecule.

In general, it is impossible to assign a single global set of linearly independent internal coordinates that accurately describes the structure of a molecule. However, in the neighborhood of any given configuration (such as the vicinity of an equilibrium structure), it is always possible to construct a well-defined set of coordinates. In the language of differential geometry, these are local “coordinate charts” and the internal coordinate manifold is locally isomorphic to R^{3N-6}, where N is the number of atoms. Globally, however, the topology of the internal coordinate manifold is more complex and does not resemble a Euclidian space.

Ignoring the non-linear dependencies between internal coordinates^{19} has led to some efficient algorithms^{20} for the optimization of equilibrium geometries and transition states.^{21–23} These methods form the generalized inverse of the Wilson G matrix^{24,25} to transform the gradients and Hessian matrices. The G matrix transformation, however, is geometry dependent. Generalization of this local analysis method to problems that involve the global structure of the internal coordinate space, such as interpolation or reaction path optimization, requires the tools of differential geometry. One example of such analysis can be found in previous work on potential energy surface interpolation,^{26} where Cartesian and internal coordinate spaces are treated as Riemannian manifolds. In that work, singular value decomposition (SVD) of the Wilson B matrix is used to construct local coordinate charts in which local Taylor expansions of the potential energy are expressed. These Taylor expansions are then smoothly patched together to obtain an interpolated global potential energy surface.

The space of configurations of a molecule is also known as the *reduced configuration space*^{27} and is constructed as the quotient space $R3N/SE3$, where SE(3) is the Special Euclidean group of translations and rotations acting on $R3$. In other words, a point in the reduced configuration space corresponds to all the possible Cartesian geometries of a molecule that differ only by an overall translation or rotation. In cases where chirality of the molecule is not of concern, the reduced configuration space is sometimes instead defined as $R3N/E3$, where geometries that are mirror images are also collapsed to the same point in the reduced configuration space. The group of rotations, SO(3), has a non-trivial topology, so the topology of the quotient space is expected to also be rather complicated. A concrete implication is the algebraic “syzygy” relationships between bond lengths that arise for molecules with more than four atoms.^{28,29}

Current path interpolation methods such as LST,^{12} Image Dependent Pair Potential (IDPP),^{30} and the Nebterpolator^{31} employ a two-step process to perform interpolation in a space of redundant coordinates. A direct interpolation is first performed to yield a straight line in the redundant internal coordinate space. Because this does not account for nonlinear constraint relationships between the redundant internal coordinates, mathematically inconsistent sets of internal coordinate values can be generated. For example, the triangle inequality amongst bond lengths can be violated.

Here, we refer to these inconsistent combinations of internal coordinate values as *infeasible*. The second step of the above algorithms must therefore find a path that resides in the *feasible* subspace, ideally lying close to the original infeasible path. Often, a minimization process is used to find valid geometries with internal coordinates closest to these interpolated values. However, as we will show in detail in Secs. II and IV, various problems can arise in these methods.

In this work, we propose a different interpolation approach, which performs displacements only in Cartesian coordinates. The method, which is closely related to Intrinsic Reaction Coordinate (IRC) methods, is detailed in Sec. III, where we define the interpolation path between two geometries as the geodesic curve in a Riemannian manifold. An efficient algorithm is provided to determine approximate geodesics by minimizing the arc length exclusively within Cartesian coordinates. We then show how our approach is connected to the local G matrix projection^{20} and B matrix SVD methods.^{26} In fact, we show that the G matrix projection method is equivalent to choosing a specific local chart on the manifold in this approach and explain why such projection schemes are inappropriate for problems involving global geometry such as interpolation or reaction path optimization.

The basic connection between geodesic curves and interpolation paths that we exploit in this work can be understood without the cumbersome mathematics of differential geometry. When the initial and final structures are in close proximity, the difficulty of infeasible coordinate values disappears. In such cases, all existing interpolation methods work well. Taking advantage of this, interpolation paths for reactions involving large amplitude motions can be obtained by requiring the internal coordinates of any small segments of the path to be close to the linear interpolation of the coordinate values at the end points of each segment. This is in fact equivalent to the definition of a geodesic curve and is the main insight exploited in our method.

Geodesic interpolation is formulated as an approximation to the geodesic formalism of IRC^{32} and bears some similarity to the variational reaction energy (VRE)^{33–35} and variational reaction coordinate (VRC) methods^{36,37} for MEP optimization. The key idea is to find a metric for the reduced configuration space manifold that approximates the interactions which define the potential energy. This explains the surprising ability of geodesic interpolation to generate paths that are very similar to MEPs, using only geometric information about the molecule.

In Sec. IV, numerical examples are provided to illustrate the interpolation process in detail. Examples are provided to show how and why a number of previous interpolation methods fail. We then show that the new method leads to more meaningful paths with lower energy, serving as better starting guesses for NEB calculations that converge quickly and reliably and also tending to lead to lower energy MEPs.

## II. REVIEW OF INTERPOLATION METHODS IN REDUNDANT INTERNAL COORDINATES

Interpolation between geometries is conventionally performed by individually interpolating each coordinate, treating all coordinates as independently varying. To properly characterize large amplitude motions such as chemical reactions, it is often necessary to use more coordinates than the total number of degrees of freedom in the system. In such cases, there are necessarily functional relationships between internal coordinates such that they are not linearly independent. Such sets of internal coordinates are referred to as redundant internal coordinates.

Treated as if they were independent, redundant internal coordinates span a space isomorphic to $RM$ for some M > 3N − 6 (the dimensionality of the reduced configuration space). The image of the 3N-dimensional Cartesian coordinate space embedded in this redundant internal coordinate space, i.e., the subset of redundant internal coordinate space that corresponds to a realizable Cartesian geometry, is hereafter referred to as the feasible subspace. In the redundant internal space, the feasible subspace is curved. Therefore, direct interpolation paths constituting a straight-line segment in $RM$ can lie outside of the feasible space and may not be physically meaningful.

This problem has long been known, and a number of approaches have been proposed to overcome it. In the widely used Linear Synchronous Transit (LST) interpolation scheme,^{12} the interpolation path is obtained by minimizing the difference between the (feasible) internal coordinates of the images and the targeted (but potentially infeasible) direct interpolation values through a weighted least-squares procedure. As reported previously,^{30} this highly nonlinear minimization tends to produce discontinuous paths. The problem is exacerbated in larger molecules where the interpolated bond lengths can be far from any feasible geometry and in molecules with large amplitude angular and/or torsional motions. The Image Dependent Pair Potential (IDPP) method^{30} was introduced in order to alleviate these difficulties. IDPP uses the squared norm of the difference between the internal coordinates of an image and their target interpolated values as the energy of that image in a NEB-like optimization process. In other words, for each image, the weighted distance to the infeasible interpolation is minimized like in LST, but motion along the path is determined by a spring force which penalizes large geometric changes along the path. The energy function to be minimized takes the form

where $R\u2192k$ are the Cartesian coordinates of the *k*th image (i.e., bead), $riR\u2192k$ are the Cartesian coordinates of the *i*th atom in the *k*th image, and $dijk$ is the target bond length between atoms *i* and *j* for image *k*, obtained by linearly interpolating the corresponding bond length between initial and final points. The weighting function *w*(*d*) = 1/*d*^{4} is used to ensure that distances between atoms that are close to each other and strongly interacting are more accurately reproduced than those for pairs of atoms that are far apart and weakly interacting. This weight factor is crucial for the quality of the resulting interpolation. In fact, the chemical intuition carried in such weight factors is the most important reason why IDPP and LST work better than Cartesian interpolation.

The same functional is used as the target function in the LST least-squares minimization step. However, in IDPP, each image has a different energy expression (hence the name of the method—Image Dependent Pair Potential). In the subsequent NEB-like process, a spring force governs the movement of images along the path, forcing the path to be smooth and preventing discontinuities. However, it is important to note that once spring forces between adjacent images are introduced, it is not obvious how to formulate the IDPP process as the minimization of an objective function (similar issues arise in NEB). As we will show in Sec. IV, the IDPP process is prone to oscillation and can fail to converge.

A more robust method to remove discontinuities in interpolated paths is provided by the Nebterpolator,^{31} which performs direct interpolation in a set of redundant internal coordinates that also includes bond angles and torsions. The Nebterpolator method often uses trajectories from molecular dynamics simulations as starting guesses and is shown in Sec. IV to be significantly more robust than the Cartesian linear interpolation guesses generated by LST and IDPP. Similar to LST, the initial guess path in the redundant coordinates is refined through least-squares minimization to obtain geometries that reproduce the interpolated values of internal coordinates as closely as possible. In the Nebterpolator, discontinuities are actively detected by monitoring the Cartesian distance between each pair of neighboring images. When the Cartesian distance between two neighboring images is greater than the length of an adjacent segment by a factor or two or more, the path is considered discontinuous. In such cases, the square of the Cartesian distance between these two images is added to the target function as a spring term, and the weight of this term is automatically increased until discontinuities disappear. This pulls neighboring images closer in Cartesian space and ensures a smooth interpolated path.

Although both IDPP and Nebterpolation can remove discontinuities in the interpolated path, this comes at the price of significant degradation in the quality of the final interpolated paths (see Sec. IV). The NEB-like process in IDPP sometimes creates extremely distorted intermediate structures, leading to instability and convergence failures. Although Nebterpolation can consistently generate continuous pathways, the resulting pathways can have very large barriers because the large spring term causes portions of the path to effectively revert to Cartesian interpolation.

To summarize, all three methods described above start with direct interpolation paths in redundant internal coordinates, which are often infeasible. This is followed by an optimization process, which identifies a feasible path close to the direct interpolation path. However, when the end points are very different geometries, the direct interpolation is often very distant from the feasible set, and the highly nonlinear second step is complicated by the existence of multiple local minima.

## III. THEORY

From the discussion above, it is clear that an appropriate choice of coordinates can result in interpolation paths that resemble MEPs. At the same time, the need to operate in redundant internal coordinates introduces various difficulties. In this section, we use the tools of differential geometry to investigate the influence of coordinate choice on interpolation paths and MEPs. We show that they are both coordinate independent and the perceived difference in interpolations is caused by implicit changes of the metric. In fact, the appropriate metric for interpolation paths is already known from previous results on IRCs. This realization leads us to develop a new interpolation scheme that provides better approximations to MEPs, while avoiding the need to operate in redundant coordinates.

### A. The influence of metric and coordinates on minimal energy paths and interpolations

There has been significant confusion in the literature concerning the dependence of the MEP on the choice of coordinates, which has been detailed and explained previously.^{38–40} Here, we carefully distinguish between the metric and the choice of coordinates, which is key to avoiding previous misunderstandings. In the framework of modern coordinate-free differential geometry, a Riemannian manifold is defined as a pairing of a topological space with a metric. If the metric is not correctly considered during coordinate transformations, the MEP appears to depend on the chosen coordinate system. However, it has been established^{38–40} that a correct formulation in terms of covariant derivatives^{41} leads to an MEP that is independent of the coordinate system. On the other hand, given a definition of MEP, the corresponding metric is uniquely determined. The most widely used definition of MEP is the Intrinsic Reaction Coordinate (IRC)^{10,42} proposed by Fukui, for which the kinematic metric,^{40} i.e., the mass-weighted Euclidean metric, should be used.

Given that MEPs are coordinate independent, the reader might find it odd that interpolations in internal coordinates would provide a better estimation for chemical reactions compared to Cartesian coordinates. Indeed, a proper definition of interpolation should depend only on the metric and not on the coordinates used. However, metric choices in interpolation methods are often implicit and empirical. Inconsistencies within the same procedure, as discussed in Sec. II, frequently cause divergence and discontinuities. To design better performing interpolation schemes, it is crucial to first establish a metric that yields paths close to true MEPs.

An important relation between geometry and energetics of molecules was provided by Tachibana and Fukui in the geodesic formulation of IRC^{32} and in their analysis of the differential geometry of chemical reactions.^{40} Fukui showed the existence of one particular metric under which geodesic curves correspond to the IRCs. Note that this does not imply metrics can be changed at will. As representations of IRCs, gradient descent paths are associated with the mass-weighted Cartesian metric, and the geodesic curves are associated with Fukui’s metric. The expression for reaction paths needs to be properly adapted whenever metrics are changed to retain the correct behavior.

Key concepts of differential geometry used in this work are briefly recapitulated in Appendix A. For simplicity, from this point on, we will adopt the Einstein summation convention. Fukui, *et al.* showed that the IRC is a geodesic curve on a Riemannian manifold, where the metric, in the form of the following length element, is derived from the potential energy of the molecule

where $gij$ is the Euclidean metric of mass-weighted Cartesian in the current internal coordinate space, while $gkl$ is its inverse. This length element definition is invariant under different choices of coordinates ** q** as it should be. A geodesic curve is a generalization of a straight line in a curved space and is therefore directly related to interpolation. When the feasibility of redundant coordinates is ignored, direct interpolations are just straight lines in the internal coordinate space. In fact, by the Nash embedding theorem,

^{43,44}it is possible to embed Fukui’s Riemannian manifold with the metric given by Eq. (2) into a Euclidean space of high dimensionality, and if the basis of this space is used as “internal coordinates,” then finding a geodesic curve on the hypersurface of feasible geometries will yield the exact IRC.

Unfortunately, the Nash embedding theorem only states the existence of such an embedding with an upper bound for the required dimensionality of the embedding space. Generating such an embedding exactly would require full knowledge of the potential energy surface (including its first and second derivatives with respect to molecular displacements), which is generally not available.

Fortunately, with the tools of Riemannian geometry, geodesics can be calculated with any coordinates or basis. Indeed, Fukui’s seminal contribution made clear the distinction between the metric tensor used to define a notion of distance on the PES manifold and the different possible coordinate systems used to represent the PES. Importantly, the metric tensor is an entity that is independent of the choice of coordinates. Using different coordinate systems will result in different matrix representations of the metric tensor, but the geodesic curves calculated using the same metric are always the same.

This suggests that the reason some interpolation schemes work better than others is not because of the choice of coordinates, but rather the (possibly implicit) choice of the metric that is used. With the correct metric, geodesics can be directly located using Cartesian coordinates, avoiding the need to explicitly solve the embedding problem.

When mass-weighted Cartesians are used as the basis, Eq. (2) simplifies to

and therefore the matrix of the metric tensor in mass-weighted Cartesians is

Supposing that internal coordinates ** q** are used, the Euclidean metric gives the metric tensor

and one can see that a straight line in {*q*_{i}} will be a good approximation to the true IRC if and only if

that is, if the coordinates ** q** capture the shape of the potential or equivalently, if the potential energy surface is almost flat in

**. The success of interpolating global PESs in coordinates constructed from inverse bondlengths**

*q*^{26,45}implies that the PES appears nearly flat when expressed in inverse bondlengths.

The idea in this work is to keep the induced metric structure of Eq. (5) so that the manifold is a local isometric immersion in the Euclidean space of redundant internal coordinates; in other words, we choose a metric that is Euclidean when using a specific set of redundant internal coordinates (see Appendix C for details). This property allows for a straightforward interpretation of the internal coordinate space consistent with the conventional interpolation picture. This also greatly simplifies the process of geodesic optimization (see Appendix B). The approximation in Eq. (6) is easily achieved by using one of the traditional diatomic potential functions, such as Morse or Leonnard-Jones, for the internal coordinates ** q**. Chemically, it is generally true that any two atoms interact more strongly at smaller distances, where the interactions often follow exponential or inverse power scaling with the distance. On the other hand, it is very difficult to predict if the interaction between two atoms is attractive or repulsive. The form of Eq. (6) allows us to properly account for the general distance scaling of interatomic interactions, without the need to determine their signs, which greatly simplifies the problem.

In this work, the functions q_{i} are defined as the following scaled inter-atomic distance coordinates (denoted as *R̃*_{kl}):

where *i* labels the coordinates, each of which depends on a pair of atoms. The Euclidean distance between the *k*th and *l*th atoms is denoted as *r*_{kl}, and $rekl$ denotes an approximate equilibrium bond distance given as the sum of the covalent radii of the two atoms. The parameters α and β are included to allow some tuning of the relative importance of short- and long-range interactions. In this work, we chose α = 1.7 and β = 0.01, a choice which reflects the fact that short-range repulsion dominates inter-atomic interactions. We find that the quality of the interpolation (see below) is quite insensitive to small changes in the values of these parameters.

The procedure is illustrated for a two-dimensional model problem in Fig. 1. Here, the potential is computed from a model system with three Morse potential terms from the three distances labeled in the middle panel of Fig. 1. Given three reference points *x*_{1}=(2, −1), *x*_{2}=(−2, −1), and *x*_{3}=(0, 0), the model potential is defined as

Three distance coordinates are chosen as the redundant coordinate system. These coordinates are defined according to Eq. (7) using distances to the three reference points. Note that we choose different exponential factors in the potential and the coordinate definition to ensure generality of the example. Using a set of internal coordinates whose Euclidean metric loosely approximates Fukui’s metric, the MEP becomes nearly straight, as shown in the right panel of Fig. 1. This is the case despite the fact that a general exponential factor is used instead of the exact one in the Morse potentials and that the repulsive higher order term in the Morse potential was ignored in the coordinate scaling. Changing from the Cartesian Euclidean metric to the induced Euclidean metric of internal coordinates causes the region where energy changes rapidly to be stretched while regions where energies are nearly constant are compressed. As shown in the right panel of Fig. 1, the asymptotic region, which is generally flat, is compressed near a point while the regions on the PES where energies are extremely high are stretched into long legs on the embedded reduced configuration space hypersurface.

It is of course possible to incorporate more elaborate choices for the coordinates, for example, by taking into account atom types and their typical radii. Using Eq. (7), coordinates may also be constructed from force fields or *ab initio* data. Such scaling methods will likely further improve the quality of the interpolation. However, we leave this to future work as our main aim here is to show how to effectively find the interpolation path given some arbitrary coordinate system. Of course, there will be diminishing returns as the coordinates become more complex and more molecule specific since the coordinates will be both harder to determine and less transferable.

### B. Interpolation with geodesics

Let us denote the reduced configuration space of the molecule as *M*. Recall that $M=R3N/SE(3)$ when chirality needs to be resolved and $M=R3N/E(3)$ otherwise. We start with a set of *K* redundant internal coordinate functions $q:M\u2192RK$

From this set of functions, a Riemannian manifold (*M*, *g*) can be defined, where the metric *g* is defined at each *p* ∈ *M* by

Here, we have used the Cartesian basis for ease of exposition. An approximation to the MEP from point A to B is obtained by constructing the geodesic curve that connects them in the manifold (*M*, *g*). Detailed analysis is given in Appendix A, and we just summarize here.

Let us denote a curve in Cartesian coordinates between A and B as *γ*(*t*) = *x*^{i}(*t*)**x**_{i}, where {**x**_{i}} is the Cartesian basis. The length of the path from A to B on *M* is given as [this equation is derived in Appendix A, where it can be found as Eq. (A10)]

Here and in the following, we use single overdots to denote first derivatives with respect to the parameter of the curve, and double overdots to denote second derivatives

Henceforth, we omit the explicit dependence of each of the quantities on *γ*(*t*) for simplicity.

### C. Numerical method to find approximate geodesics between a pair of points

Efficient methods are available for finding geodesic paths on manifolds using the fast marching method.^{46,47} Widely used in computer vision^{48} and robotics,^{49} these methods accurately find the shortest geodesic between two points. However, they require propagation of the distance function on a grid that spans the entire manifold. This is unrealistic for molecular systems because of the high dimensionality of the configuration space.

One method related to the current work has seen much success in the field of computer aided design and modeling of 3D objects.^{50} Similar to the treatment here, the shapes of real world objects are constructed as a Riemannian manifold, and methods are developed to perform geodesic interpolations between shapes to estimate how objects deform from one shape to another. Shapes of objects were represented with a triangular mesh of its surface, and the mesh is required to be the same for the entire path. This is usually not appropriate for describing the shape of molecules. Perhaps more importantly, the method is designed for deformations that do not change the topology of the object and experiences problems of collision when large amplitude motions are involved. In chemistry, however, the change in shape is often drastic, so a more general approach is necessary.

To find the geodesic curve between points A and B, we first need to find an efficient and accurate approach to numerically evaluate the integral of Eq. (12). We begin by representing the path by a discrete series of images **x**^{(n)} = *γ*(*t*^{(n)}), where **x**^{(1)} = *A* and **x**^{(N)} = *B*, so that Eq. (12) can be written as the sum of the length of each segment between images

We require that the segment between each pair of neighboring images is a geodesic curve. Since any segment of a geodesic curve must also be a geodesic curve, the desired path minimizes the length defined in Eq. (14). Although there is still no formula to calculate the exact length of these shorter geodesic segments, upper and lower bounds can be derived by slicing the path γ into a sufficiently large number of pieces

The meaning of these bounds can be easily understood when the reduced configuration space *M* is viewed as an immersion in the redundant internal coordinate space. As shown in Fig. 2, the geodesic is the shortest feasible path between the two end points, feasible meaning that it lies entirely inside the reduced configuration space. The lower bound corresponds to a straight line in the redundant internal space, which is in general infeasible. Being a straight line in the internal space, it is no longer than any path linking the two end points including the geodesic. The fact that it is shorter than the geodesic attests to its infeasibility. The upper bound, on the other hand, is the reduced configuration space path that is straight in the Cartesian space with the Euclidean metric. This path is curved with the interpolation metric, and being a feasible path, it is no shorter than the geodesic.

An approximate formula that strictly lies between the upper and lower bounds is also available

The derivations of these expressions are found in Appendix B. The numerical procedure to compute path length and its derivatives with respect to individual images is shown in Algorithm 1.

function Length (x^{(I)}, f^{i}, ∂_{j}f^{i}) | |

input parameters | |

x^{(I)} | Initial geometries |

f^{i} | Coordinate functions used to generate metric |

∂_{j}f^{i} | Derivatives of coordinate functions |

returns | |

L | Piecewise geodesic length |

∇L ≡ {∂_{j}^{(I)}L} | Derivatives of path length with respect to each image |

L: = 0 | Approximate geodesic length |

for all image index I do | |

$qi(I):=fi(x(I))$ | Compute internal coordinate values |

$Bij(I):=\u2202jfi(x(I))$ | Compute Wilson’s B matrices |

$\u2202j(I)L:=0$ | Derivatives of length with respect to each images |

end for | |

for I: = 1 to N^{image} − 1 do | |

$x\u0303(I):=[x(I)+x(I+1)]/2$ | Cartesian geometry of midpoint of images I and I+1 |

$q\u0303i(I):=fi(x\u0303(I))$ | Internal coordinates of midpoint |

$B\u0303ij(I,I):=\u2202q\u0303i(I)\u2202xj(I)=12\u2202jfi(x\u0303(I))$ | B matrix at midpoint, with respect to each side |

$B\u0303ij(I,I+1):=\u2202q\u0303i(I)\u2202xj(I+1)=12\u2202jfi(x\u0303(I))$ | |

$L+=q\u0303(I)\u2212q(I)+q\u0303(I)\u2212q(I+1)$ | |

$\u2202j(I)L+=q\u0303i(I)\u2212qi(I)q\u0303(I)\u2212q(I)B\u0303ij(I,I)\u2212Bij(I,I)+q\u0303i(I)\u2212qi(I+1)q\u0303(I)\u2212q(I+1)B\u0303ij(I,I)$ | |

$\u2202j(I+1)L+=q\u0303i(I)\u2212qi(I)q\u0303(I)\u2212q(I)B\u0303ij(I,I+1)+q\u0303i(I)\u2212qi(I+1)q\u0303(I)\u2212q(I+1)B\u0303ij(I,I+1)\u2212Bij(I,I+1)$ | |

end for | |

return L,∇L | |

end function |

function Length (x^{(I)}, f^{i}, ∂_{j}f^{i}) | |

input parameters | |

x^{(I)} | Initial geometries |

f^{i} | Coordinate functions used to generate metric |

∂_{j}f^{i} | Derivatives of coordinate functions |

returns | |

L | Piecewise geodesic length |

∇L ≡ {∂_{j}^{(I)}L} | Derivatives of path length with respect to each image |

L: = 0 | Approximate geodesic length |

for all image index I do | |

$qi(I):=fi(x(I))$ | Compute internal coordinate values |

$Bij(I):=\u2202jfi(x(I))$ | Compute Wilson’s B matrices |

$\u2202j(I)L:=0$ | Derivatives of length with respect to each images |

end for | |

for I: = 1 to N^{image} − 1 do | |

$x\u0303(I):=[x(I)+x(I+1)]/2$ | Cartesian geometry of midpoint of images I and I+1 |

$q\u0303i(I):=fi(x\u0303(I))$ | Internal coordinates of midpoint |

$B\u0303ij(I,I):=\u2202q\u0303i(I)\u2202xj(I)=12\u2202jfi(x\u0303(I))$ | B matrix at midpoint, with respect to each side |

$B\u0303ij(I,I+1):=\u2202q\u0303i(I)\u2202xj(I+1)=12\u2202jfi(x\u0303(I))$ | |

$L+=q\u0303(I)\u2212q(I)+q\u0303(I)\u2212q(I+1)$ | |

$\u2202j(I)L+=q\u0303i(I)\u2212qi(I)q\u0303(I)\u2212q(I)B\u0303ij(I,I)\u2212Bij(I,I)+q\u0303i(I)\u2212qi(I+1)q\u0303(I)\u2212q(I+1)B\u0303ij(I,I)$ | |

$\u2202j(I+1)L+=q\u0303i(I)\u2212qi(I)q\u0303(I)\u2212q(I)B\u0303ij(I,I+1)+q\u0303i(I)\u2212qi(I+1)q\u0303(I)\u2212q(I+1)B\u0303ij(I,I+1)\u2212Bij(I,I+1)$ | |

end for | |

return L,∇L | |

end function |

As shown in Appendix B, the error of the lower bound formula [Eq. (15)] compared to the true length scales as *O*(Δ*l*^{3}), while the error for both the upper bound [Eq. (16)] and the approximation of Eq. (17) are *O*(Δ*l*^{2}). The fact that Eq. (17) tends to overestimate the true length when images are far apart significantly improves the stability of the numerical process. The presence of a positive error that scales quadratically with respect to individual segment lengths acts as an implicit regularization term. The overestimation error itself becomes effectively a regularization term, and path length minimization using Eqs. (14) and (17) automatically suppresses large distances between images, driving them to be evenly distributed along the path without the need for an additional redistribution step. We can therefore start from an unevenly distributed set of images, and path length minimization will automatically redistribute the images evenly.

We find that the most robust strategy to minimize the path length is to adjust one image at a time. The index of the image being adjusted sweeps back and forth along the path until all images stop moving. To ensure that the number of images is sufficiently large to describe the geodesic curve, the upper bound of Eq. (16) and the lower bound of Eq. (15) are computed after determining the minimum path length with Eq. (18). These upper and lower bounds are compared with the approximate length of the final path, as given in Eq. (17). Additional images are added if the difference between the two bounds is large, and Eq. (18) is used again to determine the optimal location for the images. This procedure is described in Algorithm 2.

procedure Geodesic(x^{(I)}, f^{i}, ∂_{j}f^{i}, ϵ) |

repeat |

ΔX:= 0 |

for I:= 2 to N^{image} − 1 do |

Minimize Length(x^{(I)}, f^{i}, ∂_{j}f^{i}) by adjusting image I only |

Let the min length be L |

Let the resulting coordinates for image I be y^{(I)} |

$\Delta x:=x(I)\u2212y(I)$ |

x^{(I)}: = y^{(I)} |

end for |

Repeat the previous loop with image index I sweeping in reverse order |

until Δx < ϵ |

Calculate lower bound L^{lower} using Eq. (14) |

Calculate upper bound L^{upper} using Eq. (15) (m = 10 in this work) |

if L^{lower} < 0.95L or L^{upper} > 1.1L then |

Calculate the bounds for each segment between neighboring images |

Add midpoints for all pairs where the difference of the bounds is too large |

Restart the procedure with the larger set of images |

else |

Procedure finished with current x^{(I)} giving the approximate geodesic. |

end if |

end procedure |

procedure Geodesic(x^{(I)}, f^{i}, ∂_{j}f^{i}, ϵ) |

repeat |

ΔX:= 0 |

for I:= 2 to N^{image} − 1 do |

Minimize Length(x^{(I)}, f^{i}, ∂_{j}f^{i}) by adjusting image I only |

Let the min length be L |

Let the resulting coordinates for image I be y^{(I)} |

$\Delta x:=x(I)\u2212y(I)$ |

x^{(I)}: = y^{(I)} |

end for |

Repeat the previous loop with image index I sweeping in reverse order |

until Δx < ϵ |

Calculate lower bound L^{lower} using Eq. (14) |

Calculate upper bound L^{upper} using Eq. (15) (m = 10 in this work) |

if L^{lower} < 0.95L or L^{upper} > 1.1L then |

Calculate the bounds for each segment between neighboring images |

Add midpoints for all pairs where the difference of the bounds is too large |

Restart the procedure with the larger set of images |

else |

Procedure finished with current x^{(I)} giving the approximate geodesic. |

end if |

end procedure |

This procedure assumes that a series of images forming a continuous path between the end points is already known and can be used as a starting guess. In LST and IDPP, such an initial path is generated through Cartesian linear interpolation. This, however, often causes problems because Cartesian interpolations can introduce collisions between atoms and they are generally qualitatively different from the MEP. Further complicating the situation, in most reactions, many different local MEPs may exist between the same end points. These paths often have drastically different barrier heights, and a poor starting guess can lead to interpolated paths with very high barriers. We approach this by sampling many initial paths and choose the one that has the shortest estimated geodesic length. More specifically, we first find the Cartesian geometry that is closest to the average of the internal coordinates of the end points. This involves minimizing the differences between the internal coordinates of this new geometry and the average of the internal coordinates of the end points, which is equivalent to LST with only 1 image. Both end points are used as starting points of this minimization process for a number of iterations, and a moderate size random noise is added to the starting guess, which allows the process to converge to a number of different local minima. This creates a set of three point paths. The path with the shortest length, calculated using Algorithm 1, is chosen and used to generate an initial path for Algorithm 2. This procedure is detailed in Algorithm 3.

procedure Interpolate(A, B, f^{i}, ∂_{j}f^{i}, N^{image}, ϵ) |

q^{(mid)}: = [q(x^{(A)}) + q(x^{(B)})]/2 |

L^{min}: = ∞ |

for i: = 1 to 10 do |

if i is odd then |

x^{0}: = A + random noise^{*} 0.1 |

else |

x^{0}:=B + random noise^{*} 0.1 |

end if |

Find x which minimizes $q(x)\u2212q(mid)2$ with x^{0} as starting point |

L: = Length ([A,x,B],f^{i},∂_{j}f^{i},ϵ) |

if L < L^{min} then |

L^{min}:= L |

x^{(mid)}:= x |

end if |

end for |

Use Geodesic procedure to adjust position x^{(mid)} to optimize path [A,x^{(mid)},B] |

Create path of N^{image} images with |

$x(I)=A,\u20031\u2264I\u226413Nimagex(mid),\u200313Nimage<I\u226423NimageB,\u200323Nimage<I\u2264Nimage$ |

Use Geodesic procedure to adjust path [x^{(I)}] |

end procedure |

procedure Interpolate(A, B, f^{i}, ∂_{j}f^{i}, N^{image}, ϵ) |

q^{(mid)}: = [q(x^{(A)}) + q(x^{(B)})]/2 |

L^{min}: = ∞ |

for i: = 1 to 10 do |

if i is odd then |

x^{0}: = A + random noise^{*} 0.1 |

else |

x^{0}:=B + random noise^{*} 0.1 |

end if |

Find x which minimizes $q(x)\u2212q(mid)2$ with x^{0} as starting point |

L: = Length ([A,x,B],f^{i},∂_{j}f^{i},ϵ) |

if L < L^{min} then |

L^{min}:= L |

x^{(mid)}:= x |

end if |

end for |

Use Geodesic procedure to adjust position x^{(mid)} to optimize path [A,x^{(mid)},B] |

Create path of N^{image} images with |

$x(I)=A,\u20031\u2264I\u226413Nimagex(mid),\u200313Nimage<I\u226423NimageB,\u200323Nimage<I\u2264Nimage$ |

Use Geodesic procedure to adjust path [x^{(I)}] |

end procedure |

### D. Geodesic interpolation and variational reaction coordinate (VRC) method

The geodesic interpolation process is closely related to the Variational Reaction Coordinate (VRC) method for the optimization of reaction paths recently developed by Birkholz and Schlegel.^{36,37} In VRC, the steepest descent reaction path is obtained by minimizing the Variational Reaction Energy (VRE),^{33–35} defined as

where U is the potential energy and x(t) is the reaction path. In the VRC method, reaction paths are expanded as linear combinations of a set of quartic B-spline functions. In fact, the authors recognized that without the gradient term, the VRE becomes the path length in internal coordinates. In fact, in their later work,^{37} the following arc length in redundant internal coordinates is minimized as the initial step of the VRC process

which is strongly reminiscent of Eq. (12). However, as the authors have discussed, the VRC method as currently formulated has some problems and does not take full advantage of the redundant internal coordinates. Similar to the work presented in this paper, the VRC paths are always constructed in Cartesian coordinates to avoid infeasibility problems. However, the paths are expanded with spline functions in Cartesian coordinates. The resulting equations are not invariant to overall translations and rotations. Because the metric tensor is degenerate in translational and rotational degrees of freedoms, the VRC equations are singular and need explicit projection of rotational and translational degrees of freedom to remain stable.

A projection scheme is used in VRC to address the effect of coordinate redundancy. As shown in Appendix D, this projection scheme is non-holonomic. Because the derivatives of the projection matrix are omitted, integration in their projected basis breaks coordinate invariance. In order to obtain an exact coordinate invariant formulation of integrals using redundant coordinates, contributions from the commutators between basis components must be explicitly included, as shown in Appendix A.

Perhaps more importantly, in VRC, no scaling is performed for internal coordinates. The arc length optimization step effectively drops the first factor in Eq. (19). As we show in this paper, a much better approximation to Eq. (19) can be achieved by scaling internal coordinates to make the first factor nearly constant. Note that Eq. (19) simplifies directly into Eq. (20) when the coordinates satisfy *∂U*/*∂x*^{i} ≈ 1.

On the other hand, the method derived in this work is invariant with respect to overall translations and rotations. More precisely, path lengths are properly defined on the reduced configuration space. Singularities due to translations and rotations are therefore avoided by construction.

## IV. NUMERICAL EXAMPLES

In this section, a number of numerical examples of our geodesic interpolation are provided, along with comparisons to other interpolation methods for the same systems.

### A. Finding geodesics from a series of images: Imidazole hydrogen migration

We first compare the performance of different interpolation methods when starting from a given initial guess path. To compare the performance and stability of the geodesic approach and other interpolation schemes, the reaction path for intramolecular hydrogen migration in imidazole is generated using our new geodesic (Algorithm 2), LST, IDPP, and Nebterpolator interpolations. The LST interpolation is performed in internal coordinates. To make it easier to visualize the interpolated paths, the molecule is constrained at the geometry of the deprotonated structure, and only the migrating hydrogen is allowed to move. This allows us to plot the path of the hydrogen in a two-dimensional plane. The metric can also be visualized with this simplification. Since only one atom is moving and the movement is constrained to the plane, each point on the plane corresponds to a unique molecular geometry.

We first generated a linear interpolation in Cartesian coordinates and shifted it by 2 Å so that no atomic collisions occur (gray dotted line labeled “Initial Path” in Fig. 3). This new set of initial images is then used for the initial guess geometries for each of the interpolation methods. The initial path and the interpolations generated from each method are shown in Fig. 3. The true MEP calculated with NEB at the UB3LYP/6-31g level of theory is depicted as a solid black line for comparison.

As shown in Fig. 3, all interpolation methods except for LST generated continuous paths. In LST, the separate solution branches of its target function are disjoint. As a result, it can experience discontinuities even when seeded with a closely spaced, “continuous,” set of images. Nebterpolation produces a smooth path, but this path is relatively far away from the MEP. This is because no scaling is done for internal coordinates; hence, its metric is a poor approximation to the Fukui metric. IDPP, on the other hand, generates a continuous path that reproduces the MEP reasonably well. It does, however, contain a kink. Surprisingly, this kink in the curve does not go away even when the convergence threshold of IDPP is significantly tightened. By shifting the initial guesses by a different amount, it is discovered that at some shift values, this kink does not appear, while with some other shift values, the IDPP process diverges and hydrogen atoms are ejected from the vicinity of the molecule. This shows that, as discussed above, IDPP convergence behavior is not always robust. Geodesic interpolation produces a continuous curve close to the MEP, and the path is stable as long as the initial guess does not pass through the C-H bond. The eigenvectors of the metric tensor are shown in the left panel of Fig. 4 to demonstrate the curved nature of the manifold. The eigenvector with larger eigenvalue is plotted with black grid lines, while the eigenvector that corresponds to the smaller eigenvalue is plotted with blue grid lines. At most geometries, the two eigenvalues of the metric tensor, expanded in the Cartesian basis, are well-separated, often by many orders of magnitude. In the right panel of Fig. 4, we compare this to Fukui’s IRC metric given by Eq. (5), where the leading eigenvector corresponds to the energy gradient. The second eigenvector always has a zero eigenvalue and points along the energy isosurface. Although not identical, the resemblance between the IRC metric and the metric generated by the simple forms of Eq. (7) is quite evident. This results in a geodesic interpolation path that is quite close to the MEP. Here, we once again observe the crucial importance of the choice of metric. While the geodesic interpolation process serves to guarantee smoothness, the ability of geodesic interpolations to produce chemically sensible paths depends on the metric. In fact, we can conjecture that the relatively good performance of IDPP (disregarding problematic convergence for many initial guess paths) is due to its scaling method.

### B. Geodesic interpolation from end points: Ring formation in dehydro Diels-Alder (DDA) reaction

Here, we take a closer look at our new geodesic interpolation (Algorithms 2 and 3) using a challenging numerical example: ring formation in the Dehydro Diels-Alder (DDA) reaction.^{51} Discovered in 1895,^{52} the high yield dimerization of phenylpropiolic acid **1** into 1-phenylnaphthalene-2,3-dicarboxylic acid anhydride **5** (see Scheme 1) is a well studied example of an intramolecular Diels-Alder reaction. It is known that the reaction mechanism involves formation of anhydride (**1** → **2**), followed by a Diels-Alder step where the ring is formed (**2** → **3** → **4**), and finally proton exchange with the solvent to yield the stable aromatic product (**4** → **5**).^{53} Here, the geodesic interpolation procedure described in Sec. III is applied to construct the reaction path for the intramolecular Diels-Alder cycloaddition process **2** → **3** → **4**. This reaction is chosen because of its large size and concerted nature that make it quite challenging. The reaction path involves simultaneous large amplitude motion of many atoms, including ring closure and large torsional displacements.

The initial and final geometries for interpolation, shown in Fig. 5, are optimized with UB3LYP/6-31g. Starting from the same endpoints, IDPP interpolation diverges and we were unable to correct this. Nebterpolation requires a reasonable initial guess path (usually the reaction trajectory obtained from *ab initio* nanoreactor^{6} dynamics) that was not available in this case. Finally, LST was not able to generate a continuous path. We are therefore not able to provide comparisons with existing methods for this system because none can generate a path for comparison. The geodesic interpolation path is constructed from these two end points using Algorithm 3.

Twenty images are used to describe the path, including the end points. The converged geodesic path has an arc length of 2.206 by Eq. (17). The lower bound for the length of the final path is 2.184, while the upper bound is 2.213, which is deemed sufficiently tight. The bounds are calculated with Eqs. (15) and (16), respectively. The distances here are unitless due to the Morse-like scaling used for the interatomic distance coordinates, and the length values correspond roughly to the total number of bonds formed and/or broken along the path. The scaled distance coordinates take values of ∼1 near the sum of covalent radii and values of ∼0 at large distances. Here, one bond breaking causes a scaled coordinate to change from near 1 to near 0, whereas one bond forming results in another scaled coordinate to change from near 0 to near 1, while all other interatomic distances are kept roughly constant along the whole path, resulting in a length slightly larger than 2.

To examine the quality of the interpolation, NEB optimization was performed using the interpolated path as starting guess. The energy profile of the geodesic and the optimized MEP are shown in Fig. 6. Although the barrier estimated through the geodesic is much higher than the transition state on the true MEP, the interpolation can still be considered quite successful given the complexity of the reaction and the fact that the interpolation process uses solely geometric information. When used as an initial guess, the interpolation significantly accelerates the subsequent NEB processes. Note that the energy profile from the geodesic path also qualitatively reproduces the shape of the true MEP, which indicates that Fukui’s metric is qualitatively well approximated.

The transition state on the optimized MEP and the corresponding point on the geodesic curve are compared in Fig. 7. Here, because the integrated lengths in Cartesian are almost the same for the MEP and the geodesic interpolation path, we chose the point on the geodesic curve that is at the same integrated path length as the transition state on MEP. This point is also near the top of the barrier. The similarity between the two geometries is evident.

### C. Using geodesics as initial path for NEB calculations

One important application of reaction path smoothing and interpolation methods is to create a reasonable starting guess for MEP optimization algorithms such as NEB.^{13} Here, we compare the performance of a number of path smoothing methods when used as the initial guess for an NEB calculation. Reaction paths extracted from *ab initio* MD simulations were used, all of which are generated by nanoreactor-driven reaction discovery^{6} for the Urey-Miller experiment.^{54} To ensure that the sample contains a large variety of reactions, each path corresponds to a distinct reaction, and a total of 403 paths are used. The test set includes reactions with and without barriers, as discussed in more detail below. The NEB calculations are performed with DL-FIND^{55} using UB3LYP/6-31G^{*}. All electronic structure calculations are performed using the TeraChem package.^{56,57} Solvation effects are modeled with C-PCM (a conductor-like polarizable continuum model), using a dielectric constant appropriate for water (ϵ = 80).^{58} The initial path interpolation methods tested here include LST and IDPP starting from a Cartesian linear interpolation path, the extended MD trajectory that contains the reaction generated with the same process as the Nebterpolator method,^{31} as well as geodesic, Nebterpolator, and IDPP paths starting from the MD paths for the respective optimization processes. IDPP interpolations are performed with the ASE package.^{59} The results are shown in Table I. For each method of generating initial reaction paths, we report the number of failed NEB runs, the average number of single point energy and gradient evaluations needed to converge the NEB calculation, the average error by which each method overestimates the barrier, and the percentage of cases for each method where the converged NEB path has the lowest barrier.

Initial . | Failed . | Average number . | RMS overestimation of . | Lowest barrier . |
---|---|---|---|---|

path . | runs . | evaluations . | barrier (kcal/mol) . | success rate (%) . |

MD path | 4 | 121.9 | >200 | 34 |

Geodesic | 0 | 101.2 | 47.7 | 65 |

Nebterpolator | 11 | 248.4 | >200 | 51 |

LST | 9 | 303.4 | 77.2 | 18 |

IDPP | 66 | 165.2 | 166.4 | 12 |

IDPP(MD) | 24 | 168.5 | 147.8 | 8 |

Initial . | Failed . | Average number . | RMS overestimation of . | Lowest barrier . |
---|---|---|---|---|

path . | runs . | evaluations . | barrier (kcal/mol) . | success rate (%) . |

MD path | 4 | 121.9 | >200 | 34 |

Geodesic | 0 | 101.2 | 47.7 | 65 |

Nebterpolator | 11 | 248.4 | >200 | 51 |

LST | 9 | 303.4 | 77.2 | 18 |

IDPP | 66 | 165.2 | 166.4 | 12 |

IDPP(MD) | 24 | 168.5 | 147.8 | 8 |

Some clarification of the numbers in Table I is due. The number of failed paths includes cases where the NEB procedure failed to converge after 1000 iterations or where the paths are so distorted that the self-consistent field (SCF) step in electronic structure calculation fails to converge. These paths also exhibit images with relative energies larger than 1000 kcal/mol that are clearly far from the true MEP. The NEB-like process in IDPP is iterative and requires a starting guess path, where either a linear interpolation in Cartesian coordinates or a trajectory obtained from dynamics can be supplied. For IDPP starting with Cartesian interpolations, this also includes 48 cases where the NEB-like process of IDPP itself did not converge. IDPP starting with MD trajectories did not encounter the same failure. All failed runs were excluded when calculating the average number of iterations needed for NEB convergence. For barrier height comparisons, we start from initial paths generated by each method and then compare the final converged path after NEB optimization to identify which method leads to MEPs with lower barriers. Because of the intrinsic accuracy of the method we used, any methods with barriers less than 20 kJ/mol (4.8 kcal/mol) above the lowest barrier height are also considered to have generated a lowest barrier MEP.

We were surprised to find that most path smoothing methods did not perform as well as simply taking MD trajectories as a starting path for MEP search. IDPP produced disappointing results. While on average, IDPP paths have lower barriers than the MD trajectories, and they are continuous unlike LST, the complex nonlinear optimization process often introduces highly strained paths that do not correctly reflect the MEP and complicate the NEB calculation that follows. As mentioned before, in 48 cases, IDPP failed to even generate a path when starting from Cartesian interpolations because the Cartesian interpolations often go through highly energetically disfavored regions causing the IDPP optimization process to diverge. Other methods did not encounter difficulty creating interpolation paths, and there were no problems creating initial paths when using MD trajectories as starting guesses. The NEB optimization to obtain MEPs also tends to fail when starting from IDPP generated paths because fragmented intermediate structures with multiple breaking bonds are sometimes generated by the IDPP process. The total number of failed paths is 66 when started from Cartesian paths—significantly more than any other method. Using MD trajectories as starting guesses in the IDPP process reduces this number to 24, which is still quite high. This evidences instability of IDPP, resulting from the complex and highly nonlinear NEB-like optimization process involved. We believe that this is a disadvantage of the method, especially for treating reactions with large amplitude motions. It also exhibited a strong tendency to lead to MEPs with higher reaction barriers after the NEB optimization for both types of starting guesses. Using IDPP paths as the initial guess for NEB-based MEP optimization also results in more *ab initio* evaluations compared to MD trajectories.

Surprisingly, LST paths using the same set of internal coordinates as IDPP are more stable and outperform IDPP on all measures even though most of the interpolated paths are discontinuous. The average barrier is significantly lower than IDPP, which reflects the compromises that had to be made in IDPP to ensure continuity. Despite the ubiquitous occurrences of discontinuities, the NEB optimization itself is sometimes capable of fixing discontinuities when they are not too severe. Note that while superior to IDPP paths, LST paths require many more *ab initio* evaluations in NEB calculations than unsmoothed MD paths.

Nebterpolation yields significantly lower failure rates than IDPP on this test set. Compared to LST and IDPP, it also yields MEPs with lower energy barriers. However, it still fails more often than just using the raw MD path without Nebterpolation, and on average, it takes more *ab initio* evaluations to converge because the discontinuity removal mechanism deteriorates the path. More specifically, the Cartesian displacement term which draws both ends of the discontinuity together causes that segment of the path to resemble Cartesian interpolation, which is well-known to create highly problematic paths. As a result, for a small number of paths, the Nebterpolator yields paths with barriers higher than 1 Hartree. This resulted in the significantly higher barrier number in Table I and is responsible for the large number of evaluations needed and failed NEB runs. It performs very well when such problems do not occur and, on average, is likely to produce a low barrier MEP.

In general, for all three of the preexisting path finding methods surveyed here, (LST, IDPP, and Nebterpolator), it is crucial that the resulting smoothed paths be inspected to detect pathological behavior. If these problematic paths are not manually removed, all three methods to smooth MD trajectories result in worse convergence behavior when the paths are used as starting guesses for MEP optimization. This is a difficulty for automated reaction discovery and refinement procedures,^{6–11} where it is critical that the MEP optimization can proceed unattended as much as possible.

Geodesic interpolation clearly outperforms all other considered interpolation methods on all measures for this test set. The formulation itself prevents discontinuities from occurring, obviating the need for *ad hoc* procedures to remove discontinuities. This preserves the proper structure of the internal coordinates and, as long as a sensible set of coordinates is used, a good approximation to the MEP can be obtained. It takes fewer evaluations to converge NEB calculations starting from the geodesic-generated path, and these are most likely to produce a low barrier MEP. Thus, the new geodesic method presented here can serve as a good method for generating initial MEPs that will be further optimized by NEB.

As mentioned above, our test set includes reactions with and without barriers. The test set contains 124 barrierless reactions (out of 403), defined as having a reaction barrier lower than 0.063 kcal/mol. For these reactions, geodesic interpolation produces 39 barrierless paths before optimization, allowing one to skip subsequent NEB calculations. Other methods are less successful at identifying barrierless reactions before optimization. For comparison, LST produces 35 barrierless paths, Nebterpolator produced 24, direct IDPP produced 11, and IDPP from MD trajectories produced 5.

## V. CONCLUSIONS

Interpolations performed in redundant internal coordinates are known to perform significantly better than those in Cartesian coordinates. Previous methods for creating interpolations in redundant internal coordinates are investigated and found to experience difficulties that can be traced back to their treatment of redundant coordinates. To address this problem, careful analysis shows that implicit metric changes are responsible for the performance improvement. By choosing the proper metric, interpolation paths similar to MEPs can be located directly in Cartesian coordinates. This allows interpolations to be reformulated as an approximation to Fukui’s Intrinsic Reaction Coordinate (IRC) through an approximation to the geodesic formulation. The resulting method is equivalent to finding the geodesic curve between the two end points, when the Euclidean metric of the internal coordinates roughly approximates Fukui’s IRC metric given in Eq. (4). The condition for the metric to generate well-behaved approximations to an IRC is derived, and a simple coordinate choice that provides satisfactory performance was presented. We show that geodesic interpolations can be conveniently computed in Cartesian coordinates by minimizing the path length through a numerical procedure that also provides bounds for the lengths of exact geodesics. Using a number of examples, the geodesic approach is shown to be robust and stable in situations where previous methods fail. Geodesic interpolation also generates superior starting guesses for NEB calculations compared to existing methods.

## SUPPLEMENTARY MATERIAL

See supplementary material for geometries of molecules along reaction paths described in the main text and a Python reference code for performing the geodesic interpolations according to the described algorithms.

## ACKNOWLEDGMENTS

This work was supported by the U.S. Office of Naval Research, Multidisciplinary University Research Initiatives (MURI) Program, Contract No. N00014-16-1-2557, the U.S. Office of Naval Research Contract No. N00014-17-1-2875, a generous gift from Procter and Gamble Company, and by Toyota Motor Engineering and Manufacturing North America, Inc.

### APPENDIX A: BRIEF REVIEW OF DIFFERENTIAL GEOMETRY USED IN THIS WORK

We briefly review basic concepts of Riemannian geometry, using Einstein summation notation throughout. A manifold is a topological space which locally resembles Euclidean space but whose global structure can be more complicated. Consider the surface of the earth: locally it is isomorphic to $R2$ and we experience the world around us as flat. However, the shortest airline route between two cities on different continents is a *great circle* which can be very different from the straight line on a two-dimensional map. The crucial innovation of Riemannian geometry is to separate the geometry from the coordinate representation. We can then discuss the great circle route without reference to the conventions of latitude and longitude.

More formally *M* is a manifold if at every point *p* ∈ *M*, there exists a neighborhood *U* ⊂ *M* and a mapping $\varphi U:U\u2192Rn$, which is bijective, continuous, and with a continuous inverse. Each such mapping $\varphi U$ is called a *coordinate map* or *chart*, and the collection of charts {$\varphi U$} that covers *M* is termed an *atlas*. In general, unless explicitly needed, the subscript *U* will be dropped.

Using local charts, one can construct tangent vectors and tangent planes. These are the natural generalizations of the tangent to a curved surface in $R3$. A tangent vector is defined as an equivalence class of curves through point *p* which share the same derivatives when mapped into $Rn$ by $\varphi $. The *tangent plane* of a manifold *M* at point *p*, denoted as T_{p}*M*, is the vector space of all tangent vectors at point *p*. The *tangent bundle*, *TM*, is the set of all tangent planes of *M* and plays a significant role in the geometry of classical mechanics.

Given a basis {**x**_{i}} of $\varphi (U)\u2282Rn$, the image of a 1-dimensional curve *γ*: $R$ → *M* can be expressed as

where the symbol ◦ denotes function composition (*f* ◦ *g*)(*x*) ≡ *f*(*g*(*x*)).

With this, a basis **e**_{i} of T_{p}*M* is naturally defined, and a tangent vector *X* = *u*^{i}**e**_{i} is the class of all curves *γ*(*t*) which satisfies *γ*(0) = *p* and

The tangent vector is therefore the derivative of the curve on the manifold, defined using the local chart to map the derivative from Euclidean space, and we write *γ*′(*t*), just like its Euclidean counterpart.

Recalling that coordinate charts are bijective so have well defined inverses, tangent vectors act on functions $f:Rn\u2192R$ via the chain rule

Note that $f\u25e6\varphi \u22121:Rn\u2192R$. We have “pulled” the derivative back to $Rn$ where it is well defined. By this construction, tangent vectors correspond to partial derivatives, and it is conventional to write the basis for T_{p}*M* as

A vector field *X* is then a mapping that associates to each point *p* ∈ *M* a tangent vector *X*_{p} ∈ *T*_{p}*M*. Here, we note the crucial difference between a *basis* and a set of *coordinates* for the tangent space. The basis definition in Eq. (A4) relies on a local chart, and every point *p* ∈ *M* can potentially have a different chart imbuing each tangent plane with a different basis. A choice of basis is termed *coordinate* only if it is consistent amongst neighboring points, in the sense that for each *p* ∈ *M*, there exists a neighborhood *U* containing *p* with functions *x*^{i}: *U* → *R* which satisfy Eq. (A4). This may at first seem trivial since we are only extending Eq. (A4) from a single point to a small neighborhood. However, even the ability to extend to infinitesimally small neighborhoods poses a strong constraint that the Lie bracket between basis vectors vanish, i.e.,

Such as basis is called *holonomic*, and only a holonomic basis affords a set of coordinates. Condition (A5) is easy to understand by recognizing that partial derivatives must not depend on the order of differentiation. Applying the right-hand side of Eq. (A5) to an arbitrary function yields

Equation (A5) therefore requires that the Hessian of *f* be symmetric. A non-holonomic basis is inconsistent in the sense that Eq. (A5) cannot hold true for all finite neighborhoods. When one tries to estimate the coordinate values using such a basis, the result is path dependent and basis vectors do not correspond to partial differentiation. Although calculus in such a basis is still possible, it requires explicit computation of the commutator, Eq. (A5), at each point *p* ∈ *M* which is significantly more difficult. In this work, all calculations are performed in Cartesian coordinates, which are holonomic.

The definition of tangent vectors allows one to specify a notion of distance on the manifold. A metric tensor at a point *p* ∈ *M* is a functional *g*_{p}(*X*_{p}, *Y*_{p}) which maps two tangent vectors *X*_{p},*Y*_{p} ∈ *T*_{p}*M* to a number. In general, *g*_{p} is required to be symmetric, bilinear, and non-degenerate.

Further, a metric is *Riemannian* if it is smooth on M and positive definite, i.e.,

When a basis $\u2202\u2202xi$ for T_{p}*M* is given, the metric is often expressed by its matrix elements in this basis as

A manifold equipped with a Riemannian metric is called a Riemannian manifold. The Riemannian metric generalizes the notion of an inner product from Euclidean spaces and allows one to measure distances on the manifold with the following length element:

The length of a curve *γ*(*t*) = *x*^{i}(*t*)**x**_{i} from A to B on *M* is

Here, the coordinate dependence on parameter *x*^{i}(*t*) is determined from γ through Eq. (A1).

The definition of path length in turn allows the definition of a *geodesic* as a curve whose arc length *L* is extremal, i.e., locally we have

A geodesic is the generalization of a straight line in Euclidean space. When the manifold is flat, its geodesics are straight lines, so this concept is especially helpful for our understanding of interpolation methods.

In a holonomic basis, the condition for a curve *γ*(*t*) = *x*^{i}(*t*)**x**_{i} to be a geodesic is

where $\Gamma jki$ are the *Christoffel symbols*

An important property of a Riemannian manifold is that it can always be embedded in a Euclidean manifold with sufficiently high dimensionality, a result known as the *Nash embedding theorem*.^{34,35} This means that any Riemannian manifold can be mapped into a curved hypersurface in a flat space, where the length of curves on the surface measured using the simple Euclidean metric of the higher dimensional space matches the length of the same curve measured in the Riemannian manifold with a possibly non-Euclidean metric. As discussed in the main text, this has significant implications for reaction paths and interpolations.

### APPENDIX B: BOUNDS AND APPROXIMATE VALUES OF SEGMENT LENGTH

We denote **q**^{(n)} = **q**(**x**^{(n)}) and Δ**q**^{(n)} = **q**^{(n+1)} − **q**^{(n)}. Using Cauchy’s inequality, we have

Integrating from **x**^{(n)} to **x**^{(n+1)} on both sides yields

Since the left-hand side is simply $\Delta qn2$, we have

Equation (B5) provides a lower bound for the true length. In fact, from the second order Taylor expansion of the curve, using arc length parameterization, we have

Since we parameterized by arc length,

and

Multiply Eq. (B7) by $\Delta qk+q\u0307k\Delta s$ and then summing over indices gives

Inserting Eq. (B7) and we get

Therefore, Eq. (B5) also gives an approximation for the path length through second order.

On the other hand, since the geodesic curve minimizes path length, the length of any given path in the manifold serves as an upper bound for the length of the geodesic. Here, we take a straight line in Cartesian to measure the upper bound. The expression of a straight line between **x**^{(n)} and **x**^{(n+1)} is

where *t* ∈ [0,1]. With the internal coordinate metric, the length of such a Cartesian straight line provides the following upper bound

To evaluate the value and derivatives of this upper bound, the line segment is further subdivided into *m* subsegments. When the subsegments are sufficiently small, the length of each subsegment can be accurately estimated with Eq. (B5). Ten to twenty subsegments can usually produce an accurate upper bound. In the case of division into *m* segments, the upper bound is

It is easy to see that when *m* = 1, Eq. (B18) simplifies to Eq. (B5) and yields the lower bound, whereas when *m* → ∞, Eq. (B18) yields the upper bound. An efficient and robust approximation can be obtained through a hybrid of the upper and lower bound approximations by setting *m* = 2,

Equation (B19) can be differentiated analytically, and the total path length can be minimized with any standard numerical minimization method, such as L-BFGS, to yield an approximate geodesic curve.

It is straightforward to verify that the length given by Eq. (B19) lies between the upper and lower bounds. The fact that it is larger than the lower bound prescribed by Eq. (B5) follows simply from the triangle inequality.

On the other hand, if we perform the same analysis on the segment of the straight line between **x**^{(n)} and $xn+xn+12$, we have

The same can be done for the segment between $xn+xn+12$ and **x**^{(n+1)}. Summing these two terms together yields

### APPENDIX C: PROOF THAT THE MAPPING *q* IS AN ISOMETRIC IMMERSION OF THE REDUCED CONFIGURATION SPACE (*M*, *g*) IN THE REDUNDANT INTERNAL COORDINATE SPACE(*N*, *h*)

Conventional geometry optimization or interpolation methods operate in the redundant internal coordinate space. More specifically, previous methods concern a feasible subspace on manifold (*N*, *h*), where $N=RK$ is the space of the values of *K* internal coordinates, with a Euclidean metric

The manifold (*N*, h) has a simple structure and contains all numeric values for each internal coordinates, including those values of internal coordinates that have no counterpart in a realizable molecular geometry (i.e., where the values of the internal coordinates are inconsistent). We further use manifold (*M*, *g*) to denote the reduced configuration space of the molecule. (*M*, *g*) spans only realizable geometric configurations but has a more complex structure which results from the feasibility constraints. Here, we show that internal coordinate functions q facilitate an isometric immersion of (*M*, *g*) into (*N*, *h*) if certain conditions on *q* are fulfilled. It is readily seen that the *K* coordinate functions *q* are a mapping from *M* to *N*. For *q*: *M* → *N* to be an immersion, it has to be differentiable and its differential D_{p}*q*: T_{p}*M* → T_{q(p)}*N* must be injective, i.e., one-to-one. Given a tangent vector $Xp=Xi\u2202\u2202xi\u2208TpM$ , the differential of the mapping can simply be obtained through the chain rule

And this linear mapping is injective if and only if

Therefore, the internal coordinate functions *q* are an immersion of (*M*, *g*) into (*N*, *h*) if *q* is everywhere differentiable and nowhere underdetermined in *M*. In the case where *q* itself is also injective (for example, where there are no periodic functions such as angles or torsions), *q* is an embedding. We note that although problems of periodicity and non-uniqueness arise in the conventional treatments in *N*, the reduced configuration space *M* does not have any such redundancy. Periodicity and global multi-valuedness of the coordinate functions does not affect the analysis in (*M*, *g*) either since the metric only requires local derivative information.

The differential of *q* given in Eq. (C2) maps a vector in T_{p}*M* into a vector in T_{q(p)}*N*, which is termed a *pushforward* in differential geometry. Correspondingly, its dual mapping provides a *pullback* from the cotangent space $Tq(p)*N$ to $Tp*M$. The cotangent space is spanned by differentials, and given a cotangent vector $dfp=fidqi\u2208Tq(p)*N$, the mapping is

For any two tangent vectors $Xp=Xi\u2202\u2202xi$ and $Yp=Yi\u2202\u2202xi$ in T_{p}*M* from the metric, we have

Therefore, the mapping *q* preserves the metric. With this particular choice of metric, *q* is an isometric immersion of (*M*, *g*) into (*N*, *h*). Therefore, when *N* is not affected by multi-valuedness of internal coordinates, *M* simply describes the intrinsic structure of the feasible submanifold of *N*. When internal coordinates are multivalued, *M* is still smooth and single valued, and any unique subset of the feasible submanifold of *N* can be mapped onto *M*.

### APPENDIX D: RELATIONSHIP WITH G MATRIX PROJECTION PROCEDURE

In the context of geometry optimization, a successful method for obtaining linearly independent subspaces from the redundant internal coordinates is the G matrix projection procedure. So far, it has not been applied to non-local problems such as interpolation because of the local nature of the method. We prove that this process is equivalent to the choice of a specific basis set for T_{p}*M* and that the interpolation process described above is in fact equivalent to linear interpolation on the G matrix projected basis.

The G matrix projection procedure is first briefly reviewed here. At a particular point, the G matrix is constructed from Wilson’s B matrix^{18,19}

The similarity with the matrix representation of the metric tensor is evident. In fact, the metric of the internal coordinate manifold is given by **g** = **B**^{T}**B**. An eigenvalue decomposition of G is then performed

where the diagonal matrix Λ contains all nonzero eigenvalues, *K* are the corresponding eigenvectors, and *L* spans the null space. Then, a non-redundant set of internal coordinates are constructed from the redundant coordinates *q* using *K*

Vectors ** K** can be further used to construct transformations between this non-redundant internal coordinate system and Cartesian coordinates.

^{22}The similarity between the matrix of the metric tensor and the G matrix is not superficial. With any basis

**x**

_{i}, one can also perform an eigenvalue decomposition of the matrix representation of the metric tensor

*g*

Note that index *i* is not being summed over in Eq. (D4). Because of the requirement in Eq. (C3), all dim(*M*) eigenvalues are positive. Because the Cartesian basis is used, six of the eigenvalues are 0, which correspond to translation and rotational motions that are not in *M* and are therefore not included in the metric. The eigenvectors *λ*_{i} in Eq. (D4) are deeply connected with matrix ** K** in the G matrix projection method. In fact, if we perform singular value decomposition (SVD) of

*B*^{20,26}

where ** U** and

**are orthogonal and**

*V***is the diagonal matrix containing all singular values of B, then we have**

*S*Note that although **SS**^{T}and **S**^{T}**S** are not of the same dimensionality, they are both diagonal and contain the same nonzero diagonal elements which equal to the squared singular values of **B** and only differ in the number of zero diagonal elements.

That is, ** U** contains all eigenvectors of

**while**

*G***contains all eigenvectors of**

*V***. Therefore one can choose the sign and ordering convention in**

*g***and**

*J***so that**

*K***K**=

**U**and

**J**=

**V**

_{.}For the nonzero block of S, we have

and

or

We define a basis of the cotangent space $Tp*M$ as

Again, in Eq. (D11), the index *i* is not summed over. Since *g* is bilinear, it is readily seen that the basis obtained here, up to sign and ordering, is independent of the original choice of basis. This basis is, in fact, the nonredundant basis constructed in the **G** matrix projection procedure, expressed intrinsically in manifold *M* instead of as an embedded structure in $N=RK$. To see this, we first write Eq. (D3) in a more precise manner in the cotangent space at point *p*

Then apply the pullback Eq. (C4) to this equation and obtain

Then we insert Eq. (D10) and obtain

Comparing this with Eq. (D11), we have

In other words, the **G** matrix projected non-redundant basis d*Q*^{i} is the basis of Eq. (D11) on (*M*, *g*) embedded in (*N*, *h*). However, while d*Q*^{i} forms a basis for the cotangent space and thereby naturally induces a dual basis *Q*_{i} in the tangent plane, it is not holonomic and cannot form a coordinate system. As noted earlier, a set of basis vectors of a tangent space *X*_{i} is holonomic when the Lie bracket vanishes, i.e., [*X*_{i},*X*_{j}] = 0.

However,

For an arbitrary set of redundant internal coordinates *q*, this means that there exists no coordinate system that yields differentials satisfying Eq. (D12). Such a conclusion is to be expected since this basis is orthonormal (because columns of **K** are orthonormal). Only a flat (Euclidean) manifold possess basis vector fields which are both orthonormal and holonomic. In fact, a Euclidean space is defined as a space with a holonomic basis that has the Euclidean metric *g*_{ij} = *δ*_{ij}, i.e., orthonormal. More specifically, such a manifold is flat because in a holonomic basis the Christoffel symbols can be written in the form of Eq. (A12), which vanish if the basis stays orthonormal in a finite neighborhood

As a result, the scalar curvature also vanish

This is only possible when the coordinate system is non-redundant. In the redundant case, the feasible submanifold is, in general, not flat and thus *Q*_{i} is generally not holonomic. As a result, should one try to obtain the value of *Q*_{i} by integrating Eq. (D12), one would find that the integral is path dependent, and it is impossible to assign a unique coordinate value for geometric configurations in a consistent way.