The effective description of molecular geometry is important for theoretical studies of intermolecular interactions. Here we introduce a new translation-rotation-internal coordinate (TRIC) system which explicitly includes the collective translations and rotations of molecules, or parts of molecules such as monomers or ligands, as degrees of freedom. The translations are described as the centroid position and the orientations are represented with the exponential map parameterization of quaternions. When TRIC is incorporated into geometry optimization calculations, the performance is consistently superior to existing coordinate systems for a diverse set of systems including water clusters, organic semiconductor donor-acceptor complexes, and small proteins, all of which are characterized by nontrivial intermolecular interactions. The method also introduces a new way to scan the molecular orientations while allowing orthogonal degrees of freedom to relax. Our findings indicate that an explicit description of molecular translation and rotation is a natural way to traverse the many-dimensional potential energy surface.
I. INTRODUCTION
Intermolecular interactions are widely recognized for their central role in determining the structure, function, and properties of macromolecules, molecular complexes, and molecular materials.1–4 In theoretical chemistry, the fundamental tools for studying intermolecular interactions include optimizing the molecular geometry to locate the critical points on the many-dimensional potential energy surface, then calculating and characterizing the interactions at a given geometry. In recent years, significant advances have been made toward the second goal, such as accurate approximations to high-level electronic structure theories5–8 and energy decomposition analyses.9–15 However, achieving the first goal of efficiently optimizing these molecular geometries is still challenging; this is because the potential energy surface is relatively flat along the intermolecular directions compared to intramolecular ones, and long sequences of energy and gradient evaluations are often required to reach a local minimum. Geometry optimization methods that efficiently describe intermolecular degrees of freedom may greatly accelerate these nontrivial calculations.
The Cartesian coordinate system is the simplest way to describe the molecular geometry and is needed to carry out the energy and gradient evaluations. Because the potential energy surface is highly nonlinear and coupled in the Cartesian coordinates (denoted as x), only small steps are possible in the downhill direction. Internal coordinates (ICs, denoted as q) are functions of the Cartesian coordinates that better reflect the collective motions of the atoms. ICs can describe displacements along curved pathways and decouple different kinds of molecular displacements, allowing the optimization algorithm to take more efficient steps. Given a displacement in the internal coordinates Δq, the corresponding displacement in the Cartesian coordinates Δx is computed as,
where B is the Wilson B-matrix and G = BBT.16 When q is a nonlinear function, Equation (1) is applied iteratively until a desired IC displacement is achieved.
An early IC system is the Z-matrix coordinates17,18 where the position of an atom is described using a distance, angle, and/or dihedral angle with respect to other atoms; we will refer to these individual functions as primitive ICs. The Z-matrix coordinates encounter difficulties for cyclic systems where there are more ICs than Cartesian coordinates, which makes G ill-behaved; Pulay and Fogarasi showed that this can be overcome by using a generalized inverse of G.19 Schlegel extended this method by automatically constructing a full set of primitive ICs from the connectivity graph of the molecule, which we refer to as the redundant IC (RIC) system in this paper.20 The generalized inverse of GRIC is expensive due to the numerous primitive ICs. Baker proposed the delocalized internal coordinate system (DLC) to remove the redundancies, where each delocalized coordinate is a linear combination of primitive ICs corresponding to an eigenvector of GRIC with a nonzero eigenvalue.21–23 As a result, GDLC has a smaller dimension and is easier to invert.
When the above coordinate systems are applied to multiple molecules, the atoms between molecules are connected using a minimum spanning tree to describe the system as a single “super-molecule.”20 In this procedure, a minimal number of fictitious intermolecular “bonds” are added to connect the molecules at the points of closest contact, which increases the number of distances by the total number of molecules minus one; intermolecular angles and dihedrals are then added in an analogous fashion to the single-molecule case. To avoid introducing potentially ill-behaved primitive ICs between molecules, Baker proposed replacing them with inverse interatomic distances, but an artificial cutoff is necessary to avoid the number of interatomic distances increasing quadratically.24,25 Alternatively, Billeter and Thiel introduced the hybrid delocalized internal coordinate (HDLC) method where the intermolecular displacements are described by adding all of the Cartesian coordinates into the primitive IC set for the construction of delocalized ICs.26 In the above methods, displacements in the “intermolecular” primitive ICs will affect the inter- and intra-molecular structures simultaneously. We hypothesize that a coordinate system that separately describes the inter- and intra-molecular degrees of freedom could improve the performance of geometry optimizations.
The position of a molecule is easily described using its average Cartesian coordinates (i.e., centroid), but describing the orientation is more challenging. First of all, a suitable definition of rotation for non-rigid molecules is required. Second, an ideal representation of rotation should be differentiable, such that the Wilson-B matrix can be evaluated analytically; non-redundant, such that no additional constraints are needed in the parameter space; and free of singularities. As the rotation group SO(3) is three-dimensional, a non-redundant rotation IC should have three independent variables. These criteria are not satisfied by most existing rotation parameterizations, as will be discussed later.
In this paper, we introduce a well-behaved internal coordinate for molecular orientation, which accounts for non-rigid molecules with the least-squares superposition method27–29 and uses the exponential map parameterization of rotations.30 By including translation and rotation into the primitive IC set and applying the existing delocalization methods, it forms a new coordinate system that we refer to as translation-rotation-internal coordinates (TRIC).
The rest of the paper is organized as follows: We first provide mathematical details of the new translation and rotation coordinates, together with their derivative formulas. We demonstrate the improved efficiency of TRIC in geometry optimization applied to prototypical systems compared to other coordinate systems. Finally, we show how TRIC enables constrained optimization in the molecular orientation, providing a new way to scan the potential energy surface along intermolecular degrees of freedom.
II. THEORY
To build the coordinate system, the atoms are first partitioned into subunits; this could be done automatically by connecting pairs of atoms that are closer than the sum of their covalent radii,31,32 although other divisions are possible.33
For each subunit, we define three translation and three rotation internal coordinates. The translation internal coordinate is defined simply as the arithmetic average of the Cartesian coordinates,
where the notation xin denotes the ith Cartesian coordinate of the nth atom. The derivatives are trivial to compute as .
Along similar lines, we seek a differentiable function v(x, y) describing a rotation which brings an ordered list of Cartesian coordinates x into maximum overlap with a reference y, which is set to the starting geometry. We shall assume without loss of generality that x and y have been shifted such that their centroids are located at the origin. The optimal rotation v(x, y) minimizes the residual squared displacement D(x, y; v′) as
where U(v′) is a 3 × 3 rotation matrix parameterized by v′ and | ⋅ | is the Euclidean norm.
The trivial representation is to use the elements of U themselves, but there are nine elements which makes it redundant. The Euler angle representation has the correct number of parameters, but suffers from singularities when two of the rotation axes are parallel. A third representation is the rotation quaternion: for a rotation through angle θ around the axis defined by a unit vector (ux,uy,uz), the corresponding quaternion is defined as
and U(q) is given in Eq. (33) of Ref. 28. In this paper, all vectors (such as u) are defined in the global frame, as we do not use molecule-specific local frames.
Following the derivation of Coutsias and co-workers,28 Equation (3) may be written in terms of quaternions using ordinary linear algebra as
where F is the symmetric matrix given by
and R is calculated from the Cartesian coordinates as
In Equation (5), D is minimized when the quadratic form q′TFq′ is maximized. Thus the optimal rotation quaternion q is the eigenvector corresponding to the largest eigenvalue of F.
The quaternion representation is free of singularities, but it contains four variables which makes it redundant. Another closely related parameterization that overcomes this problem is the exponential map, which is defined as
where v describes a rotation of |v| radians about the axis ; thus the magnitude and axis of rotation are encoded into a single three-dimensional vector. We note in passing that the exponential map is very closely related to the axis-angle representation of rotations.
A well-known fact about the rotation group SO(3) is that it cannot be mapped into ℝ3 without singularities.30 In the case of the exponential map, the singularities in ℝ3 are spherical surfaces with radius 2nπ where n is a positive integer; this is because these vectors correspond to n complete revolutions which leave the coordinates unchanged. Moreover, any rotation with a larger norm than π is equivalent to a smaller rotation in the opposite direction. In practice, we detect when |v| exceeds 0.9π and reset the reference positions y, and this effectively avoids all such singularities. Therefore, the exponential map is suitable as the rotational internal coordinate.
The partial derivatives are needed to compute the Wilson B-matrix. First note that the derivative of the exponential map parameters in terms of the quaternion elements are given by
In the neighborhood of q0 → 1 corresponding to vanishingly small rotations, Equations (8) and (9) cannot be used because the numerator and denominator both vanish. This can be addressed by taking the Taylor expansion of Equation (8) around q0 = 1, and the expressions simplify to
thus allowing us to compute the exponential map and its derivatives in the entire region of interest including the origin. By the chain rule, the desired derivative formula is given by with the first term given by Equations (9) and (11) above.
The quaternion derivative is given by the derivative formula for the eigenvector of a matrix,34
where (⋅)+ is the generalized matrix inverse, λ is the largest eigenvalue of F given by Equation (6) above, and q is the corresponding eigenvector. The derivatives are linear combinations of derivatives of R, for example,
where
III. COMPUTATIONAL METHODS
We developed GeomeTRIC, an open-source optimization program that interfaces to quantum chemistry software by passing Cartesian coordinates as input and retrieving the output energies and Cartesian gradients (see supplementary material68). GeomeTRIC implements Cartesian coordinates, redundant IC (RIC), delocalized IC (DLC), hybrid DLC (HDLC), and translation-rotation-internal coordinates (TRIC). The Q-Chem 4.2 software package35 was used for the water clusters, and the TeraChem software package36–38 was used for the others. The TRIC coordinates and associated optimization algorithms have also been implemented internally into TeraChem to improve performance. The details of the optimization algorithm are given in the supplementary material;68 we chose a force constant of 0.05 a.u. for the translation and rotation coordinates when constructing the initial guess Hessian matrix.
IV. RESULTS AND DISCUSSION
Copper phthalocyanine (CuPc) and C60 are a donor/acceptor pair of materials that are extensively studied as a model system for organic photovoltaics. The molecular packing at the interface in thin films and bulk heterojunctions depends heavily on the intermolecular interactions and in turn influence the optical and electronic properties.43 In this example calculation, 23 starting structures were randomly selected from a molecular dynamics simulation of a CuPc/C60 mixture which uses the UFF force field.44
Figure 1 shows the impact of the coordinate system on the optimization results in a representative example. The optimization using TRIC converges in 248 cycles, followed by HDLC (322), DLC (413), RIC (420), and Cartesians (771). Moreover, the primitive coordinates describing intermolecular motion affects the path taken by the optimization and the final geometry. TRIC is the only coordinate system to contain the translation and rotation coordinates, and converges to a stacked CuPc dimer with C60 molecules in close contact. The HDLC/Cartesian coordinates both use the atomic Cartesian coordinates, and they converge to two similar structures (root mean squared deviation (RMSD) = 0.40 Å) where the CuPc molecules are far apart. The RIC/DLC both use interatomic distances, angles, and dihedral angles, and they also converge to two similar structures (RMSD = 1.62 Å) where the two CuPc molecules are perpendicular.
Optimization results for 2CuPc–2C60 (neutral, triplet) using different choices of the coordinate system. Calculations used the PBE0 approximation39 with the DFT-D3 empirical dispersion correction,40 the LANL2DZ basis set and effective core potential41,42 for copper, and the 3-21G basis for all other atoms. Energies are referenced to the TRIC-optimized structure which has the lowest energy. The lower panels show initial and optimized structures under different coordinate systems. The HDLC/Cartesian coordinates and RIC/DLC converged to highly similar structures (RMSD < 1.5 Å) so only one structure is shown for each pair. Atoms are colored as C, gray; N, blue; Cu, orange.
Optimization results for 2CuPc–2C60 (neutral, triplet) using different choices of the coordinate system. Calculations used the PBE0 approximation39 with the DFT-D3 empirical dispersion correction,40 the LANL2DZ basis set and effective core potential41,42 for copper, and the 3-21G basis for all other atoms. Energies are referenced to the TRIC-optimized structure which has the lowest energy. The lower panels show initial and optimized structures under different coordinate systems. The HDLC/Cartesian coordinates and RIC/DLC converged to highly similar structures (RMSD < 1.5 Å) so only one structure is shown for each pair. Atoms are colored as C, gray; N, blue; Cu, orange.
It is well-known that the result of a local optimization largely depends on the choice of initial conditions, and even a small perturbation of the starting structure could cause convergence to a different minimum. Thus, we tested TRIC, HDLC, and DLC on the full set of 23 starting structures to assess the statistical significance of the results generated using these coordinate systems. Figure S168 shows that the HDLC calculations have the largest number of cycles (428 ± 231), DLC converges in fewer iterations (250 ± 104), and TRIC requires the fewest (216 ± 64). All of the TRIC calculations converged in fewer than 400 cycles, and 15 out of 23 calculations reached a lower final energy than either HDLC or DLC. Student’s t-test indicates that TRIC converges to the lowest energy structure more often than HDLC or DLC with a 95% confidence level.
Trp-cage miniprotein. Proteins are diverse polymers with complex nonbonded interactions; the monomers are covalently bonded through the protein backbone and possess flexible side chains, making this an interesting case study for internal coordinate systems. Here we minimized the energy of the Trp-cage miniprotein, starting from the solution NMR structure, PDB ID 1L2Y.45 The protein was divided into subunits according to the amino acid residue number.
Figure 2 shows the optimization results; TRIC converges in the fewest iterations (244), followed by DLC (248), RIC (264), HDLC (634), and Cartesian (1146). Here, both TRIC and HDLC include the bonds, angles, and dihedrals between residues. The TRIC optimized structure has a heavy-atom RMSD of 1.94 Å from the starting structure and contains more hydrogen bonds, in line with our expectation that the energy minimum should contain more hydrogen bonds than a structure obtained at room temperature. The DLC and redundant IC behave almost identically and converge to a pair of highly similar structures (RMSD < 0.01 Å). Finally, the HDLC and Cartesian coordinates converge in a much greater number of iterations (>600) and the two final structures are rather similar (RMSD = 0.60 Å).
Optimization results for Trp-cage, using the same electronic structure method as CuPc–C60 but with the COSMO continuum solvent model added.46,47 The protein backbone is shown as ribbons and atoms are shown as lines; hydrogen bonds are highlighted in blue with a 3.0 Å distance cutoff between heavy atoms and a 30° cutoff for the angle between hydrogen, donor atom, and acceptor atom.
Optimization results for Trp-cage, using the same electronic structure method as CuPc–C60 but with the COSMO continuum solvent model added.46,47 The protein backbone is shown as ribbons and atoms are shown as lines; hydrogen bonds are highlighted in blue with a 3.0 Å distance cutoff between heavy atoms and a 30° cutoff for the angle between hydrogen, donor atom, and acceptor atom.
The amino acid residues in a protein are connected by the backbone amide bonds, thus removing these bonds from the connectivity graph will disconnect the protein into separate fragments. This in turn prevents the primitive ICs involving atoms on different residues from being generated and leads to a block-diagonal G matrix, which is suggested in Ref. 26 to speed up the inverse iterations for Cartesian steps. Figure S268 shows that excluding these bonded ICs increases the number of optimization cycles for TRIC by a greater amount than for HDLC, although TRIC still converges in fewer cycles in both cases. This is also observed statistically for all 38 NMR structures using the AMBER99-SB force field48 in Figure S3.68 Thus, although TRIC is designed to reduce the number of optimization cycles, decoupling residues from each other is one way to accelerate the inverse iterations when they are a computational bottleneck.
Water clusters contain strong but flexible hydrogen bonds, which leads to a rich diversity of local minimum structures for water clusters as small as the hexamer.49 We thus examined the influence of the coordinate system on the distribution of optimization results for a large number of water cluster structures drawn from liquid molecular dynamics simulations using the iAMOEBA model.50 Five cluster sizes were considered (6, 8, 12, 16, and 20 molecules) with 100 structures each. We also present comparisons of our results with other software packages, including DL-FIND51 (using HDLC) and Q-Chem 4.235 (using DLC) in Figure S4.68
Figure 3 compares the performance of different coordinate systems for optimizing 12 water molecules. The number of optimization cycles is significantly different between coordinate systems, while the distribution of final energies is quite similar. The calculations with TRIC have the smallest mean and variance in the number of cycles compared to the other coordinate systems. Although the Cartesian coordinates are peaked at the largest number of iterations, DLC has the largest number of outliers requiring more than 500 cycles. These outliers could indicate the risks of using interatomic distances, angles, and torsions to describe intermolecular degrees of freedom, as they undergo very different motions from intramolecular ones.
Optimization results for 100 water clusters (12 molecules) with five different coordinate systems. The PBE0-D3 approximation and 6-31G* basis were used. The scatter plot shows the number of optimization cycles vs. the final minimized energies relative to the average. The lower panel is a kernel density estimate with kernel width 0.2 showing the distribution of optimization cycles for each coordinate system.
Optimization results for 100 water clusters (12 molecules) with five different coordinate systems. The PBE0-D3 approximation and 6-31G* basis were used. The scatter plot shows the number of optimization cycles vs. the final minimized energies relative to the average. The lower panel is a kernel density estimate with kernel width 0.2 showing the distribution of optimization cycles for each coordinate system.
Figure 4 shows how the number of optimization cycles depends on the system size. TRIC is the most efficient for all of the system sizes tested and also has the lowest slope of 4.1, indicating the increase in the number of cycles per molecule added. At the largest cluster sizes (20 and 24), the slope decreases to 2.3 cycles per molecule added; the apparent sub-linear scaling of the cycle number is encouraging considering the scaling of the electronic structure method is often quadratic or greater. Taken together with the CuPc–C60 and Trp-cage results, the data indicate that TRIC significantly reduces the cost of intermolecular geometry optimization.
Scaling behavior of the number of optimization cycles with respect to the cluster size. The error bars are one standard error. The dotted lines are linear fits.
Scaling behavior of the number of optimization cycles with respect to the cluster size. The error bars are one standard error. The dotted lines are linear fits.
A. Orientation constrained optimization
The TRIC coordinates enable a new kind of constrained optimization where the relative orientations of molecules are constrained while allowing the orthogonal degrees of freedom to relax, which is useful for characterizing the intermolecular potential energy surface. To demonstrate this capability we carried out orientation-constrained optimizations on a pentacene dimer, where one molecule is held to its original orientation while rotating the other around the chosen axis. The constraints are implemented using Lagrange multipliers as described in Ref. 52.
Although the idea of constraining the molecular orientation is intuitive, care must be taken to select the correct internal coordinates to constrain. Because the energy of the overall system is invariant to overall rotations, we fix the orientation of one pentacene molecule relative to its original structure and scan the orientation of the other molecule. The rotation is specified using a rotation axis and a rotation angle; the optimization then searches for a geometry where the molecule is rotated by the specified angle with respect to the starting structure. By scaling the unit vector of the specified rotation axis by the specified angle, one obtains the constraint values for the exponential map parameters corresponding to vi in Equation (8).
Figure 5 shows a sequence of optimized structures of the pentacene dimer as the second molecule is rotated through 180° in 5° increments. We scanned the orientation first with dispersion-corrected PBE039,40 and then with the GAFF force field,53 and the results are significantly different. The final structure in density functional theory (DFT) has a much smaller horizontal offset compared to the force field, which indicates stronger intermolecular interactions in DFT. DFT also predicts much greater distortions in the monomers, which could arise from the relatively strong intermolecular interactions or could indicate that molecules are less rigid. The stronger interactions are also evident in Figure S5,68 which shows that DFT predicts much higher energy barriers to rotation.
Optimized structures of pentacene dimer using PBE0-D3/6-311++G(d,p) (left) and the GAFF force field (right). The rotation axis is perpendicular to the plane of the image. The structures are aligned using the monomer in gray/white as a reference, and the other monomer is colored according to sequence number (blue, initial; red, final). The details of this calculation are provided in the supplementary material.68
Optimized structures of pentacene dimer using PBE0-D3/6-311++G(d,p) (left) and the GAFF force field (right). The rotation axis is perpendicular to the plane of the image. The structures are aligned using the monomer in gray/white as a reference, and the other monomer is colored according to sequence number (blue, initial; red, final). The details of this calculation are provided in the supplementary material.68
Relaxed potential energy scans are a commonly used approach in theoretical chemistry, principally as a means for testing hypothetical reaction coordinates.54–56 With the introduction of the rotational coordinate in TRIC, it is now possible to perform a relaxed potential energy scan over the molecular orientation. We conjecture that this could open up new directions in the study of intermolecular forces, particularly concerning the interplay between molecular distortions and intermolecular forces that have been proposed for inclusion into polarizable force fields.57
V. CONCLUSION
As electronic structure theory is increasingly applied to study intermolecular interactions, corresponding advances in geometry optimization methods are essential. The TRIC coordinate system significantly reduces the number of optimization cycles needed for these complicated systems and also enables exploring the potential energy surface while constraining the intermolecular orientations. Such calculations can provide valuable data for parameterizing empirical potentials which sample the intermolecular degrees of freedom in finite-temperature simulations.58 We are optimistic that further applications of TRIC will include basin-hopping33,59,60 and reaction path-finding methods61–67 as they all involve geometry optimizations on molecular potential energy surfaces.
Acknowledgments
L.-P.W. acknowledges support from the UC Davis Department of Chemistry and a gift from Walt Disney Imagineering. C.S. is grateful for an E. K. Potter Stanford Graduate Fellowship and support through Grant No. NSF ACI-1450179. We would also like to acknowledge Chi-Yuen Wang and Yudong Qiu for helpful discussions.