Computing vibrational spectra using a new collocation method with a pruned basis and more points than basis functions: avoiding quadrature

We present a new collocation method for computing the vibrational spectrum of a polyatomic molecule. Some form of quadrature or collocation is necessary when the potential energy surface does not have a simple form that simplifies the calculation of the potential matrix elements required to do a variational calculation. With quadrature, better accuracy is obtained by using more points than basis functions. To achieve the same advantage with collocation, we introduce a collocation method with more points than basis functions. Critically important, the method can be used with a large basis because it is incorporated into an iterative eigensolver. Previous collocation methods with more points than functions were incompatible with iterative eigensolvers. We test the new ideas by computing energy levels of molecules with as many as 6 atoms. We use pruned bases, but expect the new method to be advantageous whenever one uses a basis for which it is not possible to find an accurate quadrature with about as many points as there are basis functions. For our test molecules, accurate energy levels are obtained even using non-optimal, simple, equally spaced points.

The development of new methods for computing vibrational spectra of polyatomic molecules is an important sub-field of chemical physics.Having a potential energy surface (PES) with a special form that simplifies the calculation of matrix elements of its basis representation makes the calculation of vibrational spectra significantly less costly.Popular special forms include the sum-ofproducts form, 1 the N-mode representation form, 2 the high-dimensional model representation form, 3 and the many-body expansion form 4 .For some molecules, the most accurate available PES does have a special form.For example, for larger molecules it is still common to use ab initio programs to make a "force field". 5However, for many molecules, the most accurate PES does not have a special form, or does not have a special form in the coordinates in which one wishes to compute the vibrational spectrum.In this case, when the potential is a general function, one can fit a special form to the PES.To avoid re-fitting the PES, it is necessary to use a method that requires only values of the PES at points, i.e., use quadrature or collocation.In this paper, we present a new collocation method for computing vibrational spectra.
When the PES does not have a special form, but is some general function, it is common to use quadrature to compute matrix elements of the potential matrix required in a variational approach.One then solves the matrix eigenvalue problem a) Electronic mail: 11js110@queensu.cab) Electronic mail: tucker.carrington@queensu.ca where S is an overlap matrix and H = K b + V b , with K b and V b being kinetic and potential matrices in the chosen basis.There are many matrix eigenvalue equations in this paper.They all have the form of Eq. ( 1).In Eq. ( 1), X is a matrix whose columns are eigenvectors and E is a diagonal matrix whose diagonal elements are eigenvalues.Frequently, one chooses coordinates and basis functions to make it possible to calculate K b and S exactly.This, however, restricts the choices.][8][9][10] To obtain Eq. ( 1), one substitutes a basis expansion of wavefunctions into the Schrödinger equation, left multiplies by a basis function and integrates.
When using collocation, one instead solves the matrix eigenvalue problem It is obtained by substituting the same basis expansion into the Schrödinger equation and then demanding that the Schrödinger equation be satisfied at a set of points called collocation points.There are as many collocation points as basis functions.In Eq. ( 2), B ai = ϕ i (x a ), B ′′ ai = Kϕ i (x a ), where K is the kinetic energy operator (KEO), and V ab = V (x a )δ ab , where V (x a ) is the potential evaluated at a point.ϕ i is a basis function and x a , is a collocation point.In this paper, a labels points and i labels basis functions.In the remainder of the paper, the meaning of B is determined by the points and functions used to build it.In some equations B is rectangular and in others it is square.
Collocation has advantages and disadvantages.The most obvious advantage is that no integrals are required.It is possible to obtain accurate solutions to the Schrödinger equation when it is not known how to find a good quadrature for computing elements of V b , (and possibly K b and S) in Eq. (1).By good quadrature we mean one which is accurate and has about as many points as there are basis functions.When the basis is a direct product of univariate bases, it is always easy to find a quadrature that is good in this sense.When basis functions are products of multivariate functions, 11,12 or otherwise optimized, 13,14 it is not easy to find a good quadrature.Collocation has no weights, and when the basis expansion error is small, almost any collocation points can be used.Owing to the fact that integrals are not necessary, there is no need to choose basis functions and coordinates to facilitate the calculation of matrix elements (e.g.kinetic and overlap matrix elements).This opens the door to choosing the best possible basis functions and coordinates.To date, this advantage of collocation has been underexploited.
The most important disadvantage of collocation is the fact that Eq. ( 2) is a generalized eigenvalue problem, even when the basis functions are orthonormal.When the basis size is small and it is possible to store B ′′ and B and use methods of direct linear algebra, solving the generalized eigenvalue problem is straightforward. 7Direct product bases are simple and general and therefore frequently used.Unfortunately, for molecules with more than three atoms, a direct product basis is often large enough that it is not possible to store B ′′ and B. In that case, one can attempt to find a smaller basis with which accurate energy levels can also be computed.][17][18][19][20] Another way to make a small basis is to start with a direct product basis and prune it by retaining only basis functions deemed to be important.7][28] In both cases, when using a standard collocation method, it is necessary to find a way to choose a collocation point set with the same number of points as the number of retained basis functions.
Rather than reducing the basis size, one can use a direct product basis with an iterative Krylov-type eigensolver to obviate the need to compute and store large matrices.Iterative eigensolvers make it possible to do variational calculations (S in Eq. ( 1) is an identity matirx) with large basis sets. 29,30However, when using an iterative eigensovler it is much harder to deal with a generalized eigenvalue problem.It is common to convert Eq. ( 2) into a regular eigenvalue problem by left multiplying with B −1 to obtain To use collocation with an iterative method, one must therefore be able to efficiently compute matrix-vector products (MVPs) with B −1 .For a multi-dimensional problem, the basis size is large and therefore B is large.It is not possible to compute and store B and then evaluate MVPs by multiplying rows of B −1 by the vector.There is one case in which it is obviously easy to evaluate MVPs with B −1 : when both the basis set and the point set are direct products.If both are direct products then B −1 is a Kronecker product and it is simple to do the sums required to evaluate the MVPs sequentially.Unfortunately, for molecules with more than about five atoms, a direct product basis is so large that a huge amount of memory is required to store even vectors (no matrices need to be computed or stored).Each vector has about 10 D elements, where D is the number of dimensions.About 8000 GB is required for a molecule with 6 atoms.
Basis-size reduction and iterative eigensolvers are two tactics for dealing with large bases.There is a third tactic that is also useful.It is possible to improve the accuracy of energy levels computed by solving Eq. ( 2) by left multiplying with B T to obtain, and using more points than basis functions.In this case B and B " are matrices with fewer columns than rows.
2][33][34][35] It is also possible to left multiply by B T W, where W is a diagonal matrix containing weights.A collocation method using more points than basis functions is sometimes called rectangular collocation (RC).It is common to use more points than basis functions when using a variational method and solving where K ex and S ex exact kinetic and overlap matrices.Eq. ( 5) and Eq. ( 4) are not equivalent.When it is possible to represent wavefunctions as linear combinations of basis functions, the accuracy of energies obtained from Eq. ( 5) depends more sensitively on the choice of the points than does the accuracy of energies obtained from Eq. ( 4).
It would be ideal to use together all three tactics for reducing the cost of a collocation calculation: (i) basissize reduction, (ii) an iterative eigensolver, and (iii) more points than basis functions (RC).See Figure 1.Heretofore, only pairs of these tactics have been combined.Iterative eigensolvers have been used in conjunction with pruned basis sets in many papers (pale green intersection in Figure 1), see for example Refs.8, 36-43.This is a combination of (i) and (ii).To realize this combination it is necessary to find a way to compute MVPs with B −1 , despite the fact that B −1 is not a Kronecker product.RC has been used (Eq.( 4) has been solved) with small optimised basis sets by using a direct eigensolver and building matrices explicitly (pale blue intersection in Figure 1).This works when the basis size is sufficiently small.5][46] An iterative Multiconfiguration Time-Dependent Hartree approach has been used with RC (pale red intersection in Figure 1). 35This is a combination of (ii) and (iii).In this case, although there are more points than basis functions, both the basis set and the point set are direct products.
In this paper, we propose a method that combines all three tactics.At first glance, one might think this is easy.
One merely needs to use an iterative eigensolver to solve Eq. ( 4), where B and B ′′ are made from an efficient (e.g.pruned) basis and some larger point set.However, solving Eq. ( 4) with an iterative eigensolver requires doing MVPs with (B T B) −1 , which is difficult.The combination of the three tactics proposed in this paper yields a method with which it is possible to solve the Schrödinger equation with a general potential that has the following advantages: 1) the basis size is reduced by pruning; 2) an efficient iterative eigensolver is used and there is no need to compute and store large matrices; 3) energies are relatively insensitive to the type of points.3) is important.There is no need for Gauss quadrature points 47 or potential optimized discrete variable representation points 48,49 or simultaneous diagonalization discrete variable representation points 50,51 .Previous calculations have shown that, using RC and a direct product basis and grid, accurate results are obtained with sub-optimal points. 35Insensitivity to the choice of the points makes the method robust.Collocation has advantages when it is hard to find quadrature points (and weights) appropriate for the basis functions one wishes to use.Smolyak quadrature methods work best when used with harmonic basis functions for which good (nested) points and weights are known. 26,52,53One would much rather not be constrained to using harmonic basis functions.Previous RC methods were either incompatible with iterative eigensolvers 9,46 or used large direct product bases and grids 35 .

II. BASIS SET AND POINT SET PRUNING
In this paper we use pruned bases.A pruned basis is obtained from a direct product basis by retaining functions whose indices satisfy a pruning condition, for example, Each function in the basis is a product There are N b functions in the basis.i c = 0, 1, . . ., n c .n c = B ∀ c.For a single coordinate, the functions ϕ c ic (q c ) i c = 0, 1, . . ., n c are importance ordered, i.e. if the value of i c is smaller the function is more important.It is also possible to implement the ideas of this paper with the more general pruning condition, where G c (i c ) is a monotonically increasing function of i c , and the functions G c (i c ) are optimised to reduce the size of the basis. 27For the sake of simplicity, we present and test everything with the simple pruning condition of Eq. ( 6).The basis obtained from the pruning condition of Eq. ( 6) is good for many molecules. 54,55onsider first building a collocation point set with the same number of points as the basis defined by the pruning condition Eq. ( 6).In order to efficiently evaluate MVPs, we associate each function in the basis with a single point in D-dimensional space.To make the collocation point set, we start by choosing a set of points There is a 1:1 correspondence between points and basis functions, i.e., point a c = m is associated with function ϕ c ic=m (q c ) in the 1-D basis {ϕ c ic=0 (q c ), ϕ c ic=1 (q c ), . . ., ϕ c ic=nc (q c )}.This naturally produces a nested set of points.The point associated with the multi-D basis function ϕ 1 i1 (q 1 )ϕ 2 i2 (q 2 ) . . .ϕ D i D (q D ) is the point in D-dimensional space labelled by {a 1 = i 1 , a 2 = i 2 , . . ., a D = i D }.The labels of the collocation points satisfy There are as many points as basis functions.
In our calculations (see Section IV), we use a set of collocation points larger than (or equal to) the basis defined by Eq. ( 6).Its labels satisfy the condition where B ≥ b.This larger set of points is linked to a larger basis, whose functions satisfy the pruning condition, The basis (point) set determined by Eq. (10) (Eq.( 9)) has N B functions (points).The functions in the basis defined by Eq. ( 6) are those that we use in our calculations; the set of those functions is a subset of the basis defined by Eq. (10).The points in the set defined by Eq. ( 9) are those that we use in our calculations; the point set is larger than the basis set.

III. HAMILTONIAN
In this paper, we use a simple approximate KEO in dimensionless normal coordinates, where ω c is the harmonic wavenumber for coordinate q c .It is certainly also possible to use curvilinear internal coordinates.In curvilinear internal coordinates the KEO can be written so that it is a sum of terms each of which contains two, one, or zero derivatives.In each term the derivatives are multiplied on the left by a coefficient which in general is coordinate dependent.To use collocation with a KEO in curvilinear internal coordinates, it is necessary to evaluate the coefficients at points, but not to do integrals (quadratures) with integrands that involve the coefficients (a good pruned basis is necessarily not a discrete variable representation (DVR) basis).This idea was used in recent collocation calculations of energy levels of methane. 35,39e do calculations for four molecules: P  O potential, we transform from normal coordinates, q, to the internal coordinates of Ref. 57, R. The CH 2 O potential therefore has the form V ( R(q)).The transformation is done point by point.Using eigenvectors of the Cartesian force constant matrix (obtained from eigenvectors of GF 61 ), we obtain for each q a set of Cartesian coordinates.From the Cartesian coordinates we compute the internal coordinates in terms of which the PES of 57 is defined.Products of 1-D harmonic oscillator functions that are functions of the dimensionless normal coordinates are used to construct the basis.

IV. RECTANGULAR COLLOCATION WITH A CHOPPED B −1
Our goal is to use the basis set defined by Eq. ( 6) and the point set defined by Eq. ( 9) together with Eq. ( 2).There are more points than basis functions.B ′′ and B are rectangular N B × N b matrices and Eq. ( 2) is a rectangular eigenvalue problem.The most natural way to solve it is to proceed least-squares-like and to use Eq.(4).To solve Eq. ( 4) with an iterative eigensolver, it is rewritten called the pseudoinverse of B and written in terms of B + , Eq. ( 12) is an N b × N b square eigenvalue problem, In this paper, we begin with a rectangular version of Eq. ( 2), but instead of left multiplying by B T and then by (B T B) −1 , we left multiply by a matrix M. One problem with Eq. ( 13) is that MVPs with B + are costly.
To find M, we first use Eq. ( 2) and set up a square calculation with the basis defined by Eq. ( 10) and the grid defined by Eq. ( 9).We must build square N B × N B matrices B, B ′′ , and V. and B KP is the Kronecker product matrix obtained when the basis is the direct product formed from the 1-D bases {ϕ c ic=0 (q c ), ϕ c ic=1 (q c ), . . ., ϕ c ic=nc (q c )} and the point set is the direct product grid formed from the points B KP is a Kronecker product of matrices, denoted (c) B, for each coordinate, C B is a chopping matrix.When the functions in the direct product basis are ordered so that the first functions are those whose indices satisfy Eq. ( 10), then C B is an identity matrix with as many rows as there are functions in the direct product basis and as many columns as there are functions in the basis defined by Eq. (10).
The bottom rows of the matrix have elements that are all zero.In this paper, a C B to the right of B KP or B ′′ KP removes columns, i.e. basis functions; a C T B to the left of B KP or B ′′ KP removes rows, i.e. points.When using the KEO of Eq. ( 11), B ′′ KP is a sum of Kronecker products.Each term in the sum has one factor, due to the second derivative in Eq. (11), that is not a B (c) matrix (see for example Ref. 42).V is the square diagonal matrix whose diagonal elements are values of the PES on the pruned grid defined by Eq. ( 9).Using the Eq. ( 9) grid and the Eq. ( 10) basis, Eq. ( 2) is a square ) Second, start with Eq. ( 16) and retain only basis functions that satisfy Eq. ( 6).This achieves our goal of using the basis defined by Eq. ( 6) and the point set defined by Eq. ( 9).The chopping matrix that selects only the functions that satisfy Eq. ( 6) is C B C ∆ .Like C B , C ∆ is a rectangular identity matrix and removes (additional) columns or functions when it is placed to the right of C B .If in Eq. ( 16), we replace, If elements of columns of Z (in Eq. ( 16)) labelled by indices in the set defined by Eq. ( 10) and not in the set defined by Eq. ( 6) are zero then: (i) Eq. ( 17) has exact solutions; (ii) some E k in Eq. ( 17) are exactly equal to diagonal elements of E in Eq. ( 16); (iii) the elements of columns of Z in Eq. ( 16) labelled by indices in the set defined by Eq. ( 6) and not in the set defined by Eq. ( 10) compose the columns of Z in Eq. (17).In practice, we use Eq. ( 17) when contributions from the excluded basis functions to states of interest are small, but not exactly zero, and some E k we compute from Eq. ( 17) are accurate even when they are not exactly equal to diagonal elements of E. To solve this generalized eigenvalue problem, one might attempt to left multiply by the inverse of C T B B KP C B C ∆ which is on the right side; this is impossible because it is rectangular.We solve Eq. ( 17) by left multiplying by and exploiting C T ∆ C ∆ = I.We find the N b × N b eigenvalue problem, M plays the role of a pseudoinverse.
To obtain Eq. ( 18) from Eq. ( 16), one simply left multiplies by (C T B B KP C B ) −1 and then chops the product of matrices that is now on the left side.Important is that the product is chopped and not the factors multiplied to make the product; see for example Section IIC of Ref. 62.
Eigenvalues computed from Eq. ( 18) will be equal to eigenvalues of Eq. ( 1) (with exact matrix elements and the basis whose indices satisfy Eq. ( 6)) when an interpolant is exact.It must be exact for the functions obtained by applying the Hamiltonian operator to all functions in the basis defined by Eq. ( 6) and the interpolant is made from the basis whose indices satisfy Eq. ( 10) and the point set whose indices satisfy Eq. ( 9).
Eq. ( 18) can only be used if it is somehow possible to invert the N B × N B matrix C T B B KP C B and do MVPs efficiently with M. To use collocation with a large point set and a small basis, we must evaluate MVPs with the inverse of a matrix as large as the point set.To solve Eq. ( 18) with an iterative eigensolver, one must evaluate MVPs with ( It might appear that MVPs with (C T B B KP C B ) −1 would be costly.There are two obvious problems: 1) we require the inverse of a large matrix; 2) we must evaluate MVPs with a large matrix.Both of these problems can be resolved by using an LU decompostion: B = LU.L is lower triangular and U is upper triangular.Let c be a column of coefficients representing a wavefunction in a basis.Uc is a column of coefficients representing the wavefunction in a ZAPPL (zero at points in previous levels) basis. 64When ZAPPL functions were introduced, 8 they were called "hierarchical functions" and what is here U is equal to what was Ã−T in Ref. 8. LUc is a column of coefficients representing the wavefunction on the grid.In previous papers, we worked directly in the ZAPPL basis and in this paper the ZAPPL basis is used only as tool.Holzmüller and Pflüger also implicitly use ZAPPL functions by introducing an LU decomposition and show that efficient MVPs are possible. 63Using an LU decomposition, Here we use: (i) the L and U factors of a Kronecker product matrix are themselves Kronecker products; (ii) owing to either the lower triangularity of L KP , or the upper triangularity of U KP , we can insert C B C T B between L KP and U KP ; (iii) the inverse of a chopped triangular matrix is the chopped inverse. 41,42,64Evaluating MVPs with Eq. ( 22) is cheap because all of the factors are either chopping matrices or Kronecker products. 38MVPs with a Kronecker product matrix can be done by doing sums sequentially.In previous papers, 8,41,42 we solved the Schrödinger equation in the ZAPPL basis and did not use it as an intermediate basis as in Eq. (20).In this paper, we use the original (here harmonic) basis because C T ∆ (C T B L KP C B ) −1 is a matrix for which the columns that correspond to the points in Eq. ( 9) and not in Eq. ( 8) are zero, which negates the desired advantage of using more points than basis functions.
In each term, we replace (c) B ′′ with (c) L ′′ (c) U ′′ and B (c) with L (c) U (c) so that c B ′′ is a product of a Kro-necker product of L factors and a Kronecker product of U factors. 63We write, Between the two Kronecker products we insert The MVPs in Subsections IV A, IV B, and IV C can all be evaluated without storing vectors with as many elements as the direct product basis.It is possible because in all cases a Kronecker product matrix is decomposed with an LU factorization and is inserted between the L and U factors.The insertion is possible because the L factor is lower triangular and/or the U factor is upper triangular.Finally, although chopping matrices occur in the equations they are not constructed and explicitly applied to vectors.Instead, the chopping matrices are applied by appropriately limiting the 1-D sequential summation indices for each MVP.

V. THE MATRIX-VECTOR PRODUCT WITH A CHOPPED KRONECKER PRODUCT OF TRIANGULAR MATRICES
To evaluate MVPs with 18), we use Subsections IV A, IV B, and IV C and apply matrices that are chopped Kronecker products of lower or upper triangular matrices to a vector.The MVPs can all be evaluated without storing vectors that have more elements than functions in the basis pruned with Eq. ( 10).After doing LU decompositions and inserting as explained in Section IV, one must evaluate these MVPs: i) How does the cost of the MVPs we must compute in order to use more points than basis functions compare with the cost of the MVPs in a square ZAPPL calculation? 38,39,41,42,63In a square ZAPPL calculation one must evaluate MVPs with The cost of all three are identical and is denoted C. When using the KEO of Eq. ( 11), the cost of the calculation is M Z (D + 2)C; M Z is the number of required Arnoldi iterations (see Section VII).In a ZAPPL calculation there are no U matrices.When using more points than basis functions, the cost of MVPs with iii) and v) is significantly less than C, at least when B > b. 64 The cost of MVPs with iv) and vi) is also smaller. 64The cost of MVPs with i) is C. The cost of MVPs with ii) is smaller than but close to C. Neglecting the cost of the MVPs with iii) and v), denoting the cost of the iv) and vi) MVPs by C < , the cost of a rectangular collocation calculation is where M RC is the number of required Arnoldi iterations.When B ⪆ b, C < is less but not significantly less than C.For the simple KEO of Eq. ( 11), it may be cheaper to replace the product of i) and iv) with ) is a Kronecker product of factors many of which are identity matrices.In any case, the cost of a rectangular calculation relative to a square calculation decreases as D increases.Moreover, M RC < M Z because the N b × N b matrix for which we compute eigenvalues has a smaller spectral range and a more favourable gap structure than the N B ×N B ZAPPL matrix.
As examples, we explain how to compute MVPs in two cases.All other MVPs are evaluated using similar ideas.Consider first the MVP with the rectangular matrix C T B L KP C B C ∆ .L KP is a Kronecker product of matrices for each of the coordinates, The MVP is done by doing sums sequentially, a2,t2 . . .
The input vector z in Eq. ( 27) has N b elements, as many as there are indices in the set defined by Eq. ( 6).The rightmost sum is done first and the leftmost sum is done last.The upper limits on the sums are determined from the pruning conditions Eq. ( 6).t c labels a ZAPPL function for coordinate c.After each sum an intermediate vector is generated.The indices of the output (z ′ ) vector are those in the retained set whose elements satisfy Eq. ( 9).z ′ has length N B .Intermediate vectors may have more elements than the input or output vectors.For example, when computing where z has as only many elements are there are indices in the set defined by Eq. ( 6).The rightmost intermediate vector is For some sets of values i 1 , . . ., i D − 1, z D has more elements than z, but the elements that do not satisfy are all zero owing to the upper triangularity of U (D) .The same thing is true for all the intermediate vectors.
Finally, we briefly explain how we implement Eq. ( 26) and Eq. ( 29) on the computer.The sums are done sequentially and as explained in Ref. 65, using the three loop structure suggested by Larsson et al. 66 .For each sum, the input and output vectors share D − 1 indices which together are treated as a composite index.Each sum is done with three loops, one over the composite index, one over values of the index being summed, and one over values of the index that replaces the summed index.For example, in Eq. ( 28) the composite index includes t 1 , . . ., t D−1 .The remaining two loops are over t D and a D , which in general have different ranges.The process is repeated for each of the D sums required to evaluate a single MVP.To do the sums over t k in Eq. ( 26), mapping arrays are used because the common indices are different for each k.

VI. BASIS FUNCTIONS AND POINTS
For every molecule and for each coordinate, we choose basis functions and points.The 1-D basis functions are importance ordered and each basis function is associated with a point.For coordinate c there are n c +1 basis functions and n c + 1 points.The final multidimensional basis and the final point set are obtained from the pruning condition.
All of the molecules for which we compute energy levels are rather rigid and harmonic basis functions are therefore an obvious choice.ϕ c ic (q c ) for which i c is smaller are more important.We use the same basis functions for all molecules and all coordinates.It is not difficult to incorporate anharmoncity into the 1-D basis functions, but strong coupling will make calculations more difficult because it will require using a larger b in Eq. ( 6).
The choice of points is less obvious.We give results for two types of points.The first is Leja points. 67Leja points are nested interpolation points for a nested sequence of univariate basis functions.They are most often used with polynomial bases.We use the Leja points that correspond to harmonic basis functions. 68Leja points are known to be good (nested) interpolation points 67,[69][70][71] and it has been shown that they are good collocation points 38,39,41,43,68 .There is a natural one to one correspondence between Leja points and basis functions.The second type of points we use are equally spaced points.We use equally spaced points to demonstrate that when using rectangular collocation it is possible to obtain accurate solutions to the Schrödinger equation without using optimal points.There is no need to use points that would be good quadrature points.This is important because there are many situations in which it is hard to find good quadrature points.It is hard to imagine simpler or more generic points than equally spaced points.We must establish a one to one correspondence between points in an equally spaced set with n c + 1 points and the basis functions, ϕ c ic (q c ), i c = 0, 1, . . ., n c .The equally spaced points are chosen to be between limits +ℓ c and −ℓ c .There is a point at both +ℓ c and −ℓ c .ℓ c is determined so that the range includes the region in which the wavefunctions is large.When B is odd, a c = 0 is at the potential minimum (q c = 0).For any value of B there are pairs of points that are equally far from the potential minimum.We arbitrarily assign a smaller value of a c to the negative point of each pair.a c = 1 and a c = 2 label the points of the pair closest to the minimum; a c = 3 and a c = 4 label the points of the pair that is second closest to the minimum; etc.An example of a large and a small point set for a 2-D problem is shown in Figure 2. In the figure, both point sets are built from Leja for each of the coordinates.Because the points for a single coordinate are importance ordered, there are no points in the corners.Most of the points in the larger set that are not in the smaller set are around the periphery.

VII. RESULTS
We have done test calculations on molecules with three, four, five, and six atoms.In all cases the eigenvalue problem is solved with ARPACK. 72For P 2 O, Fig. 3 shows a comparison of the maximum absolute error (MAE) of the lowest 50 levels computed with b = 23 and (b+D)!b!D! = 2600 basis functions using Eq. ( 18) (blue) and Eq. ( 13) (red) with Leja points.It is interesting to compare the accuracy of equations Eq. ( 13) and Eq. ( 18) when the basis is small enough that the calculation with Eq. ( 13) (with a direct eigensolver and B = C T B B KP C B ) is possible, which is the case for P 2 O.At the left of the Figure the basis size and point set size are the same (B = b).When using Eq. ( 18), as the number of points is increased by increasing B in Eq. ( 9) from 23 to 27, the MAE decreases substantially.This demonstrates the advantage of rectangular collocation: without increasing the basis size it is possible to compute more accurate energies by using more points.Energy errors computed with Eq. ( 18) decrease more quickly than those computed with Eq. ( 13) as the size of the point set is increased.To compute errors plotted in Fig. 3, we use eigenvalues of a Hamiltonian matrix whose elements are exact.
For CH 2 O, Figure 4 shows absolute errors for the lowest 50 energy levels.The basis size is 18,564, corresponding to a pruning parameter of b = 12.The number of points is increased by increasing the point pruning parameter in steps: B = 12, 14, 16, 18. Errors are calculated from reference levels obtained with a huge direct product discrete variable representation (DVR) basis with 14 6 = 7, 529, 536 functions and an iterative eigensolver.Errors in the reference levels are less than 0.001 cm −1 .Energy level errors decrease by approximately FIG. 3. Maximum absolute errors of the lowest 50 levels of P2O.The basis size is 2,600 and the number of points is increasing along the x axis.The basis pruning parameter in Eq. ( 6) is b = 23 and the point pruning parameter in Eq. ( 9) is B = 23, 24, 25, 26, 27.A comparison is shown between errors with Eq. ( 18)) (blue) and Eq. ( 13) (red).Leja points are used.
three orders of magnitude as the size of the point set is increased, without increasing the basis size.In Figure 5, we compare CH 2 O MAEs obtained with Leja and equally spaced points.The MAEs are similar, indicating that simple equally spaced points work well.A major advantage of collocation is that it is not necessary to have good quadrature points.Being able to use more points than functions means that even super simple points can be used to obtain accurate energies.
For CH 2 NH, Figure 6 shows that also for a molecule with 5 atoms, energy level errors are reduced by increasing the size of the point set.The absolute errors are with respect to reference levels computed with the same pruning condition, exact matrix elements, and the method of Ref. 55, which is only applicable for a SOP PES.Errors in the reference levels are less than 0.001 cm −1 .Errors are unacceptably large if the point set size equals the basis set size.If one were not able to use more points than basis functions, one would deem the basis with 48,620 functions too small.The advantage of using more points than functions is also clearly demonstrated in Figure 7.All the calculations for Figure 7 were done with the same point set using B = 12.The Figure makes it clear that as long as the number of points is large enough, energy errors remain acceptably small even when the basis size is reduced.If one were forced to use as many points as basis functions, it would be necessary to use many more basis functions and therefore a much larger matrix.
Results for CH 3 CN are shown in Figure 8. Reference levels are calculating using the O expansion method of Ref. 55 with exact matrix elements.Errors in the reference levels are less than 0.001 cm −1 .The basis  9) that are B = 12, 14, 16, 18.The basis pruning parameter in Eq. ( 6) is b = 12.For many eigenvalues the bars for 134,596 points are so small they are not visible.Leja points are used.

VIII. CONCLUSION
With a direct product basis made from products of univariate functions it is straightforward to solve the Schrödinger equation for the vibration of the nuclei of a molecule with five or fewer atoms.This is possible even for a general PES (i.e. one that is not a SOP and not an N-mode representation). 30,73However, for larger systems the runtime and memory cost of a direct product calculation become prohibitive.To do calculations for systems with 6 or more atoms one must use a better basis.It is common to make a better basis, either by contracting or by pruning, and this always means using basis functions that are not product DVR functions.Using better basis functions is not enough, because to solve the Schrödinger equation, one must also compute matrix elements of the Hamiltonian operator in the chosen basis.For a general PES, this usually means evaluating integrals by quadrature and using more points than basis functions.This paper is about a collocation method with more points than basis functions that obviates all integrals.
Quadrature is a problem because for good basis sets it is not known how to find an accurate quadrature with about as many points as there are functions in the basis.For example, in Ref. 74, to compute vibrational levels of methane, 280 contracted bend functions were used, but 0.34 × 10 6 bend quadrature points (after compaction) were required.Pruning is another technique for creating a basis smaller than a direct product basis.For a 15-D problem with 15 functions for each coordinate the basis obtained from the pruning condition Eq. ( 6) is about 10 orders of magnitude smaller than the direct product basis from which it is obtained.It is possible to find good quadrature points and weights for pruned basis sets, but it is only simple for polynomial-type bases. 37n any quadrature method, the accuracy of energies depends sensitively on the points.Quadrature methods do have the advantage, compared to (direct product) DVR calculations that one is able to use more points than basis functions.In a sense using more points enables one to fully exploit the space spanned by the basis.
In this paper we prune to reduce the basis size and use a rectangular collocation (RC) method with more points than basis functions.We need at most an order of magnitude more points than basis functions and demonstrate that the method is relatively insensitive to the choice of the points.By using more points than basis functions, we take full advantage of the space spanned by the basis.In standard RC calculations, one converts a rectangular eigenvalue problem into a square eigenvalue problem by left multiplying with a pseudoinverse which is (B T B) −1 B T .In this paper we use a different pseudoinverse, called M. To make M, we use a big basis (Eq.( 10)) and its corresponding grid and also a small basis (Eq.( 6)) and its corresponding grid.M can only be used if it is possible to invert and evaluate MVPs with the big B which in this paper is C T B B KP C B .For a pruned basis this is possible, as explained in Section IV.It is much easier to invert C T B B KP C B than to solve the Schrödinger equation in the big basis.We are using the big point set (Eq. ( 9)) and the small basis (Eq.( 6)).Of course one could use the big point set and the big basis.Using the small basis is less costly because MVPs are less expensive (for D = 9 and 12, if B is large enough) and fewer MVPs are required to obtain converged eigensolutions (with a small basis the spectral range the density of states are both smaller).For similar reasons, it is common to use more points than basis functions with quadrature.We have used the simplest possible pruning condition (Eq.( 6); it might be possible to reduce the point set and basis set sizes by using Eq.(7).
Energies computed from Eq. ( 18) will be exactly equal to those obtained from a variational calculation with exact matrix elements when it is possible to interpolate the functions obtained by operating with the Hamiltonian operator on all functions from the basis used to represent wavefunctions (here Eq. ( 6).In general, b should be large enough that wavefunctions are correctly represented but no larger.Using a value of b larger than required, unnecessarily increases interpolation error.
It is clear from Figure 3 that energies obtained from the M pseudoinverse are better than those obtained from the standard (B T B) −1 B T pseudoinverse and that the M pseudoinverse becomes more advantageous as B becomes larger than b.How is this possible?If we knew only the matrix (C T B B KP C B C ∆ ), then it would be best to use the standard pseudoinverse.To use the M pseudoinverse, we evaluate MVPs with (C B T B KP C B ) −1 .Hence we are using elements of B KP that are not in (C T B B KP C B C ∆ ).Energy errors computed with Eq. ( 18) are smaller than those computed with Eq. (4) because we are using more information.
When using a variational method with a general PES, the advantage of using more quadrature points than basis functions is clear.By using iterative methods and doing sums sequentially more points can be used without significantly increasing the cost. 30The standard way to use collocation and also take more points than basis functions (see for example, Ref. 9) requires solving a generalized eigenvalue problem which, when using an iterative eigensolver, is significantly more costly.The rectangular collocation method we present should be useful for a wide range of molecules with general PESs for which it is possible to devise a compact basis but for which an accurate quadrature of similar size is not known.

FIG. 1 .
FIG. 1. Diagram showing three tactics for reducing the cost of a collocation calculation.Pairs of tactics have been combined previously.In this paper we use all three tactics.
To use Eq.(18) it is also necessary to evaluate MVPs with C T B B ′′ KP C B .B ′′ KP = c c B ′′ is a sum of Kronecker product matrices and c B ′′ is the matrix c are done with N b × N b square ma-trices; MVPs with ii), iv), & vi) are done with N b × N B or N B × N b rectangular matrices; MVPs with i) are done with a N B × N B matrix.

6 FIG. 2 .
FIG. 2. Example point distribution in 2-D using Leja points.The red points are those in the small point set defined by b = 11.The blue points are those that are added to the b = 11 point set to obtain the B = 15 point set.

FIG. 4 .
FIG.4.Absolute errors of the lowest 50 levels of CH2O.The basis size is 18,564 and the different colour bars correspond to point pruning parameter values in Eq. (9) that are B = 12, 14, 16, 18.The basis pruning parameter in Eq. (6) is b = 12.For many eigenvalues the bars for 134,596 points are so small they are not visible.Leja points are used.
2 O using the quartic Taylor series potential of Ref. 56, CH 2 O using the potential of Ref. 57, CH 2 NH using the potential of Ref. 58 with the interpretation of Ref. 59, and CH 3 CN using the potential of Ref. 60 with the interpretation of Ref. 36.The P 2 O, CH 2 NH, and CH 3 CN potentials are functions of normal coordinates in SOP form.The CH 2 O potential is not a SOP and is a function of internal coordinates.Since we use dimensionless normal coordinates, to evaluate the CH 2 reduce the cost of MVPs and the length of intermediate vectors.KP C B .This is straightforward because we can replace B KP with L KP U KP and insert C B C ∆ C T ∆ C T Bbetween L KP and U KP .
t D z t1,...,t D(28)for a fixed set of t 1 , t 2 , ..., t D−1 values, z D may have more elements than z because the number of possible values of a D is larger than the number of possible values of t D .This is a consequence of the fact that the number of possible values of t D is determined byt 1 + t 2 + • • • + t D ≤ band therefore depends on t 1 , t 2 , . . ., t D−1 whereas a D = 0, 1, . . ., n D = B, independent of t 1 , t 2 , . . ., t D−1 .In Ref. 38 it was shown that elements of all intermediate vectors labelled by multi-indices not in the retained basis can be discarded.For example, in Eq. (28) all elements of z D that do not satisfy t 1 + t 2 + • • • + a D ≤ B can be dropped.Consider now the MVP with the N b ×N b square matrix