“Δ-machine learning” refers to a machine learning approach to bring a property such as a potential energy surface (PES) based on low-level (LL) density functional theory (DFT) energies and gradients close to a coupled cluster (CC) level of accuracy. Here, we present such an approach that uses the permutationally invariant polynomial (PIP) method to fit high-dimensional PESs. The approach is represented by a simple equation, in obvious notation *V*_{LL→CC} = *V*_{LL} + Δ*V*_{CC–LL}, and demonstrated for CH_{4}, H_{3}O^{+}, and *trans* and *cis*-*N*-methyl acetamide (NMA), CH_{3}CONHCH_{3}. For these molecules, the LL PES, *V*_{LL}, is a PIP fit to DFT/B3LYP/6-31+G(d) energies and gradients and Δ*V*_{CC–LL} is a precise PIP fit obtained using a low-order PIP basis set and based on a relatively small number of CCSD(T) energies. For CH_{4}, these are new calculations adopting an aug-cc-pVDZ basis, for H_{3}O^{+}, previous CCSD(T)-F12/aug-cc-pVQZ energies are used, while for NMA, new CCSD(T)-F12/aug-cc-pVDZ calculations are performed. With as few as 200 CCSD(T) energies, the new PESs are in excellent agreement with benchmark CCSD(T) results for the small molecules, and for 12-atom NMA, training is done with 4696 CCSD(T) energies.

## I. INTRODUCTION

Correcting *ab initio*-based potential energy surfaces (PESs) has been a long-standing goal of computational chemistry. Several approaches dating from 30 years ago have been suggested. In one, a correction potential is added to an existing PES, and parameters of the correction potential are optimized by matching ro-vibrational energies to the experiment.^{1–3} This approach relies on being able to calculate exact rovibrational energies to make the comparison with the experiment robust. Thus, it has only been applied to triatomic molecules, and it is limited to these and possibly tetratomics. Another approach is to modify an existing potential using scaling methods that go under the heading of “morphing.”^{4–6} An impressive example is a PES for HCN/HNC reported by Tennyson and co-workers^{7} who morphed a CCSD(T)-based PES.^{8}

More recent approaches using machine learning (ML) aim to bring a PES based on a low level of electronic theory to a higher level. As the field moves to consideration of larger molecules and clusters where high-level methods are prohibitively expensive, the motivation for doing this is obvious. There are two classes of such approaches, one is “Δ-machine learning” (Δ-ML) and the other is “transfer learning.”^{9} Δ-ML, which is of direct relevance to the present paper, seeks to add a correction to a property obtained using an efficient and thus perforce low-level *ab initio* theory.^{10–15} This approach includes an interesting, recent variant based on a “Pople” style composite approach.^{11} In this sense, the approach is related, in spirit at least, to the correction potential approach mentioned above, when the property is the PES. However, it is applicable to much larger molecules.

The transfer learning approach has been developed extensively in the context of neural networks,^{9} and so much of the work in that field has been brought into chemistry.^{12–16} The idea of transfer learning comes from the fact that knowledge gained from solving one problem can often be used to solve another related problem. Therefore, a model learned for one task, e.g., a ML-PES fit to low-level electronic energies/gradients, can be reused as the starting point of the model for a different task, e.g., an ML-PES with the accuracy of a high-level electronic structure theory.

Most work using transfer learning or Δ-ML has been on developing general transferable force fields with application mainly in the area of thermochemistry and molecular dynamics simulations at room temperature and somewhat higher. Käser *et al.* used transfer learning to improve neural network PESs for malonaldehyde, acetoacetaldehyde, and acetylacetone.^{15}

Here, we report a Δ-ML approach for PESs using the permutationally invariant polynomial (PIP) approach. The PIP approach has been applied to many PESs for molecules, including chemical reactions, dating back roughly 15 years. For reviews, see Refs. 17–19. Recent extensions of the PIP software to incorporate electronic gradients^{20,21} have extended the PIP approach to amino acids (glycine)^{22} and molecules with 12 atoms and 15 atoms, e.g., *N*-methyl acetamide,^{21,23,24} tropolone,^{25} and acetylacetone,^{26} respectively. As is widely appreciated in the field, incorporating gradients into fitting requires efficient, low-level electronic structure methods, such as density functional theory or MP2, as these provide analytical gradients.^{27} These levels of theory were used for the PES fits of the three molecules mentioned above.

Our approach is given by the simple equation

where *V*_{LL→CC} is the corrected PES, *V*_{LL} is a PES fit to low-level density functional theory (DFT) electronic data, and Δ*V*_{CC–LL} is the correction PES based on high-level coupled cluster energies. The assumption underlying the hoped-for small number of high-level energies is that the difference Δ*V*_{CC–LL} is not as strongly varying as *V*_{LL} with respect to the nuclear configuration.

We demonstrate the efficacy and high-fidelity of this approach for two small molecules, H_{3}O^{+} and CH_{4}, and for 12-atom *N*-methyl acetamide (NMA). In all cases, *V*_{LL} is a PIP fit to DFT energies and gradients, and Δ*V*_{CC–LL} is a PIP fit to a much smaller database of differences between CCSD(T) and DFT energies.

Unlike H_{3}O^{+} and CH_{4}, for NMA, there is no previous CCSD(T)-based PES, and so the present CCSD(T)-corrected one is, we believe, the most accurate one available.

## II. COMPUTATIONAL DETAILS

In order to develop a corrected PES, we need to generate a dataset of high and low-level energies for training and testing. In this study, we need both DFT and CCSD(T) datasets. Training is done for the correction PES Δ*V*_{CC–LL}, and testing is done for the corrected *V*_{LL→CC}. Do note that this two-step “training and testing” is on different datasets. Our objective is to see the impact of the training dataset size on the fidelity of the corrected PES *V*_{LL→CC} for CH_{4} and H_{3}O^{+}.

For H_{3}O^{+}, CCSD(T) energies are available from our previously reported PES, which is a fit to 32 142 CCSD(T)/aug-cc-pVQZ energies.^{28} From this large dataset, we select 1000 configurations with energies in the range 0–24 000 cm^{−1} for new DFT calculations of energies and gradients. These are done at the efficient B3LYP/6-311+G(d,p) level of theory, using the Molpro quantum chemistry package.^{29} Histograms of the distributions of DFT energies are given in supplementary material. Note that these DFT configurations span the same large range of configurations as the much larger CCSD(T) ones but have less dense sampling.

For CH_{4}, we take the DFT datasets from our recently reported work where the total of 9000 energies and their corresponding gradients were generated from *ab initio* molecular dynamics (AIMD) simulations using the B3LYP/6-31+G(d) level of theory.^{20} In that work, we reported PES fits using a number of subsets of the DFT data, which span the energy 0–15 000 cm^{−1}. Here, we generate a dataset that contains CCSD(T)/aug-cc-pVDZ energies at 3000 configurations, taken from the previous DFT data. A number of training datasets and one test dataset, which are subsets of this 3000 data, are employed to examine the Δ-ML procedure. Histogram plots of the distribution of DFT and new CCSD(T)/aVDZ electronic energies are given in supplementary material.

For NMA, we make use of previous DFT/B3LYP/cc-pVDZ energies and the corresponding PES that spans both the *trans* and *cis* isomers and barriers separating them.^{24} New CCSD(T)-F12/aug-cc-pVDZ calculations are done at a sparse set (5430) of configurations that span the full range of configurations used in the previous work. These are used to obtain the Δ*V*_{CC–LL} PES.

The PIP fits of Δ*V*_{CC–LL} are done using our recent monomial symmeterization software.^{20,30} Some details of the PIP bases are given in Sec. III. We note that they are all small relative to typical PIP bases needed for precise fitting of the full PES for these molecules.

For all molecules, the datasets are partitioned into several training and testing subsets to examine how few data are needed for training to get satisfactory results.

## III. RESULTS

We present root mean square (rms) errors for *V*_{LL→CC} relative to direct CCSD(T) energies for a variety of Δ*V*_{CC–LL} fits. In addition, comparisons are made with direct CCSD(T) results for the geometry and harmonic frequencies of relevant stationary points. To assess the performance of the present approach, these results are placed alongside the corresponding DFT ones.

We begin with the results for H_{3}O^{+}, which offers a test of the current Δ-ML approach to improve the properties of the minimum and saddle-point barrier separating the two minima.

### A. H_{3}O^{+} and CH_{4}

For H_{3}O^{+}, we trained Δ*V*_{CC–LL} on several sets of the difference of CCSD(T) and DFT absolute energies and then tested on the remaining data from the total of 32 142 configurations. In Fig. 1, we plot Δ*V*_{CC–LL} vs the DFT energies, relative to the DFT minimum for two training sets. We reference Δ*V*_{CC–LL} to the minimum of the difference between the CCSD(T) and DFT energies (which is roughly −12 110 cm^{−1}). As seen, the energy range of Δ*V*_{CC–LL} is about 3000 cm^{−1}, which is much smaller than the DFT energy range relative to the minimum value (which is roughly 23 000 cm^{−1}).

The performance of the Δ*V*_{CC–LL} fits is evaluated using the training datasets of 1000, 500, 250, and 125 configurations, and the corresponding test datasets consist of the remaining data from the total of 32 142 configurations. The corresponding rms differences between the *V*_{LL→CC} and CCSD(T) energies are given in Table III in the supplementary material. As seen, the rms errors are similar for all the training datasets. The results for the training set of only 125 energy differences are particularly encouraging, where the rms error is just 32 cm^{−1} for test energies up to 23 000 cm^{−1}. In this case, the PIP basis for Δ*V*_{CC–LL} contains only 51 terms.

A plot of *V*_{LL→CC} vs direct CCSD(T) energies for the training set of 500 points and its corresponding test data are shown in Fig. 2. As seen, there is excellent precision; however, we see some large errors for the test dataset. These come from high energy configurations that are irrelevant in this study. If needed, one can always improve these errors by adding the high energy data points into the training dataset.

An examination of the fidelity of *V*_{LL→CC} for various properties is given in Tables I and II, for the indicated training sets for Δ*V*_{CC–LL}. As seen, *V*_{LL→CC} produces results in excellent agreement with direct CCSD(T) ones and also a large improvement compared to the DFT PES. Most impressive is the high accuracy achieved even with the smallest training dataset of 125 energies.

. | Geometry parameter . | Harmonic frequency . | ||||
---|---|---|---|---|---|---|

N_{Train}
. | δ (O–H)
. | δ (H–H)
. | δ$v$_{1}
. | δ$v$_{2}
. | δ$v$_{3}
. | δ$v$_{4}
. |

1000^{a} | −3.0(−5) | −1.8(−4) | 4.8 | 1.8 | −4.4 | 3.3 |

500^{b} | −5.0(−5) | −4.4(−4) | 6.2 | 4.7 | 0.02 | 3.5 |

250^{c} | −3.0(−5) | −7.8(−4) | 2.6 | 4.8 | 6.3 | 0.02 |

125^{d} | 1.0(−5) | 13.3(−4) | −9.1 | −12.1 | −8.6 | 3.02 |

DFT | −47.8(−4) | −24.1(−3) | 125.9 | 26.5 | 26.5 | 33.7 |

. | Geometry parameter . | Harmonic frequency . | ||||
---|---|---|---|---|---|---|

N_{Train}
. | δ (O–H)
. | δ (H–H)
. | δ$v$_{1}
. | δ$v$_{2}
. | δ$v$_{3}
. | δ$v$_{4}
. |

1000^{a} | −3.0(−5) | −1.8(−4) | 4.8 | 1.8 | −4.4 | 3.3 |

500^{b} | −5.0(−5) | −4.4(−4) | 6.2 | 4.7 | 0.02 | 3.5 |

250^{c} | −3.0(−5) | −7.8(−4) | 2.6 | 4.8 | 6.3 | 0.02 |

125^{d} | 1.0(−5) | 13.3(−4) | −9.1 | −12.1 | −8.6 | 3.02 |

DFT | −47.8(−4) | −24.1(−3) | 125.9 | 26.5 | 26.5 | 33.7 |

^{a}

Maximum polynomial order of 7 and basis size of 348.

^{b}

Maximum polynomial order of 6 and basis size of 196.

^{c}

Maximum polynomial order of 5 and basis size of 103.

^{d}

Maximum polynomial order of 4 and basis size of 51.

. | Geometry parameter . | Harmonic frequency . | |||||
---|---|---|---|---|---|---|---|

N_{Train}
. | δ (O–H)
. | δ (H–H)
. | δ$v$_{1}
. | δ$v$_{2}
. | δ$v$_{3}
. | δ$v$_{4}
. | δ (barrier)
. |

1000^{a} | −5.0(−5) | −9.0(−5) | −3.1i | 3.3 | −6.1 | 1.3 | 2 |

500^{b} | −1.0(−5) | −2.0(−5) | −2.6i | 2.0 | −2.2 | −0.7 | 10 |

250^{c} | −2.2(−4) | −3.8(−4) | −1.2i | 1.2 | 7.7 | −4.3 | 7 |

125^{d} | −1.0(−5) | −1.0(−5) | −0.7i | −3.7 | −3.0 | −4.8 | −9 |

DFT | −70.6(−4) | −12.2(−3) | 111.3i | 17.6 | 45.5 | 58.7 | 297 |

. | Geometry parameter . | Harmonic frequency . | |||||
---|---|---|---|---|---|---|---|

N_{Train}
. | δ (O–H)
. | δ (H–H)
. | δ$v$_{1}
. | δ$v$_{2}
. | δ$v$_{3}
. | δ$v$_{4}
. | δ (barrier)
. |

1000^{a} | −5.0(−5) | −9.0(−5) | −3.1i | 3.3 | −6.1 | 1.3 | 2 |

500^{b} | −1.0(−5) | −2.0(−5) | −2.6i | 2.0 | −2.2 | −0.7 | 10 |

250^{c} | −2.2(−4) | −3.8(−4) | −1.2i | 1.2 | 7.7 | −4.3 | 7 |

125^{d} | −1.0(−5) | −1.0(−5) | −0.7i | −3.7 | −3.0 | −4.8 | −9 |

DFT | −70.6(−4) | −12.2(−3) | 111.3i | 17.6 | 45.5 | 58.7 | 297 |

^{a}

Maximum polynomial order of 7 and basis size of 348.

^{b}

Maximum polynomial order of 6 and basis size of 196.

^{c}

Maximum polynomial order of 5 and basis size of 103.

^{d}

Maximum polynomial order of 4 and basis size of 51.

Detailed results analogous to those shown for H_{3}O^{+} above are given for CH_{4} in the supplementary material. We note here simply that using just 100 CCSD(T)/aVDZ energies for the corrected CH_{4} PES closes the difference between the DFT PES and direct CCSD(T) results dramatically for both the geometry of the minimum and the harmonic frequencies. For example, the rms deviation for the harmonic frequencies with respect to the CCSD(T) values is reduced from 31 cm^{−1} in the DFT PES to about 1 cm^{−1} for the corrected PES.

Next, we present results for the more challenging 12-atom *N*-methyl acetamide PES.

### B. *N*-methyl acetamide, CH_{3}CONHCH_{3}

We recently reported DFT-based PESs for 12-atom *N*-methyl acetamide (NMA) using full and fragmented PIP basis sets.^{23,24} The idea of using a fragmented basis to extend the PIP approach to molecules with more than 10 atoms was illustrated for NMA. The dataset for the more recent PES, which describes the *cis* and *trans* minima as well as saddle points separating them, consisted of energies and gradients. The full basis of maximum polynomial order of 3 has 8040 linear coefficients. The fragmented PIP basis, also with a maximum polynomial order of 3, contains 6121 coefficients.

The fits were done using 6607 energies and corresponding 237 852 gradient components for a total data size of 244 459. These data were obtained from direct dynamics using the B3LYP/cc-pVDZ level. Clearly, a data set of this size from CCSD(T) calculations is not feasible, and so, the present approach is needed in order to bring this DFT-based PES close to CCSD(T) quality.

For the training and testing, we calculated a total of 5430 CCSD(T)-F12/aug-cc-pVDZ energies. Training of Δ*V*_{CC–LL} was done on 4696 data points of the difference of direct CCSD(T) and DFT-PES absolute energies. Testing of *V*_{LL→CC} was done on 734 energies. The distribution of the electronic energies (shown in supplementary material) for both the training and test data sets spans the large range of configurations used for the DFT-based PES, i.e., *trans* and *cis* isomers and their isomerization TSs.

In Fig. 3, we show the range of Δ*V*_{CC–LL} vs the DFT energies, relative to the DFT minimum for the training and test datasets. We reference Δ*V*_{CC–LL} to the minimum of the difference of the CCSD(T) and DFT energies (which is roughly −50 580 cm^{−1}). As seen, the energy range of Δ*V*_{CC–LL} is about 4500 cm^{−1}, which is much smaller than the DFT energy range relative to the minimum value (which is roughly 50 000 cm^{−1}). The PIP basis to fit the Δ*V*_{CC–LL} is generated using MSA software with the same reduced permutational symmetry of 31 111 113 (this describes the identity of the hydrogen atoms within a methyl group, which is essential to get the three fold torsional barrier) used previously and a maximum polynomial order of 2. This leads to 569 linear coefficients (PIP basis). The fitting rms error of this Δ*V*_{CC–LL} is 57 cm^{−1}. A plot of *V*_{LL→CC} vs direct CCSD(T) energies for the training and test data is shown in Fig. 4. The rms differences between the *V*_{LL→CC} and direct CCSD(T) energies for the training and test datasets are 57 cm^{−1} and 147 cm^{−1}, respectively. A slight increment in the test rms error is comparable with the DFT PES rms error of 126 cm^{−1}.

We perform geometry optimization and normal mode analyses for both *trans* and *cis* isomers using this Δ-ML PES, and we get significant improvement from the DFT PES, which predicts an incorrect minimum for the *trans*-isomer. Specifically, the torsion angle of one methyl rotor is shifted by 60° relative to the CCSD(T) structure. These differences in the structure are shown in the supplementary material, while more discussion of the torsional barriers is given below.

The *cis*–*trans* energy difference on the corrected PES is 782 cm^{−1}, which is 41 cm^{−1} below the direct CCSD(T) one. The rms errors of harmonic frequencies between the direct CCSD(T) one and the Δ-ML one are 15 cm^{−1} and 13 cm^{−1}, respectively, for *trans* and *cis* isomers, whereas these are 26 cm^{−1} and 17 cm^{−1} for the DFT PES (the complete list of harmonic frequencies for *trans*- and *cis*-NMA with the corresponding *ab initio* ones are given in Tables IV and V of the supplementary material). The geometry differences are comparably small for the *cis*-isomer but large for the DFT PES for the *trans*-isomer, owing mainly to the error in the methyl rotor minimum on the DFT PES, noted already.

Detailed comparisons of the partially relaxed torsional barriers are given in Table III. As seen, there are large differences between the DFT PES and CCSD(T) results for the CH_{3}–NH rotors for both *cis* and *trans* isomers. Overall, the Δ-ML PES barriers are significantly closer to the CCSD(T) ones than the DFT-PES ones.

trans-NMA
. | CH_{3}–NH
. | CH_{3}–CO
. |
---|---|---|

cis-NMA
. | CH_{3}–NH
. | CH_{3}–CO
. |

DFT PES | 256 | 37 |

Δ-ML PES | 34 | 74 |

CCSD(T) | 42 | 103 |

DFT PES | 61 | 361 |

Δ-ML PES | 153 | 366 |

CCSD(T) | 148 | 303 |

trans-NMA
. | CH_{3}–NH
. | CH_{3}–CO
. |
---|---|---|

cis-NMA
. | CH_{3}–NH
. | CH_{3}–CO
. |

DFT PES | 256 | 37 |

Δ-ML PES | 34 | 74 |

CCSD(T) | 42 | 103 |

DFT PES | 61 | 361 |

Δ-ML PES | 153 | 366 |

CCSD(T) | 148 | 303 |

Given the error in these DFT PES barriers, a detailed examination of the torsional potentials is warranted. These are shown in Fig. 5. These appear as would be expected, with the exception of panel (a), where the Δ-ML potential has a small dip at 60°, instead of a barrier there. The barrier of 34 cm^{−1} given in Table III is thus at slightly the wrong location. The source of this offset is the large error in the DFT PES, which has a minimum 60° in error compared to the benchmark CCSD(T) result. The small artifact in the Δ-ML torsional potential is of minor consequence, given that the CCSD(T) barrier is only 42 cm^{−1}.

To the best of our knowledge, there is no experimental determination of these torsional barriers for either isomer of NMA. However, there is a report of the torsional barrier for acetamide of 24 cm^{−1}.^{31} This barrier is consistent with the small barriers of 34 cm^{−1} (Δ-ML PES) and 74 cm^{−1} (Δ-ML PES) for *trans*-NMA. It also appears that the larger barriers for *cis*-NMA may be due to the closer proximity of these methyl rotors.

Next, we make some comments about computation times on our cluster with Intel Xeon 2.40 GHz processors. First, to calculate the 5430 CCSD(T) energies required about 900 central processing unit (CPU)-hours. (This was done using multiple nodes.) The time for 100 000 calculations of the corrected PES, *V*_{LL→CC}, is the sum of 2.056 s for the DFT PES, *V*_{LL}, plus 0.126 s for the Δ*V*_{CC–LL} PES. Thus, the Δ*V*_{CC–LL} PES takes only 6% of the total CPU time.

To conclude this section, we note that preliminary work indicates that using about half the number of CCSD(T) energies, i.e., 2200 energies, produces a Δ*V*_{CC–LL} PES that is close to the quality of the one reported here. We plan to report the details of this along with even smaller datasets later.

## IV. SUMMARY AND CONCLUSIONS

We reported an efficient and easy-to-implement correction to a low-level DFT PES based on a low-order PIP fit to the difference in a sparse set of high-level CCSD(T) and DFT energies. The correction was shown to produce a final PES with properties that are close to the corresponding CCSD(T) benchmark values for CH_{4} and H_{3}O^{+}. Similar results were shown for *N*-methyl acetamide, and this demonstrates that the approach should be widely applicable to large molecules. We plan to do this in the future for acetylacetone and tropolone for which low-level PESs have recently been reported.^{21,25,26} However, it would be difficult to present the rigorous tests against high-level coupled cluster results for, say, harmonic frequencies as these require a very large computational effort.

Finally, we note that the low-level PES can be based on any fitting method as can the correction PES. However, both should be consistent with respect to the same level of permutational invariance. We believe that the PIP approach has advantages for the correction PES. One is that the fit is permutationally invariant, and another, perhaps more significant one, is that a low-order PIP fit can be both precise and efficient to evaluate.

## SUPPLEMENTARY MATERIAL

The supplementary material contains details of training and testing for CH_{4}, H_{3}O^{+}, and *N*-methyl acetamide as well as harmonic frequencies.

## ACKNOWLEDGMENTS

J.M.B. thanks NASA (Grant No. 80NSSC20K0360) for financial support.

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request. The new Δ-ML PES for NMA is provided in the supplementary material.

## REFERENCES

*ab initio*programs, 2015; see http://www.molpro.net.