A challenge of atomistic machine-learning (ML) methods is ensuring that the training data are suitable for the system being simulated, which is particularly challenging for systems with large numbers of atoms. Most atomistic ML approaches rely on the nearsightedness principle (“all chemistry is local”), using information about the position of an atom’s neighbors to predict a per-atom energy. In this work, we develop a framework that exploits the nearsighted nature of ML models to systematically produce an appropriate training set for large structures. We use a per-atom uncertainty estimate to identify the most uncertain atoms and extract chunks centered around these atoms. It is crucial that these small chunks are both large enough to satisfy the ML’s nearsighted principle (that is, filling the cutoff radius) and are large enough to be converged with respect to the electronic structure calculation. We present data indicating when the electronic structure calculations are converged with respect to the structure size, which fundamentally limits the accuracy of any nearsighted ML calculator. These new atomic chunks are calculated in electronic structures, and crucially, only a single force—that of the central atom—is added to the growing training set, preventing the noisy and irrelevant information from the piece’s boundary from interfering with ML training. The resulting ML potentials are robust, despite requiring single-point calculations on only small reference structures and never seeing large training structures. We demonstrated our approach via structure optimization of a 260-atom structure and extended the approach to clusters with up to 1415 atoms.
Skip Nav Destination
,
,
CHORUS
Article navigation
14 February 2022
Research Article|
February 09 2022
A nearsighted force-training approach to systematically generate training data for the machine learning of large atomic structures Available to Purchase
Special Collection:
Chemical Design by Artificial Intelligence
Cheng Zeng
;
Cheng Zeng
School of Engineering, Brown University
, Providence, Rhode Island 02912, USA
Search for other works by this author on:
Xi Chen
;
Xi Chen
School of Engineering, Brown University
, Providence, Rhode Island 02912, USA
Search for other works by this author on:
Andrew A. Peterson
Andrew A. Peterson
a)
School of Engineering, Brown University
, Providence, Rhode Island 02912, USA
a)Author to whom correspondence should be addressed: [email protected]
Search for other works by this author on:
Cheng Zeng
Andrew A. Peterson
a)
School of Engineering, Brown University
, Providence, Rhode Island 02912, USA
a)Author to whom correspondence should be addressed: [email protected]
Note: This paper is part of the JCP Special Topic on Chemical Design by Artificial Intelligence.
J. Chem. Phys. 156, 064104 (2022)
Article history
Received:
November 19 2021
Accepted:
January 26 2022
Citation
Cheng Zeng, Xi Chen, Andrew A. Peterson; A nearsighted force-training approach to systematically generate training data for the machine learning of large atomic structures. J. Chem. Phys. 14 February 2022; 156 (6): 064104. https://doi.org/10.1063/5.0079314
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
The Amsterdam Modeling Suite
Evert Jan Baerends, Nestor F. Aguirre, et al.
DeePMD-kit v2: A software package for deep potential models
Jinzhe Zeng, Duo Zhang, et al.
CREST—A program for the exploration of low-energy molecular chemical space
Philipp Pracht, Stefan Grimme, et al.
Related Content
Explicitly correlated divide-and-conquer-type electronic structure calculations based on two-electron reduced density matrices
J. Chem. Phys. (July 2003)
Relationships between the third-order reactivity indicators in chemical density-functional theory
J. Chem. Phys. (June 2009)
Five smartphone experiments that don’t need apps
Phys. Teach. (December 2018)
Ten annoying experiments to do at dinnertime
Phys. Teach. (February 2025)
Using molecular similarity to construct accurate semiempirical electronic structure theories
J. Chem. Phys. (September 2004)