We describe a method for the post-hoc interpretation of a neural network (NN) trained on the global and local minima of neutral water clusters. We use the structures recently reported in a newly published database containing over 5 × 106 unique water cluster networks (H2O)N of size N = 3–30. The structural properties were first characterized using chemical descriptors derived from graph theory, identifying important trends in topology, connectivity, and polygon structure of the networks associated with the various minima. The code to generate the molecular graphs and compute the descriptors is available at https://github.com/exalearn/molecular-graph-descriptors, and the graphs are available alongside the original database at https://sites.uw.edu/wdbase/. A Continuous-Filter Convolutional Neural Network (CF-CNN) was trained on a subset of 500 000 networks to predict the potential energy, yielding a mean absolute error of 0.002 ± 0.002 kcal/mol per water molecule. Clusters of sizes not included in the training set exhibited errors of the same magnitude, indicating that the CF-CNN protocol accurately predicts energies of networks for both smaller and larger sizes than those used during training. The graph-theoretical descriptors were further employed to interpret the predictive power of the CF-CNN. Topological measures, such as the Wiener index, the average shortest path length, and the similarity index, suggested that all networks from the test set were within the range of values as the ones from the training set. The graph analysis suggests that larger errors appear when the mean degree and the number of polygons in the cluster lie further from the mean of the training set. This indicates that the structural space, and not just the chemical space, is an important factor to consider when designing training sets, as predictive errors can result when the structural composition is sufficiently different from the bulk of those in the training set. To this end, the developed descriptors are quite effective in explaining the results of the CF-CNN (a.k.a. the “black box”) model.
Skip Nav Destination
Article navigation
14 July 2020
Research Article|
July 13 2020
A look inside the black box: Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters
Special Collection:
Machine Learning Meets Chemical Physics
Jenna A. Bilbrey;
Jenna A. Bilbrey
1
Computing and Analytics Division, Pacific Northwest National Laboratory
, 902 Battelle Boulevard, P.O. Box 999, Richland, Washington 99352, USA
Search for other works by this author on:
Joseph P. Heindel;
Joseph P. Heindel
2
Department of Chemistry, University of Washington
, Seattle, Washington 98195, USA
Search for other works by this author on:
Malachi Schram;
Malachi Schram
3
Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory
, 902 Battelle Boulevard, P.O. Box 999, Richland, Washington 99352, USA
Search for other works by this author on:
Pradipta Bandyopadhyay;
Pradipta Bandyopadhyay
4
School of Computational and Integrative Sciences, Jawaharlal Nehru University
, New Delhi 110067, India
Search for other works by this author on:
Sotiris S. Xantheas
;
Sotiris S. Xantheas
a)
2
Department of Chemistry, University of Washington
, Seattle, Washington 98195, USA
3
Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory
, 902 Battelle Boulevard, P.O. Box 999, Richland, Washington 99352, USA
a)Author to whom correspondence should be addressed: sotiris.xantheas@pnnl.gov. Tel.: +1-509-375-3684
Search for other works by this author on:
Sutanay Choudhury
Sutanay Choudhury
b)
3
Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory
, 902 Battelle Boulevard, P.O. Box 999, Richland, Washington 99352, USA
Search for other works by this author on:
a)Author to whom correspondence should be addressed: sotiris.xantheas@pnnl.gov. Tel.: +1-509-375-3684
b)
Electronic mail: sutanay.choudhury@pnnl.gov
Note: This paper is part of the JCP Special Topic on Machine Learning Meets Chemical Physics.
J. Chem. Phys. 153, 024302 (2020)
Article history
Received:
April 07 2020
Accepted:
June 24 2020
Citation
Jenna A. Bilbrey, Joseph P. Heindel, Malachi Schram, Pradipta Bandyopadhyay, Sotiris S. Xantheas, Sutanay Choudhury; A look inside the black box: Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters. J. Chem. Phys. 14 July 2020; 153 (2): 024302. https://doi.org/10.1063/5.0009933
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
DeePMD-kit v2: A software package for deep potential models
Jinzhe Zeng, Duo Zhang, et al.
A theory of pitch for the hydrodynamic properties of molecules, helices, and achiral swimmers at low Reynolds number
Anderson D. S. Duraes, J. Daniel Gezelter
CREST—A program for the exploration of low-energy molecular chemical space
Philipp Pracht, Stefan Grimme, et al.
Related Content
Atlas of putative minima and low-lying energy networks of water clusters n = 3–25
J. Chem. Phys. (December 2019)
A specific MNDO parameterization for water
J. Chem. Phys. (January 2023)