The reaction coordinate (RC) is the principal collective variable or feature that determines the progress along an activated or reactive process. In a molecular simulation using enhanced sampling, a good description of the RC is crucial for generating sufficient statistics. Moreover, the RC provides invaluable atomistic insight into the process under study. The optimal RC is the committor, which represents the likelihood of a system to evolve toward a given state based on the coordinates of all its particles. As the interpretability of such a high dimensional function is low, a more practical approach is to describe the RC by some low-dimensional molecular collective variables or order parameters. While several methods can perform this dimensionality reduction, they usually require a preselection of these low-dimension collective variables (CVs). Here, we propose to automate this dimensionality reduction using an extended autoencoder, which maps the input (many CVs) onto a lower-dimensional latent space, which is subsequently used for the reconstruction of the input as well as the prediction of the committor function. As a consequence, the latent space is optimized for both reconstruction and committor prediction and is likely to yield the best non-linear low-dimensional representation of the committor. We test our extended autoencoder model on simple but nontrivial toy systems, as well as extensive molecular simulation data of methane hydrate nucleation. The extended autoencoder model can effectively extract the underlying mechanism of a reaction, make reliable predictions about the committor of a given configuration, and potentially even generate new paths representative for a reaction.

1.
J.
Kästner
,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
1
,
932
(
2011
).
2.
A.
Laio
and
M.
Parrinello
,
Proc. Natl. Acad. Sci. U. S. A.
99
,
12562
(
2002
).
3.
T.
Huber
,
A. E.
Torda
, and
W. F.
van Gunsteren
,
J. Comput.-Aided Mol. Des.
8
,
695
(
1994
).
4.
H.
Grubmüller
,
Phys. Rev. E
52
,
2893
(
1995
).
5.
A. F.
Voter
,
J. Chem. Phys.
106
,
4665
(
1997
).
6.
E.
Darve
and
A.
Pohorille
,
J. Chem. Phys.
115
,
9169
(
2001
).
7.
Y. Q.
Gao
,
J. Chem. Phys.
128
,
064105
(
2008
).
8.
A.
Ma
and
A. R.
Dinner
,
J. Phys. Chem. B
109
,
6769
(
2005
).
9.
D.
Chandler
, “
Barrier crossings: Classical theory of rare but important events
,” in
Classical and Quantum Dynamics in Condensed Phased Simulations
(
World Scientific
,
1998
), pp.
3
23
.
10.
H.
Jónsson
,
G.
Mills
, and
K. W.
Jacobsen
, “
Nudged elastic band method for finding minimum energy paths of transitions
,” in
Classical and Quantum Dynamics in Condensed Phased Simulations
(
World Scientific
,
1998
), p.
385
.
11.
A. K.
Faradjian
and
R.
Elber
,
J. Chem. Phys.
120
,
10880
(
2004
).
12.
P. G.
Bolhuis
,
D.
Chandler
,
C.
Dellago
, and
P. L.
Geissler
,
Annu. Rev. Phys. Chem.
53
,
291
(
2002
).
13.
T. S.
van Erp
and
P. G.
Bolhuis
,
J. Comput. Phys.
205
,
157
(
2005
).
14.
D.
Branduardi
,
F. L.
Gervasio
, and
M.
Parrinello
,
J. Chem. Phys.
126
,
054103
(
2007
).
15.
P. G.
Bolhuis
and
C.
Dellago
,
Rev. Comput. Chem.
27
,
111
210
(
2011
).
16.
J.
Rogal
,
W.
Lechner
,
J.
Juraszek
,
B.
Ensing
, and
P. G.
Bolhuis
,
J. Chem. Phys.
133
,
174109
(
2010
).
17.
W.
E
and
E.
Vanden-Eijnden
,
Annu. Rev. Phys. Chem.
61
,
391
(
2010
).
18.
B.
Peters
and
B. L.
Trout
,
J. Chem. Phys.
125
,
054108
(
2006
).
19.
H.
Jung
,
R.
Covino
, and
G.
Hummer
, arXiv:1901.04595 (
2019
).
20.
Y.
Wang
,
J. M. L.
Ribeiro
, and
P.
Tiwary
,
Nat. Commun.
10
,
3573
(
2019
).
21.
Y.
Wang
,
J. M.
Lamim Ribeiro
, and
P.
Tiwary
,
Curr. Opin. Struct. Biol.
61
,
139
(
2020
).
22.
W.
Li
and
A.
Ma
,
J. Chem. Phys.
144
,
114103
(
2016
).
23.
H.
Li
and
A.
Ma
,
J. Chem. Phys.
153
,
094109
(
2020
).
24.
W.
Chen
,
A. R.
Tan
, and
A. L.
Ferguson
,
J. Chem. Phys.
149
,
072312
(
2018
).
25.
W.
Chen
and
A. L.
Ferguson
,
J. Comput. Chem.
39
,
2079
(
2018
).
26.
W.
Chen
,
H.
Sidky
, and
A. L.
Ferguson
,
J. Chem. Phys.
151
,
064123
(
2019
).
27.
T.
Lemke
and
C.
Peter
,
J. Chem. Theory Comput.
15
,
1209
(
2019
).
28.
E.
Plaut
, “
From principal subspaces to principal components with linear autoencoders
,” arXiv:1804.10253 [stat.ML] (
2018
).
29.
H.
Bourlard
and
Y.
Kamp
,
Biol. Cybern.
59
,
291
(
1988
).
30.
A.
Arjun
and
P. G.
Bolhuis
,
J. Phys. Chem. B
124
,
8099
(
2020
).
31.
A.
Arjun
,
T. A.
Berendsen
, and
P. G.
Bolhuis
,
Proc. Natl. Acad. Sci. U. S. A.
116
,
19305
(
2019
).
32.
D. W. H.
Swenson
,
J.-H.
Prinz
,
F.
Noe
,
J. D.
Chodera
, and
P. G.
Bolhuis
,
J. Chem. Theory Comput.
15
,
813
(
2019
).
33.
P.
Buijsman
and
P. G.
Bolhuis
,
J. Chem. Phys.
152
,
044108
(
2020
).
34.
E. D.
Sloan
, Jr.
and
C. A.
Koh
,
Clathrate Hydrates of Natural Gases
(
CRC Press
,
2007
).
36.
L. C.
Jacobson
,
W.
Hujo
, and
V.
Molinero
,
J. Am. Chem. Soc.
132
,
11806
(
2010
).
37.
S. J.
Kemp
,
P.
Zaradic
, and
F.
Hansen
,
Ecol. Modell.
204
,
326
(
2007
).
38.
39.
M.
Scardi
and
L. W.
Harding
, Jr.
,
Ecol. Modell.
120
,
213
(
1999
).
40.
M.
Gevrey
,
I.
Dimopoulos
, and
S.
Lek
,
Ecol. Modell.
160
,
249
(
2003
).
41.
G.
Hooker
and
L.
Mentch
, arXiv:1905.03151 (
2019
).
42.
F.
Pérez
and
B. E.
Granger
,
Comput. Sci. Eng.
9
,
21
(
2007
).
43.
J. D.
Hunter
,
Comput. Sci. Eng.
9
,
90
(
2007
).
44.
T. E.
Oliphant
,
A Guide to NumPy
(
Trelgol Publishing
,
2006
), Vol. 1.
45.
D. W. H.
Swenson
,
J.-H.
Prinz
,
F.
Noe
,
J. D.
Chodera
, and
P. G.
Bolhuis
,
J. Chem. Theory Comput.
15
,
837
(
2018
).
46.
P. T., Inc.
, Collaborative data science,
2015
.
47.
F.
Pedregosa
,
G.
Varoquaux
,
A.
Gramfort
,
V.
Michel
,
B.
Thirion
,
O.
Grisel
,
M.
Blondel
,
P.
Prettenhofer
,
R.
Weiss
,
V.
Dubourg
 et al.,
J. Mach. Learn. Res.
12
,
2825
(
2011
).
48.
M.
Abadi
,
A.
Agarwal
,
P.
Barham
,
E.
Brevdo
,
Z.
Chen
,
C.
Citro
,
G. S.
Corrado
,
A.
Davis
,
J.
Dean
,
M.
Devin
,
S.
Ghemawat
,
I.
Goodfellow
,
A.
Harp
,
G.
Irving
,
M.
Isard
,
Y.
Jia
,
R.
Jozefowicz
,
L.
Kaiser
,
M.
Kudlur
,
J.
Levenberg
,
D.
Mané
,
R.
Monga
,
S.
Moore
,
D.
Murray
,
C.
Olah
,
M.
Schuster
,
J.
Shlens
,
B.
Steiner
,
I.
Sutskever
,
K.
Talwar
,
P.
Tucker
,
V.
Vanhoucke
,
V.
Vasudevan
,
F.
Viégas
,
O.
Vinyals
,
P.
Warden
,
M.
Wattenberg
,
M.
Wicke
,
Y.
Yu
, and
X.
Zheng
, TensorFlow: Large-scale machine learning on heterogeneous systems,2015. Software available from tensorflow.org.
49.
M.
Frassek
, Eae code, http://github.com,
2021
.
50.
W.
Li
and
A.
Ma
,
J. Chem. Phys.
143
,
174103
(
2015
).
51.
D. P.
Kingma
and
M.
Welling
, arXiv:1312.6114 (
2013
).

Supplementary Material

You do not currently have access to this content.