Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics, such as the likelihood and average time of events (predictions). Here, we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a dataset of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.

1.
F.
Noé
and
F.
Nüske
, “
A variational approach to modeling slow processes in stochastic dynamical systems
,”
Multiscale Model. Simul.
11
(
2
),
635
655
(
2013
).
2.
F.
Nüske
,
B. G.
Keller
,
G.
Pérez-Hernández
,
A. S. J. S.
Mey
, and
F.
Noé
, “
Variational approach to molecular kinetics
,”
J. Chem. Theory Comput.
10
(
4
),
1739
1752
(
2014
).
3.
S.
Klus
,
F.
Nüske
,
P.
Koltai
,
H.
Wu
,
I.
Kevrekidis
,
C.
Schütte
, and
F.
Noé
, “
Data-driven model reduction and transfer operator approximation
,”
J. Nonlinear Sci.
28
,
985
1010
(
2018
).
4.
R. J.
Webber
,
E. H.
Thiede
,
D.
Dow
,
A. R.
Dinner
, and
J.
Weare
, “
Error bounds for dynamical spectral estimation
,”
SIAM J. Math. Data Sci.
3
(
1
),
225
252
(
2021
).
5.
C.
Lorpaiboon
,
E. H.
Thiede
,
R. J.
Webber
,
J.
Weare
, and
A. R.
Dinner
, “
Integrated variational approach to conformational dynamics: A robust strategy for identifying eigenfunctions of dynamical operators
,”
J. Phys. Chem. B
124
(
42
),
9354
9364
(
2020
).
6.
R. T.
McGibbon
,
B. E.
Husic
, and
V. S.
Pande
, “
Identification of simple reaction coordinates from complex dynamics
,”
J. Chem. Phys.
146
(
4
),
044109
(
2017
).
7.
L.
Busto-Moner
,
C.-J.
Feng
,
A.
Antoszewski
,
A.
Tokmakoff
, and
A. R.
Dinner
, “
Structural ensemble of the insulin monomer
,”
Biochemistry
60
(
42
),
3125
3136
(
2021
).
8.
G.
Pérez-Hernández
,
F.
Paul
,
T.
Giorgino
,
G.
De Fabritiis
, and
F.
Noé
, “
Identification of slow molecular order parameters for Markov model construction
,”
J. Chem. Phys.
139
(
1
),
015102
(
2013
).
9.
C. R.
Schwantes
and
V. S.
Pande
, “
Improvements in Markov State Model construction reveal many non-native interactions in the folding of NTL9
,”
J. Chem. Theory Comput.
9
(
4
),
2000
2009
(
2013
).
10.
J.
Strahan
,
A.
Antoszewski
,
C.
Lorpaiboon
,
B. P.
Vani
,
J.
Weare
, and
A. R.
Dinner
, “
Long-time-scale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein
,”
J. Chem. Theory Comput.
17
(
5
),
2948
2963
(
2021
).
11.
E. H.
Thiede
,
D.
Giannakis
,
A. R.
Dinner
, and
J.
Weare
, “
Galerkin approximation of dynamical quantities using trajectory data
,”
J. Chem. Phys.
150
(
24
),
244111
(
2019
).
12.
W. C.
Swope
,
J. W.
Pitera
, and
F.
Suits
, “
Describing protein folding kinetics by molecular dynamics simulations. 1. Theory
,”
J. Phys. Chem. B
108
(
21
),
6571
6581
(
2004
).
13.
N.
Frank
,
C.
Schütte
,
E.
Vanden-Eijnden
,
L.
Reich
, and
T. R.
Weik
, “
Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations
,”
Proc. Natl. Acad. Sci. U. S. A.
106
(
45
),
19011
19016
(
2009
).
14.
An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation
, edited by
G. R.
Bowman
,
V. S.
Pande
, and
N.
Frank
,
Volume 797 of Advances in Experimental Medicine and Biology
(
Springer Netherlands
,
Dordrecht
,
2014
).
15.
J.
Finkel
,
R. J.
Webber
,
D. S.
Abbot
,
E. P.
Gerber
, and
J.
Weare
, “
Learning forecasts of rare stratospheric transitions from short simulations
,”
Mon. Weather Rev.
149
(
11
),
3647
3669
(
2021
).
16.
A.
Antoszewski
,
C.
Lorpaiboon
,
J.
Strahan
, and
A. R.
Dinner
, “
Kinetics of phenol escape from the insulin R6 hexamer
,”
J. Phys. Chem. B
125
(
42
),
11637
11649
(
2021
).
17.
S. C.
Guo
,
R.
Shen
,
B.
Roux
, and
A. R.
Dinner
, “
Dynamics of activation in the voltage-sensing domain of Ci-VSP
,” (
2022
).
18.
G.
Andrew
,
R.
Arora
,
J.
Bilmes
, and
K.
Livescu
, “
Deep canonical correlation analysis
,” in
Proceedings of the 30th International Conference on Machine Learning
(
PMLR
,
2013
), pp.
1247
1255
.
19.
A.
Mardt
,
L.
Pasquali
,
H.
Wu
, and
N.
Frank
, “
VAMPnets for deep learning of molecular kinetics
,”
Nat. Commun.
9
(
1
),
5
(
2018
).
20.
C.
Wehmeyer
and
F.
Noé
, “
Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics
,”
J. Chem. Phys.
148
(
24
),
241703
(
2018
).
21.
B.
Lusch
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Deep learning for universal linear embeddings of nonlinear dynamics
,”
Nat. Commun.
9
(
1
),
4950
(
2018
).
22.
W.
Chen
,
H.
Sidky
, and
A. L.
Ferguson
, “
Nonlinear discovery of slow molecular modes using state-free reversible VAMPnets
,”
J. Chem. Phys.
150
(
21
),
214114
(
2019
).
23.
A.
Glielmo
,
B. E.
Husic
,
A.
Rodriguez
,
C.
Clementi
,
N.
Frank
, and
A.
Laio
, “
Unsupervised learning methods for molecular simulation data
,”
Chem. Rev.
121
,
9722
(
2021
).
24.
J.
Strahan
,
J.
Finkel
,
A. R.
Dinner
, and
J.
Weare
, “
Predicting rare events using neural networks and short-trajectory data
,”
J. Comput. Phys.
488
,
112152
(
2023
).
25.
H.
Li
,
Y.
Khoo
,
Y.
Ren
, and
L.
Ying
, “
A semigroup method for high dimensional committor functions based on neural network
,” in
Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference
(
PMLR
,
2022
), pp.
598
618
.
26.
Y.
Khoo
,
J.
Lu
, and
L.
Ying
, “
Solving for high-dimensional committor functions using artificial neural networks
,”
Res. Math. Sci.
6
(
1
),
1
(
2018
).
27.
Q.
Li
,
B.
Lin
, and
W.
Ren
, “
Computing committor functions for the study of rare events using deep learning
,”
J. Chem. Phys.
151
(
5
),
054112
(
2019
).
28.
B.
Roux
, “
String method with swarms-of-trajectories, mean drifts, lag time, and committor
,”
J. Phys. Chem. A
125
(
34
),
7558
7571
(
2021
).
29.
B.
Roux
, “
Transition rate theory, spectral analysis, and reactive paths
,”
J. Chem. Phys.
156
(
13
),
134111
(
2022
).
30.
G. M.
Rotskoff
,
A. R.
Mitchell
, and
E.
Vanden-Eijnden
, “
Active importance sampling for variational objectives dominated by rare events: Consequences for optimization and generalization
,” in
Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference
(
PMLR
,
2022
), pp.
757
780
.
31.
R. S.
Sutton
and
A. G.
Barto
,
Reinforcement Learning: An Introduction
,
2nd ed.
(
The MIT Press
,
Cambridge, MA
,
2018
).
32.
J.
Wen
,
B.
Dai
,
L.
Li
, and
S.
Dale
, “
Batch stationary distribution estimation
,” in
Proceedings of the 37th International Conference on Machine Learning, ICML’20
(
JMLR.org
,
2020
), pp.
10203
10213
.
33.
G. H.
Golub
and
C. F.
Van Loan
,
Matrix Computations
,
3rd ed.
(
The Johns Hopkins University Press
,
1996
).
34.
R.
Du
,
V. S.
Pande
,
A. Y.
Grosberg
,
T.
Tanaka
, and
E. S.
Shakhnovich
, “
On the transition coordinate for protein folding
,”
J. Chem. Phys.
108
(
1
),
334
350
(
1998
).
35.
A.
Ma
and
A. R.
Dinner
, “
Automatic method for identifying reaction coordinates in complex systems
,”
J. Phys. Chem. B
109
(
14
),
6769
6779
(
2005
).
36.
S. V.
Krivov
, “
On reaction coordinate optimality
,”
J. Chem. Theory Comput.
9
(
1
),
135
146
(
2013
).
37.
W.
E
and
E.
Vanden-Eijnden
, “
Transition-path theory and path-finding algorithms for the study of rare events
,”
Annu. Rev. Phys. Chem.
61
,
391
420
(
2010
).
38.
E.
Vanden-Eijnden
, “
Transition path theory
,” in
Computer Simulations in Condensed Matter Systems: From Materials to Chemical Biology
(
Springer
,
2006
), Vol.
1
, pp.
453
493
.
39.
P.
Collett
,
S.
Martinez
, and
J.
San Martin
, “
Quasi-stationary distributions
,” in
Probability and its Applications
(
Springer Berlin Heidelberg
,
2012
).
40.
X.
Nguyen
,
M. J.
Wainwright
, and
M. I.
Jordan
, “
Estimating divergence functionals and the likelihood ratio by convex risk minimization
,”
IEEE Trans. Inf. Theory
56
(
11
),
5847
5861
(
2010
).
41.
B.
Peters
and
B. L.
Trout
, “
Obtaining reaction coordinates by likelihood maximization
,”
J. Chem. Phys.
125
(
5
),
054108
(
2006
).
42.
B.
Peters
,
G. T.
Beckham
, and
B. L.
Trout
, “
Extensions to the likelihood maximization approach for finding reaction coordinates
,”
J. Chem. Phys.
127
(
3
),
034109
(
2007
).
43.
H.
Jung
,
R.
Covino
, and
G.
Hummer
, “
Artificial intelligence assists discovery of reaction coordinates and mechanisms from molecular dynamics simulations
,” arXiv:1901.04595 (
2019
).
44.
A.
Chattopadhyay
,
E.
Nabizadeh
, and
P.
Hassanzadeh
, “
Analog forecasting of extreme-causing weather patterns using deep learning
,”
J. Adv. Model. Earth Syst.
12
(
2
),
e2019MS001958
(
2020
).
45.
H.
Jung
,
R.
Covino
,
A.
Arjun
,
C.
Leitold
,
C.
Dellago
,
P. G.
Bolhuis
, and
G.
Hummer
, “
Machine-guided path sampling to discover mechanisms of molecular self-organization
,”
Nat. Comput. Sci.
3
(
4
),
334
345
(
2023
).
46.
M.
George
,
B.
Cozian
,
P.
Abry
,
P.
Borgnat
, and
F.
Bouchet
, “
Probabilistic forecasts of extreme heatwaves using convolutional neural networks in a regime of lack of data
,”
Phys. Rev. Fluids
8
(
4
),
040501
(
2023
).
47.
K.
Müller
and
L. D.
Brown
, “
Location of saddle points and minimum energy paths by a constrained simplex optimization procedure
,”
Theor. Chim. Acta
53
(
1
),
75
93
(
1979
).
48.
C.
Lorpaiboon
,
J.
Weare
, and
A. R.
Dinner
, “
Augmented transition path theory for sequences of events
,”
J. Chem. Phys.
157
(
9
),
094115
(
2022
).
49.
S.
Buchenberg
,
N.
Schaudinnus
, and
G.
Gerhard Stock
, “
Hierarchical biomolecular dynamics: Picosecond hydrogen bonding regulates microsecond conformational transitions
,”
J. Chem. Theory Comput.
11
(
3
),
1330
1336
(
2015
).
50.
A.
Perez
,
F.
Sittel
,
G.
Stock
, and
K.
Dill
, “
MELD-path efficiently computes conformational transitions, including multiple and diverse paths
,”
J. Chem. Theory Comput.
14
(
4
),
2109
2116
(
2018
).
51.
F.
Sittel
,
T.
Filk
, and
G.
Stock
, “
Principal component analysis on a torus: Theory and application to protein dynamics
,”
J. Chem. Phys.
147
(
24
),
244101
(
2017
).
52.
C. W.
Hopkins
,
R. C.
Walker
,
S.
Le Grand
, and
A. E.
Roitberg
, “
Long-Time-step molecular dynamics through hydrogen mass repartitioning
,”
J. Chem. Theory Comput.
11
(
4
),
1864
1874
(
2015
).
53.
G. A.
Khoury
,
J.
Smadbeck
,
P.
Tamamis
,
A. C.
Vandris
,
C. A.
Kieslich
, and
C. A.
Floudas
, “
Ab Initio charge parameters to aid in the discovery and design of therapeutic proteins and peptides with unnatural amino acids and their application to complement inhibitors of the compstatin family
,”
ACS Synth. Biol.
3
(
12
),
855
869
(
2014
).
54.
H.
Nguyen
,
D. R.
Roe
, and
C.
Simmerling
, “
Improved generalized Born solvent model parameters for protein simulations
,”
J. Chem. Theory Comput.
9
(
4
),
2020
2034
(
2013
).
55.
P.
Eastman
,
J.
Swails
,
J. D.
Chodera
,
R. T.
McGibbon
,
Y.
Zhao
,
K. A.
Beauchamp
,
L.-P.
Wang
,
A. C.
Simmonett
,
M. P.
Harrigan
,
C. D.
Stern
,
R. P.
Wiewiora
,
B. R.
Brooks
, and
V. S.
Pande
, “
OpenMM 7: Rapid development of high performance algorithms for molecular dynamics
,”
PLoS Comput. Biol.
13
(
7
),
e1005659
(
2017
).
56.
P.
Metzner
,
C.
Schütte
, and
E.
Vanden-Eijnden
, “
Illustration of transition path theory on a collection of simple examples
,”
J. Chem. Phys.
125
(
8
),
084110
(
2006
).
57.
D. P.
Kingma
and
B.
Jimmy
, “
Adam: A method for stochastic optimization
,” arXiv:1412.6980 (
2014
).
58.
E.
Darve
,
J.
Solomon
, and
A.
Kia
, “
Computing generalized Langevin equations and generalized Fokker-Planck equations
,”
Proc. Natl. Acad. Sci. U. S. A.
106
(
27
),
10884
10889
(
2009
).
59.
S.
Cao
,
A.
Montoya-Castillo
,
W.
Wang
,
T. E.
Markland
, and
X.
Huang
, “
On the advantages of exploiting memory in Markov state models for biomolecular dynamics
,”
J. Chem. Phys.
153
(
1
),
014105
(
2020
).
60.
D.
Lucente
,
J.
Rolland
,
C.
Herbert
, and
F.
Bouchet
, “
Coupling rare event algorithms with data-based learned committor functions using the analogue Markov chain
,”
J. Stat. Mech.: Theory Exp.
2022
(
8
),
083201
(2022).
61.
Y.
Meng
,
D.
Shukla
,
V. S.
Pande
, and
B.
Roux
, “
Transition path theory analysis of c-Src kinase activation
,”
Proc. Natl. Acad. Sci. U. S. A.
113
(
33
),
9193
9198
(
2016
).
62.
B. P.
Vani
,
J.
Weare
, and
A. R.
Dinner
, “
Computing transition path theory quantities with trajectory stratification
,”
J. Chem. Phys.
157
(
3
),
034106
(
2022
).
63.
J.
Finkel
,
D. S.
Abbot
, and
J.
Weare
, “
Path properties of atmospheric transitions: Illustration with a low-order sudden stratospheric warming model
,”
J. Atmos. Sci.
77
(
7
),
2327
2347
(
2020
).
64.
P.
Miron
,
F. J.
Beron-Vera
,
L.
Helfmann
, and
P.
Koltai
, “
Transition paths of marine debris and the stability of the garbage patches
,”
Chaos
31
(
3
),
033101
(
2021
).
65.
D.
Lucente
,
C.
Herbert
, and
F.
Bouchet
, “
Committor functions for climate phenomena at the predictability margin: The example of El Niño-Southern Oscillation in the Jin and Timmermann model
,”
J. Atmos. Sci.
79
(
9
),
2387
2400
(
2022
).
66.
J.
Finkel
,
E. P.
Gerber
,
S. A.
Dorian
, and
J.
Weare
, “
Revealing the statistics of extreme events hidden in short weather forecast data
,”
AGU Adv.
4
(
2
),
e2023AV000881
(
2023
).
67.
J.
Finkel
,
R. J.
Webber
,
E. P.
Gerber
,
D. S.
Abbot
, and
J.
Weare
, “
Data-driven transition path analysis yields a statistical understanding of sudden stratospheric warming events in an idealized model
,”
J. Atmos. Sci.
80
(
2
),
519
534
(
2023
).
68.
J.
Hu
,
A.
Ma
, and
A. R.
Dinner
, “
A two-step nucleotide-flipping mechanism enables kinetic discrimination of DNA lesions by AGT
,”
Proc. Natl. Acad. Sci. U. S. A.
105
(
12
),
4615
4620
(
2008
).
69.
E.
Xia
and
M.
Wainwright
, “
Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces
,” in
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, Volume 206 of Proceedings of Machine Learning Research
, edited by
F.
Ruiz
,
D.
Jennifer
, and
J.-W.
van de Meent
(
PMLR
,
2023
), pp.
9137
9166
.
You do not currently have access to this content.