Identifying a reduced set of collective variables is critical for understanding atomistic simulations and accelerating them through enhanced sampling techniques. Recently, several methods have been proposed to learn these variables directly from atomistic data. Depending on the type of data available, the learning process can be framed as dimensionality reduction, classification of metastable states, or identification of slow modes. Here, we present mlcolvar, a Python library that simplifies the construction of these variables and their use in the context of enhanced sampling through a contributed interface to the PLUMED software. The library is organized modularly to facilitate the extension and cross-contamination of these methodologies. In this spirit, we developed a general multi-task learning framework in which multiple objective functions and data from different simulations can be combined to improve the collective variables. The library’s versatility is demonstrated through simple examples that are prototypical of realistic scenarios.

1.
D.
Frenkel
and
B.
Smit
,
Understanding Molecular Simulation: From Algorithms to Applications
(
Elsevier
,
2001
), Vol.
1
.
2.
J.
Behler
and
M.
Parrinello
, “
Generalized neural-network representation of high-dimensional potential-energy surfaces
,”
Phys. Rev. Lett.
98
,
146401
(
2007
).
3.
J.
Behler
and
G.
Csányi
, “
Machine learning potentials for extended systems: A perspective
,”
Eur. Phys. J. B
94
,
142
(
2021
).
4.
O. T.
Unke
,
S.
Chmiela
,
H. E.
Sauceda
,
M.
Gastegger
,
I.
Poltavsky
,
K. T.
Schütt
,
A.
Tkatchenko
, and
K. R.
Müller
, “
Machine learning force fields
,”
Chem. Rev.
121
,
10142
(
2021
).
5.
F.
Noé
,
A.
Tkatchenko
,
K.-R.
Müller
, and
C.
Clementi
, “
Machine learning for molecular simulation
,”
Annu. Rev. Phys. Chem.
71
,
361
(
2020
).
6.
L.
Bonati
, “
Machine learning and enhanced sampling simulations
,” Ph.D. thesis,
Swiss Federal Institute of Technology (ETH) Zürich
,
2021
.
7.
M.
Chen
, “
Collective variable-based enhanced sampling and machine learning
,”
Eur. Phys. J. B
94
,
211
(
2021
).
8.
H.
Sidky
,
W.
Chen
, and
A. L.
Ferguson
, “
Machine learning for collective variable discovery and enhanced sampling in biomolecular simulation
,”
Mol. Phys.
118
,
1737742
(
2020
).
9.
Y.
Wang
,
J. M.
Lamim Ribeiro
, and
P.
Tiwary
, “
Machine learning approaches for analyzing and enhancing molecular dynamics simulations
,”
Curr. Opin. Struct. Biol.
61
,
139
145
(
2020
).
10.
O.
Valsson
,
P.
Tiwary
, and
M.
Parrinello
, “
Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint
,”
Annu. Rev. Phys. Chem.
67
,
159
184
(
2016
).
11.
J.
Hénin
,
T.
Lelièvre
,
M.
Shirts
,
O.
Valsson
, and
L.
Delemotte
, “
Enhanced sampling methods for molecular dynamics simulations [article v1. 0]
,”
Living J. Comput. Mol. Sci.
4
,
1583
(
2022
).
12.
Y. I.
Yang
,
Q.
Shao
,
J.
Zhang
,
L.
Yang
, and
Y. Q.
Gao
, “
Enhanced sampling in molecular dynamics
,”
J. Chem. Phys.
151
,
070902
(
2019
).
13.
G.
Bussi
and
D.
Branduardi
,
Free-Energy Calculations with Metadynamics: Theory and Practice
(
Wiley
,
2015
), pp.
1
49
.
14.
J.
Rogal
,
E.
Schneider
, and
M. E.
Tuckerman
, “
Neural-network-based path collective variables for enhanced sampling of phase transformations
,”
Phys. Rev. Lett.
123
,
245701
(
2019
).
15.
T.
Karmakar
,
M.
Invernizzi
,
V.
Rizzi
, and
M.
Parrinello
, “
Collective variables for the study of crystallisation
,”
Mol. Phys.
119
,
e1893848
(
2021
).
16.
O.
Elishav
,
R.
Podgaetsky
,
O.
Meikler
, and
B.
Hirshberg
, “
Collective variables for conformational polymorphism in molecular crystals
,”
J. Phys. Chem. Lett.
14
,
971
976
(
2023
).
17.
G.
Piccini
,
D.
Mendels
, and
M.
Parrinello
, “
Metadynamics with discriminants: A tool for understanding chemistry
,”
J. Chem. Theory Comput.
14
,
5040
5044
(
2018
).
18.
D.
Mendels
,
G.
Piccini
,
Z. F.
Brotzakis
,
Y. I.
Yang
,
M.
Parrinello
,
Z. F.
Brotzakis
,
Y. I.
Yang
, and
M.
Parrinello
, “
Folding a small protein using harmonic linear discriminant analysis
,”
J. Chem. Phys.
149
,
194113
(
2018
).
19.
U.
Raucci
,
V.
Rizzi
, and
M.
Parrinello
, “
Discover, sample, and refine: Exploring chemistry with enhanced sampling techniques
,”
J. Phys. Chem. Lett.
13
,
1424
(
2022
).
20.
S.
Das
,
U.
Raucci
,
R. P. P.
Neves
,
M. J.
Ramos
, and
M.
Parrinello
, “
How and when does an enzyme react? Unraveling α-amylase catalytic activity with enhanced sampling techniques
,”
ACS Catal.
13
,
8092
8098
(
2023
).
21.
M.
Bertazzo
,
D.
Gobbo
,
S.
Decherchi
, and
A.
Cavalli
, “
Machine learning and enhanced sampling simulations for computing the potential of mean force and standard binding free energy
,”
J. Chem. Theory Comput.
17
,
5287
5300
(
2021
).
22.
N.
Ansari
,
V.
Rizzi
, and
M.
Parrinello
, “
Water regulates the residence time of benzamidine in trypsin
,”
Nat. Commun.
13
,
5438
(
2022
).
23.
J. M.
Lamim Ribeiro
,
D.
Provasi
, and
M.
Filizola
, “
A combination of machine learning and infrequent metadynamics to efficiently predict kinetic rates, transition states, and molecular determinants of drug dissociation from g protein-coupled receptors
,”
J. Chem. Phys.
153
,
124105
(
2020
).
24.
V.
Rizzi
,
L.
Bonati
,
N.
Ansari
, and
M.
Parrinello
, “
The role of water in host-guest interaction
,”
Nat. Commun.
12
,
93
(
2021
).
25.
M.
Badaoui
,
P. J.
Buigues
,
D.
Berta
,
G. M.
Mandana
,
H.
Gu
,
T.
Földes
,
C. J.
Dickson
,
V.
Hornak
,
M.
Kato
,
C.
Molteni
et al, “
Combined free-energy calculation and machine learning methods for understanding ligand unbinding kinetics
,”
J. Chem. Theory Comput.
18
,
2543
2555
(
2022
).
26.
M. M.
Sultan
,
H. K.
Wayment-Steele
, and
V. S.
Pande
, “
Transferable neural networks for enhanced sampling of protein dynamics
,”
J. Chem. Theory Comput.
14
,
1887
1894
(
2018
).
27.
D.
Ray
,
E.
Trizio
, and
M.
Parrinello
, “
Deep learning collective variables from transition path ensemble
,”
J. Chem. Phys.
158
,
204102
(
2023
).
28.
H.
Chen
,
H.
Liu
,
H.
Feng
,
H.
Fu
,
W.
Cai
,
X.
Shao
, and
C.
Chipot
, “
MLCV: Bridging machine-learning-based dimensionality reduction and free-energy calculation
,”
J. Chem. Inf. Model.
62
,
1
(
2021
).
29.
R.
Ketkaew
and
S.
Luber
, “
DeepCV: A deep learning framework for blind search of collective variables in expanded configurational space
,”
J. Chem. Inf. Model.
64
,
6352
(
2022
).
30.
D.
Trapl
,
I.
Horvacanin
,
V.
Mareska
,
F.
Ozcelik
,
G.
Unal
, and
V.
Spiwok
, “
Anncolvar: Approximation of complex collective variables by artificial neural networks for analysis and biasing of molecular simulations
,”
Front. Mol. Biosci.
6
,
25
(
2019
).
31.
G. A.
Tribello
,
M.
Bonomi
,
D.
Branduardi
,
C.
Camilloni
, and
G.
Bussi
, “
PLUMED 2: New feathers for an old bird
,”
Comput. Phys. Commun.
185
,
604
613
(
2014
).
32.
B.
Hashemian
,
D.
Millán
, and
M.
Arroyo
, “
Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables
,”
J. Chem. Phys.
139
,
214101
(
2013
).
33.
W.
Chen
,
A. R.
Tan
, and
A. L.
Ferguson
, “
Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design
,”
J. Chem. Phys.
149
,
072312
(
2018
).
34.
Neha
,
V.
Tiwari
,
S.
Mondal
,
N.
Kumari
, and
T.
Karmakar
, “
Collective variables for crystallization simulations—From early developments to recent advances
,”
ACS Omega
8
,
127
(
2022
).
35.
O.
Fleetwood
,
M. A.
Kasimova
,
A. M.
Westerlund
, and
L.
Delemotte
, “
Molecular insights from conformational ensembles via machine learning
,”
Biophys. J.
118
,
765
780
(
2020
).
36.
H.
Jung
,
R.
Covino
,
A.
Arjun
,
C.
Leitold
,
C.
Dellago
,
P. G.
Bolhuis
, and
G.
Hummer
, “
Machine-guided path sampling to discover mechanisms of molecular self-organization
,”
Nat. Comput. Sci.
3
,
334
(
2023
).
37.
P.
Novelli
,
L.
Bonati
,
M.
Pontil
, and
M.
Parrinello
, “
Characterizing metastable states with the help of machine learning
,”
J. Chem. Theory Comput.
18
,
5195
5202
(
2022
).
38.
I.
Goodfellow
,
Y.
Bengio
, and
A.
Courville
,
Deep Learning
(
MIT Press
,
2017
), http://www.deeplearningbook.org.
39.
J. M. L.
Ribeiro
,
P.
Bravo
,
Y.
Wang
, and
P.
Tiwary
, “
Reweighted autoencoded variational Bayes for enhanced sampling (RAVE)
,”
J. Chem. Phys.
149
,
072301
(
2018
).
40.
T.
Lemke
and
C.
Peter
, “
EncoderMap: Dimensionality reduction and generation of molecule conformations
,”
J. Chem. Theory Comput.
15
,
1209
1215
(
2019
).
41.
D.
Mendels
,
G.
Piccini
, and
M.
Parrinello
, “
Collective variables from local fluctuations
,”
J. Phys. Chem. Lett.
9
,
2776
2781
(
2018
).
42.
M. M.
Sultan
and
V. S.
Pande
, “
Automated design of collective variables using supervised machine learning
,”
J. Chem. Phys.
149
,
094106
(
2018
).
43.
L.
Bonati
,
V.
Rizzi
, and
M.
Parrinello
, “
Data-driven collective variables for enhanced sampling
,”
J. Phys. Chem. Lett.
11
,
2998
3004
(
2020
).
44.
R. T.
McGibbon
,
B. E.
Husic
, and
V. S.
Pande
, “
Identification of simple reaction coordinates from complex dynamics
,”
J. Chem. Phys.
146
,
044109
(
2017
).
45.
C.
Wehmeyer
and
F.
Noé
, “
Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics
,”
J. Chem. Phys.
148
,
241703
(
2018
).
46.
M.
Schöberl
,
N.
Zabaras
, and
P.-S.
Koutsourelakis
, “
Predictive collective variable discovery with deep Bayesian models
,”
J. Chem. Phys.
150
,
024109
(
2019
).
47.
Y.
Wang
,
J. M. L.
Ribeiro
, and
P.
Tiwary
, “
Past-future information bottleneck for sampling molecular reaction coordinate simultaneously with thermodynamics and kinetics
,”
Nat. Commun.
10
,
3573
(
2019
).
48.
A.
Mardt
,
L.
Pasquali
,
H.
Wu
, and
F.
Noé
, “
VAMPnets for deep learning of molecular kinetics
,”
Nat. Commun.
9
,
5
(
2018
).
49.
P.
Tiwary
and
B. J.
Berne
, “
Spectral gap optimization of order parameters for sampling complex molecular systems
,”
Proc. Natl. Acad. Sci. U. S. A.
113
,
2839
2844
(
2016
).
50.
F.
Noé
and
C.
Clementi
, “
Collective variables for the study of long-time kinetics from molecular trajectories: Theory and methods
,”
Curr. Opin. Struct. Biol.
43
,
141
147
(
2017
).
51.
W.
Chen
and
A. L.
Ferguson
, “
Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration
,”
J. Comput. Chem.
39
,
2079
2102
(
2018
).
52.
L.
Bonati
,
G.
Piccini
, and
M.
Parrinello
, “
Deep learning the slow modes for rare events sampling
,”
Proc. Natl. Acad. Sci. U. S. A.
118
,
e2113533118
(
2021
).
53.
Z.
Belkacemi
,
P.
Gkeka
,
T.
Lelièvre
, and
G.
Stoltz
, “
Chasing collective variables using autoencoders and biased trajectories
,”
J. Chem. Theory Comput.
18
,
59
78
(
2022
).
54.
H.
Chen
and
C.
Chipot
, “
Chasing collective variables using temporal data-driven strategies
,”
QRB Discovery
4
,
e2
(
2023
).
55.
G. A.
Tribello
and
P.
Gasparotto
, “
Using dimensionality reduction to analyze protein trajectories
,”
Front. Mol. Biosci.
6
,
46
(
2019
).
56.
M.
Ceriotti
, “
Unsupervised machine learning in atomistic simulations, between predictions and understanding
,”
J. Chem. Phys.
150
,
150901
(
2019
).
57.
E.
Trizio
and
M.
Parrinello
, “
From enhanced sampling to reaction profiles
,”
J. Phys. Chem. Lett.
12
,
8621
8626
(
2021
).
58.
K.
Lindorff-Larsen
,
S.
Piana
,
R. O.
Dror
, and
D. E.
Shaw
, “
How fast-folding proteins fold
,”
Science
334
,
517
520
(
2011
).
59.
L.
Molgedey
and
H. G.
Schuster
, “
Separation of a mixture of independent signals using time delayed correlations
,”
Phys. Rev. Lett.
72
,
3634
(
1994
).
60.
Y.
Naritomi
and
S.
Fuchigami
, “
Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: The case of domain motions
,”
J. Chem. Phys.
134
,
065101
(
2011
).
61.
G.
Pérez-Hernández
,
F.
Paul
,
T.
Giorgino
,
G. D.
Fabritiis
, and
F.
Noè
, “
Identification of slow molecular order parameters for Markov model construction
,”
J. Chem. Phys.
139
,
015102
(
2013
).
62.
W.
Chen
,
H.
Sidky
, and
A. L.
Ferguson
, “
Nonlinear discovery of slow molecular modes using state-free reversible VAMPnets
,”
J. Chem. Phys.
150
,
214114
(
2019
).
63.
C. X.
Hernández
,
H. K.
Wayment-Steele
,
M. M.
Sultan
,
B. E.
Husic
, and
V. S.
Pande
, “
Variational encoding of complex dynamics
,”
Phys. Rev. E
97
,
062412
(
2018
).
64.
S.
Ruder
, “
An overview of multi-task learning in deep neural networks
,” arXiv:1706.05098 (
2017
).
65.
L.
Sun
,
J.
Vandermause
,
S.
Batzner
,
Y.
Xie
,
D.
Clark
,
W.
Chen
, and
B.
Kozinsky
, “
Multitask machine learning of collective variables for enhanced sampling of rare events
,”
J. Chem. Theory Comput.
18
,
2341
2353
(
2022
).
66.
M.
Ceriotti
,
G. A.
Tribello
, and
M.
Parrinello
, “
Simplifying the representation of complex free-energy landscapes using sketch-map
,”
Proc. Natl. Acad. Sci. U. S. A.
108
,
13023
13028
(
2011
).
67.
A.
Paszke
,
S.
Gross
,
F.
Massa
,
A.
Lerer
,
J.
Bradbury
,
G.
Chanan
,
T.
Killeen
,
Z.
Lin
,
N.
Gimelshein
,
L.
Antiga
et al, “
PyTorch: An imperative style, high-performance deep learning library
,” in
Advances in Neural Information Processing Systems 32
(
Curran Associates
,
2019
); arXiv.1912.01703.
68.
W.
Falcon
and
PyTorch Lightning Team
(
2023
). “PyTorch Lightning (2.0.4),”
Zenodo
. https://doi.org/10.5281/zenodo.8071710
69.
G. M.
Torrie
and
J. P.
Valleau
, “
Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling
,”
J. Comput. Phys.
23
,
187
199
(
1977
).
70.
A.
Laio
and
M.
Parrinello
, “
Escaping free-energy minima
,”
Proc. Natl. Acad. Sci. U. S. A.
99
,
12562
12566
(
2002
).
71.
G.
Bussi
and
A.
Laio
, “
Using metadynamics to explore complex free-energy landscapes
,”
Nat. Rev. Phys.
2
,
200
212
(
2020
).
72.
O.
Valsson
and
M.
Parrinello
, “
Variational approach to enhanced sampling and free energy calculations
,”
Phys. Rev. Lett.
113
,
090601
(
2014
).
73.
L.
Bonati
,
Y.-Y.
Zhang
, and
M.
Parrinello
, “
Neural networks-based variationally enhanced sampling
,”
Proc. Natl. Acad. Sci. U. S. A.
116
,
17641
17647
(
2019
).
74.
M.
Invernizzi
and
M.
Parrinello
, “
Rethinking metadynamics: From bias potentials to probability distributions
,”
J. Phys. Chem. Lett.
11
,
2731
2736
(
2020
).
75.
T.
Giorgino
, “
PYCV: A PLUMED 2 module enabling the rapid prototyping of collective variables in Python
,”
J. Open Source Software
4
,
1773
(
2019
).
76.
C. R.
Harris
,
K. J.
Millman
,
S. J.
van der Walt
,
R.
Gommers
,
P.
Virtanen
,
D.
Cournapeau
,
E.
Wieser
,
J.
Taylor
,
S.
Berg
,
N. J.
Smith
,
R.
Kern
,
M.
Picus
,
S.
Hoyer
,
M. H.
van Kerkwijk
,
M.
Brett
,
A.
Haldane
,
J. F.
del Río
,
M.
Wiebe
,
P.
Peterson
,
P.
Gérard-Marchant
,
K.
Sheppard
,
T.
Reddy
,
W.
Weckesser
,
H.
Abbasi
,
C.
Gohlke
, and
T. E.
Oliphant
, “
Array programming with NumPy
,”
Nature
585
,
357
362
(
2020
).
77.
Pandas Development Team
, pandas-dev/pandas: Pandas,
2020
.
78.
I. T.
Jolliffe
,
Principal Component Analysis for Special Types of Data
(
Springer
,
2002
).
79.
D. P.
Kingma
and
M.
Welling
, “
Auto-encoding variational Bayes
,” presented at the 2nd International Conference on Learning Representations (ICLR 2014), Banff, AB, Canada, 14-16 April 2014; arXiv:1312.6114 (
2013
).
80.
G. A.
Tribello
,
M.
Ceriotti
, and
M.
Parrinello
, “
Using sketch-map coordinates to analyze and bias molecular dynamics simulations
,”
Proc. Natl. Acad. Sci. U. S. A.
109
,
5196
5201
(
2012
).
81.
J.
Rydzewski
and
O.
Valsson
, “
Multiscale reweighted stochastic embedding: Deep learning of collective variables for enhanced sampling
,”
J. Phys. Chem. A
125
,
6286
6302
(
2021
).
82.
M.
Welling
,
Fisher Linear Discriminant Analysis
(
Department of Computer Science, University of Toronto
,
2005
).
83.
M.
Dorfer
,
R.
Kelz
, and
G.
Widmer
, “
Deep linear discriminant analysis
,” in
Proceedings of the 4th International Conference on Learning Representations
(ICLR 2016); arXiv:1511.04707 (
2015
).
84.
J.-H.
Prinz
,
H.
Wu
,
M.
Sarich
,
B.
Keller
,
M.
Senne
,
M.
Held
,
J. D.
Chodera
,
C.
Schütte
, and
F.
Noé
, “
Markov models of molecular kinetics: Generation and validation
,”
J. Chem. Phys.
134
,
174105
(
2011
).
85.
M. M.
Sultan
and
V. S.
Pande
, “
tICA-metadynamics: Accelerating metadynamics by using kinetically selected collective variables
,”
J. Chem. Theory Comput.
13
,
2440
2447
(
2017
).
86.
J.
McCarty
and
M.
Parrinello
, “
A variational conformational dynamics approach to the selection of collective variables in metadynamics
,”
J. Chem. Phys.
147
,
204109
(
2017
).
87.
Y. I.
Yang
and
M.
Parrinello
, “
Refining collective coordinates and improving free energy representation in variational enhanced sampling
,”
J. Chem. Theory Comput.
14
,
2889
2894
(
2018
).
88.
V.
Kostic
,
P.
Novelli
,
A.
Maurer
,
C.
Ciliberto
,
L.
Rosasco
, and
M.
Pontil
, “
Learning dynamical systems via Koopman operator regression in reproducing kernel Hilbert spaces
,” in
Advances in Neural Information Processing Systems (NeurIPS) 2022
(Neural Information Processing Systems Foundation, 2022); arXiv:2205.14027 (
2022
).
89.
W.
Chen
,
H.
Sidky
, and
A. L.
Ferguson
, “
Capabilities and limitations of time-lagged autoencoders for slow mode discovery in dynamical systems
,”
J. Chem. Phys.
151
,
064123
(
2019
).
90.
K. T.
Schütt
,
H. E.
Sauceda
,
P.-J.
Kindermans
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
SchNet—A deep learning architecture for molecules and materials
,”
J. Chem. Phys.
148
,
241722
(
2018
).
91.
PLUMED Consortium
, “
Promoting transparency and reproducibility in enhanced molecular simulations
,”
Nat. Methods
16
,
670
673
(
2019
).
You do not currently have access to this content.