Molecular dynamics simulations are an invaluable tool to characterize the dynamic motions of proteins in atomistic detail. However, the accuracy of models derived from simulations inevitably relies on the quality of the underlying force field. Here, we present an evaluation of current non-polarizable and polarizable force fields (AMBER ff14SB, CHARMM 36m, GROMOS 54A7, and Drude 2013) based on the long-standing biophysical challenge of protein folding. We quantify the thermodynamics and kinetics of the β-hairpin formation using Markov state models of the fast-folding mini-protein CLN025. Furthermore, we study the (partial) folding dynamics of two more complex systems, a villin headpiece variant and a WW domain. Surprisingly, the polarizable force field in our set, Drude 2013, consistently leads to destabilization of the native state, regardless of the secondary structure element present. All non-polarizable force fields, on the other hand, stably characterize the native state ensembles in most cases even when starting from a partially unfolded conformation. Focusing on CLN025, we find that the conformational space captured with AMBER ff14SB and CHARMM 36m is comparable, but the ensembles from CHARMM 36m simulations are clearly shifted toward disordered conformations. While the AMBER ff14SB ensemble overstabilizes the native fold, CHARMM 36m and GROMOS 54A7 ensembles both agree remarkably well with experimental state populations. In addition, GROMOS 54A7 also reproduces experimental folding times most accurately. Our results further indicate an over-stabilization of helical structures with AMBER ff14SB. Nevertheless, the presented investigations strongly imply that reliable (un)folding dynamics of small proteins can be captured in feasible computational time with current additive force fields.

The physiological function and physico-chemical properties of biomolecules are inherently linked to their conformational ensembles.1–3 Computational techniques, ranging from structure refinement tools4–6 to simulations of protein motions,7–9 are indispensable for the study of thermodynamics, kinetics, and mechanisms encoded within conformational ensembles.10 In particular, in the field of biomolecular simulations, they have repeatedly proven to be of great value for understanding and predicting macromolecular quantities based on fundamental interaction potentials.11,12

The study of how proteins fold into their functional three-dimensional structure, i.e., their native state, is a prominent example in this respect. Based on the idea that the native state is thermodynamically stable at physiological conditions13 and on the fact that a random search of the conformational space is incompatible with the experimentally observed folding times,14 the concept of folding funnels emerged.15,16 Following this concept, the native state represents the most favorable minimum in a protein’s free energy landscape (FEL). Unfolded conformational states are characterized by higher relative free energies and conformational entropies compared to the native state minimum. Hence, the FEL is said to have the shape of a funnel. It has to be borne in mind that the folding funnel theory represents the system in a simplified way.17,18 The native state, for example, is not one static structure but characterized by constant conformational rearrangements.19 However, these motions are small and fast compared to transitions to the unfolded conformational ensemble. Hence, when we refer to the native state, we really discuss the native state ensemble.

Numerical studies are long established in the exploration of protein folding and unfolding processes,20–24 but the quality of the results hinges on sufficiently accurate descriptions of both the native and unfolded states. This is tied to another central problem in structural biology, i.e., the prediction of a protein’s native state from its sequence.25 The ungraspable complexity of a protein’s FEL renders de novo structure prediction a complex task, but continuous efforts have led to significant methodological advancements.26,27

While predictions based on sequence homology and rotameric libraries are still the go-to approach, folding predictions based on molecular simulations become ever more relevant. Two important reasons for this are the significant progress in the theoretical (force field based) description of large biomolecular systems and the steady increase in computational power. The majority of current force fields used in biomolecular simulations approximate internal interactions as harmonic or periodic functions, while non-bonded interactions are based on pair-wise additive potentials.28,29 Since the electrostatic interactions are typically described by point charges located at the atom centers, polarization effects are inherently neglected.

In an effort to improve the description of non-bonded interactions, in particular electrostatics, polarizable force fields have been developed.30–33 It was shown that the introduction of polarizabilities can alter the relative stability of secondary structure elements34,35 and their folding and unfolding behavior.36,37

In this study, we provide a survey of protein folding and unfolding dynamics in three non-polarizable force fields, i.e., AMBER ff14SB,38 CHARMM 36m,39 and GROMOS 54A7,40,41 and the polarizable Drude 2013 force field.42 All four force fields were developed for MD simulations of proteins but differ in their parameterization strategies.43–45 Most notably, Drude 2013 accounts for the induced electronic polarization inherent to conformational rearrangements through classical Drude oscillators.42 While the implementation of Drude 2013 is among the most efficient polarizable force fields in terms of calculation time, it is still approximately four times slower than non-polarizable additive force fields.46 We note that very recently, an update in the form of the Drude 2019 force field was presented,47 which is not considered in the present study.

AMBER ff14SB has been reported to surpass the preceding AMBER force fields as it improves helical stability and generally provides highly reliable backbone dynamics of folded proteins.38 Similarly, one of the main objectives in the development of GROMOS 54A7 was the correction of a previously observed slight destabilization of α-helical structures.41,48 CHARMM 36m, on the other hand, was specifically tuned to describe both the folded and disordered conformational ensembles, with a particular focus on intrinsically disordered proteins (IDPs).39 As it became clear that IDPs play a central role in allosteric cellular signaling associated with tumor growth, inflammation, and cardiovascular disease, the need to explore their characteristically shallow free energy landscapes increased.49–51 

An accurate description of disordered conformational states is also pivotal when studying partial unfolding in well-structured globular proteins, which is often central for their function and biophysical properties.52–54 However, the detailed structural characterization of disordered conformational ensembles still remains a major challenge50 despite substantial experimental and computational efforts in this direction.55–60 

Another aspect that has to be considered is the role of the water model used as several recent studies report a significant influence of the water model on the behavior of biomolecules.61–69 Within this study, the aim is to compare different force fields in combination with their respective standard water models, i.e., TIP3P70 for AMBER ff14SB, CHARMM-modified TIP3P (TIPS3P)71 for CHARMM 36m, SWM4-NDP72 for Drude 2013, and SPC73 for GROMOS 54A7. Consequently, we can neither compare effects stemming exclusively from the different force fields nor compare effects stemming exclusively from the different water models. For brevity, we only name the different force fields in the following, but we stress that with this, we actually refer to the respective force field–water combination.

To understand the fundamental molecular driving forces governing folding dynamics, typically, fast-folding proteins have been studied as model systems.79,80 For this survey, we selected three such systems to evaluate differences in the ensembles captured with the four force fields described above. The proteins in question are variants of chignolin (CLN025),75 villin,77 and a WW domain.78 More detailed information on these systems is provided in Table I.

TABLE I.

Collection of data of the three systems studied. The information regarding the details of the simulation setup includes the number of residues, the number of ions, and the protonation state of histidines. Ions are used to ensure a neutral total charge of the system in the CHARMM 36m, Drude 2013, and GROMOS 54A7 simulations. For the AMBER ff14SB simulations, we used a uniform background charge. The protonation states of each system were chosen to match the DESRES reference.74 That is, Asp and Glu residues were always deprotonated, and Lys and Arg residues were always protonated. For villin, His contains protons on both Nδ1 and Nϵ1 (HSP), and for the WW domain, His contains one proton on Nδ1 (HSD). For each system, the same protonation was used for the native and (partially) unfolded structures. In addition, we provide information on the used PDB structure, including the experimental conditions used, and the experimentally determined melting temperatures.

Simulation setupPDB detailsExperimental melting
SystemNo. of residuesHis-typeIonsIDT(K)pHReferencesTm(K)References
CLN025 10 … 2 Na+ 5AWL 283 5.0 75  344 76  
Villin 35 HSP 1 Cl 2F4K 274 9.0 77  360 77  
WW domain 38 HSD 1 Na+ 2F21 278 7.5 78  350 78  
Simulation setupPDB detailsExperimental melting
SystemNo. of residuesHis-typeIonsIDT(K)pHReferencesTm(K)References
CLN025 10 … 2 Na+ 5AWL 283 5.0 75  344 76  
Villin 35 HSP 1 Cl 2F4K 274 9.0 77  360 77  
WW domain 38 HSD 1 Na+ 2F21 278 7.5 78  350 78  

Our main focus lies on the de novo mini-protein CLN025.75 Its structure and dynamics have been thoroughly characterized with experiment and long-timescale simulations.81–83 With only ten residues, CLN025 is the smallest model system in our set, which also shows the fastest folding time (<1 µs).74 With this study, we focus mainly on force field based differences in the thermodynamics and kinetics of the CLN025 ensemble. For an overview on the associated folding mechanisms, we refer the reader to a recent work from McKiernan.83 

In addition, we consider a villin headpiece variant and a WW domain, which represent standard systems for fast-folding α-helical and β-sheet proteins, respectively. The villin headpiece subdomain consists of three α-helices surrounding a compact hydrophobic core (see Fig. 1). As a centerpiece of protein folding studies, villin headpiece variants were studied extensively in experiments77,84–91 and numerical studies.22,74,87,92–96 Folding times obtained in experiments range from 0.7 µs to 4 µs77,84,85 for villin. The variant considered in the present work is a norleucine/norleucine double mutant containing 35 residues (see Table I),77 for which folding times of 0.7 µs (experiment)77 and 2.8 µs (simulation)74 were reported.

FIG. 1.

Starting structures for the three systems studied, i.e., CLN025, villin (VIL), and the WW domain (WW). For the folded structure (F), the PDB ID is given, and for the unfolded structures (U1 and U2), the Cα-RMSD with respect to the F structure is given. For villin and the WW domain, we furthermore indicate the numbering of relevant structural features.

FIG. 1.

Starting structures for the three systems studied, i.e., CLN025, villin (VIL), and the WW domain (WW). For the folded structure (F), the PDB ID is given, and for the unfolded structures (U1 and U2), the Cα-RMSD with respect to the F structure is given. For villin and the WW domain, we furthermore indicate the numbering of relevant structural features.

Close modal

WW domains are relatively small proteins forming anti-parallel β-sheets and contain two eponymous tryptophan residues (see Fig. 1). Multiple variants were studied in experiments exhibiting folding times between 4 µs and 800 µs.78,97–102 This range of folding times spanning several orders of magnitude was achieved through mutations in the two loops and the replacement of the second tryptophan residue (at the end of the third β-strand) with phenylalanine.78,98,99,101 Early attempts to study WW domain folding in computer simulations failed,103 which is likely related to the used CHARMM22 force field.104,105 However, later works consistently succeeded in elucidating the folding mechanism.74,93,101,106–110 The WW domain variant considered in the present work was only modified in loop L1 and contains 38 residues (see Table I).78 In experiments, the folding time was determined to be 12 µs.99 

For all protein-force field combinations, we performed three independent NPT simulations at 337 K and 1 bar, one simulation starting from the folded PDB (protein databank) structure (F) and two simulations starting from (partially) unfolded states. This choice of temperature is motivated by the early study of Freddolino et al.103 To generate the unfolded states, we simulated the solvated systems at 500 K using the CHARMM 36m force field. The thus obtained configurations were subsequently clustered based on Cα-RMSD. For each system, two cluster representatives were picked, which served as unfolded starting structures. These two structures will be named U1 and U2 in the following. U1 has always a lower Cα-RMSD with respect to the crystal structure than U2, i.e., is less unfolded (see Fig. 1). The protonation states of each system are the same for all force fields and different starting structures (see Table I for details).

The parameters for norleucine (two present in the studied villin variant) are only incorporated into CHARMM 36m. For the other three force fields considered here, it had to be parameterized. The AMBER ff14SB parameters were derived using the antechamber module from AmberTools 19111 using a capped norleucine in order to avoid charge artifacts. The AM1-BCC charge method was used to derive partial charges in combination with AMBER atom types. Main chain atoms were defined during parameterization in order to avoid sequence dependency artifacts. For Drude 2013 and GROMOS 54A7, the backbone and the side chain parameters up to and including the CδH2 group were described using the corresponding parameters from lysine. The CϵH3 group was described using the CH3 group parameters from isoleucine.

For all simulations with the AMBER package,111 topology files and starting coordinates were generated with the program tLEaP of AmberTools 19.111 The AMBER ff14SB force field38 was used to describe the proteins, along with the TIP3P water model.70 All systems were placed in a cuboidal box with a 10 Å padding distance in each direction. Before production simulations, all systems were subjected to an elaborate equilibration protocol.112 All AMBER simulations were performed with the graphical processing unit (GPU) implementation of the pmemd module of AMBER 18.111 A constant temperature of 337 K was maintained with the Langevin thermostat113 at a collision frequency of 2 ps−1. Constant atmospheric pressure was maintained with the Berendsen barostat114 with a pressure relaxation time of 2 ps. Beyond a non-bonded cutoff of 8 Å, long range electrostatics were treated with a particle mesh Ewald method.115 The charges of the proteins were neutralized by a uniform background charge. All bonds involving hydrogens were restrained using the SHAKE algorithm,116 to allow for a time step of 2 fs. Each simulation was run for a total of 10 µs, collecting frames every 10 ps.

The simulations with the CHARMM 36m force field were conducted with OpenMM 7.3.1.117 The starting structures were placed in a cubic simulation box and solvated in CHARMM-modified TIP3P (TIPS3P) water71 using a minimal padding distance of 10 Å. The charges of the proteins were neutralized by the addition of ions (see Table I). After energy minimization and a short equilibration (40 ps), the production runs were started (the short simulations were only preformed for the unfolded structures). The equations of motion were integrated using Langevin dynamics with a friction constant of 0.1 ps−1 and a time step of 2 fs. The Coulombic interactions were calculated using the particle mesh Ewald method,118 and a real space cutoff of 12 Å was used. The Lennard-Jones interactions are switched to zero between 10 Å and 12 Å. Pressure coupling was performed with a Monte Carlo barostat.119,120

The simulations with the Drude 2013 force field were conducted with OpenMM 7.3.1.117 The starting structures were generated from the CHARMM starting structures (following minimization and short equilibration), and TIPS3P water was replaced with polarizable SWM4-NDP water.72 The equations of motion were integrated using a dual Langevin integrator based on the extended Lagrangian method121,122 using a time step of 1 fs. The regular atoms are coupled to a thermostat at 337 K with a friction constant of 0.1 ps−1, while the Drude oscillators are coupled to a thermostat at 1 K with a friction constant of 20 ps−1. A hard wall constraint at 0.2 Å was used. The Lennard-Jones and Coulomb interactions as well as the pressure coupling were handled as in the CHARMM case.

The simulations with the GROMOS 54A7 force field were conducted with GROMACS 2019.2.123 The starting structures were placed in a cubic simulation box and solvated in SPC water73 using a minimal padding distance of 10 Å. The charges of the proteins were neutralized by the addition of ions (see Table I). After energy minimization and short NVT and NPT equilibrations (1 ps each), the production runs were started. The equations of motion were integrated using Langevin dynamics with a friction constant of 0.1 ps−1 and a time step of 2 fs. The Coulombic interactions were calculated using the particle mesh Ewald method.118 An identical cutoff of 12 Å was used for both the Lennard-Jones and real space Coulomb interactions. Pressure coupling was performed with the Parrinello–Rahman barostat,124 and bond constraints were maintained using the LINCS algorithm.125 

To analyze the temporal evolution of the systems, we evaluate three quantities for each frame. We assign a secondary structure element to every residue making use of the DSSP algorithm,126 determine the fraction of native contacts Q as introduced by Best, Hummer, and Eaton,127 and calculate the root-mean-square deviation of all Cα-atoms with respect to the native structures (Cα-RMSD).

We also perform a principal component analysis (PCA) and time-lagged independent component analysis (TICA). For both the analyses, we use inverse distances between the Cα-atoms as input features. For CLN025, we consider all residues in these analyses; for villin and WW domain, however, we only include residues of the structured core, i.e., residues 2–33 for villin and residues 10–34 for the WW domain. We note that we refer to residue 1 as the first residue in the studied crystal structures and refer to the following residues using sequential numbering. For the TICA,128,129 a lag time of 25 ns was chosen.

For CLN025, we built Markov state models (MSMs) to extract thermodynamic and kinetic information from the trajectories. MSMs are statistical models that describe the dynamical processes in simulations through the transition probabilities between conformational states. As the first step, for the construction of MSMs, the trajectories need to be converted into a succession of discrete states. The transitions between these states are counted with consideration of the lag time, and the transition probabilities are calculated. Here, we clustered the CLN025 trajectories of each force field into 200 clusters using k-means clustering based on the three TICs with the largest eigenvalues. Bayesian MSMs130,131 were constructed on these discretized trajectories using a lag time of 25 ns. For the Drude 2013 force field, only 63% of the frames are connected and therefore included in the MSM. For each force field, we selected the appropriate number of PCCA+ (Perron-cluster cluster analysis+) clusters132 based on the gap between successive eigenvalues to coarse-grain the MSMs. The MSMs of all force fields have been validated with the Chapman–Kolmogorov test.130 Stationary probabilities (i.e., equilibrium probabilities) and mean first passage times (i.e., average time until a transition takes place; inverse of the rate) have been calculated from the MSMs. The distributions of native contacts Q and Cα-RMSD values have been weighted with the calculated probabilities from the MSMs and separately for the resulting metastable states according to the PCCA+. To retrieve representative structures for each metastable state, we identify all frames that show a likelihood above 90% to be a member of that state. We then perform a hierarchical clustering on the Cα-positions with a distance cutoff of 0.25 Å. The highest populated structure is then represented as the most prominent in the MSM illustrations.

For villin and the WW domain, we retrieve representative structures from each trajectory based on a hierarchical agglomerative clustering on the Cα-positions applying a distance cutoff of 8 Å for villin and 6 Å for the WW domain. Subsequently, we identify the cluster representative closest to the native structure in terms of Cα-RMSD and depict it as an overlay with the crystallized native conformation.

All described analyses were performed, in part, by using cpptraj and pytraj from AmberTools 19111 and the PyEMMA package.133 

The smallest system considered in this study is the ten-residue β-hairpin CLN025. As described in Sec. III, we study the folding dynamics of CLN025 based on simulations started from the crystallized native state (CLN025-F) and two unfolded conformations (CLN025-U1 and CLN025-U2). Figure 2 shows the time evolution of the secondary structure elements, native contacts Q, and the similarity to the native conformation as Cα-RMSD for the simulations of CLN025-F. These analyses already indicate several folding and unfolding events when using either AMBER ff14SB or CHARMM 36m. While in AMBER ff14SB, the folded state is clearly more prevalent, folded states appear for roughly half the time in the CHARMM 36m trajectory.

FIG. 2.

Simulations of CLN025-F using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D), and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (5AWL). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of all residues (red).

FIG. 2.

Simulations of CLN025-F using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D), and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (5AWL). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of all residues (red).

Close modal

Simulations performed with the Drude 2013 force field only appear to explore disordered and misfolded conformational states as even the simulation started from the native structure unfolds within the first few nanoseconds. For t > 9 µs, a rather stable bend segment appears, which is joined by a second bend segment at about 14 µs [see Fig. 2(D)]. Simulations with the GROMOS 54A7 force field fluctuate between the folded and unfolded states in rapid succession. The data obtained from the unfolded starting structures CLN025-U1 and CLN025-U2 yield qualitatively comparable results (see Figs. S1 and S2 of the supplementary material).

We further perform an in-depth analysis of the associated kinetics and thermodynamics using MSMs. Building MSMs for the different force fields allows us to identify differences in state probabilities and transition rates. To this end, we combine the CLN025 simulations that originate from the different starting structures. Additionally, we apply the same workflow a 106 µs trajectory of CLN025,74 kindly provided by D.E. Shaw Research (DESRES), as a benchmark for our results. We note that the CLN025 trajectories of the present study were simulated at a slightly lower temperature (337 K) than the DESRES trajectories (340 K) and that the DESRES trajectory was obtained using CHARMM22*. In Fig. 3, we depict the captured conformational ensemble for each force field as projection onto the DESRES TICA space. Thereby, we can directly compare the resulting folding free energy landscapes. Additionally, we show representative structures of the conformational states identified in the MSM workflow including their state population and mean first passage times (see also Tables S-I and S-II of the supplementary material).

FIG. 3.

Free energy landscapes, state probabilities, and mean first passage times of CLN025 from different force fields: (R) reference DESRES trajectory, (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. Left: The conformational ensembles are projected onto the first two TICA eigenvectors and color coded according to their MSM-reweighted free energy. The native conformation is projected onto the same space and represented as a white circle, and projections of the MSM representative structures are depicted in the respective color. Right: Representative MSM structures derived from fuzzy clustering are depicted with their respective probability and transition time scales.

FIG. 3.

Free energy landscapes, state probabilities, and mean first passage times of CLN025 from different force fields: (R) reference DESRES trajectory, (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. Left: The conformational ensembles are projected onto the first two TICA eigenvectors and color coded according to their MSM-reweighted free energy. The native conformation is projected onto the same space and represented as a white circle, and projections of the MSM representative structures are depicted in the respective color. Right: Representative MSM structures derived from fuzzy clustering are depicted with their respective probability and transition time scales.

Close modal

Already from the FEL, we can identify clear distinction between the force fields in our set. AMBER ff14SB and CHARMM 36m show a highly similar pattern to the DESRES reference trajectory, with two distinctly separated minima, which represent the native (folded) and unfolded conformational states. However, the depth of the minima differs significantly between the sampled ensembles. The minima observed in the DESRES reference trajectory translate into a probability of 78% for the folded state. AMBER ff14SB represents the native conformations with a probability of 92%. The CHARMM 36m ensemble, on the other hand, is characterized by a probability of 47% for the folded state and two unfolded states with probabilities of 50% and 3%, respectively. This observed shift of probabilities away from the native state is well in line with CHARMM 36m being specifically parameterized to sample disordered conformational ensembles. We also note that the probabilities of the folded states are consistent with the results shown in Figs. 2(A) and 2(C).

The conformational ensembles and corresponding FELs captured with Drude 2013 and GROMOS 54A7 show substantially different characteristics than the reference DESRES trajectory. However, as described above and highlighted in Fig. 2(D), even when starting from the native conformation, Drude 2013 unfolds the β-hairpin within a few nanoseconds. As indicated above, also the FEL of CLN025 sampled with GROMOS 54A7 implies significant differences to the conformational ensemble of the DESRES reference trajectory. It appears shallower overall, with lower barriers between the most favorable minima, which translates into the corresponding faster transition time scales. With mean first passage times of 100 ns between the folded and highest populated unfolded states, GROMOS 54A7 clearly exhibits the fastest (un)folding dynamics. This trend can already be seen in Fig. 2 as the native contacts form and break rapidly. In terms of state populations, GROMOS 54A7 shows result as CHARMM 36m. The conformational state that resembles the native state most closely is represented with a probability of 45%. The remaining three conformational states identified with the MSM workflow exhibit probabilities of 44%, 9%, and 2%, respectively. We note that the highest populated unfolded state in the GROMOS 54A7 force field is structurally significantly more similar to the native state than for CHARMM 36m (see Figs. S22 and S24 of the supplementary material).

Villin is a widely used model protein to study the dynamics of α-helix formation. As described in Sec. III, we again perform simulations starting from the native state (VIL-F) and two unfolded conformations (VIL-U1 and VIL-U2) for each force field. The simulations starting from VIL-F are stable in the AMBER ff14SB force field over the studied 10 µs, while in the CHARMM 36m trajectory, which is more than twice as long, the folded structure partially unfolds at t > 12 µs. When villin is studied in the Drude 2013 force field, the folded state is partially retained for the 11 µs studied. In the GROMOS 54A7 force field, villin transitions to a misfolded state.

The unfolding events observed in CHARMM 36m and GROMOS 54A7 indicate a destabilization of the native state since the experimental melting temperature of the WW domain is 23 K higher than the temperature studied here (see Table I). Nevertheless, it appears that in both force fields, helix α3 (see Fig. 1) is the most stable helix, a finding which is consistent with isotope-edited IR spectroscopy data91 (a more detailed discussion of the VIL-F simulations can be found in the supplementary material).

The results for the simulations started from the unfolded villin conformation VIL-U1 are shown in Fig. 4. The simulation performed using AMBER ff14SB refolds within 2 µs and then remains close to the native state, with the exception of a partial unfolding event after around 9 µs. The conformational ensemble resulting from simulations starting from VIL-U1 using CHARMM 36m does not include the native state. After numerous conformational rearrangements in the first 15 µs, the system appears to become trapped in a misfolded state for the following 8 µs. In this misfolded state, helices α2 and α3 are correctly formed, but instead of helix α1, a β-sheet is formed. Starting a simulation from the same structure with the Drude 2013 force field shows partial refolding within the first few nanoseconds. However, the simulation does not sample the native state. Neither do the simulations with the GROMOS 54A7 force field. Nevertheless, the analyses shown in Fig. 4(G) indicate numerous partial folding and unfolding events of all three helices. It appears that in simulations using the GROMOS 54A7 force field, helix α3 is the most likely helix to be folded.

FIG. 4.

Simulations of VIL-U1 using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D),and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (2F4K). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of residues 2–33 (red).

FIG. 4.

Simulations of VIL-U1 using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D),and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (2F4K). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of residues 2–33 (red).

Close modal

The unfolded villin structure VIL-U2 has a higher Cα-RMSD to the crystal structure than VIL-U1. In Fig. 5, we profile the degree of folding captured with each force field on starting from VIL-U2. Again, the simulation performed using the AMBER ff14SB force field resembles the characteristics of the native state closely in the captured 10 µs. Only slight deviations in helix α3 remain [see Fig. 5(A)]. For CHARMM 36m, the results are quite different as in this case, the simulation explores helical and β-sheet conformations, but it does not reach the native state. The simulations started from VIL-U2 using Drude 2013 misfolds toward one large α-helix within the initial 2 µs, which spans the residues of helices α1 and α2 in the native structure. As the simulation progresses, this helix splits into two, and also helix α3 is partially formed. Despite not reaching the native state, the system reaches a conformation that is comparable to structures captured with Drude 2013 simulations started from the VIL-F structure (see Fig. S26 of the supplementary material). As shown in Fig. 5(G), simulations with GROMOS 54A7 started from VIL-U2 approach the characteristics of the native conformations. While helices α1 and α3 are stably folded throughout most of the 10 µs trajectory, the second helix is only formed in the last microsecond of the trajectory. These last conformations then exhibit characteristics resembling the native state.

FIG. 5.

Simulations of VIL-U2 using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D), and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (2F4K). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of residues 2–33 (red).

FIG. 5.

Simulations of VIL-U2 using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D), and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (2F4K). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of residues 2–33 (red).

Close modal

Apart from the simulations with CHARMM 36m, all simulations of VIL-U2 at least approach the native state. However, the order in which the helices are formed is different. In AMBER ff14SB, we observe α1α2α3; in Drude 2013, we observe α3α1α2 (note that α1 is not folded correctly); in GROMOS 54A7, we observe α1α3α2 (note that α1 unfolds and refolds several times). Experimental results suggest that predominant folding pathways start with the formation of helix α3,91 which is observed here only for Drude 2013, but Drude 2013 does not reach the fully folded state within the ≈5 µs trajectory studied. In both AMBER ff14SB and GROMOS 54A7, helix α1 is formed fist. This discrepancy and the agreement of Drude 2013 with the experimental findings should, however, not be overrated since we here study folding of a single unfolded conformation and not of an ensemble of unfolded conformations as done in experiments.

The statistical significance of the achieved simulation data unfortunately does not allow a reliable analysis of the captured ensembles with MSMs. However, in Fig. 6, we depict the captured conformational ensembles as projections onto the PCA space of a 125 µs DESRES reference trajectory (PC1 explaining 18% of the variance, PC2 11%). This representation allows a direct comparison of the FEL resulting from each starting structure and force field.

FIG. 6.

Free energy landscapes of villin from different force fields as a projection onto the PCA space of the DESRES reference trajectory: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. The structural information captured in each simulation is projected onto the two first eigenvectors of a PCA performed on a reference DESRES trajectory and color coded according to free energy. For a direct comparison, the conformational space of the reference trajectory is illustrated in the background in gray. Additionally, we depict a projection of the starting structures onto the same space as circles: native/crystal conformation (white), U1 (teal), and U2 (violet). The diamonds represent the structures shown in Fig. 7, folding closest to the native state.

FIG. 6.

Free energy landscapes of villin from different force fields as a projection onto the PCA space of the DESRES reference trajectory: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. The structural information captured in each simulation is projected onto the two first eigenvectors of a PCA performed on a reference DESRES trajectory and color coded according to free energy. For a direct comparison, the conformational space of the reference trajectory is illustrated in the background in gray. Additionally, we depict a projection of the starting structures onto the same space as circles: native/crystal conformation (white), U1 (teal), and U2 (violet). The diamonds represent the structures shown in Fig. 7, folding closest to the native state.

Close modal

All AMBER ff14SB simulations represent the native state as the most favorable minimum, regardless of the starting structure. The most extended starting structure (VIL-U2) also captures a significant region of the unfolded portion of villin’s conformational ensemble. Interestingly, the simulation starting from the partially unfolded conformation (VIL-U1) hardly samples disordered areas of the FEL.

With CHARMM 36m, the unfolded conformational space is represented more prominently than with AMBER ff14SB. Even on starting from the native state, CHARMM 36m captures an extensive portion of the unfolded region. While the simulation starting from the unfolded conformation VIL-U2 explores extensive regions of villin’s FEL, we find no probability density in the proximity of the minimum representing the native state. Starting from the unfolded conformation VIL-U1, on the other hand, results in a FEL that closely resembles the reference DESRES trajectory and also includes the native conformational state.

The FELs of villin as described with Drude 2013 are highly consistent and appear to be rather independent from the starting conformation. However, the minimum sampled in all Drude 2013 simulations does not correspond to the native state. Simulations performed using GROMOS 54A7 lead to very similar observations. The minimum representing the native state is hardly sampled.

To visualize the structural characteristics of each simulation, we clustered each trajectory based on the Cα positions (see Sec. IV). In Fig. 7, we depict the representative structure with the smallest Cα-RMSD to the native conformation for each villin simulation. These representative conformations were also further projected onto the PCA space in Fig. 6.

FIG. 7.

Cluster representative structures with the smallest Cα-RMSD with respect to the F structure for all simulations of villin. The columns represent the different starting structures, i.e., VIL-F, VIL-U1, and VIL-U2, while the rows represent the different force fields, i.e., AMBER ff14SB (A), CHARMM 36m (C), Drude 2013 (D), and GROMOS 54A7 (G). ttraj is the length of the respective trajectories. For comparison, the native structure is also shown (translucent white).

FIG. 7.

Cluster representative structures with the smallest Cα-RMSD with respect to the F structure for all simulations of villin. The columns represent the different starting structures, i.e., VIL-F, VIL-U1, and VIL-U2, while the rows represent the different force fields, i.e., AMBER ff14SB (A), CHARMM 36m (C), Drude 2013 (D), and GROMOS 54A7 (G). ttraj is the length of the respective trajectories. For comparison, the native structure is also shown (translucent white).

Close modal

The studied WW domain consists of 38 residues arranged in three β-strands. While similar in length to villin, we consider its computational study more challenging as the formation of β-sheets is significantly slower than for α-helices.134 We characterize the folding dynamics of the WW domain with each force field based on simulations starting from the native state (WW-F), a partially unfolded conformation (WW-U1), and a fully unfolded conformation (WW-U2).

The native WW-F structure is stable in both the AMBER ff14SB and CHARMM 36m simulations over the entire trajectory (10 µs and 20 µs, respectively). In contrast, the simulation performed with Drude 2013 leads to partial unfolding of the protein. Only the central region of the WW domain stays intact. The WW-F structure also appears to be unstable when simulated using the GROMOS 54A7 force field. As in the case of villin, unfolding is not expected as the melting temperature of the WW domain is 13 K higher than the temperature studied here (see Table I). However, in contrast to villin, the WW domain is stable for 20 µs in CHARMM 36m (a more detailed discussion of the WW-F simulations can be found in the supplementary material).

The results of the simulations starting from the partially unfolded WW-U1 structure are depicted in Fig. 8. Both the AMBER ff14SB and CHARMM 36m force fields direct the system back to the native state within the first 2 µs, which is stable for the remaining trajectory. When performing simulations starting from WW-U1 using Drude 2013, we observe further unfolding. Already in the first stages of the simulation, the two main β-strands transition toward disordered conformational states. At ≈3 µs, parts of this region refold as indicated by all three quantities shown in Fig. 8(D). However, at the same time, the residues of the short β-strand (β3) form an α-helix. At ≈9 µs, this α-helical structure replaces the two main β-strands, resulting in a misfolded state. The dynamics of the GROMOS 54A7 simulations are less severe. While the partially unfolded WW-U1 conformation does not fold to the native state within the ≈10 µs studied, it also does not completely unfold. That is, two of the three β-strands of the WW domain stay partially folded over the whole trajectory. We also observe that both the native contacts and the Cα-RMSD show little fluctuation for t ≥ 5 µs. This indicates that this trajectory is exploring one minimum in the free energy landscape from which it does not escape within the captured timescale.

FIG. 8.

Simulations of WW-U1 using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D), and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (2F21). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of residues 10–34 (red).

FIG. 8.

Simulations of WW-U1 using different force fields: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. In (A), (C), (D), and (G), the three panels from top to bottom show the secondary structure, the number of native contacts Q, and the Cα-RMSD with respect to the crystal structure (2F21). The secondary structure of the crystal is shown to the left and right of the data. Q is calculated for the whole protein (blue) and for the backbone N and O of residues 10–34 (red).

Close modal

None of the simulations started from the fully unfolded WW-U2 structure capture the native state of the system within 6 µs–20 µs of simulation time. In the simulation with AMBER ff14SB, the protein misfolds into a stable conformation consisting of short 310-helices and turns within the first microsecond. For CHARMM 36m, the analyzed quantities indicate that the system does not exhibit a stable fold within the simulated 20 µs. In the Drude 2013 simulation, we observe a partial formation of the two main β-strands. However, this onset of folding does not lead to sampling of the native state conformation. In addition, GROMOS 54A7 does not lead to the native fold but, instead, captures a stable misfolded state including an α-helix (a more detailed discussion of the WW-U2 simulations can be found in the supplementary material).

As these results already imply, we were not able to construct statistically stable MSMs for the WW domain. Hence, in Fig. 9, we represent the FEL explored with each starting conformation and force field individually as a projection onto a combined PCA space, where PC1 comprises 45% of the variance and PC2 13%. (We do not project onto the PCA space given by the DESRES trajectory recorded for the WW domain because a significantly different variant, i.e., GTT, was studied in Ref. 74.) This analysis emphasizes the substantial impact of the WW domain starting structure on the resulting ensemble. As already suggested in Fig. 8, the FEL described with AMBER ff14SB shows that simulations are strongly stabilized once they are close to native state. However, the simulation starting from the WW-U2 conformation remains in a narrow distinct minimum far away from the folded conformational state. CHARMM 36m also consistently samples the narrow region around the minimum representing the native state not only for the simulation starting there (WW-F) but also for the simulation starting from the partially unfolded conformation (WW-U1). Most intriguingly, CHARMM 36m explores remarkably extensive portions of the FEL when started from the fully unfolded state (WW-U2) including a minimum in close vicinity to the native state.

FIG. 9.

Free energy landscapes of the WW domain from different force fields as a projection onto a combined PCA space: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. Additionally, we depict a projection of the starting structures onto the same space as circles: native/crystal conformation (white), U1 (teal), and U2 (violet). The diamonds represent the structures shown in Fig. 10, folding closest to the native state.

FIG. 9.

Free energy landscapes of the WW domain from different force fields as a projection onto a combined PCA space: (A) AMBER ff14SB, (C) CHARMM 36m, (D) Drude 2013, and (G) GROMOS 54A7. Additionally, we depict a projection of the starting structures onto the same space as circles: native/crystal conformation (white), U1 (teal), and U2 (violet). The diamonds represent the structures shown in Fig. 10, folding closest to the native state.

Close modal

The FEL resulting from Drude 2013 simulations of the native state identifies a minimum in the vicinity of the native state as the most favorable, while the native state itself does not correspond to a minimum. When starting from the partially unfolded conformation, Drude 2013 does not visit the minimum close to the native state but rather explores unfolded conformations. Similar FEL regions are explored in the WW-U2 simulations. GROMOS 54A7 based simulations display a very different behavior based on the constructed FEL in PCA space. The simulation starting from the native state correctly represents the native minimum but also leads to considerable exploration of unfolded conformations. The native minimum is also sampled if the simulations are started from the partially unfolded conformation. However, the simulation starting from the unfolded conformation (WW-U2) does not visit the vicinity of the native state minimum.

The structural characteristics discussed for each simulation are further illustrated in Fig. 10. We again clustered each simulation and depict the representative conformation, which resembles the native conformation most closely (see Sec. IV). The respective conformations are also depicted as a projection onto the PCA space shown in Fig. 9.

FIG. 10.

Cluster representative structures with the smallest Cα-RMSD with respect to the F structure for all simulations of the WW domain. The columns represent the different starting structures, i.e., WW-F, WW-U1, and WW-U2, while the rows represent the different force fields, i.e., AMBER ff14SB (A), CHARMM 36m (C), Drude 2013 (D), and GROMOS 54A7 (G). ttraj is the length of the respective trajectories. For comparison, the native structure is also shown (translucent white).

FIG. 10.

Cluster representative structures with the smallest Cα-RMSD with respect to the F structure for all simulations of the WW domain. The columns represent the different starting structures, i.e., WW-F, WW-U1, and WW-U2, while the rows represent the different force fields, i.e., AMBER ff14SB (A), CHARMM 36m (C), Drude 2013 (D), and GROMOS 54A7 (G). ttraj is the length of the respective trajectories. For comparison, the native structure is also shown (translucent white).

Close modal

In this study, we characterize the folding dynamics of the three fast-folding proteins using four different force fields, i.e., AMBER ff14SB, CHARMM 36m, GROMOS 54A7, and the polarizable force field Drude 2013 in conjunction with their standard water models (TIP3P, TIPS3P, SPC, and SWM4-NDP, respectively). Focusing on the smallest system in our test set, CLN025, we observe several clear differences in the conformational ensembles captured by the different force fields. Due to its comparably small size and fast transitions between the native and disordered conformational states, CLN025 is a prime model system for quantitative studies of β-hairpin folding dynamics. Here, we evaluate the performance of each force field based on simulations starting from the native and two unfolded conformations. Analyses based on the secondary structure, native contacts, and Cα-RMSD of each simulation already suggest that the captured dynamics are independent of the starting structure. While the ensembles do not appear to be biased by the initial conformation, we find significant differences between the considered force fields. To ensure statistically and physically meaningful analyses, we characterize the ensembles resulting from each force field using MSMs. We apply the same MSM workflow to an already published 106 µs DESRES trajectory of CLN025 (simulated with CHARMM22*) to provide a reference for our evaluations. The MSM built from this ultralong DESRES trajectory assigns a probability of 78% to the folded state at 340 K, and the corresponding work of Lindorff-Larsen et al.74 suggests a melting temperature of 381 K for CLN025. Experimental studies, however, estimate a population of ∼55% at 337 K for the native state75 and determine a melting temperature of ∼344 K76 (see Table I). An overestimation of the folded fraction of CLN025 has also been observed for other force fields.120,135

Similarly, also the CLN025 ensemble captured with AMBER ff14SB represents the folded state as clearly prevalent. Furthermore, the observed kinetics agree remarkably well with the reference DESRES MSM. A vague experimental reference for the mean first passage times can be derived from T-jump experiments, which report relaxation times of 100 ns76 at 328 K. Although the experimental relaxation times were determined 10 K lower than the present simulations, they still indicate that the folding mean first passage times suggested by AMBER ff14SB are slightly more realistic than the ones suggested by the DESRES reference.

CHARMM 36m explores a similar FEL as the AMBER ff14SB simulation, but it estimates a significantly lower population for the native state (47%). Furthermore, the transition rate toward the native state is estimated about an order of magnitude slower in this force field. Hence, the CHARMM 36m MSM retrieves state populations closer to the experimental estimates, while the kinetics are further off than for AMBER ff14SB and DESRES simulations. Furthermore, we find that the CHARMM 36m ensemble does not agree with the DESRES reference as AMBER ff14SB as well. This might appear counterintuitive as the DESRES simulations are based on a force field from the CHARMM family. However, the aim in the development of the studied CHARMM 36m was to provide a force field that would be applicable for both structured and disordered proteins. It is thus not surprising but rather encouraging that this force field is biased toward the exploration of unfolded conformations. Similar to the CHARMM 36m simulations, also the GROMOS 54A7 ensembles agree well with experimental thermodynamics, assigning a 45% probability to the native state. The CLN025 folding dynamics captured with GROMOS 54A7, however, also excel in terms of the estimated kinetics. Here, the transition rates between the native and unfolded conformational states are substantially faster. While DESRES, AMBER ff14SB, and CHARMM 36m estimate mean first passage times between 1 µs and 2 µs, GROMOS 54A7 fluctuates between the native and unfolded conformations within 100 ns. Drude 2013 can also be considered as a member of the CHARMM family and stands out as the only polarizable force field in our test set. Nevertheless, Drude 2013 is also the only force field that does not capture conformational rearrangements toward the native state. Although one of the performed Drude 2013 simulations is started from the native conformation, the structure unfolds rapidly. As no refolding events are observed, the initial frames depicting the native state are not considered in the MSM.

In addition, we study helix folding dynamics based on simulations of villin using polarizable and non-polarizable force fields. Besides the difference in the secondary structure, villin is more than 20 residues larger than CLN025 and thus poses a more complex sampling challenge to the selected force fields. Again, we perform simulations starting from the native state and two unfolded conformations to evaluate simulation stability and folding dynamics. The individual analysis of each simulation highlights AMBER ff14SB as a performance champion in terms of folding toward the native conformation and stable sampling thereof. While all force fields sample the minimum around the native structure when started from there, AMBER ff14SB is the only one that does not exhibit notable indications of unfolding. We, nevertheless, emphasize that the CHARMM 36m simulations are more than twice as long (>20 µs) as the AMBER ff14SB simulations (10 µs), and also in CHARMM 36m, villin is entirely stable in its native state up to t ≈ 12 µs. When we evaluate the simulations started from the unfolded conformations, it is again AMBER ff14SB that refolds the structure closest toward the native state minimum. This result is somewhat surprising, as also GROMOS 54A7 was specifically optimized to stabilize helical structures. Additionally, we had expected Drude 2013 to be particularly well suited for folding villin as polarization plays a key role in helix formation.136 

The most demanding system in our set is a WW domain, which we selected to study the formation of β-sheets. Although this WW domain is comparable to villin in size, it is well established that folding of β-sheets is significantly slower than helix formation.134 We analyze simulations starting from the native, a partially and a fully unfolded structure. AMBER ff14SB and CHARMM 36m show highly stable simulations of the native state. Furthermore, both force fields also rearrange the partially unfolded starting structure toward the native conformation within the first 2 µs. Both GROMOS 54A7 and Drude 2013, on the other hand, display lower stability, i.e., partial to full unfolding, in the simulation starting from the native state. For the simulations started from the partially unfolded conformation, the system appears to be trapped in a misfolded conformational state in both force fields. For Drude 2013, this misfolded state even includes an α-helix. When starting from the fully unfolded conformation, none of the considered force fields is able to refold the WW domain to its native structure. In all runs except Drude 2013, α-helical structures appear. While it has been reported before that β-sheet formation can proceed via α-helical intermediates,137,138 only the simulation with CHARMM 36m shows a direct transition from an α-helical arrangement to an almost folded WW domain, which, however, is short lived (see Fig. S32 of the supplementary material). The longer lived α-helical structures found in AMBER ff14SB and GROMOS 54A7 point to an over-stabilization of this structural feature.

In summary, based on the simulation setup, model systems, and analyses presented in this study, we report the following key observations. CHARMM 36m performs slightly better than AMBER ff14SB in sampling representative state probabilities, while AMBER ff14SB performs slightly better in α-helix formation. However, we note that AMBER ff14SB also quickly misfolds the fully unstructured conformation of the studied WW domain into a highly stable helical conformation. This could be an indication that the helical structure elements are overstabilized within this force field. It has been described for preceding force fields in the AMBER family that the helical propensities of amino acids are generally overestimated.139,140

The preference for α-helical structures in AMBER ff14SB observed here is likely a feature related to the standard water model of AMBER ff14SB TIP3P. Past studies found that TIP3P induces overly compact protein structures65,67,141 and that AMBER ff14SB per se underestimates helical propensities.69 Generally, it has been shown that a change in the water model can have a significant impact on folding equilibria and kinetics.61,66,67,69 For instance, it was recently suggested that proteins can refold at very low temperatures,142 a phenomenon that is likely related to the complex behavior of supercooled water143,144 and hence strongly depends on the quality of the employed water model. In this survey, however, we always used the respective standard water models of the four force fields considered. Hence, we are only able to compare these four different force field–water combinations and cannot draw conclusions regarding the influence of water models on the protein folding dynamics observed.

While we expected the most accurate results for Drude 2013 simulations, we consistently find it to be outperformed by the other three force fields. Considering its sophisticated treatment of electrostatics and the resulting significant computational overhead, we present simulations of notable length. Nevertheless, we observe surprisingly unstable sampling for all the systems starting from the native conformation. Indications that Drude 2013 destabilizes β-hairpins were already noted in the past,35 and during the preparation of this article, the Drude developers published a work with similar findings.47 In this work, Lin et al.47 also present an optimized version of the Drude force field (Drude 2019), which is expected to resolve the observed issues. In fact, it was shown that the folded state of CLN025 is stable in Drude 2019 throughout a 1 µs trajectory simulated at 300 K.47 

In the presented work, we profile folding dynamics captured with the polarizable Drude 2013 force field and three non-polarizable force fields, i.e., AMBER ff14SB, CHARMM 36m, and GROMOS 54A7 in conjunction with their respective standard water models. In performing long-timescale simulations of the three model systems, we evaluate the performance of each force field in sampling β-hairpin, α-helical, and β-sheet structures. In particular, for Drude 2013, protein dynamics on such extended timescales have not been studied before. Despite the extensive sampling and sophisticated description of electrostatics, we consistently observe destabilization of the native state ensemble in all Drude 2013 simulations, independently of its respective secondary structure elements. However, the recently published Drude 2019 force field was explicitly developed to address and resolve this issue.47 All non-polarizable force fields in our test set perform well in sampling the dynamics within the folded state ensemble. For the ultra-fast-folding mini-protein CLN025, ensembles from both the DESRES and the AMBER ff14SB simulations overestimate the population of the native state. This common limitation of protein force fields is overcome by GROMOS 54A7 and CHARMM 36m, which was particularly developed to sample structured and disordered proteins. Our results further indicate that GROMOS 54A7 even outperforms CHARMM 36m in terms of accurate folding kinetics of CLN025.

In the description of helix formation, AMBER ff14SB achieves the most promising results as even the most extended starting conformation of villin closely folds toward the native state within 10 µs. However, considering our findings for the WW domain, it appears that helices are generally overstabilized in our AMBER ff14SB simulations.

Overall, our findings suggest that current additive force fields are well suited to study not only ultra-fast-folding dynamics but also partial unfolding and refolding processes in feasible computational time.

See the supplementary material for additional figures, tables, and a corresponding discussion.

A.S.K. and P.H.H. contributed equally to this work.

All authors thank the Austrian Science Fund (FWF) for financial support (Project Nos. P30565 and P30737). P.H.H. also acknowledges funding by the University of Innsbruck (NWF-Project No. 282396). The computational results presented were achieved, in part, using the HPC infrastructure LEO of the University of Innsbruck and the Vienna Scientific Cluster (VSC). They are grateful to D. E. Shaw Research for supplying their trajectories of the systems considered in this study.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
D. D.
Boehr
,
R.
Nussinov
, and
P. E.
Wright
, “
The role of dynamic conformational ensembles in biomolecular recognition
,”
Nat. Chem. Biol.
5
,
789
796
(
2009
).
2.
R. G.
Smock
and
L. M.
Gierasch
, “
Sending signals dynamically
,”
Science
324
,
198
203
(
2009
).
3.
H.
van den Bedem
and
J. S.
Fraser
, “
Integrative, dynamic structural biology at atomic resolution—It’s about time
,”
Nat. Methods
12
,
307
318
(
2015
).
4.
J.
Köfinger
,
L. S.
Stelzl
,
K.
Reuter
,
C.
Allande
,
K.
Reichel
, and
G.
Hummer
, “
Efficient ensemble refinement by reweighting
,”
J. Chem. Theory Comput.
15
,
3390
3401
(
2019
).
5.
M.
Igaev
,
C.
Kutzner
,
L. V.
Bock
,
A. C.
Vaiana
, and
H.
Grubmüller
, “
Automated cryo-EM structure refinement using correlation-driven molecular dynamics
,”
eLife
8
,
e43542
(
2019
).
6.
M.
Bonomi
and
M.
Vendruscolo
, “
Determination of protein structural ensembles using cryo-electron microscopy
,”
Curr. Opin. Struct. Biol.
56
,
37
45
(
2019
).
7.
M.
Karplus
and
J. A.
McCammon
, “
Molecular dynamics simulations of biomolecules
,”
Nat. Struct. Biol.
9
,
646
652
(
2002
).
8.
T.
Hansson
,
C.
Oostenbrink
, and
W.
van Gunsteren
, “
Molecular dynamics simulations
,”
Curr. Opin. Struct. Biol.
12
,
190
196
(
2002
).
9.
M.
Karplus
and
J.
Kuriyan
, “
Molecular dynamics and protein function
,”
Proc. Natl. Acad. Sci. U. S. A.
102
,
6679
6685
(
2005
).
10.
S. A.
Hollingsworth
and
R. O.
Dror
, “
Molecular dynamics simulation for all
,”
Neuron
99
,
1129
1143
(
2018
).
11.
R. O.
Dror
,
R. M.
Dirks
,
J. P.
Grossman
,
H.
Xu
, and
D. E.
Shaw
, “
Biomolecular simulation: A computational microscope for molecular biology
,”
Annu. Rev. Biophys.
41
,
429
452
(
2012
).
12.
H.
Zhao
and
A.
Caflisch
, “
Molecular dynamics in drug design
,”
Eur. J. Med. Chem.
91
,
4
14
(
2015
).
13.
C. B.
Anfinsen
, “
Principles that govern the folding of protein chains
,”
Science
181
,
223
230
(
1973
).
14.
C.
Levinthal
, “
How to fold graciously
,” in
Mössbauer Spectroscopy in Biological Systems Proceedings
(
University of Illinois Press
,
1969
), Vol. 67, pp.
22
24
.
15.
J. D.
Bryngelson
,
J. N.
Onuchic
,
N. D.
Socci
, and
P. G.
Wolynes
, “
Funnels, pathways, and the energy landscape of protein folding: A synthesis
,”
Proteins: Struct., Funct., Bioinf.
21
,
167
195
(
1995
).
16.
K. A.
Dill
and
H. S.
Chan
, “
From Levinthal to pathways to funnels
,”
Nat. Struct. Biol.
4
,
10
19
(
1997
).
17.
D.
Frenkel
, “
Simulations: The dark side
,”
Eur. Phys. J. Plus
128
,
10
(
2013
).
18.
F. H.
Stillinger
,
Energy Landscapes, Inherent Structures, and Condensed-Matter Phenomena
(
Princeton University Press
,
2015
).
19.
F.
Rao
and
M.
Karplus
, “
Protein dynamics investigated by inherent structure analysis
,”
Proc. Natl. Acad. Sci. U. S. A.
107
,
9152
9157
(
2010
).
20.
M.
Karplus
and
D. L.
Weaver
, “
Protein-folding dynamics
,”
Nature
260
,
404
406
(
1976
).
21.
B.
Ma
,
S.
Kumar
,
C.-J.
Tsai
, and
R.
Nussinov
, “
Folding funnels and binding mechanisms
,”
Protein Eng.
12
,
713
720
(
1999
).
22.
T.
Mori
and
S.
Saito
, “
Molecular mechanism behind the fast folding/unfolding transitions of Villin headpiece subdomain: Hierarchy and heterogeneity
,”
J. Phys. Chem. B
120
,
11683
11691
(
2016
).
23.
R.
Best
and
G.
Hummer
, “
Interpreting phi-values using protein folding transition paths
,”
Biophys. J.
112
,
60a
(
2017
).
24.
F.
Noé
,
G.
De Fabritiis
, and
C.
Clementi
, “
Machine learning for protein folding and dynamics
,”
Curr. Opin. Struct. Biol.
60
,
77
84
(
2020
).
25.
K. A.
Dill
,
S. B.
Ozkan
,
M. S.
Shell
, and
T. R.
Weikl
, “
The protein folding problem
,”
Annu. Rev. Biophys.
37
,
289
316
(
2008
).
26.
A.
Perez
,
J. A.
Morrone
,
E.
Brini
,
J. L.
MacCallum
, and
K. A.
Dill
, “
Blind protein structure prediction using accelerated free-energy simulations
,”
Sci. Adv.
2
,
e1601274
(
2016
).
27.
S.
Jin
,
M.
Chen
,
X.
Chen
,
C.
Bueno
,
W.
Lu
,
N. P.
Schafer
,
X.
Lin
,
J. N.
Onuchic
, and
P. G.
Wolynes
, “
Protein structure prediction in CASP13 using AWSEM-suite
,”
J. Chem. Theory Comput.
16
,
3977
(
2020
).
28.
A. D.
MacKerell
, Jr.
, “
Empirical force fields for biological macromolecules: Overview and issues
,”
J. Comput. Chem.
25
,
1584
1604
(
2004
).
29.
R. B.
Best
, “
Atomistic force fields for proteins
,” in
Biomolecular Simulations
(
Springer
,
2019
), pp.
3
19
.
30.
J. W.
Ponder
,
C.
Wu
,
P.
Ren
,
V. S.
Pande
,
J. D.
Chodera
,
M. J.
Schnieders
,
I.
Haque
,
D. L.
Mobley
,
D. S.
Lambrecht
,
R. A.
DiStasio
, Jr.
et al., “
Current status of the AMOEBA polarizable force field
,”
J. Phys. Chem. B
114
,
2549
2564
(
2010
).
31.
C. M.
Baker
, “
Polarizable force fields for molecular dynamics simulations of biomolecules
,”
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
5
,
241
254
(
2015
).
32.
P. E.
Lopes
,
O.
Guvench
, and
A. D.
MacKerell
, “
Current status of protein force fields for molecular dynamics simulations
,” in
Molecular Modeling of Proteins
(
Springer
,
2015
), pp.
47
71
.
33.
J.
Melcr
and
J.-P.
Piquemal
, “
Accurate biomolecular simulations account for electronic polarization
,”
Front. Mol. Biosci.
6
,
143
(
2019
).
34.
Z.
Lin
and
W. F.
van Gunsteren
, “
Effects of polarizable solvent models upon the relative stability of an α-helical and a β-hairpin structure of an alanine decapeptide
,”
J. Chem. Theory Comput.
11
,
1983
1986
(
2015
).
35.
A. J.
Hazel
,
E. T.
Walters
,
C. N.
Rowley
, and
J. C.
Gumbart
, “
Folding free energy landscapes of β-sheets with non-polarizable and polarizable CHARMM force fields
,”
J. Chem. Phys.
149
,
072317
(
2018
).
36.
J.
Huang
and
A. D.
MacKerell
, Jr.
, “
Induction of peptide bond dipoles drives cooperative helix formation in the (AAQAA)3 peptide
,”
Biophys. J.
107
,
991
997
(
2014
).
37.
J. A.
Lemkul
,
J.
Huang
, and
A. D.
MacKerell
, Jr.
, “
Induced dipole–dipole interactions influence the unfolding pathways of wild-type and mutant amyloid β-peptides
,”
J. Phys. Chem. B
119
,
15574
15582
(
2015
).
38.
J. A.
Maier
,
C.
Martinez
,
K.
Kasavajhala
,
L.
Wickstrom
,
K. E.
Hauser
, and
C.
Simmerling
, “
ff14SB: Improving the accuracy of protein side chain and backbone parameters from ff99SB
,”
J. Chem. Theory Comput.
11
,
3696
3713
(
2015
).
39.
J.
Huang
,
S.
Rauscher
,
G.
Nawrocki
,
T.
Ran
,
M.
Feig
,
B. L.
de Groot
,
H.
Grubmüller
, and
A. D.
MacKerell
, “
CHARMM36m: An improved force field for folded and intrinsically disordered proteins
,”
Nat. Methods
14
,
71
73
(
2017
).
40.
D.
Poger
,
W. F.
Van Gunsteren
, and
A. E.
Mark
, “
A new force field for simulating phosphatidylcholine bilayers
,”
J. Comput. Chem.
31
,
1117
1125
(
2010
).
41.
N.
Schmid
,
A. P.
Eichenberger
,
A.
Choutko
,
S.
Riniker
,
M.
Winger
,
A. E.
Mark
, and
W. F.
van Gunsteren
, “
Definition and testing of the GROMOS force-field versions 54A7 and 54B7
,”
Eur. Biophys. J.
40
,
843
(
2011
).
42.
P. E. M.
Lopes
,
J.
Huang
,
J.
Shim
,
Y.
Luo
,
H.
Li
,
B.
Roux
, and
A. D.
MacKerell
, Jr.
, “
Polarizable force field for peptides and proteins based on the classical Drude oscillator
,”
J. Chem. Theory Comput.
9
,
5430
5449
(
2013
).
43.
M. D.
Smith
,
J. S.
Rao
,
E.
Segelken
, and
L.
Cruz
, “
Force-field induced Bias in the structure of aβ21–30: A comparison of OPLS, AMBER, CHARMM, and GROMOS force fields
,”
J. Chem. Inf. Model.
55
,
2587
2595
(
2015
).
44.
P. E. M.
Lopes
,
O.
Guvench
, and
A. D.
MacKerell
, , Methods in Molecular Biology Vol. 1215 (
2015
).
45.
M. R.
Shirts
,
C.
Klein
,
J. M.
Swails
,
J.
Yin
,
M. K.
Gilson
,
D. L.
Mobley
,
D. A.
Case
, and
E. D.
Zhong
, “
Lessons learned from comparing molecular dynamics engines on the SAMPL5 dataset
,”
J. Comput.-Aided Mol. Des.
31
,
147
161
(
2017
).
46.
J. A.
Lemkul
,
J.
Huang
,
B.
Roux
, and
A. D.
MacKerell
, Jr.
, “
An empirical polarizable force field based on the classical Drude oscillator model: Development history and recent applications
,”
Chem. Rev.
116
,
4983
5013
(
2016
).
47.
F.-Y.
Lin
,
J.
Huang
,
P.
Pandey
,
C.
Rupakheti
,
J.
Li
,
B.
Roux
, and
A. D.
MacKerell
, Jr.
, “
Further optimization and validation of the classical Drude polarizable protein force field
,”
J. Chem. Theory Comput.
16
,
3221
3239
(
2020
).
48.
W.
Huang
,
Z.
Lin
, and
W. F.
van Gunsteren
, “
Validation of the GROMOS 54A7 force field with respect to beta-peptide folding
,”
J. Chem. Theory Comput.
7
,
1237
1243
(
2011
).
49.
P. E.
Wright
and
H. J.
Dyson
, “
Intrinsically disordered proteins in cellular signalling and regulation
,”
Nat. Rev. Mol. Cell Biol.
16
,
18
29
(
2015
).
50.
M.
Brucale
,
B.
Schuler
, and
B.
Samorì
, “
Single-molecule studies of intrinsically disordered proteins
,”
Chem. Rev.
114
,
3281
3317
(
2014
).
51.
R. B.
Berlow
,
H. J.
Dyson
, and
P. E.
Wright
, “
Expanding the paradigm: Intrinsically disordered proteins and allosteric regulation
,”
J. Mol. Biol.
430
,
2309
2320
(
2018
).
52.
A. S.
Kamenik
,
F.
Hofer
,
P. H.
Handle
, and
K. R.
Liedl
, “
Dynamics rationalize proteolytic susceptibility of the major birch pollen allergen Bet v 1
,”
Front. Mol. Biosci.
7
,
18
(
2020
).
53.
B.
Groitl
,
S.
Horowitz
,
K. A. T.
Makepeace
,
E. V.
Petrotchenko
,
C. H.
Borchers
,
D.
Reichmann
,
J. C. A.
Bardwell
, and
U.
Jakob
, “
Protein unfolding as a switch from self-recognition to high-affinity client binding
,”
Nat. Commun.
7
,
10357
(
2016
).
54.
T. R.
Alderson
,
J.
Ying
,
A.
Bax
,
J. L. P.
Benesch
, and
A. J.
Baldwin
, “
Conditional disorder in small heat-shock proteins
,”
J. Mol. Biol.
432
,
3033
3049
(
2020
).
55.
S.
Grutsch
,
J. E.
Fuchs
,
L.
Ahammer
,
A. S.
Kamenik
,
K. R.
Liedl
, and
M.
Tollinger
, “
Conformational flexibility differentiates naturally occurring Bet v 1 isoforms
,”
Int. J. Mol. Sci.
18
,
1192
(
2017
).
56.
R.
Konrat
, “
NMR contributions to structural dynamics studies of intrinsically disordered proteins
,”
J. Magn. Reson.
241
,
74
85
(
2014
).
57.
F.
Zosel
,
D.
Mercadante
,
D.
Nettels
, and
B.
Schuler
, “
A proline switch explains kinetic heterogeneity in a coupled folding and binding reaction
,”
Nat. Commun.
9
,
3332
(
2018
).
58.
F.
Zosel
,
A.
Soranno
,
K. J.
Buholzer
,
D.
Nettels
, and
B.
Schuler
, “
Depletion interactions modulate the binding between disordered proteins in crowded environments
,”
Proc. Natl. Acad. Sci. U. S. A.
117
,
13480
(
2020
).
59.
L. M.
Pietrek
,
L. S.
Stelzl
, and
G.
Hummer
, “
Hierarchical ensembles of intrinsically disordered proteins at atomic resolution in molecular dynamics simulations
,”
J. Chem. Theory Comput.
16
,
725
737
(
2019
).
60.
S.
Rauscher
,
V.
Gapsys
,
M. J.
Gajda
,
M.
Zweckstetter
,
B. L.
de Groot
, and
H.
Grubmüller
, “
Structural ensembles of intrinsically disordered proteins depend strongly on force field: A comparison to experiment
,”
J. Chem. Theory Comput.
11
,
5513
5524
(
2015
).
61.
R. B.
Best
and
J.
Mittal
, “
Protein simulations with an optimized water model: Cooperative helix formation and temperature-induced unfolded state collapse
,”
J. Phys. Chem. B
114
,
14916
14923
(
2010
).
62.
V.
Cruz
,
J.
Ramos
, and
J.
Martínez-Salazar
, “
Water-mediated conformations of the alanine dipeptide as revealed by distributed umbrella sampling simulations, quantum mechanics based calculations, and experimental data
,”
J. Phys. Chem. B
115
,
4880
4886
(
2011
).
63.
P. S.
Nerenberg
,
B.
Jo
,
C.
So
,
A.
Tripathy
, and
T.
Head-Gordon
, “
Optimizing solute–water van der waals interactions to reproduce solvation free energies
,”
J. Phys. Chem. B
116
,
4524
4534
(
2012
).
64.
R. B.
Best
,
W.
Zheng
, and
J.
Mittal
, “
Balanced protein–water interactions improve properties of disordered proteins and non-specific protein association
,”
J. Chem. Theory Comput.
10
,
5113
5124
(
2014
).
65.
S.
Piana
,
A. G.
Donchev
,
P.
Robustelli
, and
D. E.
Shaw
, “
Water dispersion interactions strongly influence simulated structural properties of disordered protein states
,”
J. Phys. Chem. B
119
,
5113
5123
(
2015
).
66.
S.
Boonstra
,
P. R.
Onck
, and
E.
van der Giessen
, “
CHARMM TIP3P water model suppresses peptide folding by solvating the unfolded state
,”
J. Phys. Chem. B
120
,
3692
3698
(
2016
).
67.
P.
Robustelli
,
S.
Piana
, and
D. E.
Shaw
, “
Developing a molecular dynamics force field for both folded and disordered protein states
,”
Proc. Natl. Acad. Sci. U. S. A.
115
,
E4758
E4766
(
2018
).
68.
R.
Anandakrishnan
,
S.
Izadi
, and
A. V.
Onufriev
, “
Why computed protein folding landscapes are sensitive to the water model
,”
J. Chem. Theory Comput.
15
,
625
636
(
2018
).
69.
C.
Tian
,
K.
Kasavajhala
,
K. A. A.
Belfon
,
L.
Raguette
,
H.
Huang
,
A. N.
Migues
,
J.
Bickel
,
Y.
Wang
,
J.
Pincay
,
Q.
Wu
et al., “
ff19SB: Amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution
,”
J. Chem. Theory Comput.
16
,
528
552
(
2019
).
70.
W. L.
Jorgensen
,
J.
Chandrasekhar
,
J. D.
Madura
,
R. W.
Impey
, and
M. L.
Klein
, “
Comparison of simple potential functions for simulating liquid water
,”
J. Chem. Phys.
79
,
926
935
(
1983
).
71.
A. D.
MacKerell
, Jr.
,
D.
Bashford
,
M.
Bellott
,
R. L.
Dunbrack
, Jr.
,
J. D.
Evanseck
,
M. J.
Field
,
S.
Fischer
,
J.
Gao
,
H.
Guo
,
S.
Ha
et al., “
All-atom empirical potential for molecular modeling and dynamics studies of proteins
,”
J. Phys. Chem. B
102
,
3586
3616
(
1998
).
72.
G.
Lamoureux
,
E.
Harder
,
I. V.
Vorobyov
,
B.
Roux
, and
A. D.
MacKerell
, Jr.
, “
A polarizable model of water for molecular dynamics simulations of biomolecules
,”
Chem. Phys. Lett.
418
,
245
249
(
2006
).
73.
H. J. C.
Berendsen
,
J. P. M.
Postma
,
W. F.
van Gunsteren
, and
J.
Hermans
, in
Intermolecular Forces
, edited by
B.
Pullman
(
Reidel
,
Dordrecht
,
1982
), p.
331
.
74.
K.
Lindorff-Larsen
,
S.
Piana
,
R. O.
Dror
, and
D. E.
Shaw
, “
How fast-folding proteins fold
,”
Science
334
,
517
520
(
2011
).
75.
S.
Honda
,
T.
Akiba
,
Y. S.
Kato
,
Y.
Sawada
,
M.
Sekijima
,
M.
Ishimura
,
A.
Ooishi
,
H.
Watanabe
,
T.
Odahara
, and
K.
Harata
, “
Crystal structure of a ten-amino acid protein
,”
J. Am. Chem. Soc.
130
,
15327
15331
(
2008
).
76.
C. M.
Davis
,
S.
Xiao
,
D. P.
Raleigh
, and
R. B.
Dyer
, “
Raising the speed limit for β-hairpin formation
,”
J. Am. Chem. Soc.
134
,
14476
14482
(
2012
).
77.
J.
Kubelka
,
T. K.
Chiu
,
D. R.
Davies
,
W. A.
Eaton
, and
J.
Hofrichter
, “
Sub-microsecond protein folding
,”
J. Mol. Biol.
359
,
546
553
(
2006
).
78.
M.
Jäger
,
Y.
Zhang
,
J.
Bieschke
,
H.
Nguyen
,
M.
Dendle
,
M. E.
Bowman
,
J. P.
Noel
,
M.
Gruebele
, and
J. W.
Kelly
, “
Structure–function–folding relationship in a WW domain
,”
Proc. Natl. Acad. Sci. U. S. A.
103
,
10648
10653
(
2006
).
79.
H.
Gelman
and
M.
Gruebele
, “
Fast protein folding kinetics
,”
Q. Rev. Biophys.
47
,
95
142
(
2014
).
80.
Y.-P.
Pang
, “
How fast fast-folding proteins fold in silico
,”
Biochem. Biophys. Res. Commun.
492
,
135
139
(
2017
).
81.
P.
Kührová
,
A.
De Simone
,
M.
Otyepka
, and
R. B.
Best
, “
Force-field dependence of chignolin folding and misfolding: Comparison with experiment and redesign
,”
Biophys. J.
102
,
1897
1906
(
2012
).
82.
M. P. D.
Hatfield
,
R. F.
Murphy
, and
S.
Lovas
, “
The CLN025 decapeptide retains a β-hairpin conformation in urea and guanidinium chloride
,”
J. Phys. Chem. B
115
,
4971
4981
(
2011
).
83.
K. A.
McKiernan
,
B. E.
Husic
, and
V. S.
Pande
, “
Modeling the mechanism of CLN025 β-hairpin formation
,”
J. Chem. Phys.
147
,
104107
(
2017
).
84.
J.
Kubelka
,
W. A.
Eaton
, and
J.
Hofrichter
, “
Experimental tests of Villin subdomain folding simulations
,”
J. Mol. Biol.
329
,
625
630
(
2003
).
85.
T. K.
Chiu
,
J.
Kubelka
,
R.
Herbst-Irmer
,
W. A.
Eaton
,
J.
Hofrichter
, and
D. R.
Davies
, “
High-resolution x-ray crystal structures of the Villin headpiece subdomain, an ultrafast folding protein
,”
Proc. Natl. Acad. Sci. U. S. A.
102
,
7517
7522
(
2005
).
86.
S.
Xiao
,
Y.
Bi
,
B.
Shan
, and
D. P.
Raleigh
, “
Analysis of core packing in a cooperatively folded miniature protein: The ultrafast folding Villin headpiece helical subdomain
,”
Biochemistry
48
,
4607
4616
(
2009
).
87.
K. A.
Beauchamp
,
D. L.
Ensign
,
R.
Das
, and
V. S.
Pande
, “
Quantitative comparison of Villin headpiece subdomain simulations and triplet–triplet energy transfer experiments
,”
Proc. Natl. Acad. Sci. U. S. A.
108
,
12734
12739
(
2011
).
88.
T.
Cellmer
,
M.
Buscaglia
,
E. R.
Henry
,
J.
Hofrichter
, and
W. A.
Eaton
, “
Making connections between ultrafast protein folding kinetics and molecular dynamics simulations
,”
Proc. Natl. Acad. Sci. U. S. A.
108
,
6103
6108
(
2011
).
89.
H. S.
Chung
,
T.
Cellmer
,
J. M.
Louis
, and
W. A.
Eaton
, “
Measuring ultrafast protein folding rates from photon-by-photon analysis of single molecule fluorescence trajectories
,”
Chem. Phys.
422
,
229
237
(
2013
).
90.
G.
Žoldák
,
J.
Stigler
,
B.
Pelz
,
H.
Li
, and
M.
Rief
, “
Ultrafast folding kinetics and cooperativity of Villin headpiece in single-molecule force spectroscopy
,”
Proc. Natl. Acad. Sci. U. S. A.
110
,
18156
18161
(
2013
).
91.
S.
Nagarajan
,
S.
Xiao
,
D. P.
Raleigh
, and
R. B.
Dyer
, “
Heterogeneity in the folding of Villin headpiece subdomain HP36
,”
J. Phys. Chem. B
122
,
11640
11648
(
2018
).
92.
P. L.
Freddolino
and
K.
Schulten
, “
Common structural transitions in explicit-solvent simulations of Villin headpiece folding
,”
Biophys. J.
97
,
2338
2347
(
2009
).
93.
J.
Mittal
and
R. B.
Best
, “
Tackling force-field bias in protein folding simulations: Folding of Villin HP35 and Pin WW domains in explicit water
,”
Biophys. J.
99
,
L26
L28
(
2010
).
94.
S.
Piana
,
K.
Lindorff-Larsen
, and
D. E.
Shaw
, “
How robust are protein folding simulations with respect to force field parameterization?
,”
Biophys. J.
100
,
L47
L49
(
2011
).
95.
S.
Piana
,
K.
Lindorff-Larsen
, and
D. E.
Shaw
, “
Protein folding kinetics and thermodynamics from atomistic simulation
,”
Proc. Natl. Acad. Sci. U. S. A.
109
,
17845
17850
(
2012
).
96.
E.
Wang
,
P.
Tao
,
J.
Wang
, and
Y.
Xiao
, “
A novel folding pathway of the Villin headpiece subdomain HP35
,”
Phys. Chem. Chem. Phys.
21
,
18219
18226
(
2019
).
97.
H.
Nguyen
,
M.
Jäger
,
A.
Moretto
,
M.
Gruebele
, and
J. W.
Kelly
, “
Tuning the free-energy landscape of a WW domain by temperature, mutation, and truncation
,”
Proc. Natl. Acad. Sci. U. S. A.
100
,
3948
3953
(
2003
).
98.
H.
Nguyen
,
M.
Jäger
,
J. W.
Kelly
, and
M.
Gruebele
, “
Engineering a β-sheet protein toward the folding speed limit
,”
J. Phys. Chem. B
109
,
15182
15186
(
2005
).
99.
F.
Liu
,
D.
Du
,
A. A.
Fuller
,
J. E.
Davoren
,
P.
Wipf
,
J. W.
Kelly
, and
M.
Gruebele
, “
An experimental survey of the transition between two-state and downhill protein folding scenarios
,”
Proc. Natl. Acad. Sci. U. S. A.
105
,
2369
2374
(
2008
).
100.
E. S.
Cobos
,
M.
Iglesias-Bexiga
,
J.
Ruiz-Sanz
,
P. L.
Mateo
,
I.
Luque
, and
J. C.
Martinez
, “
Thermodynamic characterization of the folding equilibrium of the human Nedd4-WW4 domain: At the frontiers of cooperative folding
,”
Biochemistry
48
,
8712
8720
(
2009
).
101.
S.
Piana
,
K.
Sarkar
,
K.
Lindorff-Larsen
,
M.
Guo
,
M.
Gruebele
, and
D. E.
Shaw
, “
Computational design and experimental testing of the fastest-folding β-sheet protein
,”
J. Mol. Biol.
405
,
43
48
(
2011
).
102.
M.
Iglesias-Bexiga
,
M.
Szczepaniak
,
C.
Sánchez de Medina
,
E. S.
Cobos
,
R.
Godoy-Ruiz
,
J. C.
Martinez
,
V.
Muñoz
, and
I.
Luque
, “
Protein folding cooperativity and thermodynamic barriers of the simplest β-sheet fold: A survey of WW domains
,”
J. Phys. Chem. B
122
,
11058
11071
(
2018
).
103.
P. L.
Freddolino
,
F.
Liu
,
M.
Gruebele
, and
K.
Schulten
, “
Ten-microsecond molecular dynamics simulation of a fast-folding WW domain
,”
Biophys. J.
94
,
L75
L77
(
2008
).
104.
K.
Vandivort
,
J. C.
Phillips
,
E.
Villa
,
P. L.
Freddolino
,
J.
Gumbart
,
L. G.
Trabuco
,
D. E.
Chandler
,
J.
Hsin
,
C. B.
Harrison
,
L.
Kalé
 et al., “
Long time and large size molecular dynamics simulations made feasible through new TeraGrid hardware and software
,” in
Proceedings of the 2008 TeraGrid Conference
,
2008
.
105.
P. L.
Freddolino
,
S.
Park
,
B.
Roux
, and
K.
Schulten
, “
Force field bias in protein folding simulations
,”
Biophys. J.
96
,
3772
3780
(
2009
).
106.
D. L.
Ensign
and
V. S.
Pande
, “
The Fip35 WW domain folds with structural and mechanistic heterogeneity in molecular dynamics simulations
,”
Biophys. J.
96
,
L53
L55
(
2009
).
107.
D. E.
Shaw
,
P.
Maragakis
,
K.
Lindorff-Larsen
,
S.
Piana
,
R. O.
Dror
,
M. P.
Eastwood
,
J. A.
Bank
,
J. M.
Jumper
,
J. K.
Salmon
,
Y.
Shan
et al., “
Atomic-level characterization of the structural dynamics of proteins
,”
Science
330
,
341
346
(
2010
).
108.
G.
Berezovska
,
D.
Prada-Gracia
, and
F.
Rao
, “
Consensus for the Fip35 folding mechanism?
,”
J. Chem. Phys.
139
,
035102
(
2013
).
109.
F.
Jiang
and
Y.-D.
Wu
, “
Folding of fourteen small proteins with a residue-specific force field and replica-exchange molecular dynamics
,”
J. Am. Chem. Soc.
136
,
9536
9539
(
2014
).
110.
L.
Zanetti-Polzi
,
C. M.
Davis
,
M.
Gruebele
,
R. B.
Dyer
,
A.
Amadei
, and
I.
Daidone
, “
Parallel folding pathways of Fip35 WW domain explained by infrared spectra and their computer simulation
,”
FEBS Lett.
591
,
3265
3275
(
2017
).
111.
D. A.
Case
,
I. Y.
Ben-Shalom
,
S. R.
Brozell
,
D. S.
Cerutti
,
T. E.
Cheatham
 III
,
V. W. D.
Cruzeiro
,
T. A.
Darden
,
R. E.
Duke
,
D.
Ghoreishi
,
G.
Giambasu
,
T.
Giese
,
M. K.
Gilson
,
H.
Gohlke
,
A. W.
Goetz
,
D.
Greene
,
R.
Harris
,
N.
Homeyer
,
Y.
Huang
,
S.
Izadi
,
A.
Kovalenko
,
R.
Krasny
,
T.
Kurtzman
,
T. S.
Lee
,
S.
LeGrand
,
P.
Li
,
C.
Lin
,
J.
Liu
,
T.
Luchko
,
R.
Luo
,
V.
Man
,
D. J.
Mermelstein
,
K. M.
Merz
,
Y.
Miao
,
G.
Monard
,
C.
Nguyen
,
H.
Nguyen
,
A.
Onufriev
,
F.
Pan
,
R.
Qi
,
D. R.
Roe
,
A.
Roitberg
,
C.
Sagui
,
S.
Schott-Verdugo
,
J.
Shen
,
C. L.
Simmerling
,
J.
Smith
,
J.
Swails
,
R. C.
Walker
,
J.
Wang
,
H.
Wei
,
L.
Wilson
,
R. M.
Wolf
,
X.
Wu
,
L.
Xiao
,
Y.
Xiong
,
D. M.
York
, and
P. A.
Kollman
, AMBER 2019,
University of California
,
San Francisco
,
2019
.
112.
H. G.
Wallnoefer
,
S.
Handschuh
,
K. R.
Liedl
, and
T.
Fox
, “
Stabilizing of a globular protein by a highly complex water network: A molecular dynamics simulation study on factor Xa
,”
J. Phys. Chem. B
114
,
7405
7412
(
2010
).
113.
S. A.
Adelman
and
J. D.
Doll
, “
Generalized Langevin equation approach for atom/solid-surface scattering: General formulation for classical scattering off harmonic solids
,”
J. Chem. Phys.
64
,
2375
2388
(
1976
).
114.
H. J. C.
Berendsen
,
J. P. M.
Postma
,
W. F.
van Gunsteren
,
A.
DiNola
, and
J. R.
Haak
, “
Molecular dynamics with coupling to an external bath
,”
J. Chem. Phys.
81
,
3684
3690
(
1984
).
115.
T.
Darden
,
D.
York
, and
L.
Pedersen
, “
Particle mesh Ewald: An N log (N) method for Ewald sums in large systems
,”
J. Chem. Phys.
98
,
10089
10092
(
1993
).
116.
J.-P.
Ryckaert
,
G.
Ciccotti
, and
H. J. C.
Berendsen
, “
Numerical integration of the cartesian equations of motion of a system with constraints: Molecular dynamics of n-alkanes
,”
J. Comput. Phys.
23
,
327
341
(
1977
).
117.
P.
Eastman
,
J.
Swails
,
J. D.
Chodera
,
R. T.
McGibbon
,
Y.
Zhao
,
K. A.
Beauchamp
,
L.-P.
Wang
,
A. C.
Simmonett
,
M. P.
Harrigan
,
C. D.
Stern
,
R. P.
Wiewiora
,
B. R.
Brooks
, and
V. S.
Pande
, “
OpenMM 7: Rapid development of high performance algorithms for molecular dynamics
,”
PLoS Comput. Biol.
13
,
e1005659
(
2017
).
118.
U.
Essmann
,
L.
Perera
,
M. L.
Berkowitz
,
T.
Darden
,
H.
Lee
, and
L. G.
Pedersen
, “
A smooth particle mesh Ewald method
,”
J. Chem. Phys.
103
,
8577
8593
(
1995
).
119.
K.-H.
Chow
and
D. M.
Ferguson
, “
Isothermal–isobaric molecular dynamics simulations with Monte Carlo volume sampling
,”
Comput. Phys. Commun.
91
,
283
289
(
1995
).
120.
J.
Åqvist
,
P.
Wennerström
,
M.
Nervall
,
S.
Bjelic
, and
B. O.
Brandsdal
, “
Molecular dynamics simulations of water and biomolecules with a Monte Carlo constant pressure algorithm
,”
Chem. Phys. Lett.
384
,
288
294
(
2004
).
121.
J.
Huang
,
J. A.
Lemkul
,
P. K.
Eastman
, and
A. D.
MacKerell
, Jr.
, “
Molecular dynamics simulations using the Drude polarizable force field on GPUs with OpenMM: Implementation, validation, and benchmarks
,”
J. Comput. Chem.
39
,
1682
1689
(
2018
).
122.
G.
Lamoureux
and
B.
Roux
, “
Modeling induced polarization with classical Drude oscillators: Theory and molecular dynamics simulation algorithm
,”
J. Chem. Phys.
119
,
3025
3039
(
2003
).
123.
M. J.
Abraham
,
T.
Murtola
,
R.
Schulz
,
S.
Páll
,
J. C.
Smith
,
B.
Hess
, and
E.
Lindahl
, “
Gromacs: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
,”
SoftwareX
1-2
,
19
25
(
2015
).
124.
M.
Parrinello
and
A.
Rahman
, “
Polymorphic transitions in single crystals: A new molecular dynamics method
,”
J. Appl. Phys.
52
,
7182
7190
(
1981
).
125.
B.
Hess
, “
P-LINCS: A parallel linear constraint solver for molecular simulation
,”
J. Chem. Theory Comput.
4
,
116
122
(
2008
).
126.
W.
Kabsch
and
C.
Sander
, “
Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features
,”
Biopolymers
22
,
2577
2637
(
1983
).
127.
R. B.
Best
,
G.
Hummer
, and
W. A.
Eaton
, “
Native contacts determine protein folding mechanisms in atomistic simulations
,”
Proc. Natl. Acad. Sci. U. S. A.
110
,
17874
17879
(
2013
).
128.
G.
Pérez-Hernández
,
F.
Paul
,
T.
Giorgino
,
G.
De Fabritiis
, and
F.
Noé
, “
Identification of slow molecular order parameters for Markov model construction
,”
J. Chem. Phys.
139
,
015102
(
2013
).
129.
C. R.
Schwantes
and
V. S.
Pande
, “
Improvements in Markov state model construction reveal many non-native interactions in the folding of NTL9
,”
J. Chem. Theory Comput.
9
,
2000
2009
(
2013
).
130.
J.-H.
Prinz
,
H.
Wu
,
M.
Sarich
,
B.
Keller
,
M.
Senne
,
M.
Held
,
J. D.
Chodera
,
C.
Schütte
, and
F.
Noé
, “
Markov models of molecular kinetics: Generation and validation
,”
J. Chem. Phys.
134
,
174105
(
2011
).
131.
B. E.
Husic
and
V. S.
Pande
, “
Markov state models: From an art to a science
,”
J. Am. Chem. Soc.
140
,
2386
2396
(
2018
).
132.
S.
Röblitz
and
M.
Weber
, “
Fuzzy spectral clustering by PCCA+: Application to Markov state models and data classification
,”
Adv. Data Anal. Classif.
7
,
147
179
(
2013
).
133.
M. K.
Scherer
,
B.
Trendelkamp-Schroer
,
F.
Paul
,
G.
Pérez-Hernández
,
M.
Hoffmann
,
N.
Plattner
,
C.
Wehmeyer
,
J.-H.
Prinz
, and
F.
Noé
, “
PyEMMA 2: A software package for estimation, validation, and analysis of Markov models
,”
J. Chem. Theory Comput.
11
,
5525
5542
(
2015
).
134.
J.
Kubelka
,
J.
Hofrichter
, and
W. A.
Eaton
, “
The protein folding ‘speed limit
,’”
Curr. Opin. Struct. Biol.
14
,
76
88
(
2004
).
135.
K.
Lindorff-Larsen
,
P.
Maragakis
,
S.
Piana
,
M. P.
Eastwood
,
R. O.
Dror
, and
D. E.
Shaw
, “
Systematic validation of protein force fields against experimental data
,”
PLoS One
7
,
e32131
(
2012
).
136.
T.
Sun
,
C.
Wei
,
N. W. C.
Neo
, and
D.
Zhang
, “
Misfolding of a polyalanine variant due to lack of electrostatic polarization effects
,”
Theor. Chem. Acc.
132
,
1354
(
2013
).
137.
Y.
Kuroda
,
D.
Hamada
,
T.
Tanaka
, and
Y.
Goto
, “
High helicity of peptide fragments corresponding to β-strand regions of β-lactoglobulin observed by 2d-NMR spectroscopy
,”
Fold Des.
1
,
255
263
(
1996
).
138.
D.
Hamada
,
S.-i.
Segawa
, and
Y.
Goto
, “
Non-native α-helical intermediate in the refolding of β-lactoglobulin, a predominantly β-sheet protein
,”
Nat. Struct. Biol.
3
,
868
873
(
1996
).
139.
R. B.
Best
,
D.
de Sancho
, and
J.
Mittal
, “
Residue-specific α-helix propensities from molecular simulation
,”
Biophys. J.
102
,
1462
1467
(
2012
).
140.
A.
Perez
,
J. L.
MacCallum
,
E.
Brini
,
C.
Simmerling
, and
K. A.
Dill
, “
Grid-based backbone correction to the ff12SB protein force field for implicit-solvent simulations
,”
J. Chem. Theory Comput.
11
,
4770
4779
(
2015
).
141.
J.
Henriques
,
C.
Cragnell
, and
M.
Skepö
, “
Molecular dynamics simulations of intrinsically disordered proteins: Force field evaluation and comparison with experiment
,”
J. Chem. Theory Comput.
11
,
3420
3431
(
2015
).
142.
D. J.
Kozuch
,
F. H.
Stillinger
, and
P. G.
Debenedetti
, “
Low temperature protein refolding suggested by molecular simulation
,”
J. Chem. Phys.
151
,
185101
(
2019
).
143.
P. H.
Handle
,
T.
Loerting
, and
F.
Sciortino
, “
Supercooled and glassy water: Metastable liquid(s), amorphous solid(s), and a no-man’s land
,”
Proc. Natl. Acad. Sci. U. S. A.
114
,
13336
13344
(
2017
).
144.
J.
Bachler
,
P. H.
Handle
,
N.
Giovambattista
, and
T.
Loerting
, “
Glass polymorphism and liquid–liquid phase transition in aqueous solutions: Experiments and computer simulations
,”
Phys. Chem. Chem. Phys.
21
,
23238
23268
(
2019
).

Supplementary Material