Understanding the permeation of biomolecules through cellular membranes is critical for many biotechnological applications, including targeted drug delivery, pathogen detection, and the development of new antibiotics. To this end, computer simulations are routinely used to probe the underlying mechanisms of membrane permeation. Despite great progress and continued development, permeation simulations of realistic systems (e.g., more complex drug molecules or biologics through heterogeneous membranes) remain extremely challenging if not intractable. In this work, we combine molecular dynamics simulations with transition-tempered metadynamics and techniques from the variational approach to conformational dynamics to study the permeation mechanism of a drug molecule, trimethoprim, through a multicomponent membrane. We show that collective variables (CVs) obtained from an unsupervised machine learning algorithm called time-structure based Independent Component Analysis (tICA) improve performance and substantially accelerate convergence of permeation potential of mean force (PMF) calculations. The addition of cholesterol to the lipid bilayer is shown to increase both the width and height of the free energy barrier due to a condensing effect (lower area per lipid) and increase bilayer thickness. Additionally, the tICA CVs reveal a subtle effect of cholesterol increasing the resistance to permeation in the lipid head group region, which is not observed when canonical CVs are used. We conclude that the use of tICA CVs can enable more efficient PMF calculations with additional insight into the permeation mechanism.

Fundamental insight into how molecules passively permeate through biological membranes remains important in areas such as targeted drug delivery, pathogen detection, and cellular communication. The complexity of these processes is partially due to the impressive variety of amphipathic molecules with distinct chemical structures (e.g., phospholipids, ceramide-based sphingolipids, and cholesterol) that make up biomembranes. The composition varies not only across cell types and organelles but also between bilayer leaflets and sometimes membrane domains, such as the membrane above a budding HIV virion.1 In addition to composition, the packing and fluidity of a bilayer, as determined by the organization and interactions between membrane components,2 further influence membrane permeability. These trends were captured in early studies showing that the composition of cell membranes affects the transport of drug molecules.3–5 Although computer simulations have provided invaluable insight into membrane permeation processes to date, they have often been limited to simple model membranes, such as a single-component bilayer, due to computational and algorithmic limitations. Moving forward, it is essential to extend existing computational modeling methods to the transport of drug molecules through increasingly complex and realistic membrane models.

Two common challenges for molecular dynamics (MD) simulations, including those of membrane permeation, are (1) limited attainable simulation time juxtaposed to rugged free energy surfaces and (2) loss of information when many dimensional free energy surfaces are projected onto just a few degrees of freedom that we can interpret. Focusing on the first, any minimum on the free energy surface deeper than a few kcal/mol effectively traps the system, e.g., a permeating molecule, for long simulation times. For membranes, the slow mixing of lipids within a bilayer further complicates the free energy surface such that converging free energy profiles of membrane permeation can be prohibitively expensive (years of computer time for a typical drug molecule). In the absence of this limitation, such as for very small molecules, permeation can be well characterized with conventional MD. For example, Ghysels et al. investigated permeation of water and oxygen through a lipid bilayer composed of dipalmitoylphosphatidylcholine (DPPC), 1,2-dioleoyl-sn-glycero-3-phosphocholin (DOPC), and cholesterol and showed that these small molecules are able to cross a bilayer leaflet through the boundary regions enriched by DOPC and cholesterol molecules.6 

For larger molecules or more rugged free energy profiles, however, the standard approach to reducing long computation times has been to combine MD with some form of enhanced free energy sampling.7 For example, Lee et al.8 computed the permeation free energy profile (or potential of mean force; PMF) of urea, benzoic acid, and codeine through a pure dimyristoylphosphatidylcholine (DMPC) bilayer using several different enhanced sampling methods (umbrella sampling, replica exchange umbrella sampling, adaptive biasing force, and multiple-walker adaptive biasing force) and concluded each provided similar accuracy. Similarly, the MARTINI coarse-grained model was used to test various free energy sampling methods (umbrella sampling, replica exchange umbrella, and metadynamics) by using different sets of collective variables and found each method provided similar results.9 Bennion et al.10 employed umbrella sampling to calculate permeation PMFs for a range of compounds through a DOPC bilayer and qualitatively predicted their relative permeability coefficients. Bochicchio et al.11 compared the performance of umbrella sampling, non-tempered metadynamics (MetaD), and well-tempered MetaD (WTMetaD) for calculating the permeation PMF of polyethylene and polypropylene oligomers through a 1-palmitoyl-2-oleoyl-phosphatidylcholine (POPC) bilayer. Even though MetaD was reported to be more efficient and has lower statistical uncertainties, both MetaD and umbrella sampling yielded the same PMF estimates for the association of these polymeric molecules with the lipid bilayer. Neale et al.12 employed a modified umbrella sampling with Hamiltonian exchange to enable random walks along the bilayer normal and demonstrated a faster convergence in the estimation of PMF for the insertion of the ionic side chain analog of arginine into a POPC bilayer. Ghaemi et al.13 studied the permeation of ethanol through a POPC membrane by using bias-exchange MetaD with multiple collective variables (CVs) such as the z-component distance between the permeating molecule and the bilayer, the torsional angle of the molecule, and the degree of solvation/desolvation. They found that the obtained permeability coefficients agreed well with those obtained from long unbiased MD simulations but also that the use of a single CV results in significant memory effects in the force correlation function. Sun et al. used transition-tempered metadynamics (TTMetaD) to understand the permeation mechanism14 and membrane permeability15 of a large drug molecule called trimethoprim through a POPC bilayer. They demonstrated that faster convergence was achieved by using multiple CVs describing the distance normal to the bilayer as well as the orientation of the drug molecule. Recently, the effect of the charge of the permeant on the permeability was investigated by using a combination of constant-pH MD simulations and free energy sampling methods.16 This study revealed how permeation of charged species through a POPC bilayer can be significantly altered by dynamic neutralization at the membrane surface, which explains why conventional models often overestimate the permeability for these molecules.

Computational investigations have also started to focus on the effect of membrane composition and physical properties on the free energy landscape of permeation. For example, cholesterol was found to influence permeation for certain types of drug-like molecules in a recent study employing adaptive biasing force (ABF) calculations.17 The permeation of a cancer drug, cisplatin, through a heterogeneous membrane that mimics a cancer cell [with increased fractions of 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE) and 1,2-dioleoyl-sn-glycero-3-phospho-L-serine (DOPS) in the outer monolayer] was shown to be significantly decreased compared to that through a normal membrane [asymmetric bilayer composed of sphingomyelin (SM), DOPC, DOPS, and DOPE with larger fractions of SM and DOPC in the outer monolayer].18 Yesylevskyy et al. used umbrella sampling to investigate the effect of membrane curvature on the permeability of anti-cancer drugs cisplatin and gemcitabine, and found that highly curved membranes increase their permeability.19 Cao et al. used bias-exchange MetaD simulations to query the effect of cholesterol on the permeation of amino acids through a membrane composed of cholesterol and DPPC and found that it also depended on the type of amino acid; the energy barrier of protonated arginine permeation linearly increases with higher cholesterol concentration, while the barrier of tryptophan first increases and then remains unchanged with a further increase in cholesterol concentration.20 The authors suggested that the origin of this difference could be associated with different hydration defects during permeation. A review by Neale and Pomès connected the thickness and bending modulus of a membrane with the magnitude of the central free energy barrier, suggesting that the barrier decreases with increasing flexibility.21 Following this logic, cholesterol should increase the barrier while increasing bilayer rigidity. Bennett et al. also showed that the rigidity, thickness, and order of the bilayers increase with a higher concentration of cholesterol.22 They found that the free energy barrier for transferring the headgroup of DPPC to the bilayer center increases with increasing cholesterol concentration. Finally, the influence of an ordered vs disordered lipid phase was shown to influence the permeation free energy profiles of oxygen and water in the aforementioned study by Ghysels et al.6 

In contrast to the previous findings are the results from a recent study by Tse et al.,23 in which adaptive biasing force simulations were used to investigate the effect of membrane composition on the permeation of ethanol through six single-component membranes (DLPC, DLPE, DMPC, DOPC, DOPS, and POPE) as well as one asymmetric membrane composed of 25 lipid types and cholesterol that mimics a mammalian cell membrane. The authors concluded that the permeability profiles from single-component membranes (except for those through lipids with very short tail groups and bulky headgroups) agreed well with those through the multicomponent mammalian cell membrane and therefore, that, single-component membranes can be considered as good approximate models for more realistic membranes. They also demonstrated that membrane permeability can be calculated without estimating the complete PMF and diffusivity profiles as the solubility-diffusion integral is dominated by only small regions of the PMF. Using free energy perturbation, they were able to calculate the free energy at particular points on the free energy landscape and obtain the complete PMF by interpolating through the free energy at the selected points. While insightful and potentially useful, these findings should be cautiously extended to other molecules. Focusing instead on the influence of small molecules (constructed by two neutral MARTINI beads) on phase separation, ternary mixed membranes (DPPC/DLiPC/CHOL) were investigated with MARTINI coarse-grained (CG) simulations and umbrella sampling.24 The authors demonstrated a linear correlation between the strength of phase separation in the mixed membrane and preferential partitioning of the small molecules. They also measured transfer free energy to understand small molecule-induced mixing and demixing effects and found that the demixing effects of small molecules increase as the molecule becomes less hydrophobic. Various other applications of enhanced sampling methods to the computation of the permeation PMFs are comprehensively discussed in a recent review by Venable, Krämer, and Pastor.25 

While the above examples demonstrate how a range of enhanced sampling methods are enabling permeation calculations through mixed lipid compositions, these model membranes are still far more simple than real biomembranes. Moreover, computational expense for complex molecules is often still prohibitive. Thus, an ongoing methodological challenge in the field is to increase the efficiency, and arguably accuracy, of permeation simulations. A key factor in this, beyond the particular algorithm chosen for the enhanced free energy sampling, is the choice of CVs used to describe the permeation process (i.e., finding representative reaction coordinates). Ultimately, convergence and accuracy both rely on finding the degrees of freedom that describe the reaction of interest. Selecting the best CVs, however, is far from easy. Current approaches typically require a time intensive, manual, and often intuition-based process that leads to CVs that seem to work, but the degree to which they work is often unknown. Invariably, other CVs exist that are key to both efficiency and accuracy.

Increasingly, machine learning approaches are suggesting new routes to identify optimal CVs.26 One strategy is using time-structure based Independent Component Analysis (tICA),27–29 an unsupervised machine learning technique that finds degrees of freedom that decorrelate slowly. It is the slow dynamics that limits the exploration of a free energy surface and, thus, the slow degrees of freedom aligned with the process of interest that are the optimal CVs in enhanced free energy sampling simulations.30 The tICA method was used in combination with regular MetaD31 and WTMetaD32 to study the behaviors of small systems, such as conformational transitions in alanine dipeptide. Both studies showed that tICA MetaD is an effective method for obtaining highly diffusive behavior in CV space and faster convergence. More recently, tICA MetaD was employed on a more complex problem to understand protein conformational transitions of L99A T4 lysozyme,33 which is a moderate size protein and, thus, involves many slow degrees of freedom. This work demonstrated that tICA CVs accelerate the convergence of the PMF calculations and allowed the exploration of new intermediate states as well as the estimation of the free energy difference between the states in excellent agreement with experiments. tICA MetaD was also employed on non-biological systems to investigate crystallization of Na and Al metals.34 Many other developments and applications of machine learning in identifying collective variables in enhanced sampling in biomolecular simulations are discussed in a recent review by Sidky, Chen, and Ferguson.35 

In this work, we simulate a POPC bilayer and a POPC:CHOL mixture to understand the effect of cholesterol on the transport characteristics of a drug molecule called trimethoprim. As the cholesterol level in the membrane varies between different human cell types,36 effects of cholesterol on the transport properties obtained from simulations can be useful to improve the design of drug molecules targeting particular cells and tissues. Variation in the cholesterol level is even more pronounced between the plasma membrane and organelle membranes. Since the subcellular localization of drug molecules after uptake has critical effects on the therapeutic efficiency and side effects,37 it is also important to understand the role of cholesterol on localization for optimal drug design. We extend the previously proposed TTMetaD permeation method to study permeation of trimethoprim through a heterogeneous membrane model, specifically the POPC:CHOL (9:1) mixture. In addition, we apply tICA to determine the slow dynamics controlling the orientation of the drug molecule that limits the exploration of the free energy landscape in initial TTMetaD runs of our membrane–drug systems and then use the resulting coordinates as CVs in subsequent TTMetaD simulations. These CVs demonstrate not only faster convergence but also reveal additional physical aspects of the system and a more accurate permeation free energy profile.

All-atom molecular dynamics simulations were carried out using GROMACS 2016.438 patched with PLUMED 2.4.139 to perform TTMetaD (publicly available in PLUMED version 2.4 and beyond). The pure lipid bilayer was made of 40 POPC molecules, while the heterogeneous lipid bilayer was made of 36 POPC and 4 cholesterol molecules [POPC:CHOL (0.9:0.1)]. The bilayer was placed in the x–y plane and solvated by 2200 water molecules. The simulated system had dimensions of ∼3.6 × 3.6 × 9.5 nm3, with periodic boundary conditions enforced along all directions of the Cartesian space. Finite size effects were tested with larger bilayer systems as described in the supplementary material. For both systems, CHARMM3640 was used for the lipid and CHARMM general force field (CGenFF)41 was used for the drug molecule (trimethoprim). The water molecules were modeled using the TIP3P force field.42 The initial structure of the heterogeneous lipid bilayer was prepared using a CHARMM-GUI membrane builder43 and equilibrated for 150 ns in water. After the equilibration, the drug molecule was randomly placed into the water using PACKMOL.44 The lipid and the water plus drug molecule were individually coupled to two 323 K heat baths using velocity rescaling with a stochastic term.45 Pressure control was also applied by using a Berendsen barostat,46 in which the box was rescaled every 5 ps. The cutoff distance for the short-range neighbor list was 12 Å, and the neighbor list was updated every 40 steps. Fast smooth Particle-Mesh Ewald (SPME)47 was used to compute the long-range electrostatic interaction. All hydrogen bonds were constrained by a linear constraint solver (LINCS),48 and the MD integration time step was 2 fs. The initial velocities were randomly sampled from a Maxwell–Boltzmann distribution with a temperature of 323 K. The images from the simulations were generated with Visual Molecular Dynamics (VMD).49 

Metadynamics (MetaD)50,51 is an enhanced sampling technique that can be used to study physical phenomena occurring at time and length scales beyond the capability of standard MD. In MetaD, a history-dependent bias energy in the form of Gaussian potential [Eq. (1)] is added to the Hamiltonian of the system using predetermined collective variables (CVs) to bias the normal evolution of the system,

VG(s,t)=0tdtwexpi=1dSi(R)Si(R(t))22σi2,
(1)

where σ is the width of the Gaussian for the ith CV, w is the height of the Gaussian, wo, divided by the deposition stride, τ, and SiRt corresponds to the value of the ith CV at time t′. The sampling of rare events is accelerated by discouraging the system from revisiting the previously visited regions in the CV phase space, ideally resulting in ergodic and diffusive motion along the chosen CVs. Summing the negative of the MetaD bias energies provides an underlying free energy profile along the selected CVs.

In the non-tempered MetaD, the height of the Gaussian remains constant during the simulation, which has been shown to prevent convergence and/or destabilize the system due to the addition of too much energy. WTMetaD resolves this issue by introducing a time-dependent bias potential in which the height of the bias decreases exponentially based on the local bias energy [Eq. (2)]. Dama et al. have showed that WTMetaD results in asymptotic convergence of the bias energy to a linearly scaled inverse of the underlying free energy,

w(t)=w0expVG(s,t)kBΔT.
(2)

The parameter ΔT in Eq. (2) tunes the rate of reducing the height of the Gaussian bias and must be chosen prior to the simulation. The ideal choice of ΔT is proportional to the free energy barrier to be crossed,52 which is generally not known in advance. Choosing too small of a value results in failure to escape local minima, while choosing too large of a value can introduce unphysical instability and errors in the PMF. Transition-tempered metadynamics (TTMetaD)52 was developed to converge like well-behaved WTMetaD while overcoming the previously mentioned efficiency issues dependent on choosing the right tempering rate. TTMetaD only tempers once the transition state (TS) region is sampled and then tempers aggressively. This is achieved by replacing the local bias energy VG [Eq. (3)] with a global property V*, which is the minimal bias on the maximally biased path among all the continuous paths s(λ) connecting all of the preselected points in the CV space,

w(t)=w0expV*(s(λ),t)kBΔT.
(3)

The basins surrounding a significant barrier in the free energy landscape are defined by the preselected points in TTMetaD. The Gaussian height does not temper before the basins are connected through the TS as V* remains zero due to the minimal bias on the paths connecting the basins, as discussed above. The underlying free energy landscape is, thus, almost filled when the TS region is first sampled; therefore, the convergence of TTMetaD simulations can be achieved by performing a more aggressive tempering (small ΔT) of the Gaussian height.

A bottleneck in the application of MetaD is the selection of the collective variables (CVs). This step involves assessment of the system to identify the critical “slow” degrees of freedom that mediate membrane permeation. The manual approach for selecting CVs requires, for each new molecule, a time intensive and subjective process and, in general, will not necessarily lead to the best CVs. In addition, it is not possible to select a large number of CVs to describe all the degrees of freedom of the system as the cost of MetaD simulations increases exponentially with the number of CVs. In this paper, we aim at overcoming this problem by using time-structure based Independent Component Analysis (tICA). tICA is an unsupervised machine learning technique designed to find the degrees of freedom where correlation decays slowly31,32 and can be viewed as a linear solution to the Variational Approach to Conformational Dynamics (VAC), a systematic approach for isolating the slow processes central to molecular dynamics simulations. The theoretical background of the tICA approach is described in detail by Sultan and Pande31 and McCarty and Parrinello32 in the recent studies. Herein, we test the efficacy of applying tICA to an initially biased trajectory, which is essential to capture membrane permeation events, in order to identify efficient CVs for further TTMetaD simulations. When using tICA, the state of a molecular system is characterized using various degrees of freedom, or features, which are selected prior to analysis. These features are then linearly combined to produce a new set of coordinates that optimally describe the slow behavior of the system by solving a generalized eigenvalue problem. While tICA provides a new set of coordinates as the output, we only utilize the slowest of this set in this work; the use of multiple tICA coordinates is reserved for future work. In contrast to the work of Sultan and Pande31 and Brotzakis and Parrinello,33 we here apply tICA to the results of an initial metaD simulation using the time scale native to the biased simulation (using the MSMBuilder package53). Rescaling the biased features and lag time in a manner that is appropriate for TTMetaD is non-trivial and will be investigated in future work. After obtaining the trajectory, a set of features were selected to represent the system and a tICA lag time was picked, as explained in detail in the Selection of Features. Then, the covariance matrices central to tICA were computed, and the generalized eigenvalue problem is solved to find a linear approximation to the slowest observed collective modes. Finally, this tICA coordinate is used as a CV in our TTMetaD simulations. While the resulting slow processes observed in the biased simulation do not rigorously relate to those observed in the unbiased system, the resulting procedure is here shown to provide effective CVs while using off-the-shelf analysis methods and minimal simulation data. Additionally, in contrast to previous work, the tICA-derived CV (which helps characterize the orientation of the drug molecule) is combined with a CV describing the translational movement of the drug molecule, which is very similar to the original bias. This is further tested with a naïve center of mass (COM) CV in the initial simulation, which produces consistent efficiency and free energy profiles when combined with corresponding tICA-derived CVs (full details in the supplementary material).

Our selection of features used as the input to tICA was motivated by previous work in which the COM distances between each ring of a drug molecule and the lipid bilayer (referred to as z1, z2 CVs below) were shown to decrease the total simulation time required to converge the permeation PMF.14,15 Two CVs used in that work correspond to the z-component distances between the COMs of the trimethoxybenzyl (z1) and pyrimidine (z2) groups to the lipid bilayer. One benefit of the z1, z2 CVs was their ability to capture orientational changes in the drug molecule relative to the membrane. Despite the improved convergence, it is unlikely that these two degrees of freedom (z1, z2) were optimal relative to the many degrees of freedom that mediate membrane permeation. For the tICA CV described below, five distances between the drug molecule and the center of the bilayer were selected to see if this relatively simple expansion in dimensions better captures the permeation-limiting dynamics.

First, we performed a 70 ns TTMetaD simulation using z1, z2 CVs to obtain an initial trajectory that involves at least one transition of interest (crossing of the drug across the lipid bilayer). Then, we selected the aforementioned set of features that are important for the permeation process and analyzed the initial trajectory using these selected features [dj(R)] to identify the most dominant generalized eigenvector corresponding to the slowest decaying mode via tICA. The corresponding eigenvector was then used as a CV [denoted si(R)] by using it to specify a linear combination of dj(R). In this case, the features are multiple distances between trimethoprim and the lipid bilayer in the z-axis direction (Fig. 1, upper left panel), and the resulting tICA CV is a linear combination of those distances,

si(R)=jNcijdj(R),
(4)
tICACV=c1*Z1+c2*Z2+c3*Z3+c4*Z4+c5*Z5.
(5)

The corresponding expansion coefficients (cj) were obtained at different lag times by solving the eigenvalue equation (Fig. 1, upper right and lower panels). The lag time was chosen to be 4 ns. The coefficients were found to have similar trends for lag times in the range of 2–8 ns. We also extracted the coefficients from an unbiased simulation in which the drug molecule does not cross the membrane. As expected, the resulting coefficients were found to be completely different than those obtained from biased simulations. They were also unstable over different lag times (Fig. S1). The drug molecule remains in the water throughout the unbiased simulation, whereas it permeates through the membrane in the biased simulation, leading to different expansion coefficients obtained from unbiased and biased simulations. The tICA CV corresponding to a lag time of 4 ns is given as

tICACV=0.106*Z1+0.142*Z2+0.107*Z30.082*Z4+0.498*Z5.
(6)

This tICA CV was combined with the z-distance between the center of mass of the drug molecule and the center of the bilayer in the resulting TTMetaD simulations. The center of the bilayer is defined by calculating the COM of the bilayer in the z-direction (normal to the bilayer).

FIG. 1.

Selection of features for tICA analysis. (Upper left panel) Distances in z coordinates between different parts of trimethoprim and the center of the bilayer were tested. (Other panels) tIC loadings for the five selected features at different lag times. The position of the highest weight is shown on the molecule in the lowest panel.

FIG. 1.

Selection of features for tICA analysis. (Upper left panel) Distances in z coordinates between different parts of trimethoprim and the center of the bilayer were tested. (Other panels) tIC loadings for the five selected features at different lag times. The position of the highest weight is shown on the molecule in the lowest panel.

Close modal

The coordination numbers between trimethoprim–POPC and trimethoprim–CHOL (only carbon atoms) was calculated using PLUMED39 and is described as follows:

Coordinationnumber=iεAiεBsij,
(7)
sij=1rijd0r0n1rijd0r0m,
(8)

where sij is 1 or 0 depending on the existence of a contact between atoms i and j, rij is the distance between atoms, and r0 is the cutoff parameter that is set to be 0.35 nm. Other parameters were set to the following values (d0 = 0, n = 12, and m = 24). The coordination numbers between trimethoprim–POPC and trimethoprim–CHOL were normalized based on the number of carbon atoms in POPC and CHOL.

We first applied TTMetaD to obtain the PMF for the permeation of trimethoprim through the POPC/CHOL bilayer using the previous CVs (z1 and z2 distances) designed by Sun et al.14,15 The Gaussian bias energy deposition rate was every 500 steps with a height of 0.05 kJ/mol and a width of 0.24 nm. The height of the Gaussian was tempered with a bias factor of 5. The positions of the TTMetaD basins were selected to be (1.7; 1.7 nm) and (0; 0 nm) for z1 and z2 distances. Four randomly initialized TTMetaD replicas were run for around 2.5 µs, and the resulting PMF [Fig. 2(a)] was obtained by averaging over replicas. One clear difference between the 2D PMFs reported herein and those in the previous study14,15 for the POPC bilayer is the narrower area sampled in the free energy profile. This difference originates from the smaller widths of Gaussian hills used in the current study. The minimum free energy path (MFEP) on the 2D PMF [black curve in Fig. 2(a)] was calculated with a zero temperature string method.54 The MFEP represents the most probable transition path in the ensemble of the permeation processes. The 1D PMF was directly obtained by using the average MFEP from four independent replicas. A comparison between 1D PMFs for POPC15 and POPC/CHOL shows that both the width and height of the free energy barrier increase with the addition of cholesterol (Fig. 3). These changes are consistent with characteristic modifications in the membrane properties, including lower area per lipid and higher bilayer thickness, due to the presence of cholesterol.

FIG. 2.

(a) Permeation potential of mean force for trimethoprim through the POPC/CHOL bilayer using z1, z2 CVs (a) as well as through the POPC bilayer (b) and 9:1 POPC:CHOL bilayer (c) using tICA CVs. The free energy color bar is in kcal/mol. The black curve shows the MFEP found from the string method.

FIG. 2.

(a) Permeation potential of mean force for trimethoprim through the POPC/CHOL bilayer using z1, z2 CVs (a) as well as through the POPC bilayer (b) and 9:1 POPC:CHOL bilayer (c) using tICA CVs. The free energy color bar is in kcal/mol. The black curve shows the MFEP found from the string method.

Close modal
FIG. 3.

1D PMFs from MFEPs in Fig. 2. The free energy in bulk water has been set to zero. Error bars [0.01–0.33 kcal/mol for POPC/CHOL (tICA), 0.03–0.61 kcal/mol for POPC (tICA), and 0.06–0.97 kcal/mol for POPC/CHOL (z1, z2)] are not shown for clarity.

FIG. 3.

1D PMFs from MFEPs in Fig. 2. The free energy in bulk water has been set to zero. Error bars [0.01–0.33 kcal/mol for POPC/CHOL (tICA), 0.03–0.61 kcal/mol for POPC (tICA), and 0.06–0.97 kcal/mol for POPC/CHOL (z1, z2)] are not shown for clarity.

Close modal

We analyzed the bilayer thickness and tail order parameters of both POPC only and POPC/CHOL systems to verify the effect of cholesterol on the membrane properties. The POPC/CHOL bilayer is, indeed, thicker (40.8 ± 0.8 Å vs 38.6 ± 0.7 Å) and more ordered than pure POPC (Fig. 4). The measurement of area per lipid is not straightforward in the case of heterogeneous membranes but the area per lipid is expected to reduce with increased bilayer thickness and tail order parameters.55 Consistent with this, the area per lipid decreases from 66.2 ± 1.4 Å2 in the pure POPC membrane to 59.3 ± 1.3 Å2 in the POPC/CHOL membrane. Similar changes were recently reported by Tse et al. who obtained PMFs for ethanol permeation through different types of lipid bilayers and found that decreased area per lipid and increased thickness produced wider and taller free energy barriers.23 Consistent with our results, they suggested that these trends in heterogeneous membranes were due to the condensing effect of cholesterol molecules.

FIG. 4.

Tail order parameters of POPC lipids in the presence and absence of trimethoprim from unbiased (POPC and POPC/CHOL) and biased simulations (POPC–TRIM and POPC/CHOL–TRIM).

FIG. 4.

Tail order parameters of POPC lipids in the presence and absence of trimethoprim from unbiased (POPC and POPC/CHOL) and biased simulations (POPC–TRIM and POPC/CHOL–TRIM).

Close modal

The tail order parameters were also measured in the presence and absence of trimethoprim for the POPC bilayer and the POPC:CHOL mixture (Fig. 4). The effect of trimethoprim on the order parameter was investigated by averaging over all the lipids that are within 5 Å of trimethoprim. There is an apparent reduction in the lipid tail order parameters for the lipid molecules in the neighbor of trimethoprim both in POPC and POPC:CHOL mixture. As expected, cholesterol molecules increase the tail order parameters of the lipids such that the POPC:CHOL mixture in the presence of trimethoprim is still slightly more ordered than the pure POPC bilayer in the absence of the drug.

Next, we analyzed the coordination between drug (trimethoprim) and membrane components (POPC and CHOL) using only carbon atoms to determine if the drug molecule has a preference to interact with either of the two membrane components. For this analysis, we setup a new bilayer composed of 50% of cholesterol to have an equal number of molecules for each component and ran a simulation with the same settings for 1.25 µs. The drug molecule has no preference toward a specific membrane component although the coordination number of cholesterol molecules is usually slightly higher than that of POPC lipids, especially when trimethoprim is inside the membrane (Fig. 5). These results support the fact that the energy barrier becomes wider and taller with the addition of cholesterol due to changes in the membrane properties, including the bilayer thickness and tail order parameters, instead of increased interactions between trimethoprim and cholesterol molecules.

FIG. 5.

(a) The z-component distance between the centers of masses of trimethoprim and the lipid bilayer as a function of time for the POPC:CHOL (1:1) mixture. (b) A comparison of coordination numbers (only carbon atoms) between trimethoprim–POPC and trimethoprim–CHOL in the POPC:CHOL (1:1) mixture as a function of time. The Z-distance is normalized and, therefore, unitless.

FIG. 5.

(a) The z-component distance between the centers of masses of trimethoprim and the lipid bilayer as a function of time for the POPC:CHOL (1:1) mixture. (b) A comparison of coordination numbers (only carbon atoms) between trimethoprim–POPC and trimethoprim–CHOL in the POPC:CHOL (1:1) mixture as a function of time. The Z-distance is normalized and, therefore, unitless.

Close modal

The time-structure based independent component analysis (tICA) method was used to determine the slow dynamics that limit exploration of the free energy surface of membrane permeation. The characteristics of the system showing slow dynamics determined by tICA were used as CVs in the TTMetaD simulations. We used this dimensionality reduction technique to find the degrees of freedom that do not quickly de-correlate during the permeation by analyzing a short, biased trajectory in which at least one crossing of drug through the membrane was observed. The resulting tICA based CV is a linear combination of the relevant slow structural features with appropriate weights determined from tICA. A more detailed protocol for the selection of features and the construction of the tICA CV can be found in the section titled “Methods.” The second CV was selected to be the z component of the center of mass distance between the membrane and the drug molecule. For tICA TTMetaD simulations, the Gaussian bias energy deposition rate was every 500 steps with a height of 0.03 kJ/mol. The Gaussian widths were 0.768 and 1.536 nm for the tICA CV and the COM CV, respectively. The height of the Gaussian was tempered with a bias factor of 5. The Gaussian widths were increased for the tICA simulations to make the simulations more stable. Three randomly initialized TTMetaD replicas were run for around 1 µs for both POPC only and POPC/CHOL systems. The 2D PMFs [Figs. 2(b) and 2(c)] were obtained by averaging the PMFs from each replica. The MFEP, as before, was obtained by using the average MFEP from the averaged PMF from independent replicas.

A comparison between the 1D PMFs of POPC only and POPC/CHOL systems (Fig. 3) shows similarities to those obtained from z1, z2 CVs: both the width and height of the energy barrier increase with the addition of cholesterol. These results support the notion that cholesterol molecules modify the membrane properties such as bilayer thickness, area per lipid, and tail order parameters, which, in turn, affect the shape of the energy barrier for the permeation of the drug molecule through the lipid bilayer. One of the interesting differences between the PMFs obtained by using two sets of CVs (z1, z2 CVs and tICA CVs) is the difference in the work required for the permeation of the drug through the region below the lipid head group water interface (∼1.4–1.8 nm from the center of the membrane) in the POPC/CHOL system. The tICA CVs demonstrate a slight increase in the PMF with the addition of cholesterol around that region, which is not observed in the z1, z2 CVs. This could result from the condensing effect of cholesterol, as the cholesterol molecules will result in a smaller area in the region below the lipid–water interface, making the permeation of the drug molecule harder. A similar effect of cholesterol on the PMF for the permeation of drug or drug-like molecules through POPC and POPC:CHOL (2:1) bilayers was also reported by Tse et al.17 They observed an overall increase in the height and width of the PMF as well as a larger PMF below the lipid head group–water interface, consistent with our findings with tICA CVs. The z1, z2 CVs were not able to capture this effect of cholesterol on the PMF in our simulations.

To test the efficiency of tICA CVs, a comparison on the number of membrane crossings was made between two sets of CVs (z1, z2 CVs vs tICA CVs) for the simulation of trimethoprim through the POPC:CHOL (9:1) mixture. The number of membrane crossings is an important metric to determine the computational speedup in the permeation simulations as a larger number of crossings will result in a more efficient sampling. This is especially important for simulating the permeation of drug molecules across realistic multicomponent lipid membranes, where each component of the membrane needs to be sampled sufficiently well in order to capture the effect of membrane heterogeneity on permeation. Based on the comparison, the same number of membrane crossings can be obtained in a shorter simulation time (about three times less) by using tICA CVs (Fig. 6). These results indicate that tICA CVs led to more efficient sampling and, therefore, require less simulation time in total. In addition, the results of tICA CVs provide a more accurate physical intuition for the permeation of drug molecule and understanding the effect of cholesterol or heterogeneity on the energetics of the association between the membrane components and the drug molecule.

FIG. 6.

The z-component distance between the centers of masses of trimethoprim and the lipid bilayer as a function of time from TTMetaD simulations with z1, z2 CVs (upper panel) and tICA CVs (lower panel). The z-distance is a normalized distance and, therefore, unitless.

FIG. 6.

The z-component distance between the centers of masses of trimethoprim and the lipid bilayer as a function of time from TTMetaD simulations with z1, z2 CVs (upper panel) and tICA CVs (lower panel). The z-distance is a normalized distance and, therefore, unitless.

Close modal

To understand if the heterogeneity of lipid bilayers has any effect on the organization of drug molecules in the bilayer during permeation, the orientation of trimethoprim was investigated as a function of its distance from the center of the membrane and the types of neighboring molecules. The orientation was described by the angle made by the vector passing through the COM of the trimethoxybenzyl and pyrimidine groups in trimethoprim, and the vector normal to the lipid bilayer. The angle distributions of the drug for two cases (when it is either nearby cholesterol or POPC) indicate that cholesterol disturbs the orientation of trimethoprim [Fig. 7(b)]. Trimethoprim favors more specific configurations [formation of narrower or bimodal angle distribution; see Fig. 7(b), lower panels] when there is a cholesterol molecule nearby, especially around the lipid head group region.

FIG. 7.

Orientation of trimethoprim with respect to the lipid bilayer. (a) Atomistic representation highlighting angle θ (red arrows) with cholesterol and trimethoprim in black and green, respectively, as well as POPC lipids (right). (b) Probability distributions of θ as a function of the z-component distance between the COMs of trimethoprim and the lipid bilayer in the presence (black line) and absence (red line) of nearby cholesterol molecules.

FIG. 7.

Orientation of trimethoprim with respect to the lipid bilayer. (a) Atomistic representation highlighting angle θ (red arrows) with cholesterol and trimethoprim in black and green, respectively, as well as POPC lipids (right). (b) Probability distributions of θ as a function of the z-component distance between the COMs of trimethoprim and the lipid bilayer in the presence (black line) and absence (red line) of nearby cholesterol molecules.

Close modal

To investigate the origin of differences between the PMFs obtained from the simulations using different sets of CVs, we first obtained trimethoprim orientation plots for the z1, z2 CVs (Fig. S7) and compared them to those for the tICA CVs (Fig. 7). There are clear differences between the angle distributions for the z1, z2 and tICA CVs. Specifically, z1, z2 CVs favor angles closer to 0° and 90°, whereas tICA CVs favor angles in the midrange [corresponding representative images are given in Fig. S8(A) for z1, z2 CVs and Fig. S8(B) for tICA CVs]. In addition, the effect of cholesterol molecules on the trimethoprim orientation is larger for the tICA CVs in the region of 1.5–2 nm, showing that both the angle and cholesterol interactions contributed to the difference in the 1D PMFs for the mixed membrane near the head group region. We additionally obtained the number of contacts between trimethoprim and cholesterol molecules as a function of time for the z1, z2 CV and tICA CV simulations (Fig. S9). Both sets of CVs show similar varying numbers of contacts between trimethoprim and cholesterol, suggesting a similar sampling of cholesterol–trimethoprim interactions. Interestingly, the highest number of contacts (e.g., number of contacts above 25) is always observed for the tICA CV, consistent with the tICA CV capturing angles that enabled interactions with cholesterol.

We have combined MD simulations with TTMetaD and machine learning methods (tICA) to characterize the permeation mechanism of trimethoprim through POPC only and POPC/cholesterol membranes. Our simulations demonstrate that the free energy barrier for the drug permeation through a lipid bilayer increases with the addition of cholesterol due to altered membrane properties, including decreased area per lipid, increased order, and increased bilayer thickness. The use of machine-learned tICA CVs was shown to accelerate the convergence of TTMetadD PMF calculations while also revealing additional features in the permeation of trimethoprim through the membrane. Specifically, the tICA CVs reveal an increase in the PMF (decreased stability) with the addition of cholesterol just below the lipid head group region, showing that the condensing effect of cholesterol contributes to permeation resistance near the lipid–water interface. Non-tICA CVs, in contrast, suggest that this region is equally stable to bulk water. The orientation of trimethoprim is additionally influenced by interactions with cholesterol, altering the dominant configurations of trimethoprim in the cholesterol-rich regions in the membrane. Most importantly, this work constitutes a step forward in our effort to effectively reduce many dimensional rare-event processes onto a limited number of reaction coordinates that both enable sufficient computational sampling and capture the relevant free energy profile. Future work will address the convergence and efficacy of tICA-derived CVs with biased simulation time, approach (well-tempered vs transition-tempered), parameters, and rescaling.

See the supplementary material for additional figures showing the tICA weightings from unbiased simulations and analyses of the origin of differences in the PMFs, as well as a description of testing for finite size effects and the efficacy of a naïve initial CV.

The authors thank Dr. Harshwardhan H. Katkar and Professor Greg Voth for helpful discussion. The MD simulations were performed at various supercomputers through Allocation No. MCB200018 with resources provided by the Extreme Science and Engineering Discovery Environment supported by the National Science Foundation (Grant No. ACI-1548562). The support and resources of the Center for High Performance Computing at the University of Utah are also gratefully acknowledged.

F.A. and J.M.J.S. designed the research. F.A., G.C.A.d.H., J.D.M.N., and M.I.O. carried out the simulations and analysis. All authors interpreted the results and wrote the manuscript.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
M.
Guyader
 et al,
J. Virol.
76
,
10356
(
2002
).
2.
G.
Vereb
 et al,
Proc. Natl. Acad. Sci. U. S. A.
100
,
8053
(
2003
).
3.
A.
Preetha
,
N.
Huilgol
, and
R.
Banerjee
,
Colloids Surf., B
53
,
179
(
2006
).
4.
J.
Zhang
 et al,
Biophys. J.
92
,
3988
(
2007
).
5.
C.
Peetla
,
A.
Stine
, and
V.
Labhasetwar
,
Mol. Pharm.
6
,
1264
(
2009
).
6.
A.
Ghysels
 et al,
Nat. Commun.
10
,
5616
(
2019
).
7.
D.
Lopes
 et al,
Prog. Lipid Res.
65
,
24
(
2017
).
8.
C. T.
Lee
 et al,
J. Chem. Inf. Model.
56
,
721
(
2016
).
9.
I.
Kabelka
,
R.
Brožek
, and
R.
Vácha
,
J. Chem. Inf. Model.
61
,
819
(
2021
).
10.
B. J.
Bennion
 et al,
J. Phys. Chem. B
121
,
5228
(
2017
).
11.
D.
Bochicchio
 et al,
J. Chem. Phys.
143
,
144108
(
2015
).
12.
C.
Neale
 et al,
J. Chem. Theory Comput.
9
,
3686
(
2013
).
13.
Z.
Ghaemi
 et al,
J. Phys. Chem. B
116
,
8714
(
2012
).
14.
R.
Sun
 et al,
J. Chem. Theory Comput.
12
,
5157
(
2016
).
15.
R.
Sun
 et al,
J. Chem. Phys.
149
,
072310
(
2018
).
16.
Z.
Yue
 et al,
J. Am. Chem. Soc.
141
,
13421
(
2019
).
17.
C. H.
Tse
 et al,
J. Chem. Theory Comput.
14
,
2895
(
2018
).
18.
T.
Rivel
,
C.
Ramseyer
, and
S.
Yesylevskyy
,
Sci. Rep.
9
,
5627
(
2019
).
19.
S.
Yesylevskyy
,
T.
Rivel
, and
C.
Ramseyer
,
Sci. Rep.
9
,
17214
(
2019
).
20.
Z.
Cao
 et al,
J. Chem. Phys.
150
,
084106
(
2019
).
21.
C.
Neale
and
R.
Pomès
,
Biochim. Biophys. Acta
1858
,
2539
(
2016
).
22.
W. F. D.
Bennett
,
J. L.
MacCallum
, and
D. P.
Tieleman
,
J. Am. Chem. Soc.
131
,
1972
(
2009
).
23.
C. H.
Tse
 et al,
J. Chem. Theory Comput.
15
,
2913
(
2019
).
24.
A.
Centi
 et al,
Biophys. J.
118
,
1321
(
2020
).
25.
R. M.
Venable
,
A.
Krämer
, and
R. W.
Pastor
,
Chem. Rev.
119
,
5954
(
2019
).
26.
P.
Gkeka
 et al,
J. Chem. Theory Comput.
16
,
4757
(
2020
).
27.
L.
Molgedey
and
H. G.
Schuster
,
Phys. Rev. Lett.
72
,
3634
(
1994
).
28.
C. R.
Schwantes
and
V. S.
Pande
,
J. Chem. Theory Comput.
9
,
2000
(
2013
).
29.
G.
Pérez-Hernández
 et al,
J. Chem. Phys.
139
,
015102
(
2013
).
30.
R. T.
McGibbon
,
B. E.
Husic
, and
V. S.
Pande
,
J. Chem. Phys.
146
,
044109
(
2017
).
31.
M. M.
Sultan
and
V. S.
Pande
,
J. Chem. Theory Comput.
13
,
2440
(
2017
).
32.
J.
McCarty
and
M.
Parrinello
,
J. Chem. Phys.
147
,
204109
(
2017
).
33.
Z. F.
Brotzakis
and
M.
Parrinello
,
J. Chem. Theory Comput.
15
,
1393
(
2019
).
34.
Y.-Y.
Zhang
 et al,
J. Chem. Phys.
150
,
094509
(
2019
).
35.
H.
Sidky
,
W.
Chen
, and
A. L.
Ferguson
,
Mol. Phys.
118
,
e1737742
(
2020
).
36.
A. J.
Bruce Alberts
,
J.
Lewis
,
M.
Raff
,
K.
Roberts
, and
P.
Walter
,
Molecular Biology of the Cell
(
Garland Science
,
New York
,
2002
).
37.
N. M.
Sakhrani
and
H.
Padh
,
Drug Des., Dev. Ther.
7
,
585
(
2013
).
38.
M. J.
Abraham
 et al,
SoftwareX
1-2
,
19
(
2015
).
39.
G. A.
Tribello
 et al,
Comput. Phys. Commun.
185
,
604
(
2014
).
40.
J. B.
Klauda
 et al,
J. Phys. Chem. B
114
,
7830
(
2010
).
41.
K.
Vanommeslaeghe
 et al,
J. Comput. Chem.
31
,
671
(
2010
).
42.
W. L.
Jorgensen
 et al,
J. Chem. Phys.
79
,
926
(
1983
).
43.
S.
Jo
 et al,
Biophys. J.
97
,
50
(
2009
).
44.
L.
Martinez
 et al,
J. Comput. Chem.
30
,
2157
(
2009
).
45.
G.
Bussi
,
D.
Donadio
, and
M.
Parrinello
,
J. Chem. Phys.
126
,
014101
(
2007
).
46.
H. J. C.
Berendsen
 et al,
J. Chem. Phys.
81
,
3684
(
1984
).
47.
U.
Essmann
 et al,
J. Chem. Phys.
103
,
8577
(
1995
).
48.
B.
Hess
 et al,
J. Comput. Chem.
18
,
1463
(
1997
).
49.
W.
Humphrey
,
A.
Dalke
, and
K.
Schulten
,
J. Mol. Graphics
14
,
33
(
1996
).
50.
A.
Laio
and
M.
Parrinello
,
Proc. Natl. Acad. Sci. U. S. A.
99
,
12562
(
2002
).
51.
A.
Laio
and
F. L.
Gervasio
,
Rep. Prog. Phys.
71
,
126601
(
2008
).
52.
J. F.
Dama
 et al,
J. Chem. Theory Comput.
10
,
3626
(
2014
).
53.
K. A.
Beauchamp
 et al,
J. Chem. Theory Comput.
7
,
3412
(
2011
).
54.
W.
E
,
W.
Ren
, and
E.
Vanden-Eijnden
,
J. Phys. Chem. B
109
,
6688
(
2005
).
55.
F.
de Meyer
and
B.
Smit
,
Proc. Natl. Acad. Sci. U. S. A.
106
,
3654
(
2009
).

Supplementary Material