Broad-spectrum antiviral drugs are urgently needed to stop the Coronavirus Disease 2019 pandemic and prevent future ones. The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is related to the SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV), which have caused the previous outbreaks. The papain-like protease (PLpro) is an attractive drug target due to its essential roles in the viral life cycle. As a cysteine protease, PLpro is rich in cysteines and histidines, and their protonation/deprotonation modulates catalysis and conformational plasticity. Here, we report the pKa calculations and assessment of the proton-coupled conformational dynamics of SARS-CoV-2 in comparison to SARS-CoV and MERS-CoV PLpros using the recently developed graphical processing unit (GPU)-accelerated implicit-solvent continuous constant pH molecular dynamics method with a new asynchronous replica-exchange scheme, which allows computation on a single GPU card. The calculated pKa’s support the catalytic roles of the Cys–His–Asp triad. We also found that several residues can switch protonation states at physiological pH among which is C270/271 located on the flexible blocking loop 2 (BL2) of SARS-CoV-2/CoV PLpro. Simulations revealed that the BL2 can open and close depending on the protonation state of C271/270, consistent with the most recent crystal structure evidence. Interestingly, despite the lack of an analogous cysteine, BL2 in MERS-CoV PLpro is also very flexible, challenging a current hypothesis. These findings are supported by the all-atom fixed-charge simulations and provide a starting point for more detailed studies to assist the structure-based design of broad-spectrum inhibitors against CoV PLpros.
Over the last two decades, three coronaviruses have caused deadly epidemics, threatening the global human population. The severe acute respiratory syndrome coronavirus (SARS-CoV) caused an outbreak in 2003, and a related Middle East respiratory syndrome coronavirus (MERS-CoV) caused an outbreak in 2012. Today, the world is facing the pandemic of the Coronavirus Disease 2019 (COVID-19) caused by a novel coronavirus SARS-CoV-2, which shares about 82% genome sequence identity with the original SARS-CoV.1 All three viruses are thought to have originated from animal reservoirs, and zoonotic transmission into the human population has led to the outbreaks.2 Currently, no effective treatment exists for any of the three coronavirus diseases; thus, there is an urgent need to understand the potential therapeutic targets and develop inhibition strategies.
Following the release of the coronavirus genome from the acidic endosome, the replicase polyproteins are translated and subsequently self-cleaved by two cysteine proteases to produce the functional non-structural proteins that are required for viral replication. The papain-like protease (PLpro) located in Nsp3 produces Nsp1, Nsp2, and Nsp3, while the 3C-like or main protease located in Nsp5 cleaves 11 sites downstream of Nsp4.2–6 In addition to the proteolytic function, CoV PLpro counteracts the host cell innate immune response by deactivating signaling cascades that lead to the impairment of production of pro-inflammatory cytokines and interferons.7,8 The former is accomplished through a deubiquitinating activity, which leads to the removal of ubiquitin from signaling proteins,9 and latter through the deISGylating activity, which leads to the removal of ISG15 from IRF3.10 Thus, PLpro is a critical player in the viral life cycle and as such an attractive drug target for stopping COVID-19 and other coronavirus outbreaks.
Most recently, the first (and only) two x-ray structures of SARS-CoV-2 PLpro were determined (PDB 6W9C and 6WRH). The PLpro monomer (about 300 residues), which is the predominant form in solution,11 is comprised of an independent N-terminal ubiquitin-like domain (first 62 residues) and a C-terminal catalytic domain [Figs. 1(a) and 1(b)]. The latter folds in a canonical thumb–palm–fingers-like structure, with the Ubl domain anchored to the thumb. The interface between the thumb (residues 107–113, 162–168) and palm (residues 269–279) forms the substrate binding site leading to the catalytic triad of the active site comprising Cys111, His272, and Asp286 [Fig. 1(b)]. The substrate binding site is solvent exposed and flanked by a flexible β-hairpin loop called the blocking loop 2 or blocking loop 2 (BL2) (G266–G271). The fingers’ subdomain contains a zinc finger coordinated by four cysteines, which upholds the structural integrity and is essential for the PLpro activity.12 The structure of SARS-CoV-2 PLpro is nearly identical to that of SARS-CoV PLpro,13 as expected from the highly similar sequences [96% similarity and 83% identity, Fig. 1(c)]. In contrast, although the structure of MERS-CoV PLpro overlays well with the SARS-CoV PLpro structures, small differences are visible [Fig. 1(a)], as expected from the larger sequence differences [66% similarity and 30% identity with SARS-CoV PLpro, Fig. 1(c)].
All three CoV PLpros are rich in Cys and His residues. SARS-CoV/CoV-2 PLpro contains 8/11 Cys and 11/9 His, while MERS-CoV PLpro contains 14 Cys and 10 His residues [Fig. 1(c)]. Among them, five Cys and two His residues are conserved in all three PLpros, including the catalytic Cys and His. Cys and His have model pKa’s of 8.6 and 6.5, respectively; thus, in model compounds or peptides at physiological pH 7.4, they are both predominantly neutral, i.e., Cys is protonated and His is singly protonated. However, in the protein environment, a small pKa downshift for Cys or upshift for His may occur, leading to a significant population of or a complete switch to the alternative protonation state, i.e., negatively charged, deprotonated Cys and positively charged, doubly protonated His.
A protonation state switch is an important energy transduction mechanism to enable functionally required conformational changes in biology. For example, the coronavirus spike protein makes use of protonation state switches to induce large conformational changes required for membrane fusion.14,15 Our previous work employing the hybrid-solvent based continuous constant pH molecular dynamics (CpHMD) simulations16 demonstrated that the elucidation of proton-coupled conformational dynamics offers a deeper understanding of the structure–dynamics–function relationships17 and inhibition mechanisms18–20 of aspartyl proteases.
Toward understanding the (possibly) proton-coupled structure–function relationship and assisting broad-spectrum inhibitor design, here, we report the pKa calculations and preliminary assessment of the proton-coupled conformational dynamics of SARS-CoV-2 PLpro in comparison to SARS-CoV and MERS-CoV PLpros. This work employed the recently developed graphical processing unit (GPU)-accelerated GB-Neck2 implicit-solvent based CpHMD method21 with a new asynchronous implementation of the pH replica exchange sampling protocol. To confirm our findings, the conventional all-atom fixed-charged MD in Amber1822 was also applied. The simulations allowed us to determine the protonation states of all titratable sites, including the catalytic Cys–His–Asp triad, offering a timely knowledge to facilitate the MD studies of PLpros in the community. Importantly, we tested a hypothesis regarding the proton-coupled conformational plasticity of the BL2 loop, which modulates substrate and inhibitor binding. The contrasting features among the three CoV PLpros have implications for designing broad-spectrum antiviral inhibitors. For ease of discussion, the acronyms used in this work are listed in Table I.
|Acronym .||Full name .|
|SARS||Severe acute respiratory syndrome|
|MERS||Middle East respiratory syndrome|
|BL2||Flexible blocking loop 2|
|GPU||Graphical processing unit|
|CpHMD||Continuous constant pH molecular dynamics|
|Acronym .||Full name .|
|SARS||Severe acute respiratory syndrome|
|MERS||Middle East respiratory syndrome|
|BL2||Flexible blocking loop 2|
|GPU||Graphical processing unit|
|CpHMD||Continuous constant pH molecular dynamics|
II. METHODS AND PROTOCOLS
A. System preparation
The coordinates were retrieved from the protein data bank (PDB): SARS-CoV PLpro (PDB 2FE813), SARS-CoV-2 PLpro (PDB 6W9C), and MERS-CoV PLpro (PDB 4RNA23). If multiple chains were available in the x-ray crystal structure, only the first chain was used. Any small molecules or solvent were removed. For each structure, the acetylated N-terminus and amidated C-terminus along with all missing hydrogens were added using the CHARMM program (C36b2).24 In the crystal structure of SARS-CoV-2 PLpro (PDB 6W9C), a disulfide bond is present in the fingers’ subdomain in place of a zinc ion. Considering that the zinc ion cannot be represented in implicit-solvent simulations, we removed the zinc ion and added an analogous disulfide linkage between the closest non-adjacent cysteine pairs in all other structures to maintain the integrity of the fingers’ subdomain. The disulfide linkage should not affect the pKa’s of the discussed residues, as they are located far away in other subdomains [Fig. 1(a)]. Following the addition of hydrogens and disulfide bridge, the structure was subject to a 20-step energy minimization with the heavy atoms restrained and a 20-step energy minimization with all atoms restrained but the disulfide bonded cysteine pairs. The minimization used ten steps of steepest decent and ten steps Newton–Raphson methods. From there, the force field parameters and coordinate files were constructed from the CHARMM output with the LEAP utility in Amber.22 The ff14sb force field25 was used to represent the protein, and the GB-Neck2 (igb = 8) implicit-solvent model26 was used to represent solvent. The mbondi3 intrinsic Born radii were modified for improving the titration simulations of His21 and Cys27 side chains. The structure was then energy minimized and equilibrated in GB-Neck2 implicit solvent,26 following our previous protocol.28 The energy minimization was performed using the steepest decent algorithm for 5000 steps and the conjugate-gradient algorithm for 1000 steps. The equilibration was performed at pH 7 in four stages, each having 2000 MD steps with gradually decreased restraining force constants of 5 kcal/mol/Å2, 2 kcal/mol/Å2, 1 kcal/mol/Å2, and 0 kcal/mol/Å2. The final structure was used for CpHMD titration simulations.
B. CpHMD simulations with an asynchronous replica exchange scheme
The titration simulations were performed using the recently implemented GPU-accelerated GBNeck2-CpHMD method28 in the pmemd engine of Amber18.22 The implementation is built upon the CPU version of the GBNeck2-CpHMD module21 and the GPU version of the GBNeck2 module26,29–31 in Amber18.22 The GBNeck2-CpHMD method has its origin in the GBSW-CpHMD method implemented in CHARMM.32–34 Here, we implemented an asynchronous version of the pH replica-exchange protocol16 to allow Amber replica-exchange simulations to be performed on a single GPU or several GPUs with a total number smaller than the number of replicas. A similar implementation that allows replica-exchange molecular dynamics on CPUs to progress without centralized synchronization steps and the need for direct communication between processors was developed by Gallicchio, Levy et al. in the past.35 The Python script of our asynchronous pH replica-exchange algorithm is freely available at https://gitlab.com/shenlab-amber-cphmd/async_ph_replica_exchange.
The conventional way of running replica exchange is to use one GPU (or one CPU core) per replica. Under this scheme, all replicas are running at the same time, and periodically, an attempt is made to exchange pH values (or configurations) between replicas according to the Metropolis criterion. This method is not feasible if the number of replicas is larger than the number of available GPUs. Instead, in the asynchronous method, the replicas are consecutively run on each available GPU, starting from the lowest pH condition. As soon as two replicas that are supposed to exchange at that exchange step are completed, the exchange is attempted, not waiting for other replicas to finish. As soon as a GPU finishes a replica, that GPU is assigned the next available pH value and begins a new single-pH simulation. If all replicas at a single exchange step are being run, the GPU will be assigned the first replica from the next exchange step to avoid idling GPUs. In the current Amber implementation of pH replica exchange,22 the pH conditions are swapped, but to simplify the arrangement of replicas, the asynchronous method instead swaps configurations and keeps a constant arrangement in pH space. This also eliminates a post-processing step in which the replica trajectories are sorted and stitched together according to their pH conditions.
Our previous work showed that pH replica-exchange enhances both protonation and conformational state sampling, allowing pKa’s to rapidly converge.16,28,36 In the protocol, nine pH replicas were placed at pH values ranging from pH 4.5 to 8.5 at an interval of 0.5 pH units. An exchange of two adjacent pH conditions was attempted every 1000 MD steps (or 2 ps). Each replica was run for 55 ns, resulting in an aggregate time of 495 ns for each. The λ values were recorded after each exchange attempt. All side chains of Asp, Glu, His, Cys, and Lys were allowed to titrate, with their titration model parameters taken from our previous work.21,27,28 An ionic strength of 0.15M was used to represent the physiological salt condition. Simulations were run at a temperature of 300 K and an effectively infinite cutoff (999 Å) for nonbonded interactions. SHAKE was used to constrain bonds involving hydrogens to allow for a 2-fs time step.
C. Conventional all-atom fixed-charge MD simulations
To support the findings from the GB-CpHMD simulations, two all-atom MD simulations were carried out for SARS-CoV-2 PLpro (PDB 6W9C) using the predicted protonation states for Asp, Glu, His, Cys, and Lys at pH 8.5. Note, since several residues may switch protonation states at physiological pH according to the CpHMD predictions, pH 8.5 was used to avoid ambiguity in choosing protonation states. Two additional runs were carried out using the protonated form of Cys270. All simulations were performed with Amber18.22 The protein and water were represented by the ff14SB25 and TIP3P37 force fields. The initial structure was placed in a truncated octahedron box of water molecules. Long-range electrostatic interactions were handled by using the Particle Mesh Ewald method.38 A non-bonded cutoff of 8 Å was used with a time step of 2 fs. The starting structure underwent energy minimization by applying 5000 steps of steepest descent, followed by 5000 steps of conjugate gradient minimization with a force constant of 25 kcal/mol/Å2 applied to the solute heavy atoms. The force constant was reduced to 5 kcal/mol/Å2, and the system was heated from 100 K to 300 K for 50 ps. Following heating, solvent was equilibrated in the NPT ensemble for 250 ps using the isotropic Berendsen barostat39 and with the same force constant. Subsequently, the restraints were removed, and the system was further relaxed for 100 ps in the NPT ensemble. Finally, two production runs of 1 µs each were performed for each system starting from a different random initial velocity seed. All analysis was performed with the Amber module CPPTRAJ.40 The first 300 ns from each trajectory was discarded.
III. RESULTS AND DISCUSSION
We performed pH replica-exchange CpHMD simulations to estimate the pKa values of Asp/Glu/His/Cys/Lys side chains and assess possible proton-coupled dynamics in SARS-CoV, SARS-CoV-2, and MERS-CoV PLpros. The titration simulations were conducted in the pH range of 4.5–8.5 and lasted 55 ns per replica (aggregate simulation time of 495 ns for each protein). The protonation states were well converged. Consistent with our previous work,36 we found that the protonation states of His residues converge rapidly within 10 ns per replica, and those of Cys converge more slowly due to the formation of hydrogen bonds that are not present in the crystal structure (see later discussion). The convergence analysis of protonation state sampling and replica walks along the pH ladder are given in Figs. S1–S6 of the supplementary material.
For the ease of discussion, we refer to the “standard” protonation states as the default settings in the MD programs, i.e., deprotonated Asp/Glu (negatively charged), deprotonated His (neutral), with one proton on either δ or ε nitrogen, protonated Cys (neutral), and protonated Lys (positively charged). Our simulations showed that several Cys, His residues and one Asp are in the “non-standard” protonation state or can switch to this state at physiological pH in all three PLpros (Table II). A complete list of the calculated pKa’s is given in Table 1 of the supplementary material.
|Residue .||SARS .||SARS-2 .||MERS .|
|Residue .||SARS .||SARS-2 .||MERS .|
Catalytic triad residues. A complete list of the calculated pKa’s is given in Table 1 of the supplementary material.
A. Protonation states and hydrogen bond network of the catalytic triad
We first consider the catalytic triad in SARS-CoV-2 PLpro. The catalytic Cys111 is located in the thumb, His272 is located in the foothill of the palm adjacent to the flexible loop BL2, and Asp286 is located at the end of β18 [Fig. 1(b)]. Currently, no measured pKa data are available. Biochemical experiments of SARS-CoV PLpro suggested that the Cys serves as a nucleophile, while the His functions as a general acid with the assistance of a negatively charged Asp;12 however, it is unclear whether the reactive nucleophile is the thiolate of the Cys⋯His ion pair or the neutral thiol, which becomes deprotonated upon binding of the substrate.4 The calculated pKa’s of Cys111 and Asp286 are <4.5, whereas the pKa of His272 is >8.5, indicating that the catalytic triad residues are all in the charged state at physiological pH. Thus, our data support the mechanism in which the reactive nucleophile is the thiolate ion, rather than the neutral thiol that needs to be first activated by the substrate.4
The CpHMD simulations showed that the Cys–His–Asp triad maintains a catalytic geometry through several hydrogen bonds, which supports their protonation states. The catalytic His272 forms a hydrogen bond simultaneously with Cys111 and Asp286 in the entire pH range of 4.5–8.5, stabilizing His272 in the doubly protonated state and Asp286 and Cys111 in the deprotonated states [Figs. 2(a) and 2(d)]. The doubly protonated form allows His272 to perform its role as a general acid.4 The catalytic Cys111 forms a hydrogen bond not only with His272 but also with the indole nitrogen of Trp106, which provides further stabilization for the thiolate form and explains the significant downshifted pKa relative to the solution value of 8.5 (Table II). The Cys111⋯Trp106 hydrogen bond is important, as it maintains the position of Trp106; the analogous residue in SARS-CoV PLpro has been hypothesized as the oxyanion hole residue that donates a hydrogen bond to stabilize the negatively charged tetrahedral intermediate developed in the course of peptide hydrolysis.13 The three hydrogen bond interactions in SARS-CoV-2 PLpro are consistent with those in SARS-CoV PLpro [Fig. 2(b)].
In MERS-CoV PLpro, Trp106 is replaced with a Leu, which is incapable of forming a hydrogen bond with the catalytic Cys or with the negatively charged intermediate. This has been hypothesized as a cause for the significantly lower catalytic activity of MERS-CoV as compared to SARS-CoV PLpro.41 In addition to the missing Cys⋯Trp hydrogen bond, CpHMD simulations of MERS-CoV PLpro showed that the hydrogen bond between the catalytic His and Asp is nearly abolished [Fig. 2(c)]. The loss of hydrogen bond interactions involving the catalytic His and Cys appears to provide less stabilization to the respective charged states in MERS-CoV PLpro, as suggested by the partial titration under the highest and lowest pH conditions, respectively (see Fig. S3). To test whether the loss of hydrogen bond network affects the flexibility of the regions near the catalytic triad, we calculated the root-mean-square fluctuations (RMSFs) of the Cα atoms of the catalytic triad and nearby five residues (Fig. 3). Interestingly, the RMSFs of the loop residues that are sequence neighbors of the catalytic triad in MERS-CoV PLpro are increased as compared to those in SARS-CoV-2 and SARS-CoV PLpros, which are similar except for the flexible BL2 loop (267–271) region. The loop (106–116) adjacent to α4, which harbors the catalytic Cys, the BL2 loop next to the catalytic His, and the β-hairpin loop next to the catalytic Asp all display enhanced mobility [Figs. 3(a) and 3(b)]. The largest increase in RMSF is seen for the BL2 loop, whereby the RMSF in MERS-CoV PLpro is nearly doubled relative to SARS-CoV-2 PLpro, which shows a somewhat higher mobility than SARS-CoV PLpro. The extremely high flexibility of the BL2 loop in MERS—as compared to SARS-CoV PLpro is consistent with the lack of electron density for the region in the first x-ray structure of the apo MERS-CoV PLpro.41
B. Proton-coupled conformational dynamics of the BL2 loop in SARS-CoV/CoV-2 PLpro
The BL2 loop is perhaps the most prominent feature of the substrate binding site in SARS-CoV PLpro, as its movement modulates the substrate and inhibitor binding.4 Crystal structures show that BL2 is open in the unbound SARS-CoV PLpro and it closes by about 1.5 Å–2 Å in the bound form, which allows hydrogen bonds to form between Tyr269/Gln270 and the inhibitor.4 Upon inspection of the x-ray structures of SARS-CoV-2 PLpro, We noticed that the BL2 loop is open in the pH 7.5 structure (PDB 6W9C) and a zinc ion is found within the binding distance of Cys270; however, in the structure determined at pH 4.5 (PDB 6WRH), the BL2 loop closes in by 1.9 Å (Cα distance between Tyr268 and Asp164) and the zinc ion is absent. Thus, we hypothesized that the BL2 loop dynamics is coupled to the titration of C270.
CpHMD titrations gave the pKa’s of 6.7 and 6.9 for Cys270 in SARS-CoV-2 PLpro and the equivalent Cys271 in SARS-CoV PLpro, respectively (Table II). Thus, Cys270/C271 samples both protonated and deprotonated states at physiological pH. The pKa that downshifts relative to the model Cys pKa of 8.5 is due to the formation of local hydrogen bonds, which favors the thiolate state. In the crystal structure of SARS-CoV-2 PLpro, Cys270 does not interact with Thr265. The titration simulations showed that the distance between Cys270 and Thr265 varies widely between 5 and 15 Å, when Cys270 is protonated (λ value close to 0); however, when Cys270 is deprotonated (λ value close to 1), the distance is locked to 1.5 Å–2.5 Å, indicating the formation of a hydrogen bond [Fig. 4(a)]. In addition to the hydroxyl group of Thr265, the deprotonated Cys270 can also accept a hydrogen bond from the backbone amide groups of His272 and Gly271, stabilizing the charged state [Fig. 4(b)]. The same hydrogen bonds were also formed in the simulations of SARS-CoV PLpro, which explains the similarly downshifted pKa of the equivalent Cys271.
To test the hypothesis that protonation/deprotonation of Cys270 modulates the BL2 dynamics, we examined the Cα distance between Tyr268 on the BL2 loop and the conserved Asp164 next to α7, which represents the width of the S3 subpocket [Figs. 4(c) and 4(d)]. The equivalent Tyr269 in SARS-CoV PLpro is an important residue, as it forms a hydrogen bond with the inhibitors that bind to the S3 pocket.4 When Cys270 is protonated, the probability distribution of the Tyr268–Asp164 distance covers a broad range of 8 Å–20 Å with a peak at around 12 Å; however, when Cys270 is deprotonated, the distribution samples a narrower range of 14 Å–20 Å with a peak at around 16 Å [Fig. 4(c)]. Thus, the CpHMD data suggest that the deprotonated Cys270 is correlated with the BL2 conformations that are more open and rigid, which might be attributed to the aforementioned hydrogen bond formation between the deprotonated Cys270 and the surrounding residues. Turning to SARS-CoV PLpro, Fig. 4(c) shows that the BL2 movement is coupled to the protonation/deprotonation of the analogous Cys271. However, in SARS-CoV PLpro, it appears that the BL2 with a deprotonated Cys270 can sample a wider range of 10 Å–20 Å as compared to SARS-CoV-2 PLpro, although the peak remains around 16 Å. The wider range of BL2 movement may be attributed to the somewhat weaker hydrogen bonds involving the deprotonated Cys271. The movement of the BL2 loop is very similar between SARS-CoV and SARS-CoV-2 PLpros when Cys270 is protonated.
C. Comparison of the BL2 conformation across the three PLpros
For broad-spectrum inhibitor design, it is important to understand the difference in the BL2 conformation across the three PLpros. The distributions of the Tyr269/268–Asp165/164 distance for SARS-CoV/CoV-2 and the equivalent Thr274–Asp164 distance for MERS-CoV PLpro at physiological pH [Figs. 5(a) and 4(d)] show that the widest position of the BL2 loop is about the same across the three PLpros; however, the BL2 in SARS-CoV-2 PLpro samples the narrowest range, followed by SARS-CoV PLpro and MERS-CoV PLpro, which samples the widest range between 5 Å and 22 Å. The enhanced flexibility of the BL2 in MERS-CoV PLpro may be attributed to the lack of a Cys equivalent to Cys271/270 in SARS-CoV/CoV-2 PLpro, which can form hydrogen bonds with neighboring residues to restrict the loop motion, and perhaps also the loosening of the nearby catalytic His, which no longer forms double hydrogen bonds as in SARS-CoV/CoV-2 PLpro.
In agreement with the BL2 dynamics described with the protonated and deprotonated states of Cys270 [Fig. 4(c)] and the x-ray structures determined at pH 7.5 (PDB 6W9C) and pH 4.5 (PDB 6WRH), a significant pH dependence is observed with the BL2 dynamics in the simulations of SARS-CoV-2 PLpro (Fig. 5). In the simulation at pH 7.5, the BL2 loop samples a state that leaves the S3/S4 subpocket more open, similar to the neutral pH crystal structure [Fig. 5(c)]. In the simulation at pH 4.5, the BL2 loop assumes a wide range of dynamics, allowing it to sample a state that leaves the S3/4 subpocket more closed, similar to the low pH crystal structure [Fig. 5(c)].
D. The Ubl domain contains an Asp with a highly upshifted pKa
As expected, nearly all Asp/Glu residues adopt standard protonation states (i.e., charged) at physiological pH; however, Asp12 in SARS-CoV-2 PLpro has a pKa abnormally upshifted from its model pKa of 4.0 to 6.7 (Table II), making it possible to occasionally sample the protonated state at pH 7.4. Trajectory analysis suggested that this upshift is, in part, due to the protonated Asp12 acting as a hydrogen bond donor to either (deprotonated) Glu67 or Asn15 [Figs. 6(a) and 6(c)], which stabilizes the protonated state. The two hydrogen bonds are mutually exclusive such that Asp12 is a hydrogen bond donor 82%–96% of the time when it is in the protonated state [Fig. 6(a)]. In addition to hydrogen bonding, Asp12 is buried in a hydrophobic pocket with a very low solvent accessible surface area, which increases as Asp12 becomes deprotonated at higher pH [Fig. 6(b)]. In SARS-CoV PLpro, the analogous Asp13 experiences a similar degree of hydrogen bonding and solvent sequestration, resulting in a upshifted pKa of 5.9. In MERS-CoV PLpro, the analogous Asp11 primarily donates a hydrogen bond to Asn15, as the analogous residue to Glu67 is missing (Fig. S7). Compared to Asp12/13 in SARS-CoV-2/CoV PLpro, Asp11 is more solvent exposed in the lower pH range (Fig. S7), which may contribute to a smaller degree of pKa upshift of Asp11 in MERS-CoV PLpro as compared to SARS-CoV/CoV-2 PLpro.
E. Histidines that can switch protonation states at physiological pH
CpHMD titrations revealed that three histidines unique to SARS-CoV/CoV-2, H74/73, H90/89, and H176/175 (Fig. 7), have pKa’s around 7 (Table II) and can sample both protonated and deprotonated states at physiological pH. His74/73 located on the C-terminal end of α2 in SARS-CoV/CoV-2 has a pKa of 7.3/7.3. Analysis suggested that the pKa upshift relative to the model value of 6.5 is due to the formation of hydrogen bonds with Phe70/69, Asn129/128, or a salt bridge with Glu71/70 (see Fig. S8), which stabilizes the charged state. His90/89 located on the C-terminal end of α3 in SARS-CoV/CoV-2 has a pKa of 7.0/6.9. Analysis showed that a small pKa upshift relative to the model pKa is due to the stabilization of the charged state by the occasional hydrogen bonding with the backbone carbonyl of Ser86/85 or transient salt-bridge interaction with Asp107/108 located near the oxyanion hole of the CoV PLpro (Fig. S9). His176/175 is located on the C-terminal end of α7 and opposite to His74/73. The increased pKa of 7.4/7.3 can also be attributed to local hydrogen bonding either with His172 in SARS-CoV-1 or Tyr171 in SARS-CoV-2 PLpro (Fig. S10).
F. All-atom fixed-charge MD of SARS-CoV-2 PLpro
To provide support for the protonation states determined by GB-CpHMD titrations and test the proton-coupled dynamics of the BL2 loop, we performed conventional all-atom fixed-charge MD simulations of SARS-CoV-2 PLpro with the catalytic side chains fixed in the charged states and Cys270 fixed in the protonated or deprotonated state. All other residues were fixed in the standard protonation states. Two 1-µs trajectories were obtained with each Cys270 protonation state. Consistent with the GB-CpHMD simulations at physiological pH, the catalytic triad remained very stable, with the hydrogen bond between His272 and Cys111 being the strongest, followed by the His272⋯Asp286 and Cys111⋯Trp106 hydrogen bonds, as shown in the hydrogen bond occupancy plots [Fig. 8(a)]. Interestingly, while the effect of Cys270 protonation/deprotonation appears negligible for the latter two hydrogen bonds (occupancy change is below 5%), protonation of Cys270 weakens the His272⋯Cys111 hydrogen bond (occupancy decreases by over 20%). This decrease is consistent with the GB-CpHMD data, which shows that the catalytic His⋯Cys hydrogen bond is significantly weakened below pH 6 as Cys270 becomes fully protonated in SARS-CoV-2 PLpro [Fig. 2(a)].
To test the effect of Cys270 titration on the BL2 conformation, we calculated the probability distribution of the Cα distance between Tyr268 and Asp164. For the trajectories with protonated Cys270, the distribution displays a single peak at about 11 Å; however, for the trajectories with deprotonated Cys270, a second peak appears at about 17 Å [Fig. 8(b)]. These data indicate that deprotonated Cys270 is correlated with the more open BL2 loop conformations, consistent with the findings from the GB-CpHMD simulations [Fig. 4(c)]. However, due to the slow transition between the open and closed BL2 loop conformations in explicit solvent, more trajectories or longer simulations are needed to solidify the conclusion.
G. Comparison to structure-based Poisson–Boltzmann pKa calculations
We compare the CpHMD-predicted pKa’s of SARS-CoV-2/CoV and MERS-CoV PLpros with those from the DelPhiPKa server,42 which performs continuum Poisson–Boltzmann (PB) calculations with a smooth Gaussian dielectric function.43 Note, Cys pKa calculation is a newly added functionality in DelPhipKa.44 The default setting with a protein internal dielectric constant of 8 was used. For the purpose of this work, we focus on the pKa’s of His, Cys, and the abnormal Asp13/12/11 (Fig. S11). There appears to be a good correlation for His and Cys pKa’s in the pKa range of 5–7.5 between the two methods; however, there is a large disagreement for the pKa’s that are predicted to be highly down- or upshifted relative to the model values by the CpHMD method. Close inspection suggests that the disagreement is related to the small pKa ranges from the DelPhipKa calculations: 6–8 for His and 5.5–7 for Cys (i.e., all cysteines are thiolates). Specifically, cysteines that have CpHMD predicted pKa’s above 8.5 have pKa’s below 7 according to DelPhipKa calculations. Similarly, histidines that have CpHMD predicted pKa’s below 4.5 have pKa’s of about 6 according to DelPhipKa calculations. Another significant disagreement is for Asp13/12/11, which have significantly upshifted pKa’s according to CpHMD (5.2–6.7, see Table II) but have downshifted pKa’s based on DelPhipKa (pKa 2–3, Fig. S11). A possible explanation is that in the crystal structure, one of the carboxylate oxygens of Asp13/12/11 accepts a hydrogen bond from the side chain amino group of Asn15/14/13, thereby stabilizing the deprotonated form. By contrast, in the CpHMD simulations, Asn15/14/13 rotated such that its carbonyl group accepts a hydrogen bond from Asp13/12/11, which additionally donates a hydrogen bond to Glu67 in SARS-CoV-2/CoV-2 PLpro (Fig. 6). Experimental measurements and future community efforts such as the 2009 blind pKa prediction exercise45 are needed to assess and promote the further development of various pKa calculation approaches.
IV. CONCLUDING DISCUSSION
The protonation states and possible proton-coupled conformational dynamics of SARS-CoV-2 PLpro were investigated in comparison to SARS-CoV and MERS-CoV PLpros using the GPU-accelerated GBNeck2-CpHMD titration simulations with a new asynchronous pH replica-exchange scheme as well as conventional all-atom MD. The simulations showed that the catalytic Cys, His, and Asp are charged in the entire simulation pH range of 4.5–8.5 for all three PLpros, which supports the mechanism in which the reactive nucleophile is the thiolate ion and the catalytic His serves as a general acid stabilized by the catalytic aspartate. The catalytic triad in SARS-CoV-2/CoV PLpro forms a hydrogen bond network among themselves and with a nearby Trp, which serves as an oxyanion hole residue to stabilize the tetrahedral intermediate developed in the peptide hydrolysis. In contrast, the hydrogen bond with Trp is missing and the hydrogen bond between the catalytic His and Asp is nearly abolished in MERS-CoV PLpro, consistent with the significantly lower catalytic activity compared to SARS-CoV PLpro.41 Interestingly, the lack of a hydrogen bond network for the catalytic triad in MERS-CoV PLpro is correlated with the increased mobility of nearby loop residues, in particular the BL2 loop.
The simulations revealed that several titratable residues have shifted pKa values such that they switch between two protonation states at physiological pH. These include three His and one Cys residues unique to SARS-CoV-2/CoV and one Asp residue common to all three PLpros (Table II). Of particular interest is Cys270/271 on the flexible BL2 loop of SARS-CoV-2/CoV, which has a pKa of 6.7/6.9 and samples both the standard thiol and charged thiolate forms at neutral pH. CpHMD simulations showed that the BL2 loop samples an open or a closed conformational ensemble with deprotonated or protonated Cys270/271, respectively, consistent with two crystal structures of SARS-CoV-2 PLpro determined at neutral and low pH conditions and the conventional all-atom MD trajectories of SARS-CoV-2 PLpro with either deprotonated or protonated Cys270. Thus, the simulation data and experiment together support our hypothesis that the BL2 loop conformation is coupled to the titration of C270/271 in SARS-CoV-2/CoV PLpro.
An induced fit mechanism, by which BL2 closes in to form hydrogen bonds with the inhibitor, has been proposed in designing potent inhibitors targeting the S3/S4 pocket of SARS-CoV PLpro.4 Our finding suggests that in the absence of a ligand, protonation of C270/271 induces the closure of BL2 in SARS-CoV-2/CoV PLpro, which raises the possibility that a conformational selection mechanism may be operative, in which inhibitor binding shifts the BL2 conformational population to the closed form, perhaps by favoring the protonation of C270/271. While the unliganded crystal structures show that BL2 in MERS-CoV PLpro is more open than in SARS-CoV-2/CoV PLpro, CpHMD simulations suggest that BL2 has an increased flexibility in MERS-CoV PLpro and it can sample closed conformations. This finding challenges a current hypothesis, according to which SARS-CoV PLpro inhibitors do not bind to MERS-CoV PLpro due to the more open BL2 loop as a result of the sequence difference and one extra residue.23,46 A main caveat of our work is the use of the GB-Neck2 implicit-solvent model.26 Although it has been demonstrated in the accurate de novo folding simulations of nearly two dozen small proteins with α and β topologies,31 inherent issues such as the lack of solvent granularity may limit the accuracy of detailed conformational representation. Nonetheless, our work provides a starting point for further mechanistic investigations using higher-level approaches such as the all-atom CpHMD47 and more extensive conformational sampling to assist the structure-based drug design targeting the coronavirus PLpros.
See the supplementary material for convergence analysis, a list of pKa estimates for all titratable residues, and additional hydrogen bond analysis.
The Python script to enable the asynchronous replica-exchange protocol for Amber simulations is freely available at https://gitlab.com/shenlab-amber-cphmd/async_ph_replica_exchange. The data that support the findings of this study and raw trajectory files are available from the corresponding author upon reasonable request.
We acknowledge financial support from the National Institutes of Health (Grant No. GM098818).