Proteolysis is essential for the control of metabolic pathways and the cell cycle. Bacterial caseinolytic proteases (Clp) use peptidase components, such as ClpP, to degrade defective substrate proteins and to regulate cellular levels of stress-response proteins. To ensure selective degradation, access to the proteolytic chamber of the double–ring ClpP tetradecamer is controlled by a critical gating mechanism of the two axial pores. The binding of conserved loops of the Clp ATPase component of the protease or small molecules, such as acyldepsipeptide (ADEP), at peripheral ClpP ring sites, triggers axial pore opening through dramatic conformational transitions of flexible N-terminal loops between disordered conformations in the “closed” pore state and ordered hairpins in the “open” pore state. In this study, we probe the allosteric communication underlying these conformational changes by comparing residue–residue couplings in molecular dynamics simulations of each configuration. Both principal component and normal mode analyses highlight large-scale conformational changes in the N-terminal loop regions and smaller amplitude motions of the peptidase core. Community network analysis reveals a switch between intra- and inter-protomer coupling in the open–closed pore transition. Allosteric pathways that connect the ADEP binding sites to N-terminal loops are rewired in this transition, with shorter network paths in the open pore configuration supporting stronger intra- and inter-ring coupling. Structural perturbations, either through the removal of ADEP molecules or point mutations, alter the allosteric network to weaken the coupling.
I. INTRODUCTION
Maintaining protein homeostasis at the cellular level is essential in all kingdoms of life.1,2 Bacterial Caseinolytic proteases (Clp) assist these mechanisms by performing intracellular protein quality control through regulatory protein degradation.3 Self-compartmentalized Clp nanomachines comprise a central barrel-like peptidase, such as ClpP, and one or two ring–shaped ATPase components, such as ClpA or ClpX, which are axially stacked at the two opposite ends of the peptidase.4,5 Complex formation, which is dependent on ATP binding, tightly regulates the degradation process to prevent uncontrolled protein destruction.6,7 The isolated ClpP is catalytically inactive as access to its proteolytic chamber is precluded by axial gates locked in a closed configuration, which allow diffusion of short peptides8–10 but hinder the entry of longer unfolded polypeptide chains and block internalization of folded proteins.11–13 Docking of one or two ATPase partners to ClpP triggers gate opening to unleash the powerful degradation mechanism.14–17 Upon recognizing SPs targeted for degradation through short peptide tags attached covalently at one of the polypeptide terminals, the ATPase applies repetitive mechanical forces to effect SP unfolding and translocation through its narrow central channel and to propagate the unfolded polypeptide toward the peptidase core.18–25 SP degradation takes place processively and yields small peptide fragments of 7–8 amino acids.26
Highly conserved N–terminal loops (amino acids 1–19) control access to the degradation chamber through the axial pores of the double-ring tetradecameric ClpP.27,28 Intermediate “head” regions (residues 20–122 and 149–193) connect N-terminal loops to the inter–ring equatorial interface formed by interlocked “handle” regions (residues 123–148) of each protomer (Fig. 1). In the “closed” pore configuration, N-terminal loops assemble into a mesh that involves strong inter–loop contacts and that occludes passage to the proteolytic chamber.29 Removal of the N-terminal loops in ClpP variants abolishes the gating mechanism and enables even the isolated peptidase to indiscriminately destroy unfolded proteins. Functional control of degradation through complex formation with the ATPase is mediated by contacts formed with a ClpP ring at hydrophobic grooves located at peripheral sites from the axial pore. Although docking of the hexameric ClpA or ClpX ATPases to the heptameric ClpP rings involves a symmetry mismatch,1,27 binding of six conserved ATPase loops, which contain the IGL or IGF sequence motif in ClpA or ClpX, respectively, ensures robust degradation activity.4,29–31 Weaker complexes with only five IGL/F loops are functional, albeit degradation proceeds at a reduced rate.15 The N-terminal loops are also involved in the complex formation through interaction with the ATPase pore–2 loops, however, they form weaker contacts due to their greater conformational flexibility.30,32,33 Remarkably, pore opening may be affected without assistance from the ATPase through strongly cooperative binding of seven acyldepsipeptide 1 (ADEP1) molecules to the peripheral ClpP sites (Fig. 1).2,34 Kinetic studies indicate that ADEP1 establishes favorable interactions with the ClpP hydrophobic groove through its phenylalanine, β-methylproline and alanine moieties, and the aliphatic tail.35 This triggers a dramatic conformational transition to a quasi–symmetric configuration of N-terminal β-hairpins pointing outward from the proteolytic chamber that results in nearly doubling the pore diameter.26,34,36,37 Notably, the ClpP structure is very similar in the ADEP-bound and ClpX-bound configurations, with the global Cα-based root-mean-square deviation of 0.8 Å in one ClpP ring and 0.6 Å in the pore region (defined by the seven N-terminal loops and adjacent helices).5 ClpP pore opening induced by ADEP binding yields powerful antibacterial action through uncontrolled destruction of bacterial proteins that is pursued for therapeutic applications against pathogenic Staphylococcus aureus and Mycobacterium tuberculosis.35,38–41 Structural plasticity of ClpP probed in crystallographic and computational studies led to the hypothesis that transient exit channels form in the equatorial region to facilitate the release of peptide fragments resulting from the degradation process.2,42
The details of allosteric communications that underlie the critical gate opening mechanism remain elusive even as multiple ClpP conformations have been resolved. Such challenges are commonly noted for ring-shaped molecular machines, such as GroEL, thermosome and CCT chaperonins, and are attributable to their large conformational plasticity, dynamic rewiring of allosteric networks, and correlated intra- and inter-ring motions.43–47 To uncover the allosteric mechanisms in ClpP, we use computational approaches that are able to address such questions in diverse proteins by probing the complex networks of residue–residue interactions underlying the long-distance communication between the effector site and the functional site.48–51 These approaches use concepts in graph theory combined with residue–residue coupling derived from structural data or conformational dynamics to identify “communities” of strongly coupled amino acids and to map the allosteric pathways connecting them.52–62
In this paper, we describe comparative studies of ClpP conformational dynamics in its open- and closed-pore configurations. To this end, we perform solvated, atomistic, molecular dynamics simulations of these systems and we identify the collective motions using principal component and normal mode analyses. Coupling between regions of the tetradecameric structure revealed by community network analysis indicates a switch between inter- and intra-protomer coupling as a result of the transition from the closed to the open pore configuration. Allosteric paths identified between the ADEP binding sites and the ClpP N-terminal loops highlight stronger intra- and inter-ring coupling between binding sites and N-terminal loops in the open configuration.
II. METHODS
A. Molecular dynamics simulations
The closed and open pore configurations of Escherichia coli ClpP were described using the x-ray structures with Protein Data Bank IDs 1YG629 and 3MT6,34 respectively. Unresolved regions of the x-ray structures were modeled using the ModLoop64 web server. To study the effect of perturbations on each structure, we considered several point mutations, which were implemented using PyMOL65 (Table S1). For each configuration and sequence molecular dynamics (MD) simulations were performed using the Gromacs66 2022 package and the CHARMM36 all-atom force field.67 The CGenFF server was used to generate force field parameters for the ADEP molecule compatible with the CHARMM36 force field.68 Each protein structure was solvated in a dodecahedral box with dimensions Å3 with water molecules represented using the CHARMM-modified TIP3P69 model. The system was neutralized by adding an appropriate number of Na ions for each setup (Table S1). The solvated system was energy minimized using the steepest descent algorithm for 50 000 steps with the convergence criterion of the maximum force value smaller than 1000 kJ/(mol nm). Next, the systems were equilibrated by performing NVT and NPT dynamics. First, simulations were performed for 500 ps in the NVT ensemble using the V-rescale70 algorithm, with T = 300 K, and by restraining the heavy atoms of the protein with a spring constant of 1000 kJ/(mol nm2). In the second equilibration step, the restrained system was simulated for 500 ps in the NPT ensemble, with the constant pressure of 1 atm, using the Parrinello–Rahman71 algorithm. The time step in all MD simulations was 2 fs. Finally, restraints were removed and, for each setup, four unbiased MD simulation trajectories (50 ns each) were performed in the NPT ensemble. For analysis purposes, data frames were saved every 100 ps were saved after excluding the first 10 ns of each trajectory. Root-mean-square deviations of simulations in each setup are shown in Fig. S1.
B. Dynamic cross correlation matrix (DCCM)
To quantify the time-dependent directional correlations between residue pairs of the protein, we computed the Dynamic Cross Correlation Matrix (DCCM) of position fluctuations of Cα atoms of protein residues using the Bio3D package.72 DCCM is an N × N matrix, with N equal to the number of residues, where each element Cij corresponds to the dynamic cross-correlation between residues i and j:
Here, ri and rj are the spatial positions of the Cα atoms of residues i and j as a function of time, ⟨·⟩ denotes the ensemble average over all trajectories and all time frames up to time t, and △ri(t′) = ri(t′) − ⟨ri⟩ denotes the instantaneous position fluctuation of residue i from its mean over the simulation time. Correlation values range from −1 to 1, with positive Cij values corresponding to motions of i and j atoms that are moving in the same direction, whereas the negative values indicate motions in opposite directions. Convergence of the DCCM matrix, shown in Fig. S2, was assessed using , where Np is the number of residue pairs and the time interval τ = 5 ns.73 We note that evaluation of R(t) using several τ values between 2.5 and 10 ns consistently yield R(t) ≤ 0.001. Here, Cij is evaluated using data frames up to the total simulation time per trajectory t ⩽ 50 ns.
DCCM maps yield couplings between highly interconnected residue pairs that make it challenging to decipher correlated motions of larger protein regions. To address this limitation, we performed community detection using the strongly coupled residue pairs in DCCM (|Cij| ⩾ 0.6) and the Girvan-Newman algorithm74 implemented in the cna() function in the Bio3D software package.72 In this approach, the full residue network is split into highly intra-connected communities but weakly inter-connected between communities. In the Girvan-Newman algorithm, the number of communities is selected using the maximum modularity approach. Modularity quantifies how well a network is partitioned into communities. Since this is a probability measure, the values are between 0 and 1. In general, for an ensemble, MD derived correlation network modularity above 0.7 indicates reasonable partitioning in a network.72,75
C. Principal component analysis
Molecular Dynamics of complex biomolecular systems produce high-dimensional datasets comprising atomic positions saved in each time step. To glean information about the most significant conformational dynamics at a coarse-grained level one can employ Principal Component Analysis.76–78 Here, we probe the functional dynamics between the open and closed configurations of ClpP by performing Principal Component Analysis (PCA). In PCA, the covariance matrix (⟨△ri⋅△rj⟩) comprising positional fluctuations is diagonalized to determine the set of eigenvalues and eigenvectors. PCA calculations are performed using the GROMACS analysis tools g_covar and g_anaeig,66 where g_covar generates both eigenvalues and eigenvectors by diagonalizing the covariance matrix and g_anaeig filters the original trajectory and projects it along a given eigenvector. Prior to PCA calculation, we remove translation and rotation degrees of freedom of the entire molecule by superimposing each frame of the MD simulation onto the crystal structure. The comparison between the essential subspaces corresponding to open and closed configurations is performed by calculating the Root Mean Square Inner Product (RMSIP), which computes the overlap between two subsets of eigenvectors {(}, {)} by using79,80
Here, the overlap is computed such that the top M eigenvalues in each configuration account for over 80% of the total variance.
D. Optimal and suboptimal path analysis
To understand the dynamic coupling between the ADEP binding site and the N-terminal loop regions of the ClpP ring, we calculate the optimal and suboptimal paths traversing from each ADEP binding site to all the N-terminal loops in one heptameric ring. We use the cnapath() function implemented in the Bio3D package.58,72,75 In the path analysis, each Cα atom is considered a node, and the connection between nodes is weighted by wij = −log(|Cij|). To remove weakly correlated regions and interactions between residues that are not in contact, the DCCM is filtered using the cutoff value |Cij| ⩾ 0.3 and the dynamic contact map obtained in the MD trajectories. The contact map is generated in two steps. First, we identify persistent contacts in each trajectory, i.e., residue pairs with the Cα-Cα distance dij ⩽ 10 Å present in at least 75% of the simulation time frames.59,81 Next, the dynamic contact map is generated as the consensus map of contacts identified in at least three of the four trajectories. We find that the dynamic contact maps include ≃84% of contacts present in the native structure and ≃2% of new contacts. Paths are determined by using an efficient bidirectional approach that simultaneously initiates the search from one “source” (ADEP binding site) residue and one “sink” (N-terminal loop) residue and identifies closed paths upon locating common nodes. The path with the shortest length is identified as optimal, whereas slightly longer paths that are closed are identified as suboptimal. Accordingly, analysis of paths traversing between the “source” and the “sink” is useful to glean information about the allosteric regulation of regions that do not show large conformational changes. In our analysis, 300 paths are calculated for each “source” and “sink” residue pair and path length distributions are analyzed to assess the strength of the correlated motions. The extent of the overlap between two path length distributions, pi(x), i = 1, 2, is quantified by using the overlapping coefficient, OC = ∫min{p1(x), p2(x)}dx.82,83 Furthermore, the normalized node degeneracy, evaluated as the fraction of paths traversing through a given node, is used to identify the important residues that contribute to the allosteric network. Analysis of node degeneracy indicates that ∼350 paths between the source and sink are sufficient for convergence.59
E. Normal mode analysis and structural perturbation method
In order to calculate the normal modes of the proteins and analyze the response of the modes to perturbations, we modeled the proteins as elastic networks composed of N nodes where N is the number of amino acid residues in the structure.44,84,85 The nodes are placed at the locations of the Cα atoms of the amino acid residues in the corresponding PDB structures. All nodes that are within a cutoff distance Rc = 9 Å are connected via harmonic potential with the energy function:
where γ is the spring constant that defines the energy scale, dij is the dynamic distance between residues i and j, and is the distance between the residues in the PDB structure.
The normal mode calculation yields a set of 3N-dimensional eigenvectors, and corresponding eigenvalues ωM for each mode M. We also analyze the responses of these modes to perturbations that mimic to point mutations of specific amino acids. This approach has been termed the Structural Perturbation Method (SPM)86,87 and, in practice, we calculate the response to a mutation at the site i:
where δγ is the perturbed spring constant. The sum is over all other nodes in the network. The greater the response δωiM, the more dynamically significant a specific residue is to a given mode. We highlight the nodes that have δωiM values that are three-fold above the average value for a mode as significant.
The overlap function quantifies how a given normal mode compares with the conformational change along a transition pathway. We compute the overlap function by projecting the eigenvector qM of a given mode M onto the displacement vector between the open and closed configurations according to the formula,88
where the sum is over the Np nodes, , and are position vectors of the ith node in the open (closed) structures, respectively. A value of one for the overlap corresponds to the direction given by the eigenvector aM being identical with that of Δr. The relative amplitude of node i in mode M is obtained using where , α = x, y, z, are the components of the normalized eigenvector aM.
III. RESULTS
A. Collective motions underlying the gate-opening conformational transition involve primarily N-terminal loop regions of ClpP
To glean the collective motions of ClpP protomers underlying the gate-opening transition, we use Normal Mode Analysis (NMA) and Principal Component Analysis (PCA) (see Sec. II). These approaches have proved highly effective in studies of conformational changes and dynamics of multisubunit biomolecules, such as the ring-shaped chaperonins, ClpXP, or immunoglobulin.45,89–92 We use these analysis methods in conjunction to obtain detailed information about the collective motions, as inter-residue couplings in NMA are restricted to an elastic network model of the native protein structure, whereas in PCA they reflect dynamic, i.e., native as well as non-native, contacts. Nonlinear contributions to these couplings, quantified by considering the mutual information between residue coordinates,93 have also been used to probe allosteric networks.58,60,61,94–97 We note that a recent study of the tetracycline repressor dimer identified similar trends in the inter-residue coupling when comparing linear and nonlinear contributions.97 A distinct advantage of network-based approaches is their ability to probe the propagation of allosteric signal on timescales accessible to MD simulations, yet yielding results consistent with the much longer biological timescales.90,98
In PCA, diagonalization of the covariance matrix of atomic fluctuations yields the set of independent modes of motions of the protein that characterizes its essential subspace.77 Eigenvectors of the matrix characterize the orthogonal directions of maximal variance, whereas eigenvalues determine the amplitude of positional deviations. We focus on the PC modes that correspond to the largest eigenvalues (Fig. S3), which provide the major contribution to the variance of fluctuations. In the open state of ClpP, we find that eigenvalues corresponding to the top 3 PC modes contribute ≃40% of the cumulative variance of the fluctuations, and eigenvalues of the top 20 modes are required to explain ≃67% of the total variance. In the closed state, the top 3 eigenvalues contribute ≃52% of the total variance, whereas the top 20 eigenvalues contribute ≃74% of the total variance. In both cases, examination of the motions corresponding to the top 2 PC modes indicates a combination of large amplitude swing-type and torsional motions of the N-terminal loops that enable pore opening and closing [Figs. 2(a) and 2(b) and Movies SM1-4 in the supplementary material]. Handle domains in the inter-ring interface undergo slight contracting twisting motions. More specifically, in the open state, PC1 corresponds to swing-like motions that underlie pore opening and closing and PC2 corresponds to torsional motions of N-terminal loops. In the closed configuration, PC1 corresponds to torsional motions and PC2 corresponds to a combination of swing-like and torsional motions.
To discern the contribution of harmonic vibrations to these motions we perform normal mode analysis of the open state configuration (see Sec. II). We focus on the subset of normal modes whose eigenvectors have the largest overlap (see Sec. II) with the conformational changes corresponding to the transition between the open and closed conformations. As shown in Fig. 2(c), five modes have overlap 0.15 ≲ IM ≲ 0.2, whereas all other modes have smaller contributions (IM < 0.1). The absence of a single, dominant, mode indicates the lack of strong coordination of motions of the seven subunits of each ring to enable the transition. We also note that the lowest frequency modes, which describe global motions of the ClpP tetradecamer, have small contributions to the open → closed pore transition, whereas the five higher frequency modes, which describe more local motions, are more suitable to describe the transition. Consistently, the amplitudes of residue motions in the top five modes indicate large values only in the loop regions and negligible amplitudes outside of these regions (Fig. S4). These results reveal that dominant conformational changes in this transition are primarily associated with motions of the N-terminal loops. On the basis of the comparison between PCA and NMA results, we surmise that dynamic fluctuations provide a stronger contribution in the handle and head regions than in the N-terminal loop regions.
Hot-spot residues that are critical for allosteric communication are identified by employing the Structural Perturbation Method (see Sec. II). Here, the perturbations imposed by point mutations capture the change in the local network energy and hot-spot residues correspond to nodes whose response (δωiM) exceeds by three-fold the average value. Table S2 shows the hot-spot residues derived from the two modes with the largest overlap, 38 and 28 (Fig. S5 and Movies SM5-6), and Fig. 2(d) illustrates the location of these hot-spots residues projected onto a single ClpP monomer. We note that NMA and SPM analysis identify hot-spot residues clustered primarily in the N-terminal loop region, in accord with the dominant structural flexibility of this region in the harmonic approximation. In addition to the hot-spot residue cluster within and near the N-terminal loops, SPM analysis highlights two residues (His191, Arg192) in the C-terminal region.
Both PCA and NMA results are consistent with observations of structural studies, which highlighted that the peptidase core, ClpP(20–193), has virtually identical conformations in the open and closed configurations. On this basis, it was proposed that ADEP1 binding causes a significant conformational change in the N-terminal loops and only small-amplitude motions of the equatorial belt formed between two rings.34 To quantify these contributions to the motions and to pinpoint the regions with the largest contribution to the open → closed transition, we further probe the PCs associated with the motions of the N-terminal loops and of the peptidase core, ClpP(20–193), respectively, in closed and open configurations. To this end, we perform separate PCA of each of these two ClpP regions in each of the two configurations and we determine the Root Mean Square Inner Product (RMSIP) of the two subsets of eigenvectors, which provides a measure of the similarity of motions described by the PC modes (see Sec. II).
Comparison of collective motions of N-terminal loops was computed by considering the eigenvectors corresponding to the largest 11 eigenvalues, which collectively account for 80% of the variance, thus representing the essential subspace. Quantitatively, the overlap between the essential subspaces corresponding to N-terminal regions indicates weak similarity, with RMSIP ≈0.37. By contrast, for the peptidase core, ClpP(20–193), where the top 110 eigenvectors must be included to describe the essential subspaces, we find a strong overlap between the PC modes, with RMSIP ≈0.73. This indicates that ADEP binding has only a weak effect on the dynamics of the peptidase core.
Overall, NMA and PCA data suggest that the conformational transition of ClpP protomers during the gate opening and closing are best described by an ensemble of modes. Taken together, both PC and normal mode data reveal that large conformation changes only exert at the N-terminal loop regions while the core remains mainly intact.
B. Distinct coupling between ClpP regions in closed and open pore configurations
To investigate the correlated motions of regions in the ClpP tetradecamer, we employ the community network analysis, which uses the residue–residue coupling quantified by the directional cross-correlation map (DCCM, see Sec. II). DCCM maps are highly interconnected at the residue level, which makes it complicated to extract information for large systems. Therefore, we probe community network clusters by converting the atomic cross-correlations to a coarse-grained type community network using the Girvan-Newman clustering method.74 To investigate both intra- and inter-protomer coupling, we select the cutoff of the cross-correlation between residues i and j, |Cij|, such that the maximum modularity (see Sec. II and Fig. S6) corresponds to a larger number of community clusters than the number of ClpP protomers, 14, in each of the three ClpP configurations. We find that this requirement is satisfied by |Cij| ⩾ 0.6, which yields community clusters in ClpP configurations (Tables S3–S5). As shown in Fig. 3, the community network analysis reveals distinct patterns of domain coupling in ADEP-bound and unbound conformations of ClpP. Strikingly, the ADEP-bound open conformation is characterized by extensive intra-subunit coupling within six subunits that involves strong coordination between the N-terminal loop, handle, and head regions [Fig. 3(a)]. By contrast, the limited coupling is observed across the equatorial interface between protomers of the cis (A–G) and trans (H–N) rings (according to the ClpP protomer organization shown in Fig. 1), involving the handle regions of two protomer pairs (C-J and D-K). Removal of ADEP from the open pore conformation results in dramatic changes in intra-protomer coupling, which nearly abolish the correlation between the handle region and the loop and head regions [Fig. 3(b)]. Furthermore, the inter-ring coupling is enhanced through coordinated motions of handle regions of five protomer pairs. These changes yield coupling patterns similar to the closed conformation, in which intra-protomer coupling is weaker and inter-ring coupling between handle regions of protomer pairs is prevalent [Fig. 3(c)]. We note that these results are consistent with the important role of the handle domain in tetradecamer formation and stacking of the two rings (cis and trans).99
C. Optimal and suboptimal path analysis reveal stronger coupling between allosteric and active sites of ADEP bound ClpP
The absence of large-scale rigid-body domain motions of ClpP protomers and the proximity of the ADEP binding sites to the N-terminal loops suggest that allosteric signals can be transmitted through relatively short, intra-protomer, paths. The ring structure of ClpP, however, also allows effective allosteric communication to take place between ADEP binding site of one protomer and N-terminal loops of the other intra-ring protomers. To probe these allosteric mechanisms in a quantitative manner, we use approaches that combine residue–residue positional correlations and concepts in graph theory (see Sec. II). To this end, we map the allosteric paths that connect one ADEP binding site, termed “source,” and the seven N-terminal loops within one ring, or “sink.” In the correlation network, each node represents one protein residue and the connecting edges have associated lengths that reflect the cross-correlation values (see Sec. II).54 The path length between nodes located in the source and sink is then identified as the sum of the lengths of all individual edges that connect these nodes. We emphasize that, given the construction of an allosteric map in the correlation space, the relative importance of the allosteric paths depends on the strength of the coupling between constituent residue pairs rather than their proximity in the physical space. In this framework, the shortest paths between residues of the source and the sink reveal the strongest allosteric couplings within a protein.51 Interestingly, the minimal length, or “optimal,” path was not found to be very sensitive to changes in the protein configuration, therefore, analysis of allosteric communication limited to this path may yield misleading conclusions about the signaling pathways.59 Instead, a comprehensive analysis requires the additional evaluation of “suboptimal” paths, which have slightly longer lengths than the optimal path.59
Using this framework, we computed 300 paths for each residue pair formed by one source and one sink amino acid. To this end, we select as source one ADEP binding site [Fig. 4(a)], comprising residues Val44, Leu48, Phe49, Glu51, Ala52, Phe82 from protomer i − 1, and Arg22, Leu23, Val28, Phe30, Tyr60, Tyr62, Ile90, Met92, Leu114, Leu189 from the adjacent protomer i, and as sink two representative residues, Thr10 and Gly13, located near the turn of each N-terminal loop in the same (cis) ring. The selection of representative loop residues suffices for the purposes of mapping paths, as the allosteric signal propagates within the loop exclusively through intra-loop residues. By considering paths connecting each ADEP binding site to each loop, we examined a total of 470 400 pathways traversing one ClpP ring (see Sec. II). In all three ClpP configurations, the optimal pathways are intra-protomeric and connect Arg22 to loop residues 13–17 (Table S6). Thus, optimal pathways are largely stable among the three ClpP configuration, which is in accord with observations made on small proteins noted above.
In order to characterize the strength of allosteric communication from a broad perspective, we examine the set of suboptimal paths in each pore configuration as well as the subsets of paths connecting each ADEP binding site to a specific loop [Fig. 4(a)]. As shown in Fig. 4(b), the histograms of suboptimal path lengths in the three ClpP configurations highlight the stronger coupling between binding sites and loops affected by ADEP binding, with the shortest paths corresponding to the open state. By contrast, in the closed state, the path length distribution is shifted toward longer paths, with lengths up to ≃8, which indicates a weaker coupling between the ADEP binding site and the N-terminal loops. Perturbation of the open configuration through the removal of the ADEP molecules results in a slight shift of the suboptimal path length distribution toward paths of intermediate lengths between those found in the open and closed configurations. Next, we probed the intra-ring propagation of the allosteric signal by examining the path length distributions corresponding to paths connecting each binding site to the N-terminal loop of each protomer [Figs. 4(c)–4(f)]. In accord with the above observations, path lengths corresponding to the open state are consistently shorter than those for the closed state. In both open and closed pore configurations, the shortest paths originating from the ADEP binding site of protomer i correspond to paths to the N-terminal loop of the same protomer [Figs. 4(c)–4(f)]. In the open state, we find relatively short paths, of length ≃2, to the nearest neighbor loops in both clockwise (CW) and counterclockwise (CCW) directions. This strong intra-ring coupling supports the ability of the hexameric ATPase to trigger ClpP gate-opening even with the substoichiometric occupation of distal binding sites. Decreasing coupling strength is found in the N-terminal loops of the second and third nearest neighbor protomers, however, with slightly shorter paths in the CW direction [Figs. S7(A)–S7(C)]. In the closed state, we note the larger overlap than in the open state between path length distributions corresponding to neighboring loops and that of the same protomer loop, which indicates a slower intra-ring decay of the weaker allosteric signal in this state. In addition, the CW and CCW distributions corresponding to N-terminal loops of equidistant nearest neighbors overlap nearly completely [Figs. S7(D)–S7(F)].
The distinct pattern of path lengths in the open and closed configurations of ClpP, as well as the marked effect of perturbations on the path lengths indicate a significant dynamic rewiring of allosteric communication. To obtain the microscopic understanding of the changes induced by perturbations, we probe the detailed paths in each configuration. To this end, we examine the three-dimensional maps associated with suboptimal paths (Fig. 5). The structural maps of allosteric pathways in the open configuration indicate strong signaling propagated from the ADEP binding site to the nearest three N-terminal loops in the cis ring [Fig. 5(a)]. Removal of the ADEP molecules yields weaker coupling (indicated by thinner lines connecting the nodes) and increasing number of paths connecting the ADEP binding site to the more distant loops [Fig. 5(b)]. In the closed configuration, paths that connect multiple loops are increasingly found [Fig. 5(c)].
Interestingly, as shown in Fig. 5(d), the analysis of inter-ring pathways connecting the binding site of one protomer (B) in the cis ring and the N-terminal loops of its partner (I) in the trans ring, using |Cij| ⩾ 0.6, reveals shorter paths, and therefore, tighter inter-ring coupling, in the open configuration compared with the closed configuration. In the open configuration, allosteric communication between rings is primarily mediated by paths with lengths , which are not available in the closed configuration, and the overlap between the two distributions is small, with the overlapping coefficient OC ≃ 0.29 [Fig. 5(d)]. Although, as noted in Sec. III B, the inter-ring handle interface is abolished in the open configuration, shorter paths become available through alternate routes, as indicated in Figs. 5(e) and 5(f). The shorter path lengths corresponding to the open configuration can be rationalized in terms of the connectivity between community networks of the cis and trans protomers that mediate the inter-ring allosteric communication, indicated in Fig. 3. Whereas, in the closed configuration, paths must cross two gaps between intra-protomer communities to connect with the inter-protomer handle community, in the open configuration, a single, inter-protomer, gap must be crossed to connect the cis and trans protomer communities [Figs. 3, 5(e), and 5(f)]. The differential gap penalty results in a length of ≃5.9 for the optimal path in the open configuration, and ≃6.9 for the closed pore configuration, even as the structural details of the paths are similar (Table S7). We surmise that the handle interface, rather than facilitating inter-ring allosteric communication, acts as a constraint in the closed configuration, resulting in two intra-protomer path length penalties. By contrast, in the open configuration, the inter-protomer constraint is removed and a single path length penalty is applied.
Next, to identify residues that are critical for allosteric signaling, we compute the normalized intra-ring node degeneracy, i.e., the fraction of paths that include each node (see Sec. II). Table I summarizes the residues in the cis ring that have the node degeneracy in at least one configuration, as revealed by path analysis. The structural location of residues that act as critical nodes is highlighted in (Fig. 6). Notably, residues Ile19i, Tyr20i, Ser21i, Ser21i−1, and Leu24i−1, at the base of N-terminal loops proximal to the ADEP binding site and Gln46i−1 and Leu50i−1, which are in the proximity of ADEP binding site, have node degeneracy in all three setups and therefore are likely to have a critical contribution to allosteric communication. (In our notation, residue labeling includes as subscript the protomer location in the cis ring, with the “source” binding site occupying protomers i − 1 and i. Protomers are numbered in the counterclockwise direction in the top view of the ring.) One set of residues, Ile19i, Ser21i−1, Leu24i−1, and Leu50i−1, has a slightly reduced degeneracy in the closed pore configuration and upon removal of ADEP molecules in the open configuration, whereas the other residues have increased degeneracies. Interestingly, we also note that residue Gln46i−1 shows no change in node degeneracy values in the three setups, indicating a weak sensitivity to structural perturbations. Structural studies revealed that the presence of Ile19 is crucial for the stability of the N-terminal loops and substrate translocation in the ADEP bound open state.100 We note that Ile19 is present in all three setups with a slightly higher value in the open setup.
Residue . | Ser21i−1 . | Leu24i−1 . | Ala45i−1 . | Gln46i−1 . | Leu50i−1 . |
---|---|---|---|---|---|
Open | 0.17 | 0.26 | 0.07 | 0.14 | 0.19 |
Open, no ADEP | 0.17 | 0.26 | 0.08 | 0.14 | 0.15 |
Closed | 0.14 | 0.17 | 0.11 | 0.14 | 0.17 |
Residue . | Ser21i−1 . | Leu24i−1 . | Ala45i−1 . | Gln46i−1 . | Leu50i−1 . |
---|---|---|---|---|---|
Open | 0.17 | 0.26 | 0.07 | 0.14 | 0.19 |
Open, no ADEP | 0.17 | 0.26 | 0.08 | 0.14 | 0.15 |
Closed | 0.14 | 0.17 | 0.11 | 0.14 | 0.17 |
Residue . | Ile19i . | Tyr20i . | Ser21i . | Leu24i . | Ile29i . | Ser21i+1 . |
---|---|---|---|---|---|---|
Open | 0.22 | 0.11 | 0.19 | 0.11 | 0.10 | 0.06 |
Open, no ADEP | 0.11 | 0.19 | 0.26 | 0.12 | 0.08 | 0.13 |
Closed | 0.16 | 0.23 | 0.22 | 0.05 | 0.09 | 0.06 |
Residue . | Ile19i . | Tyr20i . | Ser21i . | Leu24i . | Ile29i . | Ser21i+1 . |
---|---|---|---|---|---|---|
Open | 0.22 | 0.11 | 0.19 | 0.11 | 0.10 | 0.06 |
Open, no ADEP | 0.11 | 0.19 | 0.26 | 0.12 | 0.08 | 0.13 |
Closed | 0.16 | 0.23 | 0.22 | 0.05 | 0.09 | 0.06 |
D. N-terminal mutations differentially alter allosteric communication in open and closed pore configurations
We further explore how perturbations alter the coupling between the allosteric and active sites of ClpP by engineering point mutations at N-terminal sites indicated to be functionally important (see Sec. II and Table S1).100 We focus, on the one hand, on single point-mutations, such as I7P, E8K, and K25E, that alter the stability of individual N-terminal loops, and, on the other hand, on double, E14A-R15A, and triple, E14A-R15A-E8A and E14A-R15A-K25A, mutations that affect intra-as well as inter-loop stabilization (Fig. 7). To this end, we compare the path length distributions corresponding to shortest 5000 suboptimal paths among the allosteric pathways connecting one binding site to all the ClpP N-terminal loops in the cis ring (see Sec. II). As shown in Fig. 7, we find that mutations have distinct effects on allosteric coupling, which can manifest differently in the open and closed pore configurations. Single mutations have generally large effects in both open and closed configurations. As shown in Fig. 7(b), the I7P mutation, which makes the coil conformation more favorable, has a drastic effect on the open configuration, with a low OC ≃ 0.22, in accord with the deleterious effect of this mutation on polypeptide degradation.100 E8K and K25E mutations, which remove the stabilizing salt bridge at the base of the N-terminal loop, affect both the open and closed configurations [Figs. 7(b) and 7(c)], which is consistent with their diminished degradation rates of both polypeptides, which stringently require an open gate, and peptides, which may be internalized through the closed gate.100 Double and triple mutations, including the E14A-R15A mutations that provide stabilizing inter-loop salt bridges, have a lesser effect on the closed than on the open pore configuration, as the salt bridges formed by Glu14 and Arg15 residues in neighboring protomers are not present in the closed state even in the wild-type ClpP [Figs. 7(d) and 7(e)]. We find that the triple mutation E14A-R15A-K25A has the strongest effect on allosteric paths, with OC ≃ 0.14 in the open configuration, which is in accord with the largest reduction in the degradation rate of polypeptides compared with the single and double mutations.100
IV. DISCUSSION
Our computational studies probe the allosteric mechanisms of the ClpP peptidase in response to effectors that activate its gate-opening conformational transition. Intriguingly, this transition involves limited structural rearrangement outside of the gate-controlling N-terminal loop region. How is the allosteric signal propagated in the absence of large-scale rigid-body motions of ClpP subunits? To address this question, we undertook comparative studies of the open and closed pore configurations of ClpP by performing equilibrium MD simulations of each of these states. Our results, quantified through principal component and normal mode analysis, highlight the similarity of motions of the peptidase core in the two states even as the loop motions are significantly different.
In our analysis, both structural perturbation, derived from Normal Mode calculations, and node degeneracies, computed using the positional cross-correlations, highlighted a set of hot-spot residues that are critical for the gate opening transition of ClpP. We find that the hot-spots derived from the harmonic approximation used in NMA reveal regions that are highly flexible and dynamic through their proximity to the N-terminal loop regions or the C-terminal regions. Node degeneracy values derived from dynamic cross-correlations account for hot-spot residues that are distributed in both N-terminal loops and protease core. Here we note that residues Ile19, Tyr20, and Leu24 were common to both SPM and node degeneracy calculations indicating these residues as critical during the allosteric regulation. Our results are in agreement with structural studies,100 which have shown that large non-polar side chains of Ile are critical for the integrity of the N-terminal loops.
Our detailed analysis of intra- and inter-ring allosteric pathways reveals stronger communication in the open configuration between each ADEP binding site and N-terminal loops of distant protomers. According to these results, in this configuration, neighboring intra-ring protomers are strongly coupled, consistent with the observed ability of the ATPase to trigger gate opening even as it activates only six binding sites. Interestingly, inter-ring coupling is also strengthened in the open configuration, even as the handle interface present in the closed configuration between protomer partners is removed. Stronger coupling in the open configuration is affected through efficient crossing of a single gap between residue communities that reduces the penalties of crossing gaps between multiple communities in the closed configuration.
Allosteric communication in the ClpP peptidase is likely to be further modulated by two external factors, namely its interactions with the ATPase partner, such as the single-ring ClpX or the double-ring ClpA, and with the substrate protein being degraded. The effect of the ATPase interaction reflects the variability of IGL/IGF loop binding to the seven ClpP binding sites during the catalytic cycle. Given the asymmetric binding of the six ATPase loops to the seven binding sites of ClpP, it is plausible that the signaling induced by ADEP binding represents the upper bound to the coupling strength between binding sites and the ClpP N-terminal loops. Additionally, the non-concerted conformational transitions of the ATPase hexamer further weakens the allosteric coupling and breaks the ring symmetry. Nevertheless, asymmetric intra-ring allostery may support ClpP’s active internalization of the polypeptide chain in the degradation process through non-concerted conformational changes of the pore loops. In support of the active action, studies using a ClpAP complex with one or more IGL loops of ClpA covalently crosslinked to the ClpP binding sites allow degradation to proceed at a slightly reduced rate compared with the noncovalent ClpAP complex.101
Allosteric communication within the ATPase itself has a high complexity and can, therefore, give rise to multiple responses in ClpP. An illustration of the complex ATPase allostery is provided by the double-ring ClpB disaggregase, which is not a cellular partner of ClpP, but can be engineered to form a complex with it.102 Our community network analysis of apo, nucleotide and/or substrate-bound configurations of ClpB revealed distinct intra-ring communication within the nucleotide-binding domain (NBD) 1 and 2 rings.90 Whereas, in the NBD1 ring, strong coupling is found between Large and Small subdomains of neighboring subunits, in the NBD2 ring, intra-protomer coupling is dominant.
The interaction between the ClpP peptidase and its substrate protein partner also has the potential to effect changes in the allosteric signaling. As noted in a recent study, ClpP forcefully grips the titin substrates with forces that exceed those of the partner ATPase.103 Such strong signaling may dramatically alter the allosteric paths and break the symmetry of communications between the ADEP binding site and N-terminal loops.
SUPPLEMENTARY MATERIAL
See the supplementary material for Table S1 showing the summary of the MD setups for ClpP wild-type and mutant setups, Table S2 for hot-spot residues derived from the structural perturbation method, Tables S3–S5 for highly correlated residue communities and Tables S6 and S7 for the list of optimal paths in each ClpP configuration, Fig. S1 for RMSD time series in each ClpP configuration, Fig. S2 for DCCM convergence computed over multiple trajectories, Fig. S3 for the largest 20 eigenvalues of the PC modes for each ClpP configuration, Fig. S4 for normalized amplitudes of amino-acid motions associated with the top five normal modes, Fig. S5 for motions associated with top normal modes of the ClpP tetradecamer, Fig. S6 for network modularity, Fig. S7 for path length distributions of equidistant protomer loops, Movies SM1-4 for motions corresponding to principal components in the open (SM1–2) and closed pore configurations (SM3–4) and Movies SM5-6 for motions associated with the top two normal modes.
ACKNOWLEDGMENTS
The authors gratefully acknowledge stimulating discussions with Sue Wickner and Mike Maurizi. This work has been supported by the National Science Foundation Grant Nos. MCB-1516918 and MCB-2136816 to G.S. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by NSF Grant No. ACI-1548562. XSEDE Bridges resources at the Pittsburgh Supercomputer Center were used through allocation TG-MCB170020 to G.S. Partial support was received through research cyberinfrastructure resources and services provided by the Advanced Research Computing center at the University of Cincinnati.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Ashan Dayananda: Conceptualization (equal); Data curation (lead); Formal analysis (lead); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal); Writing – original draft (lead); Writing – review & editing (lead). T. S. Hayden Dennison: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal). Hewafonsekage Yasan Y. Fonseka: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal). Mohammad S. Avestan: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal). Qi Wang: Conceptualization (equal); Investigation (supporting); Methodology (equal); Validation (supporting); Visualization (supporting). Riina Tehver: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). George Stan: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.