Redox processes are important in chemistry, with applications in biomedicine, chemical analysis, among others. As many redox experiments are also performed at a fixed value of pH, having an efficient computational method to support experimental measures at both constant redox potential and pH is very important. Such computational techniques have the potential to validate experimental observations performed under these conditions and to provide additional information unachievable experimentally such as an atomic level description of macroscopic measures. We present the implementation of discrete redox and protonation states methods for constant redox potential Molecular Dynamics (CEMD), for coupled constant pH and constant redox potential MD (C(pH,E)MD), and for Replica Exchange MD along the redox potential dimension (E-REMD) on the AMBER software package. Validation results are presented for a small system that contains a single heme group: *N*-acetylmicroperoxidase-8 (NAcMP8) axially connected to a histidine peptide. The methods implemented allow one to make standard redox potential (E^{o}) predictions with the same easiness and accuracy as p*K*_{a} predictions using the constant pH molecular dynamics and pH-REMD methods currently available on AMBER. In our simulations, we can correctly describe, in agreement also with theoretical predictions, the following behaviors: when a redox-active group is reduced, the p*K*_{a} of a near pH-active group increases because it becomes easier for a proton to be attached; equivalently, when a pH-active group is protonated, the standard redox potential (E^{o}) of an adjacent redox-active group rises. Furthermore, our results also show that E-REMD is able to achieve faster statistical convergence than CEMD or C(pH,E)MD. Moreover, computational benchmarks using our methodologies show high-performance of GPU (Graphics Processing Unit) accelerated calculations in comparison to conventional CPU (Central Processing Unit) calculations.

## I. INTRODUCTION

The coupling between electron and proton transfer is essential to describe many important biological processes. The protonation/redox state of proteins and other biomolecules is related to their structure and function, and it can affect properties like stability, ligand binding, catalysis, absorption spectrum, among others.^{1,2} This happens because the solution’s pH and redox potential (reduction potential or electrode potential) affect the charge distribution on the biomolecules due to changes in the predominant protonation/redox state of the titratable groups. For many systems, the standard redox potential (also called the midpoint reduction potential) turns out to be pH-dependent.^{3,4} Hence, theoretical methods that can correctly describe systems at constant pH and constant redox potential are very important. Previous studies in this area have been published.^{5–12}

Due to the mathematical similarity between the Henderson-Hasselbalch equation (applied to acid-base reactions) and the Nernst equation (applied to electrochemistry, redox reactions), the theoretical derivations used on constant pH methods^{13–16} can be extended to constant redox potential methods.^{17} There are two types of Constant pH Molecular Dynamics (CpHMD) approaches:^{18} the ones that consider continuous protonation states^{19–22} and the ones that consider discrete protonation states.^{13,14} The continuous protonation states approach is based on the introduction of a fictitious particle at each titratable site that is propagated through conventional Molecular Dynamics (MD) according to a pH-dependent force. The discrete protonation states approach makes use of Metropolis Monte Carlo exchange attempts between different protonation states over the course of the MD. Regardless of the approach type, the accuracy of predicted p*K*_{a} values relies not only on the accuracy of the force field parameters but also on the extent of the conformational sampling.

Replica Exchange Molecular Dynamics (REMD) is a state-of-the-art method that is able to significantly improve conformational sampling convergence^{23–28} while taking advantage of computational parallelization. This approach consists of individual simulations (which are considered as independent replicas) that periodically attempt to exchange information between them through a Metropolis Monte Carlo scheme. Studies regarding REMD along the pH dimension (pH-REMD) have been reported in the literature.^{18,21,24,25,29,30} In pH-REMD, each replica explores the conformational space at a different pH value.

In this work, we present the implementation of constant redox potential MD (CEMD), of coupled constant pH and constant redox potential MD (C(pH,E)MD), and of REMD along the redox potential dimension (E-REMD) in the AMBER software package.^{31} To the best of our knowledge, this is the first implementation of E-REMD. Discrete redox and protonation states are considered in our methodologies, and calculations can be done using both implicit [i.e., Generalized Born (GB)] or explicit solvent models. These new implementations are based on existing CpHMD and pH-REMD methods implemented on AMBER by Mongan *et al.*^{13,18,25} We present results for *N*-acetylmicroperoxidase-8 (NAcMP8)^{32} with a histidine peptide as the axial ligand. NAcMP8 is a small peptide derived from cytochrome *c* that has a single ferric heme group and is a good reference compound candidate for the simulation of larger proteins containing one or more equivalent heme groups.

Previous simulations for large systems using AMBER’s high-performance GPU (Graphics Processing Unit) code have shown to have great speedups over calculations using conventional CPUs (Central Processing Units).^{33,34} Our CEMD, C(pH,E)MD and E-REMD implementations are also available using AMBER’s GPU code. To our knowledge, these are the first implementations of constant redox potential methods using the GPU accelerated code.

## II. THEORY AND METHODS

### A. Constant redox potential molecular dynamics (CEMD)

In CEMD, we make use of Monte Carlo transitions between discrete redox states, represented by different atomic charge distributions for reduced and oxidized states of a given redox-active residue. Furthermore, in CEMD, a predetermined number of MD steps are performed, the simulation is then halted, and a redox state change is attempted. In order to understand how the energetic cost of this attempt is computed, we start from the reduction reaction,

The equilibrium constant is given by $Ke=Aredn\u2212vAoxin+e\u2212v$ and is unitless. It is possible to devise expressions equivalent to $pH=\u2212logH+$ and *pK*_{a} = −log *K*_{a} for the redox potential $E=\u2212kbTFlne\u2212$ and the standard redox potential $Eo=kbTvFlnKe$, where *F* is the Faraday constant. Then, from the equilibrium constant expression, we get to the Nernst equation^{35}

From this equation, an expression for the fraction of reduced species is devised,

In this equation, *n* is the Hill coefficient and is a post hoc correction added to account for the fact that the redox-active group might be affected by other pH- or redox-active groups nearby. We can also devise an expression for *f*_{red} based on a statistical mechanics point of view,

Any theoretical modeling based solely on this equation requires prior knowledge of the *E*^{o} value; however, this is one of the properties one wants to be able to predict theoretically. To overcome this issue, we also write Eq. (5) for a reference compound where the value of *E*^{o} is known. We also split the Gibbs free energy of reduction into electrostatic (Coulombic) and non-electrostatic contributions

The approximation made on AMBER is that $\Delta Gnon\u2212elec=\Delta Gnon\u2212elec,ref$, which leads to

This is the equation used on CEMD to perform Monte Carlo redox state change attempts. $\Delta Gelec,ref$ is a precomputed quantity adjusted to reproduce the $Erefo$ value for the reference compound and $\Delta Gelec$ is computed on every redox state change attempt. $\Delta Gelec$ is obtained by taking the difference between the electrostatic energy associated with the proposed and current redox states.

### B. Constant pH and redox potential molecular dynamics (C(pH,E)MD)

In CpHMD, the core equation for protonation state change attempts,^{13,25,30} equivalent to Eq. (8), is

In our C(pH,E)MD implementation, protonation and redox state change attempts are performed separately, even if the attempts happen at the same MD step. The relation between protonation and redox processes comes naturally from the fact that a successful redox state change attempt will alter the charges of a redox-active group and this change will thus affect the next protonation state change attempts of neighboring pH-active groups. Similarly, a successful protonation state change attempt will affect the next redox state change attempts of near redox-active groups.

### C. Replica exchange molecular dynamics along the redox potential dimension (E-REMD)

The chemical potential *μ*_{A} associated with a substance *A* is defined as $\mu A=\mu Ao+kbT\u2061lnaA$, where $\mu Ao$ is the standard chemical potential and *a*_{A} is the activity of *A*. The standard chemical potential for electrons is chosen as $\mu e\u2212o=0$; thus, by making use of $E=\u2212kbTFlne\u2212$, we obtain

This shows that performing simulations at constant redox potential is the same as performing simulations at constant chemical potential for electrons $\mu e\u2212$.

Hereafter, we show how the Monte Carlo exchange criterion derived by Itoh *et al.*^{30} for pH-REMD can be extended to E-REMD. First, we consider a system of *M* non-interacting replicas. Replica *i* (*i* = 1, …, *M*) has coordinate and momentum vectors $q\u2192i$ and $p\u2192i$, temperature T, a redox potential value of *E*_{l} (*l* = 1, …, *M*), and $Nie\u2212$ electrons. If we exchange two replicas, the following detailed balance condition must be satisfied:

where $Xil\u2261q\u2192i,p\u2192i,Nie\u2212,El$, *P* is the equilibrium probability, and $w$ is the transition probability. In the semi-grand canonical ensemble of electrons, where the volume (*V*), temperature (*T*), and $\mu e\u2212$ are fixed, the equilibrium probability is given by

where $\epsilon i$ is the energy of the system and $\Xi $ is the grand canonical partition function. We can write the ratio of the transition probabilities as

The Metropolis Monte Carlo solution for the exchange probability can be then written as

It is important to notice that the acceptance ratio decreases exponentially with the difference in redox potential values. Therefore, it is important to choose redox potential values for the replicas such that the acceptance ratio is not too low. For example, for *T* = 300 K, $Nie\u2212\u2212Nje\u2212=1$, and *E*_{n} − *E*_{l} = 30 mV, we have $e\u2212\Delta \u224531%$. Roughly the same probability is obtained on pH-REMD by using pH intervals of 0.5. This is already expected because it can be shown that 1.0 pH unit is equivalent to 59.5 mV in redox potential units at *T* = 300 K.

### D. Theoretical description of the pH dependence of *E*^{o} and the redox potential dependence of *pK*_{a} values

By making use of a thermodynamic cycle, it is possible to devise equations to describe the pH dependence of standard redox potential *E*^{o} and the redox potential dependence of *pK*_{a} values. This thermodynamic cycle and the derivations for all the equations shown in this subsection are available in the supplementary material. By making thermodynamic considerations and assuming the system to contain only a single redox-active residue and one or more pH-active residues, we obtain

where $Eproto$ and $Edeproto$ are the standard redox potential of the redox-active residue when all the pH-active residues are, respectively, fully protonated or fully deprotonated. $pKa,redi$ and $pKa,oxii$ are the *pK*_{a} values of the *i*th pH-active residue when the redox-active residue is, respectively, fully reduced or fully oxidized.

Equation (16) shows that the difference between $Eproto$ and $Edeproto$ is directly related to differences between $pKa,redi$ and $pKa,oxii$. As an example of that, if the pH-active groups do not suffer from the interaction with the redox-active group such that $pKa,redi=pKa,oxii$, then according to Eq. (16), we must necessarily have $Eproto=Edeproto$.

We can also, in the following equation, describe the pH-dependent *E*^{o} value for the case where the species of all pH-active residues are not all fully protonated nor all fully deprotonated:

We observe that at the low pH limit (i.e., large [*H*^{+}] values), *E*^{o} becomes $Eproto$, and at the high pH limit (i.e., small [*H*^{+}] values), *E*^{o} becomes exactly the expression for $Edeproto$ that we obtain from Eq. (16).

The redox potential dependence of the *pK*_{a} values of the pH-active residues can be described by the following equation:

## III. CALCULATION DETAILS

In this work, all the simulations were performed using an in-house modified version of AMBER 16.^{31} Our implementations are part of the new AMBER 18 release.^{55} In this section, we are going to discuss the C(pH,E)MD and E-REMD implementations, followed by details about parametrization of our test system, and the implicit and explicit solvent calculations performed.

### A. C(pH,E)MD implementation

Previous discrete protonation states CpHMD publications have analyzed, among other things, the behavior of *pK*_{a} predictions in long Generalized Born (GB) implicit solvent (IS) simulations,^{18} how a higher frequency of exchange attempts affects convergence on pH-REMD,^{25} and how variables like the solvent relaxation time affects explicit solvent simulations.^{18} These discussions will be revisited here in the context of constant redox potential, even though we expect to see these same behaviors due to the natural likeness between CpHMD and CEMD. This likeness can be seen by comparing Eqs. (8) and (9) and arises from the similarities between the Henderson-Hasselbalch and the Nernst equations.

Figure 1 shows the workflow of C(pH,E)MD as implemented in AMBER. The number of standard MD steps between state change attempts is tunable and may be different for protonation and redox state change attempts. Thus, protonation and redox state change attempts may happen at the same MD step. In this case, the protonation state change attempts are performed first.

In both the implicit (IS) and explicit solvent (ES) implementations, the proposed state of a residue in a protonation or redox state change attempt is chosen randomly from the available states of the residue, excluding the current state the residue is in.

#### 1. Implicit solvent (IS) implementation

When the standard MD is halted for a protonation and/or redox state change attempt (see Fig. 1), a single residue is picked randomly and is the only residue considered for a protonation or redox state change attempt.

#### 2. Explicit solvent (ES) implementation

In the ES implementation, when the MD is halted, one state change is attempted for each pH- and/or redox-active residue. The residues are chosen in random order until all pH- and/or redox-active residues are visited once. Likewise in the ES CpHMD AMBER implementation,^{18} during redox state change attempts, the water molecules and any eventual ions are stripped and the $\Delta Gelec$ term for a redox-active residue [see Eq. (8)] is computed using a GB model. After the attempts for each residue are performed, the solvent molecules are restored. If any state change was accepted, then solvent relaxation is performed. Solvent relaxation consists in performing a predetermined number of MD steps only on the degrees of freedom associated with the solvent, including any possible ions, while the solute degrees of freedom are kept fixed. Solvent relaxation is done to adapt the solvent conformation to the new protonation and/or redox state. The number of solvent relaxation steps may be different for protonation and redox state change attempts. In order to make C(pH,E)MD more efficient, if a protonation and a redox state change attempt happen at the same MD step when the standard MD is halted (see Fig. 1), only a single solvent relaxation is performed using either the number of relaxation steps for protonation or redox state change attempts, whichever is larger. This actually makes the computational cost of C(pH,E)MD close to that of CpHMD or CEMD in ES simulations, as shown in Sec. IV D.

Doing one attempt for each titratable residue when the simulation is halted allows us to perform protonation and redox state change attempts less frequently than in the IS implementation, thus lowering the total number of solvent relaxation steps over the course of the simulation. In ES constant pH and/or constant redox potential MD simulations, the solvent relaxation contributes the most to the computational cost of the methodology in comparison to regular MD. Therefore, lowering the total number of relaxation steps makes the simulation computationally more efficient.

During protonation or redox state change attempts, the sudden change in charge, even if the pH- or redox-active residue is not solvent exposed, destabilizes the solvent molecules around the solute. This makes the use of ES calculations for state change attempts unfeasible as an unfavorable energy change would lead to a low state change probability. For this reason, our state change attempts require the use of implicit solvent calculations where the solvent instantaneously adapts to the new charge set. Stern^{15} has proposed an ES CpHMD method that uses an interpolation between the current and the proposed protonation states and does not require the use of implicit solvent calculations, but its implementation has not been attempted in the present publication.

Another point worth to discuss is the fact that the net charge of the system changes when protonation or redox state change attempts are accepted. It has been shown that when periodic boundary conditions are in place and the electrostatic interactions are computed using a lattice sum, finite-size effects arise on simulations where the volume or the net charge is changing throughout the simulation.^{36,37} These effects have been shown to be higher for small unit cells and could lead to unrealistic behaviors.^{36} By using GB calculations for state change attempts in our methodology, we avoid these finite-size effects. In our implementation, the standard MD steps between state change attempts are done at constant charge.

### B. E-REMD implementation

In replica exchange simulations, an important variable is the exchange attempt frequency (EAF). If the EAF equals zero, this means that no exchange attempts between replicas are performed and E-REMD becomes equivalent to C(pH,E)MD, or to CEMD if the constant pH option is not considered. Previous AMBER replica exchange publications have shown that by increasing the EAF the sampling convergence could be improved as the simulated system may overcome barriers more easily.^{25,38,39} In this work, we show that a better convergence is also obtained for E-REMD on properties that depend on the redox potential in comparison to C(pH,E)MD or CEMD.

As can be seen in Eqs. (14) and (15), the E-REMD exchange probability depends on the difference between redox potential values of the replicas we are trying to exchange. Therefore, in the same way that is done in AMBER for pH-REMD, in our E-REMD implementation we only attempt exchanges between replica neighbors to increase the exchange probability. To exemplify, consider 4 replicas ordered by increasing values of redox potential. At the first exchange attempt, we attempt to exchange replicas 1 with 2 and 3 with 4. On the next exchange attempt, the attempts are made between replicas 2 with 3 and 1 with 4. This cycle is repeated until the end of the simulation is reached.

### C. NAcMP8 parametrization

In many proteins containing heme groups, each iron atom at the center of a porphyrin ring is axially connected to two histidines.^{11,40,41} For the theoretical modeling of such proteins, it is ideal to have a reference compound in agreement with this conformation. The standard redox potential of *N*-acetylmicroperoxidase-8 (NAcMP8) axially connected to an imidazole molecule has been experimentally measured as −203 mV *vs.* NHE at pH 7.0.^{32} NAcMP8 has a histidine residue from its peptide chain that is axially connected to the heme group. On the other side of the porphyrin plane, we axially connected a histidine peptide as shown in Fig. 2. This closely represents the experimental situation.

The structure of NAcMP8 is sketched in Fig. 1 of Ref. 32. From this figure and by making modifications to the structure of the horse heart cytochrome *c*^{42} available at the PDB code 1HRC, we obtained the structure shown in Fig. 2. The iron atom, the porphyrin ring, together with the side chains of two histidines and two cysteines were considered as a single residue we called HEH, which is colored in green in Fig. 2. HEH is the redox-active residue that changes its atomic charge distribution when a redox state change attempt is successful. Therefore, a redox state change affects the charge distribution on the histidines and cysteines. The charge distributions of the reduced and oxidized states of HEH can be found at Table S1 of the supplementary material. NAcMP8 contains three pH-active residues: two propionates (PRN) colored in red in Fig. 2 and one glutamate (GL4) colored in blue. The propionates are separate residues from HEH. The charge distributions of the deprotonated and protonated states of PRN can be found in Table S2 of the supplementary material.

Residues other than the heme group were parameterized using the AMBER FF99SB force field.^{43} The heme force field parameters were taken from Crespo *et al.*,^{44} with the atomic charges for the reduced and oxidized states adapted from Henriques *et al.*^{40} The experimental p*K*_{a} of propionic acid is 4.85.^{45} The $\Delta Gelec,ref$ term from Eq. (9) was fitted in implicit and explicit solvent constant pH simulations of propionic acid in water using *pK*_{a,ref} = 4.85. The fitted value was then used in the simulations for the NAcMP8 propionates. $\Delta Gelec,ref$ from Eq. (8) for the HEH residue was also fitted in IS and ES using constant pH and redox potential simulations of NAcMP8 with a histidine peptide as the axial ligand with pH = 7.0 and $Erefo=\u2212203$ mV. The glutamate residue has *pK*_{a,ref} = 4.4 and we used values already available in AMBER’s *cpinutil*.*py* tool for its $\Delta Gelec,ref$ in IS and ES.^{25}

In the explicit solvent calculations, the intrinsic solvent radius of the carboxylate oxygens on the propionates and the glutamate was changed from 1.5 to 1.3 Å in order to compensate for having two dummy hydrogens present on each oxygen.^{18,46}

### D. Implicit solvent calculations

We begin with the initial structure having the heme group in the oxidized state and both propionates and the glutamate in the deprotonated state. This structure is then minimized for 100 steps using the steepest descent algorithm and then for 3900 steps using the conjugate gradient algorithm constraining the backbone atoms with a 10 kcal/mol Å^{2} constant. The minimized structure is then heated during 3 ns by varying linearly the target temperature from 10 to 300 K over the initial 0.6 ns. During heating, the backbone atoms were constrained using a constant of 1 kcal/mol Å^{2}, and the temperature was controlled using Langevin dynamics with a friction frequency of 5 ps^{−1}.

An equilibration at 300 K is then performed on the heated structure for 10 ns using Langevin dynamics with a 10 ps^{−1} friction frequency and a 0.1 kcal/mol Å^{2} constrain on the backbone atoms. This equilibrated structure was used as the initial structure for the production simulations from where our results were extracted. For the production simulations, no positional restraints are applied and redox and/or protonation state change attempts are performed every 10 fs. Calculations with E-REMD were done using an exchange attempt frequency (EAF) of either 0.5 or 50 ps^{−1}, meaning one exchange attempt every 2000 or 20 fs, respectively. All simulations were done with a time step of 2 fs, and all bond lengths of bonds containing hydrogen atoms were constrained using the SHAKE algorithm.^{47,48}

Previous AMBER implicit solvent CpHMD publications^{13,18,25} have used the generalized Born model proposed by Onufriev *et al.*^{49} (represented by the input flag *igb* = 2 on AMBER) to account for solvent effects. For consistency, the same model was used in our implicit solvent simulations, both during MD and during protonation and redox state change attempts.

### E. Explicit solvent calculations

In the initial structure, the heme group is in the oxidized state and both propionates and the glutamate are in the deprotonated state. The system was solvated using waters from the TIP3P model in a truncated octahedron box with a buffer of 10 Å (distance between the wall and the closest atom in the solute). In order to neutralize the total charge at this initial state, two sodium ions were randomly added to the solution.

The initial structure was minimized, constraining the backbone atoms with a 10 kcal/mol Å^{2} constant, for 1000 steps using the steepest descent algorithm followed by 4000 steps of conjugate gradient. The minimized structure was then heated at constant volume during 3 ns using a backbone constrain of 1 kcal/mol Å^{2} and Langevin dynamics with a friction frequency of 5 ps^{−1} to control the temperature. During heating, the target temperature was linearly varied from 10 to 300 K over the initial 0.67 ns.

The heated structure was then equilibrated at constant pressure (1 bar) and temperature (300 K) for 8 ns with no backbone restraints using a friction coefficient of 2 ps^{−1} for the Langevin dynamics to maintain the temperature and a relaxation time of 3 ps for the Berendsen barostat to control the pressure. The system was then submitted to a new equilibration at constant volume and temperature (300 K) for 20 ns using Langevin dynamics with a 2 ps^{−1} friction frequency. The structure from this equilibration was used as the input for the production simulations from where our results were extracted.

It has been shown for ES CpHMD that when a protonation state change is accepted, relaxation times up to 4 ps could be required to completely stabilize the solvent; however, the most significant part of the solvent stabilization is done during the first 200 fs.^{18} As relaxation times on the order of 4 ps would have very significant effects on the computational cost of the methodology, lower computational relaxation times of around 200 fs were attempted. No significant differences were observed in the p*K*_{a} predictions in comparison to higher relaxation times of up to 2 ps. We verified this same behavior in ES CEMD or C(pH,E)MD on E^{o} predictions.

In our production simulations, redox and/or protonation state change attempts are performed every 200 fs and the solvent relaxation is performed for 200 fs. The E-REMD calculations were performed using an Exchange Attempt Frequency (EAF) of either 0.5 or 5 ps^{−1}; this means one exchange attempt every 2000 or 200 fs, respectively. The time step used on all MD simulations, including during solvent relaxation, is 2 fs and all bonds containing hydrogen atoms were constrained using the SHAKE algorithm.^{47,48} The particle-mesh Ewald method^{50,51} using a van der Waals cut-off and a direct space of 8 Å was considered for the long-range electrostatic interactions.

## IV. RESULTS AND DISCUSSIONS

As discussed before, replica exchange simulations are meant to accelerate convergence. This means a REMD simulation should converge faster than a simulation without REMD for the same number of steps. We will now test this concept for E-REMD by presenting results for short simulations in which neither the CEMD nor C(pH,E)MD results are converged and also evaluate what happens as a function of simulation time on long simulations. We then analyze how the use of E-REMD affects the results.

### A. CEMD vs E-REMD

#### 1. Short simulations

CEMD results are presented for simulations in which the propionates and the glutamate are both either protonated or deprotonated throughout the whole simulation, including during minimization, heating, and equilibration. Production simulations were executed for only 50 ps (25 000 MD steps) in IS and 1000 ps (500 000 MD steps) in ES. As discussed previously, we are making protonation and redox state change attempts every 10 fs in IS and every 200 fs in ES. For this reason, significantly more state change attempts are performed in IS than in ES during the same simulated time interval. In order to account for that, we are performing more steps for ES than for IS in this analysis.

Production simulations were performed for 12 values of redox potential using CEMD and also E-REMD with two different EAFs. Figure 3 shows the fraction of reduced species of HEH as a function of E and the fittings of E^{o} and the Hill coefficient using Eq. (3) to the data obtained from the simulations. The errors reported for E^{o} and the Hill coefficient are the errors in the fitting to Eq. (3). Results are shown for both implicit and explicit solvent models.

The fractions of reduced species obtained from the simulations are in agreement with the expected behavior predicted by Eq. (3). Lower fitting errors are obtained when replica exchange is used.

Protonation of a pH-active group leads to a positive increase on its charge. Hence, as electrons can neutralize the positive charge of protons, the standard redox potential of an adjacent redox-active group increases as it becomes easier for an electron to be attached there. The standard redox potentials shown in Fig. 3 have more positive E^{o} values when all pH-active residues are protonated in comparison to when they are all deprotonated, therefore in agreement with the expectation.

### B. C(pH,E)MD vs E-REMD

#### 1. Short simulations

We now extend the same test from the previous section (Sec. IV A) to C(pH,E)MD, and starting from the equilibrated structure where the propionates and the glutamate are deprotonated, we let their protonation states change during the production simulations while still allowing the redox states to change as before. Here, production simulations were executed for only 50 ps in IS and 1000 ps in ES and for 18 different pH values ranging from 2.0 to 10.5 in intervals of 0.5. Furthermore, for each pH value, 12 values of redox potential from −353 to −23 mV with an interval of 30 mV were considered. Results are shown for C(pH,E)MD and E-REMD with two different EAFs, where in E-REMD only replicas with the same pH are allowed to have their redox potential values exchanged. For each pH, we plotted the fraction of reduced species versus redox potential (as in Fig. 3) to extract the value of E^{o}. Then, the E^{o} of HEH as a function of pH is shown in Fig. 4. Results are shown for both implicit and explicit solvent models.

As the pH of the solution increases, the concentration of deprotonated species increases, leading to lower values of E^{o} as it becomes more energetically unfavorable for an electron to reduce the heme group. This behavior has been reported experimentally.^{3,4} Furthermore, it can be seen that the most significant changes in E^{o} with respect to pH happen for pH values around the p*K*_{a}s of the pH-active residues that more closely interact with the redox-active residue. The results shown in Fig. 4 are in agreement with these trends. The solid curves are the fittings of the data from the simulations to Eq. (17). As can be seen in Fig. 4, this equation clearly describes the pH-dependence of the E^{o} values observed in our simulations. We also observe that the E^{o} values for the low and high pH limits match with the E^{o} values predicted in Sec. IV A using CEMD for all protonated and all deprotonated pH-active residues, respectively.

#### 2. Long simulations

Here, we analyze how our standard redox potential predictions behave as a function of simulation time for C(pH,E)MD and E-REMD. In some situations, like for the study of processes that happen at long time scales, long simulations must be performed and, due to computational efficiency limitations, GPU-accelerated CUDA calculations must be used. Therefore, the analysis to be shown here demonstrates if our methodologies are stable and can be used in these situations.

For this analysis, production simulations were executed for 300 ns in both IS and ES. Both the Single Precision Fixed Point (SPFP) and the Double Precision Fixed Point (DPFP) AMBER 16 CUDA precision models^{52} were used for the IS calculations, and only SPFP was used for the ES calculations. By construction, the CUDA DPFP implementation contains no further approximations in comparison to the CPU code and is the precision model used in AMBER to directly compare CPU and GPU results. Simulations were performed for pH = 7.0 and 12 redox potential values from −353 to −23 mV. The cumulative fraction of reduced species for each redox potential value was obtained as a function of time, and for each time we gathered all the fractions to fit E^{o} using Eq. (3). The cumulative prediction of the E^{o} of HEH as a function of simulation time can then be obtained. This procedure was independently repeated 5 times. Figure 5 contains the average for the 5 independent simulations of the cumulative E^{o} as a function of simulation time. We also report the standard deviation of the E^{o} predictions in the 5 independent simulations as a function of time. Results are presented for C(pH,E)MD and E-REMD with three different EAFs, for both IS and ES calculations.

The standard deviations in the E^{o} predictions can be interpreted as an error measure of our methodologies. For both implicit and explicit solvent simulations, we observe that the C(pH,E)MD and E-REMD predictions agree with each other within the errors reported. The significantly lower errors for E-REMD in comparison to C(pH,E)MD are a very compelling evidence of the better convergence efficiency of E-REMD. It is also important to emphasize that the error bars obtained are reasonably small. We see error bars of up to 2.7 mV for C(pH,E)MD and of up to 1.2 mV for E-REMD. At 300 K, these values correspond to 0.045 and 0.020 in pH units, respectively.

To complement the analyses done from Fig. 5, in Fig. 6 we split the simulations into 5 ns chunks and compute the predicted E^{o} of each window using the fractions of reduced species for all redox potential values. Figure 6 was also generated using data from the 5 independent runs.

Figure 6 also confirms smaller standard deviations for E-REMD simulations in comparison to C(pH,E)MD. In some cases, we also observe that the chunk E^{o} predictions are clearly different at the beginning and at the end of the simulation, which means the chunk E^{o} values do not fluctuate anymore around the target value of −203 mV. This explains the E^{o} drops observed in Fig. 5 generally after 200 ns of simulation. In order to understand this behavior, we looked at one of the C(pH,E)MD IS SPFP independent runs. For this simulation, we concluded the E^{o} drop is caused by a conformational change (a flip) in the histidine that belongs to the NAcMP8 peptide chain. The side chain of this histidine is part of the HEH residue (see Fig. 2). More details and discussions about this are provided in Sec. IV of the supplementary material. As we show in the supplementary material, even if this conformational change happens for only one of the redox potential values close to E^{o}, this is already enough to affect the E^{o} fitting. In the simulation analyzed, this conformational change happened only once throughout the 300 ns simulated. Therefore, the addition of redox potential replica exchange helps to mix different conformations across different target redox potential values, which possibly explains the fact that the E^{o} drops in Figs. 5 and 6 are more subtle for E-REMD than for C(pH,E)MD.

Previous AMBER replica exchange publications have shown that increasing EAF in T-REMD simulations improves the convergence in structural analyses due to the better sampling of the conformational space.^{39} For pH-REMD, where a better sampling of the pH-space is done, better convergence in the p*K*_{a} prediction for some residues are obtained when EAF is increased.^{25} In this pH-REMD paper, this conclusion about increasing EAF is reached mostly based on the results for a single residue, Asp 66. The authors, Swails *et al.*, attribute this behavior for Asp 66 to a better mobility of flexible regions of the protein they studied (HEWL). As can be seen in their paper, excluding Asp 48, all the other pH-active residues are effectively insensitive to increasing the EAF.

As can be seen in both Figs. 5 and 6, the standard deviations in the E-REMD simulations do not change significantly when we increase EAF from 0.005 ps^{−1} to higher values. C(pH,E)MD corresponds to the limit of EAF equal to zero; therefore, if we decrease EAF to values below 0.005 ps^{−1}, we must at some point start seeing the same behavior observed for C(pH,E)MD. However, as can be seen, even the small EAF of 0.005 ps^{−1} (where 1500 exchange attempts are performed during 300 ns of simulation) was enough to see effectively the same behavior observed for the higher EAF used.

The original GB model proposed by Onufriev *et al.*^{49} (*igb* = 2 on AMBER) used in this paper did not include iron in its parameter set. Thus, the iron ion in our system is the only heme atom without specific GB parameters. In our simulations, we used the values 1.5 and 0.8 Å for the iron GB radius and screen, respectively. In the explicit solvent simulations, we see that the water molecules are always more than 4 Å apart from the iron atom, as can be seen in Fig. S2 of the supplementary material. As the iron ion is then nearly buried, one would expect that the exact values for these GB parameters will not influence the simulation results significantly. Also, it is important to check whether the porphyrin ring and the iron ion remain planar over the course of the implicit simulations. As the data in Sec. II of the supplementary material show, we observe that both the porphyrin ring and the iron ion remain planar in both the implicit and the explicit solvent simulations. In this analysis, the dihedral angles obtained in implicit solvent are close to the values obtained in explicit solvent.

### C. Predicting E^{o} vs pH and p*K*_{a} vs E

As discussed in Sec. IV B 1, E^{o} should decrease with the increase of the solution pH. Following the same reasoning, equivalently the p*K*_{a} of a pH-active residue should also decrease with the increase of the solution redox potential. Based on the discussions from Figs. 5 and 6, we present 140 ns production simulation results in both IS with CUDA DPFP and in ES with CUDA SPFP. We have used E-REMD with EAF = 5 ps^{−1} in ES and EAF = 50 ps^{−1} in IS and 18 different pH values from 2.0 to 10.5, and for each pH value, 12 values of redox potential from −353 to −23 mV were considered. Figure 7 shows the standard redox potential of HEH as a function of pH and the p*K*_{a} of all pH-active residues and their sum as a function of the redox potential for both implicit and explicit solvent simulations. The solid lines in the figure are fittings using Eqs. (17) and (18).

As Fig. 7 shows, we obtain very good agreement between the theoretical predictions from Eqs. (17) and (18) and the data from the simulations. It is important to mention that if we simply get the parameters in the equations ($Eproto$, $Edeproto$, $pKa,redi$, and $pKa,oxii$) from the simulations, thus without fitting, we already obtain a good description of the simulation data shown in Fig. 7. We also observe, as one would expect from the equations, that the pH values for which the E^{o} of HEH changes the most are around the p*K*_{a} values of the pH-active groups that more closely interact with the heme group. Equivalently, the p*K*_{a} of each pH-active residue changes the most for redox potential values around the E^{o} of HEH. As the figure shows, the p*K*_{a} values of the two propionates are not exactly the same. They actually differ by ∼0.15 pH units. The p*K*_{a} values for the limits of low and high redox potential values differ by 0.2 pH unit for each propionate and by 0.15 pH unit for the glutamic acid. Therefore, even though the glutamic acid is farther apart from the heme than the propionates, it still significantly interacts with the heme group. Thus, for NAcMP8, the pH dependence of the standard redox potential of the heme group cannot be completely explained by considering just a single pH-active site or just the heme propionates. As an example, for the implicit solvent simulation, using Eq. (16), we see that the GL4 residue contributes with 8.8 mV to the difference of 28.6 mV between the E^{o} values for the limits when the pH-active residues are fully protonated and fully deprotonated.

As both in the implicit and explicit solvent calculations the protonation and redox state change attempts were performed using the same GB model, the differences between our IS and ES simulations on the relative E^{o} and p*K*_{a} predictions in comparison to the reference compound can only come from the different ensemble of configurations generated in both cases. Figure 7 shows that these conformational differences are significant enough to produce different E^{o} and p*K*_{a} values in IS and ES, even though the overall trends are the same. The p*K*_{a} predictions are ∼0.5 pH units higher in ES than in IS. Also, even though the E^{o} values at low pH limits are basically the same, at the high pH limit, E^{o} is ∼12 mV lower in ES than in IS. In addition, as one would expect from Fig. 5, we can also observe the influence of the total number of MD steps by comparing Fig. 7 with the results for short simulations shown in Fig. 4.

For consistency, it is important to compare the low and high pH limiting E^{o} values with the E^{o} values predicted using E-REMD for the same EAF and same total number of steps but without the constant pH option. Without using constant pH, the E^{o} values obtained when all pH-active residues are protonated and deprotonated are, respectively, −178.55 and −207.95 mV in IS and −176.98 and −214.86 mV in ES. When constant pH is used, the E^{o} values for the low and high pH limits are, respectively, −179.18 and −207.93 mV in IS and −177.69 and −214.58 mV in ES. The agreement between the E^{o} predictions is good, as the predictions agree within less than 1 mV.

By considering a model where the pH-dependence of the heme group is associated with a single pH-active group, Das and Medhi^{53} were able to experimentally infer for microperoxidase 11 the p*K*_{a} values that correspond to the heme being fully reduced and fully oxidized. Microperoxidase 11 differs from NAcMP8 for having an extended peptide chain which contains a pH-active lysine reside. Also, the axial ligand in their experiments is a water molecule, instead of a histidine as is our case. Using their single pH-residue model for the range of pH values between 5.0 and 7.5, they obtained p*K*_{a} values of 7.1 and 6.3 when the heme group is reduced and oxidized, respectively. According to Eq. (16), the difference between these p*K*_{a} values should be directly related to the difference of E^{o} values for when the pH-active group is fully protonated and fully deprotonated. From our ES simulations, the p*K*_{a} values for both propionates at the low and high redox potential limits corresponding to the heme being fully reduced and fully oxidized are, respectively, 6.65 and 6.45 for PNR 15 and 6.80 and 6.60 for PNR 16. Assuming the experimental p*K*_{a} values of 7.1 and 6.3 to correspond to the heme propionates, we see that the propionate p*K*_{a}s values obtained in our simulations are in the same range as the experimental ones.

In another publication,^{54} Das and Medhi were also able to experimentally infer for several different proteins the p*K*_{a} values that would correspond to the heme propionates. For cytochrome *c*_{2}, they obtained p*K*_{a} values of 7.4 and 6.3 when the heme group is reduced and oxidized, respectively. For cytochrome *b*_{5}, where the propionates are more solvent exposed, they obtained p*K*_{a} values are 5.9 and 5.7. We see the propionate p*K*_{a} values obtained in our simulations are in the same range as the experimental ones for cytochrome *c*_{2}. As can be seen in Fig. 2, the propionates on NAcMP8 are solvent exposed. As Fig. 7 shows, the Δp*K*_{a} for each propionate when the heme group is fully reduced and fully oxidized is 0.2. This difference matches with the one obtained experimentally for cytochrome *b*_{5} where the propionates are mostly solvent exposed. However, this comparison should not be overinterpreted as in the experimental paper only a single pH-active residue was considered in their model, where in our simulations 3 pH-active residues are considered.

### D. Computational benchmarks

In Table I, the computational efficiencies of the *pmemd* and *sander* AMBER modules are compared for C(pH,E)MD. Calculations were done using Cray XK7 nodes (that have Tesla K20X GPUs) at the Blue Waters supercomputer. In implicit solvent, there are 237 atoms (corresponding to the NAcMP8 and the histidine peptide) and in explicit solvent this number is 7403 (there are 2388 water molecules and 2 Na^{+} ions). The calculations were performed for pH = 7.0 and E = −203 mV.

. | Implicit solvent . | Explicit solvent . |
---|---|---|

Computation . | Computational performance (ns/day) . | |

sander Serial (1 CPU) | 6.38 | 0.27 |

sander MPI (2 CPUs) | 13.18 | 0.55 |

sander MPI (4 CPUs) | 25.11 | 1.04 |

sander MPI (8 CPUs) | 42.98 | 1.71 |

sander MPI (16 CPUs) | 75.44 | 2.78 |

pmemd Serial (1 CPU) | 4.48 | 0.52 |

pmemd MPI (2 CPUs) | 9.40 | 1.02 |

pmemd MPI (4 CPUs) | 18.11 | 1.99 |

pmemd MPI (8 CPUs) | 31.42 | 3.30 |

pmemd MPI (16 CPUs) | 56.95 | 5.24 |

pmemd CUDA SPFP (1 GPU) | 336.58 | 126.56 |

pmemd CUDA DPFP (1 GPU) | 135.23 | 59.34 |

. | Implicit solvent . | Explicit solvent . |
---|---|---|

Computation . | Computational performance (ns/day) . | |

sander Serial (1 CPU) | 6.38 | 0.27 |

sander MPI (2 CPUs) | 13.18 | 0.55 |

sander MPI (4 CPUs) | 25.11 | 1.04 |

sander MPI (8 CPUs) | 42.98 | 1.71 |

sander MPI (16 CPUs) | 75.44 | 2.78 |

pmemd Serial (1 CPU) | 4.48 | 0.52 |

pmemd MPI (2 CPUs) | 9.40 | 1.02 |

pmemd MPI (4 CPUs) | 18.11 | 1.99 |

pmemd MPI (8 CPUs) | 31.42 | 3.30 |

pmemd MPI (16 CPUs) | 56.95 | 5.24 |

pmemd CUDA SPFP (1 GPU) | 336.58 | 126.56 |

pmemd CUDA DPFP (1 GPU) | 135.23 | 59.34 |

*sander* was the first module implemented on AMBER capable of performing molecular dynamics. *pmemd* started as a reimplementation of some *sander* functionalities, intended to improve the computational performance of the simulations. In Table I, we see that for serial and MPI, the IS C(pH,E)MD calculations with *sander* are faster than with *pmemd*; however, for ES, where the total number of atoms increases 31 times in comparison to the IS calculations, the computational performance is better for *pmemd* than for *sander*. This shows the scalability of *pmemd* is better, which means it would perform even better for larger systems in comparison to *sander*. Even though the calculations scale well with the number of CPUs, Table I also emphasizes the high-performance aspect of the GPU enabled implementation. Great speedups are observed in both IS and ES calculations. In ES, we see that the SPFP calculation using GPU is 243 times faster than the serial calculation and 24 times faster than the fastest MPI calculation. The SPFP precision model is around 2 times faster than the DPFP precision model in the CUDA calculations.

It is important to mention that the computational cost of our methodology in ES depends on the values of E and pH. If the redox potential value is close to the E^{o} of a redox-active residue or if the pH value is close to the p*K*_{a} of a pH-active residue, the probability of accepting a state change attempt increases, and in explicit solvent this means that more relaxation steps will be performed.

In Table II, we compare the computational cost of E-REMD, C(pH,E)MD, CpHMD, and CEMD in comparison to regular MD. Here we have only used the *pmemd.cuda_SPFP.MPI* module. For C(pH,E)MD, we performed calculations for 36 (pH,E) values, combining *pHs* = 3.5, 4.0, 4.5, 5.0, 5.5, and 6.0 with *Es* = −263, −233, −203, −173, −143, and −113 mV. For CpHMD, CEMD, and regular MD, we have executed the same simulations as in C(pH,E)MD but turning off, respectively, the constant redox potential, the constant pH, and both the constant redox potential and pH. For E-REMD, we have performed the same simulations as in C(pH,E)MD but allowing the redox potential values between replicas of same pH to be exchanged. The computational performances shown in Table II are the averages for the 36 (pH,E) simulations.

. | Implicit solvent . | Explicit solvent . |
---|---|---|

Calculation . | Computational performance (ns/day) . | |

Regular MD | 491.86 | 225.32 |

CpHMD | 391.55 | 154.74 |

CEMD | 389.62 | 163.83 |

C(pH,E)MD | 289.13 | 145.08 |

E-REMD at CpH | 275.68 | 103.54 |

(EAF = 0.5 ps^{−1}) | ||

E-REMD at CpH | 198.10 | 96.81 |

(EAF = 50 ps^{−1} IS and 5 ps^{−1} ES) |

. | Implicit solvent . | Explicit solvent . |
---|---|---|

Calculation . | Computational performance (ns/day) . | |

Regular MD | 491.86 | 225.32 |

CpHMD | 391.55 | 154.74 |

CEMD | 389.62 | 163.83 |

C(pH,E)MD | 289.13 | 145.08 |

E-REMD at CpH | 275.68 | 103.54 |

(EAF = 0.5 ps^{−1}) | ||

E-REMD at CpH | 198.10 | 96.81 |

(EAF = 50 ps^{−1} IS and 5 ps^{−1} ES) |

As Table II shows, the addition of constant pH and/or constant redox potential has a bearable computational cost in comparison to regular MD. CpHMD and CEMD are 21% slower in IS and around 30% slower in ES than regular MD. In IS, the computational performances of CpHMD and CEMD are essentially the same. In ES, because each residue is visited once during protonation or redox state change attempts, the small difference between CpHMD and CEMD performances can be explained by the fact that we have three pH-active residues and only one redox-active residue. C(pH,E)MD is 41% slower in IS and 36% slower in ES than regular MD. We observe that the C(pH,E)MD performance is close to the CpHMD and CEMD performances for ES.

In Sec. IV B, we discussed the effect of increasing the EAF in our E-REMD simulations. As can be seen in Table II, the computational cost of the simulation rises when EAF is increased, mainly in implicit solvent. The addition of replica exchange has a bearable computational cost in comparison to C(pH,E)MD. For the largest EAF we used, E-REMD is 31% slower in IS and 33% slower in ES than C(pH,E)MD.

## V. CONCLUSIONS

Based on the existing CpHMD and pH-REMD methods^{13,18,25} implemented by Mongan *et al.*, we have successfully implemented the CEMD, C(pH,E)MD, and E-REMD methods in AMBER. These methods allow the theoretical study of systems at a constant value of redox potential using both implicit and explicit solvent models.

Validation results and tests were presented for NAcMP8 axially connected to a histidine peptide. This system contains a single heme and three pH-active residues. Our simulations correctly describe the behavior of E^{o} vs pH and of p*K*_{a} vs redox potential, in good agreement with theoretical predictions. Regarding the NAcMP8’s heme group, for the complete description of the pH dependence of its E^{o}, we conclude it is necessary to consider more than one pH-active group, as both propionates and the glutamic acid interact significantly with the heme. We observed that the propionates have slightly different p*K*_{a} values (a difference of ∼0.15 pH units) and that these p*K*_{a} values are in agreement with values obtained experimentally for other proteins that contain a single heme.

The addition of replica exchange significantly improves the statistical convergence of our E^{o} predictions. The increase in the exchange attempt frequency in our replica exchange simulations did not show a significant change in the convergence efficiency of our calculations. The smallest EAF used (0.005 ps^{−1}) was already satisfactory to reproduce effectively the same convergence as the larger EAF used.

Based on our computational benchmarks, we see that our methodologies have an acceptable computational cost in comparison to regular MD and that the GPU-accelerated code is able to provide high-performance in comparison to calculations using CPUs.

## SUPPLEMENTARY MATERIAL

See supplementary material for the derivation of the critical equations, additional results and discussions, and charge distributions of the HEH and PRN residues.

## ACKNOWLEDGMENTS

The authors gratefully acknowledge financial support from CAPES (Brazil) and CNPq (Brazil; PDE fellowship for M.S.A., Process No. 234282/2014-2). This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (Award Nos. OCI-0725070 and ACI-1238993) and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. The authors acknowledge J. Swails, A. M. Baptista, and M. R. Gunner for interesting discussions and S. Lenka for a careful reading of the manuscript.

The authors declare no competing financial interest.