In the past decade, there was increased research interest in studying internal motions of flexible proteins in solution using Neutron Spin Echo (NSE) as NSE can simultaneously probe the dynamics at the length and time scales comparable to protein domain motions. However, the collective intermediate scattering function (ISF) measured by NSE has the contributions from translational, rotational, and internal motions, which are rather complicated to be separated. Widely used NSE theories to interpret experimental data usually assume that the translational and rotational motions of a rigid particle are decoupled and independent to each other. To evaluate the accuracy of this approximation for monoclonal antibody (mAb) proteins in solution, dissipative particle dynamic computer simulation is used here to simulate a rigid-body mAb for up to about 200 ns. The total ISF together with the ISFs due to only the translational and rotational motions as well as their corresponding effective diffusion coefficients is calculated. The aforementioned approximation introduces appreciable errors to the calculated effective diffusion coefficients and the ISFs. For the effective diffusion coefficient, the error introduced by this approximation can be as large as about 10% even though the overall agreement is considered reasonable. Thus, we need to be cautious when interpreting the data with a small signal change. In addition, the accuracy of the calculated ISFs due to the finite computer simulation time is also discussed.
I. INTRODUCTION
The Neutron Spin Echo (NSE) Spectroscopy was first invented by Ferenc Mezei in the 1970s.1 Different from other inelastic neutron scattering techniques, NSE directly measures the intermediate scattering function (ISF) as a function of momentum transfer and correlation time. By taking advantage of the spin procession of a neutron in a magnet field, it can probe the time scale from several picoseconds to hundreds of nanoseconds with the length scale from a few angstroms to tens of nanometers. Owing to the large time and length scales it can probe, it has been used in many scientific areas including material sciences,2,3 liquid physics,4–6 glass transitions,7,8 polymer dynamics,9–11 and biological systems,12 especially protein clusters13–16 and protein internal motions.17–22 Due to the interest in understanding the relationship between protein motions and its functionalities, protein internal motions are intensively studied by NSE in the past decade.23,24
For macromolecules in solution, the coherent intermediate scattering function, I(Q, t), measured by NSE contains the information of translational, rotational, and internal domain motions.17,25,26 Here, Q is the magnitude of the wavevector transfer and t is the correlation time.25 It is highly non-trivial to separate different types of motions from I(Q, t). Different models have been proposed.17,18,22,25,26 One commonly used approach is to assume that the translation, rotational, and internal domain motions are independent of each other so that I(Q, t) is simply the multiplication of the intermediate scattering functions from each different type of motions.22,24,26 However, the accuracy of this approximation has not been rigorously evaluated for anisotropic particles.22
One consequence of this assumption is that the total diffusion coefficient, , obtained from I(Q, t) is simply the summed values of two contributions: one is the diffusion due to a rigid body model of the protein due to the translational and rotational motions and another one is due to the internal domain motion.17,18,22,25 The accuracy of the obtained internal dynamics thus depends on the accuracy of the estimation of based on a rigid protein model.22 However, the contribution of internal protein motions to the total effective diffusion coefficient is usually very small (about 10% of ).17,18 This makes it very important to investigate the accuracy of the estimated of a rigid body protein model based on the aforementioned approximation.
While there are different ways to separate the translational and rotational motions for rigid proteins, one widely used approach is to assume that the center-of-mass and rotational motions are independent from each other.17,18,24–26 Here, we evaluate the accuracy of this approximation by studying a model monoclonal antibody (mAb) provided by the National Institute of Standards and Technology (NISTmAb), which is a reference material introduced to support the pharmaceutical industry.27 The mAb based therapeutic drugs have been the fastest growing sector of the pharmaceutical industry in the past decade. Also, their global sales reached 105 billion dollars in 2016.28,29 NSE has been used to study the cluster formation and internal motions of different mAbs.14,16,20 Due to the similarity of the overall shape of mAbs, the understanding of NISTmAb can also be useful to study many other therapeutic mAbs.
In this work, we treat the NISTmAb as a rigid protein by constructing a coarse grained rigid model protein from its PDB structure. We performed Brownian motion simulations on this rigid-body protein model using the dissipative particle dynamics (DPD) simulation to evaluate the accuracy of the aforementioned approximation to model I(Q, t). The simulation was done on a single protein so that there is no need to worry about inter-protein motions for concentrated protein solutions.25 The contributions to I(Q, t) by the center-of-mass and rotational motions are evaluated. The effective diffusion coefficients as a function of Q are obtained by fitting I(Q, t). Our results indicate that at the short time limit, the rotational and center-of-mass motions are still coupled. Treating them as independent motions can result in up to about 10% deviation of the obtained . The comparison between the total collective ISF and the product of the center-of-mass and rotational ISFs shows quantitative difference. However, given the error bars of typical NSE experiments, the agreement seems to be acceptable even though one has to be careful when interpreting the small difference from the data.
II. THEORIES
NSE typically measures the coherent ISF expressed as
where is the wave vector transfer, V is the volume of the system exposed to the neutron beam, bm and bn are the scattering lengths of the mth and the nth atoms, respectively, and and are the position of the mth atom at time t = 0 and the position of the nth atom at time t.
For macromolecules, such as proteins in solution, we can rewrite the term in Eq. (1) as , where is the center of mass coordinates of the jth protein at time t, and is the Fourier transformation of the density function of the jth protein. Therefore, Eq. (1) can be expressed as
If there is only one protein, we can simplify Eq. (2) as
where is the center of mass coordinates of the protein at time t.
For a real system at dilute concentrations, the interaction between proteins is negligible and we are able to focus on the ISF of individual proteins.25 Since proteins are typically randomly orientated in solution, we can take the angular average of Eq. (3) as25
where represents the angular average and describes the center-of-mass motion of the protein.
In NSE experiments, the measured results are usually normalized to be25
where is the form factor, P(Q).
As a general form, Eq. (5) contains all the protein motions: translational, rotational, and internal motions. Also, we can define
which contains the information of both the rotational and internal motions, and
which contains only the information of the center-of-mass motion. In this way, Eq. (5) can be represented as
Notice that the center-of-mass motions are usually coupled with the rotational and internal motions in Eq. (8); it is complicated to separate the different contributions.
While there are different ways to simplify Eq. (8), one widely used approach is18–20,23,26,30
where
and
where is for the center-of-mass motion, D0 is the free diffusion coefficient, and is for both rotational and internal motions, respectively. Since the studied protein here is rigid, is purely due to the rotational motion. Hence, in this paper, = .
III. DISSIPATIVE PARTICLE DYNAMICS SIMULATION
Typically, mAb proteins are considered to be asymmetric Y-shaped particles. Our all-atom model of the NISTmAb was generated from coordinates derived from x-ray crystallography of the Fc and Fab domains with interdomain orientation from a representative single structure consistent with small-angle scattering data29 as shown in Fig. 1(a). A rigid-body model was generated using residue-based coarse graining using the Martini force field31 as shown in Fig. 1(b).
We use a dissipative particle dynamics (DPD)32 based approach for modeling the fluid phase. The simulation box size is approximately 6 times of the protein diameter. The simulation has been run long enough, which is estimated to correspond to about 200 ns for a mAb in water for each independent run. Compared with the molecular dynamics (MD) simulations, the DPD method introduces the random force and the hydrodynamic effect due to the motions of solvent molecules that affect the rotational and translational motions of a macromolecule in solution. Even though all atomic simulation including explicit water molecules can also include the hydrodynamic effect, the DPD method is simpler so that it can potentially simulate longer simulation time if needed.
DPD is a mesoscale model of fluids that can loosely be thought of as a Lagrangian formulation of Navier–Stokes equations with the inclusion of thermal fluctuations for modeling Brownian motions. The details of this computational model are beyond the scope of the paper.33,34 Here, we highlight its main features and focus on the relevant aspects of the models to this work. The DPD simulation is similar in structure to molecular dynamics simulations. However, instead of modeling all the molecular properties of the system, the motions of mesoscopic DPD particles that represent a coarse grained fluid are considered. The DPD particles are subject to three forces: conservative, dissipative, and random, with the total force, , on a particle to be expressed as
The conservative force, , is a soft repulsive radial force, which decreases linearly with the center-to-center distance, , between two DPD particles i and j whose amplitude is chosen so that the compressibility of the DPD fluid is close to that of water.35 The dissipative force, , is proportional to the difference of velocities between DPD particles i and j, , and acts to slow down their relative motion and produce a viscous effect. Here, v is the velocity of a particle. Finally, a random force, is added to control the temperature of the system and is the mechanism for Brownian motions via the thermal fluctuations. The dissipative and random forces control the viscosity of the fluid and maintain a well-defined temperature, and they are related by the fluctuation-dissipation theorem. For convenience, the energy scale is chosen such that kBT = 1. In addition parameters are chosen such that the fluid system possesses a Gibbs-Boltzmann equilibrium state in order that detailed balance is respected.36 This approach has been show to recover the Einstein intrinsic viscosity for hard spheres and has been validated for a variety of flow scenarios, including Pouiselle flow and Jeffery's orbits for sheared ellipsoids.33
IV. RESULTS AND DISCUSSION
To understand the individual contributions of the center-of-mass and rotational motions to the total collective ISF, we separate the trajectory into the center-of-mass and the rotational part as
where , and describe the trajectories of overall motion, center-of-mass motion, and rotational motion, respectively. is the center-of-mass of the protein at time t. Here, the mass is assumed to be evenly distributed among amino acid residues.
Considering a large number of proteins existing in the real solution, we take an angular average on Eq. (1) as
where . The normalized total, center-of-mass, and rotational collective ISFs following Eq. (14) are computed with for the computational convenience.
The results are shown in Fig. 2. , and are the normalized ISFs for the total, center-of-mass, and rotational motions based on Eqs. (8), (10), and (11), respectively. The x-axis is plotted as a dimensionless value, , so that the results can be compared with those of other systems with different solvent viscosities. Results of four different Q values are shown in Fig. 2 with the value of up to about 1.5. Here, is the free diffusion coefficient. The free diffusion coefficients of a mAb protein depend on the solvent viscosity with the value of about 3.7 Å2 ns for a D2O based buffer. During NSE experiments for mAb samples, the interested Q range is from about 0.05 Å−1 to about 0.2 Å−1. The measured correlation time by a NSE for mAb systems ranges from about 1 ns to about 200 ns.16 The value of for a NSE experiment may range from about 0.01 to about 2 at Q = 0.05 Å−1 while the value may range from about 0.2 to about 30 at Q = 0.2 Å−1.
, and are expected to be different at different Q values. For example, for a rigid spherical particle, for dilute protein solutions. Also, as NSE measures the coherent ISF. Note that this is different from the neutron scattering experiments focusing on incoherent ISF, for which decays and can be expressed as the summation of a series of exponential functions.37 Hence, if the coherent ISFs are plotted as a function of , all ISFs should be collapsed into a master curve for a rigid spherical particle. The difference between ISFs as a function of is thus a direct reflection of the effect due to the anisotropic shape of a mAb protein.
However, the ISFs in Fig. 2 do not show a consistent trend at relative large values of . This is most likely due to the fact that the simulation time is not long enough to sample sufficient Brownian motions. To verify this, three independent simulations with different initial values were conducted. Also, the ISFs from the three different sets of trajectories are calculated and shown in Fig. 3 as a function of . Here, is the characteristic diffusion time and σ is the effective diameter of a particle. Thus, τc is the time that needs a particle to diffuse a distance of its own diameter. Assuming 10 nm and Å2/ns, τc is about 400 ns for a mAb protein.
Theoretically speaking, as the three simulations are performed for the same system, the results should be similar to each other if each simulation time is long enough with sufficient sampling of the dynamics of the protein. However, for larger than about 0.003, the results in Fig. 3 start deviating from each other at the same Q value. As the Brownian motion is a stochastic process, it is expected that different simulations have different sampling of the particle motions. If the simulation time is long enough, the sampled motions resemble the true distribution function, and the results should be similar for the trajectories from different simulations. However, if the simulation time is too short, the sampled motions are biased by the particular simulation and may have different results as demonstrated here. Our simulation time for the three different trajectories is about 200 ns that corresponds to . Our results indicate that for less than about 0.003, the simulation results are similar to each other. The results obtained within this time region are reproducible with small uncertainty. For the calculated ISFs with , the uncertainty due to the insufficient sampling makes it hard to compare the results from different trajectories. Note that for all atomic simulation with a protein, the typical simulation time is up to a few hundreds of nanoseconds,38 which is comparable to the simulation time used here. However, for this range of simulation time, it is not sufficient enough to extract I(Q, t) reliably for larger than about 0.003. Because of this, the difference of I(Q, t) at larger values of shown in Fig. 2 is also partly due to the insufficient sampling of the simulation.
The effective diffusion coefficient, , has been widely used to understand the internal motions of proteins. can be extracted from the collective ISF at the short-time limit through15,17,25,26
Note that the correlation time for a short-time limit for a colloidal system should be larger than the momentum relaxation time so that the diffusion of a particle is independent of the mass and is only determined by its shape. To estimate the momentum relaxation time, τB, the center-of-mass mean square displacement of the protein is calculated and shown as the blue line in Fig. 4. By fitting the curve to the analytical form of mean square displacement39
where , kB is the Boltzmann constant, T is the temperature, γ is the friction coefficient, and ν0 is the initial velocity, we can extract in the unit of computer simulation time while τc = 1880 in the unit of the computer simulation time. Hence, the relative time ratio for is about 0.0003. As a verification, we perform the same simulation with the mass of the protein changed by a factor of 0.01 and plot the result as the red line in Fig. 4. As we can observe, two lines superimpose at indicating that the diffusion of the protein is independent of the mass for t larger than the momentum relaxation time. Therefore, the center-of-mass motions of the protein have already entered the diffusive motion region after τB. Later in the paper, the effective diffusion coefficients are obtained by fitting the ISF using Eq. (15) within the region where , respectively. (Note that the fitting of the corresponding ISF is performed within the time region of which is also well within the time region that the calculated ISFs are similar to each other based on the results from different simulation trajectories.) It is also noted that the decay of the intermediate scattering functions within such a short time is very small. However, with the accurate trajectories generated by the computer simulations, reliable results can be obtained. In a real experiment, the fitting time region for actual intermediate scattering functions has no such constraint as the experimental results are the ensemble of many proteins in solution.
The effective total (), center-of-mass (), and rotational () diffusion coefficients as a function of Q are shown in Fig. 5(a) from different trajectories. is almost identical for all three trajectories. However, even though and from different trajectories are similar to each other, the difference from different trajectories is about 10%. Note that unit used in Fig. 5(a) is not converted to the diffusion of a mAb in a real system. To compare the results with the experimental data, the results are normalized by the isotropic free diffusion coefficient at the infinite dilute condition by dividing at Q = 0.01 Å−1 that should be the same as D0. The results are shown in Fig. 5(b). The error bars are estimated based on the variation of the results from different trajectories of three independent simulations.
The approximation by Eq. (9) leads to the approximation of the short-time diffusion coefficients so that
with
for a rigid particle at the dilute condition. However, it is useful to point out that the exact result for should be17,25
where
if the rotational and center-of-mass motions are independent of each other. Both (Q), and (Q) are related to the translational motion only. However, different from (Q), (Q) couples the motions with the shape of the particle through Eq. (20).
The accuracy of Eq. (17) is first evaluated by comparing with as shown in Fig. 6(a). Even though the difference is overall small, the results from all three trajectories show a consistent discrepancy. It is found that the simple addition of and overestimates the value of with the largest difference up to about 10%.
To further understand what causes the difference, one trajectory of our simulations is further investigated. The calculated and from this particular trajectory are shown in Fig. 6(b). (The results calculated from other trajectories are similar.) It is not surprising that is constant while changes as a function of Q. Overall the difference between these two coefficients are very small. The largest difference is about 10% at around 0.05 Å−1, which approximately correspond to the length scale of the diameter of the protein.
Figure 6(c) shows the results of so that we can evaluate the accuracy of Eq. (19). For Q < 0.07 Å−1, is almost identical with . However, it becomes larger than at larger Q values. The largest difference is about 4%. Therefore, Eq. (19) is more accurate than Eq. (17) at the small and intermediate Q values. However, both overestimate the total diffusion coefficient at larger Q values (Q > 0.07 Å−1). This indicates that for larger Q values, the translational and rotational motions are not completely decoupled at the short time limit as described by either Eq. (17) or Eq. (19).
Note that Eqs. (19) and (20) are equivalent to that proposed in Ref. 17 if the rotational and translational motions are independent of each other. Since Eq. (19) has been widely used,17,18 we therefore need to be prudent when studying some small differences of mAb proteins using NSE with this approximation. For many flexible proteins, the deviation of measured diffusion coefficient from that of the rigid protein model has been attributed to the internal domain motions. Our results show that using Eq. (19) may potentially affect the accuracy of estimated internal domain motions. Equation (17) is shown to be less accurate than Eq. (19). Even though Eq. (17) is less frequently used to model the diffusion coefficient of flexible proteins, it is implicitly used in Eq. (9) that has been widely used for many studies.20,23,26 Because the better agreement between and , it is reasonable to propose that Eq. (9) could be replaced with
where .
We further evaluate the accuracy of the intermediate scattering function using Eq. (9) at a finite correlation time as this approximation has been widely used.20,26 Figure 7 shows the results of , and together with the multiplication of and calculated from one trajectory. Even though the uncertainty may be large for the intermediate scattering functions at large correlation time as demonstrated in Fig. 3, all trajectories are still subject to the same physics. It is still possible to evaluate the coupling of the rotational and translational motions even with one trajectory. Overall, simply decoupling between the center-of-mass and rotational motions by multiplying the intermediate scattering functions together shows a reasonable agreement with at all four Q values. However, there is still small difference between them. We have calculated the results from other trajectories (not shown here). The results all show similar results as shown in Fig. 7. Hence, similar to the comparison of the short-time diffusion coefficients, we need to be prudent when interpreting a subtle difference by comparing the results from the decoupling approximation using Eq. (9) with that from an experiment. Overall, not surprisingly, the center-of-mass motions contribute more to the relaxation of I(Q, t) as a function of time compared to that of the rotational motions. As the experimental error bars of a NSE experiment on a mAb system are usually larger than the difference demonstrated here in Fig. 7, the approximation by Eq. (9) seems to be reasonable.
V. CONCLUSIONS
The dissipative particle dynamic simulation is conducted for a rigid-body model of the NISTmAb. The intermediate scattering functions and the effective short-time diffusion coefficients are calculated based on the trajectories of the simulations. It is found that in order to obtain reliable values of the intermediate scattering functions for our systems, the simulation time has to be about two orders of magnitude longer than the interested correlation time.
A commonly used approximation by decoupling the center-of-mass and rotational motions of a protein is evaluated for this model mAb protein. This decoupling approximation is observed to overestimate the total effective diffusion coefficient. Approximating the translation motion using instead of using shows much better agreement with the total diffusion coefficient using the Eq. (19). However, at relatively high Q values, both Eqs. (17) and (19) overestimate the total diffusion coefficient indicating that the translational and rotational motion are not completely decoupled. However, overall, Eq. (19) has a better agreement than that calculated using Eq. (17). Also, the difference due to the two different approximations is relatively small over the studied Q range with the largest difference to be about 10% at about the Q value corresponding to the diameter of the protein.
The estimated total intermediate scattering function is different from the intermediate scattering function calculated based on the decoupling approximation [Eq. (9)]. However, the overall agreement is still not so bad. As the error bars of many NSE experimental results are relatively large, the difference introduced by the decoupling approximation could be still considered acceptable.
Many studies of the internal motions of proteins using NSE need an accurate estimation of the NSE signals of a rigid body protein model. Our results demonstrate that the assumption of the decoupling between the center-of-mass and rotational motions could introduce appreciable errors to the data interpretation. It is also important to point out that the coupling between the center-of-mass and rotational motion is expected to depend on the overall shape of a protein. Thus, the quantitative difference due to the coupling approximation could be different for other proteins with different shapes. In addition, when the protein is very flexible, its internal motions could further potentially affect the coupling between the rotational and translational motions that may need future quantitative studies.
ACKNOWLEDGMENTS
Y.L. acknowledges the support by the Center for High Resolution Neutron Scattering (CHRNS), a partnership between the National Institute of Standards and Technology and National Science Foundation under Agreement No. DMR-1508249. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division, under Award No. DE-SC0014084 (YZ). N.S.M. thanks Joseph Curtis for useful conversations and providing the Martini model mAb structure utilized in our simulations.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.