Absorption of radiation by solution is described by its frequency-dependent dielectric function and can be viewed as a specific application of the dielectric theory of solutions. For ideal solutions, the dielectric boundary-value problem separates the polar response into the polarization of the void in the liquid, created by the solute, and the response of the solute dipole. In the case of a protein as a solute, protein nuclear dynamics do not project on significant fluctuations of the dipole moment in the terahertz domain of frequencies and the protein dipole can be viewed as dynamically frozen. Absorption of radiation then reflects the interfacial polarization. Here we apply an analytical theory and computer simulations to absorption of radiation by an ideal solution of lysozyme. Comparison with the experiment shows that Maxwell electrostatics fails to describe the polarization of the protein-water interface and the “Lorentz void,” which does not anticipate polarization of the interface by the external field (no surface charges), better represents the data. An analytical theory for the slope of the solution absorption against the volume fraction of the solute is formulated in terms of the cavity field response function. It is calculated from molecular dynamics simulations in good agreement with the experiment. The protein hydration shell emerges as a separate sub-ensemble, which, collectively, is not described by the standard electrostatics of dielectrics.

## I. INTRODUCTION

Inserting a solute into a molecular liquid leads to a nonlinear perturbation of its density.^{1} The density profile formed around a solute is a sensitive function of the strength of the intermolecular potential. Weak dewetting^{2,3} or, alternatively, collapse of hydration shells^{4–6} occurs for hydration when the solute-water interaction is either weaker (dewetting) or stronger (collapse) than the water-water interaction.

In contrast to the diversity of scenarios for the density profile of hydration layers, the orientational structure of the solute-solvent interface is typically viewed as a specific case of the standard dielectric boundary-value problem. In this picture, the dipoles of water orient along the lines of the solute’s electric field according to the rules of electrostatics.^{7} There are a number of fundamental reasons why this simplistic picture can break down,^{8} particularly in the case of patterned interfaces of nano-scale dimension.^{9} By the patterned interface, we mean a substrate with its polarity or charge changing on the length-scale comparable to the correlation length between the molecular dipoles in the surrounding water. The most illuminating example is the patchwork of positive and negative charges, each orienting the nearest water molecules along the field of the surface charge. If the distance between different charges is comparable with the size of the surface polarized domain, this spatial arrangement will lead to the orientational frustration of the water molecules at the domain boundaries. One can anticipate that the response of this patchwork of frustrated domains can potentially be different from the response of a homogeneous dielectric.

The picture of a patchwork of positive and negative charges distributed over a nanometer-scale surface is a good match of the protein-water interface. Proteins possess a large density of surface ionized residues, which provides the free energy required to stabilize a protein in solution. The distribution of positive and negative charges at the surface of a globular protein is nearly uniform, with the average distance between them $\u223c10$ Å.^{10,11} This distance is comparable to the length of a chain of 3–4 water molecules. When two waters at the ends of the chain are pinned by the protein charges, the one-two molecules in the middle must be strongly frustrated orientationally. They resolve frustration by joining one of the domains with the corresponding orientation, thus enhancing the overall orientational order. The hydration shell becomes a patchwork of polarized domains. Their low-temperature statistics is reminiscent of that of another material showing domain formations, the relaxor ferroelectric close to the glass transition.^{12}

The key question relevant to this phenomenology is that of the depth of domains’ extension into the bulk, or, in other words, of the number of water molecules involved in the hydration layer with properties distinct from the bulk. Given that $\u223c ( 300 \u2013 500 ) $ molecules of water can be counted only in the first hydration layer of a typical globular protein, changes in the orientational structure of even the second hydration layer will produce an ensemble of water molecules sufficiently large to be characterized as a separate mesophase ($\u223c1063$ water molecules within a 6 Å shell around lysozyme^{12}). Such an ensemble will carry new properties requiring characterization.

Absorption of radiation in the THz domain of frequencies has recently appeared as a novel technique to study the hydration shells of proteins.^{13–22} The absorption of light as a means of probing the solvation structure has a significant advantage compared to broad-band dielectric spectroscopy since the sensitivity of the response is improved by a factor roughly equal to the dielectric constant for the same magnitude of the liquid polarization or by a factor of dielectric constant squared for the same magnitude of the external electric field.^{23} The difference is due to the transverse geometry of light absorption compared to the longitudinal geometry of the dielectric experiment.

Further advantage of the THz domain spectroscopy over absorption and dielectric spectroscopy at lower frequencies is that the response of the protein dipole (relaxation time $\u223c9$ ns for lysozyme^{24}) is dynamically frozen at the THz frequency. While there are a number of vibrational modes in the THz region belonging to the vibrational density of states of a protein,^{25,26} they do not project on the protein dipole to produce sufficient THz absorption.^{27} The protein is therefore mostly transparent in the THz domain and main absorption comes from water in solution. The challenge is how to separate the hydration shell signal from the dominant bulk absorption.

The analysis of the experimental data in terms of the simplistic mixing model,^{13,15,16,28,29} disregarding non-additivity of the interfacial polarization [see discussion after Eq. (5)], suggested the possibility of an extended hydration shell around a protein, with absorption distinct from the bulk.^{16,28} It was later shown that the change in the absorption of solutions relative to bulk water can be related to the cross correlation between the protein dipole and the dipoles of water molecules in the hydration shell.^{27} Specifically, the alteration of the dipolar susceptibility of the solution relative to the bulk is directly proportional to the solute-solvent dipolar correlation function $ \chi 0 s ( \omega ) $,

[see Eq. (11) for a complete form]. When the cross correlation $ \chi 0 s ( \omega ) $ is calculated for hydration shells of varying thickness, its length of saturation to the bulk value turns out to be in the range of 20–40 Å^{27,30} from the protein surface. Therefore, deviations from the ideal behavior with increasing solute concentration^{16} are likely related to changes of $ \chi 0 s $ from its infinite-dilution value.

As expected from the general arguments presented above, simulations have shown that orientational correlations of water dipoles in the hydration layer are significantly different from those in a homogeneous dielectric. The major question, still posed to the field, is whether these differences can be reliably extracted from absorption of THz radiation with its wavelength far exceeding the length-scale of any structural correlations in the solution. Since the wavelength of radiation does not provide spatial resolution, theory is required to interpret the data.

The first question to ask is what the standard electrostatics would predict for the absorption coefficient of solutions. This problem can in fact be addressed by Maxwell electrostatics of dielectrics, by which we mean the application of the Maxwell boundary conditions to the solution of the Poisson equation.^{7} The problem at hand is illustrated in Fig. 1. The polarization of the interface of a spherical void approximating the hydrated protein integrates to an effective dipole assigned to the void, which is anti-parallel to the external field,

Here, *P* is the polarization of the bulk caused by the uniform field of the electromagnetic wave, $ \Omega 0 $ is the volume of the spherical solute, and $ \mathit{\epsilon} s $ is the dielectric constant of water; the superscript “M” specifies the solution of the Maxwell boundary-value problem. The Maxwell electrostatics thus predicts a decrease in the effective polarity of the solution compared to that of a homogeneous liquid due to the combination of the removal of a polar solvent from the solute volume $ \Omega 0 $ and an additional polarization of the interfacial dipoles as dictated by Maxwell’s boundary conditions (Fig. 1).

Simulations^{27,31} and absorption measurements for hydrated lysozyme^{32} and for some amino acids^{18} have suggested that the polarization of the interface can significantly deviate from the Maxwell scenario. We indeed show below that the so-called Lorentz void,^{33} instead of the Maxwell void [Eq. (2)], provides a better representation of the collective interfacial polarization. However, this conclusion cannot be reached without taking into account the induced dipole of the protein itself.

The negative projection of the interface Maxwell dipole is counterbalanced by a positive induced solute dipole $ M 0 i n d $. The latter is composed of the instantaneous (on the THz time scale) electronic induced dipole $ M 0 e $ and the average permanent dipole $ \u27e8 M 0 \u27e9 E $ established along the field for an ensemble of proteins in solution,

The permanent dipole of course depends on the frequency of the signal and is expected to be dynamically frozen in the THz domain. The average over an ensemble of randomly oriented permanent dipoles of the proteins in solution then yields a negligible net value of $ \u27e8 M 0 \u27e9 E $. Nevertheless, dynamical freezing depends on the radiation frequency and should not be blindly assumed. The induced dipole cannot be generally neglected, as we indeed show below. Given the mutual compensation between the anticipated negative interface dipole and the solute induced dipole, a quantitative theory is required to incorporate both components. Here, we construct such a theoretical description based on previous theoretical advances.^{23,27} Our main focus here is on bringing the theory to the level permitting direct applications to the experiment.

The permanent protein dipole induced by the external field, $ \u27e8 M 0 \u27e9 E $ in Eq. (3), is supplied by molecular dynamics (MD) simulations. These results confirm that it is dynamically frozen in the THz frequency window and can be neglected. The main compensation is therefore between the positive induced electronic dipole and the negative interface dipole. We compare the results of our analysis to recent THz absorption data reported for lysozyme solutions.^{32}

## II. FORMALISM

The absorption coefficient for electromagnetic radiation at frequency $\omega $ is fully defined in terms of the real and imaginary parts of the dielectric function of solution $\mathit{\epsilon} ( \omega ) = \mathit{\epsilon} \u2032 ( \omega ) +\u2009i \mathit{\epsilon} \u2033 ( \omega ) $,

The problem is therefore reduced to calculating $\mathit{\epsilon} ( \omega ) $ in terms of the complex-valued dielectric function of the solvent $ \mathit{\epsilon} s ( \omega ) $ (water in our case) and the properties of the solute (protein). A general formalism for $\mathit{\epsilon} ( \omega ) $ for an ideal solution was developed in Refs. 23 and 27. Here we summarize the main steps of this theory relevant to our simulations and comparison to the experiment.

We first consider a fictitious solute that does not produce any dipolar response and thus can be approximated as a void in a polar liquid. This approximation is relevant to many THz experiments when the solute dipole is too slow and is dynamically frozen. Computer simulations presented below specifically address the question of whether this approximation is applicable in the case of hydrated lysozyme. Independently of the outcome, this limit is the simplest conceptual point of departure for our discussion.

### A. Voids in water

A void removes polar material from its volume thus reducing the overall dipole moment of the liquid from its homogeneous magnitude $ M x l i q =PV$ to the value $ M x l i q \u2212 N 0 \Omega 0 P$. Here, we have adopted the geometry of an absorption experiment sketched in Fig. 1, in which light is propagated with the wave-vector along the *z*-axis of the laboratory frame and its electric field is polarized along the *x*-axis. Further, the solution contains *N*_{0} solutes, each having the volume $ \Omega 0 $ and $ \eta 0 = ( N 0 \Omega 0 ) /V$ is the volume fraction.

Removing the polarized medium from the void’s volume is not sufficient to calculate the polarization of the solution. Any discontinuous interface inside a dielectric is polarized by the field, and there will be an additional interfacial polarization, which requires solving the dielectric boundary-value problem.^{7} In the case of a spherical void, the transverse geometry of the absorption experiment leads to the following relation:^{23}

Here, the last summand accounts for an additional polarization of the interface not accounted for by the removal of the polarized liquid.

The first term in Eq. (5) describes the reduction of absorption produced by removing water from the volume of the protein. This is the water term in the two-component analysis often used in the literature.^{13,15,16,28,29} In this approach, the absorption coefficient of the solution is represented as the volume-fraction weighted sum of the protein (p) and water (w) components: $ \alpha a b s = ( 1 \u2009 \u2212 \u2009 \eta 0 ) \alpha w \u2009+\u2009 \eta 0 \alpha p $. This approximation can produce negative protein absorption when applied to analyzing experimental data.^{19} This seemingly astonishing result is in fact trivial when compared to Eq. (5): an additional negative contribution, which is assigned to absorption by the protein in the two-component analysis, is caused by polarization of the void in addition to the removal of water [the second summand in Eq. (5)].

The two-component analysis often leads to seemingly reasonable results, but this comes not from the soundness of this approximation, but instead from the combination of the breakdown of the Maxwell dielectric picture for the heterogeneous protein-water interface (see below) and a potential compensation between the interface dipole and the dipole induced at the solute [$ M 0 i n d $ in Eq. (3) and in Fig. 1].^{27} Deviations between the two-component model and observations have been often assigned to the third component, the hydration water. Given that such deviations, according to Eq. (5), are predicted even within the standard dielectric picture, any conclusions based on such an analysis are unfounded. Numerically, the additional polarization of the interface [the second summand in Eq. (5)] is a significant effect, amounting to about a half, at $ \mathit{\epsilon} s \u226b1$, of the polarization loss from the direct expulsion of water from the solute volume.

The appearance of the dielectric constant of bulk water $ \mathit{\epsilon} s $ in Eq. (5) is a consequence of using the standard dielectric boundary conditions at the discontinuous surface separating the void from the polar liquid. Hydration water surrounding proteins is strongly altered compared to the bulk and it is *a priori* not clear if the standard dielectric boundary conditions apply. In fact, we have shown before^{27} and demonstrate more conclusively here that polar response of the protein-water interface is qualitatively different from that of the standard dielectric interface. In order to account for these differences, we introduce the parameter $\alpha ( \omega ) $, which allows us to re-write Eq. (5) in the following form:

Here, we have indicated the dependence of all components on the radiation frequency $\omega $. It has to be stressed that similar to the dielectric function $ \mathit{\epsilon} s ( \omega ) $, the parameter $\alpha ( \omega ) = \alpha \u2032 ( \omega ) +\u2009i \alpha \u2033 ( \omega ) $ is a complex-valued function, as will be clear from the following discussion. At $\alpha ( \omega ) =1$, one restores the standard Maxwell solution in Eq. (5), and at $\alpha ( \omega ) =0$ the interface is not polarized, which we dub here as the Lorentz scenario. Any value of $\alpha ( \omega ) $ distinct from these two limits is obviously allowed.

From Eq. (6), one can directly calculate the change in the susceptibility of the solution $4\pi \chi ( \omega ) =\mathit{\epsilon} ( \omega ) \u22121$ relative to the susceptibility of the bulk liquid $4\pi \chi s ( \omega ) = \mathit{\epsilon} s ( \omega ) \u22121$,

where we used the subscript “v” to stress that this susceptibility change is due to the ideal solution of voids. In the case of the Maxwell dielectric, $\alpha ( \omega ) =1$, the solution susceptibility decreases linearly with the volume fraction $ \eta 0 $,

The slope of the linear scaling with $ \eta 0 $ is equal to the product of $ \mathit{\epsilon} s ( \omega ) $ with the cavity field susceptibility of dielectric theories,^{34,35}

The susceptibility $ \chi c = E c / E 0 $ is the ratio of the field inside the void (cavity) *E*_{c} to the uniform external field *E*_{0} (see below), and the superscript “M” indicates the Maxwell solution for this property.^{7,33}

Equation (7) does not accomplish much unless a connection between $\alpha ( \omega ) $ and parameters accessible by either simulations and/or experiment can be established. It was previously suggested^{27} that the effective dipole at the surface of a spherical void polarized by the uniform external field can be alternatively calculated from the response to a dipole placed inside the void. The connection is provided by the relation

Equation (7) is therefore re-written in the following form:

Equation (11) provides a significant advantage for the theory-experiment connection since it directly yields the THz slope ($ \alpha a b s $ vs $ \eta 0 $), instead of absorption at a given concentration as calculated from simulations of the total dipole of the mixture.^{22,36} The required input is two susceptibility functions: $ \chi 00 ( \omega ) $ and $ \chi 0 s ( \omega ) $. They are accessible by computer simulations in the limit of a single solute in the simulation cell, thus bypassing the difficulties of simulating mixtures. The susceptibilities $ \chi 00 ( \omega ) $ and $ \chi 0 s ( \omega ) $ arise from, correspondingly, the “self,” $\u221d \u27e8 \delta M 0 ( t ) \u22c5 \delta M 0 ( 0 ) \u27e9 $, and “cross,” $\u221d \u27e8 \delta M s ( t ) \u22c5 \delta M 0 ( 0 ) \u27e9 $, time correlation functions of the fluctuating dipole moments of the solute, **M**_{0}(*t*), and solvent, **M**_{s}(*t*). Being linear response functions, $ \chi 00 ( \omega ) $ and $ \chi 0 s ( \omega ) $ represent the permanent dipole moment induced by the external electric field at the solute (subscript “00”) and the dipole moment caused in the hydration layer by the solute dipole (subscript “0s”).

The frequency-dependent susceptibilities are Laplace-Fourier $\omega $-transforms of the corresponding time correlation functions appearing in the dynamic linear response theory.^{37} One can define the normalized time correlation functions as

where $\delta M a ( t ) = M a ( t ) \u2212 \u27e8 M a \u27e9 $ and *a* = 0, *s*. These functions are Laplace-Fourier transformed to yield the functions $ S \u0303 0 a ( \omega ) $, which then enter the corresponding linear response functions

The fitting procedure using a multi-exponential decay with amplitudes *A*_{i} ($ \u2211 i A i =1$) and relaxation times $ \tau i $ leads to

and

The cross correlation $\u221d \u27e8 \delta M s ( t ) \u22c5 \delta M 0 ( 0 ) \u27e9 $ is typically poorly converged and is a challenge to calculate from simulations. In anticipation of the final results used to calculate the absorption coefficients, we note that the response function

enters the final expressions. This function is based on the time correlation function $\u221d \u27e8 \delta M 0 ( t ) \u22c5 \delta M ( 0 ) \u27e9 $ between the solute dipole **M**_{0} and the entire dipole moment of the sample **M**. This correlation function is typically a faster converging property.

Since $ \chi 00 ( \omega ) $ is always positive, Eq. (11) clearly shows that the sign of the slope of $\Delta \chi v $ vs $ \eta 0 $ is determined by the sign of the cross, solute-solvent dipolar correlations. The solute dipole and the dipoles of the solvent molecules are expected to be anti-correlated, thus resulting in a usually observed^{13,14,32} negative THz slope. The reason for such negative cross correlations is illustrated in Fig. 2, which shows the dipoles of the solvent arranged around a solute dipole. While axial solvent dipoles tend to orient in parallel, the equatorial dipoles tend to orient anti-parallel. Since there are always more equatorial dipoles than axial dipoles, the overall cross correlation is negative. This picture is, however, based on the assumption that interfacial dipoles can freely change their orientations along the field lines of the solute dipole. Hydration shells of proteins are densely packed^{38,39} and orientationally frustrated,^{6,12} thus potentially affecting the ability of hydration waters to polarize along the field of the solute. While most solutes still show a negative slope of $\Delta \chi v $ vs $ \eta 0 $, solutions of some amino acids (glycine and serine^{18}) produce a positive THz slope. The latter result might be, at least partially, related to the compensating positive dipole moment induced at the solute [Eq. (3)], which we consider below.

In the analysis below, we use the input for the frequency-dependent dielectric function of water from the experiment^{40} and model the electronic induced dipole of the protein by its refractive index. The force fields typically used in MD simulations miss the polarizability of at least one of these two components. It is not *a priori* clear if the force fields typically used in condensed phase simulations are suitable for modeling absorption. To illustrate this point, Fig. 3 compares the real and imaginary parts of $ \mathit{\epsilon} s ( \nu ) $ ($\omega =2\pi \nu $) for TIP3P water (dashed lines) with the experiment^{40} (solid lines). The differences between experiment and TIP3P water are significant. In order to establish whether these differences affect absorption of radiation, one needs a formalism operating in terms of solvent parameters accessible by both experiment and simulations. Equation (11) presents this opportunity, and we discuss this analysis below. It turns out that the use of the frequency-dependent dielectric function of TIP3P water significantly deteriorates the agreement between calculations and experiment within the analytical (Lorentz void) framework but does not strongly affect the results directly obtained from MD simulations.

### B. Solute dipole

The solute dipole can include both the permanent and electronic components. As is typically done in mean-field theories of polarizable dielectrics,^{41,42} it is convenient to combine them in one polar density parameter,^{27}

where $ y e = ( 4 \pi / 3 ) ( \alpha 0 / \Omega 0 ) $ and $ \alpha 0 $ is the electronic polarizability of the solute. The Clausius-Mossotti equation can be used to estimate *y*_{e},

Here, *n*_{p} is the refractive index of the protein and $ \eta 0 p $ is the packing fraction of proteins in the powder used to measure *n*_{p}. We will assume $ \eta 0 p \u22431$ in our estimates below, which is also the assumption made by Onsager for molecular polarizability.^{43} Corrections can be introduced if this parameter is known independently.

The frequency-dependent dipolar response of the solute adds to the solution susceptibility, which is now defined as

Here, the first summand, due to the void, is given by Eq. (11). The second term, the change in the susceptibility due to dipolar response of the solutes, is given by the following equation:^{27}

where here and in Eq. (19) the dependence on frequency was dropped for brevity. The frequency-dependent response function $ \chi c ( \omega ) $ is

where $ \chi 0 d ( \omega ) $ given by Eq. (16) is the total susceptibility of the solute dipole in response to the uniform polarizing external field.^{44,45} The physical meaning of $ \chi 0 d / \chi 00 $ is that it is equal^{27} to the ratio, *E*_{c}/*E*_{0}, of the cavity field *E*_{c} experienced by the solute dipole from the polarized solvent to the field of the external charges *E*_{0} [the result of Maxwell electrostatics is given by Eq. (9)]. Although a direct measurement of the cavity field is hardly possible, it enters a number of spectroscopic observables.^{46} Correspondingly, the response function $ \chi c ( \omega ) $ represents the ratio of the cavity field to the external field at a given frequency.

The first summand in Eq. (20) is the infinite-dilution contribution of the solute dipole to the solution susceptibility. The second summand, which includes the function $ J 0 ( \eta 0 ) $, represents the non-ideal effect of polarizing the void by the permanent dipoles of other solutes in the solution. Therefore, *J*_{0}(0) = 0, and it is given for an arbitrary $ \eta 0 $ by an equation involving the density structure factor of the solutes, $ S 0 ( k ) = N 0 \u2212 1 \u2211 i , j N 0 exp [ i k \u22c5 r i j ] $, where **r**_{ij} is the vector connecting two solutes in the solution and **k** is the wavevector. The density structure factor determines mutual correlations of the positions of the solutes, and $ J 0 ( \eta 0 ) $ is given as^{23}

Here, *j*_{1}(*x*) is the spherical Bessel function of the first order^{47} and integration is performed over the dimensionless variable $x=k \sigma 0 $, where $ \sigma 0 $ is the solute diameter. While *S*_{0}(*k*) is experimentally available from small-angle scattering, we estimated it here based on the Percus-Yevick solution for a fluid of hard spheres.^{48} These estimates [see $ J 0 ( \eta 0 ) $ tabulated in Ref. 23] show that $ J 0 ( \eta 0 ) $ can be dropped from the final equations at the low volume fraction $ \eta 0 <0.1$. When *J*_{0} = 0, one can apply Maxwell electrostatics ($\alpha =1$) as a consistency check of Eq. (20). One then gets Eq. (9) for $ \chi 0 d / \chi 00 $, and for the static ($\omega =0$) response,

## III. SIMULATION RESULTS

All-atom MD simulations were performed using the initial crystallographic structure of lysozyme (PDB entry 1AIK) solvated with TIP3P water. The simulation cell consisted of a total of 87 050 atoms with 28 361 water molecules. The total charge of the lysozyme protein was −7 *e*. The constant pressure temperature equilibration simulations were done using the Langevin temperature-pressure control with a damping coefficient of 5 ps^{−1}, a piston pressure of 1 atm, a piston decay time of 50 fs, a piston oscillation period of 200 fs, and at a temperature of 300 K. A cutoff radius of 12 Å and full electrostatics using the particle mesh Ewald technique at every simulations step were employed. NAMD 2 × 10^{49} with the CHARMM27 force field was used to produce the MD trajectories. An initial optimization of the simulation cell was performed by conjugate gradient minimization for 2000 steps, followed by a 5 ns NPT equilibration simulation. Proceeding from equilibration run, production NVT simulations were carried out for 10 ns.

No counterions were used in the simulation setup. The simulation cell therefore carries a total non-zero charge equal to the charge of the protein. According to the standard practice of employing Ewald sums for the long-range electrostatics, this setup is equivalent to placing a uniform compensating background charge in the periodic Wigner lattice.^{50} This background charge will affect the free energy of a charge placed within the simulation cell,^{50,51} but will cancel out from the deviation of the dipole moment $\delta M a $ from its average value. Only $\delta M a $ enters the calculation of susceptibilities in Eq. (13) and the background charge does not affect our results. In terms of comparison of our results to the experiment, the relaxation time of the electrolyte buffer used in the typical experimental setup exceeds the picosecond time scale relevant to the THz experiment.^{48} One can therefore assume that the electrolyte is dynamically frozen in the THz experiment.

The main reason for a relatively short length of the simulation trajectory used here is that the simulations (both the equilibration NPT and the production NVT) were performed with a relatively short time step of 0.5 fs and flexible hydrogen bonds of the protein. The trajectory was saved every 10 fs. This focus on the short-time dynamics has allowed us to calculate the time correlation functions with a 10 fs time resolution required for the modeling of the THz response. Exponential fits of the correlation function were done on the time window of 1 ns. Those were Laplace-Fourier transformed to obtain Eq. (14) for the correlation functions and Eq. (15) for the response functions.

## IV. COMPARISON TO EXPERIMENT

A significant result of our present and previous^{27} computer simulations is that $ ( 4 \pi / 3 ) \chi 00 $ can be neglected in Eq. (17) relative to *y*_{e} in the THz domain. The nuclear motions of the protein do not produce sufficient fluctuations of its dipole, and absorption due to the protein’ permanent dipole is negligible. Only the electronic dipole induced at the protein makes a non-negligible contribution. Note that, in contrast with the previous simulations,^{27} the current setup allowed vibrations of protons in the protein, along with a significantly smaller integration step and more frequent savings. Nevertheless, *S*_{00}(*t*) is nearly flat at $t\u22430.1 \u2013 1$ ps (Fig. 4) and, correspondingly, $ \chi 00 ( \omega ) $ is very small for $\nu =\omega / ( 2 \pi ) $ in the THz domain of frequencies.

With this simplification and after dropping the term proportional to *J*_{0} from Eq. (20), one arrives at a simple relation

where the Clausius-Mossotti relation between the protein refractive index *n*_{p} and *y*_{e} was applied [$ \eta 0 p =1$ in Eq. (18)]. In the limit of the Maxwell electrostatics (superscript “M”), one obtains

At *n*_{p} = 1, this relation is converted into Eq. (8) for the susceptibility of a Maxwell void.

One can additionally consider the limit in which no surface charges, originating from introducing a dividing surface into the dielectric,^{7} are created at the interface of a void (no “+” and “−” at the void’s surface in Fig. 1). This limit corresponds to $\alpha ( \omega ) =0$ in Eq. (7) and the assumption that the only result of creating the void is the expulsion of the polarized water from the void’s volume. Such a void is known in the theory of dielectrics as the virtual, or Lorentz, cavity.^{34} We have dubbed this limit as the “Lorentz scenario.”^{33} The ratio $ \chi 0 d / \chi 00 $ becomes equal to the Lorentz cavity field,^{27} $ E c / E 0 = ( \mathit{\epsilon} s ( \omega ) + 2 ) / ( 3 \mathit{\epsilon} s ( \omega ) ) $, and one obtains from Eq. (24) (superscript “L” is for the Lorentz scenario)

Not surprisingly, there is no change to the solvent response when the virtual void is filled with the same material as the solvent, $\Delta \chi L =0$ at $ \mathit{\epsilon} s = n p 2 $. Note that the Lorentz cavity field is used in deriving the Clausius-Mossotti equation, and that fact leads to the complete cancellation of the interface and electronically induced solute dipoles at $ \mathit{\epsilon} s = n p 2 $. This does not happen in the Maxwell case in Eq. (25) since the cavity field of dielectric theories,^{34} instead of the Lorentz field, is used in the Maxwell electrostatics.

In application to analyzing experimental results, the complex-valued dielectric function of bulk water was taken from recent measurements^{40} covering the range of frequencies 5.9–1120 GHz. The experimental data^{32} for the absorption coefficient of lysozyme were obtained by averaging absorption over the interval of frequencies 0.38–0.92 THz. There is therefore a sufficient frequency overlap between the data for bulk water^{40} and solutions.^{32} The refractive index of the protein, $ n p \u22431.55$, is from Ref. 52.

Figure 5 presents the calculations of the change in the absorption coefficient of the solution with increasing the volume fraction of the solute according to Eq. (4). Specifically, we normalize the change in absorption by the absorption of bulk water,

where the absorption of water $ \alpha a b s s ( \omega ) $ is calculated from Eq. (4) with the water dielectric function $ \mathit{\epsilon} s ( \omega ) $ used in place of the solution dielectric function $\mathit{\epsilon} ( \omega ) $. The solution dielectric function is calculated as $\mathit{\epsilon} ( \omega ) = \mathit{\epsilon} s ( \omega ) +4\pi \Delta \chi ( \omega ) $, with three scenarios presented by Eqs. (24)–(26) for $4\pi \Delta \chi ( \omega ) $.

Both the Maxwell and Lorentz voids are sensitive to the choice of the protein refractive index. The dashed lines in Fig 5 refer to the calculations neglecting the induced dipole of the protein [$ M 0 i n d =0$ in Eq. (3)], which is achieved by putting *n*_{p} = 1. There is, however, no physical reason to adopt this assumption and this limit is shown here as a mere warning that a “good agreement” with the experiment can follow from unphysical approximations (the Lorentz scenario is in perfect agreement with experimental data in this case, blue dashed line). It is also clear that MD simulations of $ \chi c ( \omega ) $ provide a good account of the cavity field dynamics and are consistent with the experiment. We additionally show in Fig. 6 the comparison of the calculations employing the experimental^{40} dielectric function $ \mathit{\epsilon} s ( \omega ) $ (solid lines) and the same function obtained from MD simulations of TIP3P water (dashed lines). There is little sensitivity to the choice of the water model when the cavity response function $ \chi c ( \omega ) $ is calculated from MD simulations. In contrast, the Maxwell and Lorentz equations for the cavity field are sensitive to the choice of the frequency-dependent dielectric function. The agreement with the experiment is much worse when $ \mathit{\epsilon} s ( \omega ) $ for TIP3P water is used in Eqs. (25) and (26). This is not surprising given large discrepancies in the dielectric properties between the experimental and TIP3P water shown in Fig. 3.

## V. CONCLUSIONS

Absorption of radiation provides access to the orientational structure of the hydration shell in terms of the dipolar solute-solvent (cross) correlation function [Eqs. (1) and (11)]. The alternative representation is in terms of the cavity field susceptibility [Eq. (24)]. Either of these functions is the only information about the solution accessible from the absorption experiment. It does not allow any spatial resolution of the interface and requires theoretical interpretation to draw the connection between these susceptibilities and the interfacial structure. The analytical model^{23,27} presented here calculates the slope of the absorption coefficient vs the solute volume fraction in terms of the cavity susceptibility at the radiation frequency. Simulations of a single lysozyme in solution performed here are capable of producing this function and lead to a good agreement with the experiment^{32} (Fig. 5).

Nuclear motions of the protein do not contribute to the signal, and protein itself is effectively transparent in the THz domain. The only contribution to protein’s absorption comes from the electronic induced dipole and the corresponding refractive index. Therefore, it is predominantly the protein-water interface that determines the change in the THz absorption by solutions relative to the bulk.

The structure of water can be disrupted to a different extent depending on the solute. The extent of disruption and the resulting structure of the hydration shell affect the decay of the solute-solvent correlations into the bulk.^{27} The decay length can be in the range of 20–40 Å.^{27,30} The extended hydration shell anticipated in the past^{16} in fact implies extended dipolar cross correlations. Those are electrostatic correlations, for which a long range of decay is anticipated. Absorption by ideal solutions provides no direct input into either the structure of the hydration shell (spatial arrangement of the nuclei) or its extent into the bulk.

In the absence of direct simulations, two analytical limits, corresponding to the Maxwell and Lorentz voids, can be derived [Eqs. (25) and (26)]. We find, in agreement with previous calculations,^{30} that the Maxwell void gives a poor representation of the data, while the Lorentz void gives a better account of the experiment. One still has to realize that the appearance of the Lorentz limit for the void polarization is not consistent with the standard electrostatics but is an emergent consequence of the orientational and density restructuring of the protein hydration layer compared to bulk water. It is therefore the deviation from the Maxwell limit that should be considered as the specific effect of the protein hydration shell. The Lorentz scenario is an effective-medium representation of the complex structure of the protein-water interface, which requires modification of the standard dielectric boundary-value problem.

## ACKNOWLEDGMENTS

This research was supported by the National Science Foundation (No. CHE-1464810) and through XSEDE resources (No. TG-MCB080116N).