A discrepancy has emerged in recent years between single-molecule Förster resonance energy transfer (smFRET) measurements and small angle X-ray scattering (SAXS) or small angle neutron scattering experiments in the study of unfolded or intrinsically disordered proteins in denaturing solutions. Despite significant advances that have been made in identifying various factors which may have contributed to the manifestation of the so-called smFRET-SAXS discrepancy, no consensus has been reached so far on its original source or eventual resolution. In this study, we investigate this problem from the perspective of the solvent effect on FRET spectroscopic ruler (SEFSR), a generic term we use to describe various solvent-dependent factors affecting the accuracy of the FRET experimental method that is known as a “spectroscopic ruler.” Some factors belonging to SEFSR, such as direct dye-solvent interaction and labeling configuration, seem to have not received due attention regarding their significance in contributing to the discrepancy. We identify SEFSR by measuring a rigid segment of a double-stranded DNA in various solutions using the smFRET method and evaluate its relative importance in smFRET experiments by measuring segments of a single-stranded DNA and polyethylene glycol (PEG) in solutions. We find that SEFSR can produce non-negligible FRET-inferred interdye distance changes in various solutions, with an intensity following the Hofmeister series in ionic solutions and dependent on labeling configurations. SEFSR is found to be significant in GuHCl and urea solutions, which can fully cover the apparent expansion signal of dye-labeled PEG. Our findings suggest that SEFSR may have played an important role in contributing to the smFRET-SAXS discrepancy.
Single-molecule Förster resonance energy transfer (smFRET) and small angle X-ray or neutron scattering (SAXS or SANS) have become two powerful sets of experimental tools to probe into a variety of biophysical processes such as protein folding processes in denaturing solutions.1–6 However, the inferences drawn from these two approaches do not always agree with each other. For small proteins, as the denaturant concentration is increased, smFRET measurements indicated a gradual expansion of the dye-labeled proteins, which is regarded as evidence for the coil-globule transition preceding the folding transition.1,2 By contrast, SAXS tests did not show a statistically significant expansion of the proteins under similar conditions, suggesting that the coil-globule transition occurs simultaneously with the folding transition rather than at the initial stage of folding.1 Notably, this disagreement, referred to as the smFRET-SAXS discrepancy, is experimentally persistent and appears to be common for single-domain proteins,3 challenging the precise prediction of protein folding and the mechanism of chemical denaturation as well as the validity and accuracy of these two experimental methods.
A number of studies have been devoted to the investigation of this discrepancy from different angles using different approaches.3–12 However, no consensus has been reached so far regarding the origin of the discrepancy. Some studies investigated possible systematic errors in smFRET measurements. In one study, changes in the quantum yield of the dyes in labeled target molecules were considered to be a contributing factor to systematic errors in smFRET experiments.10 The effect of the dyes on the polymers was also investigated.11,12 Attention was also given to the influence of using generic polymer models to extract FRET-inferred interdye distances from measured FRET efficiencies.5,13 A question was raised on the interpretation of FRET efficiency changes in terms of hydrophobic or hydrogen bond-driven protein collapse.3
On the other hand, some studies identified certain factors that could result in systematic errors in SAXS experiments. For instance, the inherent difficulty of the SAXS experiments in determining the radii of gyration at very low denaturant concentrations where the protein dimensions vary most significantly may lead to an underestimation of the SAXS results.10 The sensitive dependence of the radii of gyration on the range of the Guinier approximation in analyzing the SAXS data may also have contributed to the smFRET-SAXS discrepancy.10
Furthermore, several studies conducted very recently with multiple experimental methods (including smFRET and SAXS) and/or simulations have produced consistent results that reconcile (partially) the smFRET-SAXS discrepancy for the proteins and experimental conditions considered, suggesting that there is no fundamental contradiction between smFRET and SAXS experiments.5–7,9 However, even these consistent studies do not agree on the source of the discrepancy in earlier research studies. Some attributed the discrepancy mostly to the inherent characteristics of the experimental approaches5 or the handling and interpretation of the experimental data.6 Others ascribed the discrepancy to the physical decoupling of different quantities measured in smFRET and SAXS experiments that arise from chemical and conformational heterogeneity rather than the inherent shortcomings of the experimental methods.7,9 Taken together, it appears that multiple factors are at play in manifesting the smFRET-SAXS discrepancy and further investigation is still needed before different perspectives can be pieced together to fully resolve the discrepancy.
In addition, even though the interaction of the polymer with the solvent,6 the effect of the dyes on the polymer,11,12 and the importance of the dyes to structure determination14,15 have been investigated systematically, the solvent effect on the dyes and their role in smFRET measurements seem to be a relatively underexplored subject, to our knowledge. (Note that “dye” includes both fluorophores and linkers in this work.) Some previous studies held the view that the effects of the dyes in smFRET were not significant enough to play an important role.5–8 However, it is common in these studies that the effects of the dyes were assessed by invoking a generalized polymer scaling law to describe the contribution of the dye pair to the overall dimension of the dye-labeled polypeptide in denaturing solutions. The dye pair was regarded as an equivalent number of residues in the polypeptide chain, and the whole dye-labeled polypeptide was assumed to share a single scaling exponent. Implied in the form of the scaling law seems to be an assumption that the dyes would respond to the solvent environments the same way as the polypeptide chain does, that is, a form of homogeneity between the dyes and the polypeptide chain. Yet it has been proposed recently that the chemical and conformational heterogeneity of the polypeptide chain may be a major factor leading to the smFRET-SAXS discrepancy.7–9 Given that the chemical and physical properties of the dyes16 are significantly different from the single-chain polymers, the dyes may provide the strongest source of heterogeneity in response to solvents. Therefore, the dye-induced heterogeneity should not be ignored a priori, and it is important to evaluate the effects of different solvents on the dyes in smFRET measurements.
In this work, we investigate the smFRET-SAXS discrepancy problem from the perspective of the “solvent effect on FRET spectroscopic ruler” (SEFSR) using a generic calibration approach. SEFSR is a generic term we use to describe various solvent-dependent factors that affect the accuracy of the FRET experimental method. In the following, we explain this concept more specifically from the aspects of its intuitive understanding, content, merits, identification, and significance in smFRET-SAXS discrepancy.
The FRET experimental method is known as a “spectroscopic ruler” that measures distances on the nanoscale. In certain aspects, the FRET spectroscopic ruler is similar to a vernier caliper, but much smaller and more delicate. Both have a ruler body and two measuring jaws. The “ruler body” of the FRET spectroscopic ruler is fluorescence spectroscopy, and the “measuring jaws” are the two dyes (including both fluorophores and linkers). The accuracy of measurements with a vernier caliper may be affected by the deformation of the ruler body and the measuring jaws in different environments (e.g., at different temperatures). Similarly, the FRET spectroscopic ruler may become “warped” in different solvent environments due to the “deformation” in the ruler body (fluorescence spectroscopy) and the measuring jaws (dye pair). Considering that the FRET spectroscopic ruler is a delicate nanoscale ruler (e.g., the “measuring jaws” as the dyes are flexible rather than rigid) which depends on a variety of factors, it is likely that it may also be affected significantly by different solvent environments. An intuitive understanding of the “solvent effect on FRET spectroscopic ruler” (i.e., SEFSR) would be the warping effect of the FRET spectroscopic ruler due to the solvent environments.
SEFSR understood as such is obviously a generic effect that contains multiple specific factors or mechanisms. We can categorize these factors or mechanisms into two major classes, namely, the solvent effect on the “ruler body” (fluorescence spectroscopy) and the solvent effect on the “measuring jaws” (dye pair). More specifically, the solvent effect on ruler body contains solvent-dependent factors such as quantum yield, refraction index, dyes spectra, and their overlap integral. The solvent effect on the measuring jaws contains mechanisms such as direct interaction between dye linkers and aqueous solutions (i.e., polymer physics of dye linkers), direct interaction between fluorophores and aqueous solutions (non-photophysical effect), solvent-dependent dye-dye and dye-substrate interactions (e.g., hydrophobic interactions), the effect from the coefficient of viscosity, etc. Note that errors from data analysis methods are not included as a component of SEFSR, even though they may also contribute to the FRET-inferred interdye distance.5,13,18
Knowing that SEFSR contains so many individual factors, one will probably question whether such a concept is meaningful or useful at all. We would like to first stress that we do think it is very important to identify and study each individual factor contributing to SEFSR, as that will deepen our understanding of the mechanism of how SEFSR emerges. In fact, in this study, we will identify and study two factors that possibly have made significant contributions to SEFSR, namely, direct dye-solvent interaction and dye flexibility in different labeling configurations. There are also other potentially important factors contributing to SEFSR. The precise assessment of the relative weights of each individual factor in SEFSR is an important but very tricky task. In the present paper, we only focus on the two individual factors mentioned above. The much more ambitious task of complete identification and study of each individual factor and its relative weight in SEFSR is beyond our current capability and resources available to us.
On the other hand, we also believe that SEFSR as a generic effect has merits, especially from a pragmatic point of view. We notice that practical calibration of a ruler (including the FRET spectroscopic ruler) does not necessarily require detailed identification of each individual factor contributing to the warping of the ruler under different environmental conditions. Undoubtedly, having such knowledge will deepen our understanding of why the ruler becomes warped. But as far as how to practically calibrate the ruler is concerned, it is sufficient to identify the overall warping effect (e.g., SEFSR) without identifying each individual contributing factor. This is essentially the generic calibration approach. The contrary is the specific calibration approach, with each individual factor identified, its contribution assessed, and then combined together for calibration.
In the specific calibration approach, it is likely that some unknown factors not yet identified may be missed, or the errors in each factor when combined together may accumulate to become too large, resulting in inaccurate calibration. Besides, it is simply too time-consuming to do it that way, as far as practical calibration is concerned. By contrast, in the generic calibration approach, identifying the overall warping effect of the FRET spectroscopic ruler, SEFSR, has the following advantages. First, no individual factors yet unidentified are missed as all individual factors (known or unknown) are contained in SEFSR. Second, real distance changes arising from conformational changes of the substrate molecule will emerge precisely when SEFSR is properly offset. Third, SEFSR is more time-efficient to identify compared to each individual factor. (The disadvantage of identifying only SEFSR without individual factors is obviously that the mechanisms manifesting SEFSR are hidden, which is why the identification of the generic effect, SEFSR, needs to be complemented by the identification of individual contributing factors.)
Then naturally there is the question of how SEFSR can be identified. A typical method to check whether a regular ruler is warped in different environments (not necessarily obvious to the eye) is to measure a rigid distance that does not vary under the environmental conditions considered and check whether the measurement results change (beyond error bounds) with the environmental conditions. If the measurement results are read off from the ruler and processed correctly, the apparent distance changes of the measurement results would be an indication that the ruler is warped under these environmental conditions. In this paper, we use essentially the same strategy to identify SEFSR. More specifically, we identify SEFSR by attaching a dye pair to a segment of a double-stranded DNA (dsDNA) that can be considered as rigid. We find that many additive-enriched aqueous solutions, such as guanidine hydrochloride (GuHCl) and urea solutions, can affect the FRET-inferred interdye distances to various degrees, which is a signature of SEFSR. The apparent FRET-inferred interdye distance changes (in contrast to real conformational changes of the substrate molecule) under different solvent conditions can thus be used as a quantitative measure of how strong SEFSR is. In ionic solutions, SEFSR is found to follow the Hofmeister series, known as the ion-specific effect, suggesting that direct dye-solvent interaction probably plays an important role in SEFSR. We also find that labeling configurations associated with different flexibilities of the dyes on the dsDNA can result in different responses of the FRET-inferred interdye distance to the solvent.
SEFSR identified and quantified in the above way can then be used as an estimate to assess to what extent these artificial distance changes from SEFSR contribute to the FRET-inferred interdye distance in other smFRET experiments, in which the substrate molecules could go through real conformational changes that contribute to the FRET-inferred interdye distance too. In this work, this is done by performing smFRET experiments on a dsDNA with an adjustable length of the overhang [protruding single-stranded DNA (ssDNA)] and polyethylene glycol (PEG) 3500 in solutions. We find that in GuHCl and urea solutions SEFSR makes a considerable contribution to the expansion of the mean FRET-inferred interdye distance as the denaturant concentration is increased, similar to those observed in unfolded or denatured proteins. A ssDNA (overhang) with as many as 20 bases could barely produce a comparable amount of expansion as SEFSR does. Moreover, the expansion arising from SEFSR is capable of covering the observed apparent expansion of dye-labeled PEG 3500, which is believed to be unable to expand further by other means. Our findings suggest that SEFSR with the associated dye-induced heterogeneity may possibly account for the smFRET-SAXS discrepancy under some circumstances.
A. Preparation of labeled PEG and DNA
HS-PEG3500-NH3 was purchased from Jenkem Technology (Beijing). The dye-labeled PEG 3500 was prepared by coupling succinimide-modified AF488 (Invitrogen) and maleimide-modified AF647 (Invitrogen) with HS-PEG3500-NH3. More specifically, the HS-PEG3500-NH3 was dissolved in 100 mM NaHCO3 solution (pH 8.5), with a final concentration of 200 μM. Then the AF488 was added to the former solution, with a final concentration of 800 μM. After maintaining the mixture at 4 °C for 24 h, the mixture was dialyzed using a dialysis bag of 2000 Da cutoff for 72 h in PBS buffer (pH 7.4). The obtained product was mixed with excessive AF647 and incubated at 4 °C for 24 h. Then it was dialyzed using a dialysis bag of 2000 Da cutoff for 72 h in PBS buffer. The obtained product was dye-labeled PEG 3500. The labeled DNAs were purchased from Thermo Fisher. Then annealing of each of the complement DNAs was performed in our lab in PBS buffer, 20 μM. The sequences are shown in Table S1 in the supplementary material. Note that all operations were performed in the dark.
B. SmFRET experiments
All experiments measuring a single-molecule signal were performed with a home-built confocal microscope equipped with a 488 nm laser (Melles Griot) and a Zeiss 63×/1.4NA oil objective. Though the results we show in the manuscript are all from the oil objective, a water objective (63×W/1.2NA, Zeiss) was also used (only several times) in GuHCl solutions and no significant difference in FRET efficiency distribution can be observed. Photons emitted from the sample were collected using the same objective. Remaining excitation light was eliminated using a filter (488 nm, Semrock) before the emitted photons passed the confocal unit with a 50 μm pinhole. The emitted photons were separated into two channels with a dichroic mirror (594 nm, Semrock). Donor photons were filtered using a bandpass filter (525/50 nm, Semrock) and then focused on an avalanche photodiode (PicoQuant). Acceptor photons were filtered using a bandpass filter (676/37 nm, Semrock) and detected using another avalanche photodiode (PicoQuant). The arrival time of every detected photon was recorded with a PicoHarp 300 counting module (PicoQuant). All measurements were performed by exciting the donor dye with a laser power of 100 μW, at 293 K. Our laser focus is an ellipsoid (0.24 × 0.24 × 1.20 μm) determined by fluorescence correlation spectroscopy (FCS) of free diffusing rhodamine 6G within our basic solution. Focus volume changes do not affect FRET efficiency because intrachain dynamics of our sample is much faster than diffusion time within laser focus. Laser focus of the oil objective is deep into aqueous solutions 10-15 μm (depth is measured by the thickness of coverslip). All samples contained 20 mM Tris buffer (pH 7.4) with varying concentration of solutes. In addition, 0.5M GuHCl was added to each urea solution and 0.3M NaCl was added to all other nonionic solutions in order to minimize the effects of electrostatic interactions.
C. FRET efficiency and mean FRET-inferred interdye distance calculation
Single-molecule FRET efficiency distributions were acquired in samples with PEG or DNA concentrations of about 30-50 pM. The time points of photon detection were stored with 4 ps resolution. Each sample was measured for 30–120 min. In all aqueous solutions, 0.001% Tween 20 (Sigma-Aldrich) was used to prevent surface adhesion, and 2 mM methyl viologen and 1 mM L-ascorbic acid (Sigma-Aldrich) were used to prevent photobleaching.19 A photon burst was retained as a significant event if the total number of counts exceeded 40 and photon flux density exceeded 25/ms. Flux density is determined by filtering the arriving time of two adjacent photons. When the arriving time interval is larger than 40 μs (corresponding to 25 photons/ms), photon burst is regarded as finished. Our photon burst signal is ∼104/ms (total red) and ∼158 photons/ms (total green), and the total background is 1.9 photons/ms so that the background is ignored.
The FRET efficiency was calculated with the formula
where nA is the photon number collected in the acceptor channel within one burst and nD is the photon number collected in the donor channel. γ is the correction factor for dye differences of quantum yield and detection efficiency, determined using photon flux density difference between two different photon bursts, Δ IA/Δ ID, in 20 mM of Tris buffer (pH = 7.4). Herein, γ is equal to 0.66, an average value of 200 photon bursts. Beta distribution was used to fit each FRET distribution,20 and the mean FRET efficiency ⟨EF⟩ was extracted to calculate the mean FRET-inferred interdye distance rDA.
The mean FRET efficiency for the labeled polymer and the FRET-inferred interdye distance are related by the formula4,13
where ⟨Ep⟩ is the mean FRET efficiency for the labeled polymer, r is the real-time FRET-inferred interdye distance of the dye-labeled polymer, p(r) is the probability distribution of r due to configuration variations, R0 is the Förster radius, and L is the contour length of the dye-labeled polymer.
Some cautionary remarks are in order regarding the formula in Eq. (2). First, the real-time FRET efficiency distribution has not been able to be measured so far;21 only ⟨Ep⟩ can be determined by the experimental FRET efficiency distribution.22 ⟨EP⟩ is approximated by ⟨EF⟩ and possible error of this approximation (mainly photophysics error) is incorporated into SEFSR.
Second, in theory, p(r) can be observed using different dye pairs with different R0.23 However, to our knowledge, p(r) has not been detected experimentally. Practically, an a priori polymer model is used to give the explicit form of p(r) with one adjustable parameter, such as the Gaussian chain, the worm-like chain, and the self-avoiding walk models. Presumably, a polymer model that better describes the actually measured polymer (with the dye pair) will produce more accurate results. The frequently used Gaussian chain model has the probability distribution4,7
where ⟨r2⟩ is the adjustable parameter. Using Eqs. (2) and (3), ⟨r2⟩ can be fixed for each experimentally determined ⟨Ep⟩. Then the mean FRET-inferred interdye distance rDA can be calculated as follows:
Third, the value of the Förster radius R0 for the dye pair AF488 and AF647 used in our experiments was taken to be 4.89 nm.21 Finally, the contour length L of the dye-labeled polymer is given by the contour length of the dye pair (4.4 nm)16 plus that of the polymer part separating the dye pair. If an a priori probability distribution p(r) associated with a polymer model is used, as is the case in our study, the specific value of L would not matter as long as it is large enough to include any significant contribution of p(r) to the integral.
III. RESULTS AND DISCUSSION
A. Solvent effect on FRET spectroscopic ruler
To identify and isolate SEFSR, we attached the AF488 and AF647 dye pairs16 (structural formulas shown in Fig. S1 of the supplementary material) to the blunt end of a dsDNA, separated by a rigid distance of the breadth of the double strand [Fig. 1(a)]. This was achieved by annealing two complementary ssDNAs dye-labeled at the 5′ and 3′ end of each strand, respectively. The resulting dye-labeled dsDNA is named DNA(0), where zero indicates that the dye pair is not separated by any nucleobase [Fig. 1(a)]. The sequence of the dsDNA is given in Table S1 in the supplementary material.
Although far from being perfect, DNA(0) is the best control molecule we could find so far to identify and extract SEFSR. The validity of using DNA(0) as a control molecule is supported by the following considerations. First, only bound DNA(0) can produce significant meaningful signals. DNA disassociation occurs in an all-or-none manner for DNA in a certain length range (e.g., 4 and 30 base pairs),21,24 applicable to DNA(0) with 25 base pairs. Disassociation signals of the DNA helix are rare (resident time of bound DNA and free DNA is very long compared to transition time) and transient [0.3 μs per base, 7.5 μs for DNA(0)].21,24 Second, DNA stability is decoupled from SEFSR. We observed an expansion signal of DNA(0) in aqueous solutions of PEG 6000, a DNA stabilizer25 [Fig. S2(b) in the supplementary material], indicating that the signal is an artifact of SEFSR rather than a real conformational change of the DNA helix. Third, considering the compactness of the DNA helix under native conditions, solutions could not compact the helix further. Therefore, it is reasonable to trust that DNA(0) provides a rigid segment in various solvent environments to serve as a control for SEFSR. In addition to DNA(0), we also considered other molecules as possible controls, including end-sealed DNA, polyproline, DL-dithiothreitol (DTT), and direct link of two dyes, but they are all more problematic than DNA(0) [see Fig. S2(c) in the supplementary material]. DNA(0) is the best control we have found so far.
We performed smFRET experiments with DNA(0) in GuHCl and urea solutions. The experimental smFRET efficiency distributions of DNA(0) at increasing concentrations of GuHCl and urea are shown in Figs. 1(b) and 1(c), respectively. As one can see, there is a gradual shift in the smFRET efficiency distribution with increasing GuHCl and urea concentrations. This indicates that the mean FRET-inferred interdye distance also changes with the GuHCl and urea concentrations. Given the previous discussions on the validity of DNA(0) as a control molecule, such apparent changes in the FRET-inferred interdye distance are most probably artifacts arising from SEFSR rather than real conformational changes of the DNA helix.
SEFSR can also be quantified by an expansion rate. To see how this is done, we first calculated FRET-inferred interdye distance rDA at a given GuHCl or urea concentration from the corresponding smFRET efficiency using Eqs. (2)–(4) in Sec. II. The FRET-inferred interdye distances against different GuHCl and urea concentrations are then plotted in Figs. 1(d) and 1(e), respectively. It is interesting to notice that a straight line is suitable for the data fitting, which means that the expansion of the FRET-inferred interdye distance with increasing solute concentration is almost linear in the regime probed by the current experimental conditions. Therefore we can quantify SEFSR by an expansion rate, defined as the change of the mean FRET-inferred interdye distance per unit change of the solute concentration, given by the slope of the fitting line (with a unit nm/M). For the GuHCl solution [Fig. 1(d)], the total change of the FRET-inferred interdye distance is 1.33 ± 0.05 nm as the GuHCl concentration varied from 0.6M to 6M, resulting in an expansion rate of 0.25 ± 0.01 nm/M. Similarly, for the urea solution [Fig. 1(e)], the total expansion of the FRET-inferred interdye distance is 1.19 ± 0.04 nm as the urea concentration was increased from 0 M to 7 M, giving an expansion rate of 0.17 ± 0.01 nm/M. To date several individual factors in SEFSR have been evaluated.3,10,26 However, compared to the generic SEFSR found here (0.99 ± 0.04 nm in GuHCl solution in a 4M range), the effects of those factors are relatively small and even not an expansion (the largest factor, quantum yield, produced an error of −0.4 nm in GuHCl solution from 0M to 4M).10 This indicates that there are other factors in SEFSR that cannot be covered by these individual factors.
We notice that the expansion rate in GuHCl solution is different from that in urea solution. A possible explanation is that the chemical property of the solutes can play an essential role in manifesting SEFSR. To test this hypothesis, we used DNA(0) as a probe to measure SEFSR in other solutions. All measurements except those for denaturant solutions were performed at solution concentrations higher than 2M in order to minimize the nonlinear effects of dye-dye electrostatic interactions and dye-DNA electrostatic interactions.10 Solution concentrations lower than 2M are considered for denaturant solutions [e.g., in Figs. 1(d) and 1(e)]. We observed changes in the mean FRET-inferred interdye distance to varying degrees in all the solutions tested (Figs. S3 and S4 in the supplementary material), indicating that SEFSR is ubiquitous. Most interestingly, we observed a regular pattern emerging when the expansion rate was analyzed with respect to different ionic solutions (Fig. 2). More specifically, for solutions containing different cations, the expansion rate was found to increase gradually as the solutions varied from KCl, NH4Cl, NaCl, LiCl, MgCl2 to GuHCl, which matches well with the cation Hofmeister series.27 The Hofmeister series was found everywhere in aqueous solutions27 and is one of the oldest problems in the physical chemistry field,28 the mechanism of which remained unclear until now. A consistent view has been reached that the Hofmeister series is caused by direct interactions between ions and specific molecules.29 The specific molecule is the two dyes (including linkers) in this study. It is thus a well-founded hypothesis that a significant portion of SEFSR quantified by the solute-dependent expansion rate arises from the direct interaction between the solute and the dyes.
B. Labeling configuration can affect SEFSR
In a previous study, the dye-labeled dsDNA did not show a significant expansion in GuHCl and glycerol solutions.10 However, the dyes in their experiment were attached to the dsDNA in a way different from DNA(0), resulting different labeling configurations of the dyes. This suggests that labeling configuration may affect SEFSR. To investigate this problem, we prepared three types of dsDNAs labeled by the dye pair at different sites in the dsDNA [Fig. 3(a)]. Each dye in the dye pair was attached either to the end of the dsDNA or to a nucleobase within the dsDNA (the labeling sites are shown in Table S1 in the supplementary material). The former types of dyes have more flexibility than the latter considering sterically accessible volume30 and will be called flexible dyes. Thus different labeling configurations are associated with different flexibilities of the dyes. The three types of dsDNAs are referred to as type (I), (II), and (III), with the number of flexible dyes being zero, one, and two, respectively, corresponding to an increasing flexibility of the dyes [Fig. 3(a)]. [Note that type (III) dsDNA here is simply DNA(0) defined previously.] We remark that the part of the dsDNA separating the dye pair (less than 17 base pairs) can still be considered as rigid under the current experimental conditions. Therefore, the difference in the change of FRET-inferred interdye distance for these three different types of dye-labeled dsDNAs will be a reflection of SEFSR with different dye flexibilities.
We indeed observed a difference in the FRET-inferred interdye distance change when the three types of dsDNAs containing dye pairs with different flexibilities were measured in GuHCl, urea, ethylene glycol (EG), and sodium chloride (NaCl) solutions (Figs. S5 and S6 in the supplementary material). The FRET-inferred interdye distance rDA was still calculated with Eqs. (2)–(4). We also used the expansion rate of rDA to assess SEFSR with different flexibilities. For GuHCl and urea solutions, the expansion rate was found to increase significantly with increasing dye flexibility [Figs. 3(b) and 3(c)]. More specifically, in GuHCl solution, the expansion rates for type (I), (II), and (III) dye-labeled dsNDAs were 0.13 ± 0.03 nm/M, 0.22 ± 0.02 nm/M, and 0.25 ± 0.01 nm/M, respectively [Fig. 3(b)], indicating that SEFSR grows with increasing dye pair flexibility (number of flexible dyes) in this case. The expansion rates in urea followed a similar trend to those in GuHCl solution [Fig. 3(c)]. In EG solution, the expansion rate increased significantly first but then dropped a little bit [Fig. 3(d)]. As for NaCl solution, the expansion rate mostly remained small, even though it also increased a little bit first and then dropped a little bit [Fig. 3(e)]. Taken together, it seems that flexibility of the dyes, i.e., labeling configuration, may indeed affect SEFSR significantly under some conditions, but there are also definitely some other factors, probably more subtle ones, at play. We note that the expansion rate of type (I) dsDNA in solutions obtained here is similar to that reported in Ref. 9, in which the expansion in type (I) dsDNA provided the largest potential error resulting from the photophysical effect and instrument factors.
Given the spatial homogeneity of the solvent concentration (well-mixed solutions) and the long chemical linker used to avoid photophysical contact between the substrate and fluorophore, the photophysical properties of the dyes would not change much with different labeling configurations. Then the above results indicate that the flexibility of the dyes labeled on target molecules is an important factor affecting the level of SEFSR. This, however, implies that the approach of attaching the dyes to a rigid structure might be inaccurate in determining the level of SEFSR when the dyes are attached to actual target molecules. This is because the dyes may have different flexibilities under these two different conditions (rigid structure V.S. target molecule). Rigid structures with different levels of rigidity (e.g., polyproline18 and dsDNA30) may also affect the level of SEFSR. Nevertheless, with its limitation kept in mind, this approach may still prove to be a valuable way to make an approximate estimate of SEFSR in smFRET experiments.
C. SEFSR in smFRET measurements
SEFSR can produce artificial FRET-inferred interdye distance as solute concentration is varied, which may overlay the signal of real polymer conformational change and lead to inaccuracy in smFRET measurements.3 To evaluate the relative importance of the artificial distance change from SEFSR in relation to the signal of real polymer conformational change in smFRET measurements, we prepared a type of dye-labeled dsDNA with an overhang (a stretch of ssDNA at the end of the dsDNA) in such a way that the dye pair was separated by the overhang with an adjustable length [Fig. 4(a)]. This was achieved by adding thymine bases to the strand of DNA(0) labeled by AF488, creating a 3′ T-overhang. The length of the overhang that separates the dye pair can be adjusted by inserting a different number of thymine bases. We name these dye-labeled dsDNAs with an overhang as DNA(n) (n = 0, 2, 3, 4, 5, 6, 8, 12, 16, 20), where n indicates the base number (number of thymine bases) in the overhang.
We investigated the conformational changes of the overhang in DNA(n) to evaluate the relative contribution of SEFSR in smFRET measurements.31 [The calculated FRET-inferred interdye distances for DNA(n) in different solutions are shown in Fig. S7 in the supplementary material.] We first studied DNA(n) in ionic solutions. In GuHCl solution, the expansion rate was found to increase gradually with an increasing base number [Fig. 4(b)], reading 0.25 ± 0.01 nm/M for DNA(0) and 0.48 ± 0.03 nm/M for DNA(20). This means that real conformational changes of ssDNA (overhang) containing as many as 20 bases (estimated by 0.48 nm/M-0.25 nm/M = 0.23 nm/M) could barely provide an expansion rate comparable to that contributed by SEFSR [estimated by the expansion rate for DNA(0), 0.25 nm/M], indicating that SEFSR is significant in GuHCl solution. However, in NaCl solution, the trend for the expansion rate with respect to the base number was contrary to that in GuHCl solution. The expansion rates for DNA(0) and DNA(20) were −0.04 ± 0.02 nm/M and −0.42 ± 0.05 nm/M, respectively [Fig. 4(c)]. Real conformational changes of ssDNA containing only 3 bases [estimated by the apparent expansion rate for DNA(3) offset by SEFSR estimated by the apparent expansion rate for DNA(0)] could already generate an expansion rate comparable to that contributed by SEFSR in NaCl solution [Fig. 4(c)]. Therefore, the relative weights of SEFSR and real conformational change of ssDNA are rather different for different solutions. SEFSR is comparable to the real conformational change of ssDNA with 20 bases in GuHCl solution, which reduces to only 3 bases in NaCl solution. This phenomenon is a manifestation of the heterogeneity between the dyes and the ssDNA in response to different solutions.
Then we investigated DNA(n) in non-ionic solutions. We also observed the heterogeneity between SEFSR and ssDNA real conformational changes in different solutions. In urea solution, the expansion rate showed a similar trend to that in GuHCl solution [Fig. 4(d)] and real conformational changes of ssDNA with 16 bases could contribute the same expansion rate as that from SEFSR. However, in EG solution (another non-ionic solution), the expansion rate first increased and then decreased with respect to increasing base number, which might arise from the complicated conformational changes of the ssDNA with different lengths.
Our investigations above suggest that erroneous inferences regarding the conformational changes of polymers could result if SEFSR was not taken into account properly. For instance, if only DNA(12) was measured with smFRET, a naive analysis of the experimental data without considering SEFSR would lead to the conclusion that the ssDNA (with 12 bases) could expand significantly with increasing GuHCl and urea concentrations [Figs. 4(b) and 4(d)], with an apparent expansion rate of 0.44 ± 0.03 nm/M in GuHCl and 0.34 ± 0.03 nm/M in urea. However, the expansion rate contributed by SEFSR estimated by that of DNA(0) (0.25 ± 0.01 nm/M in GuHCl and 0.17 ± 0.01 nm/M in urea) needs to be subtracted. As a result, the real expansion rate of DNA(12) (estimated to be 0.19 nm/M in GuHCl and 0.17 nm/M in urea solutions) may actually be less than a half of the apparent expansion rate. As for the case in EG solution [Fig. 4(e)], DNA(12) apparently expanded if the expansion rate is taken at face value. However, taking into account SEFSR (0.12 ± 0.01 nm/M), which is similar to that of DNA(12) (0.12 ± 0.03 nm/M), DNA(12) itself may actually have gone through no real conformational change. Moreover, considering the entire range of expansion rates in EG solution from DNA(0) to DNA(12) and compensating SEFSR, we see that the ssDNA may have expanded first and then contracted as it was gradually lengthened with more bases [Fig. 4(e)]. Nevertheless, this complex behavior would not be detected in normal smFRET experiments, in which the dyes were attached to only one pair of fixed sites in the polymer.32 Therefore, SEFSR can play an important role in smFRET measurements, with the potential of affecting the observed and inferred distances both quantitatively and qualitatively.
D. SEFSR can account for the apparent expansion of PEG 3500
Since SEFSR could potentially influence the accuracy of smFRET measurements significantly, it may be a contributing factor to the smFRET-SAXS discrepancy. Here we investigate this problem further using PEG 3500, a random coil chain which cannot be expanded any further by many other methods and has been used as a negative control to investigate the smFRET-SAXS discrepancy.3 We performed smFRET measurements on dye-labeled PEG 3500 in GuHCl and urea solutions, the only two solutions in which the smFRET-SAXS discrepancy has been identified so far.1,3 As shown in Figs. 5(a) and 5(b), the mean FRET-inferred interdye distance displayed a trend of gradual increase as the GuHCl or urea concentration was increased. Moreover, the expansion rates of dye-labeled PEG 3500 in GuHCl and urea solutions were found to be 0.21 ± 0.03 nm/M and 0.17 ± 0.04 nm/M, respectively. The corresponding values for DNA(0), an estimate of SEFSR, were 0.25 ± 0.01 nm/M and 0.17 ± 0.01 nm/M in GuHCl and urea solutions, respectively. These expansion rates are shown in Figs. 5(c) and 5(d). As can be seen, the apparent expansion rate of DNA(0) is close to or larger than that of dye-labeled PEG 3500. This means that SEFSR can account for the apparent expansion of dye-labeled PEG 3500. In turn, this implies that the apparent expansion of dye-labeled PEG 3500 may merely be an artifact arising from SEFSR rather than a real expansion of the PEG chain itself, in agreement with the fact that PEG has not been able to expand further by other means.
The smFRET method is known as a “spectroscopic ruler” detecting distances in the range 1-10 nm.17 At this scale, the SEFSR may become considerable (1.33 ± 0.04 nm in GuHCl solution from 0.6M to 6M, for example) and thus cannot be neglected a priori. More specifically, the changes of the FRET-inferred interdye distance induced by SEFSR in smFRET experiments may become significant enough to overlay the potential signals of real conformational changes in the coil-globule transition of unfolded or denatured proteins.2,3 The implications are at least two fold. On the one hand, signals associated with real conformational changes of the proteins may be twisted in the noise arising from the SEFSR [the scenario for DNA(12) in GuHCl and urea solutions]. On the other hand, noise generated by the SEFSR may be misinterpreted as signals from real conformational changes of the proteins (the possible scenario for PEG 3500 in GuHCl and urea solutions). In both scenarios, SEFSR may have made a significant contribution to the manifestation of the smFRET-SAXS discrepancy observed in some previous studies.1,3
In this work, we investigated the smFRET-SAXS discrepancy problem from the angle of the solvent effect on FRET spectroscopic ruler (i.e., SEFSR) in smFRET measurements. We identified and extracted SEFSR by attaching the dye pair to a rigid segment of a dsDNA and quantified the effect by an expansion rate of the FRET-inferred interdye distance. We found that SEFSR is ubiquitous in a variety of additive-enriched aqueous solutions, such as GuHCl, urea, NaCl, and EG. Solvent species is a significant factor in SEFSR. The intensity of SEFSR was found to follow the Hofmeister series in ionic solutions, suggesting that it arises mostly from the direct interaction between the solute and the dyes. In addition to solvent species associated with direct dye-solute interaction, labeling configuration associated with dye flexibility was identified to be another major factor affecting the level of SEFSR. Moreover, we evaluated the strength of SEFSR in relation to the signals of real polymer conformational changes in smFRET experiments. We found that the conformational change of a ssDNA with 20 bases could barely produce an expansion effect comparable to the strength of SEFSR in GuHCl solution, while that with 3 bases would be sufficient in NaCl solution, reflecting a form of heterogeneity between the dye pair and the polymer (ssDNA). In GuHCl and urea solutions, SEFSR was found to be strong enough to account for the apparent expansion signal of dye-labeled PEG3000, suggesting that the signal may possibly be an artifact arising from SEFSR. Our findings highlight the role of the SEFSR in smFRET experiments. Together with heterogeneity within polymers, SEFSR may bring us closer to the eventual reconciliation of the smFRET-SAXS discrepancy. It is noteworthy that although our results are on smFRET experiments, our conclusion is also relevant to all FRET measurements.
In the future, we plan to study more specifically various individual factors contributing to the manifestation of SEFSR as well as their relative weights. We would also like to address, to a higher level of satisfaction, the issue that SEFSR extracted from the control molecule [e.g., DNA(0) in our study] for now can only be used as an approximate estimate of the level of SEFSR in smFRET experiments with labeled target molecules. In addition, the influence of data analysis methods in relation to SEFSR will be further investigated.
See supplementary material for DNA sequences, dye structure, and more mean FRET-inferred interdye distances of dye-labeled probes in various solutions.
We thank the support of the National Natural Science Foundation of China Grant Nos. 91430217 and 21603217 and the Ministry of Science and Technology of China Grant Nos. 2016YFA0203200 and 2013YQ170585. J.W. thanks the support from No. NSF-PHYS-76066.