There is a growing understanding of the structural dynamics of biological molecules fueled by x-ray crystallography experiments. Time-resolved serial femtosecond crystallography (TR-SFX) with x-ray Free Electron Lasers allows the measurement of ultrafast structural changes in proteins. Nevertheless, this technique comes with some limitations. One major challenge is the quality of data from TR-SFX measurements, which often faces issues like data sparsity, partial recording of Bragg reflections, timing errors, and pixel noise. To overcome these difficulties, conventionally, large volumes of data are collected and grouped into a few temporal bins. The data in each bin are then averaged and paired with the mean of their corresponding jittered timestamps. This procedure provides one structure per bin, resulting in a limited number of averaged structures for the entire time interval spanned by the experiment. Therefore, the information on ultrafast structural dynamics at high temporal resolution is lost. This has initiated research for advanced methods of analyzing experimental TR-SFX data beyond the standard binning and averaging method. To address this problem, we use a machine learning algorithm called Nonlinear Laplacian Spectral Analysis (NLSA), which has emerged as a promising technique for studying the dynamics of complex systems. In this work, we demonstrate the power of this algorithm using synthetic x-ray diffraction snapshots from a protein with significant data incompleteness, timing uncertainties, and noise. Our study confirms that NLSA is a suitable approach that effectively mitigates the effects of these artifacts in TR-SFX data and recovers accurate structural dynamics information hidden in such data.

Determination of three-dimensional (3D) structures and dynamics of biological molecules at high spatial and temporal resolutions is one of the main goals of structural biology and biophysics. The pump–probe technique is one of the methods in time-resolved crystallography carried out using synchrotron radiation1–3 and XFELs.4–7 Despite the differences between these x-ray sources,3,8 both provide time series of x-ray diffraction patterns that need to be analyzed to extract the 3D structural dynamics of a given molecule. In this work, we will primarily focus on the analysis of data from TR-SFX experiments by XFELs.

In conventional TR-SFX experiments, as illustrated in Fig. 1, microcrystals are embedded in a viscous liquid medium. This liquid containing the crystals is streamed through the x-ray exposure path. To study the dynamics of a protein in an optically excited state, pulses from an optical pump beam also periodically hit the stream of crystals, and the delay time between the optical excitation and the x-ray diffraction is recorded, while the x-ray diffraction patterns are collected on a detector.9 In a typical experiment, the collected diffraction patterns can easily number in the hundreds of thousands.10 This method of data collection differs from traditional x-ray crystallography experiments at synchrotron light sources, which are performed on larger crystals that are rotated through a wide range of angles to ensure that a complete diffraction pattern is collected.5 By contrast, the microcrystals in serial femtosecond crystallography (SFX and TR-SFX) experiments undergo negligible rotation since they are exposed to the x-ray beam for a very short amount of time.11 Moreover, the pulse duration of the beam is very short and has such high intensity that the sample is destroyed nearly immediately after diffraction has occurred.9,10,12 Due to the negligible rotation and short lifetime of each measured microcrystal, each of the collected diffraction “stills” or “snapshots” contains an incompletely recorded set of Bragg reflections for the crystal.13,14 Furthermore, there are effects from the energy bandwidth of the x-ray beam as well as minor structural irregularities of the microcrystals. This results in an effect known as partiality, which means that each Bragg spot in the snapshot was not fully measured.10,11,15,16 In addition to this data incompleteness factor, the experimental data also suffer from systematic and random noise,12,13 beam intensity variations,10,15,17 temporal instabilities (in the case of time-resolved experiments),6,15,18 and variability in other key parameters, such as size, orientation, and quality of crystals, detector response, sample injection, and so forth. In addition, the data are said to be very sparse, meaning the measured Bragg spots in a snapshot are too few to sufficiently describe the structure and dynamics of the system on their own. As a result, many data processing algorithms may break down.19,20 Therefore, to obtain more accurate structure and dynamics, it is crucial to identify and minimize these artifacts, which will lead to the efficient use of samples and decrease the experimental workload to obtain high-quality data.

FIG. 1.

Schematic of conventional TR-SFX pump–probe experiments. Femtosecond x-ray pulses are delivered to a stream of microcrystals embedded in a liquid stream, which have been optically excited by a pump beam pulse. The resulting diffraction snapshots, which are essentially random cuts from the Ewald sphere, are recorded on a detector along with their corresponding delay times ( t 1 , t 2 , ) and comprise a time series of diffraction data.

FIG. 1.

Schematic of conventional TR-SFX pump–probe experiments. Femtosecond x-ray pulses are delivered to a stream of microcrystals embedded in a liquid stream, which have been optically excited by a pump beam pulse. The resulting diffraction snapshots, which are essentially random cuts from the Ewald sphere, are recorded on a detector along with their corresponding delay times ( t 1 , t 2 , ) and comprise a time series of diffraction data.

Close modal

Monte Carlo integration13 and data binning4,6,21,22 are established methods used by the crystallography community to mitigate data incompleteness. For a sufficiently high number of diffraction snapshots of randomly oriented microcrystals, Monte Carlo integration constructs a complete diffraction volume by averaging all snapshots together to a common intensity scale. In the absence of time-dependent structures, the spatial resolution improves as more snapshots are included in the analysis, and so a sufficiently large amount of data is needed to overcome data incompleteness.13,16 When searching for a time-resolved structure, as in TR-SFX, snapshots that fall within a set of time ranges are binned, and each bin is treated with the Monte Carlo averaging method individually. Consequently, a unique structure from each bin is produced, which allows for structural changes to be identified in time.4,6,21,22 However, this technique is crucially confined to determining, at best, a few discrete pictures of molecules far apart in time (on the order of 100 fs, to the authors' knowledge).4,6,21–24 Therefore, interesting and important dynamical phenomena happening in femtosecond timescales and below are not accessible via this approach.

The alternative way to deal with data incompleteness and timing errors is to use manifold-based methods.18,25 Interesting and important features of TR-SFX datasets can be uncovered by these machine learning approaches while simultaneously allowing for stochastic features to be filtered out.25 Commonly, methods such as singular value decomposition (SVD) and singular spectrum analysis (SSA)26 are deployed to analyze time series datasets. However, these linear methods are not relevant when dealing with nonlinear features in complex molecular systems. To address this problem, we use the Nonlinear Laplacian Spectral Analysis (NLSA)27 algorithm, which has successfully revealed the ultrafast structural dynamics of molecular systems that were previously challenging to quantify.18,25 As a complementary study to our previous work,18 here we further demonstrate and validate how NLSA handles the experimental challenges involved with serial femtosecond crystallography snapshots, such as data incompleteness (partiality and sparsity), as well as timing jitter in the case of time-resolved experiments. In the following sections, we first review the algorithm of NLSA. We then apply it to a set of synthetic x-ray crystallography diffraction patterns from a protein in its ground (dark) state, where the primary limitation is data incompleteness. We will compare the results with those obtained through Monte Carlo integration by CrystFEL,28–30 which is a conventional data analysis software in the field of x-ray crystallography. This is followed by the application of NLSA on time-resolved simulated data from the same protein, involving timing errors, data incompleteness, and noise. We then compare the reconstructed diffraction volumes at some selected time points with the original volumes. A similar comparison will be done by calculating difference electron density maps at selected time points.

The approach we employ is called Nonlinear Laplacian Spectral Analysis (NLSA). Here, we give a brief overview of this method. A schematic of the algorithm is shown in Fig. 2, and more details and applications can be found elsewhere.18,25,27,31–34 The NLSA process starts with a delay-coordinate embedding, which is a well-known method for the analysis of time-series data.26,27,35,36 The foundation of this technique is commonly known as Takens's theorem.35 This theorem states that a manifold containing the dynamical information of a system can be constructed from a series of time-lagged observations of the system. Consider a series of N measurements x 1 , x 2 , , x N from a dynamical system, such as diffraction pattern snapshots with D pixels. A delay (time-lagged) embedded data matrix X is constructed such that each column (or supervector) corresponds to a concatenation of the original data vectors: X t = x t , x t δ t , , x t ( c 1 ) δ t T , where the embedding window (or concatenation order) c is the number of lagged copies of the original time series. Each X t is constructed by concatenating together original snapshot vectors x t i δ t , where 0 i c 1 , in time sequential order so that each supervector contains c snapshots.

FIG. 2.

Schematic view of the NLSA algorithm. The original data x is time-lagged-embedded into a higher-dimensional space to give X . The curved manifold ( μ Φ ), which contains X is then found with the diffusion map algorithm. The data matrix X is projected onto the manifold to obtain A . Singular value decomposition of A and back projection of the leading modes to the embedded data space results in the matrix X ̃ . Diagonal averaging of X ̃ gives the final reconstructed data x ̃ , which is a time series of diffraction volumes that are now full (i.e., no data incompleteness), as well as noise- and jitter-reduced.

FIG. 2.

Schematic view of the NLSA algorithm. The original data x is time-lagged-embedded into a higher-dimensional space to give X . The curved manifold ( μ Φ ), which contains X is then found with the diffusion map algorithm. The data matrix X is projected onto the manifold to obtain A . Singular value decomposition of A and back projection of the leading modes to the embedded data space results in the matrix X ̃ . Diagonal averaging of X ̃ gives the final reconstructed data x ̃ , which is a time series of diffraction volumes that are now full (i.e., no data incompleteness), as well as noise- and jitter-reduced.

Close modal
To extract the dynamical features of a system, one possible way to proceed is to apply singular value decomposition (SVD).26 However, in general, the output of standard SVD may not be relevant for systems with complicated nonlinear dynamics.27 To overcome this limitation, the nonlinear (“curved”) space manifold that the data matrix X occupies is identified through the nonlinear dimensionality reduction method known as diffusion map embedding.19 The X matrix is then projected onto this manifold and a matrix A is obtained as
(1)
where Φ and μ denote the leading dimensions and the Riemannian curvature of the curved space, respectively. Standard SVD is now applied to the matrix A so that
(2)
where U and V are, respectively, the left and right singular vectors, and S are the singular values. Like the standard SVD, the first few modes with the largest singular values describe the most significant degrees of freedom in the data, and the remaining modes usually correspond to noise or other data artifacts. Therefore, a new matrix X ̃ can be reconstructed by taking only the leading SVD modes and then projecting them back from the manifold onto the data space,
(3)
where M < K is the number of selected modes. This is followed by averaging of the diagonal elements of X ̃ , which results in a reconstructed series x ̃ i of recovered information for each originally incomplete measurement. Note that in TR-SFX experiments, x i is a two-dimensional diffraction pattern with a pump–probe delay timestamp, and x ̃ i is a three-dimensional reconstructed diffraction volume at single time points. This means that the output of the NLSA approach can be a time series of 3D full diffraction volumes at high temporal resolutions.18,25

It is important to note that in standard data processing methods, the data complexities mentioned earlier in this paper are either ignored or handled through the binning and averaging approach. We make similar assumptions in the application of our NLSA algorithm. The main difference between NLSA and conventional analysis is that averaging is performed on a nonlinear manifold. NLSA does not require explicit binning and therefore does not suffer from loss of temporal resolution in time-resolved studies due to averaging binned data. This enables the algorithm to capture coherent structural changes that may occur in ultrafast timescales.18 

To compare NLSA to basic Monte Carlo averaging, we simulated dark state diffraction snapshots of photoactive yellow protein (PYP:PDB ID:5DH3).4 Here, a brief description of this procedure is given. More details of the simulation, as well as preprocessing and analysis of dark state data with NLSA are covered in the supplementary material.48 We used the pattern_sim function of the crystallographic data processing software CrystFEL28,29 (version 0.10.2). This program simulates diffraction snapshots when a user provides a PDB file, a full list of Bragg reflections, x-ray beam energy and size, microcrystal size, and desired noise levels. More information can be found in CrystFEL literature and the user manual for pattern_sim.28–30 For this work, the simulated x-ray beam energy was chosen to be 9.5 keV and have a radius of 1 μ m . Each microcrystal was chosen to have dimensions in the range of 900–1100 nm along each side. The full reflection list for PYP4 used in the simulations was generated using CCP437–40 (version 8.0). A total of 16,000 diffraction snapshots were simulated, and the noise was added to them as implemented by default in CrystFEL (version 0.11.0). This set of diffraction patterns was then indexed with the indexamajig function of CrystFEL to pair the simulated Bragg intensities with the appropriate Miller indices. To apply Monte Carlo merging to these data, the partialator function of CrystFEL (version 0.11.0) was used with the partiality model set to unity since we only want to perform basic Monte Carlo merging. The partialator function does compute an overall scaling factor and the Debye–Waller temperature factor for each crystal during the merging process. These scaling parameters were extracted from CrystFEL and applied to the unmerged data separately for use in NLSA. This way, the comparison between NLSA and Monte Carlo merging would be on data that have been suitably corrected with standard crystallographic considerations. A total of 15,865 diffraction snapshots were successfully indexed and scaled by CrystFEL. The resulting set of Monte Carlo merged reflections from CrystFEL's partialator function was used in the proceeding comparative analysis. The data input to the NLSA algorithm were the scaled but otherwise unmerged reflections.

To evaluate the performance of NLSA on a time series of a biological molecule undergoing a dynamical process, we simulated a set of diffraction snapshots of optically excited PYP using the information from an experimental time series dataset obtained at the Linac Coherent Light Source (LCLS).4,18 In these simulations, randomly oriented Ewald cuts were taken from the full diffraction volumes (equally spaced about 70 attoseconds apart18) to incorporate the effect of the random orientation of the microcrystals as well as data incompleteness, as illustrated in Fig. 3. Partial reflections were calculated by modeling Bragg spots as spheres of radius R = 1 / ( a / 25 ) and using an x-ray beam energy of 9.5 keV. The factor of a / 25 was chosen to result in the typical percentage of Bragg reflections observed in an actual experiment. The intersection of the Ewald sphere with each Bragg spot was approximated to be a circle of radius S R . This circular cross section was used to scale the Bragg spots down from their full intensities, thereby incorporating partiality. Finally, noise was added to each Bragg intensity according to a Gaussian distribution with a mean of zero and a standard deviation equal to the standard deviation of each set of Bragg intensities. This dataset, consisting of 87 141 snapshots, was then analyzed with NLSA as described in the Results section.

FIG. 3.

Schematic view of simulated data incompleteness and partiality. The enlarged area shows the intersection of the Ewald sphere in reciprocal space with the reciprocal lattice point. In this work, the reciprocal lattice point is approximated as a sphere of radius R, and the intersection of the Ewald sphere and the reciprocal lattice point is treated as a circle of radius S.

FIG. 3.

Schematic view of simulated data incompleteness and partiality. The enlarged area shows the intersection of the Ewald sphere in reciprocal space with the reciprocal lattice point. In this work, the reciprocal lattice point is approximated as a sphere of radius R, and the intersection of the Ewald sphere and the reciprocal lattice point is treated as a circle of radius S.

Close modal

To evaluate the performance of NLSA on a dataset affected by timing uncertainty, which is one of the main complications of data analysis in TR-SFX experiments, we introduced timing jitter into the simulated diffraction snapshots of excited-state PYP. This is illustrated in Fig. 4. We began with the synthetic dataset which did not include timing uncertainty. For each of these snapshots, a jitter offset time was randomly chosen from a Gaussian distribution with a standard deviation of 100 fs, which was chosen to be very similar to the reported uncertainty for the laser pulse timing apparatus.18,24 The original timestamps of the snapshots were then changed by the jitter offset amounts and the samples were re-sorted according to their new timestamps. The snapshots whose adjusted time stamps fell outside the original range of the un-jittered data were eliminated from consideration. The resulting dataset, consisting of 75 646 snapshots, was then analyzed with the NLSA algorithm.

FIG. 4.

Schematic of simulating timing jitter. The simulated diffraction snapshots had an initial temporal ordering. The temporal ordering of the snapshots was then reshuffled to a degree based on the timing uncertainty observed under experimental conditions.18,24

FIG. 4.

Schematic of simulating timing jitter. The simulated diffraction snapshots had an initial temporal ordering. The temporal ordering of the snapshots was then reshuffled to a degree based on the timing uncertainty observed under experimental conditions.18,24

Close modal

The results of the NLSA algorithm on dark state data were compared to the results of Monte Carlo merging. In doing this, a baseline result about the capabilities of NLSA on SFX data compared to the well-established Monte Carlo method is obtained. To perform NLSA, a delay embedding parameter corresponding to the number of lagged copies of the original series, as well as the number of modes to use for final data reconstruction must be chosen, as introduced in the NLSA Algorithm section of this work. For the simulated dark state snapshots, an embedding parameter of c = 2048 was chosen, and five modes were used in the reconstruction process. Based on our experience, this number of dimensions (modes) can sufficiently recover the diffraction volumes of dark data. Despite having only 15 865 snapshots in the dataset, this parameter combination successfully constructed a final series of full diffraction volumes. The series of reconstructed diffraction volumes output by NLSA was averaged together to produce a final diffraction volume to compare with the output of simple Monte Carlo merging. To quantify the comparison, both datasets were placed on the same numerical scale with zero mean and unit variance, and R-factors were calculated between the Monte Carlo merged results and the NLSA results. These R-factors were plotted for a range of spatial resolutions (q-shells). The results are shown in Fig. 5. The results of NLSA and Monte Carlo merging are very close, as shown by the fact that the R-factor is well below 0.1 for most spatial resolutions. This affirms that NLSA performs comparably to the Monte Carlo merging method when analyzing a system not undergoing complex dynamical processes. The success of this basic yet important comparison provides the clearance to proceed with testing NLSA on time-resolved data, which has the added difficulty of timing uncertainty.

FIG. 5.

R-factor as a function of spatial resolution between NLSA-reconstructed data and Monte Carlo merged data for dark state PYP. The plot shows the R-factor remains well below 0.1 for most spatial resolutions. This confirms that NLSA performs the same as the well-established Monte Carlo merging method for resolving structural information from a system not undergoing dynamical processes.

FIG. 5.

R-factor as a function of spatial resolution between NLSA-reconstructed data and Monte Carlo merged data for dark state PYP. The plot shows the R-factor remains well below 0.1 for most spatial resolutions. This confirms that NLSA performs the same as the well-established Monte Carlo merging method for resolving structural information from a system not undergoing dynamical processes.

Close modal
To apply NLSA on the time series of 87 141 simulated diffraction snapshots of optically excited PYP, a delay embedding parameter of c = 8192 was chosen, and the number of modes used in reconstruction was 10, which were the modes with the strongest singular values of NLSA. This combination was found to sufficiently handle the uncertainty in this time-resolved dataset. To quantify the results of the reconstruction, the Pearson correlation coefficient was calculated for each corresponding pair of reconstructed and original diffraction volumes. The Pearson correlation coefficient is a measure of linear correlation between the two datasets and is given by
(4)
where I i NLSA is the reconstructed intensity value from a certain pixel, and I orig ¯ and I NLSA ¯ are the average values of the original and NLSA pixel-level intensities, respectively.

For the simulated dataset which did not include timing jitter, the results of NLSA reconstruction were very successful. As shown in Fig. 6(a), the Pearson correlation coefficient for this case remains around a value of 0.995 for all volumes in the comparison and does not drop below 0.98. Therefore, despite the significant level of data incompleteness simulated in this dataset, the NLSA method produces a highly accurate reconstruction of the original data. This demonstrates the strength of NLSA when applied to data that are very sparse and imperfect, which was about 98 % incomplete compared to the full diffraction volumes that the snapshots were simulated from.

FIG. 6.

Plots of the Pearson correlation coefficients between each original and reconstructed diffraction volume. (a) Results for the case of no timing jitter. (b) Results for the case where timing jitter was included. In both cases, the correlation coefficient hovers just below 0.99, which indicates that NLSA is successful in structure determination even with significant data incompleteness and timing uncertainty.

FIG. 6.

Plots of the Pearson correlation coefficients between each original and reconstructed diffraction volume. (a) Results for the case of no timing jitter. (b) Results for the case where timing jitter was included. In both cases, the correlation coefficient hovers just below 0.99, which indicates that NLSA is successful in structure determination even with significant data incompleteness and timing uncertainty.

Close modal

The same NLSA parameters and performance metric were used to evaluate the case of the 75 646 snapshots where timing jitter was applied. The Pearson correlation coefficients between the original and reconstructed intensities were calculated, and the results are shown in Fig. 6(b). As seen, the correlation coefficient is steady around 0.985 and does not drop below a value of 0.950. Despite the added timing uncertainty, NLSA was able to provide a faithful reconstruction of the original data.

To further quantify the results of NLSA, difference electron density (DED) maps were calculated at several time points for both the non-jittered and jittered simulated datasets. These were compared with DED maps calculated from the original data at the same time points. The Pearson correlation coefficient between DED maps was used to quantify the results. To calculate a DED map, the structure factor amplitudes of a reference dark (i.e., not optically excited) dataset are subtracted from those of the excited dataset.6 The reference dark dataset used here was measured at the LCLS.4 The DED calculations were performed using various functions of CCP4,37–46 and the results were visualized using Coot47 (version 0.9.8.7). Figure 7 shows the results of the DED analysis for the case without timing jitter. As we are mainly interested in the chromophore of PYP, a mask with a 3 Å border around the chromophore region was applied to each DED map. For the delay times shown in Fig. 7, the average correlation coefficient was about 0.90. Figure 8 shows the results where timing jitter was introduced. The average correlation coefficient for the delay times shown in Fig. 8 was about 0.77. This drop in correlation coefficients is understandable given the introduction of significant timing uncertainty. Even so, NLSA reconstruction proved reliable in recovering useful structural information from a dataset with high levels of uncertainty.48 

FIG. 7.

Comparing difference electron density (DED) maps calculated from (a) the original data (top row) to (b) those calculated from the NLSA-processed data without timing jitter (bottom row). The contour level of these maps is 3.0 RMSD. The average of the correlation coefficients between the DED maps for the delay times displayed here was 0.90.

FIG. 7.

Comparing difference electron density (DED) maps calculated from (a) the original data (top row) to (b) those calculated from the NLSA-processed data without timing jitter (bottom row). The contour level of these maps is 3.0 RMSD. The average of the correlation coefficients between the DED maps for the delay times displayed here was 0.90.

Close modal
FIG. 8.

Comparing difference electron density (DED) maps calculated from the (a) original data (top row) to (b) those calculated from the NLSA-processed data with simulated timing jitter included (bottom row). The contour level of these maps is 3.0 RMSD. For these DED maps, the average of the correlation coefficients was 0.77, which is understandably lower than the case without significant timing uncertainty.

FIG. 8.

Comparing difference electron density (DED) maps calculated from the (a) original data (top row) to (b) those calculated from the NLSA-processed data with simulated timing jitter included (bottom row). The contour level of these maps is 3.0 RMSD. For these DED maps, the average of the correlation coefficients was 0.77, which is understandably lower than the case without significant timing uncertainty.

Close modal

In this work, we have demonstrated the application of a machine learning algorithm known as NLSA on simulated x-ray crystallography diffraction snapshots. NLSA was applied to dark state PYP snapshots simulated by CrystFEL, and to excited-state PYP snapshots simulated from experimental full diffraction volumes, with and without timing uncertainty. A baseline comparison with the long-standing Monte Carlo averaging method shows that NLSA successfully produces the same results for a non-excited system. Further, NLSA excels when analyzing time-resolved data. As the results show, NLSA is very effective at reconstructing time series of full diffraction volumes from simulated data, which suffers from incompleteness and timing uncertainty as well as noise. When no timing jitter was applied to the data, the Pearson correlation coefficients between the NLSA-reconstructed data and the original data remained above 0.98, confirming that NLSA is indeed a useful method for analyzing incomplete data with high sparsity. When timing jitter was applied to the simulated data, NLSA gave similarly satisfactory results, with the Pearson correlation coefficient remaining above 0.950 between the NLSA-reconstructed data and the original 3D volumes. The results in this paper have therefore demonstrated that NLSA can be confidently applied to TR-SFX experimental data, despite significant timing uncertainty and data sparsity, to determine crystal structures of proteins that are very close to the true structure. The DED map analysis clearly visualized and quantified the comparative features of original and simulated datasets. In the case without simulated timing jitter, the average correlation coefficient for the chromophore region of the DED maps between the original and NLSA-reconstructed volumes was 0.90. This average dipped to 0.77 when timing jitter was introduced but still shows that NLSA can recover meaningful information from a time series of data where the timing uncertainty is high. We conclude that this algorithm is a viable and valuable addition to the analysis routine of TR-SFX and other complex datasets. The results demonstrated in this paper support the more general idea that NLSA can be applied to other types of experiments where high-resolution dynamical information is obscured by high degrees of timing uncertainty and data incompleteness, as well as noise. Of course, more investigation is required to examine the impact of the other artifacts mentioned earlier on the functionality of NLSA, which we intend to pursue in the future using time-resolved crystallography data.

This work was supported in part by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-SC0002164, the U.S. National Science Foundation under Award No. STC 1231306, and the UWM College of Letters and Science Faculty Start-up Grant (No. AAK9353).

The authors have no conflicts to disclose.

Justin Trujillo: Data curation (equal); Formal analysis (lead); Investigation (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Russell Fung: Investigation (supporting); Methodology (equal); Software (supporting); Validation (supporting); Visualization (supporting); Writing – review & editing (supporting). Madan Kumar Shankar: Software (supporting); Validation (supporting); Visualization (supporting); Writing – review & editing (supporting). Peter Schwander: Conceptualization (equal); Data curation (equal); Formal analysis (supporting); Investigation (supporting); Methodology (supporting); Resources (supporting); Software (supporting); Supervision (supporting); Validation (supporting); Visualization (supporting); Writing – review & editing (supporting). Ahmad Hosseinizadeh: Conceptualization (equal); Data curation (supporting); Formal analysis (supporting); Funding acquisition (lead); Investigation (supporting); Methodology (equal); Project administration (lead); Resources (lead); Software (supporting); Supervision (lead); Validation (supporting); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
K.
Moffat
, “
The frontiers of time-resolved macromolecular crystallography: Movies and chirped X-ray pulses
,”
Faraday Disc.
122
,
65
77
(
2003
).
2.
K.
Moffat
, “
Time-resolved macromolecular crystallography
,”
Annu. Rev. Biophys. Biophys. Chem.
18
,
309
332
(
1989
).
3.
R.
Neutze
and
K.
Moffat
, “
Time-resolved structural studies at synchrotrons and X-ray free electron lasers: Opportunities and challenges
,”
Curr. Opin. Struct. Biol.
22
(
5
),
651
659
(
2012
).
4.
K.
Pande
,
C. D. M.
Hutchison
,
G.
Groenhof
,
A.
Aquila
,
J. S.
Robinson
,
J.
Tenboer
,
S.
Basu
,
S.
Boutet
,
D. P.
DePonte
,
M. N.
Liang
,
T. A.
White
,
N. A.
Zatsepin
,
O.
Yefanov
,
D.
Morozov
,
D.
Oberthuer
,
C.
Gati
,
G.
Subramanian
,
D.
James
,
Y.
Zhao
,
J.
Koralek
,
J.
Brayshaw
,
C.
Kupitz
,
C.
Conrad
,
S.
Roy-Chowdhury
,
J. D.
Coe
,
M.
Metz
,
P. L.
Xavier
,
T. D.
Grant
,
J. E.
Koglin
,
G.
Ketawala
,
R.
Fromme
,
V.
Srajer
,
R.
Henning
,
J. C. H.
Spence
,
A.
Ourmazd
,
P.
Schwander
,
U.
Weierstall
,
M.
Frank
,
P.
Fromme
,
A.
Barty
,
H. N.
Chapman
,
K.
Moffat
,
J. J.
van Thor
, and
M.
Schmidt
, “
Femtosecond structural dynamics drives the trans/cis isomerization in photoactive yellow protein
,”
Science
352
(
6286
),
725
729
(
2016
).
5.
H. N.
Chapman
,
P.
Fromme
,
A.
Barty
,
T. A.
White
,
R. A.
Kirian
,
A.
Aquila
,
M. S.
Hunter
,
J.
Schulz
,
D. P.
DePonte
,
U.
Weierstall
,
R. B.
Doak
,
F. R. N. C.
Maia
,
A. V.
Martin
,
I.
Schlichting
,
L.
Lomb
,
N.
Coppola
,
R. L.
Shoeman
,
S. W.
Epp
,
R.
Hartmann
,
D.
Rolles
,
A.
Rudenko
,
L.
Foucar
,
N.
Kimmel
,
G.
Weidenspointner
,
P.
Holl
,
M. N.
Liang
,
M.
Barthelmess
,
C.
Caleman
,
S.
Boutet
,
M. J.
Bogan
,
J.
Krzywinski
,
C.
Bostedt
,
S.
Bajt
,
L.
Gumprecht
,
B.
Rudek
,
B.
Erk
,
C.
Schmidt
,
A.
Hömke
,
C.
Reich
,
D.
Pietschner
,
L.
Strüder
,
G.
Hauser
,
H.
Gorke
,
J.
Ullrich
,
S.
Herrmann
,
G.
Schaller
,
F.
Schopper
,
H.
Soltau
,
K. U.
Kühnel
,
M.
Messerschmidt
,
J. D.
Bozek
,
S. P.
Hau-Riege
,
M.
Frank
,
C. Y.
Hampton
,
R. G.
Sierra
,
D.
Starodub
,
G. J.
Williams
,
J.
Hajdu
,
N.
Timneanu
,
M. M.
Seibert
,
J.
Andreasson
,
A.
Rocker
,
O.
Jönsson
,
M.
Svenda
,
S.
Stern
,
K.
Nass
,
R.
Andritschke
,
C. D.
Schröter
,
F.
Krasniqi
,
M.
Bott
,
K. E.
Schmidt
,
X. Y.
Wang
,
I.
Grotjohann
,
J. M.
Holton
,
T. R. M.
Barends
,
R.
Neutze
,
S.
Marchesini
,
R.
Fromme
,
S.
Schorb
,
D.
Rupp
,
M.
Adolph
,
T.
Gorkhover
,
I.
Andersson
,
H.
Hirsemann
,
G.
Potdevin
,
H.
Graafsma
,
B.
Nilsson
, and
J. C. H.
Spence
, “
Femtosecond X-ray protein nanocrystallography
,”
Nature
470
(
7332
),
73
77
(
2011
).
6.
J.
Tenboer
,
S.
Basu
,
N.
Zatsepin
,
K.
Pande
,
D.
Milathianaki
,
M.
Frank
,
M.
Hunter
,
S.
Boutet
,
G. J.
Williams
,
J. E.
Koglin
,
D.
Oberthuer
,
M.
Heymann
,
C.
Kupitz
,
C.
Conrad
,
J.
Coe
,
S.
Roy-Chowdhury
,
U.
Weierstall
,
D.
James
,
D. J.
Wang
,
T.
Grant
,
A.
Barty
,
O.
Yefanov
,
J.
Scales
,
C.
Gati
,
C.
Seuring
,
V.
Srajer
,
R.
Henning
,
P.
Schwander
,
R.
Fromme
,
A.
Ourmazd
,
K.
Moffat
,
J. J.
Van Thor
,
J. C. H.
Spence
,
P.
Fromme
,
H. N.
Chapman
, and
M.
Schmidt
, “
Time-resolved serial crystallography captures high-resolution intermediates of photoactive yellow protein
,”
Science
346
(
6214
),
1242
1246
(
2014
).
7.
J. C. H.
Spence
,
U.
Weierstall
, and
H. N.
Chapman
, “
X-ray lasers for structural and dynamic biology
,”
Rep. Prog. Phys.
75
(
10
),
102601
(
2012
).
8.
S.
Pandey
,
R.
Bean
,
T.
Sato
,
I.
Poudyal
,
J.
Bielecki
,
J. C.
Villarreal
,
O.
Yefanov
,
V.
Mariani
,
T. A.
White
,
C.
Kupitz
,
M.
Hunter
,
M. H.
Abdellatif
,
S.
Bajt
,
V.
Bondar
,
A.
Echelmeier
,
D.
Doppler
,
M.
Emons
,
M.
Frank
,
R.
Fromme
,
Y.
Gevorkov
,
G.
Giovanetti
,
M.
Jiang
,
D.
Kim
,
Y.
Kim
,
H.
Kirkwood
,
A.
Klimovskaia
,
J.
Knoska
,
F. H. M.
Koua
,
R.
Letrun
,
S.
Lisova
,
L.
Maia
,
V.
Mazalova
,
D.
Meza
,
T.
Michelat
,
A.
Ourmazd
,
G.
Palmer
,
M.
Ramilli
,
R.
Schubert
,
P.
Schwander
,
A.
Silenzi
,
J.
Sztuk-Dambietz
,
A.
Tolstikova
,
H. N.
Chapman
,
A.
Ros
,
A.
Barty
,
P.
Fromme
,
A. P.
Mancuso
, and
M.
Schmidt
, “
Time-resolved serial femtosecond crystallography at the European XFEL
,”
Nat. Methods
17
(
1
),
73
78
(
2020
).
9.
T. A.
White
,
A.
Barty
,
F.
Stellato
,
J. M.
Holton
,
R. A.
Kirian
,
N. A.
Zatsepin
, and
H. N.
Chapman
, “
Crystallographic data processing for free-electron laser sources
,”
Acta Crystallogr., Sect. D
69
,
1231
1240
(
2013
).
10.
N. K.
Sauter
, “
XFEL diffraction: Developing processing methods to optimize data quality
,”
J. Synchrotron Rad.
22
,
239
248
(
2015
).
11.
T. A.
White
, “
Post-refinement method for snapshot serial crystallography
,”
Philos. Trans. R. Soc., B
369
,
1647
(
2014
).
12.
J.
Hattne
,
N.
Echols
,
R.
Tran
,
J.
Kern
,
R. J.
Gildea
,
A. S.
Brewster
,
R.
Alonso-Mori
,
C.
Glöckner
,
J.
Hellmich
,
H.
Laksmono
,
R. G.
Sierra
,
B.
Lassalle-Kaiser
,
A.
Lampe
,
G.
Han
,
S.
Gul
,
D.
DiFiore
,
D.
Milathianaki
,
A. R.
Fry
,
A.
Miahnahri
,
W. E.
White
,
D. W.
Schafer
,
M. M.
Seibert
,
J. E.
Koglin
,
D.
Sokaras
,
T. C.
Weng
,
J.
Sellberg
,
M. J.
Latimers
,
P.
Glatzel
,
P. H.
Zwart
,
R. W.
Grosse-Kunstleve
,
M. J.
Bogan
,
M.
Messerschmidt
,
G. J.
Williams
,
S.
Boutet
,
J.
Messinger
,
A.
Zouni
,
J.
Yano
,
U.
Bergmann
,
V. K.
Yachandra
,
P. D.
Adams
, and
N. K.
Sauter
, “
Accurate macromolecular structures using minimal measurements from X-ray free-electron lasers
,”
Nat. Methods
11
(
5
),
545
548
(
2014
).
13.
R. A.
Kirian
,
X. Y.
Wang
,
U.
Weierstall
,
K. E.
Schmidt
,
J. C. H.
Spence
,
M.
Hunter
,
P.
Fromme
,
T.
White
,
H. N.
Chapman
, and
J.
Holton
, “
Femtosecond protein nanocrystallography-data analysis methods
,”
Opt. Express
18
(
6
),
5713
5723
(
2010
).
14.
R. A.
Kirian
,
T. A.
White
,
J. M.
Holton
,
H. N.
Chapman
,
P.
Fromme
,
A.
Barty
,
L.
Lomb
,
A.
Aquila
,
F. R. N. C.
Maia
,
A. V.
Martin
,
R.
Fromme
,
X. Y.
Wang
,
M. S.
Hunter
,
K. E.
Schmidt
, and
J. C. H.
Spence
, “
Structure-factor analysis of femtosecond micro-diffraction patterns from protein nanocrystals
,”
Acta Crystallogr., Sect. A
67
,
131
140
(
2011
).
15.
C. M.
Casadei
,
A.
Hosseinizadeh
,
G. F. X.
Schertler
,
A.
Ourmazd
, and
R.
Santra
, “
Dynamics retrieval from stochastically weighted incomplete data by low-pass spectral analysis
,”
Struct. Dyn.
9
(
4
),
044101
(
2022
).
16.
H. M.
Ginn
,
M.
Messerschmidt
,
X. Y.
Ji
,
H. W.
Zhang
,
D.
Axford
,
R. J.
Gildea
,
G.
Winter
,
A. S.
Brewster
,
J.
Hattne
,
A.
Wagner
,
J. M.
Grimes
,
G.
Evans
,
N. K.
Sauter
,
G.
Sutton
, and
D. I.
Stuart
, “
Structure of CPV17 polyhedrin determined by the improved analysis of serial femtosecond crystallographic data
,”
Nat. Commun.
6
,
6435
(
2015
).
17.
W.
Kabsch
, “
Processing of X-ray snapshots from crystals in random orientations
,”
Acta Crystallogr., Sect. D
70
,
2204
2216
(
2014
).
18.
A.
Hosseinizadeh
,
N.
Breckwoldt
,
R.
Fung
,
R.
Sepehr
,
M.
Schmidt
,
P.
Schwander
,
R.
Santra
, and
A.
Ourmazd
, “
Few-fs resolution of a photoactive protein traversing a conical intersection
,”
Nature
599
(
7886
),
697
701
(
2021
).
19.
R. R.
Coifman
and
S.
Lafon
, “
Diffusion maps
,”
Appl. Comput. Harmonic Anal.
21
(
1
),
5
30
(
2006
).
20.
A. L.
Ferguson
,
A. Z.
Panagiotopoulos
,
P. G.
Debenedetti
, and
I. G.
Kevrekidis
, “
Systematic determination of order parameters for chain dynamics using diffusion maps
,”
Proc. Natl. Acad. Sci. U. S. A.
107
(
31
),
13597
13602
(
2010
).
21.
P.
Nogly
,
T.
Weinert
,
D.
James
,
S.
Carbajo
,
D.
Ozerov
,
A.
Furrer
,
D.
Gashi
,
V.
Borin
,
P.
Skopintsev
,
K.
Jaeger
,
K.
Nass
,
P.
Båth
,
R.
Bosman
,
J.
Koglin
,
M.
Seaberg
,
T.
Lane
,
D.
Kekilli
,
S.
Brünle
,
T.
Tanaka
,
W.
Wu
,
C.
Milne
,
T.
White
,
A.
Barty
,
U.
Weierstall
,
V.
Panneels
,
E.
Nango
,
S.
Iwata
,
M.
Hunter
,
I.
Schapiro
,
G.
Schertler
,
R.
Neutze
, and
J.
Standfuss
, “
Retinal isomerization in bacteriorhodopsin captured by a femtosecond x-ray laser
,”
Science
361
(
6398
)
eaat0094
(
2018
).
22.
T.
Gruhl
,
T.
Weinert
,
M. J.
Rodrigues
,
C. J.
Milne
,
G.
Ortolani
,
K.
Nass
,
E.
Nango
,
S.
Sen
,
P. J. M.
Johnson
,
C.
Cirelli
,
A.
Furrer
,
S.
Mous
,
P.
Skopintsev
,
D.
James
,
F.
Dworkowski
,
P.
Båth
,
D.
Kekilli
,
D.
Ozerov
,
R.
Tanaka
,
H.
Glover
,
C.
Bacellar
,
S.
Brünle
,
C. M.
Casadei
,
A. D.
Diethelm
,
D.
Gashi
,
G.
Gotthard
,
R.
Guixà-González
,
Y.
Joti
,
V.
Kabanova
,
G.
Knopp
,
E.
Lesca
,
P.
Ma
,
I.
Martiel
,
J.
Mühle
,
S.
Owada
,
F.
Pamula
,
D.
Sarabi
,
O.
Tejero
,
C.-J.
Tsai
,
N.
Varma
,
A.
Wach
,
S.
Boutet
,
K.
Tono
,
P.
Nogly
,
X.
Deupi
,
S.
Iwata
,
R.
Neutze
,
J.
Standfuss
,
G.
Schertler
, and
V.
Panneels
, “
Ultrafast structural changes direct the first molecular events of vision
,”
Nature
615
(
7954
),
939
944
(
2023
).
23.
G.
Nass Kovacs
,
J.-P.
Colletier
,
M. L.
Grünbein
,
Y.
Yang
,
T.
Stensitzki
,
A.
Batyuk
,
S.
Carbajo
,
R. B.
Doak
,
D.
Ehrenberg
,
L.
Foucar
,
R.
Gasper
,
A.
Gorel
,
M.
Hilpert
,
M.
Kloos
,
J. E.
Koglin
,
J.
Reinstein
,
C. M.
Roome
,
R.
Schlesinger
,
M.
Seaberg
,
R. L.
Shoeman
,
M.
Stricker
,
S.
Boutet
,
S.
Haacke
,
J.
Heberle
,
K.
Heyne
,
T.
Domratcheva
,
T. R. M.
Barends
, and
I.
Schlichting
, “
Three-dimensional view of ultrafast dynamics in photoexcited bacteriorhodopsin
,”
Nat. Commun.
10
(
1
),
3177
(
2019
).
24.
J. M.
Glownia
,
J.
Cryan
,
J.
Andreasson
,
A.
Belkacem
,
N.
Berrah
,
C. I.
Blaga
,
C.
Bostedt
,
J.
Bozek
,
L. F.
DiMauro
,
L.
Fang
,
J.
Frisch
,
O.
Gessner
,
M.
Gühr
,
J.
Hajdu
,
M. P.
Hertlein
,
M.
Hoener
,
G.
Huang
,
O.
Kornilov
,
J. P.
Marangos
,
A. M.
March
,
B. K.
McFarland
,
H.
Merdji
,
V. S.
Petrovic
,
C.
Raman
,
D.
Ray
,
D. A.
Reis
,
M.
Trigo
,
J. L.
White
,
W.
White
,
R.
Wilcox
,
L.
Young
,
R. N.
Coffee
, and
P. H.
Bucksbaum
, “
Time-resolved pump-probe experiments at the LCLS
,”
Opt. Express
18
(
17
),
17620
17630
(
2010
).
25.
R.
Fung
,
A. M.
Hanna
,
O.
Vendrell
,
S.
Ramakrishna
,
T.
Seideman
,
R.
Santra
, and
A.
Ourmazd
, “
Dynamics from noisy data with extreme timing uncertainty
,”
Nature
532
(
7600
),
471
475
(
2016
).
26.
H.
Hassani
, “
Singular spectrum analysis: Methodology and comparison
,”
J. Data Sci.
5
(
2
),
239
257
(
2021
).
27.
D.
Giannakis
and
A. J.
Majda
, “
Nonlinear Laplacian spectral analysis for time series with intermittency and low-frequency variability
,”
Proc. Natl. Acad. Sci. U. S. A.
109
(
7
),
2222
2227
(
2012
).
28.
T. A.
White
,
R. A.
Kirian
,
A. V.
Martin
,
A.
Aquila
,
K.
Nass
,
A.
Barty
, and
H. N.
Chapman
, “
CrystFEL: A software suite for snapshot serial crystallography
,”
J. Appl. Crystallogr.
45
,
335
341
(
2012
).
29.
T. A.
White
,
V.
Mariani
,
W.
Brehm
,
O.
Yefanov
,
A.
Barty
,
K. R.
Beyerlein
,
F.
Chervinskii
,
L.
Galli
,
C.
Gati
,
T.
Nakane
,
A.
Tolstikova
,
K.
Yamashita
,
C. H.
Yoon
,
K.
Diederichs
, and
H. N.
Chapman
, “
Recent developments in CrystFEL
,”
J. Appl. Crystallogr.
49
,
680
689
(
2016
).
30.
T. A.
White
, “
Processing serial crystallography data with CrystFEL: A step-by-step guide
,”
Acta Crystallogr., Sect. D
75
(
Pt 2
),
219
233
(
2019
).
31.
N. D.
Brenowitz
,
D.
Giannakis
, and
A. J.
Majda
, “
Nonlinear Laplacian spectral analysis of Rayleigh-Benard convection
,”
J. Comput. Phys.
315
,
536
553
(
2016
).
32.
A.
Hosseinizadeh
,
G.
Mashayekhi
,
J.
Copperman
,
P.
Schwander
,
A.
Dashti
,
R.
Sepehr
,
R.
Fung
,
M.
Schmidt
,
C. H.
Yoon
,
B. G.
Hogue
,
G. J.
Williams
,
A.
Aquila
, and
A.
Ourmazd
, “
Conformational landscape of a virus by single-particle X-ray scattering
,”
Nat. Methods
14
(
9
),
877
881
(
2017
).
33.
A.
Dashti
,
P.
Schwander
,
R.
Langlois
,
R.
Fung
,
W.
Li
,
A.
Hosseinizadeh
,
H. Y.
Liao
,
J.
Pallesen
,
G.
Sharma
,
V. A.
Stupina
,
A. E.
Simon
,
J. D.
Dinman
,
J.
Frank
, and
A.
Ourmazd
, “
Trajectories of the ribosome as a Brownian nanomachine
,”
Proc. Natl. Acad. Sci. U. S. A.
111
(
49
),
17492
17497
(
2014
).
34.
A.
Dashti
,
G.
Mashayekhi
,
M.
Shekhar
,
D.
Ben Hail
,
S.
Salah
,
P.
Schwander
,
A.
des Georges
,
A.
Singharoy
,
J.
Frank
, and
A.
Ourmazd
, “
Retrieving functional pathways of biomolecules from single-particle snapshots
,”
Nat. Commun.
11
(
1
),
4734
(
2020
).
35.
F.
Takens
, “
Dynamical systems and turbulence, Warwick 1980
,” in
Proceedings of a Symposium Held at the University of Warwick 1979/80
, edited by
D.
Rand
and
L.-S.
Young
(
Springer Berlin Heidelberg
,
Berlin
,
Heidelberg
,
1981
), pp.
366
381
.
36.
N. H.
Packard
,
J. P.
Crutchfield
,
J. D.
Farmer
, and
R. S.
Shaw
, “
Geometry from a time series
,”
Phys. Rev. Lett.
45
(
9
),
712
716
(
1980
).
37.
J.
Agirre
,
M.
Atanasova
,
H.
Bagdonas
,
C. B.
Ballard
,
A.
Baslé
,
J.
Beilsten-Edmands
,
R. J.
Borges
,
D. G.
Brown
,
J. J.
Burgos-Mármol
,
J. M.
Berrisford
,
P. S.
Bond
,
I.
Caballero
,
L.
Catapano
,
G.
Chojnowski
,
A. G.
Cook
,
K. D.
Cowtan
,
T. I.
Croll
,
J. É.
Debreczeni
,
N. E.
Devenish
,
E. J.
Dodson
,
T. R.
Drevon
,
P.
Emsley
,
G.
Evans
,
P. R.
Evans
,
M.
Fando
,
J.
Foadi
,
L.
Fuentes-Montero
,
E. F.
Garman
,
M.
Gerstel
,
R. J.
Gildea
,
K.
Hatti
,
M. L.
Hekkelman
,
P.
Heuser
,
S. W.
Hoh
,
M. A.
Hough
,
H. T.
Jenkins
,
E.
Jiménez
,
R. P.
Joosten
,
R. M.
Keegan
,
N.
Keep
,
E. B.
Krissinel
,
P.
Kolenko
,
O.
Kovalevskiy
,
V. S.
Lamzin
,
D. M.
Lawson
,
A. A.
Lebedev
,
A. G. W.
Leslie
,
B.
Lohkamp
,
F.
Long
,
M.
Maly
,
A. J.
McCoy
,
S. J.
McNicholas
,
A.
Medina
,
C.
Millán
,
J. W.
Murray
,
G. N.
Murshudov
,
R. A.
Nicholls
,
M. E. M.
Noble
,
R.
Oeffner
,
N. S.
Pannu
,
J. M.
Parkhurst
,
N.
Pearce
,
J.
Pereira
,
A.
Perrakis
,
H. R.
Powell
,
R. J.
Read
,
D. J.
Rigden
,
W.
Rochira
,
M.
Sammito
,
F. S.
Rodríguez
,
G. M.
Sheldrick
,
K. L.
Shelley
,
F.
Simkovic
,
A. J.
Simpkin
,
P.
Skubak
,
E.
Sobolev
,
R. A.
Steiner
,
K.
Stevenson
,
I.
Tews
,
J. M. H.
Thomas
,
A.
Thorn
,
J. T.
Valls
,
V.
Uski
,
I.
Usón
,
A.
Vagin
,
S.
Velankar
,
M.
Vollmar
,
H.
Walden
,
D.
Waterman
,
K. S.
Wilson
,
M. D.
Winn
,
G.
Winter
,
M.
Wojdyr
, and
K.
Yamashita
, “
The CCP4 suite: Integrative software for macromolecular crystallography
,”
Acta Crystallogr., Sect. D
79
,
449
461
(
2023
).
38.
R.
Agarwal
, “
A new least-squares refinement technique based on the fast Fourier transform algorithm
,”
Acta Crystallogr., Sect. A
34
(
5
),
791
809
(
1978
).
39.
L. F.
Ten Eyck
, “
Efficient structure-factor calculation for large molecules by the fast Fourier transform
,”
Acta Crystallogr., Sect. A
33
(
3
),
486
492
(
1977
).
40.
A. T.
Brunger
, “
Free R-value—A novel statistical quantity for assessing the accuracy of crystal-structures
,”
Nature
355
(
6359
),
472
475
(
1992
).
41.
S.
French
and
K.
Wilson
, “
On the treatment of negative intensity observations
,”
Acta Crystallogr., Sect. A
34
(
4
),
517
525
(
1978
).
42.
A. T.
Brunger
, “
Free R value: Cross-validation in crystallography
,”
Methods Enzymol.
277
,
366
396
(
1997
).
43.
P. L.
Howell
and
G. D.
Smith
, “
Identification of heavy-atom derivatives by normal probability methods
,”
J. Appl. Crystallogr.
25
,
81
86
(
1992
).
44.
L.
Ten Eyck
, “
Crystallographic fast Fourier transforms
,”
Acta Crystallogr., Sect. A
29
(
2
),
183
191
(
1973
).
45.
R. J.
Read
and
A. J.
Schierbeek
, “
A phased translation function
,”
J. Appl. Crystallogr.
21
(
5
),
490
495
(
1988
).
46.
C. I.
Branden
and
T. A.
Jones
, “
Between objectivity and subjectivity
,”
Nature
343
(
6260
),
687
689
(
1990
).
47.
P.
Emsley
,
B.
Lohkamp
,
W. G.
Scott
, and
K.
Cowtan
, “
Features and development of Coot
,”
Acta Crystallogr., Sect. D
66
,
486
501
(
2010
).
48.
See the supplementary material for more detailed information on the simulation, preprocessing, and implementation of NLSA on dark state data.