The field of time-resolved macromolecular crystallography has been expanding rapidly after free electron lasers for hard x rays (XFELs) became available. Techniques to collect and process data from XFELs spread to synchrotron light sources. Although time-scales and data collection modalities can differ substantially between these types of light sources, the analysis of the resulting x-ray data proceeds essentially along the same pathway. At the base of a successful time-resolved experiment is a difference electron density (DED) map that contains chemically meaningful signal. If such a difference map cannot be obtained, the experiment has failed. Here, a practical approach is presented to calculate DED maps and use them to determine structural models.

## THE PHYSICAL BASIS OF DED MAPS, REAL SPACE REPRESENTATION

The need for a practical tutorial how to analyze time-resolved crystallographic (TRX) data led to a workshop at the 10th Annual international BioXFEL conference in San Juan, Puerto Rico in May 2023. The majority of the material used at this workshop is referenced here. This manuscript is intended as a tutorial for a practical approach to producing high-quality difference electron density maps from crystallographic data. In addition, it outlines how to use a difference electron density map to determine a molecular structure, which could be a structure of an intermediate or any other structure of interest. The tutorial is driven by the need to analyze small signal in difference maps caused by a weak extent of reaction initiation that is common to TRX experiments. In addition, it serves as a refresher and as an entry point to this fascinating field.

Related information is also found in the literature cited below. However, this manuscript does not explain how to process TRX data, neither Laue data (Moffat, 1989; Ren and Moffat, 1995) nor data collected by time-resolved serial crystallography (Aquila , 2012; Tenboer , 2014; and White, 2019). It also does not explain how to globally analyze the TRX data and extract chemical, kinetic mechanisms, and pure species structures from them. For this, the reader is referred to advanced literature (Schmidt , 2003; Rajagopal , 2004; Ihee , 2005; Schmidt , 2005; Schmidt, 2008; Jung , 2013; Schmidt , 2013; and Schmidt, 2019).

_{1}…,I

_{3}. At any time-point t during the reaction, the electron density ρ

_{t}is a sum of the electron densities ρ

_{I1}…,ρ

_{I3}of the pure intermediate species weighted by their respective fractional concentrations c

_{I1}…,c

_{I3}plus the respective concentration c

_{ref}of the reference state electron density ρ

_{ref}. The reference state denotes the one obtained without initiating the reaction, e.g., crystals were measured in the dark, or no substrate or ligand is added,

_{I1}… ρ

_{I3}and the c

_{I1}… c

_{I3}. ρ

_{ref}can be measured by a separate experiment without reaction initiation. C

_{ref}follows from mass conservation as

_{ref}is not multiplied with its respective fractional concentration because this has been eliminated by the law of mass conservation [Eqs. (2) and (3)].

_{ref}, one obtains

_{t}is measurable (see next paragraph). It is a linear combination of time independent difference maps of the intermediates Δρ

_{I1}… Δρ

_{I3}weighted by their respective fractional concentrations. There is no contribution of the reference state to the difference map. As in any time-resolved experiment, the time-information is exploited to (i) gain information about the time-dependent concentrations (the concentration profiles) of the intermediates and (ii) extract the electron densities of the time-independent species structures.

## THE PHYSICAL BASIS OF DED MAPS, RECIPROCAL SPACE REPRESENTATION

**DF**) that consist of difference structure factor amplitudes $ \Delta | F |$ that are obtained by subtracting reference structure factor amplitudes $ | F | ref$ from $ | F | t$ measured at time t after reaction initiation. $ \Delta | F | t$ are combined with the phase derived from a good (well refined) model of the reference state

In general, a time-resolved crystallographic experiment consists of two steps. (i) Reference structure factor amplitudes $ | F | ref$ are collected from protein crystals where a reaction has not been started. This can be done for example by exposing the crystals in the dark ahead of a pump-probe experiment. In step (ii), time-dependent structure factor amplitudes $ | F t |$ are collected at a time delay t after a reaction is initiated, for example, by an intense laser light pulse or by mixing with the substrate. $ | F t |$ is the amplitude of the time-dependent structure factor $ F t$. $ F t$ is the reciprocal space equivalent of Eq. (1). Figure 1 shows that how $ F t$ is constructed from the structure factors of the intermediates and the reference state. After processing, two crystallographic datasets are obtained, one for the reference state, and another that probes the progress of the reaction at time-point t. (More datasets can be obtained at any other time-point.) Both datasets consist of a long list of Miller indices, structure factor amplitudes, and their measurement errors. Data are typically stored in a binary format called the mtz-format (after the progenitors McLaughlin, Terry, and Zelinka, see also the IUCr Commission on Crystallographic Computing) that is standard to the collaborative project number 4 (CCP4) suite of programs (Winn , 2011).

Figure 2 shows a flow chart how to calculate a difference map from measured data. The mtz-file that contains the reference structure factor amplitudes $ | F ref obs |$ is called dark.mtz that with the time-resolved data $ | F t obs |$ is called light.mtz. This mimics the result of a pump-probe TRX experiment where the reaction is started in the crystals with laser light pulses. The mtz files could easily be called water.mtz and ligand.mtz where both files might have originated from a substrate diffusion experiment (Schmidt, 2013; Olmos , 2018), or given any other name. A third mtz file is required that contains the phases of the reference state, called here φ_{ref}. This mtz file is obtained by refining a reference state model against the structure factor amplitudes contained in the dark.mtz file using standard refinement programs, such as refmac (Murshudov , 2011) or phenix (Adams , 2010; Liebschner , 2019). The reference model should be as complete as possible including all water molecules and other ions or ligands. The refinement should provide a set of the best reference phases possible. It also provides a set of calculated structure factor amplitudes $ | F ref calc |$ that fit the observed structure factor amplitudes of the reference state $ | F ref obs |$ as accurately as possible. The $ | F ref calc |$ are by definition on the absolute scale.

Difference map calculations rely on proper scaling of the $ | F ref obs |$ to the $ | F t obs |$. In general, this is done by applying a resolution dependent scaling model that can consist of scale factors and isotropic or anisotropic B-factors (Evans, 2006; Dalton , 2022). Typically, the quality of the $ | F ref obs |$ dataset is better than that of the $ | F t obs |$ data. This is because the reaction in the crystals produces disorder that is larger than that in the crystals at rest. The time-dependent intensities (or amplitudes) are corrected during scaling. Wilson plots with the $ | F t obs | 2$ and the $ | F ref obs | 2$ have approximately the same slope after scaling. Disorder of individual atoms engaged in the reaction causes the magnitudes of the positive difference features to be often smaller than those of the negative ones.

_{cryst}, Eq. (7)] during refinement. Once the $ | F ref obs |$ are on the absolute scale, the $ | F t obs |$ are scaled to them. The progress of the scaling can be monitored by an R

_{scale}[Eq. (7)] that is calculated in a similar way as the familiar R

_{cryst}

_{cryst}) is typically on the order of 15%–20% caused by inaccuracies in the structural model. Therefore, the scaling of the observed to the calculated data results in a similar unfavorable R

_{scale}-factor. It is shown further down that the quality of the difference maps decisively depends on whether the sign (positive or negative) of the differences can be accurately determined. When the R

_{scale}is large, this sign cannot be accurately determined. That makes the discovery of small signal difficult even when the so-called mF

^{obs}–DF

^{calc}difference maps are used (m is the figure of merit and D is a weighting factor that accounts for model errors) (Read, 1986). In contrast, scaling two observed datasets results in much lower R

_{scale}-factors on the order of 5%–10%. This leads to much more accurate $ \Delta | F |$(and their signs) that are meaningful even when the extent of the reaction initiation is low (e.g., smaller than 10%).

Monitoring the R_{scale} is important to predict the success of a time-resolved experiment. When the data quality is poor because (i) not enough diffraction patterns are collected, (ii) unsuitable data processing parameters are employed, (iii) the detector geometry has not been determined correctly, or (iv) the detector itself is not in good working condition, the R_{scale} will be elevated caused by both systematic and experimental error in the collected data. Then, a meaningful difference electron density map cannot be obtained. Improvements in data processing or collecting more diffraction patterns (whatever is relevant for the particular experiment) may be necessary. From experience, R_{scale} factors near of smaller than 10% are found to be necessary to result in good difference maps with meaningful signal.

_{ref}to difference structure factors [

**DF**, Eq. (6)], from which a difference map can be calculated by Fourier summation [Eq. (8)]. The difference electron density is represented on only half the absolute

**DF**appears to be inapt for the calculation of a difference map.

## THE DIFFERENCE FOURIER APPROXIMATION

**F**is very difficult if not impossible to be measured during the time the reaction proceeds in the protein crystal. For the calculation of a true DED map, true difference structure factors

_{t}**ΔF**=

_{true}**F**

_{t}–

**F**

_{ref}are required [Fig. 3(a)].

**ΔF**is obtained by subtracting the structure factor of the reference state from the time-dependent structure factor as a vector in the complex plane [Fig. 3(a)]. Then, the true DED map is calculated as

_{true}Note, this time, $ \Delta \rho true$ is on the absolute scale when true difference structure factors **ΔF**_{true} are used with amplitude |ΔF_{true}| and phase φ_{ΔF,true} [Fig. 3(a)]. Here, attention to detail is required: |ΔF_{true}| (the vertical bars enclose the entire ΔF_{true}) is the amplitude of the true difference structure factor generated by subtraction of two vectors in the complex plane [Fig. 3(a)]. However, as mentioned (Fig. 1), the phase φ_{ΔF,true} cannot be determined, and the difference structure factor **ΔF**_{true} cannot be calculated. This would be the end of any attempt to determine a difference map, if there would not be an approximation that makes it possible that instead of the true difference structure factor an observed difference structure factor amplitude could be used together with the reference (dark) state phase.

**F**

_{t}[Fig. 3(b), red and orange bars], the projection can be evaluated by using the cosine

_{t}and φ

_{ref}are essentially equal [hence the phase difference is small, Fig. 3(b)], an equation is derived that relates the measured (observed) difference structure factor

**DF**(with amplitudes $ \Delta | F | t$ and phases φ

_{ref}) to the true difference structure factor

With the notion that the phase of the difference structure factor **ΔF**_{true} is not correlated with the reference phase, the second term on the right-hand side of Eq. (13) averages out in a Fourier summation. Equation (13) is the mathematical counterpart of the difference Fourier approximation. Equation (8) can be derived this way from first principles. The reason why $ \Delta \rho $ is only represented on half the absolute scale and the justification that the reference phase can be used is now understood, because the difference structure factor **DF** with measured amplitudes Δ|F| and model phases φ_{ref} is approximately $ 1 2 \Delta F true$. Accordingly, a difference map calculated with the observed differences and the phases of the reference state is a true difference map with DED features on ½ the absolute scale and some additional noise caused by the second term on the right-hand side of Eq. (13). Since $ \Delta | F | t$ can be accurately measured [Fig. 3(c)], the experimental DED map is very sensitive to structural and occupancy changes (Henderson and Moffat, 1971).

## NECESSITY TO WEIGHT DIFFERENCE STRUCTURE FACTOR AMPLITUDES

_{ref}. The two main errors identified were difference structure factor amplitudes determined (i) from large amplitudes as well as (ii) from poorly measured amplitudes with large experimental error (sigma) values. Condition (i) arises, since large amplitudes also carry, on an absolute scale, large error values. If two large numbers are subtracted, large false positive and negative differences can likely arise. Large intensities (and amplitudes) are usually measured at low resolution. Low-resolution difference structure factor amplitudes generate a rolling DED landscape with high crests and deep valleys in which the high-resolution difference features are located. The contour level is determined by fluctuations in the unit cell that determine the sigma value of the DED map. True DED features tend to be obscured by the valleys or crests of the low-resolution DED landscape. Poorly measured amplitudes (ii) can also result in false positive or negative differences but affect mostly high-resolution differences. In both cases, false positive or negative differences may occur, that deteriorate features in the DED map. It is, therefore, desirable to weigh down either large differences or those with large experimental errors. Based on the statistical considerations by Ursby and Bourgeois (Ursby and Bourgeois, 1997), Zhong Ren and colleagues developed a largely simplified, practical weighting factor that can be used to correct for both conditions (i) and (ii) (Ren , 2001). The weighting factor is calculated for each difference structure factor amplitude with index hkl,

_{Δ|F|}is much larger than the average sigma $ \u27e8 ( \sigma \Delta | F | ) 2 \u27e9$ found in the dataset of difference structure factor amplitudes. (Note that the σ

_{Δ|F|}is determined by error propagation from the individual measurement errors of the amplitudes used to calculate the Δ|F|.) This weighting scheme can be easily implemented in a computer program and does not need additional information, such as coordinate errors. In the original article by Ren

*et al.*, the values in terms 2 and 3 were not squared. The author of this article uses the squared values in publications he authored and coauthored and obtained good results. Other weighting schemes are discussed by De Zitter

*et al.*(De Zitter , 2022).

## DED MAP FUN

_{ref}of the dark state [see Eq. (13)]. However, as mentioned, a convenient feature in crystallography is that negative amplitudes can be always converted to positive ones by flipping the phase by 180°,

From the new file differenceF10.phs, a DED map (F10) can be calculated and compared to the observed difference map determined from the appropriate amplitudes and weights. Surprisingly, the two difference maps are almost identical [compare Figs. 4(a) and 4(b)]. If one would observe the F10 difference map during a LCLS beamtime, one would already be satisfied, since the experiment worked. This clearly outlines that the determination of a correct phase-flip pattern (either dark phase or the phase flipped by 180°) is sufficient to produce a good difference map. The art is to determine the phase-flip pattern correctly in the presence of experimental noise. If this pattern deteriorates caused by poorly measured structure factor amplitudes, the signal in the DED map deteriorates accordingly (Schmidt , 2003). This is the reason that weighting is important (see above). The goal is to down-weigh the contribution by potentially false phase flips to keep the difference signal as strong as possible.

## STRUCTURES FROM DED MAPS

_{I1}[analogous to Eq. (4)],

_{I1}, one obtains

It should be mentioned at this point: Δρ is only determined on half the absolute scale using the measured amplitudes and the reference (dark) phases [see above, Eqs. (8) and (13)]. Therefore, for this equation to work with measured DED maps, twice the measured Δρ must be added for the extrapolation to remain related to 1/c_{I1}. This needs to be kept in mind.

_{C}to determine extrapolated structure factor amplitudes would be 40. The hesitation to accept large N factors because only small Ns and correspondingly large occupancies result in an acceptable extrapolated map may lead to gross errors in the interpretation of structural changes. A suggestion how to obtain a better extrapolated map is outlined further down.

This equation is unusual unless one accepts that the Fourier synthesis [Eq. (23)] can be executed with negative amplitudes. Again, from crystallographic first principles (complex number algebra), any negative amplitude can be replaced with a positive amplitude when the associated phase is flipped by 180°. Figure 6 explains the reason using a representation in the complex plane. The extrapolated structure factor amplitudes are calculated by aligning the difference structure factor amplitudes with the dark state structure factor. If the $ \Delta | F | t$ are positive, an extrapolated structure factor with a magnitude larger than the reference amplitude and with the reference phase emerges. The $ \Delta | F | t$ can also be negative. Then, an extrapolated structure factor with a smaller magnitude [Fig. 6(b)] can be obtained. However, and this is inevitable, some of the extrapolated structure factor amplitudes calculated from Eq. (22) will become negative. This situation is depicted in Fig. 6(c). The resulting extrapolated structure factor points in the opposite direction compared to the reference structure factor. This means that this extrapolated structure factor is an ordinary structure factor (with a positive amplitude), but its phase is φ_{ref} + 180°. Of course, it is this positive amplitude that must be submitted to a reciprocal space refinement program, and all extrapolated structure factors (with either the reference phase or the reference phase flipped by 180°) must be used to calculate an extrapolated map.

By omitting the “negative” structure factors from the Fourier summation [Eq. (23)] as suggested recently (De Zitter , 2022), an extrapolated map (ρ_{nn}, nn for *n*o-*n*egatives) is obtained that on the first glimpse appears quite similar to the correct DED map ρ_{t}. Structural refinement against ρ_{nn}, though, becomes more difficult. As an example, the ρ_{nn} map has been used to real-space refine the structure of the pCA chromophore [Fig. 5(b)] in photoactive yellow protein (PYP) with the goal to reproduce the torsional angle Φ_{T} determined previously (Pande , 2016) [Fig. 5(b), red line]. To determine a structure that follows the electron density is more difficult and the result was quite different (Φ_{T} ∼140°) from that where all structure factor amplitudes are maintained (Φ_{T} ∼35°) (Pande , 2016). For the calculation of ρ_{ext} in this example, the average extrapolated structure factor amplitude pointing in the direction of **F**_{ref} was 255 × f (f is the Thomson scattering length of an electron), whereas the average amplitude pointing into opposite direction to **F**_{ref} was 67 × f, a magnitude that cannot be neglected. It seems to be so that when the “negative” amplitudes are dismissed, it is not clear (a) whether an accurate characteristic N (N_{C}) can be determined (see next paragraph for methods to determine N_{C}), (b) whether the obtained extrapolated map is correct, and (c) whether a reciprocal space refinement against the incomplete |F^{ext}| data will provide accurate structural displacements or structural relaxations. In any case, there is no physical reason to dismiss any extrapolated amplitudes. When an N_{C} is determined for the PYP data (Pande , 2016; Pandey , 2020), the fraction of structure factors pointing in the opposite direction to those of the reference is about 25% of all structure factors in the dataset whether N_{C} = 16 [Fig. 7(a)] or N_{C} = 29 [Fig. 7(b)] is employed. This large fraction might also have a deeper meaning that merits investigation.

## SEMI-AUTOMATIC DETERMINATION OF A CHARACTERISTIC FACTOR N_{C} TO CALCULATE AN EXTRAPOLATED MAP

_{C}must be determined to calculate an extrapolated map unless a faithful estimate of the occupancy is available from other sources. As shown by Terwilliger and Berendsen (Terwilliger and Berendzen, 1996), a well determined calculated electron density ρ

_{calc}can replace the observed reference electron density in Eq. (21). Then, the reciprocal space equivalent would look like

_{C}is determined, the occupancy of the intermediate can be calculated (see above) as

_{C}several methods can be applied. Here, three are listed: method 1 tries to isolate a well separatable volume with positive or negative difference electron density and integrates the difference electron density in this volume (Šrajer , 2001). This integration provides an electron count, e.g., 1.5 e

^{−}. Due to the difference Fourier approximation, this value is on ½ the absolute scale. The true electron count is 3 e

^{−}. If a ligand has 14 electrons (e.g., carbon monoxide), this corresponds to about 21% occupancy. The factor N

_{c}in this case would be about 10 [Eq. (25)].

Method 2 uses the extrapolated maps themselves (Tripathi , 2012; Schmidt, 2019; and Pandey , 2020). A set of extrapolated structure factor amplitudes can be calculated with increasing factor N. Then, regions with strong negative density in the DED map become negative also in the extrapolated map [Fig. 8(b)]. These regions of interest (ROI) can be used to determine an accurate N_{C}. Figure 7 shows real world examples from TR-SFX experiments at the LCLS and the European XFEL (EXFEL). With increasing N, the negative densities found in the ROIs of the extrapolated maps increase. The results are plotted as a function of N and the N_{C} determined at the intersection of the two red lines in Fig. 7(a) or Fig. 7(b).

Method 3 correlates calculated difference electron density features to the observed difference density features. With increasing N, structural models M_{N} can be refined against the resulting extrapolated structure factor amplitudes. From the resulting model, structure factors can be determined that can be used with the structure factors from the reference model to calculate a M_{N}–M_{ref} difference map. Difference features are compared and correlated with the observed difference features (Claesson , 2020). The correct N is found when the correlation between the two sets of difference features (observed and calculated) is optimum. Such an analysis can be performed in a user-friendly way using the program Xtrapol8 (De Zitter , 2022). The author acknowledges a poster presentation by De Zitter *et al.* at the 2023 PSB symposium in Grenoble, Fr.

Once a characteristic N_{c} is determined, an extrapolated map can be calculated that allows for the determination and a real space refinement of a structural model, e.g., in Coot (Emsley , 2010). An example of an extrapolated map from a recent experiment on photoactive yellow protein (PYP) is shown in Fig. 5(b) together with the corresponding difference map [Fig. 5(a)]. The ROI where the negative electron densities in the extrapolated maps are integrated is denoted by the dashed circle in Fig. 5(b). N_{C} = 16 has been determined from Fig. 7(a). If N_{C} would have been determined very different from 16 (for example N_{C} = 8), the torsional angle Φ_{T} would likely be substantially different. Since the torsional angle is a functional reaction coordinate for the PYP chromophore *trans* to *cis* isomerization, it is of paramount importance to determine this angle correctly. It has been suggested to improve the extrapolated map by density modification, such as solvent flattening and histogram matching, for example, using the program “dm'”(Cowtan, 1994). An improved structural model can then be determined from such a map and refined against the |F^{ext}| (see also below).

## LOVE-HATE RELATIONSHIP WITH LARGE N_{C}

Large extrapolation factors are a nuisance. Reciprocal space refinement against the extrapolated structure factor amplitudes (all of them, see discussion above) usually results in inacceptable R_{cryst} values of > 40%. Here, a way to remedy this is described.

_{|ΔF|,true}is simply not known (see also Figs. 2 and 3). Regardless whether or not density modification is applied, a structural model becomes available from the real-space fit to the (admittedly) noisy extrapolated map. This model and the model of the reference state can be used to calculate difference structure factors with amplitude |ΔF|

^{calc}and estimated true phases φ

_{|ΔF|,#}. When the calculated phases φ

_{|ΔF|,#}are combined with the Δ|F|

^{obs}, phased extrapolated structure factors are determined,

As a detail: by the application of the estimated phase of the difference structure factor φ_{|ΔF|,#}, Δ|F|^{obs} becomes an ordinary amplitude |ΔF|^{obs} (note the position of the vertical straight lines to denote absolute values of the difference structure factor which is used [compare Eqs. (24) and (26)]. The calculation is particularly easy, if the Δ|F|^{obs} are stored as positive values for book-keeping [Eq. (17)]. From these, a phased extrapolated electron density map is calculated. Refinement against the phased extrapolated structure factor amplitudes immediately results in acceptable R_{cryst} values.

A more sophisticated method to recover the magnitude of the true difference structure factor amplitude that was previously only estimated by projection [Eq. (10)] is shown in Fig. 9. Here, the situation is depicted by an Argand diagram that makes use of the estimated true phase φ_{|ΔF|,#}. The orange difference is the weighted difference structure factor, and the red difference is a corrected difference structure factor $ | \Delta F t | #$ that closes the triangle between the measured time-dependent amplitude and the sum of the dark structure factor and the difference structure factor. Extrapolation [Eq. (26)] can then be pursued with the corrected $ | \Delta F t | #$ and the phase ϕ_{|ΔF|,#}. Since the difference Fourier approximation is not applied, the reason for the factor 2 in [Eqs. (21) and (22)] vanishes. N becomes much smaller, and the extrapolated map appears much improved.

The method shown in Fig. 9 needs closer examination. In particular, the probability of the true difference structure factor given the noise and other systematic errors in the data and the structural models must be evaluated perhaps in a similar way as it has been done previously for partial structural models with errors (Read, 1986, 1997). In addition, the phase bias introduced by a model refined against the extrapolated map needs to be estimated. The best way to do this is by engaging an appropriate simulation using realistic structure factors with noise followed by a statistical analysis as performed previously (Read, 1986; Schmidt , 2003).

It can only be hoped that this tutorial will help to promote the widespread usage of TRX methods. Appropriate software solutions will be user-friendly with push-button interface and more functional in the future so that everyone with general crystallography knowledge can learn and practice structure determination from time-resolved DED maps.

## ACKNOWLEDGMENTS

This work was enabled by NSF Science and Technology Center Biology with XFELs (BioXFEL), NSF-STC 1231306. The author thanks P. Schwander and E. Stojkovic for commenting on an earlier version of this manuscript.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Marius Schmidt:** Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Methodology (equal); Resources (equal); Software (equal); Visualization (equal); Writing – original draft (equal).

## DATA AVAILABILITY

No data were generated for this manuscript.

## REFERENCES

**31**, 34–38

*Principles of Protein X-Ray Crystallography*

*trans-cis*isomerization pathways in photoactive yellow protein visualized by picosecond X-ray crystallography

*Phenix*

*Structure Based Enzyme Kinetics by Time-Resolved X-Ray Crystallography, in: Ultrashort Laser Pulses in Medicine and Biology*

*Chemical Kinetics and Dynamics*

*CrystFEL*: A step-by-step guide

*CCP4*suite and current developments