One way to achieve spatial resolution using fluorescence imaging—and track single molecules—is to use wide-field illumination and collect measurements over multiple sensors (camera pixels). Here we propose another way that uses confocal measurements and a single sensor. Traditionally, confocal microscopy has been used to achieve high temporal resolution at the expense of spatial resolution. This is because it utilizes very few, and commonly just one, sensors to collect data. Yet confocal data encode spatial information. Here we show that non-uniformities in the shape of the confocal excitation volume can be exploited to achieve spatial resolution. To achieve this, we formulate a specialized hidden Markov model and adapt a forward filtering-backward sampling Markov chain Monte Carlo scheme to efficiently handle molecular motion within a symmetric confocal volume characteristically used in fluorescence correlation spectroscopy. Our method can be used for single confocal volume applications or incorporated into larger computational schemes for specialized, multi-confocal volume, optical setups.
Fluorescent microscopy1,2 (FM) creates contrast between fluorescent molecules and their background. Unlike other microscopy techniques, such as electron microscopy3 or scanning probe microscopy,4 FM is less invasive and the laser intensity used is often limited by the sensitivity of the analysis tools used downstream to extract information from fluorescence data.1,5,6
In particular, wide-field optical setups, commonly encountered in FM, employ an array of regularly spaced light sensors (pixels). On account of these multiple sensors, the location of fluorescing molecules can be pinpointed with high spatial accuracy, and when used in conjunction with super-resolution approaches such as stochastic optical reconstruction microscopy (STORM)7 or photoactivated localization microscopy (PALM),8,9 the labeled molecules can be localized with accuracy approaching ≈20-30 nm, well below the conventional diffraction limit.5 In this context, linking of localized molecules across a sequence of successive frames enables single particle tracking (SPT).10–13 Nevertheless, although wide-field setups are capable of reconstructing dynamics with high spatial accuracy, they typically suffer from poor temporal accuracy.5 This is due to both hardware and software-related reasons. On the hardware side, the acquisition of an entire frame over a large number of sensors needs to be read by the same camera electronics.14,15 On the software side, a sufficient number of photons needs to be collected within individual frame acquisitions for algorithms to successfully localize molecules. As such, frame rates used in SPT are typically low, ≈10 Hz.15–19 Such frame rates are appropriate when imaging stationary or otherwise slowly diffusing molecules in vivo. Additionally, since large portions of the specimen are continuously illuminated, wide-field FM may also cause significant photo-damage to fluorescing molecules or be photo-toxic to the specimen under observation.20,21
Confocal optical setups used in FM provide an alternative. By using a small number of sensors, such as avalanche photodiodes (APDs), or most often just one sensor,2 confocal microscopy reaches data acquisition rates ≈10 GHz. At such high rates, fast biochemical processes are readily monitored with temporal accuracy superior to wide-field FM, however, with compromised spatial resolution. Additionally, because long exposures are unnecessary, since molecules are not spatially localized and the illumination is targeted to only a small portion of the specimen, photo-damage caused by confocal illumination is minimal.
Existing SPT analysis,10–12 however, cannot be used to extract spatial information from confocal data. Fundamentally, this is because of the reduced spatial information captured by only one or very few sensors. Yet, a glimpse at confocal data reveals that spatial information is in fact encoded even within the signal provided by just one pixel or avalanche photodiode (APD) (see Fig. 1).
In particular, due to the non-uniform illumination used in confocal setups, the location of a molecule is encrypted in the instantaneous photon emission rates and ultimately reflected in the acquired traces. Nevertheless, to extract spatial information from such traces, on account of the symmetry inherent to typical confocal volumes, we must account for the fact that multiple locations of a molecule within a confocal volume (for example, those at similar distances from the focal point for the confocal volume shown) may give rise to the same signal (see Fig. 1). In addition, we must account for molecular motion across the illuminated volume and noise.
Here, we specifically tackle these challenges and develop the necessary computational framework to obtain spatial resolution from confocal measurements. Briefly, we develop and validate a Bayesian method that can be used on raw measurements obtained from just one sensor with the characteristics of an APD, as found in most common confocal setups, and provide diffusion coefficient estimates, similar to bulk fluorescence correlation spectroscopy (FCS) or fluorescence recovery after photobleaching (FRAP). Complementary to these methods that extract only dynamical parameters, such as diffusion coefficients, we also extract trajectories of the molecules. Additionally, our method reconstructs not only the most probable trajectory similar to wide-field SPT,5,13,22 but full distributions over molecular trajectories that are consistent with the raw measurements. To keep the computational cost low, we develop a formulation based on hidden Markov models and adapt tools from Computational Statistics such as non-linear filtering and Markov chain Monte Carlo (MCMC) sampling that we describe in Sec. II.
In this section, we describe the theoretical foundation of our method. That is, we first introduce the necessary formalism and subsequently present the computational tools for its evaluation. Throughout this study, we assume that successive measurements, in a confocal experiment, are reported at times tk, for k = 1, …, K, where t1 marks the time of the first measurement and tK marks the time of the last one. Although typically measurements are reported at equal time intervals tk − tk−1, the formalism we describe below can handle non-equal time intervals too with no further modifications.
Assuming that the time levels tk are known, we start from individual photon detection counts at those times, . These counts are taken to be the raw experimental output. Here, is the number of photons detected between tk−1 and tk. The source of these photons is either a single molecule in the vicinity of the illuminated volume or some unspecified fluorescent background. Fluctuations in arise from photon shot noise (stochastic photon emission) and the motion of the fluorescent molecule within the inhomogeneously illuminated confocal volume. We denote the location of the single molecule by , where rk is the location of the molecule relative to the center of the confocal volume at time tk.
For clarity, in the following presentation, we first consider a simpler 1D confocal model, where rk = xk. We generalize to 3D, where rk = (xk, yk, zk), later.
A. Model description in 1D
A simple graphical summary of our 1D model is provided in Fig. 2. Below we explain the photon emission and molecular motion separately.
1. Photon emission
To analyze photon count time series, , we first model the photon emission probability. This probability captures observation noise and is necessary in linking photon count measurements to molecular locations. Here, we assume that photon counts between tk−1 and tk follow Poisson (shot noise) statistics5,23
where μk is the instantaneous photon emission rate. Technically, this is the emission rate for the detected photons (i.e., μk excludes photons that are lost in the optical path from the molecule to the detector such as those straying away from the objective or blocked by dichroics). In turn, the instantaneous photon emission rate
combines a time-independent background rate μback and a time-dependent molecule photon emission rate μmolPSF(xk). Here, the rate μmol represents the maximum rate of photon emissions of a single molecule which is achieved when the molecule is at the center of the confocal volume, i.e., xk = 0. It is worth mentioning that the parameter μmol in this study, known as the brightness at the center of the confocal volume, is different from the molecular photon emission rate commonly described in the literature. The latter quantity is the average photon count rate, i.e., μmolPSF(x) averaged over x in the effective volume.24,25 Besides μmol, our molecule’s photon emission also depends upon the location of the molecule within the confocal volume through PSF(xk) which combines the effects of excitation and emission point spread functions (PSFs) into a single term.2,26 Namely, similar to existing FCS approaches, we use a single Gaussian27 as an approximation to the product of the two separate point spread functions.28–30 In 1D, our approximation is
where ωx is the distance (waist) from the focal point24,31,32 when the PSF(x) drops to e−2.
2. Molecular motion
Since the molecule moves stochastically inside the specimen under observation, to reconstruct the dynamics captured in , we consider a probabilistic motion model. We assume that the molecule follows isotropic diffusion with its position’s probability density, p(x, t), at time t satisfying Fick’s second law33–35
where D is the molecule’s diffusion coefficient. To solve this equation, we assume that at time tk−1, the molecule’s location is xk−1, which provides the initial condition, p(x, tk−1) = δ(x − xk−1). The solution at any later time t then takes the form
which equals to the probability density of a normal random variable with mean xk−1 and variance 2(t − tk−1)D. Thus, our motion model describing the position of the molecule is
B. Model inference in 1D
The quantities which we are interested in, for example, the diffusion coefficient D and the molecule’s trajectory are introduced as model variables in the preceding formulation. To estimate values for D and reconstruct trajectories for , we follow the Bayesian paradigm.5,36,37 Within the Bayesian paradigm, the variables D and x1, in particular, are parameters of the model and, as such, require prior probability distributions, which we describe next.
To ensure positive values for the diffusion coefficient and also to achieve conjugacy with the motion model, which facilitates the computations below, we use an inverse-Gamma prior probability distribution on D
with hyper-parameters αD and βD.
Due to the symmetries in our confocal point spread function [Eq. (3)], i.e., a molecule at a location +x emits on average the same number of photons as a molecule at location −x, the intensities in cannot distinguish these two locations. Hence, we place a prior probability distribution on the initial location x1 that respects this symmetry. Accordingly, in our framework, at the onset of the measuring period, the molecule is equally likely to be located at any of the positions ±x1. To facilitate the computations (see below), we use a symmetric normal distribution
Specifically, the probability density of a distribution is
In summary, our entire Bayesian model in 1D consists of the following relations:
which have the structure of a Bayesian hidden Markov model38–41 and are illustrated graphically in Fig. 2. This model leads to a posterior probability distribution that captures all variables of interest. The nonlinearity in the point spread function [Eq. (3)] with respect to the variables in , however, exclude analytic forms of . For this reason, below we develop a computational scheme exploiting Markov chainMonte Carlo36,42 (MCMC) that can be used to generate pseudo-random samples from , which can be used to obtain any statistic of interest such as mean values and error bars.
Our MCMC algorithm exploits a blocked Gibbs sampling scheme.23,36,42 Accordingly, posterior samples are generated by updating each one of the variables involved sequentially by sampling conditioned on the other variable and the measurements . Algorithmically, the steps involved in the generation of are as follows:
Generate the diffusion coefficient D(i) by sampling from the distribution .
Generate the trajectory by sampling from the distribution .
We present the fine details of these steps below.
1. Sampling the diffusion coefficient
As mentioned above, we update the diffusion coefficient D by sampling from , which, due to the specific dependencies of the variables in our formulation, simplifies to . Using Bayes’ rule and because of conjugacy of our prior on Eq. (7), the latter reduces to
where the derivation and explicit formulas for and are given in the supplementary material.
2. Sampling the trajectory
As we have already mentioned, to update the trajectory , we need to sample from , which we achieve through backward sampling.39,43,44 Namely, since the density factorizes as
we sample sequentially starting from xK and marching backward towards x1. However, before we begin backward sampling, we need to compute the individual densities and . We compute these recursively using forward filtering.38–40,43,44 Namely, we start our density computations at x1 and march forward towards xK. Below we describe in detail each one of the involved steps.
a. Forward filtering stage.
By Bayes’ rule, the individual densities in Eq. (15) factorize as
where we use as a shorthand for , …, and Ck is a factor that does not depend on xk. Therefore, for the computation of the densities required to sample xk in backward sampling, it is sufficient to compute only filter densities p(xk|D, w1:k) since we already know p(xk+1|xk, D) from the motion model and can recover Ck by normalization, i.e., , although in practice we need not compute Ck at all (see the sampling stage below).
where and denotes the Anscombe transform,47 which in our context is given by
with . This transform approximates Poisson random variables by univariate normal random variables47 which are necessary for the computations described next. Our approximation is highly accurate for large fk(x), while acceptable accuracy is maintained so long as fk(x) > 4.
Now, as both our kinetics and (transformed) observation probabilities are normally distributed, i.e., Eqs. (6) and (17), respectively, we can use the existing theory of Kalman filters.48–50 Nevertheless, because the mean, i.e., , of the observation distribution depends nonlinearly on the location xk, to apply Kalman filters we need to linearize the observation probability density.51–54 To do so, in Sec. III, we try two common approaches: (i) extended Kalman filters (EKFs),51–55 which apply a local approximation to linearize [Eq. (17)] around selected values; (ii) unscented Kalman filters (UKFs),56–58 which apply a global approximation to linearize in a statistical manner the entire density of in Eq. (17).
Despite the linearization involved, neither naive EKFs nor naive UKFs serve as appropriate filter approximations to p(xk|D, ) in our framework. This is because both assume that the filters themselves are normal densities. This assumption is particularly problematic for confocal observations with ±x symmetry which implies equal probabilities for the molecule to exist in either the negative or positive side of the illumination profile. In 1D, under our symmetric prior in Eq. (8) and isotropic dynamics in Eq. (6), this implies that our filter p(xk|D, ) has two, equally weighted and symmetrically placed across the origin, modes. We therefore modify EKFs and UKFs to account for bimodal densities appropriate for symmetric confocal setups.
More specifically, unlike the conventional filters51–58 that consider a uni-modal filter using , for properly chosen mk and ck, we compute filter densities of the form
where denotes the symmetric normal distribution with the density given in Eq. (9) (see also Fig. 3). Our approximation, as we have explained earlier, arises due to the linearization in the observation distribution. As EKF and UKF compute mk and ck differently, we describe the procedures, adapted to Eq. (19), separately in the supplementary material.
b. Backward sampling stage.
After computing the filter densities p(xk|D, ) in the forward filtering stage, either through EKF or UKF, we are able to sample the locations of the molecule using backward sampling as in Eq. (15). Specifically, according to Eq. (16), the sampling densities on the individual locations become for the last time level and for the preceding ones,38 where mk and ck are the filter parameters computed in the forward filtering stage. The resulting distributions are
which are directly sampled in our Gibbs sampling scheme.
C. Extension to 3D
The full case of a 3D confocal volume, illustrated graphically in Fig. 4, is a straightforward generalization of the 1D case we described earlier. More precisely, the full Bayesian model, in the form of a hidden Markov model, is
with an instantaneous rate μk = μback + μmolPSF(xk, yk, zk). In principle, our treatment is general enough to treat any PSF; though, for simplicity, we restrict our discussion here to a 3D Gaussian of the form
where ωxy and ωz are the minor and major semi-axis of the 3D Gaussian.24,31,32
For the 3D model above, the corresponding posterior is . Due to isotropy of the kinetics, we are able to sample each of , , and separately using forward filters and backward sampling similar to the 1D case. Accordingly, we generate posterior samples with the following Gibbs steps:
Generate the diffusion coefficient D(i) by sampling from the distribution .
Generate the trajectory by sampling from the distribution .
Generate the trajectory by sampling from the distribution .
Generate the trajectory by sampling from the distribution .
To benchmark our model, we apply it to synthetic datasets generated through common pseudorandom simulations33,59–62 that mimic data generated in confocal microscopy. Then, by applying our sampling scheme, we estimate the diffusion coefficient D and reconstruct the trajectory using the model posterior . We sample and compare the posterior in three cases: (i) exact, where we do not use any of the approximations in the filter computation, (ii) EKF, and (iii) UKF. Exact sampling is described in thesupplementary material. It is worth noting that exact sampling is inefficient for real applications due to its exceedingly long running time which is the reason we only used it as a benchmark. For all cases, we compare the results with the ground truth diffusion coefficient D and trajectory used to generate the synthetic data.
1. 1D model
From Figs. 5 and 6, we immediately note that the posteriors over positions are sharper, when the molecule is closer to the center of the illuminated region in all cases, i.e., whether exact, EKF, and UKF. This occurs because the molecules emit more photons near the center of the confocal volume which, in turn, leads to cleaner data, thereby sharpening our posterior estimates.
As shown in Figs. 5 and 6, the posteriors of the diffusion coefficient learned by EKF and UKF are close to the exact posterior learned by the exact random walk sampler. Also, the estimation of the molecule’s trajectory when the molecule is close to the center of the confocal volume matches, almost exactly, the ground truth.
We note that, in Fig. 6, while the molecule finds itself either on the positive or negative side of the 1D illumination profile, the posterior has a symmetric probability density around the origin. Thus, our framework resolves where the molecule is with respect to the confocal center but cannot resolve whether it is on the left- or right-hand side of the origin.
In Fig. 7, we compare the results obtained using our symmetric (bi-modal) EKFs and UKFs described in Sec. II with those obtained by conventional (uni-modal) EKF and UKF filters without any further modifications. A glance at the trajectory estimates suggest that, indeed, uni-modal filters as approximations to the true bi-modal filters are inadequate for confocal applications with symmetric confocal volumes around the origin. The inadequacy is particularly pronounced in the case of UKF.
2. 3D model
As the confocal volume exhibits symmetry in all 3 dimensions, for a 3D Gaussian confocal volume (centered by convention at the origin), posteriors are now invariant under a transformation that leaves the normalized distance
unchanged. In particular, this distance is symmetric under xk ↦ −xk, yk ↦ −yk, and zk ↦ −zk.
Next, we benchmark our method in 3D. Again, we compare the results of EKF and UKF in estimating the trajectory and diffusion coefficient of the molecule. Figure 8 demonstrates a sharper posterior as molecules move toward the center of the confocal volume consistent with the higher photon emission rates observed in this region.
B. Quantification of the diffusion coefficient
To benchmark our model across different time scales, we apply it on molecules diffusing with a range of diffusion coefficients (from slow to fast). Here, an intermediate diffusion coefficient is 1 μm2/s, where the molecule emits appreciably (say, an average of at least 4 photons per time point) for an average of 1000 time points. From Fig. 9, we see that when molecules diffuse rapidly, that is, when D is on the order of 10 μm2/s, the (symmetrized) EKF provides a poor estimate of the diffusion coefficient due to the local linearization employed, while the UKF provides a reasonable result due to its global approximation.
C. Application on cylindrical PSFs
We now extend our 3D model to allow for different illumination profiles.2 In particular, we consider a cylindrical PSF shape (i.e., a major semi-axis far exceeding in size the minor semi-axis for the 3D Gaussian) in addition to the ellipsoidal one (a 3D Gaussian with major and minor semi-axes of the same order) previously considered. That is, for a cylindrical volume, we consider the case z/ωz ≪ 1, and thus, the PSF reduces to
Since, for a cylindrical shaped volume, the observation probability is the same for any location of the molecule along the z-axis, now it is only possible to measure the radial distance of the molecule with respect to the z-axis. As such, for a molecule at (xk, yk, zk), we can estimate trajectories in terms of the following distance:
where, Rk is the radial distance of the molecule respect to the z-axis.
Figure 10 shows that, even in this new geometry, we can capture the normalized distance of the molecule with respect to the z-axis (Rk) as well as its diffusion coefficient.
Although both filters provide reasonably good estimates of the trajectory, the UKF again provides better estimates of the diffusion coefficient than the EKF.
In this study, we introduce a new framework to decode spatial information from the most common illumination profiles used across confocal microscopy.2 As the confocal volumes exhibit symmetry, i.e., multiple molecular positions generate identical signals, this creates modeling and computational challenges. To address these, as well as to address the inherent non-linearities in the observation process introduced by the PSFs dependence on the molecular location, we develop new approaches for non-linear filtering. In particular, we developed multi-modal generalizations of both extended and unscented Kalman filters.51–58 With these filters, we estimate not only molecular trajectories and diffusion coefficients but also errors around all of our estimates as, within the Bayesian paradigm, we obtain full posteriors over all quantities of typical interest in experiments employing confocal microscopy.
Our novel framework enables us, for the first time, to achieve spatial resolution from a conventional single-focus confocal setup. Although several experimental methods derive spatial information from confocal microscopy,18,63,64 all such methods rely on complex optical setups with multiple confocal volumes. It is now possible to generalize the approach presented here to treat multi-confocal volumes and even generalize it to the case of complex photophysics65 or multiple molecules within the illuminated region. The latter is a critical generalization as, if more than one molecule is present in the confocal volume, using our method unmodified would result in an estimate of only an average trajectory (as opposed to separate trajectories for each molecule). In a companion study,66 we show that the treatment presented herein, under the appropriate extensions, indeed carries over many molecules diffusing simultaneously through the confocal volume.
Our current formulation can also be adapted for different physical scenarios, for different experimental setups, or for different biophysical/biochemical phenomena. As an example, in cases that the diffusive molecule is also influenced by drift, our motion model can not only be modified to incorporate drift but also to learn the drift velocity. Drift velocity, in turn, could resolve the symmetry problem (i.e., our inability to distinguish in which portion of the shell of a Gaussian PSF a molecule is located). Furthermore, we can now envision treating more complex PSFs32 or more complex motion models as well, in particular, cases where the diffusion coefficient itself is changing over time.67 Also, using the Stokes-Einstein equation,33 we could extract the temperatures through the diffusion coefficients.
As a comment on possible broader implications of this work, we note that, traditionally, in fluorescence microscopy, signals are produced using multi-pixel cameras that receive photons at different rates. Accordingly, images are only subsequently formed by interpreting differences among the signals recorded in each pixel. Therefore, naive interpretation of the acquired data suggest that just one pixel in fluorescence microscopy should not produce images or should not provide any spatial resolution at all since it lacks nearby pixels to compare against. Nevertheless, this is true only under the assumption of uniform sample illumination.
Yet confocal data, often acquired to investigate processes with higher temporal resolution, encode spatial information even if just one pixel, for example, a sensor-like APD, is used. This is fundamental because the sample is under non-uniform illumination and so spatial information is already encoded on the photon emission rate itself.
Put differently, signals in fluorescence microscopy may be generated by signal differences either present at the observation level (i.e., using multi-pixel cameras with different signals arriving in each pixel) or at the data-generation level (i.e., using non-uniform illumination). Here, we take a first step in the latter direction suggesting a strategy for achieving some degree of spatial resolution—while fully retaining the temporal resolution—of confocal methods from just one pixel.
See supplementary material for a detailed derivation of EKF and UKF formulas, a description of the exact sampler, and a summary of our notational convention. The source code of the model is also available.
S.P. acknowledges support from NSF CAREER Grant No. MCB-1719537.