Fluorescence microscopy is of vital importance for understanding biological function. However, most fluorescence experiments are only qualitative inasmuch as the absolute number of fluorescent particles can often not be determined. Additionally, conventional approaches to measuring fluorescence intensity cannot distinguish between two or more fluorophores that are excited and emit in the same spectral window, as only the total intensity in a spectral window can be obtained. Here we show that, by using photon number resolving experiments, we are able to determine the number of emitters and their probability of emission for a number of different species, all with the same measured spectral signature. We illustrate our ideas by showing the determination of the number of emitters per species and the probability of photon collection from that species, for one, two and three otherwise unresolvable fluorophores. The convolution binomial model is presented to represent the counted photons emitted by multiple species. Then, the expectation-maximization (EM) algorithm is used to match the measured photon counts to the expected convolution binomial distribution function. In applying the EM algorithm, to leverage the problem of being trapped in a sub-optimal solution, the moment method is introduced to yield an initial guess for the EM algorithm. Additionally, the associated Cramér–Rao lower bound is derived and compared with the simulation results.

Understanding the complex biological function often requires precise localization of biological molecules in space and time to better understand their interactions and relevant chemical reactions and transformations. Cryogenic electron microscopy is capable of defining positions of atoms in complex molecules with angstrom accuracy1 but is often limited in capturing complex interactions of molecules in dynamic biological environment. Fluorescence and Raman microspectroscopies can provide structural and functional information about biological molecules, though special measures are needed to extend the spatial resolution of optical imaging beyond the traditional diffraction limit defined by the excitation wavelength of light. The new generation of optical imaging methods based on super-resolution optical imaging, for which the Nobel Prize in Chemistry was awarded in 2014, is gradually being adopted by the research community, and these techniques are now indispensable for fundamental understanding of biological function at the molecular level. However, many biological molecules are performing their function not along but in a coherent ensemble with other molecules. A typical example of such synergistic interaction is the electron-transfer complex in a mitochondrial membrane, where several cytochromes are involved in the ultimate production of ATP molecules that serve as a major energy fuel for a cell. To understand the function of a mitochondrion and to assess its ability to efficiently produce ATP molecules, one needs to quantify the number of such cytochromes in a membrane, which is close to impossible using conventional methods. There are several challenges which are related to the size of the focal spot and internal dynamics of mitochondria, and these make it difficult to localize those different cytochromes within the membrane. Traditional approaches based on classical photon statistics have significant limitations, because of either their invasive nature, such as induced photobleaching, which affects the electronic structure of biological fluorophores and amends its function, or complexity of signal collection and analysis.2 On the other hand, quantum spectroscopy based on the quantum statistics of detecting photons using photon resolved detectors and cameras3,4 (see also qCMOS by Hamamatsu5) as we have shown recently6 makes it feasible to count individual emitters in a focal volume.

Techniques of fluorescence microscopy have become among the most used techniques for understanding biological function.7–12 These typically involve measurement of the uptake of functionalized fluorophores, or observation of the expression of fluorescent proteins in response to some stimulus.8–12 Fluorescence experiments are usually qualitative or, at most, relative, as the total number of fluorophores is often not observable because a conventional intensity measurement is unable to distinguish a few bright emitters from many dim emitters.8 Moreover, if the fluorophores are excited and emit in the same spectral windows, then they may be impossible to distinguish with intensity only measurements. In Ref. 13, the average number of emitters in each species and the brightness ratio between multiple species are investigated and evaluated using high-order image correlation spectroscopy.

Quantum correlation techniques use two-photon and higher coincidences to gain information about an incoming light field. As such, they have found application in imaging and localization, e.g.,14–19 characterization6,20 as well as in the identification of non-classical states of light.21,22 The traditional method of measuring coincidences is through the Hanbury Brown and Twiss experiment,23 where single photon avalanche detectors (SPADs) monitor the light received via a beamsplitter, or sequence of beamsplitters in the case of higher order coincidences. Sometimes, systems that monitor higher-order coincidences in this fashion are referred to as photon number resolving detectors (PNRD) (see, for example, Ref. 22) although here we reserve this terminology for devices that are inherently photon resolving such as those we mention above. A comparison of high-order Hanbury Brown and Twiss with photon number resolving detectors can be found in Ref. 6.

Earlier, we showed that the problem of quantitative determination of the number of emitters and the probability of photon detection could be solved for a single species of emitters with assumed identical properties.6 Our approach there used the binomial distribution of the number of photons emitted in a pulsed fluorescence experiment. By considering photon number resolving detectors (PNRD),4,24–31 we showed that the distribution of photons arriving in each detected pulse is uniquely specified by the number of emitters and the photon detection probability, so that these parameters can be determined with some confidence given a particular measurement record.

Here, we show that PNRD detection techniques can, in principle, be used to discriminate between an arbitrary number of fluorescent species. We assume that each species is defined by two parameters, the number of emitters, M, and the probability of detection p, which is assumed to be constant for each member of the species, for instance, by each species having a different transition dipole moment. We assume no further ability to distinguish the species. The determination of multiple species is a highly non-trivial extension of our earlier work.6 The additional complexity arises due to the multi-dimensional state space which leads to multiple local minima, confounding convergence to the ground truth. To mitigate these issues, we introduce a two-stage estimator, which we find improves convergence significantly. Our treatment assumes that an upper bound on number of species is known a priori (some species could have either p = 0 or M = 0), but for computational simplicity here we will assume that the number of distinct species is known.

This paper is organized as follows. We first introduce the measurement model and maximum likelihood estimation (MLE) approach. We then discuss how the expectation-maximization (EM) technique is applied. Finally, we present simulations and Cramér–Rao lower bounds (CRLB) for the cases of one, two, and three species.

We consider a fluorescence experiment where there are m distinct fluorophores (species); species j has Mj emitters with probability of photon detection from each member of that species pj. A single excitation source (e.g., laser through microscope objective) excites the sample, and the fluorescence photons are collected by a PNRD. Schematic with three species emitters is shown in Fig. 1.

Fig. 1.

(a) Schematic of using PNRD to discriminate three species of emitters. The excitation laser (blue beam) passes through the dichroic mirror and focuses on three species (M1, p1), (M2, p2), and (M3, p3) under the diffraction limit locating within one Gaussian focal spot in (c). The emitting light (red beam) is detected by PNRD, and the synthetic photon number resolving signal is shown in (b) as bar charts in three colors. With the measurement time increases, the probability of detecting the photon numbers would follow Poisson distribution, shown as the black smooth curves. The area underneath each Poisson curve or bar chart is always one, indicating that in a practical measurement the photon counts or the intensity from each species is the same; therefore, they cannot be identified by conventional intensity-only based microscopy. However, they generate distinguishing PNRD signals, building on which we introduce the MLE approach to tell them apart.

Fig. 1.

(a) Schematic of using PNRD to discriminate three species of emitters. The excitation laser (blue beam) passes through the dichroic mirror and focuses on three species (M1, p1), (M2, p2), and (M3, p3) under the diffraction limit locating within one Gaussian focal spot in (c). The emitting light (red beam) is detected by PNRD, and the synthetic photon number resolving signal is shown in (b) as bar charts in three colors. With the measurement time increases, the probability of detecting the photon numbers would follow Poisson distribution, shown as the black smooth curves. The area underneath each Poisson curve or bar chart is always one, indicating that in a practical measurement the photon counts or the intensity from each species is the same; therefore, they cannot be identified by conventional intensity-only based microscopy. However, they generate distinguishing PNRD signals, building on which we introduce the MLE approach to tell them apart.

Close modal

Due to the numerical complexity of the current scheme, we make a simplifying assumption that each emitter species has the same detection probability per emitter. The point spread function of a microscope system means that as the emitters' positions vary, the collection efficiency would also vary. For some spatial distribution of the emitters, then the i th emitter from group j will have collection probability p i , j = p i , j ( r i , j ), where r i , j is the distance of the emitter from the focal point of the objective. If these detection probabilities overlap, then without additional information (for example an ansatz describing the spatial distribution of the emitters) it will not be possible to distinguish bright emitters with large r from less bright emitters with small r, which in turn implies that estimates of the Mj will be problematic. These restrictions mean that we require the emitters to be confined to a region much smaller than the point spread function, or to use uniform illumination and an integrating sphere to collect the emission.

This experiment is repeated a large number of times to build up statistics about the system. The number of fluorescence photons detected from each pulse, y, is counted at the outputs of the emitters; this is modeled as the sum of m random variables sampled from binomial distributions with parameters [ M j , p j ] , j = 1 , , m, i.e.,
Y = j = 1 m Y j , where Y j B ( M j , p j ) .
(1)
The probability mass function (PMF) of Y = y, y = 0 , , M and M = i = 1 m M i, given parameters θ = [ M 1 , p 1 , M 2 , p 2 , , M m , p m ] can be seen, by the law of total probability, to be a convolution of m binomial distributions, p r j ( y j | M j , p j ) for j = 1 , , m and y j = 0 , 1 , , M j, and so can be derived as
p r ( Y = y | θ ) = y Y y ( j = 1 m M j ( M j y j ) ! y j ! p j M j ( 1 p j ) M j y j ) = y Y y ( j = 1 m p r j ( y j | M j , p j ) ) ,
(2)
where y = [ y 1 , , y m ] and
Y y = { [ y 1 , , y m ] | j = 1 m y j = y , y j [ 0 , 1 , 2 , , M j ] } .
Y y is actually the collection of partition of the integer y into exactly m parts. It should be noted that background noise, such as stray light, objective autofluorescence, and camera noise are not considered in our model. These noise sources may be modeled by Poisson or mixed Poisson Gaussian model,32 and researchers have proposed some methods to eliminate or compress these noise.33–35 Although treating this noise is practically important, it represents a significant complication to our model, greatly increasing the necessary space to explore and thereby distracting from the main goals of this paper. We, therefore, omit background noise in the model (2), leaving this as a future problem.

The fundamental problem of interest is to estimate θ based on the measurements. We show that this problem can be formulated as an MLE. In Ref. 36, it is shown that the probability of the sum of m independent integer-valued random variables (not necessarily identically distributed), i.e., p r ( Y = y | θ ), may be calculated using a recurrence relation. Furthermore, the PMF of (1) can be approximated by a Gaussian distribution, N [ j = 1 m M p j , j = 1 m M j p j ( 1 p j ) ]. In Refs. 37–39, other more accurate approximation methods, such as saddlepoint approximation, Kolmogorov approximation, or Krawtchouk polynomial approximation, are provided. Since all the mentioned methods are either in the form of a recurrence formula or otherwise have no closed form, they cannot be directly used in deriving the MLE for θ. Accordingly, we use Eq. (2) for investigating the MLE.

In each experiment, we record the peak corresponding photon number from the PNRD signal as i, i = 0 , , N and N < = M. We count the occurrence of i in a series of experiments as Ci; then, the data from a series of experiments can be given by the frequency distribution [ C 0 , C 1 , , C N ], and ν = i = 0 N C i is the total number of experiments. The log-likelihood function can then be expressed as
( C 0 , , C N | θ ) = i = 0 N C i log y Y i ( j = 1 m p r j ( y j | M j , p j ) ) = i = 0 N C i log L ( y | θ ) ,
(3)
where L ( y | θ ) = y Y i [ j = 1 m p r j ( y j | M j , p j ) ]; similar to Y y , Y i is the collection of partition of the integer i into m parts. Furthermore, we assume that f ( y | θ ) = j = 1 m p r j ( y j | M j , p j ) and therefore L ( y | θ ) = y Y i f ( y | θ ).
The MLE of θ , θ ̂ = [ M ̂ 1 , p ̂ 1 , , M ̂ m , p ̂ m ], is
θ ̂ = arg max θ × j = 1 m ( + × [ 0 , 1 ] ) ( C 0 , , C N | θ ) ,
(4)
where × is the Cartesian product. From (2) to (4), the underlying estimation problem is formulated as a parameterized MLE problem; that is, seeking the set of parameters θ in the parameter space which yield maximum likelihood ( C 0 , , C N | θ ) based on the observations.

The solution to (4) when m = 1 was provided in Ref. 6, where the MLE is proved to be an effective estimator and the associated CRLB is derived. However, when m > 1, solving (4) directly is very inefficient and even computationally impossible since the dimension of the problem, i.e., the number of the parameters to be estimated, is 2m. This rules out grid-based methods to find the MLE. With the increasing number of parameters, the existence of multiple local extrema confounds most optimization methods for finding the global extremum.

To investigate the determination of θ, we generated synthetic data using the PMF, Eq. (2), with total number of experiments ν. These synthetic data yield a histogram of events, as illustrated in Fig. 2. The synthetic data were generated on the basis of ν = 100 experiments, with parameter θ = [ 8 , 0.1 , 10 , 0.2 , 12 , 0.3 ]. Also shown in Fig. 2(a), the expected PMF given by Eq. (2), the histogram of synthetic data when ν = 100. As the number of experiments increases, the synthetic data should converge to the expected PMF as shown in Fig. 2(b).

Fig. 2.

The comparison between the relative occurrence obtained from synthetic measurement from photons with different ν, (a) ν = 100 and (b) ν = 1 e 7, where θ = [ 8 , 0.1 , 10 , 0.2 , 12 , 0.3 ]. (c) shows the binomial PMF for each species. It can be seen that the histogram obtained from the synthetic data converges to the expected distribution of photons with the increasing of the ν.

Fig. 2.

The comparison between the relative occurrence obtained from synthetic measurement from photons with different ν, (a) ν = 100 and (b) ν = 1 e 7, where θ = [ 8 , 0.1 , 10 , 0.2 , 12 , 0.3 ]. (c) shows the binomial PMF for each species. It can be seen that the histogram obtained from the synthetic data converges to the expected distribution of photons with the increasing of the ν.

Close modal

Remark 1. In the estimation of the parameters of fluorescent species, one may be more interested in the number of emitters of each species rather than the probabilities, so that, ideally, more consideration should be given to that problem. However, in the estimation of [ M j , p j ], their accuracies are correlated; that is, reduced accuracy of estimation of pj will lead to a worse estimation performance for Mj. Additionally, because of the integral nature of Mj, the estimator M ̂ j may be well away from ground truth, corresponding to a small change of p ̂ j. Accordingly, we will not focus on a particular parameter in what follows.

The EM algorithm is an effective method for finding the MLE (or local extremum of the likelihood) iteratively by a simplification of a complicated likelihood function. By carefully choosing the initial guess of the EM, the MLE is approached with high probability.

In this section, the likelihood function ( C 0 , , C N | θ ), i.e., (3), is firstly reformulated and simplified to an equivalent problem under the EM framework using the sum-of-log-of-sums method.40 As a result, the 2m dimension MLE problem is converted into m independent two-dimension optimization problems for which local extrema can be found iteratively, see (12).

Since the likelihood has many local minima, the initial guess has a significant impact on the final estimate of the EM algorithm, i.e., the EM algorithm may converge to a local critical point with an “incorrect” initial guess. It should be noted that EM becomes more sensitive to the initial guess with increasing m, as the number of local extrema increases for larger m. Seeding the EM algorithm with a limited number of random initial guesses typically provides convergence to the MLE when m = 2, but the number of initial seeds becomes unacceptably large when m > 2. To overcome this problem, we combined the moment estimator with EM algorithm; that is, the results of moment estimator are used as the initial guesses.

Remark 2. We mention that it is non-trivial to find the global optimum, i.e., exact MLE, for the likelihood function (3) since the likelihood function is highly complex and exhibits multiple local minima. Some existing optimization methods, such as Annealing,41, particle swarm optimization42 and stochastic gradient descent,43,44 are available to find the global optimum for simple cost function, but there is no guarantee that these method can find the global optimum for highly nonlinear functions having multiple local optimum such as found here in the case of multiple species.

We introduce an intermediate variable θ ̂ s that is the estimate of θ at the sth iteration of the EM algorithm. Given θ ̂ s , from (3), we have
log L ( y | θ ) log L ( y | θ ̂ s ) = log [ y Y i f ( y | θ ) f ( y | θ ̂ s ) f ( y | θ ̂ s ) L ( y | θ ̂ s ) ]
(5)
y Y i w y s log [ f ( y | θ ) f ( y | θ ̂ s ) ]
(6)
= y Y i w y s log f ( y | θ ) y Y i w y s log f ( y | θ ̂ s )
(7)
= Q ̃ ( θ | θ ̂ s ) Q ̃ ( θ ̂ s | θ ̂ s ) ,
(8)
where (6) follows by Jensen's inequality,40 with w y s = f ( y | θ ̂ s ) / L ( y | θ ̂ s ) and y Y i w y s = 1.
In a similar way, we consider the joint log-likelihood function ( C 0 , , C N | θ ). In this case, the auxiliary function is
Q ( θ | θ ̂ s ) = i = 0 N C i y Y i w y s log f ( y | θ ) = i = 0 N C i y Y i w y s j = 1 m log g ( y j | θ j ) ,
(9)
where f ( y | θ ) is defined in (3) and
g ( y j | θ j ) = M j ( M j y j ) ! y j ! p j M j y j ( 1 p j ) y j .
(10)
One can find a local extremum of log L ( y | θ ) by maximizing Q ̃ ( θ | θ ̂ s ) iteratively.45 The problem then becomes to maximize Q ( θ | θ ̂ s ) over θ, i.e.,
θ ̂ s + 1 = { θ ̂ 1 s + 1 , , θ ̂ m s + 1 } = arg max y × j = 1 m ( [ 0 , 1 ] × + ) Q ( θ | θ ̂ s ) .
(11)
Since, log g ( y j | θ j ) , j = 1 , 2 , , m are independent with respect to θ j , where j = { 1 , , m } { j }, (11) can be rewritten as
θ ̂ s + 1 = { θ ̂ 1 s + 1 = arg max θ 1 [ 0 , 1 ] × + i = 0 N C i y Y i w y s log g ( y 1 | θ 1 ) , , θ ̂ m s + 1 = arg max θ m [ 0 , 1 ] × + i = 0 N C i y Y i w y s log g ( y m | θ m ) } = { θ ̂ 1 s + 1 = arg max θ 1 [ 0 , 1 ] × + Q 1 ( θ 1 | θ ̂ s ) , , θ ̂ m s + 1 = arg max θ m [ 0 , 1 ] × + Q m ( θ m | θ ̂ s ) } .
(12)
For j { 1 , , m }, we have
θ ̂ j s + 1 = arg max θ j [ 0 , 1 ] × + Q j ( θ j | θ ̂ s ) .
Then, p ̂ j s + 1 can be found via
p j Q j ( θ j | θ ̂ s ) = 0 p ̂ j s + 1 = i = 0 N C i y Y i w y s y j M ̂ j s ν .
(13)
Substituting p ̂ j s + 1 into jth term of (12), we obtain
θ ̂ j s + 1 = { M ̂ j s + 1 , p ̂ j s + 1 } ,
(14)
where
M ̂ j s + 1 = arg max M j + Q j ( M j | M ̂ j s , p ̂ j s + 1 ) .
(15)

Now, the EM algorithm can be implemented using an initial guess θ ̂ j 0 for j = 1 , , m and the (local) estimate can be obtained until θ ̂ j s converges. The structure of the EM algorithm is listed in Algorithm 1.

Algorithm 1.

The EM algorithm to estimate θ.

Data: C 0 , , C N 
Result: θ ̂ [ M ̂ 1 , p ̂ 1 , , M ̂ m , , p ̂ m ] s 0
Choose initial guesses [ M ̂ 1 s , , M ̂ m s ] and [ p ̂ 1 s , , p ̂ m s ] 
repeat 
   s s + 1
  Calculate p ̂ j s using (13) and (14) for j = 1 , , m
until converge
θ ̂ [ M ̂ 1 s , p 1 s , , M ̂ m s , p m s ]
Data: C 0 , , C N 
Result: θ ̂ [ M ̂ 1 , p ̂ 1 , , M ̂ m , , p ̂ m ] s 0
Choose initial guesses [ M ̂ 1 s , , M ̂ m s ] and [ p ̂ 1 s , , p ̂ m s ] 
repeat 
   s s + 1
  Calculate p ̂ j s using (13) and (14) for j = 1 , , m
until converge
θ ̂ [ M ̂ 1 s , p 1 s , , M ̂ m s , p m s ]

In the EM algorithm, the log likelihood is guaranteed to increase at each EM iteration, and it converges to a maximum of the likelihood under mild conditions.45 However, it is not guaranteed to be the global optimum.

As an “always improving” algorithm,46 EM is, of course, sensitive to the initial guess of θ, i.e., θ ̂ 0 when the likelihood function contains multiple critical points. We observe that the number of local critical points increases dramatically with the increasing number of species.

1. Choosing initial guesses for [ M 1 , , M m ]

When m > 1, (14) may converge to a local minimum that is not the MLE. To relieve this problem, a search procedure can be adopted into the EM algorithm. In (14), it can be observed that p j s + 1 is updated at each iteration step using M ̂ j s . By taking into account that p j s + 1 has a better chance to converge to pj when M ̂ j s converges to Mj, one can fix M ̂ j s = M j , s and j = 1 , , m, in (14) and then find the optimized p ̂ j s given [ M 1 , , M m ] iteratively.

In practice, [ M 1 , , M m ] is unknown and to be estimated. However, since M 1 , , M m are (bounded) integers so that their possible values are finite and listable by enumeration. Suppose that M j M , j and M +, then a set, M = { M 1 , , M L }, containing all possible combinations for [ M 1 , , M m ] can be constructed by m-combinations from the integer set { 1 , 2 , , M } without repetition, order does not matter and it can be verified that L = ( M n ). In other words, M l , l = 1 , , L, corresponds to a possible solution (combination) to [ M 1 , , M m ]. The EM algorithm is, therefore, implemented L times. At the lth implementation, the guess [ M ̂ 1 0 , , M ̂ m 0 ] is chosen to be M l and fixed for all s. For simplicity, we denote [ M ̂ m , l , , M ̂ m , l ] as the guess of [ M 1 , , M m ] at lth implementation of EM algorithm.

2. Choice of initial guesses for [ p 1 , , p m ]

To obtain initial guesses for the estimator of [ p 1 , , p m ], we use the estimates from the moment estimator. By using the data, it is straightforward to calculate the sample mean, μ ̂ 1 = i = 0 N ( C i / ν ) i, and kth sample central moments, μ ̂ k = i = 0 N ( C i / ν ) ( i μ ̂ 1 ) k for k > 1. The associated population mean and central moments are μ 1 = E [ x ] and μ k = E [ ( x μ 1 ) k ] for k > 1. Then, the moment estimator, θ ̂ mom, can be obtained by solving μ ̂ i = μ i for i = 1 , , m, and this can be used to seed the EM algorithm: θ ̂ 0 = θ ̂ mom. Since we may find multiple θ ̂ mom from the moment estimator, the EM can be run in parallel with different initial guesses.

As an example, the moments μi of the sum of m = 4 binomial distributed random variables are38 
{ μ 1 = j = 1 m M j p j , μ 2 = j = 1 m ( 1 p j ) M j p j , μ 3 = j = 1 m ( 1 p j ) ( 1 2 p j ) M j p j , μ 4 = j = 1 m M j p j ( 1 p j ) ( 1 + ( 3 M j 6 ) ( 1 p j ) p j ) ,
which can be simplified to
{ j = 1 m M j p j = μ 1 , j = 1 m M j p j 2 = μ 1 μ 2 , j = 1 m M j p j 3 = 1 2 ( 2 μ 1 3 μ 2 + μ 3 ) , j = 1 m M j p j 4 = 1 6 ( 6 μ 1 11 μ 2 + 3 μ 2 2 + 6 μ 3 μ 4 ) .
(16)

Applying the method of moments, we replace [ M 1 , , M m ] by [ M ̂ 1 , l , , M ̂ m , l ] and [ μ 1 , , μ m ] by [ μ ̂ 1 , l , , μ ̂ m , l ] in (16) and find the real solutions [ p ̂ 1 , l , , p ̂ m , l ] to provide the initial guess of [ p 1 , , p m ] at the lth implementation of EM algorithm. After obtaining multiple candidate estimates with different initial guesses, the estimated MLE among these candidates is the one with the largest value of the likelihood function (3). The algorithm is summarized in Algorithm 2.

Algorithm 2.

The algorithm to estimate θ combining searching strategy.

Data: C 0 , , C N 
Result: θ ̂ [ M ̂ 1 , p ̂ 1 , , M ̂ m , , p ̂ m ] 
{ M 1 , , M L } nchoosek ( N , m )
k 1
for l = 1 : L do 
   sol = solve ( j = 1 m M ̂ j , l p j i = μ ̂ i , i = 1 , , m )
  if sol is real then 
    s 0
    p ̂ 1 , l s , , p ̂ m , l s sol
   repeat 
     s s + 1
    Find p ̂ j , l s using (14) by replacing M ̂ j s with M ̂ j , l
   until converge
    θ ̂ k = [ M ̂ 1 , l , p ̂ 1 , l s , , M ̂ m , l , p ̂ m , l s ]
    e k = ( θ ̂ l | C 0 , , C N )
    k k + 1
   I arg max i { e i }
   θ ̂ θ ̂ I
Data: C 0 , , C N 
Result: θ ̂ [ M ̂ 1 , p ̂ 1 , , M ̂ m , , p ̂ m ] 
{ M 1 , , M L } nchoosek ( N , m )
k 1
for l = 1 : L do 
   sol = solve ( j = 1 m M ̂ j , l p j i = μ ̂ i , i = 1 , , m )
  if sol is real then 
    s 0
    p ̂ 1 , l s , , p ̂ m , l s sol
   repeat 
     s s + 1
    Find p ̂ j , l s using (14) by replacing M ̂ j s with M ̂ j , l
   until converge
    θ ̂ k = [ M ̂ 1 , l , p ̂ 1 , l s , , M ̂ m , l , p ̂ m , l s ]
    e k = ( θ ̂ l | C 0 , , C N )
    k k + 1
   I arg max i { e i }
   θ ̂ θ ̂ I

In this section, we calculate the Fisher Information Matrix (FIM) and then the Cramér–Rao lower bound (CRLB).47 The CRLB provides a lower bound for the variance of an unbiased estimator. The underlying likelihood function (3) contains continuous as well as discrete components so that the conventional method to derive CRLB may not be applicable. A Cramér–Rao type bound for discrete likelihood function is proposed in Ref. 48. However, it cannot handle the distribution containing discrete and continuous components. In this paper, following the calculation in Ref. 6, an approximated CRLB is derived by approximating the discrete component x ! by a continuous function x Γ ( x ). On the other hand, the MLE is typically asymptotically unbiased under mild conditions.49 From the simulation given in Sec. V, the proposed MLE asymptotically approaches the derived approximated CRLB.

For the parameter θ = [ θ 1 , θ 2 , , θ 2 m 1 , θ 2 m ] = [ M 1 , p 1 , , M m , p m ], we see that the (k, l)th element of the FIM for (2), I ( θ ) k , l is
I ( θ ) k , l = i = 0 M y Y i { ( f ( y | θ ) θ k f ( y | θ ) θ l ) 1 f ( y | θ ) } ,
(17)
where f ( y | θ ) = j = 1 m p r j ( y j | M j , p j ).
By using x ! = x Γ ( x ) and [ x Γ ( x ) ] = Γ ( x ) + x Γ ( x ) ψ ( x ), where Γ ( · ) is the Gamma function and ψ ( · ) is the digamma function, we have
p r j ( y j | M j , p j ) M j = Γ ( M j ) p y j ( 1 p j ) M j y j y j ! ( M j y j ) 2 Γ ( M j y j ) × { M j ( y j M j ) [ ψ ( M j y j ) ψ ( M j ) log ( 1 p j ) ] y j }
(18)
and
f ( y j | M j , p j ) p j = M j ! p j y j 1 ( 1 p j ) M j y j 1 ( M j p j y j ) y j ! ( M j y j ) ! .
(19)

As a result, by applying the chain rule, we are able to calculate ( f ( y | θ ) / θ k ) using (18) and (19). Then, the FIM, I ( θ ), can now be calculated by inserting (18) and (19) into (17). The CRLB is just C ( θ ) = I ( θ ) 1. For ν experiments, the CRLB at ground truth θ 0 is C ν ( θ 0 ) = ( 1 / ν ) C ( θ ) | θ = θ 0.

Here, the algorithm is evaluated via Monte Carlo (MC) simulations. The metrics used to evaluate performance are
RMSE ( X ̂ 1 : nMC , j , X j ) = i = 1 nMC ( X ̂ i , j X j ) 2 nMC ,
(20)
aMAPE ( X ̂ 1 : nMC , 1 : 4 , X 1 : 4 ) = 1 m j = 1 m ( i = 1 nMC | X ̂ i , j X j X j | ) ,
(21)
where X { M , P } , X ̂ i , j is the estimate of Xj at the ith MC simulation, nMC is the number of MC simulations, and RMSE ( · ) is the root mean square error and aMAPE ( · ) is the averaged mean absolute percentage error.

To explore the convergence of our algorithm, in Secs. IV A, IV B, and IV C, we illustrate the performance for the one, two, and three distinct species, respectively.

The case of determining the number and detection probability for one species is highly analogous to the case that we presented in Ref. 6. We concentrate on the small number regime as this is more pertinent for the case that we are concentrating on with few emitters.

Here, we assume that the true parameters are [ M 1 , p 1 ] = [ 8 , 0.1 ]. Figure 3 shows the convergence of the determination of the number of emitters and detection probability.

Fig. 3.

(a) RMSE of the estimate of p1 as a function of the number of experiments ν for a single species. The blue line is the Monte Carlo simulation, and the red is the square root of the CRLB. Good agreement is achieved at around 106 experiments. (b) RMSE of the estimate of the number of emitters as a function of the number of experiments. The ground truth for this case was [ M , p ] = [ 8 , 0.1 ].

Fig. 3.

(a) RMSE of the estimate of p1 as a function of the number of experiments ν for a single species. The blue line is the Monte Carlo simulation, and the red is the square root of the CRLB. Good agreement is achieved at around 106 experiments. (b) RMSE of the estimate of the number of emitters as a function of the number of experiments. The ground truth for this case was [ M , p ] = [ 8 , 0.1 ].

Close modal

Additionally, in Fig. 4, the (expected) required number of experiments, νexp, to attain a given estimation performance, CRLB ( M 1 ) / M 1 = 1 %, with varied M 1 = 1 , , 20 and p 1 [ 0.05 , 0.95 ] is plotted. It can be seen that, to achieve the fixed performance, the small value of M1 or large value of p1 requires less number of experiments, which gives some insight in how the values of parameters relate to the estimation performance.

Fig. 4.

The required number of experiments, νexp, to attain CRLB ( M 1 ) / M 1 = 1 % for one species of emitter, as a function of M and p, while M1 varies from 1 to 20 and p1 from 0.05 to 0.95. Note that ν is plotted in log 10 scale. From the figure, it can be seen that p1 has greater impact on the required number of experiments, ν, i.e., the larger the p1 is, the less ν is. This is because smaller ν is required for constructing an empirical distribution [ C 1 , , C N ] with certain accuracy when p1 is large. This can be seen from the definition of p1 and M1 in (1), where they are parameters of a binomial distribution.

Fig. 4.

The required number of experiments, νexp, to attain CRLB ( M 1 ) / M 1 = 1 % for one species of emitter, as a function of M and p, while M1 varies from 1 to 20 and p1 from 0.05 to 0.95. Note that ν is plotted in log 10 scale. From the figure, it can be seen that p1 has greater impact on the required number of experiments, ν, i.e., the larger the p1 is, the less ν is. This is because smaller ν is required for constructing an empirical distribution [ C 1 , , C N ] with certain accuracy when p1 is large. This can be seen from the definition of p1 and M1 in (1), where they are parameters of a binomial distribution.

Close modal

We now consider the case of two species. As before, we set [ M 1 , p 1 ] = [ 8 , 0.1 ] and introduce [ M 2 , p 2 ] = [ 10 , 0.2 ] as the ground truth condition for the second species. We vary the number of experiments ν and examine the reduction in the RMSE of these parameters as ν increases, as shown in Fig. 5. Note that our simulations estimate both species simultaneously. For each ν, the Monte Carlo simulation is implemented 100 times. The comparison of the simulated RMSE and the computed squared root of the CRLB of p1 for all ν is shown in Fig. 5(a) (the trends for p2 are not shown since they are similar to p1), while the comparison of the estimates and the RMSE of the estimated [ M 1 , M 2 ] is shown in Fig. 5(b) for both species.

Fig. 5.

Two species estimation: (a) Reduction in the RMSE of p1 as a function of ν for the simultaneous estimation of the parameters of both species. We omit the results for p2 as they are very similar to those of p1. The blue line shows the results of our Monte Carlo simulations, and the red shows the square root of the CRLB. Convergence between the two is achieved at ν 10 10 experiments. (b) RMSE of the estimates of M1 and M2 as a function of ν. Both estimates follow similar trends, with M2 with the RMSE of M2 around 1.8 times larger than M1, due to the fact that it is generally harder to discriminate between states of more emitters.

Fig. 5.

Two species estimation: (a) Reduction in the RMSE of p1 as a function of ν for the simultaneous estimation of the parameters of both species. We omit the results for p2 as they are very similar to those of p1. The blue line shows the results of our Monte Carlo simulations, and the red shows the square root of the CRLB. Convergence between the two is achieved at ν 10 10 experiments. (b) RMSE of the estimates of M1 and M2 as a function of ν. Both estimates follow similar trends, with M2 with the RMSE of M2 around 1.8 times larger than M1, due to the fact that it is generally harder to discriminate between states of more emitters.

Close modal

The number of experiments required to obtain CRLB ( M 1 ) / M 1 = 1 % and CRLB ( M 2 ) / M 2 = 1 % as a function of M2 and p2 is shown in Fig. 6, for fixed [ M 1 , p 1 ] = [ 8 , 0.1 ]. Though [ M 1 , p 1 ] are fixed, they are treated as unknown parameter in the simulation and estimated simultaneously with [ M 2 , p 2 ]. The trends here are highly analogous to those from Fig. 4, albeit at a larger number of experiments for the specified precision. These results show that typically fewer experiments are required for lower M2, except where p2 approaches 1. Although the high p2 limit is not expected to be achievable in standard microscopy type setups.

Fig. 6.

The required number of experiments, νexp, to attain CRLB ( M 2 ) / M 2 = 1 % with (unknown) [ M 1 , p 1 ] = [ 8 , 0.1 ] as M2 varies from 1 to 20 and p2 from 0.05 to 0.95. Note that ν is plotted in log 10 scale and the white pixel corresponds to the point that M 1 = M 2 = 8 and p 1 = p 2 = 0.1 so that the FIM is singular (the CRLB does not exist). A similar trend is observed to that shown in Fig. 4, albeit with more experiments required to reach the specified precision for all conditions.

Fig. 6.

The required number of experiments, νexp, to attain CRLB ( M 2 ) / M 2 = 1 % with (unknown) [ M 1 , p 1 ] = [ 8 , 0.1 ] as M2 varies from 1 to 20 and p2 from 0.05 to 0.95. Note that ν is plotted in log 10 scale and the white pixel corresponds to the point that M 1 = M 2 = 8 and p 1 = p 2 = 0.1 so that the FIM is singular (the CRLB does not exist). A similar trend is observed to that shown in Fig. 4, albeit with more experiments required to reach the specified precision for all conditions.

Close modal

We now illustrate our general approach by considering the simultaneous determination of the properties of three species. As before, we maintain the same ground truth solutions for species 1 and 2, such that [ M 1 , p 1 ] = [ 8 , 0.1 ] , [ M 2 , p 2 ] = [ 10 , 0.2 ] and introduce [ M 3 , p 3 ] = [ 12 , 0.3 ]. As before we explore the estimation as a function of the number of experiments, with 100 Monte Carlo simulations per experiment (i.e., for each value of ν). Convergence of the p is similar for all species, so we only show the RMSE of p1 as a function of ν, which is plotted in Fig. 7(a). The overall trends here are similar to those observed with single species and two species identification, although the three species estimation requires around 1000 times more experiments to achieve comparable RMSE in the p than the two-species estimation, with convergence to the square root of the CRLB occurring after around 1014 experiments, four orders of magnitude more than the two-species convergence.

Fig. 7.

Simultaneous estimation of three species with ground truth [ M 1 , p 1 ] = [ 8 , 0.1 ] , [ M 2 , p 2 ] = [ 10 , 0.2 ], and [ M 3 , p 3 ] = [ 12 , 0.3 ] (a) decrease in the RMSE of p1 as a function of ν. We omit p2 and p3 as they are very similar. Convergence with the CRLB is achieved around ν = 10 14. RMSE for the estimate of the pi require around three orders of magnitude more experiments before they are similar to those obtained for the two-species estimation due to the complexity of the problem and difficulty in distinguishing the distributions. (b) RMSE of the estimates of M1, M2, and M3 as a function of the number of experiments. Estimates of M2 are slightly larger than for M1 and M3.

Fig. 7.

Simultaneous estimation of three species with ground truth [ M 1 , p 1 ] = [ 8 , 0.1 ] , [ M 2 , p 2 ] = [ 10 , 0.2 ], and [ M 3 , p 3 ] = [ 12 , 0.3 ] (a) decrease in the RMSE of p1 as a function of ν. We omit p2 and p3 as they are very similar. Convergence with the CRLB is achieved around ν = 10 14. RMSE for the estimate of the pi require around three orders of magnitude more experiments before they are similar to those obtained for the two-species estimation due to the complexity of the problem and difficulty in distinguishing the distributions. (b) RMSE of the estimates of M1, M2, and M3 as a function of the number of experiments. Estimates of M2 are slightly larger than for M1 and M3.

Close modal

We show the estimation of M1, M2, and M3 for the same ground truth in Fig. 7(b). These follow similar trends, with the estimation of M2 being slightly worse than M1 or M3 for the same number of experiments. The results show that the estimation requires approximately four orders of magnitude more experiments to achieve similar precision as two-species estimation.

Due to the increasing complexity with increasing number of species, we were unable to simulate the case for more than three species, nor to explore the state space for three-species in the form of Figs. 4 and 6 

Comparing with Fig. 3, Figs. 5 and 7, it is noticeable that the required ν for p1 to attain CRLB increases rapidly with the increasing of m, i.e., ν 10 6 for m = 1, ν 10 10 for m = 2 and ν 10 14 for m = 3. Furthermore, the aMAPEs of [ M ̂ 1 , , M ̂ m ] [see Eq. (21)] are plotted in Fig. 8, where [ M 1 , p 1 ] = [ 8 , 0.1 ] , [ M 2 , p 2 ] = [ 10 , 0.2 ], and [ M 3 , p 3 ] = [ 12 , 0.3 ]. One can see, from Figs. 3, 5, 7, and 8, that the performance of estimating pj and Mj is correlated, with roughly four orders of magnitude more experiments needed as the number of simultaneous species are estimated.

Fig. 8.

Comparison of the averaged mean absolute percentage errors as a function of ν for simultaneous estimation of one species (M1 only) (blue), two species (M1 and M2) (red) and all three species (black). Ground truths were [ M 1 , p 1 ] = [ 8 , 0.1 ] , [ M 2 , p 2 ] = [ 10 , 0.2 ], and [ M 3 , p 3 ] = [ 12 , 0.3 ]. Increasing the number of species by one leads to approximately four orders of magnitude in the number of experiments required to obtain similar aMAPE.

Fig. 8.

Comparison of the averaged mean absolute percentage errors as a function of ν for simultaneous estimation of one species (M1 only) (blue), two species (M1 and M2) (red) and all three species (black). Ground truths were [ M 1 , p 1 ] = [ 8 , 0.1 ] , [ M 2 , p 2 ] = [ 10 , 0.2 ], and [ M 3 , p 3 ] = [ 12 , 0.3 ]. Increasing the number of species by one leads to approximately four orders of magnitude in the number of experiments required to obtain similar aMAPE.

Close modal

It can be seen from the simulation result, to achieve a certain value of [ CRLB ( M 1 ) / M 1 ], the required number of experiments is significantly large, roughly 1012, when the probabilities to be estimated are small. This is largely because of that when the probabilities, pi, are small, the empirical distribution [ C 0 , , C N ] will be hard to obtain with certain accuracy via limited experiments. As a result, achieving level 1% for [ CRLB ( M 1 ) / M 1 ] requires a large number of experiments. It is possible that these results will be improved if a priori information is accessible, and so these results should be considered worst case scenarios.

In this paper, we have formulated, mathematically, the estimation of the parameters of an arbitrary number of fluorescent species. Specifically, the convolution binomial model is presented for the underlying problem and then the exact MLE is derived. In order to resolve the intractability of the MLE manifest in the convolutional property and the high-dimensionality of the parameters, a version of the EM algorithm incorporating the method of moment for choosing the initial guess is proposed. The simulation results with different number of species have demonstrated the efficiency of the algorithm by comparing with the derived CRLB.

We also found that the values of the parameters, the number of emitters and their probabilities, have impact on the performance of the estimation: the closer the probabilities are, the worse the performance is, and a larger summed numbers of emitters degrades the performance. Although at present we have assumed that the number of species is known a priori, future work could compare models, for example, via the Akaike Information Criterion,50 to ease this restriction.

Our work provides a preliminary study and demonstrates the possibility for the estimation of an arbitrary number of fluorescent species and gives insights into the relationship between performance of the estimator and the values of the parameters, which improves understanding of the problem. In future work, we will seek (1) to reduce the required number of experiments while maintaining estimator performance, (2) to improve computational efficiency, (3) to introduce experimental noise signatures, and (4) to identify the role which prior information may help to constrain the estimation problem in particular situations of interest.

The authors would like to thank Brett C. Johnson for his great help in the visualization of the study. This work was funded by the Air Force Office of Scientific Research (No. FA9550-20-1-0276). A.D.G. also acknowledges funding from the Australian Research Council (Nos. CE140100003 and FT160100357). V.V.Y. acknowledges partial support from the Air Force Office of Scientific Research (AFOSR) (Nos. FA9550-20-1-0366 and FA9550-23-1-0599), the National Institutes of Health (NIH) (Nos. 1R01GM127696, 1R21GM142107, and 1R21CA269099). This material is also based upon work supported by the NASA, BARDA, NIH, and USFDA, under Contract/Agreement No. 80 ARC023CA002E.

The authors have no conflicts to disclose.

Wenchao Li: Data curation (equal); Formal analysis (equal); Methodology (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Shuo Li: Formal analysis (equal); Investigation (equal); Visualization (equal); Writing – review & editing (equal). Timothy C. Brown: Formal analysis (equal); Methodology (equal); Writing – review & editing (equal). Qiang Sun: Formal analysis (equal); Investigation (equal); Writing – review & editing (equal). Xuezhi Wang: Formal analysis (equal); Methodology (equal); Writing – review & editing (equal). Vladislav V. Yakovlev: Conceptualization (equal); Funding acquisition (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Allison Kealy: Funding acquisition (equal); Methodology (equal); Writing – review & editing (equal). Bill Moran: Formal analysis (equal); Methodology (equal); Supervision (equal); Writing – review & editing (equal). Andrew Greentree: Conceptualization (equal); Funding acquisition (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
K. M.
Yip
,
N.
Fischer
,
E.
Paknia
,
A.
Chari
, and
H.
Stark
,
Nature
587
,
157
(
2020
).
2.
K. S.
Grußmayer
,
K.
Yserentant
, and
D. P.
Herten
,
Methods Appl. Fluoresc.
7
,
012003
(
2019
).
3.
F. E.
Becerra
,
J.
Fan
, and
A.
Migdall
,
Nat. Photonics
9
,
48
53
(
2015
).
4.
J.
Provazník
,
L.
Lachman
,
R.
Filip
, and
P.
Marek
,
Opt. Express
28
,
14839
14849
(
2020
).
5.
Photonics Hamamatsu
, see https://www.hamamatsu.com/content/dam/hamamatsu-photonics/sites/documents/99_SALES_LIBRARY/sys/SCAS0149E_qCMOS_whitepaper.pdf for “
qCMOS: Quantitative CMOS technology enabled by photon number resolving
” (
2021
).
6.
S.
Li
,
W.
Li
,
V. V.
Yakovlev
,
A.
Kealy
, and
A. D.
Greentree
,
Opt. Express
30
,
12495
(
2022
).
7.
P.
Ellinger
and
A.
Hirt
,
Naunyn-Schmiedebergs Arch. Exp. Pathol. Pharmakol.
145
,
193
(
1929
).
8.
J. W.
Lichtman
and
J.-A.
Conchello
,
Nat. Methods
2
,
910
(
2005
).
9.
M.
Renz
,
Cytometry, Part A
83
,
767
(
2013
).
10.
M. J.
Sanderson
,
I.
Smith
,
I.
Parker
, and
M. D.
Bootman
,
Cold Spring Harbor Protoc.
2014
,
1042
1065
.
11.
U.
Kubitscheck
,
Fluorescence Microscopy
(
Garland Science
,
2017
).
12.
J. C.
Stockert
and
A.
Blazquez-Castro
,
Fluorescence Microscopy in Life Sciences
(
Bentham Science Publishers
,
2017
).
13.
D.
Katoozi
,
A. H. A.
Clayton
,
D. J.
Moss
, and
J. W. M.
Chon
,
Biomed. Opt. Express
12
,
539
(
2021
).
14.
S. W.
Hell
,
J.
Soukka
, and
P. E.
Hänninen
,
Bioimaging
3
,
64
(
1995
).
15.
O.
Schwartz
and
D.
Oron
,
Phys. Rev. A
85
,
033812
(
2012
).
16.
D.
Gatto Monticone
,
K.
Katamadze
,
P.
Traina
,
E.
Moreva
,
J.
Forneris
,
I.
Ruo-Berchera
,
P.
Olivero
,
I. P.
Degiovanni
,
G.
Brida
et al,
Phys. Rev. Lett.
113
,
143602
(
2014
).
17.
H.
Ta
,
J.
Keller
,
M.
Haltmeier
,
S. K.
Saka
,
J.
Schmied
,
F.
Opazo
,
P.
Tinnefeld
,
A.
Munk
, and
S. W.
Hell
,
Nat. Commun.
6
,
7977
(
2015
).
18.
Y.
Israel
,
R.
Tenne
,
D.
Oron
, and
Y.
Silberberg
,
Nat. Commun.
8
,
14786
(
2017
).
19.
J. G.
Worboys
,
D. W.
Drumm
, and
A. D.
Greentree
,
Phys. Rev. A
101
,
013810
(
2020
).
20.
A.
Kurz
,
J. J.
Schmied
,
K. S.
Grußmayer
,
P.
Holzmeister
,
P.
Tinnefeld
, and
D.-P.
Herten
,
Small
9
,
4061
(
2013
).
21.
I.
Straka
,
L.
Lachman
,
J.
Hloušek
,
M.
Miková
,
M.
Mičuda
,
M.
Ježek
, and
R.
Filip
,
npj Quantum Inf.
4
,
4
(
2018
).
22.
J.
Hloušek
,
M.
Dudka
,
I.
Straka
, and
M.
Ježek
,
Phys. Rev. Lett.
123
,
153604
(
2019
).
23.
R.
Hanbury Brown
and
R. Q.
Twiss
,
Nature
177
,
27
(
1956
).
24.
G.
Thekkadath
, “
Preparing and characterizing quantum states of light using photon-number-resolving detectors
,” Ph.D.
thesis
(
University of Oxford
,
2020
).
25.
B. E.
Kardynal
,
Z. L.
Yuan
, and
A. J.
Shields
,
Nat. Photonics
2
,
425
(
2008
).
26.
J.
Ma
,
S.
Masoodian
,
D. A.
Starkey
, and
E. R.
Fossum
,
Optica
4
,
1474
(
2017
).
27.
C.
Cahall
,
K. L.
Nicolich
,
N. T.
Islam
,
G. P.
Lafyatis
,
A. J.
Miller
,
D. J.
Gauthier
, and
J.
Kim
,
Optica
4
,
1534
(
2017
).
28.
M.
Schmidt
,
M.
von Helversen
,
M.
López
,
F.
Gericke
,
E.
Schlottmann
,
T.
Heindel
,
S.
Kück
,
S.
Reitzenstein
, and
J.
Beyer
,
J. Low Temp. Phys.
193
,
1243
(
2018
).
29.
D. A.
Kalashnikov
,
S. H.
Tan
,
M. V.
Chekhova
, and
L. A.
Krivitsky
,
Opt. Express
19
,
9352
(
2011
).
30.
L. A.
Morais
,
T.
Weinhold
,
M. P.
de Almeida
,
A.
Lita
,
T.
Gerrits
,
S. W.
Nam
,
A. G.
White
, and
G.
Gillett
, “
Precisely determining photon-number in real-time
,” arXiv:2012.10158 (
2020
).
31.
F.
Mattioli
,
Z.
Zhou
,
A.
Gaggero
,
R.
Gaudio
,
R.
Leoni
, and
A.
Fiore
,
Opt. Express
24
,
9067
(
2016
).
32.
T.
Rasal
,
T.
Veerakumar
,
B. N.
Subudhi
, and
S.
Esakkirajan
,
IET Image Process.
15
,
1383
(
2021
).
33.
S.
Haider
,
A.
Cameron
,
P.
Siva
,
D.
Lui
,
M.
Shafiee
,
A.
Boroomand
,
N.
Haider
, and
A.
Wong
,
Sci. Rep.
6
,
20640
(
2016
).
34.
R. F.
Laine
,
G.
Jacquemet
, and
A.
Krull
,
Int. J. Biochem. Cell Biol.
140
,
106077
(
2021
).
35.
V.
Mannam
,
Y.
Zhang
,
Y.
Zhu
,
E.
Nichols
,
Q.
Wang
,
V.
Sundaresan
,
S.
Zhang
,
C.
Smith
,
P. W.
Bohn
et al,
Optica
9
,
335
(
2022
).
37.
R.
Eisinga
,
M.
Te Grotenhuis
, and
B.
Pelzer
,
Stat. Neerl.
67
,
190
(
2013
).
38.
K.
Butler
and
M. A.
Stephens
,
Methodol. Comput. Appl. Probab.
19
,
557
(
2017
).
39.
H.-T.
Ha
,
Kyungpook Math. J.
57
,
493
502
(
2017
).
40.
D. R.
Hunter
,
P. K.
Don
, and
B. G.
Lindsay
, “
An expansive view of EM algorithms
,” in
Handbook of Mixture Analysis
(Chapman and Hall/CRC,
2019
),
41
52
.
41.
N.
Ueda
and
R.
Nakano
,
Neural Networks
11
,
271
(
1998
).
42.
Y.
Del Valle
,
G. K.
Venayagamoorthy
,
S.
Mohagheghi
,
J.-C.
Hernandez
, and
R. G.
Harley
,
IEEE Trans. Evol. Comput.
12
,
171
(
2008
).
43.
S.
Ruder
, “
An overview of gradient descent optimization algorithms
,” arXiv:1609.04747 (
2016
).
44.
X.
Su
and
M.
Bai
,
PLoS One
15
,
e0238000
(
2020
).
45.
C. J.
Wu
,
Ann. Stat.
11
,
95
(
1983
).
46.
A. P.
Dempster
,
N. M.
Laird
, and
D. B.
Rubin
,
J. R. Stat. Soc.: Ser. B
39
,
1
(
1977
).
47.
A.
Ly
,
M.
Marsman
,
J.
Verhagen
,
R. P.
Grasman
, and
E.-J.
Wagenmakers
,
J. Math. Psychol.
80
,
40
(
2017
).
48.
T.
Nishiyama
, “
Cramér-Rao-type bound and Stam's inequality for discrete random variables
,” arXiv:1904.12704 (
2019
).
49.
E. L.
Lehmann
and
G.
Casella
,
Theory of Point Estimation
(
Springer Science & Business Media
,
2006
).
50.
H.
Akaike
, “
Information theory and an extension of the maximum likelihood principle
,” in
Breakthroughs in Statistics: Foundations and Basic Theory
, edited by
S.
Kotz
and
N. L.
Johnson
(
Springer New York
,
New York
,
1992
), pp.
610
624
.