Search processes are ubiquitous in physical and biological phenomena, often involving the random motion of molecules. In particular, transcription factors (TFs) are proteins that regulate gene expression and need to find their DNA targets quickly—which is difficult to achieve with random motion alone. Nature came up with a remarkable solution known as facilitated diffusion, combining 1D diffusion along the DNA and “excursions” of diffusion in 3D that help the TF to quickly arrive at distant parts of the DNA. In this paper, we show that this process can be analyzed naturally using the concept of conditional probability, providing an alternative intuition to the effectiveness of this mechanism.

Elchanan Mossel's Dice paradox1 poses a simple probability question that gives rise to neat and non-intuitive results. The paradox is phrased as follows:2

You roll a fair six-sided die until you get 6. What is the expected number of rolls, if we consider only those series of rolls in which each roll in the series is an even number?

It is a common mistake to think that the answer is 3 based on the following logic: if the possible outcomes are either 2, 4 or 6 and we wish to get a 6 then the number of rolls follows a geometric distribution with a parameter of 1/3.

Alas, this “solution” overlooks the conditioning present in the formulation of the paradox that should lead us to make use of a fundamental concept in probability theory, conditional probability. All rolls are still possible, but the conditioning on rolling only even numbers makes us eliminate a series from our statistics as soon as an odd number is rolled. The basic concept of conditional probability is summarized by the following equation:
$P ( A | B ) = P ( A ∩ B ) P ( B ) .$
(1)
In Eq. (1), $P ( A | B )$ means the probability that A occurs, given that B has occurred, and $P ( A ∩ B )$ is the probability that both A and B occur. As mentioned, the answer of 3 is wrong due to improperly taking into account the conditioning. We can perform the correct calculation using Eq. (1). Instead of A we'll have Xi which stands for the probability of getting a 6 on the ith roll, so that $P ( X i ) = 1 / 6$. The event B will occur whenever a series has only even numbers (2s and 4s) before the first 6. P(B) will allow us to calculate the fraction of series that qualify for consideration because they meet the condition. First, we'll calculate $P ( X i ∩ B )$
$P ( X i ∩ B ) = ( 1 3 ) i − 1 · 1 6 = 1 2 · ( 1 3 ) i .$
(2)
P(B), the fraction of all possible series that qualify for consideration, is simply the sum of $P ( X i ∩ B )$ over all possible values of i,
$P ( B ) = ∑ i = 1 ∞ P ( X i ∩ B ) = 1 2 ∑ i = 1 ∞ ( 1 3 ) i = 1 4 .$
(3)
We can now use Eqs. (1)–(3) to calculate the expected number of rolls
$E [ N ] = ∑ i = 1 ∞ i P ( X i ∩ B ) P ( B ) = 2 ∑ i = 1 ∞ i ( 1 3 ) i = 3 2,$
(4)
which is quite different than what was naively expected. The origin for this difference is that once we properly take the conditioning into account we essentially eliminate all long sequences since they have vanishingly small probability for not including odd numbers. This difference could be emphasized by, for example, a 1,000-sided die. For such a die, the naive calculation will produce a result of $E [ N ] = 500$ while for the correct calculation $E [ N ] = 1000 / 501 ≈ 2$. In  Appendix C, we provide some code to help the reader build intuition for this result.

Next, we will describe an important biological process known as “facilitated diffusion,” which, intriguingly, is related to the concept of conditioned probability. In fact, we will see that this mechanism shares several key properties with the “paradox” discussed above.

Transcription factors (TFs) are proteins designated to regulate gene expression in cells. In order for the regulation to be effective, the TF has to find its target, a segment located somewhere along the DNA that is extremely short compared with the entire length of the DNA, in tens of seconds or less. The root of this requirement is related to typical time scales of other processes taking place within the cell—notably, the cell's doubling time, which is often on the order of tens of minutes. The motion of TFs is via diffusion. An order-of-magnitude estimate for the search time via either 1D or 3D diffusion is given, for example, in Ref. 3. The estimate for 1D diffusion is obtained using 1D diffusion constant and the total length of the DNA
$t search ∼ L 2 D 1 D ,$
(5)
with L being the DNA total length and $D 1 D$ the 1D diffusion coefficient. For typical values relevant for bacteria, $L = 10 6 nm , D 1 D = 10 4 – 10 5 nm 2 / s$—taken from Refs. 3 and 4, this results in a search time of thousands of hours. The estimate for 3D diffusion is as follows:
$t search ∼ V D 3 D r ,$
(6)
with V being the volume restricting the TF and its target, $D 3 D$ the 3D diffusion coefficient, and r the typical spatial size of the target. It could be obtained by dimensional analysis assuming that the mean search time has to be inversely proportional to the concentration of searchers (forcing the mean search time to be proportional to the search volume) or, more rigorously, by solving a first-passage-time problem of a random walker hitting a target in a restricted volume, as found in Refs. 5 and 6. For typical values relevant for bacteria, $V = 10 9 nm 3 , D 3 D = 10 7 nm 2 / s , r = 0.34 nm$—taken from Refs. 3 and 4, this results in a search time of hundreds of seconds. This result is much better than the 1D case, but in reality it is likely that the protein would spend a finite length of time bound to a non-target site, which would further increase the value of tsearch, making this estimate optimistic.7

In principle, the TF search could have been executed by first attaching to the DNA at some point, followed by processive motion until the target is found. For velocities large enough, this could have led to a superior search mechanism. However, such motion inevitably consumes energy (since it breaks detailed balance). For this or other reasons, this solution was not chosen by Nature. Note that other biological scenarios do involve the processive motion of proteins in 1D, see, e.g., Ref. 8.

Facilitated diffusion is a mechanism where the TF performs 1D diffusion along the DNA and at any given moment can fall off and perform 3D diffusion until reattaching to the DNA (a 3D excursion) at some random point along it; this is a key assumption that allows significant simplifications of the mathematical description of the mechanism. The 1D diffusion along the DNA and the 3D excursions alternate until the TF hits its target. This model is well studied, see Refs. 3, 4, and 9 for reviews, and empirical evidence supporting it was found in bacteria.10 Figure 1 shows a cartoon illustrating this mechanism. As we shall see, the dependence of the search time on the underlying microscopic parameters (diffusion constants and dimensions) is fundamentally different than the 1D and 3D models discussed above. For a broad parameter regime, it can lead to a significant speed-up of the search time.

Fig. 1.

Cartoon illustrating the facilitated diffusion search mechanism. The TF (circle) performs 1 D diffusion along the DNA (solid line). It then falls off, performs 3D diffusion, and lands somewhere else along the DNA (3D diffusion excursion)—which could get it closer to its target (square) or further away from it. This process repeats itself until the TF finds the target.

Fig. 1.

Cartoon illustrating the facilitated diffusion search mechanism. The TF (circle) performs 1 D diffusion along the DNA (solid line). It then falls off, performs 3D diffusion, and lands somewhere else along the DNA (3D diffusion excursion)—which could get it closer to its target (square) or further away from it. This process repeats itself until the TF finds the target.

Close modal

We will now show a mathematical description of this model that allows significant speed-up compared with the results of Eqs. (5) and (6). Namely, we'll arrive at a search time that scales like L instead of L2. The key principle for the success of this mechanism is analogous to what we described earlier discussing the dice “paradox”—the falling-off during a 1D search attempt and re-trying at a random point along the DNA resembles the “throwing away” of long die rolling sequences. Namely, the probability to fall-off is analogous to the probability of getting an odd number. Note that in the dice problem once we get an odd number we “reset” the counter, while in the case of facilitated diffusion failed attempts do contribute to the total time. However, in both cases, the “resetting” allows us to avoid lengthy runs and, thus, shortens the mean first passage time (MFPT). Furthermore, in both cases, we see the dramatic effects of the a priori benign conditioning process: In the dice problem, this leads to a MFPT always smaller than 2 (for any number of facets of the die). In the facilitated diffusion problem, conditioning on a successful search leads to a MFPT linear in the distance from the target, rather than quadratically as we might expect intuitively.

We now present the mathematical description of the facilitated diffusion mechanism. We start from a microscopic point of view, then move on to a continuum description and by properly taking the conditional probability into account we arrive at a description of the full search process. Our results are closely related to the extensive analytical results obtained in Ref. 11, even though our derivation is more elementary and focuses on different aspects of the mathematical description of the mechanism.

We start by solving the one dimensional problem of a TF hitting the target while being bound to the DNA. Afterwards we'll take the 3D excursions into account.

As a “warm-up” we'll start with a time-independent problem of the probability of a TF hitting a target before falling off starting at distance x from the target, $p ̃ ( x )$—hereinafter, we mark a quantity with a tilde when it describes a discrete function; we use the same notation for its continuous counterpart albeit with the tilde omitted. We can write the recursion relation for $p ̃ ( x )$ as follows:
$p ̃ ( x ) = ( 1 − γ ) p ̃ ( x + δ x ) + p ̃ ( x − δ x ) 2 ,$
(7)
where γ is the probability to fall at any step and δx is the step size. This recursion relation holds since at each step the TF falls off with probability γ, and otherwise has a probability of 1/2 of moving to either of the neighboring sites. In Eq. (7), we neglect the effect of having finite boundaries – given that the DNA is long compared with the size of the TF we may, without significantly affecting the results, solve the problem on an infinite domain, $x ∈ ( − ∞ , ∞ )$, while assuming the target is at x = 0.
Subtracting $p ̃ ( x )$ from both sides of Eq. (7) and multiplying by $2 / ( δ x ) 2$, we arrive at the following:
$0 = p ̃ ( x + δ x ) − 2 p ̃ ( x ) + p ̃ ( x − δ x ) ( δ x ) 2 − γ δ t 2 δ t ( δ x ) 2 p ̃ ( x + δ x ) + p ̃ ( x − δ x ) 2 ,$
(8)
where we also multiplied the second term by $1 = 2 δ t / 2 δ t$, δt being the time step, taking continuum limit: $δ x → 0 , δ t → 0$ and $γ → 0$, while defining $D ≡ ( δ x ) 2 / 2 δ t$ and $Γ ≡ γ / δ t$. This brings us to the following ordinary differential equation:
$d 2 p ( x ) d x 2 = Γ D p ( x ) ,$
(9)
whose solution is
$p ( x ) = exp ( − Γ D | x | ) .$
(10)
Equation (9) has another solution, $exp ( Γ / D | x | )$. However, this solution is unphysical since it predicts that the probability for hitting the target prior to falling off is greater for more distant starting points.
This result will be useful later on. For now, we wish to solve a time-dependent problem that will allow us to calculate the time “spent” on 1D search attempts—the first passage time (FPT) problem. Analogously to what we did for $p ̃ ( x )$, we can write a recursion relation for $g ̃ ( x , t )$, which can stand for either one of the following two: the probability distribution for a TF to be at x at time t or the probability that the TF would hit the target by time t given that it started at position x, see Ref. 12 for additional explanation
$g ̃ ( x , t ) = ( 1 − γ ) g ̃ ( x + δ x , t − δ t ) + g ̃ ( x − δ x , t − δ t ) 2 .$
(11)
Similar to what that was done in Eqs. (8) and (9), we can take the continuum limit and arrive at the following PDE:
$∂ g ( x , t ) ∂ t = D ∂ 2 g ( x , t ) ∂ x 2 − Γ g ( x , t ) .$
(12)
Depending on which of the two interpretations listed above is used, Eq. (12) is either a Fokker-Planck equation (also known as the Kolmogorov Forward equation) or the Kolmogorov Backward equation; see Refs. 13–16. This is a special case where both the forward problem and the backward problem are described by the same equation, which is not generally the case.

Equation (12) is analytically solvable. Even so, this solution is of little significance given that the search mechanism is not “measured” at the level of a single search attempt but—as we shall see later – on the level of numerous search attempts. In other words, what we will be interested in is the MFPT associated with Eq. (12) (not to be confused with the MFPT of the entire facilitated diffusion process—which we calculate in Sec. III B). We present the calculation of the FPT distribution for the sake of completeness in  Appendix A.

The MFPT of Eq. (12), which we shall denote as $T ( x )$, can be derived directly from Eq. (12) as shown in  Appendix B. Here, we shall present a different derivation which is very similar to how Eqs. (9) and (12) were derived. We write the following recursion relation for $T ( x )$:
$T ̃ ( x ) = δ t p ̃ ( x ) + ( 1 − γ ) T ̃ ( x + δ x ) + T ̃ ( x − δ x ) 2 ,$
(13)
where the $δ t p ̃ ( x )$ term is a bit subtle: Since $T ( x )$ includes the average time of hitting the target over all the trajectories that eventually get there, when advancing a time step δt, the time contributed to $T ( x )$ is the product of δt and the probability of actually hitting the target. Taking the continuum limit of Eq. (13) in a similar manner to what that was done in Eqs. (8) and (9) gives
$D d 2 T ( x ) d x 2 − Γ T ( x ) + p ( x ) = 0.$
(14)
We arrived at a linear ordinary differential equation. The boundary conditions for Eq. (14) are $T ( 0 ) = T ( x → ∞ ) = 0$. The former is a direct result of the definition of $T ( x )$ as a mean first passage time problem. For the latter, we reiterate that there is a non-zero possibility that the TF will fall off before reaching the target, as well as that $T ( x )$ is only contributed by successful trajectories, which become exceedingly rare as $x → ∞$ since, as shown in Eq. (10), the probability for having a successful trajectory initiated at x decays exponentially with $| x |$.
Solving Eq. (14) is possible using a well-known method called variation of parameters (or variation of constants) leading to
$T ( x ) = | x | 2 Γ D exp ( − Γ D | x | ) .$
(15)
For further reading on the variation of parameters, we recommend to advise Ref. 17. Finally, we introduce the conditioning whereby we disregard any runs which resulted in the TF falling off the DNA before it found the target. Referring to Eq. (1), we understand that in order to properly include the conditioning we should divide the result of Eq. (15) by the probability of success itself, $p ( x )$. We arrive at
$T ̂ ( x ) = | x | 2 Γ D .$
(16)
This result provides another example for the “power” of the conditioning; while time is usually related to distance squared in diffusion processes, we get a linear relation. This is a key property of this mechanism that ultimately allows the linear relation in the final result.

Up to this point we've only considered a single 1D search attempt. Taking the 3D excursions into account, we are able to model the whole process. We will do so by assuming a finite DNA of length 2L with the TF target at its center and that the reattachment point after a 3D excursion is distributed uniformly on the DNA. Moreover, we assume that the search starts with the TF bound to the DNA at some random point along it; relaxing the assumption is inconsequential.

In order to write down the term for the overall search time, we use the following definitions (some were mentioned earlier): $p ( x i )$ is the probability to hit the target in the ith 1D search attempt, $t 1 D ( x i )$ is the time spent on the ith attempt assuming it was successful while $t 1 D f ( x i )$ denotes the time assuming that it wasn't, all conditioned on starting from position xi, and $t 3 D i$ is the time spent on 3D diffusion assuming the $( i − 1 ) t h$ 1D search attempt has failed. The search time then follows:
$T = p ( x 0 ) t 1 D ( x 0 ) + ( 1 − p ( x 0 ) ) p ( x 1 ) × ( t 1 D ( x 1 ) + t 1 D f ( x 0 ) + t 3 D 0 ) + ( 1 − p ( x 0 ) ) ( 1 − p ( x 1 ) ) p ( x 2 ) × ( t 1 D ( x 2 ) + t 1 D f ( x 0 ) + t 1 D f ( x 1 ) + t 3 D 0 + t 3 D 1 ) + ⋯ = ∑ i = 0 ∞ p ( x i ) ( t 1 D ( x i ) + ∑ j = 0 i − 1 ( t 1 D f ( x j ) + t 3 D j ) ) × ∏ j = 0 i − 1 ( 1 − p ( x j ) ) .$
(17)
Taking the mean of Eq. (17) over the binding position xi (assumed to be uniformly distributed) we arrive at the following:
$⟨ T ⟩ = ∑ i = 0 ∞ ( ⟨ p ( x ) t 1 D ( x ) ⟩ ( 1 − ⟨ p ( x ) ⟩ ) i + i ⟨ p ( x ) ⟩ ⟨ t 1 D f ( x ) ( 1 − p ( x ) ) ⟩ ( 1 − ⟨ p ( x ) ⟩ ) i − 1 + i ⟨ p ( x ) ⟩ ⟨ t 3 D ⟩ ( 1 − ⟨ p ( x ) ⟩ ) i ) = ⟨ p ( x ) t 1 D ( x ) ⟩ ⟨ p ( x ) ⟩ + ⟨ ( 1 − p ( x ) ) t 1 D f ( x ) ⟩ ⟨ p ( x ) ⟩ + ⟨ t 3 D ⟩ 1 − ⟨ p ( x ) ⟩ ⟨ p ( x ) ⟩ ,$
(18)
and calculating all the means will bring us to
$⟨ T ⟩ = 1 + Γ ⟨ t 3 D ⟩ 1 − e − ( Γ / D ) L L D Γ − ( 1 Γ + ⟨ t 3 D ⟩ ) ≈ L D Γ + L Γ D ⟨ t 3 D ⟩ .$
(19)

We see that the search time now scales linearly with L instead of quadratically! As we mentioned before, this is reminiscent of the dice “paradox”—we do not “keep” long and unsuccessful sequences. Moreover, if we optimize the search time with respect to Γ we arrive at a neat conclusion that the optimal search time is obtainable by taking $Γ = 1 / ⟨ t 3 D ⟩$ and then the TF spends half of its time in 1D and half in 3D with the overall search time of $L ⟨ t 3 D ⟩ / D$. Using typical values for bacteria $2 L = 10 6 nm , D = 10 4 – 10 5 nm 2 / s$, and $⟨ t 3 D ⟩ = 10 − 4 s$ as could be found, for example, in Refs. 3 and 4, we indeed arrive at an overall search time of the order of tens of seconds.

In this paper, we revisited a well-known and well-studied mechanism for how TFs search for their target genes. We showed how the mathematical description of the mechanism naturally utilizes the basic concept of conditional probability.

The facilitated diffusion problem we discussed here is mathematically related to the class of problems of first passage time under restart. For these problems, one is interested in the first passage time of a random walker, with a rate to “reset” the particle, typically to a particular site (in contrast to the random resetting encountered in the facilitated diffusion problem). Intriguingly, for these problems, there is an optimal restart rate that can speed up the search dramatically. Reference 18 studies the statistical properties of a searcher absorption by a static target under constant rate resetting of the searcher position. A generic treatment of first passage under resetting is given in Ref. 19, and a review thoroughly studying different cases and generalizations of the resetting time is found in Ref. 20. Such processes are deeply related to the inspection paradox of probability theory, where a sampling bias may distort the statistics in counter-intuitive ways. For instance, in a famous example of this paradox, the average waiting time for a bus a person measures when they arrive at the bus station at some random, uniformly distributed, time is greater than the average time between consecutive buses. In the case of heavy-tailed distributions, in fact, the former can be infinite even when the latter is finite! Resetting overcomes this sampling bias and in some cases may even shorten the waiting time compared with the distribution's mean. A review studying the relations between stochastic processes under resetting and the inspection paradox is found in Ref. 21. This study also characterize the processes where resetting will enable a speed-up compared with a simple mean of the distribution.

While the facilitated diffusion mechanism is a powerful mechanism for shortening the search time, there are both extensions to this mechanism and other, completely different, mechanisms worth mentioning. Still within the framework of facilitated diffusion, one may take into account the dynamics of the DNA molecule itself, as discussed in Ref. 22, or the energy landscape the TF experiences while moving along the DNA, as discussed in the reviews.3,4,23 Recently, Ref. 24 related diffusion on such a disordered landscape to the phenomenon of Anderson localization and discussed its implications for facilitated diffusion. Another intriguing work directly related to facilitated diffusion is given in Ref. 25 where the authors use a facilitated diffusion-based model to study the architecture of bacteria genome.

Although facilitated diffusion is fit to describe the search mechanism of some TFs in bacteria, other types of TFs (and proteins in general) could have a significantly different structure leading to inherently different dynamics. Reference 26 discusses a protein extended in one dimension in a manner that enables it to interact with many sequences along the DNA in parallel, which effectively reduces the dimensionality of the search (from three to two dimensions) causing a remarkable speed-up of the search process—distinct from the mechanism we explored here. The concept of dimensionality reduction in biological search processes dates to the seminal work of Delbrück and Adam in 1968, Ref. 27. This work was recently revisited and extended.28 Another distinct example is given in Refs. 29–31, which discuss the search mechanism TFs use in eukaryotic cells. In this case, the TFs often have a long polymeric tail called the Intrinsic Disordered Region that plays a major role in the search, though the theoretical framework for this scenario has yet to be developed.

The authors thank Wencheng Ji, Naama Barkai, Yariv Kafri, Urlich Gerland, Shlomi Reuveni, Sarah Kostinski, and Raphael Voituriez for helpful discussions and comments.

The authors have no conflicts to disclose.

In the following, we calculate the FPT as opposed to the MFPT calculated in the main text.

For convenience, we re-write Eq. (12) governing the dynamics of $g ( x )$, the probability that the TF would hit the target by time t given that it started at position x,
$∂ g ( x , t ) ∂ t = D ∂ 2 g ( x , t ) ∂ x 2 − Γ g ( x , t ) .$
(A1)
This equation is supplemented by the initial condition $g ( x , 0 ) = 0$ and the boundary condition $g ( 0 , t ) = 1$. To proceed, as in many FPT problems,6 we will Laplace transform the equation. Denoting $L [ g ( x , t ) ] ≡ G ( x , s )$, we obtain
$s G ( x , s ) = D ∂ 2 ∂ x 2 G ( x , s ) − Γ G ( x , s ) .$
(A2)
Relying on the Laplace transform of the initial condition $L [ 1 ] = 1 / s$, we find that the solution of Eq. (A2) is
$G ( x , s ) = 1 s exp ( − s + Γ D | x | ) .$
(A3)
Performing the inverse Laplace transform produces the solution for $g ( x , t )$,
$g ( x , t ) = L − 1 [ 1 s exp ( − s + Γ D | x | ) ] = 1 2 exp ( − Γ D | x | ) [ 1 + erf ( 2 Γ D t − | x | 2 D t ) + exp ( 2 Γ D | x | ) erfc ( 2 Γ D t + | x | 2 D t ) ] .$
(A4)
Note that this is not technically a cumulative distribution function (CDF) since $g ( x , t → ∞ ) = exp ( − Γ / D | x | )$ (consistent with the result of Eq. (10): There is a non-vanishing probability not to hit the target of course). If, on the other hand, we look at the probability to hit the target conditioned on hitting it, the corresponding CDF is
$g ̂ ( x , t ) = 1 2 [ 1 + erf ( 2 Γ D t − | x | 2 D t ) + exp ( 2 Γ D | x | ) erfc ( 2 Γ D t + | x | 2 D t ) ] .$
(A5)
From this result, we can obtain the MFPT
$T ( x ) = ∫ 0 ∞ t ∂ ∂ t g ̂ ( x , t ) d t = x 2 Γ D ,$
(A6)
which reproduces the result of Eq. (16).

In the main text, we derived the equation for the MFPT using the appropriate recursion relation following fundamental principles. In the following, we shall present an alternative derivation starting from the equation for the FPT, namely, Eq. (12) (also presented in the previous appendix as Eq. (A1)).

Since $g ( x , t )$ corresponds to the probability to hit the target until time t given that we start at x, the MFPT is obtainable from $g ( x , t )$ as
$T ( x ) = ∫ 0 ∞ t ∂ g ( x , t ) ∂ t d t .$
(B1)

Note that we expect this time to be finite even on an infinite domain, since the finite fall-off rate would prevent the mean time from diverging—in contrast, for example, to the diverging MFPT associated with normal random walks in 1D (we have also shown this directly from the FPT distribution in the previous appendix).

If we act on Eq. (A1) with $∫ 0 ∞ ( t ∂ / ∂ t ) d t$, we arrive at
$∫ 0 ∞ t ∂ 2 g ( x , t ) ∂ t 2 d t = D ∫ 0 ∞ t ∂ ∂ t ∂ 2 g ( x , t ) ∂ x 2 d t − Γ ∫ 0 ∞ t ∂ g ( x , t ) ∂ t d t .$
(B2)
The RHS is simply $D ∂ 2 T ( x ) / ∂ x 2 − Γ T ( x )$, whereas, using integration by parts, the LHS reads as
$∫ 0 ∞ t ∂ 2 g ( x , t ) ∂ t 2 d t = t ∂ g ( x , t ) ∂ t | 0 ∞ − ∫ 0 ∞ ∂ g ( x , t ) ∂ t d t = − g ( x , t ) | 0 ∞ = − exp ( − Γ D | x | ) .$
(B3)
Together, we obtain the following equation for the MFPT:
$D ∂ 2 T ( x ) ∂ x 2 − Γ T ( x ) + exp ( − Γ D | x | ) = 0 ,$
(B4)
reproducing Eq. (14) obtained directly using the recursion relation.

The following Python script demonstrates the result of the dice paradox. It could be used to help the reader to develop intuition towards the paradox's result. There are two input parameters: “sides” refers to the number of sides the die has and “runs” refers to the number of runs included in the estimation of the expected value.

• import numpy as np

• np.random.seed(42)

• sides = 1000

• runs = 1000

• T_w_cond = 0

• T_wo_cond = 0

• for i in range(runs):

•   hit_tgt = False

•   t_w_cond = 0

•   t_wo_cond = 0

•   while not hit_tgt:

•     t_w_cond = t_w_cond + 1

•     t_wo_cond = t_wo_cond + 1

•     roll = np.random.randint(1,sides+1)

•     if roll %2 != 0:

•       t_w_cond = 0

•     if roll == sides:

•       hit_tgt = True

•      T_w_cond = T_w_cond +\

•       t_w_cond/runs

•      T_wo_cond = T_wo_cond +\

•       t_wo_cond/runs

• print(’Number of rolls w/o accounting for’

•   + ’ conditioning:’, str(T_wo_cond))

• print(’Number of rolls when accounting for’

•   + ’ conditioning:’, str(T_w_cond))

For each run the script “rolls” the die until it lands on “sides” while keeping track of the number of rolls and the number of rolls without landing on an odd value, the latter is reset if an odd value is rolled. The script ends with printing the number of rolls without taking the conditioning into account (which we expect to be about the same as the input of “sides”) and the number of rolls when taking the conditioning into account.

1.
G.
Kalai
, “
,” Combinatorics and more - Gil Kalai's blog (
2017
), <https://gilkalai.wordpress.com/2017/09/08/>.
2.
J.
Jin
, “
Elchanan Mossel's dice problem
,” (
2018
), <http://www.yichijin.com/files/elchanan.pdf>.
3.
M.
Sheinman
,
O.
Bénichou
,
Y.
Kafri
, and
R.
Voituriez
, “
Classes of fast and specific search mechanisms for proteins on DNA
,”
Rep. Prog. Phys.
75
,
026601
(
2012
).
4.
L.
Mirny
,
M.
Slutsky
,
Z.
Wunderlich
,
A.
Tafvizi
,
J.
Leith
, and
A.
Kosmrlj
, “
How a protein searches for its site on DNA: The mechanism of facilitated diffusion
,”
J. Phys. A: Math. Theor.
42
,
434013
(
2009
).
5.
H. C.
Berg
,
Random Walks in Biology
(
Princeton U. P
.,
Princeton
,
1993
).
6.
S.
Redner
,
A Guide to First-Passage Processes
(
Cambridge U. P
.,
Cambridge
,
2001
).
7.
M.
Sheinman
and
Y.
Kafri
, “
The effects of intersegmental transfers on target location by proteins
,”
Phys. Biol.
6
,
016003
(
2009
).
8.
S. M.
Block
, “
Kinesin motor mechanics: Binding, stepping, tracking, gating, and limping
,”
Biophysical J.
92
,
2986–2995
(
2007
).
9.
O.
Bénichou
,
C.
Loverdo
,
M.
Moreau
, and
R.
Voituriez
, “
Intermittent search strategies
,”
Rev. Mod. Phys.
83
,
81–129
(
2011
).
10.
P.
Hammar
,
P.
Leroy
,
A.
Mahmutovic
,
E. G.
Marklund
,
O. G.
Berg
, and
J.
Elf
, “
The lac repressor displays facilitated diffusion in living cells
,”
Science
336
,
1595–1598
(
2012
).
11.
M.
Coppey
,
O.
Bénichou
,
R.
Voituriez
, and
M.
Moreau
, “
Kinetics of target site localization of a protein on DNA: A stochastic approach
,”
Biophysical J.
87
,
1640–1649
(
2004
).
12.
The interpretation as an equation for the probability distribution to be at x at time t is straightforward by looking at Eq. (11), the probability to be at x at time t is the probability that at $t − δ t$ we were at either $x − δ x$ or at $x + δ x$ times $1 / 2$ and times the probability the TF did not fall off. The interpretations as an equation for the probability that the TF would either hit the target by time t or at time t given that it started at position x follow the same reasoning: The probability that the TF started at x and took $t / δ t$ steps to hit the target should be the same as starting at $x ± δ x$ and arriving after $( t / δ t ) − 1$ steps times the probability of taking one more step (and taking into account the chance of falling off).
13.
S.
Karlin
and
H. M.
Taylor
,
A First Course in Stochastic Processes
(
,
Cambridge
,
1975
).
14.
C. W.
Gardiner
,
Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences
(
Springer
,
Berlin
,
2003
).
15.
V. G. V.
Kampen
,
Stochastic Processes in Physics and Chemistry
(
Elsevier Science & Technology Books
,
Amsterdam
,
2007
).
16.
A.
Amir
,
Thinking Probabilistically: Stochastic Processes, Disordered Systems, and Their Applications
(
Cambridge U. P
.,
Cambridge
,
2021
).
17.
G.
Teschl
,
Ordinary Differential Equations and Dynamical Systems
(
American Mathematical Society
,
Rhode Island
,
2012
).
18.
M. R.
Evans
and
S. N.
Majumdar
, “
Diffusion with stochastic resetting
,”
Phys. Rev. Lett.
106
,
160601
(
2011
).
19.
A.
Pal
and
S.
Reuveni
, “
First passage under restart
,”
Phys. Rev. Lett.
118
,
030603
(
2017
).
20.
M. R.
Evans
,
S. N.
Majumdar
, and
G.
Schehr
, “
Stochastic resetting and applications
,”
J. Phys. A: Math. Theor.
53
,
193001
(
2020
).
21.
A.
Pal
,
S.
Kostinski
, and
S.
Reuveni
, “
The inspection paradox in stochastic resetting
,”
J. Phys. A: Math. Theor.
55
,
021001
(
2022
).
22.
T.
Schötz
,
R. A.
Neher
, and
U.
Gerland
, “
Target search on a dynamic DNA molecule
,”
Phys. Rev. E
84
,
051911
(
2011
).
23.
O.
Bénichou
,
Y.
Kafri
,
M.
Sheinman
, and
R.
Voituriez
, “
Searching fast for a target on DNA without falling to traps
,”
Phys. Rev. Lett.
103
,
138102
(
2009
).
24.
Q.
Lu
,
D.
Bhat
,
D.
Stepanenko
, and
S.
Pigolotti
, “
Search and localization dynamics of the CRISPR-Cas9 system
,”
Phys. Rev. Lett.
127
,
208102
(
2021
).
25.
D.
Ezer
,
N. R.
Zabet
, and
B.
, “
Physical constraints determine the logic of bacterial promoter architectures
,”
Nucl. Acids Res.
42
,
4196
4207
(
2014
).
26.
J.
Wiktor
,
A. H.
Gynnå
,
P.
Leroy
,
J.
,
G.
Coceano
,
I.
Testa
, and
J.
Elf
, “
RecA finds homologous DNA by reduced dimensionality search
,”
Nature
597
,
426
429
(
2021
).
27.
G.
and
M.
Delbrück
, “
Reduction of dimensionality in biological diffusion processes
,”
Structural Chem. Mol. Biol.
198
,
198
215
(
1968
).
28.
D. S.
Grebenkov
,
R.
Metzler
, and
G.
Oshanin
, “
Search efficiency in the Adam–Delbrück reduction-of-dimensionality scenario versus direct diffusive search
,”
New J. Phys.
24
,
083035
(
2022
).
29.
S.
Brodsky
,
T.
Jana
,
K.
Mittelman
,
M.
Chapal
,
D. K.
Kumar
,
M.
Carmi
, and
N.
Barkai
, “
Intrinsically disordered regions direct transcription factor in vivo binding specificity
,”
Mol. Cell
79
,
459
471
(
2020
).
30.
T.
Jana
,
S.
Brodsky
, and
N.
Barkai
, “
Speed-specificity trade-offs in the transcription factors search for their genomic binding sites
,”
Trends Genet.
37
,
421
432
(
2021
).
31.
S.
Brodsky
,
T.
Jana
, and
N.
Barkai
, “
Order through disorder: The role of intrinsically disordered regions in transcription factor binding specificity
,”
Curr. Opinion Structural Biol.
71
,
110
115
(
2021
).
Published open access through an agreement withInter University Center for Digital Information Services