Finite Markov chains, memoryless random walks on complex networks, appear commonly as models for stochastic dynamics in condensed matter physics, biophysics, ecology, epidemiology, economics, and elsewhere. Here, we review exact numerical methods for the analysis of arbitrary discrete- and continuous-time Markovian networks. We focus on numerically stable methods that are required to treat nearly reducible Markov chains, which exhibit a separation of characteristic timescales and are therefore ill-conditioned. In this metastable regime, dense linear algebra methods are afflicted by propagation of error in the finite precision arithmetic, and the kinetic Monte Carlo algorithm to simulate paths is unfeasibly inefficient. Furthermore, iterative eigendecomposition methods fail to converge without the use of nontrivial and system-specific preconditioning techniques. An alternative approach is provided by state reduction procedures, which do not require additional a priori knowledge of the Markov chain. Macroscopic dynamical quantities, such as moments of the first passage time distribution for a transition to an absorbing state, and microscopic properties, such as the stationary, committor, and visitation probabilities for nodes, can be computed robustly using state reduction algorithms. The related kinetic path sampling algorithm allows for efficient sampling of trajectories on a nearly reducible Markov chain. Thus, all of the information required to determine the kinetically relevant transition mechanisms, and to identify the states that have a dominant effect on the global dynamics, can be computed reliably even for computationally challenging models. Rare events are a ubiquitous feature of realistic dynamical systems, and so the methods described herein are valuable in many practical applications.

Finite Markov chains1–4 are commonly used to represent a variety of stochastic processes. They provide attractive coarse-grained representations of continuous-state models, such as the dynamics of many-particle systems,5 since the high dimensionality of the coordinate space can be preserved.6 Markov chains corresponding to a continuous-state system can be constructed using explicit simulation data to estimate a network model by maximum-likelihood7–11 or Gibbs sampling12–15 approaches.16–20 Alternatively, the energy landscape of a physical system can be mapped to a Markovian network using geometry optimization methods21 to locate the stationary points.22–29 The dynamics of the resulting Markov chain are described by a linear master equation,30–32 a system of coupled first-order ordinary differential equations (ODEs). Here, the Markov chain corresponds to a network for which the edges are parameterized by rates Kij for the transitions between nodes ij (in the continuous-time case) or transition probabilities Tij(τ) at a lag time τ (in the discrete-time case).33 Continuous-time Markov chains (CTMCs) are also frequently used to represent population dynamics processes, where the transitions correspond to discrete changes in the numbers of species. Each node of the network is then associated with a vector specifying the population distribution. Since the populations of species are unbounded, the state space of the network is countably infinite34,35 but can be truncated to yield a finite Markov chain with negligible error.36,37 Simple examples of such models include birth–death processes38,39 and queuing networks.40 More complex examples arise as representations of chemical41,42 and biochemical43–45 reaction cycles, gene regulatory networks,46–49 epidemic spread,50 and ecosystems.51 

In realistic models for complex dynamical processes, there often exists a separation of characteristic timescales.52–66 For instance, the extinction of a species in an ecosystem takes place over a long period of time, compared to short-timescale fluctuations in the size of the population arising from individual births and deaths.51 In molecular and condensed matter systems, the underlying energy landscape typically features a disparity in the heights of energy barriers separating regions of the state space.67–71 For example, the folding of a protein to its functional structure is a relatively slow process of cooperative movement that typically requires surmounting one or more large free energy barriers, whereas unproductive local fluctuations are facile.28,29 In each of these applications, it is the rare event that is the transition of interest.

In the present work, we review exact computational methods to analyze the dynamics of arbitrary discrete- and continuous-time finite Markov chains. In particular, we are concerned with nearly reducible Markov chains,72,73 which exhibit rare event dynamics and are consequently ill-conditioned.74–79 The application of conventional dense linear algebra methods to nearly reducible Markovian networks is therefore usually prohibited by the severe propagation of error arising from the limits of numerical precision.80 Similarly, sparse iterative methods are subject to convergence issues in this rare event regime, necessitating the use of preconditioning techniques.76 The kinetic Monte Carlo81,82 (kMC) algorithm for simulation of trajectories also becomes unfeasibly inefficient, since the paths have a tendency to “flicker” within the metastable sets of nodes.83–87 We therefore focus on specialized methods that are capable of treating Markov chains featuring metastability. We consider a general Markov chain comprising the set of nodes S. To formulate the problem of analyzing a particular transition, we consider initial and absorbing (target) sets of nodes, denoted B and A, respectively. The nodes in B are a subset of the set of transient (nonabsorbing) nodes, denoted QAc.

Following an overview of the relevant theory of Markov chains and standard linear algebra methods for their exact analysis (Sec. II), we provide a detailed review of procedures that have superior numerical stability and are therefore recommended for application to networks exhibiting metastability. Specifically, we discuss state reduction methods that are inherently robust and do not involve preconditioning techniques. The mean first passage time (MFPT) for the AB transition can be computed using the graph transformation (GT) algorithm (Sec. III).88–93 The GT approach is closely related to uncoupling–coupling methods (Secs. IV A and IV B) and the Grassmann–Taksar–Heyman (GTH) algorithm (Sec. IV C) for computation of the stationary distribution. Further state reduction procedures exist to compute other macroscopic dynamical properties, such as the expected time to reach the equilibrium distribution, and higher moments of the first passage time distribution (Sec. IV D). State reduction algorithms are also available to compute microscopic dynamical properties, including the committor probabilities for nodes (Sec. V A 4), defined as the probability that a trajectory initialized at that node hits state A before state B, and other quantities that characterize the AB transition path ensemble (TPE),94–100 such as state visitation probabilities (Sec. IV E). Finally, we discuss efficient methods to simulate paths on a nearly reducible Markov chain. The kinetic path sampling83,84 (kPS) and Monte Carlo with absorbing Markov chain101–104 (MCAMC) algorithms provide an efficient alternative to standard kMC simulations81,82 for this purpose (Sec. V). The former method extends the GT algorithm with an iterative reverse randomization procedure to sample the time associated with a path escaping from a metastable region of the network, while the latter approach uses local eigendecompositions. By employing the specialized methods described herein, it is, in principle, possible to extract any desired dynamical information from a nearly reducible Markov chain. An overview of pertinent dynamical properties, the various methods to calculate them, and specific considerations that arise in their application, are summarized in Table I.

TABLE I.

Overview of methods to compute dynamical properties of Markov chains. Eigendecomposition, linear solution, matrix inversion, and state reduction operations scale cubically with the number of nodes, |S|. State reduction algorithms are stable since they address the propagation of numerical error that affects dense linear algebra methods and can therefore be applied to nearly reducible models. Iterative Krylov subspace methods to perform eigendecomposition (e.g., Arnoldi) or to solve linear systems [e.g., generalized minimal residual (GMRES)] can be preconditioned to improve convergence in this metastable regime, but this strategy is system-specific and nontrivial (Sec. II D). Methods that exploit sparsity (e.g., Krylov subspace and some state reduction algorithms) are necessary for analyzing large models. The efficiency of state reduction algorithms can be improved by eliminating nodes in blocks (Sec. IV A).

PropertyMethods for calculationCommentsReferences
Mean first passage time for  Matrix inversion (via A#Irreducible chains only. Stable if using REFUND. Yields Tiji,jSEquation (12) 
transition from node j, TAj Eigendecomposition Irreducible and reversible chains only. Yields Tiji,jSEquation (19) 
 Linear solve Yields MFPTs for transitions from all nodes to absorbing state. Equation (39) 
 State reduction (GT) Stable. Use recursive formulation to compute MFPTs for transitions Section III A, Ref. 112  
  from all nodes to the absorbing state, TAjjS  
Variance of the first passage Matrix inversion (via A#Irreducible chains only. Stable if using REFUND. Yields Viji,jSEquations (13) and (14) 
time distribution for transition Eigendecomposition  Yields VAjjS. Also yields the full FPT distribution [Eq. (22)]. Equation (24) 
from node j, VAj State reduction (Dayar-Akar) Stable. Yields variances of FPT for transitions from all nodes to the Reference 113  
Stationary probability Eigendecomposition  absorbing state, VAjjS Stationary distribution is the dominant Section II A 
distribution, π State reduction (GTH)  right eigenvector.Stable. Section IV C 
 exact uncoupling–coupling IAD Stable if metastable states chosen appropriately. Parallelizable. Section IV A 
  Efficient if metastable states chosen appropriately. Parallelizable Section IV B 
Committor probability for Eigendecomposition Can specify any number of attractors. Equation (30) 
node j, qj+ (first hitting  Linear solve Two attractors only (i.e., AB committor probability). Equation (29) 
probability of state AState reduction Stable, can specify any number of attractors Section V A 4 
Expected number of Matrix inversion Stable if the using state reduction method with the augmented matrix. Reference 112  
visits to node j, θj Linear solve N.B. result for reactive paths, θ̃j, follows from θj and qj+jS Reference 99  
Reactive visitation  From θ̃j and qj+jS Stable when this info is computed via state reduction. Yields rj+jSReference 112  
probability for node j, rj+ State reduction Stable. Computes visitation probability for a single node or set thereof Reference 114  
Net reactive ij internode fij+ from qj+ and πjjS Can be used in shortest paths analysis to decompose total reactive AB Equation (32) 
flux, fij+ (eq.) or J̃ij (noneq.) J̃ij from qj+ and θ̃jjS  flux [Eq. (33)] into contributions from finite set of flux-paths (Ref. 115Reference 99  
Average mixing time, ζK Matrix inversion (via A#Stable if using REFUND (Sec. IV D). Equation (15) 
 Eigendecomposition Reversible chains only Equation (18) 
Time-dependent occupation Eigendecomposition Diagonalizable chains only. Equation (21) 
probability distribution, p(tSolve ODEs Use stiff ODE integrator for systems with metastability. Equation (1) 
 Matrix exponentiation Can use sparse linear algebra methods to calculate p(t). Equation (3), Ref. 116  
 Simulate trajectories (see below) Section V 
Simulate trajectories Kinetic Monte Carlo Unfeasibly inefficient when trajectories are long. Section V 
[sample p(t)/sample Kinetic path sampling Stable, requires a priori knowledge of metastable states. Section V A 
FPT distribution] MCAMC Requires a priori knowledge of metastable states Section V B 
PropertyMethods for calculationCommentsReferences
Mean first passage time for  Matrix inversion (via A#Irreducible chains only. Stable if using REFUND. Yields Tiji,jSEquation (12) 
transition from node j, TAj Eigendecomposition Irreducible and reversible chains only. Yields Tiji,jSEquation (19) 
 Linear solve Yields MFPTs for transitions from all nodes to absorbing state. Equation (39) 
 State reduction (GT) Stable. Use recursive formulation to compute MFPTs for transitions Section III A, Ref. 112  
  from all nodes to the absorbing state, TAjjS  
Variance of the first passage Matrix inversion (via A#Irreducible chains only. Stable if using REFUND. Yields Viji,jSEquations (13) and (14) 
time distribution for transition Eigendecomposition  Yields VAjjS. Also yields the full FPT distribution [Eq. (22)]. Equation (24) 
from node j, VAj State reduction (Dayar-Akar) Stable. Yields variances of FPT for transitions from all nodes to the Reference 113  
Stationary probability Eigendecomposition  absorbing state, VAjjS Stationary distribution is the dominant Section II A 
distribution, π State reduction (GTH)  right eigenvector.Stable. Section IV C 
 exact uncoupling–coupling IAD Stable if metastable states chosen appropriately. Parallelizable. Section IV A 
  Efficient if metastable states chosen appropriately. Parallelizable Section IV B 
Committor probability for Eigendecomposition Can specify any number of attractors. Equation (30) 
node j, qj+ (first hitting  Linear solve Two attractors only (i.e., AB committor probability). Equation (29) 
probability of state AState reduction Stable, can specify any number of attractors Section V A 4 
Expected number of Matrix inversion Stable if the using state reduction method with the augmented matrix. Reference 112  
visits to node j, θj Linear solve N.B. result for reactive paths, θ̃j, follows from θj and qj+jS Reference 99  
Reactive visitation  From θ̃j and qj+jS Stable when this info is computed via state reduction. Yields rj+jSReference 112  
probability for node j, rj+ State reduction Stable. Computes visitation probability for a single node or set thereof Reference 114  
Net reactive ij internode fij+ from qj+ and πjjS Can be used in shortest paths analysis to decompose total reactive AB Equation (32) 
flux, fij+ (eq.) or J̃ij (noneq.) J̃ij from qj+ and θ̃jjS  flux [Eq. (33)] into contributions from finite set of flux-paths (Ref. 115Reference 99  
Average mixing time, ζK Matrix inversion (via A#Stable if using REFUND (Sec. IV D). Equation (15) 
 Eigendecomposition Reversible chains only Equation (18) 
Time-dependent occupation Eigendecomposition Diagonalizable chains only. Equation (21) 
probability distribution, p(tSolve ODEs Use stiff ODE integrator for systems with metastability. Equation (1) 
 Matrix exponentiation Can use sparse linear algebra methods to calculate p(t). Equation (3), Ref. 116  
 Simulate trajectories (see below) Section V 
Simulate trajectories Kinetic Monte Carlo Unfeasibly inefficient when trajectories are long. Section V 
[sample p(t)/sample Kinetic path sampling Stable, requires a priori knowledge of metastable states. Section V A 
FPT distribution] MCAMC Requires a priori knowledge of metastable states Section V B 

The dynamics of a continuous-time Markov chain (CTMC), parameterized by ij internode transition rates Kij and with state space S, is governed by the linear master equation30–32,105

dpj(t)dt=ijKjipi(t)Kijpj(t)jS,
(1)

where pj(t) is the time-dependent occupation probability of the jth node. Equation (1) can be written in matrix form as

dp(t)dt=Kp(t),
(2)

where p(t)=(p1(t),p2(t),,p|S|(t)) is the time-dependent occupation probability vector for the nodes of the network and K is the transition rate matrix. K has off-diagonal elements equal to the transition rates and diagonal elements such that the columns of the matrix sum to zero, i.e., Kjj = −∑γjKγj. The right eigenvectors of K represent dynamical eigenmodes, where the magnitudes and signs of the elements reflect the extent and direction of probability flow, respectively, for the corresponding nodes.106 The solution of Eq. (2) is a sum of contributions that decay exponentially with rates equal to the negatives of the eigenvalues {γk} of K.8 This solution leads to a formal definition for metastability, namely, that the Markov chain exhibits a spectral gap in its set of eigenvalues {γk}, and hence, there exists a set of dynamical eigenmodes that represent comparatively slow relaxation processes.64 By the Perron–Frobenius theorem,107,K has a unique dominant zero eigenvalue γ1 = 0 if the Markov chain is irreducible,73 i.e., if every node of the network is reachable from all other nodes. The zero eigenvalue is associated with the stationary probability vector π, and all other eigenvalues have a negative real component. The stationary distribution satisfies the global balance equation Kπ = 0, where 0 is the column vector with all elements equal to zero.73 If the stationary distribution also satisfies the detailed balance condition, Kijπj = Kjiπiij, then the Markov chain is said to be reversible, and the eigenvalues of K are real.2 

The transition rate (or infinitesimal generator) matrix K is related to a transition probability matrix T(τ), which propagates the probability distribution vector p(t) at discrete-time intervals τ according to the Chapman–Kolmogorov equation2p(t + τ) = T(τ)p(t) via50 

T(τ)=exp(Kτ)
(3)

and

K=limτ0T(τ)Iτ,
(4)

where I is the identity matrix. For an irreducible and aperiodic Markov chain, T has a single dominant eigenvalue λ1 = 1 associated with the stationary distribution π, which is the occupation probability distribution in the limit of infinite time. The absolute value of all other eigenvalues, λk for k=2,|S|, is less than unity.1 The transition rate and probability matrices share the same set of right and left eigenvectors, {ψ(k)} and {ϕ(k)}, respectively. Their eigenvalues are related via eγkτ=λk(τ) [cf. Eq. (3)].50 When the dynamics are reversible, the Markov chain has a complete set of orthonormal eigenvectors.108 Specifically, the left and right eigenvectors satisfy the orthonormality conditions8 

jSϕj(k)ϕj(l)πj=jSψj(k)ψj(l)/πj=jSψj(k)ϕj(l)=δkl,
(5)

where ψj(k) is the jth element of the kth right eigenvector and δkl is the Kronecker delta. In fact, reversibility is not a requirement for the final relation of Eq. (5) to hold.

There are two possible choices of stochastic matrix for a CTMC.109 The first is the branching probability matrix P, with elements Pij = Kij/∑γjKγj.91P contains no self-loop transitions, and the time associated with a transition from the jth node is exponentially distributed with mean τj = 1/∑γjKγj,110 referred to as the mean waiting time for the jth node. The second valid stochastic matrix for a CTMC is the continuous-time linearized transition probability matrix83 

Tlin(τ)=I+τK,
(6)

for which the mean waiting times are uniform for all nodes, equal to τ. Tlin has the same sparsity pattern as P, except that the linearized matrix contains self-loop transitions, whereas Pjj = 0 ∀ j. Provided that τmin{Kjj1:j}, the linearized transition matrix is column-stochastic and shares the same set of eigenvectors as K, but the eigenvalues are shifted.111 In the following exposition, we will use the notation T to denote any general stochastic matrix for a discrete- or continuous-time Markov chain and draw attention to separate considerations for the different formulations where necessary.

Irreducible Markov chains have a stationary distribution π that satisfies the global balance equations Tπ = π and Kπ = 0. The Markovian kernel117 [IT(τ)] and the transition rate matrix K, which we will collectively denote by A, are singular, but there exist a class of generalized inverses118–122G that satisfy AGA = A. Such matrices are fundamental in the sense that key global dynamical properties can be expressed straightforwardly in terms of G and the stationary distribution π.1,123 In the following discussion, we use T(τ) to refer to a discrete-time stochastic matrix parameterized at lag time τ or the linearized transition matrix of a CTMC with uniform mean waiting times τ.

Important examples of generalized inverses are Kemeny and Snell’s fundamental matrix,1,109

Z=(IT(τ)+π1S)1,
(7)

where 1S is the |S|-dimensional column vector with elements equal to unity, and Meyer’s group inverse,124 

A#=Zπ1S,
(8)

with elements125 

Aij#=n=0Tijnπi.
(9)

The fundamental matrix is a generalized inverse118 that also satisfies AZ = ZA,117,121,126,127 and the group inverse is the unique generalized inverse that additionally satisfies A#AA# = A#.117,120 The group inverse is also the unique solution to the Bellman-type equations128 

A#=(Iπ1S)+T(τ)A#=(Iπ1S)+A#T(τ),
(10)

with constraints A#π = 0 and γAγj#=0jS. The diagonal elements of A# are strictly positive.

In practice, it is sometimes more convenient to compute and work with the group inverse rather than the fundamental matrix, since A# can be obtained without knowledge of the stationary distribution.124 Furthermore, the elements of the group inverse have a probabilistic interpretation. Specifically, Aij# represents the expected deviation in the number of visits to the ith node for the relaxation process to the stationary distribution initialized from the jth node, compared to the average number of visits when starting at a node chosen randomly in proportion to the stationary distribution.129 Formally, if Nij(n) denotes the expected number of times that the ith node is visited on a trajectory of n steps initialized from the jth node, then124 

limnNij(n)Nik(n)=Aij#Aik#.
(11)

A key macroscopic quantity characterizing a particular transition is the mean first passage time113,124,130–140 (MFPT), defined as the expected first hitting time for trajectories to reach an absorbing (target) state given an initial condition.2 The matrix of MFPTs for all pairwise internode transitions can be expressed directly in terms of the fundamental matrix or the group inverse,124,134

T=(IG+EGd)Dτ,
(12)

where E is the |S|×|S|-dimensional matrix with all elements equal to unity, Gd is the matrix with diagonal elements of a generalized inverse G and off-diagonal elements equal to zero, and D = diag(π). The matrix whose elements Vij are the variances of the FPT distribution for all pairwise internode transitions is given by V=T(2)TT, where ◦ denotes the element-wise product and120,122

T(2)=2[TG(TG)dE]+Td(2)DTτ2,
(13)

with

Td(2)=D1+2D1[G(Iπ1S)]dD1.
(14)

More complicated expressions exist for matrices with elements corresponding to the higher moments of the FPT distributions for internode transitions.121 

Another important property that characterizes the global dynamics of an irreducible Markov chain is the average mixing time,1 which can be thought of as the expected time for an initial occupation probability distribution to decay to the stationary distribution. More formally, the average mixing time is the expected time for trajectories to first hit a target node that is sampled in proportion to the stationary distribution.141 This quantity is independent of the initial condition142–144 and is known as the Kemeny constant,1,123 given by117 

ζK=γSTγjπγjS=Tr(Z)τ=Tr(A#)+1τ.
(15)

Higher moments of the mixing time distribution depend on the initial state but can nonetheless be derived from a generalized inverse. Let the variance of the mixing time distribution when the relaxation to equilibrium is initialized from the jth node be denoted by νj. These variances are given by

νj=γSTγj(2)πγζK2
(16)

and are the elements of the vector122 [using Eqs. (13) and (14)]

ν=2[ζK(1SG)Tr(LG)1S]+2(α[IG+EGd])ζK1SτζK21S,
(17)

where L=DT=Td1T is the mixing matrix142 and α is the vector with elements αi=γTiγπγ. The Kemeny constant effectively quantifies the extent to which all of the nodes in the state space of a Markov chain are mutually reachable.141 When the dynamics are diffusive (i.e., the elements of T are comparable), the average mixing time is relatively small, and the variances of the mixing time distributions are likewise small and fairly uniform. In contrast, when there are metastable states, relaxation to the stationary distribution is a slow process, and the mixing time distributions are fat-tailed, with large means and variances, owing to the existence of rare event transitions.

It is sometimes more convenient, for reasons of numerical stability or efficiency, to compute properties of a Markov chain via an eigendecomposition operation instead of performing a matrix inversion operation required to compute the group inverse (Sec. II B). Similar to the utility of generalized inverses, many dynamical quantities can be expressed in terms of the eigenspectrum of a Markovian network. For instance, the average mixing time [Eq. (15)] is given directly by the eigenvalues of an irreducible and reversible Markov chain. In the discrete-time case,108 

ζK=1+k11λkτ.
(18)

The MFPTs for internode transitions can also be written in terms of the eigenspectrum of a reversible Markov chain.137 For a CTMC, the ij MFPT is given by108,145

Tij=1πjk>1|S|1γkψj(k)ϕj(k)ϕi(k).
(19)

The overall AB MFPT, for a transition to a specified absorbing macrostate A from a set of initial nodes B, is a sum of MFPTs for transitions from each of the nodes comprising B, where each term is weighted by the initial occupation probability of the starting node,

TAB=bBpb(0)TAb.
(20)

Although the MFPT is the usual dynamical observable,113,124,130–140 the complete FPT distribution contains valuable dynamical information that is not captured in the average. For instance, the width of the FPT distribution characterizes the heterogeneity of the transition path ensemble. When the transition probabilities or rates of the Markov chain depend on an external parameter, such as the temperature in physical systems,87 the form of the FPT distribution may change dramatically and with a relatively well-defined threshold.146 The existence of multiple peaks in the FPT distribution suggests the presence of competing transition mechanisms that each make a non-negligible contribution to the MFPT.100 To justify “lumping” a subset of nodes of a Markov chain and thus obtain a reduced Markovian representation of the dynamics,1,147–157 the FPT distribution for escape from the community of nodes ought to be an exponential distribution, which has the memoryless property.51,110

In discrete-time, the FPT distribution can be computed from eigendecomposition of the substochastic transition probability matrix TQQ, comprising only transient nodes of the set QAc. The equivalent formulation in continuous-time uses the corresponding rate matrix KQQ, for which the diagonal elements include contributions from transitions to absorbing nodes of the set A so that the columns of KQQ do not necessarily sum to zero. KQQ does not have a zero eigenvalue and associated stationary distribution, and instead, all eigenvalues have a negative real component. Since the dynamics within the transient set of nodes Q are unchanged prior to absorption, we can still write a linear master equation [cf. Eq. (2)] for the dynamics within this subnetwork. There is a probability flux out of the macrostate Q, which defines the probability distribution p(tFPT) for the AQ first passage times tFPT. Noting that the propagation of an initial probability distribution p(0) according to the linear master equation can be written in terms of the eigenspectrum of the rate matrix [cf. Eq. (3)],8 

p(t)=k=1|S|ψ(k)ϕ(k)p(0)eγkt,
(21)

where ⊗ denotes the outer product, we obtain the properly normalized FPT distribution52 

p(tFPT)=1QkγkQψQ(k)ϕQ(k)leγlQtψQ(l)ϕQ(l)pQ(0)
(22)
=kγkQeγkQt1QψQ(k)ϕQ(k)pQ(0).
(23)

Here, γkQ is the kth eigenvalue of KQQ, with ϕQ(k) and ψQ(k) being the corresponding left and right eigenvectors, respectively, and pQ(0) is the vector containing the initial occupation probabilities for the transient nodes. In writing Eq. (23) from Eq. (22), we have assumed orthonormality of the eigenvectors [Eq. (5)]. The eigendecomposition of the FPT distribution is particularly insightful, since it separates the distribution into contributions from individual dynamical eigenmodes. The kth orthonormal eigenmode makes a dominant contribution to the FPT distribution when the product 1Q[ψQ(k)ϕQ(k)]pQ(0) is close to unity. The nth moment of the FPT distribution [Eq. (22)] is given by52,158

tFPTn=0tFPTnp(tFPT)dtFPT=n!k1|γkQ|n1QψQ(k)ϕQ(k)pQ(0).
(24)

Many properties of an irreducible Markov chain can be directly expressed in terms of a fundamental matrix, obtained via a matrix inversion operation (Sec. II B). In Sec. II C, we noted that several key dynamical quantities characterizing the dynamics of a Markov chain can also be computed straightforwardly from its eigenspectrum, including MFPTs [Eq. (19)], the FPT distribution [Eq. (22)], the time-dependent occupation probability distribution [Eq. (21)], and the average mixing time [Eq. (18)]. Typical algorithms for eigendecomposition, and for matrix inversion or diagonalization, have time complexity O(|S|3), which is comparable to the time complexity of the state reduction algorithms described in Secs. III and IV. However, dense linear algebra methods are afflicted by the propagation of roundoff error in the finite precision arithmetic when applied to nearly reducible Markov chains.89,91,159 The extent to which a metastable Markov chain is ill-conditioned is essentially independent of its dimensionality.144 Hence, for Markov chains featuring a rare event, numerical error can prohibit the use of conventional dense methods, such as lower–upper (LU) decomposition to solve linear systems of equations35 or the QR algorithm to perform an eigendecomposition,160 even when the network comprises just a few nodes.52,92

For reversible Markov chains,2 marginal improvements in the numerical stability of linear algebra methods can be gained by employing the symmetrized transition rate matrix (in the continuous-time case), with elements K̃ij=(KijKji)1/2, or the symmetrized transition probability matrix (in the discrete-time case), with elements T̃ij(τ)=(Tij(τ)Tji(τ))1/2.52 These symmetrized matrices have the same eigenvalues as their unsymmetrized counterparts, and the elements of the right and left eigenvectors for the two formulations are related via ψj(k)=πj1/2ψ̃j(k) and ϕj(k)=πj1/2ϕ̃j(k), respectively.8 

An alternative framework to solve linear systems of equations or to determine the eigenspectrum of a Markov chain is provided by iterative methods,161–164 which can be optimized to treat Markov chains featuring a spectral gap via preconditioning.165 Unlike direct solution methods, iterative methods preserve the sparsity of a stochastic matrix and are therefore well-suited to treating high-dimensional structured systems,166 such as those that arise in population dynamics models.37 Below, we briefly outline the basis of these iterative methods and preconditioning strategies to aid their convergence. A detailed review of both direct and iterative solution methods in the context of Markov chains can be found in Ref. 76.

As an illustrative example, we consider application of the simplest iterative solution method, namely, the power method, to compute the stationary distribution of a Markov chain, i.e., to solve Kπ = 0 (in continuous-time) or T(τ)π = π (in discrete-time). In the power method, the solution vector x is repeatedly updated according to

x=T(τ)x=(IK)x.
(25)

The convergence rate for Eq. (25) is directly related to λ2.76 Thus, when the Markov chain is nearly reducible, the convergence xπ is slow. To remedy this problem, we may introduce a preconditioning matrix M that is readily invertible, ideally such that MK and the inverse M−1 yields a matrix (IM−1K) that has no subdominant eigenvalues close to the unique unit eigenvalue. The rate of convergence of the preconditioned version of Eq. (25) is consequently fast. In general, preconditioning refers to any method to modify a system of linear equations by premultiplication. That is, Ax = bM−1Ax = M−1b, where Mc = y can be solved efficiently with any y, and this transformation favorably alters the distribution of eigenvalues, thereby aiding the convergence of iterative methods.

The success of iterative solution methods applied to ill-conditioned Markov chains exhibiting rare event dynamics is clearly strongly dependent on the choice of M. Some iterative methods simplify the additional input required from the user by implicitly incorporating a preconditioning matrix. Successive over-relaxation167 (SOR) to solve the linear problem Ax = b, which can be thought of as a generalization of Gauss–Siedel iteration, splits the relevant matrix A as

ωA=(DωL)(ωU+(1ω)D),
(26)

where D is a diagonal matrix, L and U are strictly lower- and upper-triangular matrices, respectively, and ω > 0 is the relaxation factor. A solution vector x is then iterated according to

x=(DωL)1(ωU+(1ω)D)x+ω(DωL)1b.
(27)

The matrix ω−1(DωL) effectively acts as a preconditioner in SOR.167 Hence, the problem of selecting an appropriate preconditioning matrix is simplified to the problem of determining a suitable value for the scalar parameter ω. In practice, there is very little guidance on an appropriate choice of relaxation factor for solving a given system of linear equations if the matrix A is unstructured, and the possible gains in efficiency may be quite limited.92 

Krylov subspace methods are a class of sparse iterative procedures that can be used to perform an eigendecomposition of a matrix A.159,160,168,169 The idea is to generate a sequence of m vectors, {vk} ∀ k = 1, …, m, which form an orthonormal basis of the mth order Krylov subspace spanned by the set of vectors (v1, Ax, …, Am−1v1), where v1 is an arbitrary normalized vector.170 The Arnoldi algorithm171 provides an iterative method to produce a sequence of orthonormal vectors Vm = (v1, …, vm) spanning the Krylov subspace.172,173 The procedure yields a m × m-dimensional upper Hessenberg matrix Hm that satisfies Hm=VmAVm. For a sufficiently large number of iterations m, the dominant eigenvalues of H converge to those of A. Additionally, if φk(m) denotes the kth dominant eigenvector of Hm, then the so-called Ritz vector Vmφk(m) approximates the kth eigenvector of A.174 The generalized minimal residual (GMRES) algorithm175 adapts this concept to solve the linear problem Ax = b.

For well-conditioned matrices A, the eigenvalues of Hm converge rapidly to those of A so that the former matrix is comparatively low-dimensional and its eigenpairs can be found efficiently by direct solution methods. For more computationally challenging problems, the required number of iterations m is large so that the storage requirements and operation counts of the procedure become prohibitively large. Typical implementations of the Arnoldi and related algorithms incorporate implicit restarting to improve efficiency and avoid “wasteful” memory usage.176,177 However, numerical problems remain pervasive for pathological systems, such as nearly reducible Markov chains.178 In this ill-conditioned regime, the Arnoldi and GMRES methods can be made more effective by preconditioning in the usual way,165 i.e., replacing the matrix A with M−1A, where M is similar to A and M−1y can be evaluated efficiently and reliably for any vector y.

Similar strategies can also be used to solve systems of ODEs. A numerically stable approach to solving the master equation for the time-dependent occupation probability distribution vector p(t) [Eq. (2)] uses a stiff ODE179,180 integrator,181–185 employing GMRES iterations175 preconditioned with an appropriate matrix to solve the relevant system of linear equations.186 Alternatively, the product of a matrix exponential and a vector can be computed using Krylov subspace methods,116,187–189 and this approach is amenable to preconditioning.190 Hence, it is possible to compute the vector p(t) at a given time t [cf. Eq. (3)] even when the transition rate matrix is high-dimensional by exploiting sparsity.

In principle, preconditioned iterative methods provide stable algorithms to analyze Markov chains that are scalable in terms of both operation counts and memory usage. However, this approach is of limited use in practice, owing to nontrivial and system-specific considerations that arise. In particular, the rate of convergence of iterative procedures is strongly dependent on the choice of preconditioning matrix and on the initial guess for the solution vector. Without appropriate preconditioning, iterative solvers may converge unfeasibly slowly for linear problems involving nearly reducible Markov chains.92 For arbitrary Markov chains, there is limited a priori information on appropriate input to sparse procedures, and strategies are not generalizable. Hence, the additional parameters must usually be explored empirically in a trial-and-error fashion.

The theory presented in Secs. II B and II C was concerned with global dynamical properties of Markov chains. The role of individual nodes in determining macroscopic behavior is elucidated by transition path theory (TPT).94–98 TPT provides a theoretical framework to analyze the transition path ensemble46,47,100 (TPE) of reactive paths,99,191 which proceed directly from an initial macrostate B to an absorbing macrostate A, without revisiting B. The central object of TPT is the vector of committor probabilities192–195 for nodes. We denote by qj+ the forward AB committor probability for the jth node. qj+ is the probability that a trajectory occupying node j will visit the absorbing macrostate A before the initial macrostate B. It is in this sense that the committor probability represents an idealized reaction coordinate quantifying the progress of a AB transition, onto which state variables can be projected.111 

Formally, the forward committor probability for the jth node is defined as97 

qj+Pj(hB<hA),
(28)

where Pj denotes the probability considering all trajectories ξ(t) initialized from node j and the random variable hB=inf{t>0:ξ(t)B} is the first hitting time for the macrostate B.2 In the context of Markovian networks, a trajectory ξ(t) is a sequence of visited nodes and associated times. For reversible Markov chains, the backward committor probabilities, for the BA direction, which are defined analogously to Eq. (28) but for the time-reversed dynamics,97 are related to the forward committor probabilities simply by qj=1qj+.111 The committor probabilities can be written as the solution of a series of linear equations obtained by a first-step analysis,4 

qj+=iATijqi+,
(29)

where we have noted that, in this definition, qaA+=1 and qbB+=0. See Ref. 97 for a detailed proof. Equation (29) can be solved using a variety of linear solution methods (Sec. II D), such as Gauss–Siedel iteration,196,197 successive over-relaxation, and91,169 GMRES,175 or robustly by state reduction (Sec. V A 4).112 

The committor probabilities can also be computed from the second dominant (i.e., first nontrivial) right eigenvector of the modified stochastic matrix T̄ for which nodes of the sets B and A are subsumed into single nodes b and a, respectively, and where both of these supernodes are considered to be absorbing.111 That is, T̄aj=0ja and T̄bj=0jb. This transition matrix has two unit eigenvalues, one of which is associated with the unit vector and the second is associated with an eigenvector that we shall denote by ψ̄(2). The committor probability for the jth node is then111 

qj+=ψ̄j(2)ψ̄a(2)ψ̄b(2)ψ̄a(2).
(30)

Obtaining the committor probabilities using Eq. (30) provides a scalable approach, since the Lanczos algorithm173 can be used to compute ψ̄(2) efficiently for sparse systems.166 Moreover, this formulation is readily extended to the first hitting problem [cf. Eq. (28)] for an arbitrary number of target states.111 

Several quantities characterizing the AB TPE at a nodewise level of detail can be obtained from the committor probabilities.94–98 One such dynamical property of interest is the probability distribution of reactive trajectories, mjR, defined as the probability that the jth node is occupied at equilibrium by a trajectory that is reactive.97 Intuitively, this quantity is a product of the probabilities that the trajectory last visited B before A and will next visit A before B and the stationary probability of the jth node. Therefore, from the definition of the committor probabilities [Eq. (28)], for reversible Markov chains, this probability is given by

mjR=πjqj+(1qj+).
(31)

The probability that any given trajectory at equilibrium is reactive is equal to the normalization factor ZAB=jSmjR. The normalized distribution of reactive trajectories, mj=mjR/ZAB, is the probability to observe a trajectory at node j, conditional on the trajectory being reactive.

In the continuous-time case,97 the net probability flux of reactive trajectories along the ij edge of the network at equilibrium is

fij+=πjKij(qi+qj+)ifqi+>qj+,0otherwise.
(32)

In discrete-time, Tij replaces Kij in Eq. (32).111 The AB steady state rate constant,198,kABSS, can be calculated by defining an isocommittor cut191 Σα, which partitions the network into two sets A*A and B*B such that qaA*+>α>qbB*+, and summing over the net probability fluxes associated with the edges of the cut,199,200

kABSS=1πBiA*,jB*fij+=JABπB.
(33)

Here, we have denoted the steady state reactive AB flux43 as JAB and πB=bBπb. Of particular interest is the isocommittor cut Σα=0.5, which defines the edges that constitute the transition state ensemble (TSE). The TSE essentially characterizes the boundary between the effective basins of attraction associated with the A and B states.201–205 Alternative approaches to characterize the TSE are based on Bayesian path statistics206 and the emission-absorption cut defined in terms of the eigenvectors.64 

Equation (33) shows that the steady state reactive flux, JAB, can be exactly decomposed into additive contributions from individual edges that together comprise an AB cut in the network. Hence, the local states that constitute the dominant channels for the productive pathways can be readily identified, and representative paths associated with each of the bottleneck edges can be determined by an augmenting paths algorithm.97 Alternatively, JAB can be exactly decomposed into a finite set of contributions from transition flux-paths, which proceed productively through sequential isocommittor cuts in the network and are therefore loopless.199 The dominant flux-paths can be determined efficiently by a k shortest paths algorithm, and this procedure allows for a pathwise analysis to quantify the relative importance of competing mechanisms for the transition process.115 

The expressions for the probability distribution of reactive trajectories for nodes [Eq. (31)], and the net reactive fluxes for edges [Eq. (32)], correspond to the equilibrium TPE, where the system has relaxed to a steady state. Formally, this analysis involves considering a trajectory of infinite length in time, which continually transitions between the A and B states.97 The dynamical observable for the equilibrium path ensemble is kABSS [Eq. (33)]. In Ref. 99, a theory of transition paths was developed for the nonequilibrium path ensemble, i.e., considering the first hitting problem where trajectories are absorbed at A, for which the dynamical observable is the MFPT, TAB. In particular, expressions were derived for the expected numbers of times that nodes j are visited along reactive paths, denoted θ̃j, and for a nonequilibrium analog of the net reactive fluxes along ij edges [cf. Eq. (32)], denoted J̃ij, which depend on θ̃j and qj+jS. In Ref. 112, expressions for the probabilities rj+ that nodes j are visited along reactive paths were derived for both the nonequilibrium and equilibrium cases. Together with the committor probabilities, the reactive visitation probabilities rj+ provide a rigorous and readily interpretable metric to identify the local states that have a dominant influence on the productive AB dynamics.100 

We now proceed to describe numerically stable state reduction methods to compute the properties of a Markov chain, including moments of the first passage time distribution [Eq. (24)], the average mixing time [Eq. (15)], node properties characterizing transition paths (Sec. II E), and other quantities introduced in Sec. II. The state reduction approach is exact and requires no additional knowledge of the Markov chain besides the transition probability or rate matrix. Hence, state reduction procedures are of greater general utility than sparse linear algebra algorithms in application to nearly reducible Markov chains, since convergence of the latter methods in the metastable regime is strongly dependent on the careful choice, and moreover existence, of a suitable auxiliary matrix in the preconditioning scheme (Sec. II D). In this section, we introduce the concept of a renormalized Markov chain, which is central to the state reduction methodology, and prove that renormalization can be used to robustly compute the MFPT for a transition. In Sec. IV, we describe further closely related state reduction methods for the exact analysis of Markovian network dynamics.

The graph transformation88–92 (GT) algorithm is a procedure for the iterative elimination of nodes in an arbitrary Markov chain while preserving the AB mean first passage time (MFPT) for the transition from an initial node {b}B to an absorbing macrostate A.93 We denote the set of intervening nodes by I(AB)c. The GT method uses renormalization of transition probabilities and of mean waiting times (in the continuous-time case) or lag times (in discrete-time) for transitions from nodes to preserve individual path probabilities and the average time associated with the ensemble of paths to the absorbing state.

At each step of the nodewise iterative GT algorithm, the nth node, nI, is eliminated from the network, and the probabilities for internode ij transitions on the remaining network are renormalized according to

Tij=Tij+TinTnjm=0Tnnm=Tij+TinTnj1Tnn.
(34)

The waiting or lag times of nodes j, which we collectively denote by τj, are likewise updated according to

τj=γnTγjτj+TγnTnjm=0τj+(m+1)τnTnnm=τj+Tnjτn1Tnn.
(35)

The transition probabilities for pairs of nodes i and j, where one or both of the nodes are not directly connected to n, are unaffected by the renormalization [Eq. (34)]. If i and j were not directly connected in the untransformed network, but both nodes were directly connected to n, i.e., the sequence of direct transitions inj existed prior to renormalization, then a new ij transition connects the pair of nodes in the renormalized network. Hence, the renormalized network at the nth iteration of the GT algorithm is less sparse than the network at the (n − 1)th iteration, and self-loop (jj) transitions, if not initially present, are introduced into the successive Markov chains in the course of the algorithm. The mean waiting or lag time for a transition from the jth node, τj, increases upon elimination of the nth node if the nj transition exists and remains unchanged by the renormalization otherwise [Eq. (35)]. Hence, if the τj are initially uniform, as is the case for the lag times of a discrete-time Markov chain (DTMC), then they become nonuniform in the renormalized network. The effect of the renormalization procedure to eliminate a single node [Eqs. (34) and (35)] is illustrated in Fig. 1.

FIG. 1.

Schematic illustration of renormalization [Eq. (34)] to eliminate a single node n from a Markov chain parameterized by the transition probability matrix T. The transition probabilities T′ of the renormalized Markov chain account for transitions that occur indirectly, via the “censored” state (here, the nth node). Thus, the reduced model features a γβ transition that is not present in the original network, which corresponds to the family of transitions γn ←⋯← nβ, where an arbitrary number of nn transitions occur. Similarly, the reduced Markov chain contains a ββ transition, and the probabilities of the βδ and γδ transitions have increased (indicated by +) to account for paths that proceed via the eliminated node n.

FIG. 1.

Schematic illustration of renormalization [Eq. (34)] to eliminate a single node n from a Markov chain parameterized by the transition probability matrix T. The transition probabilities T′ of the renormalized Markov chain account for transitions that occur indirectly, via the “censored” state (here, the nth node). Thus, the reduced model features a γβ transition that is not present in the original network, which corresponds to the family of transitions γn ←⋯← nβ, where an arbitrary number of nn transitions occur. Similarly, the reduced Markov chain contains a ββ transition, and the probabilities of the βδ and γδ transitions have increased (indicated by +) to account for paths that proceed via the eliminated node n.

Close modal

Equation (34) conserves the probability flow out of all noneliminated nodes, i.e., ∑γTγj′ = 1 ∀ jn.91 The renormalized transition probabilities Tij′ subsume all ij transitions that occurred indirectly, i.e., via intervening n, with an arbitrary number of self-loop transitions of n on the original network.83 That is, the transition probabilities for the renormalized network not only account for direct ij transitions but also “round-trip” transitions inj, innj, etc. The renormalized mean waiting or lag times τj′ [Eq. (35)] have a similar probabilistic interpretation. In particular, τj′ represents the expected time for a transition from the jth node, accounting for the average contribution from deviations via the eliminated node n.144 That is, τj′ includes the average time attributable to n ←⋯← nj transitions before proceeding to escape from {j} ∪ {n}.

Equations (34) and (35) exactly preserve the MFPTs from any given transient node in the network to the set of absorbing nodes A. The renormalization of transition probabilities [Eq. (34)] also preserves the probabilities associated with individual paths (in a renormalized representation) from transient to absorbing nodes. Hence, the elements Bab of the absorption probability matrix1B, where aA and bB, are given by the renormalized transition probabilities Tab′ of the network where all nodes of the set (A{b})c have been eliminated using Eq. (34). A formal proof of these statements is the subject of Sec. III B. If the original Markov chain is irreducible, then the renormalized Markov chain is also irreducible and an expression can be derived for the stationary distribution π of the transformed network.73 This theorem forms the basis of state reduction methods to compute π (Secs. IV AIV C).

The GT method for the computation of AB MFPTs is significantly more numerically stable than conventional dense linear algebra methods.91,92 GT can retain numerical precision even when a node n to be eliminated is associated with a dominant probability for the self-loop transition, Tnn → 1, because in this case, the equivalence 1 − Tnn ≡ ∑γnTγn can be exploited to avoid performing the subtraction operation.207,208 Using this numerical trick minimizes error in the finite precision arithmetic that would otherwise be propagated74–79 and is liable to become prohibitively severe for nearly reducible Markov chains.72,80

The GT algorithm has a time complexity that is dependent on the average degree of nodes and on the heterogeneity of the node degree distribution.89 Empirically, the nodewise iterative renormalization procedure has been observed to scale approximately as O(|S|3) for sparse random networks and for some other classes of structured network.83 Proofs for the O(|S|3) asymptotic time complexity of some state reduction algorithms have been formally derived.128,208 Since a DTMC is less sparse than the corresponding CTMC, there is no advantage to converting from a continuous-to a discrete-time formulation [Eq. (3)], as the state reduction computation will then be less efficient. The reverse operation [formally given by Eq. (4)] is highly nontrivial to perform in practice,144 and in any case, an equivalent continuous-time representation of a DTMC does not necessarily exist.10,209 The performance considerations for state reduction methods are complementary to those for matrix inversion and diagonalization methods (Sec. II D), which have time complexity O(|S|3)210 but frequently fail when the Markov chain features a spectral gap.80 GT is therefore the method of choice to compute MFPTs between two subsets of nodes in a Markov chain featuring a rare event, and likewise, the state reduction procedures outlined in Sec. IV allow for robust computation of further dynamical quantities that are otherwise challenging to obtain for nearly reducible Markov chains. There are numerous tricks for performance optimization of state reduction methods, for example, prioritizing the elimination of nodes with a low degree,91 switching from sparse to dense storage when the number of remaining nodes in the network falls below a threshold,89 and eliminating multiple nodes at once via a matrix inversion operation (Sec. IV A).93,198

We wish to prove that the GT algorithm correctly preserves the MFPT from any given node in the network to a set of absorbing nodes A. Recall that the overall AB MFPT, TAB, is an average of MFPTs TAb for transitions from a node bB of the initial set to any node of the absorbing set A for a specified initial occupation probability distribution [Eq. (20)]. The following argument demonstrates that the individual MFPTs TAb, from which TAB is derived, can be computed exactly using the renormalized transition probabilities [Eq. (34)] and waiting times [Eq. (35)] that are obtained after eliminating all nodes of the set (A{b})c, with A and b chosen arbitrarily.

First, we derive the renormalized mean waiting or lag times [Eq. (35)] more explicitly by introducing the reweighted transition probabilities T̃ij=Tijeζτj. Consider the n-step discrete path ξ specified as a sequence of visited nodes, ξ = {inin−1 ←⋯← i1}. The probability of this path from i1 is Wξ=(ij)ξTij, where the product includes all ij transitions in the path ξ, with the correct multiplicities. The product of reweighted transition probabilities along the path, W̃ξ, is defined similarly. The reweighted transition probabilities satisfy

ddζW̃ξζ=0=Wξk=1n1τik.
(36)

Hence, this derivative yields the product of the path probability and the average waiting time associated with the path. For an aAbB path ξ(a,b), this quantity is the contribution of the path to the overall Ab MFPT. Therefore, to preserve the Ab MFPT by renormalization of the waiting times for nodes j that are directly connected to the nth (eliminated) node, each renormalized waiting time τj′ must be the sum of derivatives [Eq. (36)] of each of the reweighted transition probabilities for transitions from j to the set of nodes directly connected to j or n, excluding n. We will denote this set of adjacent nodes by Γ. If we use the renormalization relation for the reweighted transition probabilities [Eq. (34)] in this expression, then we recover Eq. (35),91 

τj=γΓddζT̃γjζ=0=γΓddζT̃γj+T̃γnT̃nj1T̃nnζ=0=τj+Tnjτn1Tnn.
(37)

While Eq. (34) preserves the probabilities associated with individualξ(a,q) first passage paths (in their resulting reduced representation) from any transient node qQ to any absorbing node aA, Eq. (35) does not preserve the expected first passage times for individual paths to the absorbing state but instead preserves the Aq MFPTs for all transient nodes qQ. That is, Eq. (35) preserves the path ensemble average time for the transition from a transient node to the set of absorbing nodes. To understand this result, note that the formula for the renormalized waiting time τj′ [Eq. (35)] is an average over all γ ∈ Γ ← j transitions [cf. Eq. (37)] and is associated with each of the renormalized probabilities for these transitions. We now proceed to provide a formal proof that Eq. (35) not only preserves the probability and mean time for the local γ ∈ Γ ← j transitions but also preserves the Aq MFPTs for all transient nodes qQ that remain noneliminated.

The overall probability associated with a aq first passage path that proceeds via at least one transition between nodes of the set Γ, on a network where the nth node has been eliminated, can be factorized into probabilities for segments of the path divided as follows: the portion of the path from the starting transient node qQ to a node j of the set Γ (denoted ξ(j,q)), the portion of the path from a node γ of the set Γ to the absorbing node aA (denoted ξ(a,γ)), and transitions within nodes of the set Γ. The probability for any γ ∈ Γ ← j transition on the transformed network is simply Tγj′, and the path segments ξ(j,q) and ξ(a,γ) are associated with products of transition probabilities Wξ(j,q) and Wξ(a,γ), respectively, which do not involve renormalized transition probabilities. Using the reweighted transition probabilities and their derivatives [Eq. (36)], we can hence write the contribution from the family of paths starting from node q and ending in node a, with a single transition γj between nodes of the set Γ, to the total AB MFPT as follows:91 

ddζξ(j,q)W̃ξ(j,q)γΓT̃γjξ(a,γ)W̃ξ(a,γ)ζ=0=ξ(j,q)dW̃ξ(j,q)dζζ=0γΓT̃γjξ(a,γ)W̃ξ(a,γ)+ξ(j,q)W̃ξ(j,q)γΓdT̃γjdζζ=0ξ(a,γ)W̃ξ(a,γ)+ξ(j,q)Wξ(j,q)γΓTγjξ(a,γ)dW̃ξ(a,γ)dζζ=0.
(38)

In this expression, there are sums over all path segments ξ(j,q) initialized from a particular transient node qQ and ending at a node j ∈ Γ and over all path segments ξ(a,γ) beginning at a node γ ∈ Γ and terminating at a particular absorbing node aA. Only the second term in Eq. (38) is affected by the renormalization, and the derivative in this term is the same as the local term in Eq. (37). The contribution of this family of paths (for a particular absorbing node a) to the overall Aq MFPT is not preserved by the graph transformation because each γ ∈ Γ ← j step is associated with the same averaged τj′. To obtain the result for all Aq first passage paths, we sum over absorbing nodes aA.91,144 The contribution of this set of paths to the Aq MFPT is conserved, aAξ(a,γ)Wξ(a,γ)=1γΓ. Hence, the renormalization equations [Eqs. (34) and (35)], applied any number of times to the Markov chain, preserve the MFPTs TAq for transitions from all transient nodes qQ that remain noneliminated.

The vector of MFPTs for transitions from all transient nodes, of the set QAc, can be obtained by solving the following system of linear equations obtained from a first-step analysis:4 

TAj=τj+γATγjTAγ.
(39)

We wish to find the MFPT for the transition from a particular initial node b. When all nodes of the set (A{b})c have been eliminated from the Markov chain according to Eqs. (34) and (35), the only remaining edges in the network represent transitions from the bth node to nodes of the absorbing macrostate A, and the bb self-loop. The first-step relation for the MFPTs [Eq. (39)] therefore reduces to a direct solution for the Ab MFPT,

TAb=τb1Tbb.
(40)

To compute the overall AB MFPT [Eq. (20)], for an initial macrostate BQ, the MFPTs for transitions from each node of the set B to the absorbing state A are required. When the initial set of nodes B is small, this calculation can be achieved easily in practice as follows. Each individual MFPT, TAb, for nodes bB, is determined via Eq. (40) by eliminating all nodes of the set B\b, after first eliminating all nodes of the set I(AB)c and storing the resulting renormalized network. Then, only the former computation needs to be repeated for each initial node bB to obtain TAB. When the absorbing state A is small, the MFPT for the reverse (BA) direction can be computed similarly with comparatively little additional computational effort by eliminating all nodes of the set A\a from the renormalized network resulting from elimination of all nodes of the set I, for each node aA in turn.

The renormalization of transition probabilities [Eq. (34)] forms the basis for the family of state reduction methods to robustly compute the properties of a Markov chain. In this section, we extend the theory of renormalization to eliminate blocks of nodes simultaneously and describe numerically stable state reduction procedures to compute many of the dynamical quantities introduced in Sec. II, including uncoupling–coupling methods to determine the stationary distribution (Secs. IV A and IV B) and the REFUND algorithm to compute the group inverse (Sec. IV D).

In Sec. III, we argued that renormalization of the transition probabilities upon eliminating a single node n in the network [Eq. (34)] preserves the probabilities of individual paths, in their renormalized representation, on the resulting reduced Markov chain. This theory can be extended to allow for elimination of a block of nodes, N, in a single step using a matrix inversion operation. The state space of the Markov chain is partitioned as SNZ so that we write the transition probability matrix in block form as

T=TNNTNZTZNTZZ,
(41)

where TNZ contains the nNzZ transition probabilities and the other blocks are defined similarly. Hence, TNN is the substochastic matrix for transitions within the subset of nodes to be eliminated, N, and TZZ likewise corresponds to the subnetwork comprising the nodes of the macrostate to be retained, ZNc. The renormalized Markov chain consisting only of the nodes within the set Z is associated with the transition probability matrix208 

TZZTZZ+TZN(ITNN)1TNZ.
(42)

Equation (42) is referred to as a stochastic complement by Meyer.73 The renormalized transition probabilities account for the average behavior of paths that visit N, including transitions within N, analogous to the case of eliminating a single node [Eq. (34)].211 Therefore, if there was no direct ij transition in the original network, but there were iNj paths, then a direct ij transition is present in the renormalized network for i,jZ. Similarly, it can be shown that the |Z|-dimensional vector of renormalized mean waiting or lag times for the remaining network is given by93 

τZτZ+τN(ITNN)1TNZ.
(43)

Equations (42) and (43) constitute a block formulation of the GT algorithm (Sec. III) that remains numerically stable when each of the blocks of nodes to be eliminated, N, constitutes a metastable state so that the TNN matrices do not feature a spectral gap, and hence, the Markovian kernel (ITNN) can be safely inverted by dense linear algebra methods. Simultaneous elimination of blocks of nodes in a Markov chain can also lead to improved time complexity of the GT algorithm.198 

It can be shown that if a Markov chain is irreducible, then any stochastic complement [Eq. (42)] also has a well-defined stationary distribution,1 and hence, so do any stochastic complements of that stochastic complement, and so on. Moreover, if a transition probability matrix is partitioned into N communities, C={1,2,,N}, then the stochastic complements (i.e., renormalized Markov chains) corresponding to each of the communities have independent stationary distributions, from which the stationary distribution of the original Markov chain can be inferred.212 Specifically, the stationary distribution of the original Markov chain partitioned according to

T=T11T12T1NT21T22T2NTN1TN2TNN
(44)

is given by

π=(ζ1π1,,ζN1πN1,ζNπN),
(45)

where ζY=yYπy is the coupling factor corresponding to the stochastic complement [Eq. (42)] comprising the nodes of the set Y, which has a stationary distribution vector πY, and YζY=1. The N-dimensional vector of coupling factors,

ζ=(ζ1,,ζN1,ζN),
(46)

is the stationary distribution for the stochastic matrix C with elements CXY=1XTXYπY for all communities X,YC.212,C is referred to as the aggregation matrix, since its elements are simply the intercommunity transition probabilities when a local equilibrium6 is established within each of the separate macrostates. If the community structure C appropriately characterizes the metastable communities of nodes, then C is well-conditioned, and hence, ζ can be determined accurately with relatively little computational effort, since the number of communities N is typically small. The coupling factors take a particularly simple form in the case of a two-level partition into sets N and ZNc [Eq. (41)], namely,73 

ζZ=11NTNNπN21ZZTZZπZ1NTNNπN
(47)

and ζN=1ζZ.

These observations suggest the following uncoupling–coupling algorithm to compute the stationary distribution vector. In the uncoupling phase, an irreducible Markov chain and its derived stochastic complements are repeatedly decomposed into two or more renormalized Markov chains of reduced dimensionality based on a determined partitioning C at each iteration. For the set of stochastic complements resulting from the final iteration of the uncoupling phase, linear algebra methods (Sec. II) or the GTH algorithm130,213 (Sec. IV C) can be used to compute the independent stationary distributions for each of the renormalized Markov chains with state space Y, πYYC. Then, the aggregation matrix C is constructed from this information, and the coupling factors ζY associated with each of the stochastic complements are obtained as the stationary distribution of C. The vectors {πY} and corresponding coupling factors {ζY} yield the stationary distribution of the parent Markov chain, comprising the nodes in all communities YC [Eq. (45)]. That is, the final iteration of the uncoupling stage has been undone. Repeated coupling of the stationary distributions for the stochastic complements at each level of the hierarchical partitioning that was performed in the uncoupling phase eventually recovers the stationary distribution of the original Markov chain. This uncoupling–coupling procedure based on stochastic complementation [Eq. (42)] is illustrated in Fig. 2.

FIG. 2.

Schematic illustration of the exact uncoupling–coupling procedure (Sec. IV A) to compute the stationary distribution π of an irreducible Markov chain parameterized by the transition probability matrix T. The state space of the model network shown is partitioned into three communities, SXYZ. The edges of the network are bidirectional. In the uncoupling step, the stochastic complement TYY [Eq. (42)] is computed for each community Y of the set C. Observe that the stochastic complements TYY contain additional edges compared to the corresponding substochastic blocks TYY of the original Markov chain. For instance, the stochastic complement for the community Y has an additional edge between two nodes for which an indirect transition via the set XZ exists. After uncoupling, the stationary distribution is computed for each of the independent reduced Markov chains. In this figure, the notation Tπ indicates that the stochastic matrix T has stationary distribution π. The independent stationary distributions are used to construct a stochastic aggregation (coupling) matrix C in which each community is represented by a single node. The stationary distribution π of the original Markov chain is constructed from the stationary distribution ζ associated with C and from the independent stationary distributions of the separate stochastic complements πYYC (coupling step) [Eq. (45)]. The algorithm illustrated above has a single uncoupling stage, but the independence of the stochastic complements can be exploited to recursively reduce the derived Markov chains in multiple uncoupling steps, and this algorithm is readily parallelizable. In iterative aggregation–disaggregation (IAD) procedures (Sec. IV B), the substochastic blocks TYY are used in place of the stochastic complements TYY. Hence, the initial stationary distribution determined by the coupling step is inexact. New coupling factors ζY* and corresponding approximate local stationary distributions πY* are iteratively updated by repeatedly forming a stochastic aggregation matrix and then solving a system of linear equations [Eq. (48)].

FIG. 2.

Schematic illustration of the exact uncoupling–coupling procedure (Sec. IV A) to compute the stationary distribution π of an irreducible Markov chain parameterized by the transition probability matrix T. The state space of the model network shown is partitioned into three communities, SXYZ. The edges of the network are bidirectional. In the uncoupling step, the stochastic complement TYY [Eq. (42)] is computed for each community Y of the set C. Observe that the stochastic complements TYY contain additional edges compared to the corresponding substochastic blocks TYY of the original Markov chain. For instance, the stochastic complement for the community Y has an additional edge between two nodes for which an indirect transition via the set XZ exists. After uncoupling, the stationary distribution is computed for each of the independent reduced Markov chains. In this figure, the notation Tπ indicates that the stochastic matrix T has stationary distribution π. The independent stationary distributions are used to construct a stochastic aggregation (coupling) matrix C in which each community is represented by a single node. The stationary distribution π of the original Markov chain is constructed from the stationary distribution ζ associated with C and from the independent stationary distributions of the separate stochastic complements πYYC (coupling step) [Eq. (45)]. The algorithm illustrated above has a single uncoupling stage, but the independence of the stochastic complements can be exploited to recursively reduce the derived Markov chains in multiple uncoupling steps, and this algorithm is readily parallelizable. In iterative aggregation–disaggregation (IAD) procedures (Sec. IV B), the substochastic blocks TYY are used in place of the stochastic complements TYY. Hence, the initial stationary distribution determined by the coupling step is inexact. New coupling factors ζY* and corresponding approximate local stationary distributions πY* are iteratively updated by repeatedly forming a stochastic aggregation matrix and then solving a system of linear equations [Eq. (48)].

Close modal

In practice, this exact uncoupling–coupling algorithm has been found to be highly effective for determining the stationary distribution of a nearly reducible Markov chain.214 To understand this observation, consider an iteration of the uncoupling phase, where the Markov chain T has N metastable macrostates and is partitioned into the set of N communities C, which accurately characterizes the metastable sets of nodes. For each community ZC, the renormalized transition matrix TZZ [Eq. (42)] is then close to the (nearly stochastic) block TZZ [Eq. (41)] of the parent transition matrix.73 Since the dynamics within the state space Z are fast compared to escape from this subnetwork, neither the absorbing Markov chain TZZ nor its derived stochastic complement TZZ have any subdominant eigenvalues close to unity, and the renormalized Markov chain TZZ is therefore well-conditioned. Hence, inversion of the Markovian kernel to determine the corresponding stochastic complement [Eq. (42)], and computation of the stationary distribution for this stochastic complement, is numerically stable.215 

More formally, since each of the metastable macrostates gives rise to a slowly decaying dynamical eigenmode, the nearly reducible Markov chain T necessarily has at least N − 1 eigenvalues λk that are close to unity, in addition to the eigenvalue λ1 = 1 that is associated with the stationary distribution. Furthermore, each of the stochastic complements TZZ, corresponding to communities ZC, necessarily has a unit eigenvalue.212 Each of the unique unit eigenvalues for the N stochastic complements are associated with one of the N dominant eigenvalues for the original Markov chain, and this mapping becomes exact in the limit where the original Markov chain is completely reducible, i.e., where the separate blocks ZC are themselves irreducible.72 Hence, by the continuity of the eigenspectrum, if there is a spectral gap after the N dominant eigenvalues of the original Markov chain, then each of the N stochastic complements must have a spectral gap after the unit eigenvalue.73 In fact, it can be shown that there exists an upper bound on the second dominant eigenvalue for a stochastic complement of a reversible Markov chain.216 Thus, if the community structures used to partition the stochastic complements during the uncoupling phase appropriately characterize all metastable sets of nodes in the relevant Markov chains so that the aggregation matrices are also well-conditioned, then the entire uncoupling–coupling procedure is numerically stable. Moreover, the uncoupling–coupling procedure is readily parallelizable, owing to the independence of the stationary distributions for the stochastic complements derived from a parent Markov chain.212 

Stochastic complementation is close in spirit to iterative aggregation–disaggregation (IAD) methods217–220 to compute the stationary distribution π. Both methods use an N-way partitioning of a Markov chain based on the community structure C [Eq. (44)]. Whereas the uncoupling–coupling in stochastic complementation is exact, IAD uses the substochastic blocks of a partitioned Markov chain directly, thereby avoiding the matrix inversion operations required to compute the stochastic complements [Eq. (42)].

To infer an approximation to the stationary distribution of the parent Markov chain in IAD, the normalized right eigenvector ψY(1) associated with the dominant eigenvalue (which is less than unity) is computed for each of the substochastic matrices TYY, corresponding to communities YC. The vector of coupling factors ζ* [cf. Eq. (46)] that are associated with these eigenvectors is determined as the stationary distribution for a N × N-dimensional stochastic coupling matrix C* with elements CXY*=1XTXYψY(1)X,YC. Note that thus far, the IAD procedure is analogous to the exact uncoupling–coupling method described in Sec. IV A, except that the quantities are inexact since the substochastic blocks are used in place of the stochastic complements. The eigenvectors and associated coupling factors together yield an initial approximation to the stationary distribution of the original Markov chain [cf. Eq. (45)], π(ζ1*ψ11,,ζ1*ψ1N), which is iteratively refined as follows:

Let π*=(π1*,,πN*) denote the current estimate for the stationary distribution. First, the vectors πYYC are normalized to yield the vectors {π̄Y}, and updated coupling factors {ζY*} are computed as the stationary distribution of a new stochastic aggregation matrix C* with elements CXY*=1XTXYπ̄Y*X,YC (aggregation step). The new coupling factors yield the vector z=(ζ1*π̄1*,,ζN*π̄N*). Second, an updated estimate for π is obtained by solving the following N systems of linear equations (disaggregation step):221 

πX*=TXXπX*+Y<XTXYπY*+Y>XTXYzY,XC.
(48)

These two steps are repeated until the estimate for the stationary distribution of the parent Markov chain converges, π* → π. Further refinements to this procedure that improve numerical stability and convergence were reported in Ref. 222.

In practice, this procedure typically converges to the true stationary distribution rapidly, provided that the community structure is such that the blocks of the transition matrix are close to being stochastic.215,223 IAD shares many of the same advantages as exact uncoupling–coupling, most notably the numerical stability conferred when the community structure reflects the nearly reducible structure of a Markov chain exhibiting metastability and the possibility of parallelization. The GTH algorithm (Sec. IV C) can be used to solve the linear systems of equations that arise in both the aggregation and disaggregation steps of IAD, leading to improved numerical stability.224 

The Grassmann–Taksar–Heyman (GTH) algorithm130,213 is essentially a nodewise iterative formulation of exact uncoupling–coupling via stochastic complementation (Sec. IV A). Consider the elimination of the nth node in a Markov chain by renormalization [Eq. (34)], as in the nodewise iterative formulation of the GT algorithm (Sec. III). Since the equilibrium distribution is equal to the proportion of time spent at a node in the infinite time limit, the stationary distribution of the parent Markov chain, π, must be proportional to that of the reduced model, π′,

πj=απjjS\{n},
(49a)
πn=1α.
(49b)

From Eq. (49) and the global balance equation for the stationary distribution of the parent Markov chain at the nth node, πn = ∑γπγT, we have128 

α=1+γnπγTnγ1Tnn1,
(50)

where Tij is the ij transition probability for the original Markov chain.

Equations (49) and (50) suggest the following nodewise iterative procedure to compute the stationary distribution of an irreducible Markov chain. The GTH algorithm uses renormalization of transition probabilities [cf. Eq. (34)] to eliminate all nodes n of the Markov chain for which |S|n>1. The stationary probability for the reduced Markov chain comprising the single remaining node (n = 1) is, of course, known to be π1′ = 1. The algorithm then employs a back substitution phase to “undo” the state reduction stage, thereby recovering the original Markov chain and associated stationary distribution.40 The GTH algorithm, which is essentially equivalent to a Gaussian elimination,40 is given as pseudocode in Algorithm 1. Like the GT algorithm, the GTH algorithm is numerically stable,225 since the relation 1 − Tnn ≡ ∑γnTγn can be exploited to avoid problematic subtraction operations when Tnn → 1.207 

ALGORITHM 1.

Grassmann–Taksar–Heyman (GTH) algorithm130,213 to compute the stationary distribution π of an irreducible Markov chain.

input: transition probability matrix T 
set of nodes nS\{1} to be eliminated from the state space S 
output: stationary probability distribution vector π 
/* elimination phase */ 
forn=|S|,|S|1,,2do 
Sn = ∑γ<nTγn (≡1 − Tnn); // confers numerical stability 
forj < ndo 
TnjTnj/Sn
fori < ndo 
TijTij + TinTnj
/* back substitution phase */ 
π1 ← 1; 
μ ← 1; 
forn=2,,|S|1,|S|do 
πn=Tn1+γ=2n1πγTnγ
μμ + πn
/* normalization */ 
forn=1,,|S|1,|S|do 
πnπn/μ
returnπ
input: transition probability matrix T 
set of nodes nS\{1} to be eliminated from the state space S 
output: stationary probability distribution vector π 
/* elimination phase */ 
forn=|S|,|S|1,,2do 
Sn = ∑γ<nTγn (≡1 − Tnn); // confers numerical stability 
forj < ndo 
TnjTnj/Sn
fori < ndo 
TijTij + TinTnj
/* back substitution phase */ 
π1 ← 1; 
μ ← 1; 
forn=2,,|S|1,|S|do 
πn=Tn1+γ=2n1πγTnγ
μμ + πn
/* normalization */ 
forn=1,,|S|1,|S|do 
πnπn/μ
returnπ

Recall that the key macroscopic dynamical properties of an irreducible Markov chain, including moments of the first passage time distributions for all pairwise transitions between nodes and moments of the mixing time distributions for relaxation processes from alternative starting nodes, can be computed from any generalized matrix inverse (Sec. II B). The FUND78,226,227 and REFUND128 algorithms provide numerically stable procedures to compute the group inverse A# [Eqs. (8)(10)] associated with an irreducible Markov chain and can be readily adapted to obtain related generalized inverses, such as the fundamental matrix Z.

The FUND algorithm is based on the fact that a generalized inverse X satisfying134 

X(IT)=Iπ1S
(51)

can be obtained from the result of the first phase of the GTH algorithm (Algorithm 1) and that the group inverse can be written directly in terms of any generalized inverse X satisfying Eq. (51),128 

A#=X(Iπ1S).
(52)

The first step of the FUND algorithm is to obtain a LU decomposition of the Markovian kernel, IT = UL, where U and L are upper- and lower-triangular matrices, respectively. To see that this factorization is achieved naturally and robustly by the elimination phase of the GTH algorithm (Sec. IV C), suppose that the elements of the stochastic matrix T are overwritten during the procedure, yielding the matrix T*. Let F denote the strictly lower-triangular matrix and G the upper-triangular matrix containing the corresponding elements of T* so that T* = F + G + (IS), where S is the diagonal matrix with nonzero elements Snn = Sn (cf. Algorithm 1). It can be shown that40,226

IT=(GS)(FI)=UL.
(53)

The U matrix has all column sums equal to zero. The diagonal elements of U can be computed by enforcing this constraint, which thereby provides an additional opportunity to enforce numerical stability.78 The diagonal elements of L are all equal to −1, and hence, L is non-singular.

The LU decomposition of Eq. (53) can be used to solve Eq. (51) in two stages. First, the (unique) solution Y to the problem YL=Iπ1S is determined by backward substitution. The second step is to solve XU = Y.134 Since the first column of U is identically zero, the first column of Y is also necessarily the null vector, and therefore, the generalized inverse X is not a unique solution to this equation.226 The simplest choice is to set the first column of X to be the null vector, and the remaining component vectors can then be obtained by forward substitution.227 The group inverse A# then follows straightforwardly from X [Eq. (52)], where π is computed by following through the later stages of the GTH algorithm.

The FUND algorithm outlined above is usually stable, since there is no significant numerical error associated with the forward and backward substitution procedures to determine Y and X. However, a difficulty may sometimes arise where L is ill-conditioned even though the Markovian kernel IT is not. A small refinement of the FUND algorithm proposed in Ref. 78 addresses this issue. The time complexity of either formulation of the FUND algorithm is O(|S|3).

An explicit formula relating the group inverses associated with a renormalized Markov chain where a single node has been eliminated and the corresponding parent Markov chain was derived in Ref. 128. This relation allows for a state reduction procedure to compute the group inverse that has a similar structure to the GTH algorithm, namely, the REFUND algorithm. The first stage of this algorithm is an elimination phase to iteratively reduce the Markov chain by renormalization, yielding the group inverse for the Markov chain with only a single node remaining as a trivial solution. A backward pass phase then computes the group inverses for the sequence of Markov chains resulting from the restoration of nodes in turn, given the group inverse for the reduced Markov chain where the node of the current iteration is eliminated. The recursive phase of the REFUND procedure also computes the stationary distribution of the reduced Markov chain at each iteration, which is required to compute the group inverse of the parent Markov chain. The n-dimensional stationary distribution vector for the reduced Markov chain with the nth node restored, πn, is normalized at each iteration of the backward pass phase. This approach differs from the GTH algorithm (Algorithm 1), where the stationary distribution is only normalized at the final step. REFUND has comparable stability to the refined FUND algorithm and likewise has asymptotic time complexity O(|S|3). The REFUND algorithm is given as pseudocode in Algorithm 2.

ALGORITHM 2.

REFUND algorithm128 to compute the group inverse A# of an irreducible Markov chain [Eqs. (8)(10)]. The stationary distribution π is computed concomitantly. In this scheme, T:i,:j denotes the (i − 1) × (j − 1)-dimensional block of the stochastic matrix T comprising elements with row index less than i and column index less than j. Similarly, p:n is the (n − 1)-dimensional column vector containing the elements of p with indices 1 ≤ γ < n.

input: transition probability matrix T 
set of nodes nS\{1} to be eliminated from the state space S 
output: group inverse A# 
stationary distribution π 
/* elimination phase. Note that nodes are eliminated in reverse order, thereby effectively performing the 
triangular decomposition of the Markovian kernel:IT = UL */ 
forn=|S|,|S|1,,2do 
/* p:nis the column andq:nthe row vector, respectively, corresponding to the node to be eliminated (n
in the stochastic matrix for the reduced Markov chain at the current iteration, not including 
the diagonal element. Both vectors have dimension (n − 1) */ 
p:n,q:n GetBlock(T, n); 
Sn ← ∑γ<nTγn (≡1 − Tnn); // factors Sn1<n<|S| are stored in a vector 
T:n,:nT:n,:n+Sn1p:nq:n; // GTH elimination of the nth node 
T:n+1,:n+1T:n,:np:nSn1q:nTnn; // overwrite the elements of the transition matrix 
/* the stationary distribution and group inverse for the reduced Markov chain with only one  
node remaining are trivial */ 
π1 ← (1); 
A1#(0)
/* backward pass phase. At this stage, the first diagonal element ofT, T11 = 1, corresponds to the stochastic  
matrix for a reduced Markov chain comprising only a single node. The other 
off-diagonal elements ofTare the vectorsp:nandq:n for 1<n|S| */ 
forn=2,,|S|1,|S|do 
/* p:n and q:nare defined above, butq:nis now scaled bySn1compared to theq:nvector returned by GetBlock 
during the elimination phase */ 
p:n,q:nGetBlock (T, n); 
α(1+πn1q:n)1
βαπn1q:n/Sn((1α)/Sn); // confers numerical stability 
rαAn1#q:n
tβp:nAn1#
cβ(α+p:nr)
ccπn−1t
/* stationary distribution for the reduced Markov chain with thenth node restored */ 
πnαπn1q:nπn1; // this stationary vector is normalized 
δ(πn1q:n)1(α/(1α)); // confers numerical stability 
/* the group inverse for the parent Markov chain, for which thenth node is restored, is 
recovered by an explicit formula */ 
An#An1#[rπn1+1n1c]δcrc1n1δc; // 1n−1 is the (n − 1)-dimensional unit vector 
A#A|S|#; ππ|S|
returnA#, π
/* function to partition the relevant block of the matrixT. During the elimination phase, this block  
corresponds to the stochastic matrix for the reduced Markov chain at the current iteration 
(the other elements are overwritten during this phase) */ 
function GetBlock (T, n
letT:n+1,:n+1=T:n,:np:nq:nTnn; // reduced Markov chain partitioned into blocks 
returnp:n,q:n
input: transition probability matrix T 
set of nodes nS\{1} to be eliminated from the state space S 
output: group inverse A# 
stationary distribution π 
/* elimination phase. Note that nodes are eliminated in reverse order, thereby effectively performing the 
triangular decomposition of the Markovian kernel:IT = UL */ 
forn=|S|,|S|1,,2do 
/* p:nis the column andq:nthe row vector, respectively, corresponding to the node to be eliminated (n
in the stochastic matrix for the reduced Markov chain at the current iteration, not including 
the diagonal element. Both vectors have dimension (n − 1) */ 
p:n,q:n GetBlock(T, n); 
Sn ← ∑γ<nTγn (≡1 − Tnn); // factors Sn1<n<|S| are stored in a vector 
T:n,:nT:n,:n+Sn1p:nq:n; // GTH elimination of the nth node 
T:n+1,:n+1T:n,:np:nSn1q:nTnn; // overwrite the elements of the transition matrix 
/* the stationary distribution and group inverse for the reduced Markov chain with only one  
node remaining are trivial */ 
π1 ← (1); 
A1#(0)
/* backward pass phase. At this stage, the first diagonal element ofT, T11 = 1, corresponds to the stochastic  
matrix for a reduced Markov chain comprising only a single node. The other 
off-diagonal elements ofTare the vectorsp:nandq:n for 1<n|S| */ 
forn=2,,|S|1,|S|do 
/* p:n and q:nare defined above, butq:nis now scaled bySn1compared to theq:nvector returned by GetBlock 
during the elimination phase */ 
p:n,q:nGetBlock (T, n); 
α(1+πn1q:n)1
βαπn1q:n/Sn((1α)/Sn); // confers numerical stability 
rαAn1#q:n
tβp:nAn1#
cβ(α+p:nr)
ccπn−1t
/* stationary distribution for the reduced Markov chain with thenth node restored */ 
πnαπn1q:nπn1; // this stationary vector is normalized 
δ(πn1q:n)1(α/(1α)); // confers numerical stability 
/* the group inverse for the parent Markov chain, for which thenth node is restored, is 
recovered by an explicit formula */ 
An#An1#[rπn1+1n1c]δcrc1n1δc; // 1n−1 is the (n − 1)-dimensional unit vector 
A#A|S|#; ππ|S|
returnA#, π
/* function to partition the relevant block of the matrixT. During the elimination phase, this block  
corresponds to the stochastic matrix for the reduced Markov chain at the current iteration 
(the other elements are overwritten during this phase) */ 
function GetBlock (T, n
letT:n+1,:n+1=T:n,:np:nq:nTnn; // reduced Markov chain partitioned into blocks 
returnp:n,q:n

Many procedures based on the concept of renormalization have been proposed as numerically stable approaches to solve various other linear algebra problems for Markov chains and related systems.131 In Ref. 113, an extension of the GTH algorithm was derived to compute the variance and higher moments of the FPT distributions for the transitions from each of the transient nodes of a reducible Markov chain to the absorbing state. This procedure provides a stable method to compute the variance of the FPT distribution, which can also exploit sparsity of a Markov chain and thereby avoid excessive memory usage. The vector of MFPTs for transitions from all transient nodes to the absorbing state can likewise be calculated in a single computation by extending the GT algorithm (Sec. III A) with a backward pass phase.112 Similar to the GTH (Sec. IV C) and REFUND (Sec. IV D) algorithms, this procedure is based on a recursive formula to compute the MFPTs for iteratively restored nodes, after storing relevant matrix elements during the forward pass phase.132,134

In Ref. 112, a state reduction procedure was proposed to determine the expected numbers of times that transient nodes j are visited on first passage paths prior to absorption, θj. This algorithm effectively performs a matrix inversion of the Markovian kernel corresponding to the transient nodes of a reducible Markov chain. This operation is achieved while retaining numerical precision by defining an augmented matrix where each node of the network is connected to a “dummy” partner, and state reduction is used to eliminate the non-dummy transient nodes. The corresponding quantity for reactive paths, θ̃j, can be obtained by performing this procedure on a reweighted Markov chain that represents the reactive process,191 which is derived from the original transition probabilities and the committor probabilities. As shown in Ref. 112, it is then possible to robustly obtain the reactive visitation probabilities rj+ for all transient nodes jQ for both nonequilibrium and equilibrium transition path ensembles. Alternatively, the probability that a particular node, or set thereof, is visited along a reactive path can be computed robustly and with minimal memory usage by employing a state reduction procedure that follows naturally from the concept of renormalization.114 This approach is valuable in many applications, not only for nearly reducible Markov chains, since the usual method to compute visitation probabilities for nodes on first passage1 or reactive112 paths uses matrix inversion and is therefore not scalable. Furthermore, the conventional method cannot be used to compute the visitation probabilities for macrostates. The reactive visitation probability is an important dynamical property that intuitively quantifies the kinetic relevance of states with respect to the productive transition. Thus, since there are also state reduction methods to compute the stationary distribution (Secs. IV AIV C) and committor probabilities (Sec. V A 4), renormalization can be used for robust computation of all microscopic properties characterizing the transition path ensemble (Sec. II E).

In Ref. 228, a state reduction framework was formulated for application to Markov decision processes (MDPs). A MDP augments the state space of a Markov chain with alternative sets of probabilities for transitions from the nodes, each associated with a different action that is available to the decision-making agent. A set of actions corresponding to each of the nodes of the MDP is a policy, which defines how the agent operates. The ij transition under action aj is associated with a reward Rij(aj). The reward at the tth step is usually weighted by γt, where γ is a discount factor to ensure that the reward for a trajectory converges, 0 < γ < 1. The usual optimization problem is to determine the optimal policy, namely that which maximizes the expected reward when starting from a specified initial node. The algorithm of Ref. 228 is a robust method to find the optimal policy by policy iteration using repeated application of the following two-stage procedure. First, a state reduction algorithm is used to determine the vector of expected rewards starting from each node when the agent obeys the current policy. Second, these expected rewards are used in an improvement step to obtain an update to a suboptimal policy. This numerically stable method is useful in many practical applications, where nodes associated with large rewards may be rarely visited under an initial policy.

The state reduction framework presented in Secs. III and IV provides numerically stable procedures to compute almost all dynamical properties of a Markov chain. However, there are two notable quantities introduced in Sec. II that have not yet been obtained by state reduction methods. First, there is no state reduction algorithm to exactly compute the time-dependent occupation probability distribution vector p(t), given a transition rate matrix K. Instead, p(t) must be computed by matrix exponentiation [Eq. (3)] or eigendecomposition [Eq. (21)]. Second, we desire a state reduction method to compute the committor probabilities for nodes [Eq. (29)], which constitute the central object in transition path theory (Sec. II E).

In the present section, we describe the kinetic path sampling (kPS) method to efficiently simulate trajectories, and hence p(t), for a nearly reducible Markov chain. The kPS algorithm applies a state reduction procedure to the currently occupied metastable macrostate of the Markovian network, followed by a backward pass phase to sample the numbers of internode transitions, and therefore the time, associated with a trajectory escaping from the subnetwork. The formulation of the graph transformation algorithm (Sec. III) employed in kPS leads to an exact state reduction procedure to compute the committor probabilities for all nodes of a Markov chain (Sec. V A 4).

In principle, numerical estimates for the AB FPT distribution [Eq. (22)], and for the occupation probability distribution p(t), can be obtained by explicit simulation of AB first passage paths using any algorithm that samples the solution to the linear master equation [Eq. (1)] exactly. The standard procedure to simulate trajectories on a Markovian network in continuous-time is rejection-free kinetic Monte Carlo (kMC).81,82 In the kMC algorithm, a trajectory that currently occupies the jth node is advanced to the next node using two random numbers r1, r2 ∈ (0, 1] drawn from a uniform distribution.229 The first random number is used to sample an ij transition in proportion to the branching probabilities Pij, and the second is used to increment the simulation clock by Δt = τj  ln r2.110 The probabilities of the sampled paths then agree exactly with the linear master equation [Eq. (1)]. By using the branching probability matrix, this procedure avoids self-loop transitions and so prevents the system from becoming trapped in any one node associated with small outgoing transition rates. However, for nearly reducible Markov chains, the trajectories exhibit a strong tendency to “flicker” within metastable communities of nodes,84–87 which can cause kMC simulation to be unfeasibly inefficient.83,100

For Markov chains featuring a rare event, large gains in efficiency can be achieved by defining the currently occupied metastable community of nodes (or basin) B, sampling a node at the boundary AA of the absorbing macrostate ABc and sampling an escape time for the trajectory segment escaping to the absorbing boundary. The kinetic path sampling83,84 (kPS) (Sec. V A) and Monte Carlo with absorbing Markov chain101–104 (MCAMC) (Sec. V B) algorithms provide exact methods to do this. The kPS and MCAMC algorithms forfeit resolution of the trajectory within the metastable macrostates, which in any case ought to be unproductive and hence of little interest, and gain the desirable property that their efficiency is essentially independent of the metastability of the Markov chain.83 

The kPS and MCAMC methods require a partitioning of the network into metastable communities. This clustering can be specified a priori, for example, with a suitable community detection algorithm,230 or constructed on-the-fly, for instance, using a breadth-first search procedure with criteria for including nodes in the basin.231 The algorithms can be used in conjunction with standard rejection-free kMC, automatically switching to the advanced method when flickering of a trajectory within a subset of nodes is detected.83 

1. Renormalization phase

To simulate a trajectory segment in the kinetic path sampling83,84 (kPS) algorithm, the state space SBA is divided into the currently occupied community B and an absorbing macrostate A. The nodes of the basin BET are further divided into the set of eliminated nodes EB and the set of retained nodes TB. The set T may be empty, T=. We also define the absorbing boundary AA as the subset of nodes of the absorbing macrostate that are directly connected to one or more nodes of the basin B. The total number of nodes in the set ETA is denoted Nc. The definitions of these states are illustrated in Fig. 3.

FIG. 3.

Formulation of the escape from a community B to the absorbing boundary AA in the kPS algorithm. Given an initially occupied node ϵ, which in the above illustration belongs to the subset EB of nodes of the basin B that are to be eliminated by renormalization, the output of Algorithm 3 is a stochastically drawn escape time tA for a sampled trajectory from ϵ to an absorbing node αA. A subset TB of nodes of the basin may be retained in the renormalization stage of the kPS algorithm.

FIG. 3.

Formulation of the escape from a community B to the absorbing boundary AA in the kPS algorithm. Given an initially occupied node ϵ, which in the above illustration belongs to the subset EB of nodes of the basin B that are to be eliminated by renormalization, the output of Algorithm 3 is a stochastically drawn escape time tA for a sampled trajectory from ϵ to an absorbing node αA. A subset TB of nodes of the basin may be retained in the renormalization stage of the kPS algorithm.

Close modal

To simulate a trajectory escaping from the active community B to the absorbing boundary A, kPS uses a stochastic matrix corresponding to the relevant subnetwork ETA. In the first phase of the kPS algorithm, nodes 1,,|E|E are eliminated, in turn, by renormalization (Sec. III), while retained nodes |E|+1,,|B|T and absorbing boundary nodes |B|+1,,NcA remain noneliminated. The input to the sampling stage of the kPS algorithm is the set of |E|+1 transition probability matrices {T(n)}, 0n|E|, with T(0) corresponding to the transition probability matrix for the subnetwork BA of the original (untransformed) network. T(0) has dimensions Nc×|B|, since the absorbing boundary nodes have no outgoing transitions. Successive stochastic matrices (with n > 0) are computed by the iterative elimination of the |E| nodes nE from this subnetwork using the GT algorithm (Sec. III). In the renormalization, the transition probabilities for all pairs of nodes are updated according to

Tij(n)=Tij(n1)+Tnj(n1)(Tin(n1)δin)1Tnn(n1)
(54)

[cf. Eq. (34)]. Transitions from eliminated to noneliminated nodes are preserved by Eq. (54), but transitions from noneliminated to eliminated nodes have zero probability. As in Eq. (34), only the probabilities for transitions between pairs of nodes that are both directly connected to the nth node are affected by the renormalization. However, unlike renormalization using Eq. (34), connections involving eliminated nodes must also be considered. The mean waiting or lag times for transitions from nodes are not renormalized [cf. Eq. (35)] in the kPS algorithm, since the state reduction procedure is reversed before sampling the time associated with a trajectory escaping from the basin to the absorbing boundary.

The formulation of GT expressed in Eq. (54) is analogous to a LU decomposition215,232 of a stochastic matrix.83 Exploiting this analogy reduces the memory requirements of the kPS algorithm, since the intermediate transition matrices arising from the iterative elimination of nodes need not be stored. Instead, it is only necessary to store the original and final stochastic matrices, T(0) and T(|E|), respectively, and the matrices L and U that contain the elements required to construct T(n−1) from T(n). Specifically, the L and U matrices have elements [cf. Eq. (54)]

Lnj=Tnj(n1)1Tnn(n1)andUin=Tin(n1)δin.
(55)

2. Sampling dynamics on the renormalized network and iterative reverse randomization

Recall that renormalization preserves the path probabilities to individual absorbing nodes (Sec. III B).91 This property of the state reduction framework is the basis for the kPS algorithm. To sample the absorbing boundary node αA at which the trajectory segment terminates, the following probability vectors are used to simulate successive ij transitions on the renormalized subnetwork comprising the nodes of the set BA,

cj=t*,j(|E|)ifj|E|,t*,j(j)ifj>|E|,
(56)

starting from the currently occupied node ϵB. Here, t*,j(|E|) denotes the vector containing the elements of the jth column of T(|E|), and t*,j(j) (for j>|E|) denotes the jth column of the transition matrix given by the elimination [Eq. (54)] of node j from T(0). In this sampling procedure, transitions between eliminated nodes are not permitted [cf. Eq. (54)], but transitions from noneliminated transient nodes jT to eliminated nodes iE are possible. This setup greatly reduces the number of steps required to reach a node at the absorbing boundary. If T=, then the trajectory necessarily reaches the absorbing boundary in a single transition. If E=, then Eq. (56) reduces to the standard rejection-free kMC algorithm.233 

Let h denote the |E|-dimensional vector with elements hj equal to the number of transitions from node jE to nodes i > j, i.e., transitions to nodes that are noneliminated in the stochastic matrix at the jth iteration of the renormalization phase. Let η denote the |B|-dimensional vector with elements ηj equal to the total number of transitions from the jth node. We also define the set of |E|+1 hopping matrices {H(n)}, 0n|E|, each of dimension (Ncn)×|E|. The element Hij(n) corresponds to the number of ijE transitions on the renormalized stochastic matrix resulting from the elimination of n nodes [Eq. (54)]. Hence, ηj=γBAHγj(0). Note that there are no transitions to eliminated nodes (with indices jn) in the hopping matrix of the nth iteration, hence the stated dimensionality.

In the next stage of the kPS algorithm, an iterative reverse randomization procedure is used to sample the matrix H(0) with elements Hij(0) corresponding to the numbers of ij transitions on the original (untransformed) subnetwork, with the associated stochastic matrix T(0). This procedure exploits the fact that H(n−1) can be sampled directly from H(n), T(n), and T(n−1) without requiring explicit simulation of the dynamics on the network corresponding to T(n−1), in which the nth node is not eliminated. In this sense, a single iteration of the reverse randomization procedure “undoes” a single iteration of the renormalization [Eq. (54)]. The relationship between renormalization and iterative reverse randomization is illustrated in Fig. 4. The sampling of successive hopping matrices is based on the set of Nc × n-dimensional matrices {G(n) (τ)}, for 0<n|E|, with elements Gij(n) given by the ratio of ij transition probabilities in Markov chains where the nth node is the next to be eliminated in the renormalization phase and where the nth node has been eliminated,

Gij(n)=Tij(n1)Tij(n)i>n.
(57)

Hence, Gij(n) is the fraction of ij transitions in T(n−1) that are direct, i.e., that do not proceed via the nth node. If either i or j are not directly connected to n in T(n−1), then Gij(n)=1.

FIG. 4.

Illustration of the main idea of the kPS algorithm. The model network shown consists of four nodes and the state space is divided as follows: E={β,n,γ}, A={α}, and T=. The transition probability matrix T(β) is given by the elimination of node β from the initial transition matrix T(0). (a) Elimination of node n from T(β) by renormalization gives the matrix T(n). (b) The hopping matrix H(β), containing the numbers of internode transitions on T(β), can be generated randomly from the hopping matrix H(n), where node n is eliminated, using T(β) and T(n). + and − indicate that the matrix element corresponding to the relevant edge must have increased or decreased, respectively. Note that the values of the elements of the hopping matrix may instead stay the same in the reverse randomization procedure. Broken arrows indicate that the edge does not exist or that the corresponding matrix element is zero. Eliminated nodes are shown as transparent. ∗ indicates a new edge resulting from renormalization. The illustration assumes that the stochastic matrix is the linearized or discrete-time transition matrix so that both the transition and hopping matrices depend on the lag time τ and each node has a self-loop transition.

FIG. 4.

Illustration of the main idea of the kPS algorithm. The model network shown consists of four nodes and the state space is divided as follows: E={β,n,γ}, A={α}, and T=. The transition probability matrix T(β) is given by the elimination of node β from the initial transition matrix T(0). (a) Elimination of node n from T(β) by renormalization gives the matrix T(n). (b) The hopping matrix H(β), containing the numbers of internode transitions on T(β), can be generated randomly from the hopping matrix H(n), where node n is eliminated, using T(β) and T(n). + and − indicate that the matrix element corresponding to the relevant edge must have increased or decreased, respectively. Note that the values of the elements of the hopping matrix may instead stay the same in the reverse randomization procedure. Broken arrows indicate that the edge does not exist or that the corresponding matrix element is zero. Eliminated nodes are shown as transparent. ∗ indicates a new edge resulting from renormalization. The illustration assumes that the stochastic matrix is the linearized or discrete-time transition matrix so that both the transition and hopping matrices depend on the lag time τ and each node has a self-loop transition.

Close modal

Since the total number of ij kMC transitions along the α ← ϵ trajectory on the original Markov chain is known from η, which is derived from H(0) and h, the time tA elapsed along the path for escape from the community B can be sampled. The pseudocode for the categorical sampling, iterative reverse randomization, and transition time sampling procedures detailed in this section is given in Algorithm 3. Here, ∼ denotes a random number drawn from a distribution, Γ(α, β) is the gamma distribution with shape parameter α and rate parameter β−1, and B(h, p) and NB(r, p) are the binomial and negative binomial distributions, respectively, with trial number h, success number r, and success probability p.

ALGORITHM 3.

The categorical sampling, iterative reverse randomization, and transition time sampling procedures that constitute the remainder of the main loop of the kinetic path sampling (kPS) algorithm after determination of the set of transition probability matrices {T(n)}, 0n|E|, by renormalization. Note that the final for loop, in which a transition time is sampled, corresponds to the continuous-time formulation.

input: sets of nodes E, T, BET, ABc, and AA (Fig. 3
Nc=|ETA| 
initially occupied node ϵB 
set of |E|+1 stochastic matrices {T(n)}, 0n|E|, from renormalization [Eq. (54)] of nodes in E 
set of |E| matrices {G(n)}, for 0<n|E|, derived from the {T(n)} matrices via Eq. (57) 
output: absorbing node αA 
hopping matrix H(0) with elements Hij(0) equal to the numbers of ijE internode transitions 
along the sampled αϵ path on T(0)  
vector η with elements ηj equal to the total number of transitions from node jB 
time elapsed tA for the αϵ trajectory for escape from B 
initializeH(|E|) (dimension (Nc|E|)×|E|); 
initializeh (dimension |E|); 
initializeη (dimension |B|); 
β ← ϵ; 
/* Categorical sampling procedure to sample an absorbing boundary nodeαAand numbers of internode  
transitions on the renormalized subnetworkBA */ 
whileβAdo 
γcβ [Eq. (56)]; 
ifβEthen 
Hγβ(|E|)Hγβ(|E|)+1
else 
ηβηβ + 1; 
βγ
αβ
/* Sample numbers of transitions from noneliminated nodes */ 
forδTdo 
fδNB(ηδ,1Tδδ(0))
ηδηδ + fδ
/* Iterative reverse randomization to generate the numbers of internode transitions on the  
original subnetworkBA */ 
forn|E|to 1 do 
initializeH(n−1) (dimension (Nc(n1))×|E|); 
forβE\{n}andγ ∈ {n + 1, …, Nc} do 
Hγβ(n1)B(Hγβ(n),Gγβ(n))
forβE\{n}do 
Hnβ(n1)γ{n+1,,Nc}Hγβ(n)Hγβ(n1)
forγ ∈ {n + 1, …, Nc} do 
Hγn(n1)Hγn(n)+βE\{n}Hγβ(n)Hγβ(n1)
hnγ{n+1,,Nc}Hγn(n1)
Hnn(n1)NB(hn,1Tnn(n1))
ηnhn+Hnn(n1)
deallocateH(n), T(n)
/* sample the transition time for theαAϵBtrajectory */ 
tA0
forjBTdo 
Δj ∼ Γ(ηj, τj); 
tAtA+Δj
deallocateH(0), T(0), h, η
returntA, α
input: sets of nodes E, T, BET, ABc, and AA (Fig. 3
Nc=|ETA| 
initially occupied node ϵB 
set of |E|+1 stochastic matrices {T(n)}, 0n|E|, from renormalization [Eq. (54)] of nodes in E 
set of |E| matrices {G(n)}, for 0<n|E|, derived from the {T(n)} matrices via Eq. (57) 
output: absorbing node αA 
hopping matrix H(0) with elements Hij(0) equal to the numbers of ijE internode transitions 
along the sampled αϵ path on T(0)  
vector η with elements ηj equal to the total number of transitions from node jB 
time elapsed tA for the αϵ trajectory for escape from B 
initializeH(|E|) (dimension (Nc|E|)×|E|); 
initializeh (dimension |E|); 
initializeη (dimension |B|); 
β ← ϵ; 
/* Categorical sampling procedure to sample an absorbing boundary nodeαAand numbers of internode  
transitions on the renormalized subnetworkBA */ 
whileβAdo 
γcβ [Eq. (56)]; 
ifβEthen 
Hγβ(|E|)Hγβ(|E|)+1
else 
ηβηβ + 1; 
βγ
αβ
/* Sample numbers of transitions from noneliminated nodes */ 
forδTdo 
fδNB(ηδ,1Tδδ(0))
ηδηδ + fδ
/* Iterative reverse randomization to generate the numbers of internode transitions on the  
original subnetworkBA */ 
forn|E|to 1 do 
initializeH(n−1) (dimension (Nc(n1))×|E|); 
forβE\{n}andγ ∈ {n + 1, …, Nc} do 
Hγβ(n1)B(Hγβ(n),Gγβ(n))
forβE\{n}do 
Hnβ(n1)γ{n+1,,Nc}Hγβ(n)Hγβ(n1)
forγ ∈ {n + 1, …, Nc} do 
Hγn(n1)Hγn(n)+βE\{n}Hγβ(n)Hγβ(n1)
hnγ{n+1,,Nc}Hγn(n1)
Hnn(n1)NB(hn,1Tnn(n1))
ηnhn+Hnn(n1)
deallocateH(n), T(n)
/* sample the transition time for theαAϵBtrajectory */ 
tA0
forjBTdo 
Δj ∼ Γ(ηj, τj); 
tAtA+Δj
deallocateH(0), T(0), h, η
returntA, α

The final section of Algorithm 3, in which tA is sampled, assumes that the input transition probability matrices correspond to a CTMC. Recall that in the continuous-time case, the time for a ij transition is sampled from an exponential distribution with rate parameter τj1.110 If the linearized transition probability matrix [Eq. (57)] is used instead of the branching probability matrix, then the mean waiting times for nodes on the untransformed subnetwork T(0) are uniform, τjτj. Hence, the time tA elapsed along the escape trajectory on the untransformed network can be drawn from a single gamma distribution, with the shape parameter equal to the total number of internode transitions on T(0) and rate parameter τ−1. That is, tAΓ(jETηj,τ). In the discrete-time case, the time step is fixed and uniform for all transitions, equal to the lag time τ, and so tA=τjETηj.

Algorithm 3 describes the generation of a single escape trajectory from the currently occupied community B, given the set of transition matrices {T(n)} computed by renormalization (Sec. V A 1). In a kPS simulation to sample complete AB first passage paths, the main loop of the kPS algorithm (comprising renormalization, categorical sampling, iterative reverse randomization, and sampling of a transition time) is repeated many times to yield a stochastic trajectory that hits the target state A.

3. Model example

For clarity, we illustrate the kPS algorithm for a model example. The network shown in Fig. 4 consists of four nodes. The set E comprises three nodes β < n < γ that are to be eliminated by graph transformation, in order of increasing indices. The absorbing boundary comprises a single node, A{α}, and there are no retained nodes so that EB. Therefore, when all three nodes of the set E have been eliminated, the only available moves are to the absorbing node α. Hence, in the categorical sampling procedure based on Eq. (56), the escape trajectory reaches the absorbing boundary from the initial node ϵ in a single transition. The corresponding element Hαϵ(γ) of the hopping matrix H(γ), which contains the numbers of transitions on the network T(γ) where all nodes of the set E are eliminated, is then incremented by one.

To understand the effect of renormalization, in the form employed within kPS [Eq. (54)], consider the elimination of node n from the stochastic matrix T(β) to give T(n), as shown in Fig. 4. Prior to the elimination of node n, node β is already eliminated, and hence, no edges to node β are included in the network T(β). Upon the elimination of node n, edges to node n, including the nn self-loop, are likewise removed from the network. To compensate for the removal of these transitions, the transition probabilities of edges from node n to neighboring noneliminated nodes (i.e., γ and α) increase. The transition probabilities for edges between pairs of nodes that are both directly connected to node n, and for which the terminal node is noneliminated, similarly increase. This operation involves updating the γγ and αα self-loops, as well as the αγ, γα, and αβ edges, which already exist in the network, and the addition of a γβ edge (indicated by ∗ in Fig. 4) in the renormalized network, since nodes β and γ were not directly connected in T(β). Figure 4 also shows the reverse randomization procedure to sample H(β) from H(n), hopping matrices for which node n is noneliminated and eliminated (i.e., corresponding to the networks T(β) and T(n)), respectively. The numbers of transitions along the edges to be updated are generated randomly in a way that compensates for the changes in the transition probabilities resulting from the corresponding single iteration of the renormalization phase, as outlined in Algorithm 3.

A single iteration of the reverse randomization procedure, given as pseudocode in the third for loop of Algorithm 3, comprises four individual steps. Figure 5 shows the effect of each of these individual steps [(i)–(iv)] on the hopping matrix H(n) to yield the matrix H(β) (cf. Fig. 4). In step (i), the numbers of transitions along each edge from a node of the set E to a noneliminated node, excluding edges associated with node n, are updated by drawing new values from a binomial distribution. For each transition along an edge of the renormalized network T(n), a Bernoulli trial is conducted with success probability equal to the ratio of transition probabilities for the edge in the networks T(β) and T(n), where node n is eliminated and noneliminated, respectively. It is not necessary to consider edges associated with a node that is not directly connected to node n in T(β) because renormalization does not affect these edges, and hence, the success probability in the Bernoulli trials is unity. Similarly, if two nodes are not directly connected to one another but are both connected to node n in T(β), then the ratio of transition probabilities associated with this edge in T(β) and T(n) is necessarily zero, and hence, there are no transitions along this (nonexistent) edge in H(β). Thus, the γβ edge in Fig. 5 is removed after step (i).

FIG. 5.

Illustration of a single iteration of the reverse randomization procedure (the third for loop of Algorithm 3) for the network depicted in Fig. 4. The hopping matrix H(β), which contains the numbers of kMC moves on the network T(β), where node n is noneliminated, is generated randomly from the hopping matrix H(n), which contains the numbers of internode transitions on the network T(n), where node n is eliminated. This procedure uses the transition probabilities that are the elements of the stochastic matrices T(β) and T(n) and is completed in four stages [(i)–(iv)] as described in Sec. V A 3. + and − indicate that the hopping matrix element corresponding to the relevant edge must have increased or decreased, respectively, or else have remained the same. Broken arrows indicate that the edge does not exist or that the corresponding matrix element is zero. Eliminated nodes are shown as transparent. The highlighted edges are those that are updated in the indicated stage of the reverse randomization procedure. Edges used in this calculation are weakly transparent, and irrelevant edges are strongly transparent. The illustration assumes that the stochastic matrix is the linearized or discrete-time probability matrix so that the elements of the hopping matrices depend on the lag time τ, and each node has a self-loop transition.

FIG. 5.

Illustration of a single iteration of the reverse randomization procedure (the third for loop of Algorithm 3) for the network depicted in Fig. 4. The hopping matrix H(β), which contains the numbers of kMC moves on the network T(β), where node n is noneliminated, is generated randomly from the hopping matrix H(n), which contains the numbers of internode transitions on the network T(n), where node n is eliminated. This procedure uses the transition probabilities that are the elements of the stochastic matrices T(β) and T(n) and is completed in four stages [(i)–(iv)] as described in Sec. V A 3. + and − indicate that the hopping matrix element corresponding to the relevant edge must have increased or decreased, respectively, or else have remained the same. Broken arrows indicate that the edge does not exist or that the corresponding matrix element is zero. Eliminated nodes are shown as transparent. The highlighted edges are those that are updated in the indicated stage of the reverse randomization procedure. Edges used in this calculation are weakly transparent, and irrelevant edges are strongly transparent. The illustration assumes that the stochastic matrix is the linearized or discrete-time probability matrix so that the elements of the hopping matrices depend on the lag time τ, and each node has a self-loop transition.

Close modal

In step (ii), the numbers of transitions along edges from nodes of the set E to node n, excluding the nn self-loop, increase to account for any decreases in the numbers of transitions from nodes of the set E in step (i). Similarly, in step (iii), the numbers of transitions from node n to noneliminated nodes increase to account for any decreases in the numbers of transitions to noneliminated nodes in step (i). Finally, in step (iv), the number of nn self-loop transitions is drawn from a negative binomial distribution, where the number of trials is equal to the number of transitions from node n to noneliminated nodes. By repeated application of this reverse randomization procedure, the numbers of transitions from all eliminated (and any transient) nodes on the original subnetwork T(0) are obtained, and hence, a time for the trajectory can be sampled.

4. Committor and absorption probabilities

The formulation of the state reduction methodology employed in kPS [Eq. (54)] also provides a numerically stable procedure to compute committor probabilities (Sec. II E). Recall that the forward AB committor probability for the jth node, qj+ [Eqs. (28) and (29)], is defined as the probability that a trajectory at node j hits the absorbing macrostate A before hitting the initial macrostate B.97 By definition, qb+=0 for bB and qa+=1 for aA.111 The committor probabilities for all nodes are obtained following the renormalization phase of the kPS computation if the state space is divided so that AAB and E(AB)c. Then, after repeated application of Eq. (54) to the nodes in the set E, the only transitions from eliminated nodes in the transformed network T(|E|) are to nodes in either of the endpoint states A or B. The forward committor probability for the jth node, jE, is therefore simply

qj+=aATaj(|E|)γTγj(|E|).
(58)

This procedure is readily generalizable to computing the set of committor probabilities to any number of alternative attractor states in a single computation.112 

This formulation of GT also allows for the robust computation of the absorption probabilities.1 The probability Baj that a trajectory initialized at the jth node is absorbed at node aA is given straightforwardly from the transition probabilities of the renormalized network where only the nodes of the set Ac remain noneliminated, i.e., using AA and EAc, via BaA,jA=Taj(|E|).

The Monte Carlo with absorbing Markov chain101–104 (MCAMC) algorithm provides an alternative approach to simulate pathways for a nearly reducible Markov chain and is similar in spirit to kPS. In the first passage time analysis234 (FPTA) variant of the MCAMC method, the problem of sampling a transition time tA and exit node at the absorbing boundary αA, for a trajectory escaping from an initial node of the currently occupied community ϵB, is solved exactly using eigendecomposition (Sec. II).

The probability that the trajectory has exited the community B at time t is equal to the sum of occupation probabilities for the absorbing boundary nodes at that time, which in the continuous-time case is given by [cf. Eq. (21)]

pA(t)=αApα(t)=αAkψα(k)ϕϵ(k)eγkt,
(59)

where we have used the fact that the initial probability distribution is localized at the ϵ-th node. The Markov chain for the subnetwork BA is reducible because the nodes of the state A are treated as absorbing so that pA(t)=1. An exit time tA can therefore be sampled by drawing a random number r1 ∈ (0, 1] and solving for pA(tA)=r1 numerically using a bracketing and bisection method.234 The iterative calculation to determine tA in the FPTA method is initialized from the mean exit time, which is given by

tA=k2|S|αAψα(k)ϕϵ(k)γkk2|S|αAψα(k)ϕϵ(k),
(60)

where we have used the Perron–Frobenius theorem, γ1 = 0.107 The approximate mean rate method avoids the iterative calculation to determine tA by simply using tA to advance the simulation clock.235 The relative occupation probability distribution of absorbing nodes at any given time is also known from the eigendecomposition [Eq. (59)], and therefore, an exit node α can be sampled by drawing a second random number r2 ∈ (0, 1] and comparing with the probability distribution pα(tA)/pA(tA)αA, analogous to the procedure for selecting a move in the standard kMC algorithm.81,82

The time complexity to simulate a trajectory segment escaping from the currently occupied basin B to the absorbing boundary A is O(|BA|3) for both the kPS83 and MCAMC92 algorithms. Importantly, the central processing unit (CPU) time to execute a single iteration of the main loop in the kPS and MCAMC algorithms does not depend on the actual number of transitions along the path. This feature makes the methods extremely powerful when the basins are chosen to accurately characterize the metastable sets of nodes, where the trajectory segments are sufficiently long that simulation of the path by kMC is unfeasible. Thus, the computational overhead associated with the more advanced algorithms is offset.100 The choice of community structure that is leveraged in these advanced simulation algorithms is therefore crucial to their success in simulating trajectories on nearly reducible Markov chains. The partitioning is also a critical consideration when eliminating blocks of nodes simultaneously in state reduction procedures, since inversion of the Markovian kernel for a community of nodes is numerically unstable if the subnetwork encompasses a separation of characteristic timescales. This block operation can be used in the GT algorithm to compute the AB MFPT (Sec. III A) and in algorithms to compute the expected number of node visits112 and state visitation probabilities114 (Sec. IV E) and also forms the basis of exact uncoupling–coupling to compute the stationary distribution (Sec. IV A). Likewise, a condition for efficient convergence of iterative aggregation–disaggregation is that the diagonal blocks of the partitioned Markov chain are nearly stochastic (Sec. IV B).

A discussion of appropriate community detection procedures to identify metastable macrostates is beyond the scope of the present review. However, we briefly mention a recent advance that provides a framework to refine an initial partitioning, which is useful in practical applications, where the efficiency and/or numerical stability of computations is sensitive to perturbations of the community structure. The central theorem of this approach is that for a given partitioning of a CTMC into a set of N communities, there exists an upper bound on the second dominant eigenvalue of the N × N-dimensional lumped rate matrix given by the local equilibrium approximation (LEA).145,157,236 An analogous theorem applies in discrete-time. This result suggests a variational optimization procedure to refine an initial clustering, where the assigned macrostates for one or more nodes at the intercommunity boundaries are randomly switched to that of a neighboring community.237 The second dominant eigenvalue γ2C of the updated lumped Markov chain given by the LEA6,144 is then computed. The latter operation is efficient if N is not large, especially since the full eigenspectrum is not required and therefore Krylov subpsace methods (Sec. II D) can be employed. Moreover, the eigendecomposition is numerically stable if the lumped Markov chain does not encompass a separation of characteristic timescales, which in any case is required for the Markovian approximation to the coarse-grained dynamics to be valid.

An increase in γ2C essentially guarantees that the reduced Markovian network better approximates the slowest relaxation process of the original Markov chain.238,239 This eigenvalue therefore provides a rigorous metric to improve an initial clustering in an interpretable way. Recently, it was shown that there also exists an upper bound for the Kemeny constant ζKC [Eq. (15)] of a reduced Markov chain lumped according to the community structure C and with coarse-grained intercommunity transition probabilities or rates given by the LEA.108ζKC therefore provides an alternative objective function for the variational optimization procedure, which may be preferable to using the second dominant eigenvalue γ2C. The Kemeny constant is a sum of eigenvalues [Eq. (18)] and therefore quantifies the extent to which the coarse-grained Markov chain reproduces all relaxation processes of the original Markov chain, with slower dynamical eigenmodes receiving a larger weighting. In the simplest possible implementation, the variational optimization is performed using a greedy approach, where node-switching moves that increase γ2C (or ζKC) are always accepted. More sophisticated stochastic optimization approaches, such as simulated annealing, calculate an acceptance probability for a proposed move.237 

The computational overhead associated with the kPS and MCAMC algorithms may become significant if a metastable community comprises a large number of nodes. It is not an option to split a large basin into separate macrostates because then escape of a trajectory to an absorbing boundary does not constitute a long-timescale process. The only way to reduce the computational expense for a given iteration of either simulation algorithm is to reduce the dimensionality of the relevant subnetwork. A simple way to achieve this goal is recursive regrouping6 to subsume sets of nodes that are interconnected by transition rates that exceed a specified threshold,52 with the new intergroup transition rates given by the LEA.157 If the basin is metastable, then the error resulting from this procedure ought to be small, since the regrouping of nodes that are interconnected by fast transition rates should not have a significant effect on the slow dynamics for escape from the community. A more advanced pre-processing strategy is partial graph transformation,240 where renormalization [Eqs. (34) and (35)] is used to eliminate a subset of chosen nodes within the predefined communities.86 This framework exploits the fact that, for a typical Markov chain, it is likely that the path probability distribution for the ensemble of escape trajectories from a given basin is localized.100 For instance, the probability distribution of reactive trajectories [Eq. (31)] may be concentrated in a small fraction of nodes, or escape to a particular node at the absorbing boundary may be strongly favored. Therefore, in principle, it should be possible to identify the subset of basin nodes to be eliminated that minimizes the information loss on the slow dynamics for escape from the community. Empirically, we find that it is favorable to retain nodes at the intercommunity boundary, as well as nodes with large (quasi-) stationary probabilities.240 

We have provided an overview of linear algebra (Sec. II) and state reduction methods (Secs. IIIV A) for the exact numerical analysis of Markovian network dynamics. In Sec. II, we surveyed expressions for properties characterizing both the global and local dynamical behavior of finite Markov chains. We began by noting that macroscopic quantities, such as moments of the first passage and mixing time distributions, can be computed from a fundamental matrix associated with an irreducible Markov chain (Sec. II B) or using eigendecomposition (Sec. II C). We also defined quantities that characterize an AB transition from an initial (B) to an absorbing (A) state at a microscopic level of detail, including the committor probabilities for nodes, which are the central object in analyzing the features of the productive transition (Sec. II E). For nearly reducible Markov chains, which feature a separation of characteristic timescales, ill-conditioning typically prohibits the solution of a given system of linear equations by conventional algorithms. While preconditioning schemes can be employed to aid the convergence of sparse linear algebra methods applied to Markov chains exhibiting metastability, this framework is not readily generalizable, since nontrivial and system-specific considerations arise (Sec. II D).

We therefore focused on state reduction algorithms, which have inherent numerical stability, to analyze arbitrary discrete- and continuous-time Markov chains. State reduction methods are based on renormalization [Eq. (34)] to sequentially eliminate nodes or blocks thereof. Some state reduction algorithms (such as the GTH130,213 and REFUND128 algorithms) also incorporate a backward pass phase to restore nodes in turn, recursively computing the properties of the successive Markov chains having solved the trivial one-node reduced system. Section III detailed the graph transformation algorithm to robustly compute the AB MFPT. In Sec. IV, state reduction procedures were described that enable computation of all dynamical quantities outlined in Sec. II for nearly reducible Markov chains. The stationary distribution can be computed by exact uncoupling–coupling (Sec. IV A), iterative aggregation–disaggregation (Sec. IV B), or the Grassmann–Taksar–Heyman algorithm (Sec. IV C). The fundamental matrix of an irreducible Markov chain, and hence moments of the FPT distribution for all pairwise internode transitions, can be determined by the REFUND algorithm (Sec. IV D). Other state reduction methods, including procedures to compute node properties that characterize the ensemble of AB transition paths,97,99 such as the committor112 and visitation114 probabilities for states, were summarized in Secs. IV E and V A 4. Finally, we noted that sampling trajectories using kinetic Monte Carlo is unfeasibly inefficient for systems exhibiting rare event dynamics and presented an account of kinetic path sampling, which extends the state reduction methodology to sample the numbers of internode transitions along pathways (Sec. V).

The metastable regime is crucially important, since realistic dynamical models typically exhibit a particular rare event that is the process of interest.52–66 The state reduction methodology reviewed herein allows for extensive numerical analysis of nearly reducible Markov chains, which would otherwise be intractable, and hence is valuable in many practical applications. The ability to extract observable quantities and perform detailed analyses of computationally challenging models will lead to new insights into the dynamical behavior of complex systems. In particular, we can probe the relationship between the local features of a Markovian network and the slow global dynamics, which is typically influenced strongly by a small number of states that facilitate the dominant transition mechanisms.87,100,112,115

Many of the methods described in this work, including the LU decomposition formulation of the graph transformation (GT) algorithm to compute MFPTs, committor probabilities, and absorption probabilities, the Grassmann–Taksar–Heyman (GTH) algorithm, a state reduction procedure to compute visitation probabilities, and kinetic path sampling (kPS), are implemented in DISCOTRESS, the DIscrete State COntinuous Time Rare Event Simulation Suite. DISCOTRESS is a flexible C++ program for efficient simulation and numerically stable analysis of the stochastic dynamics on arbitrary discrete- and continuous-time finite Markov chains, including nearly reducible Markov chains that are computationally challenging. DISCOTRESS is freely available software under the GNU General Public License and can be found online at github.com/danieljsharpe/DISCOTRESS. An implementation of the graph transformation (GT) algorithm to compute the AB MFPT for the transition between two endpoint states A and B is included in the PATHSAMPLE program, which is available at www-wales.ch.cam.ac.uk/PATHSAMPLE/ under the GNU General Public License.

D.J.S. gratefully acknowledges the Cambridge Commonwealth, European and International Trust for a Ph.D. scholarship and additional financial support from the EPSRC. D.J.W. gratefully acknowledges support from the EPSRC.

The authors have no conflicts to disclose.

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

1.
J. G.
Kemeny
and
J. L.
Snell
,
Finite Markov Chains
(
Springer
,
New York
,
1960
).
2.
J. R.
Norris
,
Markov Chains
(
Cambridge University Press
,
New York
,
1997
).
3.
C. M.
Grinstead
and
J. L.
Snell
,
Introduction to Probability
(
American Mathematical Society
,
Providence, RI
,
1997
).
4.
H. M.
Taylor
and
S.
Karlin
,
An Introduction to Stochastic Modeling
, 3rd ed. (
Academic Press
,
London, UK
,
1998
).
5.
D. J.
Wales
,
Energy Landscapes
(
Cambridge University Press
,
Cambridge, UK
,
2003
).
6.
D. J.
Wales
and
P.
Salamon
,
Proc. Natl. Acad. Sci. U. S. A.
111
,
617
(
2014
).
7.
S.
Sriraman
,
I. G.
Kevrekidis
, and
G.
Hummer
,
J. Phys. Chem. B
109
,
6479
(
2005
).
8.
N.-V.
Buchete
and
G.
Hummer
,
J. Phys. Chem. B
112
,
6057
(
2008
).
9.
N.-V.
Buchete
and
G.
Hummer
,
Phys. Rev. E
77
,
030902
(
2008
).
10.
P.
Metzner
,
E.
Dittmer
,
T.
Jahnke
, and
C.
Schütte
,
J. Comput. Phys.
227
,
353
(
2007
).
11.
R. T.
McGibbon
and
V. S.
Pande
,
J. Chem. Phys.
143
,
034109
(
2015
).
12.
P.
Metzner
,
F.
Noé
, and
C.
Schütte
,
Phys. Rev. E
80
,
021106
(
2009
).
13.
B.
Trendelkamp-Schroer
and
F.
Noé
,
J. Chem. Phys.
138
,
164113
(
2013
).
14.
B.
Trendelkamp-Schroer
,
H.
Wu
,
F.
Paul
, and
F.
Noé
,
J. Chem. Phys.
143
,
174101
(
2015
).
15.
B.
Trendelkamp-Schroer
and
F.
Noé
,
Phys. Rev. X
6
,
011009
(
2016
).
16.
G. R.
Bowman
,
V. S.
Pande
, and
F.
Noé
,
An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation
, 1st ed. (
Springer
,
The Netherlands
,
2014
).
17.
J.-H.
Prinz
,
H.
Wu
,
M.
Sarich
,
B.
Keller
,
M.
Senne
,
M.
Held
,
J. D.
Chodera
,
C.
Schütte
, and
F.
Noé
,
J. Chem. Phys.
134
,
174105
(
2011
).
18.
V. S.
Pande
,
K.
Beauchamp
, and
G. R.
Bowman
,
Methods
52
,
99
(
2010
).
19.
B. E.
Husic
and
V. S.
Pande
,
J. Am. Chem. Soc.
140
,
2386
(
2018
).
20.
A.
Mardt
,
L.
Pasquali
,
H.
Wu
, and
F.
Noé
,
Nat. Commun.
9
,
5
(
2018
).
21.
R. G.
Mantell
,
C. E.
Pitt
, and
D. J.
Wales
,
J. Chem. Theory Comput.
12
,
6182
(
2016
).
22.
D. J.
Wales
,
Mol. Phys.
100
,
3285
(
2002
).
23.
D. J.
Wales
,
Mol. Phys.
102
,
891
(
2004
).
24.
F.
Noé
,
D.
Krachtus
,
J. C.
Smith
, and
S.
Fischer
,
J. Chem. Theory Comput.
2
,
840
(
2006
).
25.
F.
Noé
and
J. C.
Smith
, “
Transition networks: A unifying theme for molecular simulation and computer science
,” in
Mathematical Modeling of Biological Systems
, edited by
A.
Deutsch
,
L.
Brusch
,
J.
Byrne
,
G.
de Vries
, and
H.-P.
Herzel
(
Birkhäuser
,
Boston
,
2007
), Vol. I, pp.
125
144
.
26.
F.
Noé
and
S.
Fischer
,
Curr. Opin. Struct. Biol.
18
,
154
(
2008
).
27.
D. J.
Wales
,
Annu. Rev. Phys. Chem.
69
,
401
(
2018
).
28.
J. A.
Joseph
,
K.
Röder
,
D.
Chakraborty
,
R. G.
Mantell
, and
D. J.
Wales
,
Chem. Commun.
53
,
6974
(
2017
).
29.
K.
Röder
,
J. A.
Joseph
,
B. E.
Husic
, and
D. J.
Wales
,
Adv. Theory Simul.
2
,
1800175
(
2019
).
30.
N. G.
van Kampen
,
Stochastic Processes in Physics and Chemistry
(
Elsevier
,
Amsterdam, The Netherlands
,
1992
).
31.
D. T.
Gillespie
,
Markov Processes: An Introduction for Physical Scientists
(
Academic Press
,
New York
,
1992
).
32.
R.
Zwanzig
,
J. Stat. Phys.
30
,
255
(
1983
).
33.
N.
Masuda
,
M. A.
Porter
, and
R.
Lambiotte
,
Phys. Rep.
716–717
,
1
(
2017
).
34.
E.
Seneta
,
Linear Algebra Appl.
34
,
259
(
1980
).
35.
D. P.
Heyman
,
J. Appl. Probab.
32
,
893
(
1995
).
36.
B.
Munsky
and
M.
Khammash
,
J. Chem. Phys.
124
,
044104
(
2006
).
37.
K. N.
Dinh
and
R. B.
Sidje
,
Phys. Biol.
13
,
035003
(
2016
).
38.
A. S.
Novozhilov
,
G. P.
Karev
, and
E. V.
Koonin
,
Briefings Bioinf.
7
,
70
(
2006
).
39.
L. M.
Riciardi
, “
Stochastic population theory: Birth and death processes
,” in
Mathematical Ecology
, edited by
T. G.
Hallam
and
S. A.
Levin
(
Springer
,
Berlin, Heidelberg
,
1986
), pp.
155
190
.
40.
C. D.
Meyer
, Jr.
and
R. J.
Plemmons
,
Linear Algebra, Markov Chains, and Queuing Models
(
Springer-Verlag
,
New York
,
1993
).
41.
T.
Schmiedl
and
U.
Seifert
,
J. Chem. Phys.
126
,
044101
(
2007
).
42.
L. B.
Newcomb
,
M.
Alaghemandi
, and
J. R.
Green
,
J. Chem. Phys.
147
,
034108
(
2017
).
43.
T. L.
Hill
,
Free Energy Transduction and Biochemical Cycle Kinetics
(
Springer-Verlag
,
New York
,
1989
).
44.
H.
Ge
,
J. Phys. A: Math. Theor.
45
,
215002
(
2012
).
45.
H.
Ge
,
M.
Qian
, and
H.
Qian
,
Phys. Rep.
510
,
87
(
2012
).
46.
R. J.
Allen
,
P. B.
Warren
, and
P. R.
ten Wolde
,
Phys. Rev. Lett.
94
,
018104
(
2005
).
47.
R. J.
Allen
,
D.
Frenkel
, and
P. R.
ten Wolde
,
J. Chem. Phys.
124
,
024102
(
2006
).
48.
B. K.
Chu
,
M. J.
Tse
,
R. R.
Sato
, and
E. L.
Read
,
BMC Syst. Biol.
11
,
14
(
2017
).
49.
M. J.
Tse
,
B. K.
Chu
,
C. P.
Gallivan
, and
E. L.
Read
,
PLoS Comput. Biol.
14
,
e1006336
(
2018
).
50.
L. J. S.
Allen
, “
An introduction to stochastic epidemic models
,” in
Mathematical Epidemiology
, edited by
F.
Brauer
,
P.
van den Driessche
, and
J.
Wu
(
Springer-Verlag
,
Berlin
,
2008
), pp.
81
130
.
51.
L. J. S.
Allen
,
An Introduction to Stochastic Processes with Applications to Biology
(
Prentice-Hall
,
Upper Saddle River, NJ
,
2003
).
52.
T. D.
Swinburne
,
D.
Kannan
,
D. J.
Sharpe
, and
D. J.
Wales
,
J. Chem. Phys.
153
,
134115
(
2020
).
53.
D. J.
Aldous
and
M.
Brown
, “
Inequalities for rare events in time-reversible Markov chains I
,” in
IMS Lecture Notes in Statistics
, Stochastic Inequalities Vol. 22, edited by
M.
Shaked
and
Y. L.
Tong
(
Institute of Mathematical Statistics, Waite Hill
,
Ohio
,
1992
), pp.
1
16
.
54.
P.
Heidelberger
,
ACM Trans. Model. Comput. Simul.
5
,
43
(
1995
).
55.
P.
Glasserman
,
P.
Heidelberger
,
P.
Shahabuddin
, and
T.
Zajic
,
Oper. Res.
47
,
495
(
1999
).
56.
A.
Bovier
,
M.
Eckhoff
,
V.
Gayrard
, and
M.
Klein
,
J. Phys. A: Math. Gen.
33
,
L447
(
2000
).
57.
A.
Bovier
,
M.
Eckhoff
,
V.
Gayrard
, and
M.
Klein
,
Commun. Math. Phys.
228
,
219
(
2002
).
58.
S.
Juneja
and
P.
Shahabuddin
,
Manage. Sci.
47
,
547
(
2001
).
59.
J.
Beltrán
and
C.
Landim
,
J. Stat. Phys.
140
,
1065
(
2010
).
60.
E.
Vanden-Eijnden
and
J.
Weare
,
Commun. Pure Appl. Math.
65
,
1770
(
2012
).
61.
O.
Benois
,
C.
Landim
, and
M.
Mourragui
,
J. Stat. Phys.
153
,
967
(
2013
).
62.
C.
Hartmann
,
R.
Banisch
,
M.
Sarich
,
T.
Badowski
, and
C.
Schütte
,
Entropy
16
,
350
(
2014
).
63.
M.
Sarich
,
R.
Banishc
,
C.
Hartmann
, and
C.
Schütte
,
Entropy
16
,
258
(
2014
).
64.
M. K.
Cameron
,
J. Chem. Phys.
141
,
184113
(
2014
).
65.
T.
Gan
and
M.
Cameron
,
J. Nonlinear Sci.
27
,
927
(
2017
).
66.
C.
Pérez-Espigares
and
P. I.
Hurtado
,
Chaos
29
,
083106
(
2019
).
67.
F.
Noé
,
I.
Horenko
,
C.
Schütte
, and
J. C.
Smith
,
J. Chem. Phys.
126
,
155102
(
2007
).
68.
F.
Noé
,
J. Chem. Phys.
128
,
244103
(
2008
).
69.
F.
Noé
,
C.
Schütte
,
E.
Vanden-Eijnden
,
L.
Reich
, and
T. R.
Weikl
,
Proc. Natl. Acad. Sci. U. S. A.
106
,
19011
(
2009
).
70.
J.-H.
Prinz
,
B.
Keller
, and
F.
Noé
,
Phys. Chem. Chem. Phys.
13
,
16912
(
2011
).
71.
J. D.
Chodera
and
F.
Noé
,
Curr. Opin. Struct. Biol.
25
,
135
(
2014
).
72.
D. J.
Hartfiel
and
C. D.
Meyer
, Jr.
,
Linear Algebra Appl.
272
,
193
(
1998
).
73.
C. D.
Meyer
, Jr.
,
SIAM Rev.
31
,
240
(
1989
).
74.
D. P.
Heyman
and
A.
Reeves
,
ORSA J. Comput.
1
,
52
(
1989
).
75.
G. W.
Stewart
and
G.
Zhang
,
Numer. Math.
59
,
1
(
1991
).
76.
B.
Philippe
,
Y.
Saad
, and
W. J.
Stewart
,
Oper. Res.
40
,
1156
(
1992
).
77.
C. D.
Meyer
, Jr.
,
SIAM J. Matrix Anal. Appl.
15
,
715
(
1994
).
78.
D. P.
Heyman
and
D. P.
O’Leary
,
SIAM J. Matrix Anal. Appl.
19
,
534
(
1998
).
79.
J. L.
Barlow
,
SIAM J. Matrix Anal. Appl.
22
,
230
(
2000
).
80.
T. J.
Frankcombe
and
S. C.
Smith
,
Theor. Chem. Acc.
124
,
303
(
2009
).
81.
A. B.
Bortz
,
M. H.
Kalos
, and
J. L.
Lebowitz
,
J. Comput. Phys.
17
,
10
(
1975
).
82.
D. T.
Gillespie
,
J. Comput. Phys.
28
,
395
(
1978
).
83.
M.
Athènes
and
V. V.
Bulatov
,
Phys. Rev. Lett.
113
,
230601
(
2014
).
84.
M.
Athènes
,
S.
Kaur
,
G.
Adjanor
,
T.
Vanacker
, and
T.
Jourdan
,
Phys. Rev. Mater.
3
,
103802
(
2019
).
85.
D. R.
Mason
,
R. E.
Rudd
, and
A. P.
Sutton
,
Comput. Phys. Commun.
160
,
140
(
2004
).
86.
V. V.
Bulatov
,
T.
Oppelstrup
, and
M.
Athènes
, “
A new class of accelerated kinetic Monte Carlo algorithms
,” Technical Report No. LLNL-TR-517 795,
Lawrence Livermore National Laboratory
,
2011
.
87.
D. J.
Sharpe
and
D. J.
Wales
,
J. Chem. Phys.
151
,
124101
(
2019
).
88.
S. A.
Trygubenko
and
D. J.
Wales
,
Mol. Phys.
104
,
1497
(
2006
).
89.
S. A.
Trygubenko
and
D. J.
Wales
,
J. Chem. Phys.
124
,
234110
(
2006
).
90.
D. J.
Wales
,
Int. Rev. Phys. Chem.
25
,
237
(
2006
).
91.
D. J.
Wales
,
J. Chem. Phys.
130
,
204111
(
2009
).
92.
J. D.
Stevenson
and
D. J.
Wales
,
J. Chem. Phys.
141
,
041104
(
2014
).
93.
R. S.
MacKay
and
J. D.
Robinson
,
Philos. Trans. R. Soc., A
376
,
20170232
(
2018
).
94.
E.
Vanden-Eijnden
, “
Transition path theory
,” in
Computer Simulations in Condensed Matter Systems: From Materials to Chemical Biology
, edited by
M.
Ferrario
,
G.
Ciccotti
, and
K.
Binder
(
Springer
,
Berlin, Heidelberg
,
2006
), Vol. 1, pp.
453
493
.
95.
P.
Metzner
,
C.
Schütte
, and
E.
Vanden-Eijnden
,
J. Chem. Phys.
125
,
084110
(
2006
).
96.
W.
E
and
E.
Vanden-Eijnden
,
J. Stat. Phys.
123
,
503
(
2006
).
97.
P.
Metzner
,
C.
Schütte
, and
E.
Vanden-Eijnden
,
Multiscale Model. Simul.
7
,
1192
(
2009
).
98.
W.
E
and
E.
Vanden-Eijnden
,
Annu. Rev. Phys. Chem.
61
,
391
(
2010
).
99.
M.
von Kleist
,
C.
Schütte
, and
W.
Zhang
,
J. Stat. Phys.
170
,
809
(
2018
).
100.
D. J.
Sharpe
and
D. J.
Wales
,
J. Chem. Phys.
153
,
024121
(
2020
).
101.
M. A.
Novotny
,
Phys. Rev. Lett.
74
,
1
(
1995
).
102.
M. A.
Novotny
, “
Monte Carlo with absorbing Markov chains: A faster Monte-Carlo algorithm for dynamical studies
,” in
Computer Simulation Studies in Condensed-Matter Physics VII
, edited by
D. P.
Landau
,
K. K.
Mon
, and
H.-B.
Schüttler
(
Springer-Verlag
,
Berlin
,
1994
), pp.
161
165
.
103.
M. A.
Novotny
, “
A tutorial on advanced dynamic Monte Carlo methods for systems with discrete state spaces
,” in
Annual Reviews of Computational Physics
, edited by
D.
Stauffer
(
World Scientific
,
Singapore
,
2001
), Vol. 9, pp.
153
210
.
104.
M. A.
Novotny
,
Comput. Phys. Commun.
147
,
659
(
2002
).
105.
J.
Goutsias
and
G.
Jenkinson
,
Phys. Rep.
529
,
199
(
2013
).
106.
J. K.
Weber
and
V. S.
Pande
,
Biophys. J.
102
,
859
(
2012
).
107.
C. R.
MacCluer
,
SIAM Rev.
42
,
487
(
2000
).
108.
A.
Kells
,
V.
Koskin
,
E.
Rosta
, and
A.
Annibale
,
J. Chem. Phys.
152
,
104108
(
2020
).
109.
J. G.
Kemeny
and
J. L.
Snell
,
Theory Probab. Appl.
6
,
101
(
1961
).
110.
S. A.
Serebrinsky
,
Phys. Rev. E
83
,
037701
(
2011
).
111.
J.-H.
Prinz
,
M.
Held
,
J. C.
Smith
, and
F.
Noé
,
Multiscale Model. Simul.
9
,
545
(
2011
).
112.
D. J.
Sharpe
and
D. J.
Wales
,
Phys. Rev. E
104
,
015301
(
2021
).
113.
T.
Dayar
and
N.
Akar
,
SIAM J. Matrix Anal. Appl.
27
,
396
(
2005
).
114.
D. J.
Sharpe
and
D. J.
Wales
, “
Stable computation of state visitation probabilities in finite Markov chains
” (unpublished) (
2021
).
115.
D. J.
Sharpe
and
D. J.
Wales
,
Phys. Rev. E
103
,
063306
(
2021
).
116.
Y.
Saad
,
SIAM J. Numer. Anal.
29
,
209
(
1992
).
117.
J. J.
Hunter
,
Linear Algebra Appl.
447
,
38
(
2014
).
118.
J. J.
Hunter
,
Adv. Appl. Probab.
1
,
188
(
1969
).
119.
C. D.
Meyer
,
Linear Algebra Appl.
22
,
41
(
1978
).
120.
J. J.
Hunter
,
Linear Algebra Appl.
45
,
157
(
1982
).
121.
J. J.
Hunter
,
Linear Algebra Appl.
127
,
71
(
1990
).
122.
J. J.
Hunter
,
Linear Algebra Appl.
429
,
1135
(
2008
).
123.
J. G.
Kemeny
,
Linear Algebra Appl.
38
,
193
(
1981
).
124.
C. D.
Meyer
, Jr.
,
SIAM Rev.
17
,
443
(
1975
).
125.
P.
Coolen-Schrijner
and
E. A.
van Doorn
,
Probab. Eng. Inf. Sci.
16
,
351
(
2002
).
126.
B. F.
Lamond
and
M. L.
Puterman
,
SIAM J. Matrix Anal. Appl.
10
,
118
(
1989
).
127.
J. J.
Hunter
,
Linear Algebra Appl.
102
,
121
(
1988
).
128.
I.
Sonin
and
J.
Thornton
,
SIAM J. Matrix Anal. Appl.
23
,
209
(
2001
).
129.
I.
Sonin
,
Adv. Math.
145
,
159
(
1999
).
130.
W. K.
Grassmann
,
M. I.
Taksar
, and
D. P.
Heyman
,
Oper. Res.
33
,
1107
(
1985
).
131.
J.
Kohlas
,
Zeit. Oper. Res.
30
,
A197
(
1986
).
132.
J. J.
Hunter
,
Spec. Matrices
4
,
151
(
2016
).
133.
J. J.
Hunter
,
Linear Algebra Appl.
511
,
176
(
2016
).
134.
J. J.
Hunter
,
Linear Algebra Appl.
549
,
100
(
2018
).
135.
Z.
Zhang
,
A.
Julaiti
,
B.
Hou
,
H.
Zhang
, and
G.
Chen
,
Eur. Phys. J. B
84
,
691
(
2011
).
136.
Z.
Zhang
,
Y.
Sheng
,
Z.
Hu
, and
G.
Chen
,
Chaos
22
,
043129
(
2012
).
137.
Z.
Zhang
,
T.
Shan
, and
G.
Chen
,
Phys. Rev. E
87
,
012112
(
2013
).
138.
N. F.
Polizzi
,
M. J.
Therien
, and
D. N.
Beratan
,
Isr. J. Chem.
56
,
816
(
2016
).
139.
M.
Torchala
,
P.
Chelminiak
,
M.
Kurzynski
, and
P. A.
Bates
,
BMC Syst. Biol.
7
,
130
(
2013
).
140.
D. J.
Bicout
and
A.
Szabo
,
J. Chem. Phys.
106
,
10292
(
1997
).
141.
J. J.
Hunter
,
Linear Algebra Appl.
430
,
2607
(
2009
).
142.
J. J.
Hunter
,
Linear Algebra Appl.
417
,
108
(
2006
).
143.
J. J.
Hunter
,
Commun. Stat. - Theory Methods
43
,
1309
(
2014
).
144.
D.
Kannan
,
D. J.
Sharpe
,
T. D.
Swinburne
, and
D. J.
Wales
,
J. Chem. Phys.
153
,
244108
(
2020
).
145.
A.
Kells
,
Z. É.
Mihálka
,
A.
Annibale
, and
E.
Rosta
,
J. Chem. Phys.
150
,
134107
(
2019
).
146.
W.
Zheng
,
E.
Gallicchio
,
N.
Deng
,
M.
Andrec
, and
R. M.
Levy
,
J. Phys. Chem. B
115
,
1512
(
2011
).
147.
P.
Deuflhard
,
W.
Huisinga
,
A.
Fischer
, and
C.
Schütte
,
Linear Algebra Appl.
315
,
39
(
2000
).
148.
P.
Deuflhard
and
M.
Weber
,
Linear Algebra Appl.
398
,
161
(
2005
).
149.
S.
Kube
and
M.
Weber
,
J. Chem. Phys.
126
,
024103
(
2007
).
150.
B.
Reuter
,
K.
Fackeldey
, and
M.
Weber
,
J. Chem. Phys.
150
,
174103
(
2019
).
151.
G. R.
Bowman
,
J. Chem. Phys.
137
,
134111
(
2012
).
152.
A.
Jain
and
G.
Stock
,
J. Chem. Theory Comput.
8
,
3810
(
2012
).
153.
F.
Noé
,
H.
Wu
,
J.-H.
Prinz
, and
N.
Plattner
,
J. Chem. Phys.
139
,
184114
(
2013
).
154.
G. R.
Bowman
,
L.
Meng
, and
X.
Huang
,
J. Chem. Phys.
139
,
121905
(
2013
).
155.
B. E.
Husic
,
K. A.
McKiernan
,
H. K.
Wayment-Steele
,
M. M.
Sultan
, and
V. S.
Pande
,
J. Chem. Theory Comput.
14
,
1071
(
2018
).
156.
J. A.
Ward
and
M.
López-García
,
Appl. Network Sci.
4
,
108
(
2019
).
157.
G.
Hummer
and
A.
Szabo
,
J. Phys. Chem. B
119
,
9029
(
2015
).
158.
G. H.
Weiss
,
Adv. Chem. Phys.
13
,
1
(
1967
).
159.
Y.
Saad
,
Numerical Methods for Large Eigenvalue Problems
(
SIAM
,
Philadelphia, PA
,
2011
).
160.
D. S.
Watkins
,
The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods
, 2nd ed. (
SIAM
,
Philadelphia, PA
,
2007
).
161.
162.
A. L.
Hageman
and
D. M.
Young
,
Applied Iterative Methods
(
Academic Press
,
New York
,
1981
).
163.
C. C.
Paige
and
M. A.
Saunders
,
SIAM J. Numer. Anal.
12
,
617
(
1975
).
164.
S. C.
Eisenstat
,
H. C.
Elman
, and
M. H.
Schultz
,
SIAM J. Numer. Anal.
20
,
345
(
1983
).
165.
C.
Vuik
,
ESAIM Proc. Surv.
63
,
1
(
2018
).
166.
R. G.
Grimes
,
J. G.
Lewis
, and
H. D.
Simon
,
SIAM J. Matrix Anal. Appl.
15
,
228
(
1994
).
167.
A.
Hadjidimos
,
J. Comput. Appl. Math.
123
,
177
(
2000
).
168.
169.
Y.
Saad
,
Iterative Methods for Sparse Linear Systems
, 2nd ed. (
SIAM
,
Philadelphia, PA
,
2003
).
170.
G. W.
Stewart
,
Numer. Math.
25
,
123
(
1976
).
171.
W. E.
Arnoldi
,
Q. Appl. Math.
9
,
17
(
1951
).
172.
Y.
Saad
,
Linear Algebra Appl.
34
,
269
(
1980
).
173.
C. C.
Paige
,
Linear Algebra Appl.
34
,
235
(
1980
).
174.
Z.-X.
Jia
and
L.
Elsner
,
J. Comput. Math.
18
,
265
(
2000
).
175.
Y.
Saad
and
M. H.
Schultz
,
SIAM J. Sci. Stat. Comput.
7
,
856
(
1986
).
176.
R. B.
Lehoucq
and
D. C.
Sorensen
,
SIAM J. Matrix Anal. Appl.
17
,
789
(
1996
).
177.
R. B.
Morgan
,
SIAM J. Sci. Comput.
24
,
20
(
2002
).
178.
H. D.
Simon
,
Linear Algebra Appl.
61
,
101
(
1984
).
179.
L. F.
Shampine
and
C. W.
Gear
,
SIAM Rev.
21
,
1
(
1979
).
180.
C. W.
Gear
and
I. G.
Kevrekidis
,
SIAM J. Sci. Comput.
24
,
1091
(
2003
).
181.
T. J.
Frankcombe
and
S. C.
Smith
,
J. Theor. Comput. Chem.
2
,
179
(
2003
).
182.
P. N.
Brown
,
G. D.
Byrne
, and
A. C.
Hindmarsh
,
SIAM J. Sci. Stat. Comput.
10
,
1038
(
1989
).
183.
P. N.
Brown
,
A. C.
Hindmarsh
, and
L. R.
Petzold
,
SIAM J. Sci. Stat. Comput.
15
,
1467
(
1994
).
184.
C. W.
Gear
and
Y.
Saad
,
SIAM J. Sci. Stat. Comput.
4
,
583
(
1983
).
185.
P. N.
Brown
and
A. C.
Hindmarsh
,
Appl. Math. Comput.
31
,
40
(
1989
).
186.
T. J.
Frankcombe
and
S. C.
Smith
,
J. Chem. Phys.
119
,
12741
(
2003
).
187.
Y.
Saad
, “
Preconditioned Krylov subspace methods for the numerical solution of Markov chains
,” in
Computations with Markov Chains
, edited by
W. J.
Stewart
(
Springer
,
New York
,
1995
), pp.
49
64
.
188.
B.
Philippe
and
R. B.
Sidje
, “
Transient solutions of Markov processes by Krylov subspaces
,” in
Computations with Markov Chains
, edited by
W. J.
Stewart
(
Springer
,
New York
,
1995
), pp.
95
119
.
189.
M.
Hochbruck
and
C.
Lubich
,
SIAM J. Numer. Anal.
34
,
1911
(
1997
).
190.
J.
van den Eshof
and
M.
Hochbruck
,
SIAM J. Sci. Comput.
27
,
1438
(
2006
).
191.
M.
Cameron
and
E.
Vanden-Eijnden
,
J. Stat. Phys.
156
,
427
(
2014
).
192.
C.
Dellago
,
P. G.
Bolhuis
, and
P. L.
Geissler
,
Adv. Chem. Phys.
123
,
1
(
2002
).
193.
W.
E
,
W.
Ren
, and
E.
Vanden-Eijnden
,
Chem. Phys. Lett.
413
,
242
(
2005
).
194.
A. M.
Berezhkovskii
and
A.
Szabo
,
J. Chem. Phys.
150
,
054106
(
2019
).
195.
Q.
Li
,
B.
Lin
, and
W.
Ren
,
J. Chem. Phys.
151
,
054112
(
2019
).
196.
M. S.
Apaydin
,
D. L.
Brutlag
,
C.
Guestrin
,
D.
Hsu
,
J.-C.
Latombe
, and
C.
Varma
,
J. Comput. Biol.
10
,
257
(
2003
).
197.
N.
Singhal
,
C. D.
Snow
, and
V. S.
Pande
,
J. Chem. Phys.
121
,
415
(
2004
).
198.
T. D.
Swinburne
and
D. J.
Wales
,
J. Chem. Theory Comput.
16
,
2661
(
2020
).
199.
A.
Berezhkovskii
,
G.
Hummer
, and
A.
Szabo
,
J. Chem. Phys.
130
,
205102
(
2009
).
200.
A. M.
Berezhkovskii
,
R. D.
Murphy
, and
N.-V.
Buchete
,
J. Chem. Phys.
138
,
036101
(
2013
).
201.
R.
Du
,
V. S.
Pande
,
A. Y.
Grosberg
,
T.
Tanaka
, and
E. S.
Shakhnovich
,
J. Chem. Phys.
108
,
334
(
1998
).
202.
G.
Hummer
,
J. Chem. Phys.
120
,
516
(
2004
).
203.
A.
Berezhkovskii
and
A.
Szabo
,
J. Chem. Phys.
125
,
104902
(
2006
).
204.
C. D.
Snow
,
Y. M.
Rhee
, and
V. S.
Pande
,
Biophys. J.
91
,
14
(
2006
).
205.
D.
Antoniou
and
S. D.
Schwartz
,
J. Chem. Phys.
130
,
151103
(
2009
).
206.
R. B.
Best
and
G.
Hummer
,
Proc. Natl. Acad. Sci. U. S. A.
102
,
6732
(
2005
).
207.
W. K.
Grassmann
and
D. A.
Stanford
, “
Matrix analytic methods
,” in
Computational Probability
, edited by
W.
Grassmann
(
Springer
,
New York
,
2000
), pp.
153
203
.
208.
E.
Seneta
,
SIAM J. Matrix Anal. Appl.
19
,
556
(
1998
).
209.
D. T.
Crommelin
and
E.
Vanden-Eijnden
,
J. Comput. Phys.
217
,
782
(
2006
).
210.
T. J.
Frankcombe
and
S. C.
Smith
,
J. Chem. Phys.
119
,
12729
(
2003
).
211.
A.
Milias-Argeitis
and
J.
Lygeros
,
J. Chem. Phys.
138
,
184109
(
2013
).
212.
C. D.
Meyer
,
Linear Algebra Appl.
114–115
,
69
(
1989
).
213.
T. J.
Sheskin
,
Oper. Res.
33
,
228
(
1985
).
214.
R. B.
Mattingly
,
ORSA J. Comput.
7
,
117
(
1995
).
215.
P. J.
Courtouis
and
P.
Semal
,
Linear Algebra Appl.
76
,
59
(
1986
).