Interdependent networks are susceptible to catastrophic consequences due to the interdependence between the interacting subnetworks, making an effective recovery measure particularly crucial. Empirical evidence indicates that repairing the failed network component requires resources typically supplied by all subnetworks, which imposes the multivariate dependence on the recovery measures. In this paper, we develop a multivariate recovery coupling model for interdependent networks based on percolation theory. Considering the coupling structure and the failure–recovery relationship, we propose three recovery strategies for different scenarios based on the local stability of nodes. We find that the supporting network plays a more important role in improving network resilience than the network where the repaired component is located. This is because the recovery strategy based on the local stability of the supporting nodes is more likely to obtain direct benefits. In addition, the results show that the average degree and the degree exponent of the networks have little effect on the superior performance of the proposed recovery strategies. We also find a percolation phase transition from first to second order, which is strongly related to the dependence coefficient. This indicates that the more the recovery capacity of a system depends on the system itself, the more likely it is to undergo an abrupt transition under the multivariate recovery coupling. This paper provides a general theoretical frame to address the multivariate recovery coupling, which will enable us to design more resilient networks against cascading failures.

Interdependencies among networks can greatly expand the function and application areas of the network system, but also highly increase system vulnerability, which typically presents with a cascading failure. To avoid catastrophic events or to heal cascading failures as they occur, researchers have focused on designing recovery strategies. Recent studies have shown that the recovery of one network depends on the functional state of the support network (which they refer to as “recovery coupling”). If the support networks are not functional, recovery will be partly confined. However, this study overlooks the functional status of the repaired components themselves and only takes into account the limitations on the recovery measures imposed by the functional state of the supporting network, and it is specific to post-disaster recovery and does not hold for dynamic recovery. In this paper, we develop a multivariate recovery coupling model for interdependent networks based on percolation theory. Considering the coupling structure and the failure–recovery relationship, we propose three recovery strategies for different scenarios based on the local stability of nodes. We find that the supporting network plays a more important role in improving network resilience than the network where the repaired component is located. In addition, the results show that the average degree and the degree exponent of the networks have little effect on the superior performance of the proposed recovery strategies. We also find a percolation phase transition from first to second order, which is strongly related to the dependence coefficient. This indicates that the more the recovery capacity of a system depends on the system itself, the more likely it is to undergo an abrupt transition under the multivariate recovery coupling. Our studies may provide a general theoretical frame to address the multivariate recovery coupling, which will enable us to design more resilient networks against cascading failures.

Interdependence is an inherent property of complex systems in the real world.1,2 A typical scenario comprises critical infrastructures in which different functional networks, such as the Internet, electrical power grids, transportation systems, and communication networks, interact with each other.3–5 The interdependencies among networks can greatly expand the function and application areas of the system, but also highly increase system vulnerability, which typically presents with a cascading failure.6–8 For example, Venezuela's power grid suffered two consecutive large-scale blackouts from March 7 to 27, 2019, which were caused by a local man-made attack.9 Consider also the 2008 blackout in North America, for which the combination of power component failures and computer networks contributed to cascading failures that eventually led to a large-scale blackout.10 Increasing catastrophic events suggest that a small localized disturbance could cause catastrophic damage in interdependence systems.11,12 To represent these interdependent systems, Buldyrev et al. initially proposed an interdependent network model based on the percolation theory.1 For interdependent networks, the dependent link can further enrich network features, and the study of these coupling networks has expanded rapidly.

To avoid catastrophic events or to heal failures as they occur, many researchers have focused on designing recovery strategies, which are typically divided into two categories, e.g., dynamic recovery13–16 and post-disaster recovery.17,18 Due to its timeliness and rapid response to risk, the dynamic recovery method, in which system component failure and recovery typically occur concurrently and form a competitive relationship, has received a substantial study. Majdandzic et al.14 and Böttcher et al.19 proposed a recovery model for interdependent networks where failure and spontaneous recovery perform concomitantly. Di Muro et al.13 developed a theoretical framework in which failed nodes in the boundary of the functional network are repaired before the network completely collapses. La Rocca et al.16 developed an emergency recovery strategy in which isolated finite clusters are reconnected to the functional giant component with a probability γ at each time step of the cascade of failures. Virtually, the existing recovery methods tend to repair the failed network components with a fixed ability, ignoring the fact that recovery measures are often inhibited by many factors such as resources, cost, and time. Recently, Danziger and Barabási20 developed a theoretical framework for interdependent networks to address the issue of dependence during recovery. They captured the recovery of one network depending on the functional state of the support network (which they refer to as “recovery coupling”). If the support networks are not functional, recovery will be partly confined. This analysis framework offers a theoretical basis for how recovery coupling affects a system's functionality. Virtually, this study overlooks the functional status of the repaired components themselves and only takes into account the limitations on the recovery measures imposed by the functional state of the supporting network, and it is specific to post-disaster recovery and does not hold for dynamic recovery.

However, in many real-world interdependent network systems, the spontaneous recovery or deliberate repair of the network components requires resources typically supplied by both the networks in which the repaired components are located and their supporting network.20–22 For example, in an interdependent system composed of the power grid (PG) and the communication network (CN),23 restoring failed components of CN requires that the repair crews have an electricity supply (which is supported by PG) and coordination through communications (which is supported by CN). Therefore, any absence in either or all of these supporting conditions will consequently cause recovery stunting and sometimes even recovery discontinuation. Consider also the power–transportation network,24,25 a power outage and a blocked road may delay the repair of the failed power components in a given location. Similar cases are abundant in infrastructures. In this paper, we refer to the recovery processes mentioned above as the “multivariate recovery coupling.” In addition, the dynamic recovery serves as a more practical measure to deal with cascading failures as they occur, rather than a reactive post-disaster response.13,16,26 This quick reaction helps prevent disasters and hence helps minimize losses for systems in the real world, especially for interdependent infrastructures. However, the current research on multivariate recovery coupling under dynamic recovery is limited for interdependent networks.

In this paper, we develop a percolation-based theoretical model to capture multivariate recovery coupling under dynamic recovery of interdependent networks. In our model, failed finite clusters are repaired with a time-dependent probability that is related to the instantaneous functional state of each node in the finite cluster and all supporting nodes of the finite cluster. Considering the coupling structure and the failure–recovery relationship, we develop three recovery strategies for different scenarios based on the local stability of nodes. In addition, a network resilience indicator that reflects both the level and the effectiveness of recovery is proposed. We perform both theoretical and simulation results to exemplify the phase transition properties for the random and scale-free networks. We find that the phase transition of the network changes from the first to the second order with an increasing dependence coefficient, which indicates that the more the recovery capacity of a system depends on the system itself, the more likely it is to undergo an abrupt transition under the multivariate recovery coupling. In addition, we find that the supporting network plays a more important role in improving network resilience than the network where the repaired component is located. This performs as the node with a more stable supporting node is chosen for reconnection, the better recovery efficiency. In addition, the results show that the feedback strategy (FS) performs better than the other strategies, and similar results are observed in a real-world interdependent system composed of the Western U.S. power grid and the communication network.

In this paper, we define these recovery processes where repairing the failed network component requires resources supplied by all subnetworks as the “multivariate recovery coupling.” Here, a percolation-based model is developed to capture the multivariate recovery coupling under the dynamic recovery of the interdependent networks.

1. Failure–recovery propagation

The failure mechanism for interdependent networks has been well-studied.1,27,28 Based on the study of Buldyrev et al.,1 we detail the failure–recovery propagation in our model below. For simplicity and without loss of generality, we consider an interdependent network consisting of two networks, A and B, with degree distributions P A ( k ) and P B ( k ), respectively. Both networks have N nodes, and each node in network A/B is randomly linked with one node from network B/A via a bidirectional interdependent link. We denote by t = 0 , 1 , the time steps of the failure–recovery propagation. The initial failure is added on a fraction 1 p of nodes in network A randomly at step t = 0, and the cascading failure is triggered. Following the assumptions in Buldyrev et al.,1 these finite clusters in network A that are not linked to the giant component (GC) will be removed, and then their interdependent nodes in network B also fail. As failure goes from network A to network B, network B is broken into a giant component and several finite clusters. Before these finite clusters fail, La Rocca et al.16 reconnected these isolated finite clusters to the functional giant component with a fixation probability to delay the network crash. Above this, we performed a new emergency recovery process in which isolated finite clusters are repaired following the multivariate recovery coupling as follows:

Step 1: Each isolated finite cluster in network B is repaired with a time-dependent probability γ ψ i ( t ) that depends strongly on multivariate recovery coupling. To better utilize recovery resources, we assume that only the finite cluster with more than two nodes is repaired.

Step 2: For each repaired finite cluster, we select a node and reconnect it to one node belonging to GC in network B. To maintain the initial network topology as much as possible, the nodes we choose for reconnection must have free links, i.e., links that already exist at the beginning but are removed later due to cascading failures.

After the above recovery process, these unrepaired finite clusters in network B will be removed, which, in turn, causes these supporting nodes in network A to fail. Then, the initial step t = 0 is terminated and the next step t = 1 begins. The above procedure is repeated until a steady state is reached, where no more new failure nodes occur. For simplicity, only the multivariate recovery coupling for network B is presented here. Our model can be generalized to more scenarios, such as the initial failure and multivariate recovery coupling occurring concurrently in networks A and B.

2. Multivariate recovery coupling

We now formalize the description of the multivariate recovery coupling. Given that the repair process requires resources from all subnetworks, we hypothesize that the recovery capacity of the finite cluster depends on the transient functional state of the finite cluster, which is related to both networks A and B. According to Danziger and Barabási, the repair rate of the primary network at node i depends on the state of the support network at the same location. Here, we define a time-dependent recovery probability γ ψ i ( t ) associated with the transient functional state S ψ i B of the finite cluster ψ i as
(1)
where f ( x ) is a recovery probability function capturing the association of γ ψ i ( t ) with S ψ i B . α [ 0 , 1 ] is a dependence coefficient that reflects the dependence of the recovery measures on the own internal state of the network itself and the external environment. When α = 1, the network can be thought of as a self-sustaining system, which means that the recovery of network components depends only on the resources provided by the network and is not affected by the external environment. When α = 0, the recovery coupling will degenerate into the conventional recovery model with a constant recovery capacity, which means that the recovery measures are entirely dependent on the inherent external environment and independent of the own transient state of the network itself. We define γ ψ i ( 0 ) as the initial recovery probability and set γ ψ i ( 0 ) = 1. Considering that the transient functional state of a finite cluster is usually related to the state of all nodes in the finite cluster, we assume that S ψ i B can be evaluated in terms of node average as
(2)
where node j belongs to ψ i. S j B ( t ) represents the transient functional state of node j, and | ψ i ( t ) | denotes the number of nodes in ψ i. Given that repairing the network component requires resources from all subnetworks, and then S j B ( t ) is defined as
(3)
where k j B ( t ) represents the real node degree (which is the number of functional neighboring nodes at time t) of node j in network B and k j A ( t ) of the supporting node in network A. k j B ( 0 ) and k j A ( 0 ) represent the initial node degree. k j B ( 0 ) + 1 ( k j A ( 0 ) + 1) represents the value of the degree of node j (the supporting node) plus the number of the interdependent link connected to node j (the supporting node). This is done to measure the comprehensive transient functional state of a node considering both intra-layer and inter-layer connections. S j B ( t ) captures the fact that the resources for recovery come from both the neighbors and the supporting node. Figure 1 shows the schematic of the multivariate recovery coupling.
FIG. 1.

Schematic diagrams of the multivariate recovery coupling. (a) Failure state of finite clusters. The light yellow and blue regions represent GC and finite cluster, respectively. The black dashed lines represent the free links. The gray lines denote the interdependent links, and the red line is the set of the interdependent links for the nodes in GC. The arrows indicate the flow direction of the recovery resources. (b) Post-recovery states of finite clusters. The red arrow points to the node for reconnection, and the green line is the edge for reconnection. For the leftmost finite cluster in network B, it is reconnected to GC due to its high recovery probability. The failed finite cluster and its supporting nodes indicated in gray are removed.

FIG. 1.

Schematic diagrams of the multivariate recovery coupling. (a) Failure state of finite clusters. The light yellow and blue regions represent GC and finite cluster, respectively. The black dashed lines represent the free links. The gray lines denote the interdependent links, and the red line is the set of the interdependent links for the nodes in GC. The arrows indicate the flow direction of the recovery resources. (b) Post-recovery states of finite clusters. The red arrow points to the node for reconnection, and the green line is the edge for reconnection. For the leftmost finite cluster in network B, it is reconnected to GC due to its high recovery probability. The failed finite cluster and its supporting nodes indicated in gray are removed.

Close modal

The central question of the recovery strategy is to find an optimal set of nodes for reconnection (which is defined as R-node) in both repaired finite clusters and GC. Considering the coupling structure and the failure–recovery relationship, we develop three recovery strategies for different scenarios based on the local stability of nodes, and nodes with high indicator values are preferred for reconnection.

As described in Sec. II A, after the recovery process in network B, the cascading failure causes the node failure in network A, and then the failure goes from A to B again. We define the above process as a recovery cycle and divide it into three cases, corresponding to case 1, case 2, and case 3 as shown in Fig. 2(a). Case 1 [see Fig. 2(b)] describes a simple scenario in which we consider only a single network B. We choose R-node based on the recovery indicator as
(4)
where r j B ( t ) represents the local stability of node j in network B, which reflects the ability of a node to maintain connected to GC under the random failure of the network. Researchers have found that high-degree nodes may be more reliable to random failures because they have more edges connected to the network.29–31 Here, we adopt the real node degree as the measure of the local stability of nodes [i.e., r j B ( t ) = k j B ( t )]. Virtually, case 1 statically evaluates the priority of R-node only from the perspective of a single network but does not consider the coupling structure between networks A and B. For case 1, let us assume that node 2 in network B is chosen for reconnection due to its high local stability. However, the repaired cluster ψ 2 may lose connection to the GC once more if the supporting node of node 2 fails when the cascade failure moves from B to A. Therefore, to prevent secondary failure as much as possible, it is necessary to take into account both the local stability of nodes in network B and the supporting nodes in network A. Inspired by this, to evaluate the coupling scenario as shown in case 2 [Fig. 2(c)], we present a recovery indicator based on the product of the local stability of the node and its supporting node as
(5)
where r j B ( t ) and r j A ( t ) represent the local stability of node j and its supporting node, respectively. In Fig. 2(c), represents the remaining network after removing nodes 1 , 2 , 3 , and 4 from GC. By further analysis, we note that the indicator given in Eq. (5) does not fit case 3 [Fig. 2(d)], where there is a failed cluster ψ 1 (consisting of nodes 1 and 2) and a repaired cluster ψ 2 (consisting of nodes 3, 4, 5, and 6). In case 3, node 4 will be chosen as R-node according to Eq. (5). However, the failure of ψ 1 will lead to a cascading failure of nodes 1 , 2 , 4 , and 5 , and then ψ 2 will lose connection to the GC once more when the cascade failure moves from A to B. Indeed, the recovery indicator given in Eq. (5) ignores the failure–recovery relationship. When failure goes from A to B, whether or not ψ 2 will lose connection to the GC is dependent only on the local stability of the supporting nodes in network A but not on nodes in network B. Therefore, the node with a more reliable supporting node should be selected as R-node preferentially. Based on this, a new recovery indicator based on the local stability of the supporting node is proposed as
(6)
where r j A ( t ) represents the local stability of the supporting node of node j. Actually, from the perspective of benefits, if the choice of R-node is based on the local stability of the supporting nodes in network A, then we are more likely to obtain direct benefits from the recovery measures, and if the choice is based on the local stability of the nodes in network B, then we obtain indirect benefits. However, the indirect benefits are usually not favorable for delaying a network crash due to its significant lags. Here, we define the recovery strategies based on Eqs. (4)–(6) as the simple strategy (SS), coupling strategy (CS), and FS, respectively.
FIG. 2.

Schematic diagrams of the recovery strategies. (a) Three different cases in one recovery cycle. (b) The SS recovery strategy. (c) The CS recovery strategy. (d) The FS recovery strategy. Here, the red region ( ) represents the remaining network after removing the visualization nodes (which are nodes 1 , 2 , 4 , 5 , and 6 in network A) from GC. ψ 1 and ψ 2 represent the failed finite cluster and the repaired finite cluster, respectively. The red arrow points to the node for reconnection, and the green lines are the edges for reconnection.

FIG. 2.

Schematic diagrams of the recovery strategies. (a) Three different cases in one recovery cycle. (b) The SS recovery strategy. (c) The CS recovery strategy. (d) The FS recovery strategy. Here, the red region ( ) represents the remaining network after removing the visualization nodes (which are nodes 1 , 2 , 4 , 5 , and 6 in network A) from GC. ψ 1 and ψ 2 represent the failed finite cluster and the repaired finite cluster, respectively. The red arrow points to the node for reconnection, and the green lines are the edges for reconnection.

Close modal
Network resilience reflects the recovery ability of the network, which is usually quantified based on the performance response process after a perturbation.32,33 Reed et al. developed a time-dependent resilience indicator that quantifies the network resilience based on the real performance and recovery time of the network.34 This indicator captures not only the level of recovery but also the recovery efficiency. On this basis, we propose a network resilience metric as
(7)
where Q ( t ) represents the real performance function of the network. T and t 0denote the start and end time of the failure–recovery propagation, respectively. Here, we define t 0 = 0. From Sec. II B, we know that the transient functional state of a finite cluster depends on all nodes in the finite cluster. Similarly, we assume that the real performance function of the network is associated with all functional nodes in GC as
(8)
where G C ( t ) represents the giant component of network B at time t. Node j is the functional node and belongs to G C ( t ). S j B ( t ) denotes the transient functional state of node j as given in Eq. (3). S j B ( 0 ) is the value of S j B ( t ) at the initial time t = 0. Normalization of data is done by S j B ( t ) / S j B ( 0 ). | G C ( t ) | denotes the number of nodes in G C ( t ). Note that due to the symmetry of Eq. (3) and the same GC size of networks A and B in steady state, the value of Q ( t ) for networks A and B is equal at the end time T.

Numerical simulations and theoretical calculations are conducted on scale-free (SF), random (ER), and regular random (RR) networks, which have been commonly used to depict real-world networks. For numerical simulations, the ER and RR networks are generated using NetworkX (https://networkx.org/), which is a Python package for the construction, manipulation, and analysis of complex networks. The SF network is generated by the configuration model35 that can build a network with a pre-defined degree sequence. For theoretical derivation, the degree distribution of ER networks is P ( k ) = e k k k / k !, where k represents the average degree. For SF networks, the degree distribution is P ( k ) = c k λ, and it is typically approximated as P ( k ) = ( k min / k ) λ 1 ( k min / k + 1 ) λ 1, where λ denotes the degree exponent and k min is the minimum degree. For RR networks, the degree distribution is P ( k ) = δ ( k k 0 ), where k 0 stands for the number of connections of a single node. All networks used for numerical simulations contain 10,000 nodes, and simulation results are averaged over 300 realizations.

The phase transition properties analysis is performed under the random recovery strategy (RS), in which these nodes for reconnection are chosen randomly. In addition, to verify the theory we present, the recovery constraint is relaxed to the extent that finite clusters with arbitrary sizes are potentially recoverable. All theoretical results are derived based on Eqs. (9)–(25) given in the supplementary material.

We first calculate the size of the giant components P at the steady state for the ER–ER, SF–SF, and RR–RR interdependent networks under the different dependence coefficient α [Figs. 3(a)3(c)]. We find that the network undergoes a percolation phase transition at the critical value of p c, in which the value of P jumps from a finite value to zero. More importantly, the phase transition changes from the first to the second order with increasing α. It indicates that the more the recovery capacity of a system depends on the system itself, the more likely it is to undergo an abrupt transition under the multivariate recovery coupling. In Figs. 3(a)3(c), the dot lines and symbols represent theoretical and numerical simulation results, respectively. The results show that there is an excellent agreement between theoretical and simulation results, except for a minor difference around p c.

FIG. 3.

Percolation phase transition of ER–ER, SF–SF, and RR–RR networks under the different α. (a) Size of giant components P for ER–ER networks with k = 8. (b) P for SF–SF networks with λ = 3 and k min = 2. (c) P for RR–RR networks with k 0 = 3. Iterations steps (NOI) required for the cascading process for (d) ER–ER, (e) SF–SF, and (f) RR–RR networks.

FIG. 3.

Percolation phase transition of ER–ER, SF–SF, and RR–RR networks under the different α. (a) Size of giant components P for ER–ER networks with k = 8. (b) P for SF–SF networks with λ = 3 and k min = 2. (c) P for RR–RR networks with k 0 = 3. Iterations steps (NOI) required for the cascading process for (d) ER–ER, (e) SF–SF, and (f) RR–RR networks.

Close modal

Next, we show the number of iteration steps (NOI) required for the cascading process to reach the steady state [Figs. 3(d)3(f)]. The results show NOI peaks at p c as p increases, which means that the network requires many iterations to reach the steady state when p is close to p c. This exactly explains the minor difference between the theoretical and simulation results around p c, since the recovery strategies from theory and simulation differ in their practical implementation. Specifically, after each recovery measure, we theoretically obtain a new GC following the initial topology, but in the simulation, we only reconnect the repaired cluster and GC by adding a new edge.

To further validate the exact effects of the dependence coefficient on the phase transition, the α p phase diagrams for ER–ER, SF–SF, and RR–RR networks are shown in Figs. 4(a)4(c). The phase diagram is divided by the critical curves into three regions, i.e., Collapse, Recovery, and Function. These regions are defined based on the network state, which is related to the relative size of GC at the steady state. Here, the orange curve represents p c as a function of α. The region to the left of the orange curve is labeled “Collapse” because the network belonging to this area will be completely destroyed, i.e., P is close to 0. The black dashed line represents the minimum p for α = 1, and the region to the right of it is labeled “Function.” Networks located in this area do not crash for any value of p, i.e., P > 0. When the network is in the “Function” state, the network remains the essential function, even when the recovery strategy is not applied. The middle region is marked as “Recovery,” in which the maximum α is allowed to prevent the network from falling out completely.

FIG. 4.

Phase diagrams in the α p plane for (a) the ER–ER network with k = 8, (b) SF-SF networks with λ = 3 and k min = 2. (c) RR–RR networks with k 0 = 3. The orange curve represents the critical curve p c as a function of α. The critical curve for (d) the ER–ER network at the different k , (e) the SF–SF network at the different λ and k min, and (f) the RR–RR network at the different k 0.

FIG. 4.

Phase diagrams in the α p plane for (a) the ER–ER network with k = 8, (b) SF-SF networks with λ = 3 and k min = 2. (c) RR–RR networks with k 0 = 3. The orange curve represents the critical curve p c as a function of α. The critical curve for (d) the ER–ER network at the different k , (e) the SF–SF network at the different λ and k min, and (f) the RR–RR network at the different k 0.

Close modal

The critical curve ( p c as a function of α) of the ER–ER network under the different average degree k is shown in Fig. 4(d). We find that the curves move toward the left as k increases, which corresponds to a smaller “Collapse” region in Fig. 4(a). This indicates that the bigger the value of k is, the more robust it is to cascading failures for the ER–ER network. Similar experiments are performed with the SF–SF network at the different λ and k min. As shown in Fig. 4(e), for the SF–SF network, the larger the value of k min, the smaller the area of the “Collapse” region and the stronger the ability against cascading failures. For λ, the result is the opposite. For the RR–RR network [Fig. 4(f)], the larger the value of k 0, the smaller the area of the “Collapse” region. These findings enable us to design more resilient networks against cascading failures.

In this section, we make a comparative analysis of the SS, CS, FS, and RS recovery strategies under the random failure mode and the targeted failure mode. In the random failure mode, we assume that the initial failure is added on a fraction 1 p of nodes in network A randomly at step t = 0. In the targeted failure mode, the targeted attacks are ordered by the node degree. Note that all of the following results are obtained based on the simulations, and all recovery strategies meet the recovery constraint proposed in Sec. II.

We first test the network resilience under the random initial failure. A performance comparison in the ER–ER network is presented between the SS, CS, FS, and RS strategies, shown in Fig. 5(a). We find that resilience performance using FS is superior compared with the other three types of recovery strategies. This validates that the supporting network plays a more important role in improving network resilience than the network where the repaired component is located. This is because the FS recovery strategy is more likely to obtain direct benefits. In addition, as shown in Fig. 5(a), when p is either very large or small, the value of R under the different recovery strategies is almost identical. This is because the network will reach the steady state within only a few iteration steps when p moves away from the phase transition point; thus, the recovery effect of the different recovery strategies is not significant.

FIG. 5.

Comparison of recovery strategies on the ER–ER network. (a) Network resilience R under the different recovery strategies for the ER–ER network with k = 8 and α = 0.6. The performance of the recovery strategies on the ER–ER network with the different (b) αand (c) k . Rs is defined as an estimation of the area under the R curve. The size of the circle represents the value of Rs. (d)–(f) Targeted failure mode.

FIG. 5.

Comparison of recovery strategies on the ER–ER network. (a) Network resilience R under the different recovery strategies for the ER–ER network with k = 8 and α = 0.6. The performance of the recovery strategies on the ER–ER network with the different (b) αand (c) k . Rs is defined as an estimation of the area under the R curve. The size of the circle represents the value of Rs. (d)–(f) Targeted failure mode.

Close modal

To verify the effects of α on the performance of the recovery strategies, the network resilience under different α is calculated as shown in Fig. 5(b). Here, Rs is defined as an estimation of the area under the R curve in Fig. 5(a), which is used to estimate the overall resilience for the entire test set. We find that the FS recovery strategy presents superior performance compared with other strategies under the different α. In addition, Rs decreases gradually with increasing α, which indicates the more the recovery capacity of a network depends on the state of the network itself, the more resilient the network is to cascading failures. Going a step further, we investigate the effect of changes in k on the performance of the proposed recovery strategies, shown in Fig. 5(c). Here, the size of the circles represents the value of Rs. To facilitate comparisons, all results are normalized by their maximum values. We find that the network parameters k have little effect on the size of the different circles, which indicates that the superior performance of the FS is not influenced by the structural properties of networks. Similar results are observed in the ER–ER network under the targeted failure mode [Figs. 5(d)5(f)].

Next, a comparison of the network resilience between the FS strategy and some other strategies on the SF–SF network is shown in Fig. 6. The results are similar to those of the ER–ER network; that is, the resilience performance using FS is superior compared with the other three types of recovery strategies. In addition, we find that compared to the ER–ER network, the SF–SF network exhibits different behavior under the targeted failure mode than under the random failure mode. As shown in Figs. 6(b) and 6(e), in the case of the targeted failure mode, the value of Rs is much lower than that of the random failure mode at the different α. This indicates that, compared to the ER–ER network, the SF–SF network is more vulnerable under targeted attack modes. However, under targeted attacks, the FS strategy still possesses the strongest recovery capability for the network.

FIG. 6.

Comparison of recovery strategies on the SF–SF network. (a) SF–SF networks with λ = 3, k min = 2, and α = 0.6. (b) λ = 3, k min = 2. (c) The performance of the recovery strategies on the SF–SF network with the different αand λ. (d)–(f) Targeted failure mode.

FIG. 6.

Comparison of recovery strategies on the SF–SF network. (a) SF–SF networks with λ = 3, k min = 2, and α = 0.6. (b) λ = 3, k min = 2. (c) The performance of the recovery strategies on the SF–SF network with the different αand λ. (d)–(f) Targeted failure mode.

Close modal

In addition, we compare the performance of different recovery strategies on the RR–RR network. Because the degree value of each node in the RR network is the same, the effects of the random failure mode and the targeted failure mode are identical. Here, we only provide the simulation results of performance comparison for recovery strategies under random failure mode. As shown in Fig. 7, the same results are observed for RR–RR networks, indicating that the FS strategy still has better performance.

FIG. 7.

Comparison of recovery strategies on the RR–RR network under the random failure mode. (a) RR–RR networks with k 0 = 2 and α = 0.6. (b) k 0 = 2. (c) The performance of the recovery strategies on the RR–RR network with the different α and k 0.

FIG. 7.

Comparison of recovery strategies on the RR–RR network under the random failure mode. (a) RR–RR networks with k 0 = 2 and α = 0.6. (b) k 0 = 2. (c) The performance of the recovery strategies on the RR–RR network with the different α and k 0.

Close modal

In particular, we test the validity of our proposed methods in a real-world interdependent system composed of the Western U.S. PG and CN and the communication network (CN) used for supervisory control and data acquisition of PG.28,36 Due to a lack of data, it is difficult to establish the structure of CN and its interdependencies with PG. Considering that most real-world network systems could be described as ER and SF networks, we approximately capture the structure of PG–CN by coupling PG to ER or SF networks. Here, we assume that the ER and SF networks have the same number of nodes and the same average degree as PG, and each node in PG is randomly linked with exactly one node from the ER/SF network. We perform a comparative analysis of the proposed recovery strategies in both the PG–ER and PG–SF networks (see Fig. 8). The results are similar to those of ER–ER and SF–SF networks. When p is close to p c, the network resilience under the FS recovery strategy is superior compared with the other strategies. Again, we investigate the effect of changes in α on the performance of the proposed recovery strategies as shown in Figs. 8(c) and 8(d). We also find that FS performs well under the different α and the superiority becomes more obvious as α decreases, suggesting that the more the recovery capacity of a system depends on its own internal state, the more vulnerable the system is to cascading failures.

FIG. 8.

Comparison of recovery strategies in real-world systems. Network resilience R under the different recovery strategies for (a) PG–ER networks and (b) PG–SF networks. Here, N = 1723, k = 1.389, and α = 0.2. Rs under the different recovery strategies for the (c) PG–ER and (d) PG–ER networks.

FIG. 8.

Comparison of recovery strategies in real-world systems. Network resilience R under the different recovery strategies for (a) PG–ER networks and (b) PG–SF networks. Here, N = 1723, k = 1.389, and α = 0.2. Rs under the different recovery strategies for the (c) PG–ER and (d) PG–ER networks.

Close modal

In this paper, we develop a multivariate recovery coupling model for interdependent networks based on percolation theory. Considering the coupling structure and the failure–recovery relationship, we develop three recovery strategies for different scenarios based on the local stability of nodes to improve the network resilience against cascading failures. In addition, a network resilience indicator that reflects both the level and the effectiveness of recovery is proposed. We find that the network undergoes a percolation phase transition at the critical value of p c, and the phase transition changes from the first to the second order with increasing α. It indicates that the more the recovery capacity of a system depends on the system itself, the more likely it is to undergo an abrupt transition under the multivariate recovery coupling. In addition, we find that the supporting network plays a more important role in improving network resilience than the network where the repaired component is located. This is because the recovery strategy based on the local stability of the supporting nodes is more likely to obtain direct benefits. We also find that the average degree and the degree exponent of the networks have little effect on the superior performance of the proposed recovery strategies. In addition, the results show that the FS recovery strategy performs better than the other strategies, and similar results are observed in a real-world interdependent system composed of the Western U.S. power grid and the communication network. This paper provides a general theoretical frame to address the multivariate recovery coupling, which will enable us to design more resilient networks against cascading failures. The proposed method is employed to study the simple case that the multivariate recovery coupling is only for network B. A more general scenario that the multivariate recovery coupling occurs in both the networks A and B will be studied in the future.

See the supplementary material for the theoretical derivation of the multivariate recovery coupling in interdependent networks and then self-consistent equations based on generating functions to capture and solve the failure–recovery process.

We acknowledge the support from the National Natural Science Foundation of China (NNSFC) (Grant No. 72001213), the National Social Science Foundation of China (Grant No. 19BGL297), and the Basic Research Program of Natural Science in Shaanxi Province (Grant No. 2021JQ-369).

The authors have no conflicts to disclose.

Jie Li: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Methodology (equal); Software (equal); Visualization (equal); Writing – original draft (equal). Ying Wang: Funding acquisition (equal); Project administration (equal); Supervision (equal). Jilong Zhong: Formal analysis (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal). Yun Sun: Supervision (equal); Writing – review & editing (equal). Zhijun Guo: Conceptualization (equal); Software (equal); Supervision (equal); Writing – review & editing (equal). Chaoqi Fu: Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
S. V.
Buldyrev
,
R.
Parshani
,
G.
Paul
,
H. E.
Stanley
, and
S.
Havlin
, “
Catastrophic cascade of failures in interdependent networks
,”
Nature
464
(
7291
),
1025
1028
(
2010
).
2.
A.
Bellè
,
Z.
Zeng
,
C.
Duval
,
M.
Sango
, and
A.
Barros
, “
Modeling and vulnerability analysis of interdependent railway and power networks: Application to British test systems
,”
Reliab. Eng. Syst. Saf.
217
,
108091
(
2022
).
3.
W.
Liu
and
Z.
Song
, “
Review of studies on the resilience of urban critical infrastructure networks
,”
Reliab. Eng. Syst. Saf.
193
,
106617
(
2020
).
4.
M.
Rahnamay-Naeini
and
M. M.
Hayat
, “
Cascading failures in interdependent infrastructures: An interdependent Markov-Chain approach
,”
IEEE Trans. Smart Grid
7
(
4
),
1997
2006
(
2016
).
5.
C.-W.
Ten
,
M.
Govindarasu
, and
C.-C.
Liu
, “
Cybersecurity for critical infrastructures: Attack and defense modeling
,”
IEEE Trans. Syst., Man, Cybern., Part A
40
,
853
865
(
2010
).
6.
G. J.
Baxter
,
S. N.
Dorogovtsev
,
A. V.
Goltsev
, and
J. F. F.
Mendes
, “
Avalanche collapse of interdependent networks
,”
Phys. Rev. Lett.
109
(
24
),
248701
(
2012
).
7.
X.
Liu
,
H. E.
Stanley
, and
J.
Gao
, “
Breakdown of interdependent directed networks
,”
Proc. Natl. Acad. Sci. U. S. A.
113
(
5
),
1138
1143
(
2016
).
8.
C. D.
Brummitt
,
R. M.
D'Souza
, and
E. A.
Leicht
, “
Suppressing cascades of load in interdependent networks
,”
Proc. Natl. Acad. Sci. U. S. A.
109
(
12
),
E680
E689
(
2012
).
9.
P.
Yuan
et al, “
Analysis and enlightenment of the blackouts in Argentina and New York
,” in
2019 Chinese Automation Congress (CAC)
,
Hangzhou, China
(
IEEE
,
2019
), pp.
5879
5884
.
10.
S.
Sridhar
,
A.
Hahn
, and
M.
Govindarasu
, “
Cyber-physical system security for the electric power grid
,”
Proc. IEEE
100
,
210
224
(
2012
).
11.
S. D. S.
Reis
et al, “
Avoiding catastrophic failure in correlated networks of networks
,”
Nat. Phys.
10
(
10
),
762
767
(
2014
).
12.
J.
Wu
,
Z.
Chen
,
Y.
Zhang
,
Y.
Xia
, and
X.
Chen
, “
Sequential recovery of complex networks suffering from cascading failure blackouts
,”
IEEE Trans. Network Sci. Eng.
7
(
4
),
2997
3007
(
2020
).
13.
M. A.
Di Muro
,
C. E.
La Rocca
,
H. E.
Stanley
,
S.
Havlin
, and
L. A.
Braunstein
, “
Recovery of interdependent networks
,”
Sci. Rep.
6
(
1
),
22834
(
2016
).
14.
A.
Majdandzic
,
B.
Podobnik
,
S. V.
Buldyrev
,
D. Y.
Kenett
,
S.
Havlin
, and
H.
Eugene Stanley
, “
Spontaneous recovery in dynamical networks
,”
Nat. Phys.
10
(
1
),
34
38
(
2014
).
15.
C.
Fu
,
Y.
Wang
,
Y.
Gao
, and
X.
Wang
, “
Complex networks repair strategies: Dynamic models
,”
Phys. A
482
,
401
406
(
2017
).
16.
C. E.
La Rocca
,
H. E.
Stanley
, and
L. A.
Braunstein
, “
Strategy for stopping failure cascades in interdependent networks
,”
Phys. A
508
,
577
583
(
2018
).
17.
L.
Buzna
,
K.
Peters
,
H.
Ammoser
,
C.
Kühnert
, and
D.
Helbing
, “
Efficient response to cascading disaster spreading
,”
Phys. Rev. E
75
(
5
),
056107
(
2007
).
18.
Y.
Shang
, “
Localized recovery of complex networks against failure
,”
Sci. Rep.
6
(
1
),
30521
(
2016
).
19.
L.
B
ö
ttcher
,
M.
Lukovic
,
J.
Nagler
,
S.
Havlin
, and
H.
Herrmann
, “
Failure and recovery in dynamical networks
,”
Sci. Rep.
7
,
41729
(
2017
).
20.
M. M.
Danziger
and
A.-L.
Barabási
, “
Recovery coupling in multilayer networks
,”
Nat. Commun.
13
(
1
),
955
(
2022
).
21.
M.
Monsalve
and
J. C.
de la Llera
, “
Data-driven estimation of interdependencies and restoration of infrastructure systems
,”
Reliab. Eng. Syst. Saf.
181
,
167
180
(
2019
).
22.
M.
Ouyang
, “
Review on modeling and simulation of interdependent critical infrastructure systems
,”
Reliab. Eng. Syst. Saf.
121
,
43
60
(
2014
).
23.
P.-Y.
Kong
, “
Optimal backup power deployment for communication network with interdependent power network
,”
IEEE Access
10
,
17287
17299
(
2022
).
24.
Y.
Lin
,
K.
Zhang
,
Z.-J. M.
Shen
,
B.
Ye
, and
L.
Miao
, “
Multistage large-scale charging station planning for electric buses considering transportation network and power grid
,”
Transp. Res. Part C: Emerg. Technol.
107
,
423
443
(
2019
).
25.
Z.
Zhou
,
X.
Zhang
,
Q.
Guo
, and
H.
Sun
, “
Analyzing power and dynamic traffic flows in coupled power and transportation networks
,”
Renewable Sustainable Energy Rev.
135
,
110083
(
2021
).
26.
S.
Hong
,
C.
Lv
,
T.
Zhao
,
B.
Wang
,
J.
Wang
, and
J.
Zhu
, “
Cascading failure analysis and restoration strategy in an interdependent network
,”
J. Phys. A: Math. Theor.
49
(
19
),
195101
(
2016
).
27.
Z.
Chen
,
W.-B.
Du
,
X.-B.
Cao
, and
X.-L.
Zhou
, “
Cascading failure of interdependent networks with different coupling preference under targeted attack
,”
Chaos, Solitons Fractals
80
,
7
12
(
2015
).
28.
X.
Yuan
,
Y.
Hu
,
H. E.
Stanley
, and
S.
Havlin
, “
Eradicating catastrophic collapse in interdependent networks via reinforced nodes
,”
Proc. Natl. Acad. Sci. U. S. A.
114
(
13
),
3311
3315
(
2017
).
29.
R.
Albert
,
H.
Jeong
, and
A.-L.
Barabási
, “
Error and attack tolerance of complex networks
,”
Nature
406
(
6794
),
378
382
(
2000
).
30.
J.
Wang
,
L.
Rong
,
L.
Zhang
, and
Z.
Zhang
, “
Attack vulnerability of scale-free networks due to cascading failures
,”
Phys. A
387
(
26
),
6671
6678
(
2008
).
31.
X.
Huang
,
J.
Gao
,
S. V.
Buldyrev
,
S.
Havlin
, and
H. E.
Stanley
, “
Robustness of interdependent networks under targeted attack
,”
Phys. Rev. E
83
(
6
),
065101
(
2011
).
32.
L.
Zhang
,
G.
Zeng
,
D.
Li
,
H.-J.
Huang
,
H. E.
Stanley
, and
S.
Havlin
, “
Scale-free resilience of real traffic jams
,”
Proc. Natl. Acad. Sci. U. S. A.
116
(
18
),
8673
8678
(
2019
).
33.
J.
Li
et al, “
Network resilience assessment and reinforcement strategy against cascading failure
,”
Chaos, Solitons Fractals
160
,
112271
(
2022
).
34.
D. A.
Reed
,
K. C.
Kapur
, and
R. D.
Christie
, “
Methodology for assessing the resilience of networked infrastructure
,”
IEEE Syst. J.
3
(
2
),
174
180
(
2009
).
35.
J.
Loscalzo
and
A. L.
Barabási
,
Network Science
(
Cambridge University Press
,
Cambridge
,
2016
).
36.
B.
Turley
,
A.
Cantor
,
K.
Berry
,
S.
Knuth
,
D.
Mulvaney
, and
N.
Vineyard
, “
Emergent landscapes of renewable energy storage: Considering just transitions in the Western United States
,”
Energy Res. Soc. Sci.
90
,
102583
(
2022
).

Supplementary Material