The identification of regions of similar climatological behavior can be utilized for the discovery of spatial relationships over long-range scales, including teleconnections. Additionally, it provides insights for the improvement of corresponding interaction processes in general circulation models. In this regard, the global picture of the interdependence patterns of extreme-rainfall events (EREs) still needs to be further explored. To this end, we propose a top-down complex-network-based clustering workflow, with the combination of consensus clustering and mutual correspondences. Consensus clustering provides a reliable community structure under each dataset, while mutual correspondences build a matching relationship between different community structures obtained from different datasets. This approach ensures the robustness of the identified structures when multiple datasets are available. By applying it simultaneously to two satellite-derived precipitation datasets, we identify consistent synchronized structures of EREs around the globe, during boreal summer. Two of them show independent spatiotemporal characteristics, uncovering the primary compositions of different monsoon systems. They explicitly manifest the primary intraseasonal variability in the context of the global monsoon, in particular, the “monsoon jump” over both East Asia and West Africa and the mid-summer drought over Central America and southern Mexico. Through a case study related to the Asian summer monsoon, we verify that the intraseasonal changes of upper-level atmospheric conditions are preserved by significant connections within the global synchronization structure. Our work advances network-based clustering methodology for (i) decoding the spatiotemporal configuration of interdependence patterns of natural variability and for (ii) the intercomparison of these patterns, especially regarding their spatial distributions over different datasets.
Precipitation variability of monsoons affects over two-thirds of the world’s population, and regional monsoons have their own characteristics due to specific land–ocean and topographic conditions.1 In spite of being distributed in different continental regions, they are essentially driven and synchronized by the annual cycle of solar radiation. The connections between them are via the global divergent circulation characterized by global-scale persistent overturning of the atmosphere varying with time.2,3 An integration of these monsoons forms the concept of the global monsoon in terms of similar dynamics and behaviors. The synchronization in the context of the global monsoon is not limited to the tropical regions, but also extends to the subtropics, with the East Asian monsoon being a typical example. Therefore, identifying the synchronization structure on a global scale helps to understand the interaction with mid-latitude regions. A clustering workflow with higher robustness, by combining consensus clustering and mutual correspondences, is proposed for this purpose.
Regions of similar climatological properties are somewhat coherent in time and space. Therefore, regional weather systems, formed within these areas, can exhibit synchronization behavior. Synchronization is also observed in the climate system, which spans thousands of kilometers. This may be an indication of a teleconnection.4,5 Advances in the identification of these synchronized regions help one to better understand the self-organized structure of the climate system and to further improve prediction skills.6,7
A complex-network-based clustering approach has proved to be useful in this respect due to the climatological interpretation of the identified communities.8 It takes as input complex networks reconstructed from climate data. Generally speaking, the reconstruction process treats grid points as nodes and establishes connections between nodes with a high correlation according to the corresponding timeseries. Clustering such networks has been used for the discovery of spatial relationships in climate variables,8–10 for the extraction of climate indices in a data-driven fashion,6,7,11 and for the intercomparison of performance of general circulation models (GCMs).12 However, cluster analysis remains a challenging problem in this domain because of the availability of multiple datasets on each single climate variable. For example, each dataset of precipitation produces a network, which further yields a community structure. How to extract representative and consistent communities when multiple community structures are obtained from different datasets on the same variable is a key question.
Regarding global-scale analysis,8–10,13 previous studies based on network clustering put little focus on precipitation data, let alone EREs. This is essentially due to the bias of precipitation estimation in reanalysis products.8,14 However, it has recently been pointed out in the Sixth Assessment of the Intergovernmental Panel on Climate Change (IPCC) that precipitation extremes will be very likely to become more frequent in most locations.15 Indeed, the investigation of synchronization behavior of extreme rainfall is of societal relevance due to the occurrences of natural hazards, such as floods or landslides.14,16 A particular complex-network-based study on EREs17 starts from the utilization of event synchronization [ES, a method originally introduced to study electroencephalogram (EEG) signals]18 as a nonlinear similarity measure for the reconstruction of the functional climate networks. Using observational data, some studies emphasize how the network-based clustering decodes the spatial relationship on a regional scale.16,19–21 A recent investigation reveals the global structure of synchronization of EREs based on high-resolution satellite data.22 However, it relies on a given region of interest, and therefore, it provides a partial view. The focus of this study is, therefore, to reveal a comprehensive view of this global structure.
We address the above-mentioned issues through a systematic network-based clustering workflow. Our aim is to unravel the synchronization of EREs around the globe, especially in terms of intraseasonal variability over different monsoon systems. Specifically, the representation of climate data in the form of networks is implemented based on ES.14,18,22 The reconstructed functional climate networks are then viewed as input for a downstream cluster analysis. Through this proposed clustering workflow, we identify two primary monsoon-related structures of distinct spatiotemporal characteristics during boreal summer. Our work is the first combined application of consensus clustering23 and mutual correspondences24 for climate extreme analysis. This combination provides the identification of reliable interdependence patterns not only within each dataset, but also over different ones.
We develop here a generic workflow for the identification of the global synchronization structure of EREs, as presented in Fig. 1. The network reconstruction of functional climate networks, i.e., Steps (1) and (2), is based on Ref. 22. For the downstream analysis, our emphasis is on the proposed combined clustering workflow, in particular, Steps (3.1) and (3.2). At the same time, we also include the correction for the multiple-comparison bias in Step (3.3) by using the technique introduced in Ref. 22.
A. The reconstruction of functional climate networks
1. Definition of EREs based on thresholding
For rainfall data, we adopt the th percentile of wet days (daily sum rainfall above 1 mm), by the definition of extreme weather events from IPCC,25 as the threshold to indicate EREs. Suppose that a pair of grid points and are randomly chosen from the entire set of grid cells. For the two timeseries at and , days with rainfall values higher than the given threshold are kept to indicate the occurrences of EREs. For consecutive days with EREs, only the first day is preserved to represent an independent occurrence of events.22 We denote the set of grid points with at least three events as . The sets that incorporate the final event series, after data pre-processing at and , are then defined as and with , respectively. For , denotes the time when the th event occurs and is the total number of events. Similarly, and are the corresponding variables for .
2. ES and the corresponding significance test
To quantify and determine the interactions between and , ES considers all possible pairs of events and and measures the closeness of each pair by imposing the condition that the absolute value of temporal delay must be smaller than a dynamical delay . Together with a maximum delay days, which is used to exclude the occurrences of unreasonably long delays, the is, therefore, defined between and as22
where denotes the cardinality of a given set, i.e., the number of synchronized event pairs.
The placement of network links between each pair of grid points is determined statistically. Taking and as an example, by randomly distributing the same number of events 2000 times within the June–July–August (JJA) season and computing the corresponding ES values for the 2000 pairs of surrogate event series, a null-model distribution is obtained.22 We take the th percentile of the distribution as the threshold. A significant connection with is determined if is above the threshold. Since the significance thresholds are only associated with the number of events, we manage this process independently for all possible pairs of event numbers.
In many scenarios, links of a network can be attached with features, such as weight and direction. For the reconstructed functional climate networks, each significant link has an ES value as a weight to indicate the link strength. This is an important factor in Step (3.1) (see Sec. II B) to determine the affinities of different nodes. Regarding the link direction, it can also be incorporated into the calculation of ES and the corresponding test of significance. For example, the synchronization from to is defined as
where and indicates the occurrence of an event at with a subsequent one at . The significance test for the directed case follows the same procedure as the one for the undirected one. We take the undirected weighted network as the default setting over the following main text. The directed approach is only used when identifying days of high synchronization for atmospheric condition analysis in Sec. II B.
B. The network-based clustering workflow
1. Consensus clustering
The significant link bundles given by Step (3.3)22 have limited capability to capture regions following similar climatological behavior around the globe. This is essentially due to the limited link distribution based on a selected region (see Fig. 6 in Appendix A as an example). Therefore, we adopt here complex-network-based clustering, namely, community detection, to fill this gap, by taking into account the link distribution of a whole reconstructed functional network. However, most community detection methods are not deterministic, indicating that partitions delivered by them have certain fluctuations, e.g., due to randomness. One way to manage this problem is to use consensus clustering.23 Using the consensus of a combination of partitions provides a stable result out of a set of candidates. In the context of climate science, to the best of our knowledge, the use of consensus clustering represents a novelty; using a stable partition derived this way yields an algorithmically reliable and climatologically interpretable community structure for a given dataset.
Given the reconstructed functional climate network with nodes and a community detection method , below is the detailed procedure of consensus clustering:
Apply on for 1000 times, yielding 1000 partitions.
Rank the 1000 partitions based on modularity26 and select the first partitions of the highest modularity.
Compute the consensus matrix : is the number of partitions where nodes and belong to the same community, divided by .
All entries of below a chosen threshold are set to zero.
Apply on for times, yielding partitions.
If the partitions in (A5) are all equal, that is, is block-diagonal, then stop; otherwise, go back to (A3).
Note that serves as a basic method for the entire procedure, and we use modularity to estimate the quality/strength of a partition due to the lack of climate-related benchmarks. Modularity compares the fraction of intra-community links with its expected number in a null model, which is a network with the same degree sequence but links placed at random.26 Intuitively speaking, networks with high modularity have dense intra-community connections but are sparsely connected between different communities.
In (A1), we use the parallel Louvain method (PLM)27 as the basic method as algorithm . We choose PLM because (a) it is flexible enough to control the granularity/size of communities with only one parameter , (b) its speed allows one to handle large-scale networks, and (c) it is based on the well-known Louvain method.28 The Louvain method is a greedy local search algorithm with a bottom-up multilevel approach. Specifically, the local search phase moves each node to the neighbor community where this move yields the largest increase in modularity (if any). After this process has converged, each community is contracted to a super-node. The super-nodes are linked together by weighted edges (and self-loops) whose weights are set according to the inter-community (intra-community) edge weights in the larger graph. The smaller network derived this way is treated as new input for the next iteration of local search and coarsening (until no further community changes are observed). Among others, the main change of PLM compared to plain Louvain is to use parallelism. PLM is part of the network analysis toolkit NetworKit,29 which allows the integration of PLM into more complex workflows. We verify in our work (not shown) that the default setting yields an appropriate community resolution with the highest modularity.27,30 The 1000 partitions are particularly generated in search of a broad range of possible solutions since the inherent non-determinism due to thread parallelism can easily change solutions, especially for those algorithms relying on stochastic search strategies (see Fig. 7 in Appendix A as an example).23 In (A2), we choose candidates of the highest modularity as input for reaching a consensus. In (A4), the optimal is situation-dependent.31 Because this threshold parameter also implies the probability of two nodes being settled into the same community, we vary it in our work and let . In (A6), considering multiple partitions to be obtained under different values of and , the final decision is made based on the modularity score again.
2. Mutual correspondences
Comparing different communities based on single similarity values does neither provide information on how they differ nor any spatial information on the robustness of the identified interdependence patterns of EREs. That is why we employ mutual correspondences,24 which provide insights on how different partitions agree with each other on a meta-community level (i.e., a group of communities). Suppose that there are two stable partitions and obtained, for example, from two datasets. Given and , a correspondence is mutual when the following conditions are met: (a) for any , (b) for any , (c) for any , and (d) for any . The four conditions above guarantee that any community from preserves more than 50% similarity to the counterpart in and vice versa. We explain below the detailed procedure in finding such a mutual correspondence:
Initialize with and . Only for the first iteration, is chosen based on domain knowledge, such as well-known teleconnection patterns and is preserved by for the final check in (B5); otherwise, equals , which is found in (B3).
Each element satisfying , where as in (B1), is placed into , with the correspondence direction of .
Each element satisfying , where equals found in (B2), is placed into , with the reversed correspondence direction of .
If given in (B2) equals given in (B3), then stop; otherwise, go back to (B1) for the next iteration.
If the updated and still satisfy , and when the iteration process stops in (B4), then a mutual correspondence is established.
It is also possible to start by initializing in (B1). The resulting correspondence relationship is , which is equal to obtained by initializing in (B1).
Together with consensus clustering, mutual correspondences lead to a robust top-down clustering workflow. Both combined allow a reliable identification of a synchronization structure in the context of global monsoon. A mutual correspondence can be one-to-many or many-to-one to indicate the split or combination of rainfall patterns over different datasets. For a many-to-many correspondence, it is reasonable to only consider neighborhood communities due to spatial coherence. Finally, we estimate the meta-community level similarity based on the relative correspondence rate, defined as .
1. Identification of synchronized days for meta-communities
After identifying meta-communities of good mutual correspondence, we further determine synchronized days for them. Assume that is one pair of synchronized events when is significant. It is, therefore, plausible to take as the day to indicate the occurrence of synchronization. By repeating this process for all synchronized pairs between and and for all other significant links within a meta-community, we obtain the distribution of these days over the JJA season. Such a distribution indicates the temporal characteristic of a given meta-community.
3. Significance test for link bundles
During the (re)construction process of functional climate networks, each link is determined statistically by multiple comparisons in Step (2).22 A large number of comparisons can lead to some links being preserved by chance. Therefore, we need to correct this problem afterward and to complement the proposed clustering workflow. The basic idea is to identify some regions of higher link density formed by physical mechanisms. Specifically, given a region and all links coming from this region, a spherical Gaussian kernel density estimate (KDE) provides the link density estimation for each grid point regarding the distribution of the original regional link configuration. By randomly distributing those links 1000 times to grid points with at least three events, a corresponding null-model distribution based on the same KDE is obtained for each grid point. The thresholds are then taken in the same way as in Step (2). We also use the th percentile to determine statistically significant link bundles ().
The combination of Steps (3.1), (3.2), and (3.3) gives the regional significant synchronization structure of EREs. Specifically, the regional interdependence by Steps (3.1) and (3.2) takes the form of meta-communities. For a specified region “A” within a meta-community (), we determine all other regions that are significantly synchronized with “A” based on Step (3.3). Some of these regions fall into (), forming the final significantly synchronized regional structure associated with “A.” Such a structure could indicate rainfall teleconnections, which can further be verified by atmospheric conditions.
C. Atmospheric condition analysis
This part is the verification for the identified synchronization structure. Therefore, it is not indicated in the workflow (see Fig. 1). Specifically, to investigate composite anomalies, we first identify days of high synchronization. Suppose that we need to find out the specific days when two regions “A” and “B” are highly synchronized. We first determine significant links between them, with direction from “A” (“B”) to “B” (“A”), under a significance level of . Then, for each time step, we identify the number of associated EREs occurring in “B” (“A”) within days, when there are events in “A” (“B”), based on these links. All these event numbers form a timeseries from “A” (“B”) to “B” (“A”), with the same time steps as datasets. They are then both filtered by a Butterworth low-pass filter with a cutoff frequency of 10 days. We take those time steps as days of high synchronization, where there are local maxima above the th percentile of JJA season in the filtered timeseries. With these days, we calculate the composite anomalies regarding the JJA climatology.
For the reconstruction of functional climate networks regarding EREs, we choose observational data due to the bias of precipitation estimation in reanalysis products.8 Two satellite-derived datasets, i.e., TRMM 3B42 V732 and GPM IMERG33 from NASA, are analyzed here. We utilize the daily rainfall sums within the JJA season from 2000 to 2019 with the spatial resolution of . For a comparative analysis to TRMM, the GPM data are cropped with a boundary from S to N, named as B-GPM.
For the understanding of climatological behavior behind the obtained rainfall patterns, we choose horizontal and vertical wind fields from ECMWF (ERA5)34 to obtain the corresponding atmospheric conditions. The spatial (i.e., ) and temporal (i.e., daily estimations ranging from 2000 to 2019) resolutions are consistent with TRMM.
B. Regional interdependence of EREs
The global monsoon represents the integration of all regional monsoons. Different regional monsoons are synchronized since they are fundamentally driven by the annual variation in solar radiation.3
In the context of the global monsoon, the following two points need to be further explored: (i) The synchronization structure with higher latitudes and (ii) the spatiotemporal characteristics of intraseasonal variability. We apply the proposed clustering workflow (see Fig. 1) to the global precipitation datasets and give appropriate interpretations of the results.
Two main monsoon-related meta-communities are identified during the JJA season, as shown in Figs. 2 and 3. An overview of the entire community structure is presented in Fig. 8 in Appendix B. The meta-communities describe a global view of synchronized EREs, with a synchronization structure extending to mid-latitude regions [to point (i)] and distinct intraseasonal changes on the frequency of synchronization [to point (ii)]. Temporal distributions indicate two recognizable periods of synchronization: from early June to mid-July [see Figs. 2(b) and 2(d)] and from mid-July to late-August [see Figs. 3(b) and 3(d)]. The former peaks in early June with a sharply decreasing trend thereafter, while the latter shows a gradual increase with a peak in mid-August. Since the temporal distributions are obtained based on the timing of all synchronous EREs within the identified meta-communities, it can be concluded that those teleconnection patterns carried by significant connections within meta-communities are also experiencing a similar temporal variability. We verify this with the Asian summer monsoon (ASM)-related case study in Sec. III C. Regarding the corresponding spatial distributions, they decode the compositions of regional EREs, including different monsoon systems. They are robust under different choices of maximum delays (see Figs. 15–18 in Appendix E) and significance levels (see Figs. 21–24 in Appendix F). Based on regional monsoons over Asia, West Africa, and Southwest United States in boreal summer, we now provide detailed evidence corroborating the spatiotemporal characteristics of intraseasonal variability found in Figs. 2 and 3.
Our results are in line with the known teleconnection pattern inside of ASM. The Meiyu/Baiu rainband,35 as in Figs. 2(a) and 2(c), is automatically identified by using the proposed clustering workflow. The coexistence of this noticeable region, the Bay of Bengal, and the western coast of India in the same meta-community supports the synchronized initiation (i.e., the “south” teleconnection) of the Indian rainy season and Meiyu/Baiu.35,36 Figures 3(a) and 3(c) preserve the “north” teleconnection between India and North China, which is built-up after a rapid northward jump of the rain belt.37 For adjacent oceanic areas, the connection between Indonesian rainfall with Indian and Pacific Oceans38 is partly revealed by the coexistence of these regions in Fig. 2. In Fig. 3, the northwestward propagation of cyclones in a high phase of the Pacific-Japan pattern establishes the connection of EREs between the western North Pacific tropics and the northern South China Sea; during a low phase, cyclones follow a recurved propagation route to a higher latitude, close to the east of Japan.39 These cyclones form the synchronization structure relating to the global monsoon, especially over oceanic areas.
Figures 2 and 3 capture the abrupt intraseasonal change in West Africa, i.e., from the oceanic regime [ North of equator, see Figs. 2(a) and 2(c)] to the continental regime [ North of equator, see Figs. 3(a) and 3(c)].40 Between Indian and African monsoons, there exists a connection provided by westward-propagating Rossby waves from the Pacific warm pool in boreal summer.41 They help modulate the easterly wave activity and moisture transport in West Africa.42 Meta-communities in Figs. 2(a) and 2(c), covering the Gulf of Guinea with extension to tropical South America, Caribbean Sea, and the eastern tropical Pacific, manifest the underlying connection via easterly wave disturbances over the equatorial Atlantic basin.43,44 These disturbances contribute to a large proportion of rainfall in these tropical regions. In Figs. 3(a) and 3(c), meta-communities over West Africa experience a northward shift, which is also known as “monsoon jump.”45,46 Along with this, easterly waves emanating from West Africa are enhanced in August over the tropical North Atlantic.47 They are the primary precursors of tropical cyclones, which propagate toward the Caribbean Sea and North America.22,48–50 It has also been proposed that the easterly waves from West Africa have a weak correlation with tropical cyclogenesis in the eastern North Pacific.51 Such a correlation provides a possible explanation for the synchronized EREs between them [see Figs. 3(a) and 3(c)].
For surrounding regions related to the Southwest United States monsoon, our results catch the mid-summer drought from July to August, especially over Central America, southern Mexico, and part of the Caribbean Sea [see Figs. 3(a) and 3(c)].52,53 In June, the intensified Caribbean and Great Plains low-level jets together contribute to the moisture supply to rainfall in parts of North America.54 This leads to the synchronized EREs along the way from the Caribbean Sea to the central United States [see Figs. 2(a) and 2(c)].
Apart from the above-mentioned synchronization related to monsoon, it is also observed that the meta-communities in Figs. 2 and 3 span the mid-latitude belt. The connections between them are mainly via upper-level teleconnection patterns, such as the “Silk Road teleconnection”55 and the “circumglobal teleconnection”56,57 in the Northern Hemisphere. For example, the summer rainfall variation of North America stems from East Asian subtropical monsoon heating by the Asia–North America teleconnection,58 as meta-communities covering both Meiyu/Baiu region and parts of North America in Fig. 2; the positive phase of circumglobal wave train is associated with significantly enhanced precipitation over western Europe and northwestern India,56,59 as in Fig. 3. A detailed case study on this is provided in Sec. III C.
C. Regional significant interdependence of EREs: A case study related to ASM
To further verify the intraseasonal change given by the identified meta-communities, as in Figs. 2 and 3, we here present an ASM-related case study on the upper-level atmospheric circulation. Two pairs of relationships are chosen based on significant connections within meta-communities (see Figs. 9 and 10 in Appendix C). A particular reason for the first pair in Figs. 9(e) and 9(f) is that, in the early summer of 2018, rainfall extremes occurred in South-East Europe and South-West Japan nearly simultaneously.57 Instead of southwestern Japan, we select a large area covering parts of southern China and the Yangtze River Valley since they are all within the Meiyu/Baiu rainband. Meanwhile, it has also been suggested that a strong Indian summer monsoon is often accompanied by significant above normal rainfall over part of western and central Europe.22,60 We select the second pair based on this as indicated in Figs. 10(e) and 10(f). The composite anomalies, obtained according to the two pairs of relationships, are given in Figs. 4 and 5.
In Figs. 4(b) and 4(d), more synchronizations appear in June followed by a decreasing trend, while Figs. 5(b) and 5(d) present a reversed tendency. There is a recognizable replacement of an eastward upper-level zonal wind movement, as in Figs. 4(a) and 4(c), with a westward direction, as in Figs. 5(a) and 5(c) over Japan, for example. This is consistent with the end of the Meiyu, indicating the disappearance of the high-altitude westerly jet streams and the appearance of the easterlies over the same region.35 Additionally, the upper-level meridional wind movement, as shown in Figs. 11 and 12 in Appendix D, confirms the function of Rossby waves in creating synchronized EREs for remote mid-latitude regions.22,55,56 The identified memberships of two pairs of significant connections to different meta-communities are in agreement with the temporal order and upper-level atmospheric conditions between them. This case study also suggests that the global synchronization structures in Figs. 2 and 3 resemble the distribution of rainfall anomalies (above normal) since only the occurrences of EREs are considered.
IV. DISCUSSION OF ROBUSTNESS
The here proposed clustering workflow enables us to identify regions following similar dynamical properties in an automatic and reliable way. In Sec. III, we obtain two primary global synchronization structures characterizing the intraseasonal changes by using both TRMM and B-GPM data. Therefore, this section is to show whether these two structures are robust concerning the choice of the parameters. The parameters are divided into two groups based on their usages: (i) the network reconstruction step needs an event threshold (th by IPCC), the significance level, and the maximum delay () and (ii) the clustering workflow needs , , and .
The parameters in group (i) determine the structure of the constructed networks, which serve as input for the clustering analysis. We have no prior knowledge of the underlying network structure. Therefore, different choices for the maximum delay () and the significance level should be first considered. The parameters in group (ii), although the underlying community structure is also unknown, we can still use modularity as the measure to quantify the quality of the community structure. That is, one chooses the community structure with the highest modularity.26 Therefore, all parameters in group (ii) are chosen according to the highest modularity solution; thus, we only discuss and the significance level in the remainder of this section. It should also be noted that the robustness analysis in this section is different from the significance test, which is already considered in the reconstruction step of the functional climate networks in Sec. II A.
A. Robustness over τmax
Different choices of maximum delays with days are considered, as in Figs. 13 and 14 in Appendix E. When decreases (increases) to 3 (30) days, the obtained monsoon-related spatial distribution shrinks (extends) to relatively local (wide) areas (see Figs. 15–18 in Appendix E). Indeed, a shorter delay indicates that only those closely synchronized event pairs in time are counted. These pairs are more likely to be found in neighborhood regions. The assumption of a short delay might be too strong to capture a long-distance relationship established by the physical atmospheric process. However, it is also probable that such a relationship appears as the result of a chain of mechanisms with indirect transitivity. This can be revealed from the comparisons between Figs. 15 and 17, or Figs. 16 and 18 since the spatial pattern under the delay of 3 days is preserved under that of 30 days. That is, the identified intraseasonal synchronization structures are stable inside the interval .
B. Robustness over a significance level
The significance level determines the topological structure of the reconstructed functional climate network. A particularly low threshold may eliminate some important synchronization relationships,22 while a higher value amplifies the multiple-comparison bias. We, therefore, consider the conservative choices of and (see Figs. 19 and 20 in Appendix F). The monsoon-related spatial distributions, shown in Figs. 21–24 in Appendix F, remain consistent with Figs. 2 and 3. By means of the comparison between Figs. 8 and 19, or Figs. 20 and 19, lowering the significance threshold yields a slight increase in modularity. Such a signal implies the community structure as an intrinsic attribute of the underlying self-organized climate system.
In spite of the robustness of the identified two synchronization structures, it is still worth noting that several regions remain dissimilar to a certain degree. For example, the meta-community in Fig. 2(c) covers a larger area of the North Atlantic than that in Fig. 2(a). We attribute this to the uncertainties from rainfall data since fluctuations caused by the network-based clustering approach are corrected in our proposed workflow. Ideally, when two precipitation datasets are exactly the same, the reconstructed network structures are identical and the resulting spatial distributions can be perfectly matched. However, satellite observations contain non-negligible random errors and biases during the generation of datasets due to factors, such as inadequate sampling and deficiencies in algorithms.61 These uncertainties are further retained, leading to discrepancies in the identified synchronization structures. However, the discrepancy information can also be important to measure the extent to which different datasets reproduce the same climate phenomenon.
In summary, our work starts from the identification of regions of similar climatological behavior and focuses on revealing the global view of the interdependence patterns for EREs in the context of the global monsoon. For this, we provide a network-based clustering workflow that combines consensus clustering and mutual correspondences. By means of this clustering workflow, we identify two main monsoon-related synchronization structures. They have independent temporal and spatial characteristics. The first structure, primarily from early June to mid-July, shows early summer features over different monsoon systems, including the synchronized initiation of the Indian rainy season and Meiyu/Baiu over ASM, the oceanic regime over West Africa, and early-rainy season over the Caribbean Sea toward the central United States. The second structure, from mid-July to late-August, manifests the states after the “monsoon jump,” including the establishment of the “north” teleconnection inside the ASM between Indian summer monsoon and the rainfall in North China and the continental regime over West Africa. Meanwhile, the mid-summer drought is dominating Central America and southern Mexico. Also, we show that the significant connections inside the identified meta-communities capture the synchronized behavior via the “Silk Road teleconnection”55 and the “circumglobal teleconnection”56,57 for the mid-latitude belt.
The successful identification of intraseasonal spatiotemporal variability essentially stems from the combination of ES and complex-network theory to represent the underlying climate system as networks. As an extension of this research, there are several potentially relevant directions. First, how interaction processes between different monsoons form at different levels is crucial for the physical understanding of the identified global monsoon structure. Second, this clustering workflow can be applied to other seasonal analyses, such as the boreal winter, which brings marked monsoonal characteristics to the Southern Hemisphere. Third, the application on higher spatial resolution scenarios (e.g, ) is a computational problem worthy of further consideration. Fourth, the combined application of event synchronization and community detection to other types of data, in particular, in neuroscience, such as EEG signals and magnetic resonance imaging (MRI), is another promising direction.
We would like to thank Shraddha Gupta, Niklas Boers, and Frederik Wolf for valuable discussions as well as Yang Liu and Fabian Brandt-Tumescheit for technical support. Z.S. was funded by the China Scholarship Council (CSC) scholarship. J.K. was supported by the Russian Ministry of Science and Education (Agreement No. 075-15-2020-808.) H.M. was supported by German Research Foundation (DFG) grant ME-3619/4-1 (ALMACOM).
Conflict of Interest
The authors have no conflicts to disclose.
The data that support the findings of this study are available within the article.
APPENDIX A: THE MOTIVATION FOR THE NETWORK-BASED WORKFLOW
Illustration of the limitation of the significant test for link bundles under TRMM (Fig. 6). Illustration of fluctuations of clustering approaches, such as PLM, under (a) TRMM and (b) B-GPM, respectively (Fig. 7).
APPENDIX B: COMMUNITY STRUCTURE UNDER THE DELAY OF 10 DAYS
Community structures for TRMM and B-GPM, respectively (Fig. 8).
APPENDIX C: THE PROCESS TO DETERMINE REGIONAL SIGNIFICANT INTERDEPENDENCE FOR THE ASM-RELATED CASE STUDY
Teleconnection pattern given for South-East Europe for TRMM [(a), (c), and (e)] and B-GPM [(b), (d), and (f)] (Fig. 9). Teleconnection pattern given for northern India for TRMM [(a), (c), and (e)] and B-GPM [(b), (d), and (f)] (Fig. 10).
APPENDIX D: THE MERIDIONAL COMPONENT FOR THE ASM-RELATED CASE STUDY
Atmospheric conditions for the teleconnection pattern between South-East Europe and South China, marked with yellow boxes (Fig. 11). Same as Fig. 11, but between western and central Europe and northern India (Fig. 12).
APPENDIX E: THE ROBUSTNESS TEST ON DIFFERENT DELAYS
Same as Fig. 8, but for a different days (Fig. 13). Same as Fig. 8, but for a different days (Fig. 14). Same as Fig. 2, but for a different days (Fig. 15). Same as Fig. 3, but for a different days (Fig. 16). Same as Fig. 2, but for a different days (Fig. 17). Same as Fig. 3, but for a different days (Fig. 18).
APPENDIX F: THE ROBUSTNESS TEST ON DIFFERENT SIGNIFICANCE LEVELS
Same as Fig. 8, but for a different significance level (Fig. 19). Same as Fig. 8, but for a different significance level (Fig. 20). Same as Fig. 2, but for a different significance level (Fig. 21). Same as Fig. 3, but for a different significance level (Fig. 22). Same as Fig. 2, but for a different significance level (Fig. 23). Same as Fig. 3, but for a different significance level (Fig. 24).