Much effort has been devoted to assess the importance of nodes in complex biological networks (such as gene transcriptional regulatory networks, protein interaction networks, and neural networks). Examples of commonly used measures of node importance include node degree, node centrality, and node vulnerability score (the effect of the node deletion on the network efficiency). Here, we present a new approach to compute and investigate the mutual dependencies between network nodes from the matrices of node-node correlations. To this end, we first define the dependency of node i on node j (or the influence of node j on node i), D(i, j) as the average over all nodes k of the difference between the i − k correlation and the partial correlations between these nodes with respect to node j. Note that the dependencies, D(i, j) define a directed weighted matrix, since, in general, D(i, j) differs from D( j, i). For this reason, many of the commonly used measures of node importance, such as node centrality, cannot be used. Hence, to assess the node importance of the dependency networks, we define the system level influence (SLI) of antigen j, SLI( j) as the sum of the influence of j on all other antigens i. Next, we define the system level influence or the influence score of antigen j, SLI( j) as the sum of D(i, j) over all nodes i. We introduce the new approach and demonstrate that it can unveil important biological information in the context of the immune system. More specifically, we investigated antigen dependency networks computed from antigen microarray data of autoantibody reactivity of IgM and IgG isotypes present in the sera of ten mothers and their newborns. We found that the analysis was able to unveil that there is only a subset of antigens that have high influence scores (SLI) common both to the mothers and newborns. Networks comparison in terms of modularity (using the Newman’s algorithm) and of topology (measured by the divergence rate) revealed that, at birth, the IgG networks exhibit a more profound global reorganization while the IgM networks exhibit a more profound local reorganization. During immune system development, the modularity of the IgG network increases and becomes comparable to that of the IgM networks at adulthood. We also found the existence of several conserved IgG and IgM network motifs between the maternal and newborns networks, which might retain network information as our immune system develops. If correct, these findings provide a convincing demonstration of the effectiveness of the new approach to unveil most significant biological information. Whereas we have introduced the new approach within the context of the immune system, it is expected to be effective in the studies of other complex biological social, financial, and manmade networks.
The immune system is a dynamic network whose complexity is comparable to that of the central nervous system. Being an adaptive–responsive complex system, it stores latent information about body conditions. This information can in principle be deciphered, provided proper analyses of the immune network state are used. Here, we introduce a new approach to investigate the immune state based on the construction of network of mutual dependencies between antigens. These antigen dependency networks are computed from antigen microarray data of the reactivity of hundreds of autoantibodies. We used this approach to investigate the sera of ten mother–newborn pairs. Inspection of the local topological organization revealed a higher topological similarity between the IgG networks at birth (newborns) and adulthood (mothers). Analyses of the global network organization revealed that: 1. At birth, the IgM modularity of the newborns is higher than that of their IgG network. 2. During the development of the immune system, the modularity of the IgG network increases and becomes comparable to that of the IgM networks at adulthood. Defining a new measure, which we termed the antigen influence score, we found that the lists of the 20 most influential antigens include three antigens common to the newborns and mothers both for the IgG and IgM networks. The analysis further unveils the existence of several conserved network motifs (between the newborns and maternal), which might retain important immune information as the immune system develops. If correct, these results provide a convincing demonstration that the new approach can unveil important biological information within the context of the immune system. We expect that it will be as effective in exposing hidden information in a wide range of other complex biological networks (e.g., gene networks, protein interaction networks, and neural networks), as well as social, financial and manmade networks.
I. INTRODUCTION
Antibody networks have been studied in the past based on the connectivity between idiotypes and anti-idiotypes—antibodies that bind one another. Here, we call attention to a different network of antibodies, antibodies connected by their reactivities to sets of antigens—the antigen–reactivity network. The recent development of antigen microarray chip technology for detecting global patterns of antibody reactivities makes it possible to study the immune system quantitatively using network theory methods. In this immune network, the nodes represent the antigens spotted on the chip, and the links between the nodes represent the relationships between the antibody reactivities calculated for each group of subjects. In other words, an immune network for a given group of subjects corresponds to the network of similarities between antigen reactivities within that group of subjects.
In an earlier paper,1 we demonstrated the use of the widespread Pearson correlation coefficient as the similarity measure to differentiate between maternal and newborns datasets. More specifically, we evaluated the antigen–antigen correlations by analysing antigen microarray data of the reactivity of IgM and IgG autoantibodies present in the sera of pairs of mothers and their newborns. We reported that the IgG repertoires of each mother and her newborn were very closely related and distinct for each mother–newborn pair.1 The IgM repertoires, in contrast, differed markedly between mothers and newborns; each mother manifested a different pattern of IgM reactivities that was distinct from her newborn’s cord IgM repertoire. However, the IgM reactivities of each of the newborns manifested very similar antigen-binding profiles indicating that in utero each developing fetus produced autoantibodies to a similar set of self-molecules. A subsequent analysis of the data revealed that the reactivity profiles to certain self-molecules were highly correlated as sets of functional antigen-reactivity cliques.1
More recently, we have extended the aforementioned study by applying graph and network theory analysis methods2–4; specifically, we evaluated the minimum spanning tree (MST) for the networks of antigen correlations computed from the correlation matrices.1–6 We thus made it possible to identify communities and their relations according to the topology of the network. While investigations of the antigen–antigen correlations from microarray data have proven to be an efficient bioinformatics approach to decipher important features related to functional relations between antigens, it does not provide information about the causal relations between antigens. Detailed assessments of the properties that can be inferred from studying correlations have been well investigated in the past.7–9
Recently, Kenett et al.10 introduced a new method to study relationships of influence, or dependency, by using partial correlations to construct a new type of networks. In their study, they applied this approach to the analysis of stock relationships and were able to uncover important information regarding the underlying dependency relationships between stocks traded in the New York Market.
Here, we adopted and generalized the aforementioned to develop a system level analysis of antigen dependency networks as a step toward inference of causal relations between antigens. The analysis is based on partial correlations, which are becoming ever more widely used to investigate complex systems. Examples range from studies of biological systems, such as gene networks,11,12 to financial systems in inspections of the market index effect on stock correlations.13 In simple words, the partial (or residual) correlation is a measure of the effect (or contribution) of a given antigen, say j, on the correlations between another pair of antigens, say i and k. To be more specific, the partial correlations of the (i, k) pair, given j is the correlations between them after proper subtraction of the correlations between i and j and between k and j. Defined this way, the difference between the correlations and the partial correlations provides a measure of the influence of antigen j on the (i, k) correlation. Therefore, we define the influence of antigen j on antigen i, or the dependency of antigen i on antigen j - D(i, j), to be the sum of the influence of antigen j on the correlations of antigen i with all other antigens. Next, we define the system level influence (SLI) of antigen j, SLI( j), as the sum of the influence D(i, j) of j on all other antigens i.
To demonstrate that the antigen dependency network analysis can unveil important biological information, we used the new approach to reanalyze autoantibodies of the IgM and IgG isotypes present in the sera of mothers and their newborns.
To this end, we constructed the antigen dependency networks for the groups of mothers and newborns. We first constructed the IgG, and IgM and combined IgG and IgM correlation networks and networks of antigen dependencies for the two groups. Next, the networks of the two subject groups were compared by employing two measures that were developed in the context of network theory and are widely used—the divergence rate14 and modularity score.15 The first method was used to test for significant differences between the topological organizations of the two networks by quantification of their differences. The second method was used to assess the differences in the modular organization between the two networks—the ability to partition the network into modules such that the number of edges between modules is significantly less than expected by chance. We found a higher modularity for the maternal networks, specifically for the IgG network. In terms of topological similarities, we detected that the IgG networks of the two groups have higher similarities than those of the IgM networks.
Our results might indicate that the immune network development from birth to adulthood is accompanied with an increase of the networks modularity and with a slight increase in the similarities between the isotypes networks. Furthermore, we proceeded to identify, analyze, and compare the networks of the two subject groups in terms of the most influential antigens. We found that the most influential antigens in the mothers and newborns are composed of both isotypes IgG and IgM, which points to the additional role of the transferred IgG in influence over the network. We also found a few subnetworks of highly connected antigen that are conserved between the two groups in both IgG and IgM isotypes. The above findings illustrated that the analysis of antigen dependencies networks enable to unveil possible influential antibodies in maturation of the immune system.
II. METHODS
a. The antigen–antigen correlations
First, we computed the antigen–antigen correlations from the antibody reactivities data obtained by the antigen microarray technology as was done in the past.1 The correlations between the antigen-reactivity profiles (the reactivities of the antigen in all subjects) were calculated by Pearson formula:16
where Xi(n) and Xj(n) are the reactivity of antigens i and j of subject n and σi and σj are the standard deviation of the reactivity profiles of antigens i and j. Note that the antigen–antigen correlations (or for simplicity the antigen correlations) for all pairs of antigen define a symmetric correlation matrix whose (i, j) element is the correlation between antigens i and j.
b. Partial correlations
Next we use the resulting antigen correlations to compute the partial correlations. The first order partial correlation coefficient is a statistical measure indicating how a third variable affects the correlation between two other variables.13 Since here the variables are the antigen reactivities, we will proceed with the presentation for this case. The partial correlation between antigens i and k with respect to a third antigen j−PC(i, k∣j)17 is defined as:
where C(i, j), C(i, k), and C( j, k) are the antigen correlations defined above.
c. The correlation influence and correlation dependency
The relative effect of the correlations C(i, j) and C( j, k) of antigen j on the correlation C(i, k),10 is given by:
This avoids the trivial case were antigen j appears to strongly effect the correlation C(i, k), mainly because C(i, j), C(i, k), and C( j, k) have small values. We note that this quantity can be viewed either as the correlation dependency of C(i, k) on antigen j (the term used here) or as the correlation influence of antigen j on the correlation C(i, k).10
d. Antigen dependencies
Next, we define the total influence of antigen j on antigen i or the dependency D(i, j) of antigen i on antigen j to be:
As defined, D(i, j) is a measure of the average influence of antigen j on the correlations C(i, k), over all antigens k not equal to j. The antigen dependencies define a dependency matrix D whose (i, j) element is the dependency of antigen i on antigen j. It is important to note that while the correlation matrix C is a symmetric matrix, the dependency matrix D is nonsymmetrical—D(i, j) ≠ D( j, i)—since the influence of antigen j on antigen i is not equal to the influence of antigen i on antigen j. For this reason, some of the methods used in the analyses of the correlation matrix [e.g., the principle component analysis (PCA)] have to be replaced or are less efficient. However, other methods, similar to ones presented here, can also account for the nonsymmetric nature of the dependency matrix.
e. The antigen SLI
Next we sorted the antigens according to the system level influence of each antigen on the correlations between all other antigen pairs. The system level influence of antigen j, SLI( j), is simply defined as the sum of the influence of j on all other antigens i not equal to j, that is:
For completeness of the presentation we note that in a recent publication in which we studied the symmetric antigen–antigen correlations, we employed the widely used eigenvalue centrality6,18 as a measure of the antigen importance. Here, we employed the system level influence, which can better account for the nonsymmetric nature of the antigen dependencies, as the measure of the antigen importance.
f. Network representation of the antigen dependencies
Similar to the correlation matrix, the dependency matrix can also be presented as a weighted matrix whose nodes are the antigens and the edges are the antigen dependencies. However, the nonsymmetric nature of the dependency matrix is reflected by the fact that the antigen dependency network is a directed graph. The latter means that the edges direction determines their value, i.e., value of the edge directed from node i to node j can differ from the value of the edge directed from node j to node i.
g. Informative subgraphs of the dependency network
The complete dependency network for N antigens contains N(N − 1) edges. Since most of the edges have small values (weak dependencies), the relevant information about the network (e.g., topology, organization, and the most influential antigens) can be obscured. Several methods have been developed to overcome this obstacle by constructing from the complete network a subgraph that captures the most relevant information embedded in the original network. A widely used method to construct informative subgraph of a complete network is the MST.19–25 Another informative subgraph that retains more information (in comparison to the MST) is the planar maximally filtered graph (PMFG),26 which is used here. Both methods are based on hierarchical clustering, and the resulting subgraphs include all the N nodes in the network whose edges represent the most relevant antigen dependencies. The MST subgraph contains (N − 1) edges with no loops while the PMFG subgraph contains 3(N − 2) edges. In a recent publication,5 we have constructed the MST subgraph to study the symmetric antigen–antigen correlations. Here, for a better account of the nonsymmetric nature of the antigen dependencies, we employed the PMFG subgraph.
h. Construction of the PMFG informative subgraph
To construct the PMFG), we first order the N(N − 1) values of the dependencies matrix D in decreasing rank. We then start from the pairs of nodes, say i and j, with the highest dependency and draw a directed link j → i between them. The process continues according to the rank order, while at each iteration a directed link is added if and only if the resulting graph (network) is still planar, i.e., it can be drawn on the surface of a sphere without link crossing.26 In the resulted directed subgraph, referred to as {G}, the original values of the dependencies are not retained (i.e., all the directed links have a weight 1). We also note that the subgraph {G} contains (for N ≫ 1), 3(N − 2) edges—the maximum number of directed edges for planar graph.26
i. Hybrid presentation of the informative subgraphs
We developed a hybrid presentation in which the information about the antigens (nodes) system level influence namely, the influence score, is superimposed on the informative subgraphs by coloring each node j according to its SLI( j) value. It is important to emphasize that since the SLIs were calculated on the original dependency network, the SLIs contain additional information that might have been lost in the reduction of the complete network to a subgraph. Hence, the hybrid representation can unveil additional features beyond the features that can be obtained by each analysis on its own. In Fig. 1, we show examples of the informative subgraphs for the maternal and newborns datasets and a comparison to correlation based PMFG networks.
In Appendix A, we show a comparison between the informative subgraphs in which the antigen influence score (SLI) was calculated for the entire network and the case in which they were calculated within the informative subgraphs. Note that in the latter case, since the directed links are not weighted, the measure of the system level influence of each antigen j is simply the total number of directed edges from j to other antigens.
We emphasize that all the analyses described below were performed on the informative subgraphs {G}.
j. Network comparison based on divergence rate
We have used two measures, the divergence rate presented here and the network modularity presented next, for quantitative comparison between the dependency networks of the mothers and newborns. We emphasize that both the divergence rate and network modularity comparisons were performed on the corresponding informative subgraphs and not on the complete networks. The comparisons were performed on the informative subgraphs corresponding to the dependency networks of both the IgG and the IgM isotypes of the maternal and newborns datasets.
The divergence rate measure developed by Lee and Kim,27 is based on the idea of quantification of the information difference between two process (variables) based on the notion of conditional entropy. In information theory, the specific conditional entropy h(X∣Y = y) is the entropy of a process (variable), under the condition that another process Y is assigned the value y. The conditional entropy H(X∣Y) is then the average of h(X∣Y = y) over all possible y that Y can take. It can be shown that H(X∣Y) = H(X∣Y) − H(X), where H(X∣Y) is the combined entropy of processes X and Y and H(X) is the entropy of process X. The conditional entropy28 has been used to define the metric distance or information distance ID(X, Y) between two processes X and Y as ID(X, Y) ≡ H(X∣Y) + H(Y∣X).
Motivated by this notion, Lee and Kim27 define the notion of the metric distance (MD) MD(GX, GY) between two graphs {GX} and {GY} to be:
where CDiv(GX∣GY) and CDiv(GY∣GX) can be viewed as conditional divergences and are calculated as follows: First we define DGX(i) to be the sum of the topological distances from a node i to all its neighborhoods nodes {i}NN. Then we define the conditional distances CDiv(GX∣GY)(i) to be the sum of the topological distances in graph {GY} from node i to the group of nodes {i}NN defined in graph {GX}. Note that these nodes, which are in the neighborhood of i in graph {GX}, need not be in the neighborhood of i in the graph {GY}. We also note that DGY(i) and CDiv(GY∣GX)(i) are defined in a similar way. With these definition at hand, CDiv(GX∣GY) is defined to be:
Note that in the informative subgraphs studied here, each directed link from node i to node j corresponds to a topological distance 1 from i to j, and we take the neighborhoods nodes {i}NN to be the nodes that have a topological distance 1 with node i. The topological distance between two nodes that are not directly connected by an edge is the number of directed edges of the shortest path connecting the two nodes.
k. Network comparison based on network modularity
Modular organization is characteristic of many biological and social networks alike and it is a hallmark of systems that perform multiple-parallel tasks.18,29–32 Modular organization means that the network is composed of subgroups of nodes that are strongly connected (e.g., strong correlations in the case of correlation networks or strong dependencies in the case studied here), with sparser or weaker connections between the modules. The ability to detect such groups could be of significant practical importance. For instance, groups within the correlation of gene expression network might correspond to sets of genes with related functions.31
Several methods have been developed to identify and quantify network modularity. Here we employ Newman’s partitioning algorithm15 as a quantitative measure of the network modularity score. Then we compare the dependency networks of the mothers and newborns based on their modularity score. The partitioning algorithm is based on computing a modularity matrix [M] from the adjacency matrix [A] associated with a graph {G}. The idea is to subtract from the adjacency matrix a “shuffled” adjacency matrix. The latter corresponds to a shuffled graph (network) in which the edges of the original network are distributed randomly between the nodes. More specifically, since the element A(i, j) of the adjacency matrix is the number of edges connecting nodes i and j (note that it can be 0, 1, or 2 in the case studied here), the element M(i, j) of the modularity network is defined to be:
where k(i) and k( j) are the degrees of nodes i and j, and NLinks are the total number of edges in the original graph. Once the modularity matrix is computed, the next step is to find the leading (most positive) eigenvalue and its corresponding eigenvector, and the graph is portioned to two groups, one contains the positive elements and the other contains the negative ones. Then, generally speaking, the process continues while at each iteration a generalized modularity matrix (after proper subtraction from of the separated groups from the previous modularity network). The process halts when the generalized modularity matrix has no positive eigenvalues. To calculate the modularity score, we used the generalized modularity matrix and averaged the sum of all its elements.
We note that the use of Newman’s algorithm provides a size invariant modularity measure and thus enables us to study the role of network size on modularity as an independent, interesting organization variable.
III. RESULTS
We studied the antigen correlation matrices for the antigen-reactivity data of mothers and their newborns. We start with the combined correlation matrices of both IgG and IgM isotypes. The dependency networks for the mothers and newborns were calculated separately and then compared.
Several standard algorithms for automatic graph drawing were implemented. Here we used two main network layouts: spring embedders based on minimization of the total energy of the system (Kamada–Kawai33 and Fruchterman–Reingold34). In Fig. 2, we show results of the hybrid presentation of the informative subgraphs of the dependency networks. We start by showing the differences between the isotypic organization (IgG/IgM) [Figs. 2(a) and 2(b)]. Comparing Figs. 2(a) and 2(b) indicates slightly a higher isotype integration in the maternal combined IgG and IgM network, as is reflected by the more homogeneous distribution of the IgG and IgM isotypes in this network. Quantitative comparison shows 684 edges in the mothersversus only 511 edges in the newborns between nodes from different isotypes. Another interesting point comes from looking at the general direction of the influence of these specific connections. It shows more influence of IgG over IgM in the maternal dataset (385 vs 299), in contrast, in the newborns dataset, we see more influence of IgM over IgG (318 vs 193). Next, we recolored the networks according to the strength of the SLI of each antigen on the correlations between all other antigen pairs, from most influential antigens (dark red) to least influential ones (dark blue). We note the wide dispersal of highly influence scored antigens in the networks. Finally, a quantitative comparison between the networks (see Sec. II) revealed low similarities (MD = 1.39) between the overall topology of the networks. Thus, the informative subgraphs of the dependency networks for mothers and newborns yield a relatively distinct topological organization of their reactivities to the different antigens.
We continue to investigate the relationships between the analyzed antigens and more specifically about the most influencing antigens. As a proxy of antigen influence, we use the SLI of each antigen on the correlations between all other antigen pairs (see Sec. II). The results of the top 20 most influential antigens for the networks of mothers and newborns are summarized in Fig. 3. As we can see, both isotypes are preset in the top influential antigen in the mothers and also in the newborns. In addition, from these lists of 20 most influential antigens, there is only one exact common antigen, GroEL-13. We note also the existence of many peptides of HSP60, GroEL, and HSP70 in both lists.
a. Separation to isotypes
We continue to test our network analysis approach on the separated isotypes. We note that the produced networks (Fig. 4) are highly significant in comparison to randomly generated networks with equal distribution.
b. Modularity of the antigen network
Modularity is considered to be one of the main organizing principles of biological networks.35,36 A biological network module consists of a set of elements (e.g., proteins/reactions) that form a coherent structural subsystem and have a distinct function. Several studies have explored the role of modularity and network organization in various protein interaction and regulatory cellular networks.37–39 It was suggested that there is a positive selection favoring modularity because it enhances development by enabling evolutionary changes to take place in confined modules while preserving global functions.40 Focusing specifically on modularity in immune networks, the immune antigen dependencies networks of mothers and newborns were reconstructed. We then used Newman’s algorithm (Sec. II) to partition the network and quantified the modularity of the each network (Fig. 5).
Observing the network modularity Q for the different isotypes reveals a noteworthy lead to the maternal IgM (Q = 0.752) and newborns IgM (Q = 0.751) networks, followed by the rest: maternal IgG (Q = 0.737) and newborns IgG (Q = 0.666). It seems that the maturation of our immune system involves a higher segregation of the network into modules.
A quantitative MD comparison of the different networks, namely the divergence rate measure (see Sec. II) reveals that the IgG networks are more similar than the IgM networks (MDIgG = 1.14 vs MDIgM = 1.37). Furthermore, when observing similarities between the isotypes within each group, the newborns networks are slightly more similar (MDNewborns = 1.26) than those of the mothers (MDMothers = 1.29). As expected the IgG networks were the most similar, as IgG crosses the placenta during pregnancy and is transferred from the mother to her fetus. However, the marked differences between the IgM networks suggest different network organization as was shown previously in terms of antibody profiles.1
We continue to investigate the relationships between the analyzed antigens and more specifically about the most influencing antigens. The results of the top 20 most influential antigens for the maternal and newborns isotype networks are summarized in Table I. As we can see from these lists, we have only a few common antigens;: between the networks of IgG, we have HSP70-6, oligo C, and beta 2 macroglobulin (marked in yellow); between the networks of IgM we have HSP60-10, HSP60-13, and GroEL-21 (marked in green). A comparison between the isotypes within the groups of mothers and newborns shows a common antigen only in the maternal list (marked in bold): HSP60-22 and beta melanocyte-stimulating hormone. This reemphasizes the larger similarities between the organizations of the maternal isotype networks.
Mothers IgG . | Mothers IgM . | Newborns IgG . | Newborns IgM . |
---|---|---|---|
HSP60-22 | HSP60-26 | MT 3 | SOD |
HSP60-25 | HSP70-37 | HSP70-12 | HSP70-3 |
Myeloperoxide | PBS | Oligo C | pG LPD |
Actin | HSP70-8 | HSP70-36 | HSP60-13 |
Factor II | GroEL-23 | GroEL-25 | HSP70-39 |
GroEL-31 | GroEL-13 | HSP70-6 | HSP60-10 |
HSP70-20 | Poly Asp | GroEL-22 | HSP70-43 |
GroEL-18 | IL 4 | HSP60-19 | GroEL-21 |
HSP70-4 | HSP60-35 | E. coli 27 | Peroxidase |
GroEL-3 | C9 nonST | HSP60-277 | Spectrin |
HSP70-28 | HSP60-10 | Lactoferin | HSP60-27 |
HSP70-23 | GroEL-21 | Beta 2 macroglobulin | HSP70-42 |
HSP70-6 | HSP70-36 | MMP3 | GroEL-14 |
Beta MSH | HSP60-28 | PTH | HSP60-23 |
Beta 2 microglobulin | HSP60-22 | HSP70-11 | HSP70-41 |
HSP70-43 | HSP60-13 | MT 180 | HSP70-18 |
HSP70-26 | GroEL-19 | Pepstatin A | HSP60-18 |
HSP60-32 | Ins A | HSP60-10 | GroEL-16 |
Oligo C | Beta MSH | GroEL-10 | Pepstatin |
BETA 2 macroglobulin | MT 256 | GroEL-2 | HSP60-9 |
Mothers IgG . | Mothers IgM . | Newborns IgG . | Newborns IgM . |
---|---|---|---|
HSP60-22 | HSP60-26 | MT 3 | SOD |
HSP60-25 | HSP70-37 | HSP70-12 | HSP70-3 |
Myeloperoxide | PBS | Oligo C | pG LPD |
Actin | HSP70-8 | HSP70-36 | HSP60-13 |
Factor II | GroEL-23 | GroEL-25 | HSP70-39 |
GroEL-31 | GroEL-13 | HSP70-6 | HSP60-10 |
HSP70-20 | Poly Asp | GroEL-22 | HSP70-43 |
GroEL-18 | IL 4 | HSP60-19 | GroEL-21 |
HSP70-4 | HSP60-35 | E. coli 27 | Peroxidase |
GroEL-3 | C9 nonST | HSP60-277 | Spectrin |
HSP70-28 | HSP60-10 | Lactoferin | HSP60-27 |
HSP70-23 | GroEL-21 | Beta 2 macroglobulin | HSP70-42 |
HSP70-6 | HSP70-36 | MMP3 | GroEL-14 |
Beta MSH | HSP60-28 | PTH | HSP60-23 |
Beta 2 microglobulin | HSP60-22 | HSP70-11 | HSP70-41 |
HSP70-43 | HSP60-13 | MT 180 | HSP70-18 |
HSP70-26 | GroEL-19 | Pepstatin A | HSP60-18 |
HSP60-32 | Ins A | HSP60-10 | GroEL-16 |
Oligo C | Beta MSH | GroEL-10 | Pepstatin |
BETA 2 macroglobulin | MT 256 | GroEL-2 | HSP60-9 |
To quantify the differences in the ordered lists of influential antigens for each of the networks, we used a heuristic method that measures the Euclidian distance between the indexing of the sorted lists of antigens. These results support our previous findings as it shows a higher conservation between the IgG lists (Euclidian distanceIgG = 1936) versus IgM lists (Euclidian distanceIgM = 1995) and slightly a higher conservation within the maternal networks (Euclidian distancemothers = 2014)versus the newborns one (Euclidian distancenewborns = 2018).
c. Conserved elements subnetworks
To find conserved networks elements between the datasets of the mothers and newborns, we applied a heuristic method that finds such subnetworks between the informative subgraphs of the dependency network for the separated isotypes (see Appendix C). A comparison between the maternal and newborns networks showed a few such conserved elements in the IgG (56 antigens) and IgM (50 antigens) isotypes with the largest subnetwork between the IgG networks consists of 6 antigens and the largest subnetwork between the IgM networks consists of 7 antigens (Fig. 6).
IV. DISCUSSION
We present here a new approach to investigate antigen dependency networks computed from matrices of antigen–antigen correlations. The latter are calculated from antigen microarray data of autoantibody reactivity of IgM and IgG isotypes present in the sera of ten mothers and their newborns. We used the antigen dependencies to construct a new quantitative measure of the system level influence of antigen j or influence score of j, SLI( j) as the sum of the influence of j on all other antigens i. While we have introduced the new approach and its ability to unveil important biological information within the context of the immune system, it is expected to be applicable to a wide range of other complex biological networks (e.g., gene networks, protein interaction networks, and neural networks), as well as social, financial, and manmade networks. We expect that, in all of these examples, our method will be able to unveil important hidden information (see also Ref. 41).
Partial correlations were employed to construct the antigen dependencies from the antigen–antigen correlations. More specifically, we first define the dependency of node i on node j (or the influence of node j on node i), D(i, j) as the average over all nodes k of the difference between the correlations C(i, k) and the partial correlations PC(i, k∣j). Using these definitions, the influence score SLI( j) is the sum of D(i, j) over all nodes i.
Investigating the IgG and IgM combined networks of the mothers and newborns, we found that, in the two networks, the top ranked antigens (antigens with high influence score) include both isotypes. We also found that the most influential antigens vary from birth to adulthood as is reflected by the fact that only a few antigens are included both among the top influential antigens of the newborns networks and among the top ones in the maternal network. These findings are both for the combined networks (only one common antigen) and the separated IgG and the IgM networks (only three common antigens in each).
Furthermore, the combined networks of the mothers and newborns were found to have a higher isotype integration in the maternal network. This is in agreement with previous findings in the context of correlation networks.5 Since the antigen dependency networks are directed, we could also inspect the directionality of the isotypes influence. Doing so, we found a higher IgG → IgM influence in the maternal networks while the newborns networks exhibit a higher IgM → IgG influence. This result is somewhat unexpected considering the fact that IgG antibodies are transferred from mother to her fetus via the placenta during pregnancy.
We also found, in both the IgG and the IgM networks, the existence of subnetworks exist at birth (the newborn networks) and are persist in the maternal networks. The conserved networks have a total of 56 antigens in the IgG networks and 50 antigens in the IgM networks. These conserved subnetworks motifs might serve to maintain information about the healthy network organization upon the immune network reorganization from childhood to adulthood state described next.
We investigated the networks topological similarity using the divergence rate measure. These investigations revealed a higher similarity between the IgG network of the mothers and the IgG network of the newborns (in comparison to the similarity between the two IgM networks). These results are consistent with the fact that considerable amounts of the IgG cross the placenta during pregnancy and transfer from the mother to her fetus.
Investigating the networks modularity, we found that the networks of IgM exhibit a higher modularity in comparison to the networks of the IgG and that these differences are more profound in the newborns. It may indicate that a healthy immune state has modular organization, which affords an efficient performance of many tasks as a part of its normal physiological role. We note that the modular organization of the healthy antigen dependency network is consistent with earlier findings of Madi et al.1 about the existence of antigen cliques in the correlation network of healthy subjects.
Put together, we found that the immune system at birth is associated with a higher modular reorganization of the IgG network and more pronounced topological reorganization of the IgM network. We note that the network divergence rate topology is associated with local organization (the nearest neighbors of each antigen), while the modularity is associated with the global network organization. These findings may indicate a profound and intricate response associated with local and global reorganizations of the immune networks system between the adult and infant immune states: the IgG networks exhibit a more profound global reorganization while the IgM networks exhibit a more profound local reorganization. If correct, these findings provide a dramatic demonstration of the effectiveness of the new approach to unveil the most significant biological information.
ACKNOWLEDGMENTS
We are thankful to Alexandra Sirota-Madi for her technical help in crucial times. This research has been supported in part by the Maugy-Glass Chair in Physics of Complex Systems and by the National Science Foundation-sponsored Center for Theoretical Biological Physics (CTBP) Grant Nos. PHY-0216576 and 0225630, and by the University of California at San Diego.
APPENDIX A: SLI
As a proxy of antigen influence, we use the SLI of each antigen on the correlations between all other antigen pairs (see Sec. II). We start by calculating the SLI scores of each antigen in the maternal and newborns networks and in the separate isotype networks (Fig. 7). For the purpose of comparison, we divided the SLI scores by the largest SLI value. We plotted the first 100 sorted relative antigen SLI values and observed for a notable change in the steady decreasing plots. As we can see the strongest change is in the integrated IgG and IgM maternal and newborns networks is around 20 antigens the same with less extent is seen in the newborns IgG network.
In addition, we also compare between the informative subgraphs in which the antigen system level influence was calculated for the entire network and the case in which they were calculated within the informative subgraphs. Note that in the latter case, since the directed links are not weighted, the measure of the system level influence of each antigen j is simply the total number of directed edges from j to other antigens. For this type of comparison, we simply measured the correlation between the two score vectors. Results show a significant change between the two calculations (maternalIgG = 0.25, maternalIgM = 0.33, newbornsIgG = 0.19, and newbornsIgM = 0.33), pointing to the importance of adding this information from the complete dependency network (D).
APPENDIX B: COMPARISON TO CORRELATION BASED PMFG
To farther demonstrate the differences between a correlation based network and a partial correlation based dependency network, we constructed the informative subgraph, a correlation PMFG based network, for 290 antigens of the mothers [Fig. 8(a)]. The selected layout (Kamada–Kawai) was then partitioned by Newman’s algorithm,32 and each cluster was assigned a different color. We then superimposed this cluster coloring on the informative subgraphs dependency network of the same dataset [Fig. 8(b)]. As we can see, it is not only the external topological construction of the networks but also a great difference between the internal relationships.
APPENDIX C: CONSERVED SUBNETWORKS
To find conserved networks elements between the datasets of the mothers and newborns, we applied a heuristic method that finds such subnetworks between the two networks.
The method finds these conserved subnetworks as follows: First we define DGX(i) to be the topological distances from a node i to all its neighborhoods nodes {i}NN. Then we define the conditional distances CDiv(GX∣GY)(i)to be the topological distances in graph {GY} from node i to the group of nodes {i}NN defined in graph {GX}. Finally, we select those subsets of nodes that have the same distance in both graphs {GX} and {GY}. Note that these nodes, which are in the neighborhood of i in graph {GX}, need not be in the neighborhood of i in the graph {GY}.