Link prediction is the problem of predicting the location of either unknown or fake links from uncertain structural information of a network. Link prediction algorithms are useful in gaining insight into different network structures from partial observations of exemplars. However, existing link prediction algorithms only focus on regular complex networks and are overly dependent on either the closed triangular structure of networks or the so-called preferential attachment phenomenon. The performance of these algorithms on highly sparse or treelike networks is poor. In this letter, we proposed a method that is based on the network heterogeneity. We test our algorithms for three real large sparse networks: a metropolitan water distribution network, a Twitter network, and a sexual contact network. We find that our method is effective and performs better than traditional algorithms, especially for the Twitter network. We further argue that heterogeneity is the most obvious defining pattern for complex networks, while other statistical properties failed to be predicted. Moreover, preferential attachment based link prediction performed poorly and hence we infer that preferential attachment is not a plausible model for the genesis of many networks. We also suggest that heterogeneity is an important mechanism for online information propagation.

1.
For example, many users who follow each other in online social networks not only do not know each other in the offline world, but also have no communication in their online life. Such observed links can be stated as the fake link.
2.
S.
Milgram
, “
The small world problem
,”
Psychol. Today.
2
,
60
67
(
1967
).
3.
J.
Travers
and
S.
Milgram
, “
An experimental study of the small world problem
,”
Sociometry
32
,
425
443
(
1969
).
4.
We disclaim at the outset that, despite adopting the language of Bayesians, the method itself does not at any point require a Bayesian framework. The discussion here, however, does suggests an alternative —purely Bayesian—solution to the same problem.
5.
A.-L.
Barabási
and
R.
Albert
, “
Emergence of scaling in random networks
,”
Science
286
,
509
512
(
1999
).
6.
M.
Small
,
Y.
Li
,
T.
Stemler
, and
K.
Judd
, “
Growing optimal scale-free networks via likelihood
,”
Phys. Rev. E
91
,
042801
(
2015
).
7.
X.
Xu
,
J.
Zhang
, and
M.
Small
, “
Rich-club connectivity dominates assortativity and transitivity of complex networks
,”
Phys. Rev. E
82
,
046117
(
2010
).
8.
S.
Michael
,
Applied Nonlinear Time Series Analysis: Applications in Physics, Physiology and Finance
(
World Scientific
,
2005
), Vol. 52.
9.
S.-Y.
Takemura
,
A.
Bharioke
,
Z.
Lu
,
A.
Nern
,
S.
Vitaladevuni
,
P. K.
Rivlin
,
W. T.
Katz
,
D. J.
Olbris
,
S. M.
Plaza
, and
P.
Winston
et al., “
A visual motion detection circuit suggested by Drosophila connectomics
,”
Nature
500
,
175
181
(
2013
).
10.
D. D.
Bock
,
W.-C. A.
Lee
,
A. M.
Kerlin
,
M. L.
Andermann
,
G.
Hood
,
A. W.
Wetzel
,
S.
Yurgenson
,
E. R.
Soucy
,
H. S.
Kim
, and
R. C.
Reid
, “
Network anatomy and in vivo physiology of visual cortical neurons
,”
Nature
471
,
177
182
(
2011
).
11.
S.
Redner
, “
Networks: Teasing out the missing links
,”
Nature
453
,
47
48
(
2008
).
12.
A.
Clauset
,
C.
Moore
, and
M. E.
Newman
, “
Hierarchical structure and the prediction of missing links in networks
,”
Nature
453
,
98
101
(
2008
).
13.
L.
,
L.
Pan
,
T.
Zhou
,
Y.-C.
Zhang
, and
H. E.
Stanley
, “
Toward link predictability of complex networks
,”
Proc. Natl. Acad. Sci. U.S.A.
112
,
2325
2330
(
2015
).
14.
L. A.
Adamic
and
E.
Adar
, “
Friends and neighbors on the web
,”
Soc. Netw.
25
,
211
230
(
2003
).
15.
Z.-K.
Zhang
,
C.
Liu
,
Y.-C.
Zhang
, and
T.
Zhou
, “
Solving the cold-start problem in recommender systems with social tags
,”
Europhys. Lett.
92
,
28002
(
2010
).
16.
A.
Zeng
and
G.
Cimini
, “
Removing spurious interactions in complex networks
,”
Phys. Rev. E
85
,
036101
(
2012
).
17.
K.
Shang
,
W. S.
Yan
, and
M.
Small
, “
Evolving networks—Using past structure to predict the future
,”
Phys. A Stat. Mech. Appl.
455
,
120
135
(
2016
).
18.
K.
Shang
,
M.
Small
,
X. K.
Xu
, and
W. S.
Yan
, “
The role of direct links for link prediction in evolving networks
,”
Europhys. Lett.
117
,
28002
(
2017
).
19.
A. R.
Benson
,
R.
Abebe
,
M. T.
Schaub
,
A.
Jadbabaie
, and
J.
Kleinberg
, “
Simplicial closure and higher-order link prediction
,”
Proc. Natl. Acad. Sci. U.S.A.
115
,
E11221
E11230
(
2018
).
20.
A.
Ballantyne
,
N.
Lawrance
,
M.
Small
,
M.
Hodkiewicz
, and
D.
Burton
, “Fault prediction and modelling in transport networks,” in 2018 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, 2018), pp. 1–5.
21.
D.
Liben-Nowell
and
J.
Kleinberg
, “
The link-prediction problem for social networks
,”
J. Am. Soc. Inf. Sci. Technol.
58
,
1019
1031
(
2007
).
22.
J.
Leskovec
and
A.
Krevl
, “SNAP Datasets: Stanford large network dataset collection,” 2014.
23.
P. S.
Bearman
,
J.
Moody
, and
K.
Stovel
, “
Chains of affection: The structure of adolescent romantic and sexual networks
,”
Am. J. Sociol.
110
,
44
91
(
2004
).
24.
S.
Maslov
and
K.
Sneppen
, “
Specificity and stability in topology of protein networks
,”
Science
296
,
910
913
(
2002
).
25.
J. A.
Hanley
and
B. J.
McNeil
, “
The meaning and use of the area under a receiver operating characteristic (ROC) curve
,”
Radiology
143
,
29
(
1982
).
26.
L.
and
T.
Zhou
, “
Link prediction in complex networks: A survey
,”
Phys. A Stat. Mech. Appl.
390
,
1150
1170
(
2011
).
27.
D. J.
Klein
and
M.
Randić
, “
Resistance distance
,”
J. Math. Chem.
12
,
81
95
(
1993
).
28.
F.
Fouss
,
A.
Pirotte
,
J.
Renders
, and
M.
Saerens
, “
Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation
,”
IEEE Trans. Knowl. Data. Eng.
19
,
355
369
(
2007
).
29.
R.
Milo
,
S.
Itzkovitz
,
N.
Kashtan
,
R.
Levitt
,
S.
Shen-Orr
,
I.
Ayzenshtat
,
M.
Sheffer
, and
U.
Alon
, “
Superfamilies of evolved and designed networks
,”
Science
303
,
1538
1542
(
2004
).
30.
The abscissa stands for the false positive rate and the ordinate stands for the true positive rate, then we can draw a Receiver Operating Characteristic Curve (ROC). Statistically, the area under the ROC should be between 0.5 and 1. If the area is greater than 0.5, we can suggest that our method is effective. If the area equals 0.5, then our method is invalid. The case that the area is less than 0.5, is unrealistic.
31.
M. E.
Newman
, “
Clustering and preferential attachment in growing networks
,”
Phys. Rev. E
64
,
025102
(
2001
).
32.
T.
Zhou
,
L.
, and
Y.-C.
Zhang
, “
Predicting missing links via local information
,”
Eur. Phys. J. B
71
,
623
630
(
2009
).
33.
Q.
Ou
,
Y.-D.
Jin
,
T.
Zhou
,
B.-H.
Wang
, and
B.-Q.
Yin
, “
Power-law strength-degree correlation from resource-allocation dynamics on weighted networks
,”
Phys. Rev. E
75
,
021102
(
2007
).
34.
P.
Jaccard
, “
Étude comparative de la distribution florale dans une portion des alpes et des jura
,”
Bull. Soc. Vaudoise Sci. Nat.
37
,
547
579
(
1901
).
35.
T.
Sørensen
, “
A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analysis of the vegetation on Danish commons
,”
Biol. Skr
5
,
1
34
(
1948
).
36.
L.
and
T.
Zhou
, “
Link prediction in weighted networks: The role of weak ties
,”
EPL
89
,
18001
(
2010
).
37.
E.
Ravasz
,
A. L.
Somera
,
D. A.
Mongru
,
Z. N.
Oltvai
, and
A.-L.
Barabási
, “
Hierarchical organization of modularity in metabolic networks
,”
Science
297
,
1551
1555
(
2002
).
38.
E. A.
Leicht
,
P.
Holme
, and
M. E.
Newman
, “
Vertex similarity in networks
,”
Phys. Rev. E
73
,
026120
(
2006
).
39.
L.
,
C.-H.
Jin
, and
T.
Zhou
, “
Similarity index based on local paths for link prediction of complex networks
,”
Phys. Rev. E
80
,
046122
(
2009
).
40.
L.
Katz
, “
A new status index derived from sociometric analysis
,”
Psychometrika
18
,
39
43
(
1953
).
41.
F.
Chung
,
L.
Lu
, and
V.
Vu
, “
Eigenvalues of random power law graphs
,”
Ann. Comb.
7
,
21
33
(
2003
).
42.
R.
Xulvi-Brunet
and
I. M.
Sokolov
, “
Changing correlations in networks: Assortativity and dissortativity
,”
Acta Phys. Polonica B
36
,
1431
(
2005
).
43.
K.
Shang
,
M.
Small
, and
W.-S.
Yan
, “
Fitness networks for real world systems via modified preferential attachment
,”
Phys. A Stat. Mech. Appl.
474
,
49
60
(
2017
).
44.
F.
Gasparetti
,
A.
Micarelli
, and
G.
Sansonetti
, “Community detection and recommender systems,” in Encyclopedia of Social Network Analysis and Mining, edited by R. Alhajj and J. Rokne (Springer, New York, NY, 2017), pp. 1–14.
45.
G.
Zhao
,
M. L.
Lee
,
W.
Hsu
,
W.
Chen
, and
H.
Hu
, “Community-based user recommendation in uni-directional social networks,” in Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management, CIKM’13 (ACM, New York, NY, 2013), pp. 189–198.
46.
And this is all that we consider here.
You do not currently have access to this content.