Coevolution of residues in contact imposes strong statistical constraints on the sequence variability between homologous proteins. Direct-Coupling Analysis (DCA), a global statistical inference method, successfully models this variability across homologous protein families to infer structural information about proteins. For each residue pair, DCA infers 21 × 21 matrices describing the coevolutionary coupling for each pair of amino acids (or gaps). To achieve the residue-residue contact prediction, these matrices are mapped onto simple scalar parameters; the full information they contain gets lost. Here, we perform a detailed spectral analysis of the coupling matrices resulting from 70 protein families, to show that they contain quantitative information about the physico-chemical properties of amino-acid interactions. Results for protein families are corroborated by the analysis of synthetic data from lattice-protein models, which emphasizes the critical effect of sampling quality and regularization on the biochemical features of the statistical coupling matrices.

1.
M.
Weigt
,
R. A.
White
,
H.
Szurmant
,
J. A.
Hoch
, and
T.
Hwa
,
Proc. Natl. Acad. Sci. U. S. A.
106
,
67
(
2009
).
2.
F.
Morcos
,
A.
Pagnani
,
B.
Lunt
,
A.
Bertolino
,
D. S.
Marks
,
C.
Sander
,
R.
Zecchina
,
J. N.
Onuchic
,
T.
Hwa
, and
M.
Weigt
,
Proc. Natl. Acad. Sci. U. S. A.
108
,
E1293
(
2011
).
3.
E. T.
Jaynes
,
Phys. Rev.
106
,
620
(
1957
).
4.
E. T.
Jaynes
,
Phys. Rev.
108
,
171
(
1957
).
5.
UniProt
Consortium
,
Nucleic Acids Res.
43
,
D204
(
2015
).
6.
R. D.
Finn
,
P.
Coggill
,
R. Y.
Eberhardt
,
S. R.
Eddy
,
J.
Mistry
,
A. L.
Mitchell
,
S. C.
Potter
,
M.
Punta
,
M.
Qureshi
,
A.
Sangrador-Vegas
 et al,
Nucleic Acids Res.
44
,
D279
(
2016
).
7.
A.
Procaccini
,
B.
Lunt
,
H.
Szurmant
,
T.
Hwa
, and
M.
Weigt
,
PloS One
6
,
e19729
(
2011
).
8.
J.
Skolnick
,
A.
Godzik
,
L.
Jaroszewski
 et al,
Protein Sci.
6
,
676
(
1997
).
9.
J.
Skolnick
,
A.
Kolinski
, and
A.
Ortiz
,
Proteins: Struct., Funct., Bioinf.
38
,
3
(
2000
).
10.
P.
Cossio
,
D.
Granata
,
A.
Laio
,
F.
Seno
, and
A.
Trovato
,
Sci. Rep.
2
,
351
(
2012
).
11.
S.
Miyazawa
and
R. L.
Jernigan
,
Macromolecules
18
,
534
(
1985
).
12.
E.
Shakhnovich
and
A.
Gutin
,
J. Chem. Phys.
93
,
5967
(
1990
).
13.
H.
Jacquin
,
A.
Gilson
,
E.
Shakhnovich
,
S.
Cocco
, and
R.
Monasson
,
PLoS Comput. Biol.
12
,
e1004889
(
2016
).
14.
D. T.
Jones
,
D. W.
Buchan
,
D.
Cozzetto
, and
M.
Pontil
,
Bioinformatics
28
,
184
(
2012
).
15.
M.
Ekeberg
,
C.
Lövkvist
,
Y.
Lan
,
M.
Weigt
, and
E.
Aurell
,
Phys. Rev. E
87
,
012707
(
2013
).
16.
S.
Balakrishnan
,
H.
Kamisetty
,
J. G.
Carbonell
,
S.-I.
Lee
, and
C. J.
Langmead
,
Proteins: Struct., Funct., Bioinf.
79
,
1061
(
2011
).
17.
H.
Kamisetty
,
S.
Ovchinnikov
, and
D.
Baker
,
Proc. Natl. Acad. Sci. U. S. A.
110
,
15674
(
2013
).
18.
M.
Ekeberg
,
T.
Hartonen
, and
E.
Aurell
,
J. Comput. Phys.
276
,
341
(
2014
).
19.
J. P.
Barton
,
S.
Cocco
,
E.
De Leonardis
, and
R.
Monasson
,
Phys. Rev. E
90
,
012132
(
2014
).
20.
S. D.
Dunn
,
L. M.
Wahl
, and
G. B.
Gloor
,
Bioinformatics
24
,
333
(
2008
).
21.
K. R.
Wollenberg
and
W. R.
Atchley
,
Proc. Natl. Acad. Sci. U. S. A.
97
,
3288
(
2000
).
22.
S.
Cocco
,
R.
Monasson
, and
M.
Weigt
,
PLoS Comput. Biol.
9
,
e1003176
(
2013
).
23.
M.-y.
Shen
and
A.
Sali
,
Protein sci.
15
,
2507
(
2006
).
24.
D.
Rykunov
and
A.
Fiser
,
BMC Bioinf.
11
,
1
(
2010
).
25.
S.
Miyazawa
and
R. L.
Jernigan
,
J. Mol. Biol.
256
,
623
(
1996
).
26.
J.
Heyda
,
P. E.
Mason
, and
P.
Jungwirth
,
J. Phys. Chem. B
114
,
8744
(
2010
).
27.
H. M.
Berman
,
J.
Westbrook
,
Z.
Feng
,
G.
Gilliland
,
T.
Bhat
,
H.
Weissig
,
I. N.
Shindyalov
, and
P. E.
Bourne
,
Nucleic Acids Res.
28
,
235
(
2000
).
28.
C.
Feinauer
,
M. J.
Skwark
,
A.
Pagnani
, and
E.
Aurell
,
PLOS Comput. Biol.
10
,
e1003847
(
2014
).
29.
A. G.
Murzin
,
S. E.
Brenner
,
T.
Hubbard
, and
C.
Chothia
,
J. Mol. Biol.
247
,
536
(
1995
).
30.
S. J.
Hubbard
and
J. M.
Thornton
,
Computer Program, Department of Biochemistry and Molecular Biology
(
University College London
,
1993
), p.
2
.
31.
G. A.
Jeffrey
and
G. A.
Jeffrey
,
An Introduction to Hydrogen Bonding
(
Oxford University Press
,
New York
,
1997
), Vol.
12
.
32.
S.
Cocco
and
R.
Monasson
,
Phys. Rev. Lett.
106
,
090601
(
2011
).
33.
J.
Barton
,
E.
De Leonardis
,
A.
Coucke
, and
S.
Cocco
,
Bioinformatics
32
,
3089
(
2016
).
35.
R. D.
Finn
,
J.
Clements
, and
S. R.
Eddy
,
Nucleic Acids Res.
39
,
W29
(
2011
).
36.
L.
Sutto
,
S.
Marsili
,
A.
Valencia
, and
F. L.
Gervasio
,
Proc. Natl. Acad. Sci. U. S. A.
112
,
13567
(
2015
).

Supplementary Material

You do not currently have access to this content.