We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate-distortion theory to use causal shielding—a natural principle of learning. We study two distinct cases of causal inference: optimal causal filtering and optimal causal estimation. Filtering corresponds to the ideal case in which the probability distribution of measurement sequences is known, giving a principled method to approximate a system’s causal structure at a desired level of representation. We show that in the limit in which a model-complexity constraint is relaxed, filtering finds the exact causal architecture of a stochastic dynamical system, known as the causal-state partition. From this, one can estimate the amount of historical information the process stores. More generally, causal filtering finds a graded model-complexity hierarchy of approximations to the causal architecture. Abrupt changes in the hierarchy, as a function of approximation, capture distinct scales of structural organization. For nonideal cases with finite data, we show how the correct number of the underlying causal states can be found by optimal causal estimation. A previously derived model-complexity control term allows us to correct for the effect of statistical fluctuations in probability estimates and thereby avoid overfitting.

1.
P.
Berge
,
Y.
Pomeau
, and
C.
Vidal
,
Order within Chaos
(
Wiley
,
New York
,
1986
).
2.
J.
Guckenheimer
and
P.
Holmes
,
Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields
(
Springer-Verlag
,
New York
,
1983
).
3.
S.
Wiggins
,
Global Bifurcations and Chaos: Analytical Methods
(
Springer-Verlag
,
New York
,
1988
).
4.
R. L.
Devaney
,
An Introduction to Chaotic Dynamical Systems
(
Addison-Wesley
,
Redwood City, CA
,
1989
).
5.
A. J.
Lieberman
and
M. A.
Lichtenberg
,
Regular and Chaotic Dynamics
, 2nd ed. (
Springer-Verlag
,
New York
,
1993
).
6.
E.
Ott
,
Chaos in Dynamical Systems
(
Cambridge University Press
,
New York
,
1993
).
7.
S. H.
Strogatz
,
Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering
(
Addison-Wesley
,
Reading, MA
,
1994
).
8.
N. H.
Packard
,
J. P.
Crutchfield
,
J. D.
Farmer
, and
R. S.
Shaw
,
Phys. Rev. Lett.
45
,
712
(
1980
).
9.
F.
Takens
, in
Symposium on Dynamical Systems and Turbulence
, edited by
D. A.
Rand
and
L. S.
Young
(
Springer-Verlag
,
Berlin
,
1981
), Vol.
898
, p.
366
.
10.
A.
Fraser
, in
Information Dynamics
,
NATO Advanced Studies Institute, Series B: Physics
Vol.
256
, edited by
H.
Atmanspacher
and
H.
Scheingraber
(
Plenum
,
New York
,
1991
), p.
125
.
11.
J. P.
Crutchfield
and
B. S.
McNamara
,
Complex Syst.
1
,
417
(
1987
).
12.
Nonlinear Modeling
,
SFI Studies in the Sciences of Complexity
, edited by
M.
Casdagli
and
S.
Eubank
(
Addison-Wesley
,
Reading, MA
,
1992
).
13.
J. C.
Sprott
,
Chaos and Time-Series Analysis
, 2nd ed. (
Oxford University Press
,
Oxford
,
2003
).
14.
H.
Kantz
and
T.
Schreiber
,
Nonlinear Time Series Analysis
, 2nd ed. (
Cambridge University Press
,
Cambridge
,
2006
).
15.
J. P.
Crutchfield
and
K.
Young
,
Phys. Rev. Lett.
63
,
105
(
1989
).
16.
N.
Tishby
,
F.
Pereira
, and
W.
Bialek
, in
Proceedings of the 37th Annual Allerton Conference
, edited by
B.
Hajek
and
R. S.
Sreenivas
(
University of Illinois Press
,
Urbana
,
1999
), pp.
368
377
.
17.
S.
Still
and
W.
Bialek
,
Neural Comput.
16
,
2483
(
2004
).
18.
C. R.
Shalizi
and
J. P.
Crutchfield
,
Adv. Complex Syst.
5
,
91
(
2002
).
19.
A more general approach is taken in Ref. 46, where both predictive modeling and decision making are considered. The scenario discussed here is a special case.
20.
C. E.
Shannon
,
Bell Syst. Tech. J.
27
,
379
(
1948
);
C. E.
Shannon
,
Bell Syst. Tech. J.
27
,
623
(
1948
);
reprinted in
C. E.
Shannon
and
W.
Weaver
,
The Mathematical Theory of Communication
(
University of Illinois Press
,
Urbana
,
1949
).
21.
S.
Still
and
J. P.
Crutchfield
, e-print arXiv:org/0708.0654.
22.
J. P.
Crutchfield
,
Physica D
75
,
11
(
1994
).
23.
J. P.
Crutchfield
and
C. R.
Shalizi
,
Phys. Rev. E
59
,
275
(
1999
).
24.
To save space and improve readability, we use a simplified notation that refers to infinite sequences of random variables. The implication, however, is that one works with finite-length sequences into the past and into the future, whose infinite-length limit is taken at appropriate points. See, for example, Ref. 23 or, for measure-theoretic foundations, Ref. 47.
25.
T. M.
Cover
and
J. A.
Thomas
,
Elements of Information Theory
, 2nd ed. (
Wiley-Interscience
,
New York
,
2006
).
26.
J. P.
Crutchfield
,
C. J.
Ellison
, and
J. R.
Mahoney
,
Phys. Rev. Lett.
103
,
094101
(
2009
).
27.
J. P.
Crutchfield
and
D. P.
Feldman
,
Chaos
13
,
25
(
2003
).
28.
W.
Bialek
,
R. R.
de Ruyter van Steveninck
, and
N.
Tishby
,
Proceedings of the International Symposium on Information Theory
,
2006
, pp.
659
663
.
29.
A.
del Junco
and
M.
Rahe
,
Proc. Am. Math. Soc.
75
,
259
(
1979
).
30.
J. P.
Crutchfield
and
N. H.
Packard
,
Physica D
7
,
201
(
1983
).
31.
R.
Shaw
,
The Dripping Faucet as a Model Chaotic System
(
Aerial
,
Santa Cruz, CA
,
1984
).
32.
P.
Grassberger
,
Int. J. Theor. Phys.
25
,
907
(
1986
).
33.
W.
Li
,
Complex Syst.
5
,
381
(
1991
).
34.
W.
Bialek
and
N.
Tishby
, e-print arXiv:cond-mat/9902341v1.
35.
J. P.
Crutchfield
,
C. J.
Ellison
,
J. R.
Mahoney
, and
R. G.
James
,
Chaos
20
,
037105
(
2010
).
36.
S.
Arimoto
,
IEEE Trans. Inf. Theory
18
,
14
(
1972
).
37.
R. E.
Blahut
,
IEEE Trans. Inf. Theory
18
,
460
(
1972
).
38.
K.
Rose
,
E.
Gurewitz
, and
G. C.
Fox
,
Phys. Rev. Lett.
65
,
945
(
1990
).
39.
K.
Rose
,
Proc. IEEE
86
,
2210
(
1998
).
40.
The algorithm follows that used in the information bottleneck (Ref. 16). The convergence arguments there apply to the OCF algorithm.
41.
C. J.
Ellison
,
J. R.
Mahoney
, and
J. P.
Crutchfield
,
J. Stat. Phys.
136
,
1005
(
2009
).
42.
C. C.
Strelioff
,
J. P.
Crutchfield
, and
A.
Hübler
,
Phys. Rev. E
76
,
011106
(
2007
).
43.
All quantities denoted with a ̂ are evaluated at the estimate P̂.
44.
L. R.
Rabiner
and
B. H.
Juang
, IEEE ASSP Magazine, January
1986
.
45.
J.
Rissanen
,
Stochastic Complexity in Statistical Inquiry
(
World Scientific
,
Singapore
,
1989
).
46.
S.
Still
,
Europhys. Lett.
85
,
28005
(
2009
).
47.
N.
Ay
and
J. P.
Crutchfield
,
J. Stat. Phys.
210
,
659
(
2005
).
You do not currently have access to this content.