As the recent studies indicate, the structure imposed onto written texts by the presence of punctuation develops patterns which reveal certain characteristics of universality. In particular, based on a large collection of classic literary works, it has been evidenced that the distances between consecutive punctuation marks, measured in terms of the number of words, obey the discrete Weibull distribution—a discrete variant of a distribution often used in survival analysis. The present work extends the analysis of punctuation usage patterns to more experimental pieces of world literature. It turns out that the compliance of the the distances between punctuation marks with the discrete Weibull distribution typically applies here as well. However, some of the works by James Joyce are distinct in this regard—in the sense that the tails of the relevant distributions are significantly thicker and, consequently, the corresponding hazard functions are decreasing functions not observed in typical literary texts in prose. Finnegans Wake—the same one to which science owes the word quarks for the most fundamental constituents of matter—is particularly striking in this context. At the same time, in all the studied texts, the sentence lengths—representing the distances between sentence-ending punctuation marks—reveal more freedom and are not constrained by the discrete Weibull distribution. This freedom in some cases translates into long-range nonlinear correlations, which manifest themselves in multifractality. Again, a text particularly spectacular in terms of multifractality is Finnegans Wake.

1.
M.
Halliday
,
Spoken and Written Language
(
Oxford University Press
,
1985
).
2.
T.
Stanisz
,
S.
Drożdż
, and
J.
Kwapień
, “
Complex systems approach to natural language
,”
Phys. Rep.
1053
,
1
84
(
2024
).
3.
G.
Zipf
,
Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology
(
Addison-Wesley Press
,
1949
).
4.
S. T.
Piantadosi
, “
Zipf’s word frequency law in natural language: A critical review and future directions
,”
Psychon. Bull. Rev.
21
,
1112
1130
(
2014
).
5.
H. S.
Heaps
,
Information Retrieval, Computational and Theoretical Aspects
(
Academic Press
,
New York
,
1978
).
6.
L.
Egghe
, “
Untangling Herdan’s law and Heaps’ law: Mathematical and informetric arguments
,”
J. Am. Soc. Inform. Sci. Tech.
58
,
702
709
(
2007
).
7.
G.
Altmann
, “
Prolegomena to Menzerath’s law
,”
Glottometrika
2
,
1
10
(
1980
).
8.
J.
Milička
, “
Menzerath’s law: The whole is greater than the sum of its parts
,”
J. Quant. Linguistics
21
,
85
99
(
2014
).
9.
M. A.
Montemurro
and
D. H.
Zanette
, “
Universal entropy of word ordering across linguistic families
,”
PLoS One
6
,
e19875
(
2011
).
10.
L.
Dȩbowski
,
Information Theory Meets Power Laws: Stochastic Processes and Language Models
(
Wiley
,
2020
).
11.
L.
Dȩbowski
and
C.
Bentz
, “
Information theory and language
,”
Entropy
22
,
435
(
2020
).
12.
E.
Alvarez-Lacalle
,
B.
Dorow
,
J.-P.
Eckmann
, and
E.
Moses
, “
Hierarchical structures induce long-range dynamical correlations in written texts
,”
Proc. Natl. Acad. Sci. U.S.A.
103
,
7956
7961
(
2006
).
13.
M.
Ausloos
, “
Generalized hurst exponent and multifractal function of original and translated texts mapped into frequency and length time series
,”
Phys. Rev. E
86
,
031108
(
2012
).
14.
S.
Drożdż
,
P.
Oświe¸cimka
,
A.
Kulig
,
J.
Kwapień
,
K.
Bazarnik
,
I.
Grabska-Gradzińska
,
J.
Rybicki
, and
M.
Stanuszek
, “
Quantifying origin and character of long-range correlations in narrative texts
,”
Inf. Sci.
331
,
32
44
(
2016
).
15.
J.
Liu
,
E.
Gunn
,
F.
Youssef
,
J.
Tharayil
,
W.
Lansford
, and
Y.
Zeng
, “
Fractality in Chinese prose
,”
Digital Scholarship Humanities
38
,
604
620
(
2023
).
16.
D.
Sánchez
,
L.
Zunino
,
J. D.
Gregorio
,
R.
Toral
, and
C.
Mirasso
, “
Ordinal analysis of lexical patterns
,”
Chaos
33
,
033121
(
2023
).
17.
R. F. i.
Cancho
and
R. V.
Solé
, “
The small world of human language
,”
Proc. R. Soc. Lond. Ser. B
268
,
2261
2265
(
2001
).
18.
T.
Gong
,
A.
Baronchelli
,
A.
Puglisi
, and
V.
Loreto
, “
Exploring the roles of complex networks in linguistic categorization
,”
Artificial Life
18
,
107
121
(
2011
).
19.
J.
Cong
and
H.
Liu
, “
Approaching human language with complex networks
,”
Phys. Life Rev.
11
,
598
618
(
2014
).
20.
A.
Kulig
,
J.
Kwapień
,
T.
Stanisz
, and
S.
Drożdż
, “
In narrative texts punctuation marks obey the same statistics as words
,”
Inf. Sci.
375
,
98
113
(
2017
).
21.
T.
Stanisz
,
J.
Kwapień
, and
S.
Drożdż
, “
Linguistic data mining with complex networks: A stylometric-oriented approach
,”
Inf. Sci.
482
,
301
320
(
2019
).
22.
B. C. e
Souza
,
F. N.
Silva
,
H. F.
de Arruda
,
G. D.
da Silva
,
L.
da F. Costa
, and
D. R.
Amancio
, “
Text characterization based on recurrence networks
,”
Inf. Sci.
641
,
119124
(
2023
).
23.
D.
Jurafsky
and
J. H.
Martin
, “Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (3rd edition draft),” see https://web.stanford.edu/∼jurafsky/slp3/ed3bookfeb3_2024.pdf (last accessed Feb 9, 2024).
24.
M.
Shanahan
,
K.
McDonell
, and
L.
Reynolds
, “
Role play with large language models
,”
Nature
623
,
493
498
(
2023
).
25.
W.
Chafe
, “
Punctuation and the prosody of written language
,”
Written Commun.
5
,
395
426
(
1988
).
26.
T.
Stanisz
,
S.
Drożdż
, and
J.
Kwapień
, “
Universal versus system-specific features of punctuation usage patterns in major western languages
,”
Chaos Solitons Fractals
168
,
113183
(
2023
).
27.
W.
Weibull
, “
A statistical distribution function of wide applicability
,”
ASME J. Appl. Mech.
18
,
293
297
(
1951
).
28.
T.
Nakagawa
and
S.
Osaki
, “
The discrete Weibull distribution
,”
IEEE Trans. Reliab.
R-24
,
300
301
(
1975
).
29.
N. L.
Johnson
,
S.
Kotz
, and
N.
Balakrishnan
,
Continuous Univariate Distributions
(
Wiley-Interscience
,
1994
).
30.
R. G.
Miller Jr
,
Survival Analysis
(
John Wiley & Sons
,
1998
).
31.
E. G.
Altmann
,
J. B.
Pierrehumbert
, and
A. E.
Motter
, “
Beyond word frequency: Bursts, lulls, and scaling in the temporal distributions of words
,”
PLoS One
4
,
e7678
(
2009
).
32.
W. J.
Padgett
and
J. D.
Spurrier
, “
On discrete failure models
,”
IEEE Trans. Reliab.
34
,
253
256
(
1985
).
33.
B.
McHale
,
Constructing Postmodernism
(
Routlege, London
,
1993
).
34.
S.
Jaffard
,
S.
Seuret
,
H.
Wendt
,
R.
Leonarduzzi
, and
P.
Abry
, “
Multifractal formalisms for multivariate analysis
,”
Proc. R. Soc. A
475
,
20190150
(
2019
).
35.
J.
Kwapień
and
S.
Drożdż
, “
Physical approach to complex systems
,”
Phys. Rep.
515
,
115
226
(
2012
).
36.
J. W.
Kantelhardt
,
S. A.
Zschiegner
,
E.
Koscielny-Bunde
,
S.
Havlin
,
A.
Bunde
, and
H.
Stanley
, “
Multifractal detrended fluctuation analysis of nonstationary time series
,”
Phys. A
316
,
114
(
2002
).
37.
P.
Oświȩcimka
,
J.
Kwapień
, and
S.
Drożdż
, “
Wavelet versus detrended fluctuation analysis of multifractal structures
,”
Phys. Rev. E
74
,
016103
(
2006
).
38.
C.-K.
Peng
,
S. V.
Buldyrev
,
S.
Havlin
,
M.
Simons
,
H. E.
Stanley
, and
A. L.
Goldberger
, “
Mosaic organization of DNA nucleotides
,”
Phys. Rev. E
49
,
1685
(
1994
).
39.
J. W.
Kantelhardt
,
E.
Koscielny-Bunde
,
H. H.
Rego
,
S.
Havlin
, and
A.
Bunde
, “
Detecting long-range correlations with detrended fluctuation analysis
,”
Phys. A
295
,
441
454
(
2001
).
40.
T. C.
Halsey
,
M. H.
Jensen
,
L. P.
Kadanoff
,
I.
Procaccia
, and
B. I.
Shraimant
, “
Fractal measures and their singularities: The characterization of strange sets
,”
Phys. Rev. A
33
,
1141
1151
(
1986
).
41.
S.
Drożdż
and
P.
Oświe¸cimka
, “
Detecting and interpreting distortions in hierarchical organization of complex time series
,”
Phys. Rev. E
91
,
030902(R)
(
2015
).
42.
J.
Kwapień
,
P.
Blasiak
,
S.
Drożdż
, and
P.
Oświe¸cimka
, “
Genuine multifractality in time series is due to temporal correlations
,”
Phys. Rev. E
107
,
034139
(
2023
).
43.
S.
Drożdż
,
J.
Kwapień
,
P.
Oświe¸cimka
, and
R.
Rak
, “
Quantitative features of multifractal subtleties in time series
,”
EPL
88
,
60003
(
2009
).
You do not currently have access to this content.