Shannon entropy is used to measure information uncertainty, while the information dimension is used to measure information complexity. Given two probability distributions, the difference can be measured by relative entropy. However, the existing relative entropy does not consider the effect of information dimension. To improve the existing entropy, a new relative entropy is presented in this paper. The information fractal dimension is considered in the proposed relative entropy. The new relative entropy is more generalized than the initial relative entropy. When dimension is not considered, it will degenerate to the initial relative entropy. Another interesting point is that the new relative entropy may have negative values when calculating. The physical meaning is still under exploration. Finally, some application examples are provided to exemplify the utilization of the proposed relative entropy.

Relative entropy can be used to measure the difference between different probability distributions. Information dimension can measure the roughness level. However, the existing relative entropy does not take into account the role of the information dimension. Therefore, a new relative entropy that incorporates the information dimension is proposed in this paper. Research shows that the new relative entropy performs better in applications involving changes in information dimension.

Information uncertainty can be measured by entropy. One of the most essential entropies is Shannon entropy.1 There are some entropy-relative theories, such as information of picture fuzzy set,2–4 belief entropy-of-entropy,5 the detection of neurological disorders,6 the asymptotic distribution of the permutation entropy,7 and so on.8–10 In addition, Deng entropy11 and the entropy of random permutation set (RPS)12–14 are derived from Shannon entropy. Lots of discussion is generated from RPS, such as the distance of RPS,15 marginalization,16 and so on.17,18

In addition, there are also many features of information.19–21 One of the features is the information dimension,22,23 which is a kind of fractal dimension.24–26 The complexity of fractals27,28 can be described. Further theoretical research of the information dimension involves Rényi information dimension,29 optimal information dimension,30 the information dimension of mass function31 and random permutation set,32 and so on.33–35 Simultaneously, information dimension is applied to many fields, such as social science,36 stream geomorphology and hydraulics,37 statistical mechanics,38 and so on.39,40

Difference between various information has attracted much attention.41 Many theories are proposed.42–44 An effective way is relative entropy,45,46 which is widely used in different scenarios, such as matrix trace inequalities,47 node similarity measuring,48 relative entropy of Z-numbers,49 and so on.50–52 

However, relative entropy does not consider the effect of information dimension. To improve the entropy, a new relative entropy is proposed. The information dimension and relative entropy are combined to consider the difference between the two random variables. In addition, the initial relative entropy exhibits asymmetry, while the new relative entropy is symmetric, which allows for a reduction in time complexity during computation. Moreover, negative values can be calculated from the new relative entropy. The physical meaning is still under exploration. Finally, the new relative entropy can be used to measure the additional number of bits required when using different encodings. This feature enables its application as a loss function in machine learning and deep learning.

This paper is organized as follows. Sec. II introduces the relevant knowledge of Shannon entropy, Rényi entropy, information dimension, and relative entropy. Section III presents and explains the proposed relative entropy. Section IV gives application examples and illustrates them. Section V summarizes this paper.

Shannon entropy,1 Renyi entropy,53 information dimension,22 and relative entropy45,46 will be briefly introduced in this section.

Given a random variable, suppose the probability distribution is P = (p1, p2, …, pn). The corresponding Shannon entropy is as follows:1 
Hs=i=1npilogpi.
(1)

Renyi entropy is generalized Shannon entropy.53,54

Given a random variable, suppose the probability distribution is P = (p1, p2, …, pn). The corresponding Renyi entropy is as follows:53,54
Hα(X)=11αlogi=1npiα,
(2)

where α ≥ 0 and α ≠ 0.

Fractals are proposed to modal natural in a way that captures roughness.27,28 The fractal dimension is employed to describe the roughness. One of the most classic fractal dimensions is the Hausdorff dimension.55 The information dimension is proposed based on the fractal dimension.

Given a random variable, suppose the probability distribution is P = (p1, p2, …, pn). The information dimension is defined as follows:22,56
Dr=11klimm0logi=1npiklogm.
(3)
Renyi entropy is contained in Eq. (3). When k → 1, Renyi entropy will degenerate to Shannon entropy. Information dimension is obtained as follows:22,56
Dr=limni=1npilogpilogn,
(4)
where the numerator is Shannon entropy.

Relative entropy, also named information divergence, is used to measure the difference between two probability distributions.45,46 Relative entropy is used in optimization algorithms frequently.57,58

Let P = (p1, p2, …, pn) and Q = (q1, q2, …, qn) be two probability distributions of the random variable X. The relative entropy (Er) of P concerning Q is given as follows:45,46
Er(PQ)=i=1npilogpiqi.
(5)

In this section, to consider the influence of the information dimension of probability distributions, the proposed relative entropy is presented.

Let P = (p1, p2, …, pn) and Q = (q1, q2, …, qn) be two probability distributions. The proposed relative entropy (Edr) of P concerning Q is given as follows:
Edr(PQ)=i=1npiDPlogpiDPqiDQ,
(6)
where DP represents the information dimension of P and DQ represents the information dimension of Q. Hence, DP and DQ are given as follows:22,56
DP=i=1npilogpilogn,DQ=i=1nqilogqilogn.
(7)

However, in practical simulations, the time complexity of Edr(PQ) is high, making it inconvenient for computation. Therefore, Edr(PQ) is optimized as Eq. (8).

Let P = (p1, p2, …, pn) and Q = (q1, q2, …, qn) be two probability distributions. The proposed optimized relative entropy (Eodr) of P concerning Q is as follows:
Eodr(PQ)=Edr(PQ)+Edr(QP)2=i=1npiDPlogpiDPqiDQ+i=1nqiDQlogqiDQpiDP2,
(8)
where DP and DQ are still as Eq. (7).22,56

The properties of Eodr are as follows.

When the information dimension is not considered, DP and DQ will degenerate to 1. Edr(PQ) will degenerate to the initial relative entropy Er(PQ),
Edr(PQ)=i=1npiDPlogpiDPqiDQ=i=1npi1logpi1qi1=i=1npilogpiqi=Er(PQ).
(9)
Unlike Er, the proposed optimized relative entropy Eodr is symmetrical. Eodr of P concerning Q and Eodr of Q concerning P is the same,
Eodr(PQ)=Edr(PQ)+Edr(QP)2=Edr(QP)+Edr(PQ)2=Eodr(QP).
(10)

To better introduce the proposed optimized relative entropy Eodr=12Edr(PQ)+Edr(QP), Eodr and Edr are analyzed. Suppose there are two probability distributions, P = (p, 1 − p) and Q = (q, 1 − q). The associated Eodr and Edr values for various values of p and q, ranging from 0.01 to 0.99 in increments of 0.01, are depicted in Figs. 1 and 2.

FIG. 1.

Eodr values analysis for different p and q.

FIG. 1.

Eodr values analysis for different p and q.

Close modal
FIG. 2.

Edr values analysis for different p and q.

FIG. 2.

Edr values analysis for different p and q.

Close modal

In Fig. 1, the graph is symmetric about the plane q = 0.5 and p = 0.5. It can be observed that Eodr always maintains non-negativity, suggesting that Eodr possesses the property of non-negativity. However, this has not yet been rigorously proven.

In Fig. 2, the graph is symmetric too. In addition, it is entertaining that the value of Edr sometimes will be negative. The mathematical reason is that when computing Eq. (5), it is inevitable to have one positive and one negative part in the case of probability distributions with only two outcomes. By adding dimensions, Edr strengthens the initial entropy, amplifying some of the negative components, thus producing negative values. However, the physical meaning is still under discussion.

The role of the information dimension can be better explained by Eodr than Er. To compare Eodr with Er and show the application of Eodr, example IV A, example IV B, and example IV C are illustrated. All examples below use log  2 to calculate.

Given two random variables A and B, their respective probability distributions are p(A0) = p(A1) = 0.5 and p(B0) = m, p(B1) = 1 − m. The corresponding Er(AB) and Eodr(AB) can be obtained.

The information dimensions of the two probability distributions are as follows:
DA=0.5log0.50.5log0.5log2=1,DB=mlogm(1m)log(1m)log2.
(11)
Er and Eodr can be calculated as follows, respectively:
Er(AB)=0.5log0.5m+0.5log0.51m,
(12)
Eodr(AB)=120.5DAlog0.5DAmDB+0.5DAlog0.5DA(1m)DB+mDBlogmDB0.5DA+(1m)DBlog(1m)DB0.5DA.
(13)
When m changes from 0.001 to 0.999 (step size is 0.001), some results are given in Table I. Er and Eodr can be shown in Fig. 3.
TABLE I.

Some results of example IV A.

mErEodr
0.1 0.7370 0.2545 
0.2 0.3219 0.1981 
0.3 0.1258 0.1038 
0.4 0.0294 0.0281 
0.5 0.0000 0.0000 
0.6 0.0294 0.0281 
0.7 0.1258 0.1038 
0.8 0.3219 0.1981 
0.9 0.7370 0.2545 
mErEodr
0.1 0.7370 0.2545 
0.2 0.3219 0.1981 
0.3 0.1258 0.1038 
0.4 0.0294 0.0281 
0.5 0.0000 0.0000 
0.6 0.0294 0.0281 
0.7 0.1258 0.1038 
0.8 0.3219 0.1981 
0.9 0.7370 0.2545 
FIG. 3.

The result of example IV A.

FIG. 3.

The result of example IV A.

Close modal

In Fig. 3, when m = 0.5, the two random variables A and B are the same, and Er = Eodr = 0. When m changes from 0.001 to 0.999, Er and Eodr have the same trend of change. Therefore, using Eodr measure, divergence is justified.

In addition, because of the influence of the information dimension, Eodr changes faster at both ends and slower in the middle. Specifically, in Fig. 4, the rate of change for Edr(BA) at the boundaries is slower initially and then becomes faster compared to Edr(AB). The changing trend of Eodr at two ends is caused by the variation.

FIG. 4.

A specific analysis of Eodr.

FIG. 4.

A specific analysis of Eodr.

Close modal
In addition, in Fig. 4, when m ≠ 0.5, the values of Edr(AB) are less than 0. The reason is as follows:
Edr(AB)=0.5DAlog0.5DAmDB+0.5DAlog0.5DA(1m)DB=10.5DBlogm(1m)0,
(14)

where DB ≥ 0.

Given two random variables A and B, their respective probability distributions are p(A0) = 0.3, p(A1) = 0.7 and p(B0) = m, p(B1) = 1 − m. When m changes from 0.001 to 0.999 (step size is 0.001), the corresponding Er(AB) and Eodr(AB) can be obtained as Fig. 5.

FIG. 5.

The result of example IV B.

FIG. 5.

The result of example IV B.

Close modal

The overall trends of Er and Eodr remain fairly similar; however, due to the influence of dimensionality, the changes in Eodr are less dramatic than those in Er. Both will take the value of 0 when m = 0.3, but interestingly, Eodr will first rise and then fall as m approaches 1, before rising again.

Given two probability distributions P = (p1, p2, …, pn) and Q = (q1, q2, …, qn), where pi=2in(n+1),iZ+ and qi=1n,iZ+, the following equations hold:
i=1npi=i=1n2in(n+1)=1,
(15)
i=1nqi=i=1n1n=1.
(16)

The corresponding Er(PQ) and Eodr(PQ) can be obtained.

When n changes from 2 to 600 (step size is 1), some results are given in Table II. Er and Eodr can be shown in Fig. 6.

TABLE II.

Some results of example IV B.

nEodrEr
0.0746 0.0817 
0.1228 0.1258 
0.1569 0.1536 
0.1825 0.1727 
0.2026 0.1867 
20 0.3061 0.2460 
100 0.3712 0.2716 
200 0.3840 0.2751 
500 0.3939 0.2772 
600 0.3953 0.2775 
nEodrEr
0.0746 0.0817 
0.1228 0.1258 
0.1569 0.1536 
0.1825 0.1727 
0.2026 0.1867 
20 0.3061 0.2460 
100 0.3712 0.2716 
200 0.3840 0.2751 
500 0.3939 0.2772 
600 0.3953 0.2775 
FIG. 6.

The result of example IV C.

FIG. 6.

The result of example IV C.

Close modal

In example IV C, the variable under consideration is the cardinality of the sample space n. With the increase in the value of n, the information dimension of P remains at 1, while that of Q gradually stabilizes around 0.97. During this process, Eodr is affected by the change in the information dimension. Eodr(PQ) exhibits a more pronounced variation compared to Er(PQ) in Fig. 6. Therefore, when altering the sample space, Eodr(PQ) performs better.

Based on the above examples, Eodr and Er can be used as relative entropy to measure the difference between probability distributions. However, in scenarios involving changes related to information dimension, such as alterations in the sample space, Eodr demonstrates superior performance.

The divergence between different probability distributions is measured by relative entropy. However, the initial relative entropy does not consider the effect of the information dimension. In this paper, a new relative entropy is proposed to engage with the issue. The new one adds the information dimension to the initial relative entropy. As a result, the trend of the new relative entropy is different from that of the initial relative entropy.

We also prove that when the dimension is not considered, the new relative entropy will degenerate to the initial relative entropy. Several application examples are provided to exemplify the validity of the proposed relative entropy.

This work was partially supported by the National Natural Science Foundation of China (Grant No. 62373078).

The authors have no conflicts to disclose.

Jingyou Wu: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

1.
C. E.
Shannon
, “
A mathematical theory of communication
,”
ACM SIGMOBILE Mob. Comput. Commun. Rev.
5
(
1
),
3
55
(
2001
).
2.
Y.-F.
Li
,
J.
Mi
,
Y.
Liu
,
Y.-J.
Yang
, and
H.-Z.
Huang
, “
Dynamic fault tree analysis based on continuous-time Bayesian networks under fuzzy numbers
,”
Proc. Inst. Mech. Eng., Part O
229
(
6
),
530
541
(
2015
).
3.
V.
Arya
and
S.
Kumar
, “
A new picture fuzzy information measure based on Shannon entropy with applications in opinion polls using extended Vikor–Todim approach
,”
Comput. Appl. Math.
39
,
197
224
(
2020
).
4.
Y.-F.
Li
,
H.-Z.
Huang
,
J.
Mi
,
W.
Peng
, and
X.
Han
, “
Reliability analysis of multi-state systems with common cause failures based on Bayesian network and fuzzy probability
,”
Ann. Oper. Res.
311
(
1
),
195
209
(
2022
).
5.
H.
Cui
,
L.
Zhou
,
Y.
Li
, and
B.
Kang
, “
Belief entropy-of-entropy and its application in the cardiac interbeat interval time series analysis
,”
Chaos, Solitons Fractals
155
,
111736
(
2022
).
6.
S. J. J.
Jui
,
R. C.
Deo
,
P. D.
Barua
,
A.
Devi
,
J.
Soar
, and
U. R.
Acharya
, “
Application of entropy for automated detection of neurological disorders with electroencephalogram signals: A review of the last decade
,”
IEEE Access
11
,
71905
(
2012–2022
).
7.
A.
Rey
,
A.
Frery
,
J.
Gambini
, and
M.
Lucini
, “
The asymptotic distribution of the permutation entropy, Chaos
,”
Chaos
33
(
11
),
113108
(
2023
).
8.
K.
Pilkiewicz
,
B.
Lemasson
,
M.
Rowland
,
A.
Hein
,
J.
Sun
,
A.
Berdahl
,
M.
Mayo
,
J.
Moehlis
,
M.
Porfiri
,
E.
Fernández-Juricic
et al, “
Decoding collective communications using information theory tools
,”
J. R. Soc., Interface
17
(
164
),
20190563
(
2020
).
9.
J.
Sun
,
A. A. R.
AlMomani
, and
E.
Bollt
, “
Data-driven learning of boolean networks and functions by optimal causation entropy principle
,”
Patterns
3
(
11
),
100631
(
2022
).
10.
Y.
Tang
,
Y.
Zhou
,
X.
Ren
,
Y.
Sun
,
Y.
Huang
, and
D.
Zhou
, “
A new basic probability assignment generation and combination method for conflict data fusion in the evidence theory
,”
Sci. Rep.
13
(
1
),
8443
(
2023
).
11.
Y.
Deng
, “
Deng entropy
,”
Chaos, Solitons Fractals
91
,
549
553
(
2016
).
12.
Y.
Deng
, “
Random permutation set
,”
Int. J. Comput. Commun. Control
17
(
1
),
4542
(
2022
).
13.
J.
Deng
and
Y.
Deng
, “
Maximum entropy of random permutation set
,”
Soft Comput.
26
(
21
),
11265
11275
(
2022
).
14.
L.
Chen
and
Y.
Deng
, “
Entropy of random permutation set
,”
Commun. Stat. Theory and Methods
(published online
2023
).
15.
L.
Chen
,
Y.
Deng
, and
K. H.
Cheong
, “
The distance of random permutation set
,”
Inf. Sci.
628
,
226
239
(
2023
).
16.
Q.
Zhou
,
Y.
Cui
,
Z.
Li
, and
Y.
Deng
, “
Marginalization in random permutation set theory: From the cooperative game perspective
,”
Nonlinear Dyn.
111
,
13125
(
2023
).
17.
G.
Yuan
,
Y.
Zhai
,
J.
Tang
, and
X.
Zhou
, “
CSCIM_FS: Cosine similarity coefficient and information measurement criterion-based feature selection method for high-dimensional data
,”
Neurocomputing
552
,
126564
(
2023
).
18.
W.
Yang
and
Y.
Deng
, “
Matrix operations in random permutation set
,”
Inf. Sci.
647
,
119419
(
2023
).
19.
Z.
Liu
,
Y.
Wang
,
Q.
Cheng
, and
H.
Yang
, “
Analysis of the information entropy on traffic flows
,”
IEEE Trans. Intell. Transp. Syst.
23
(
10
),
18012
18023
(
2022
).
20.
L.
Chen
,
Y.
Deng
, and
K. H.
Cheong
, “
Permutation Jensen–Shannon divergence for random permutation set, engineering applications of artificial intelligence
,”
Eng. Appl. Artif. Intell.
119
,
105701
21.
T.
Huang
,
Z.
Shao
,
T.
Xiahou
, and
Y.
Liu
, “
An evidential network approach to reliability assessment by aggregating system-level imprecise knowledge
,”
Qual. Reliab. Eng. Int.
39
(
5
),
1863
1877
(
2023
).
22.
A.
Rényi
, “
On the dimension and entropy of probability distributions
,”
Acta Math. Acad. Sci. Hung.
10
(
1–2
),
193
215
(
1959
).
23.
G.
Datseris
,
I.
Kottlarz
,
A. P.
Braun
, and
U.
Parlitz
, “
Estimating fractal dimensions: A comparative review and open source implementations
,”
Chaos
33
(
10
),
102101
(
2023
).
24.
J.
Theiler
, “
Estimating fractal dimension
,”
J. Opt. Soc. Am. A
7
(
6
),
1055
1073
(
1990
).
25.
R.
Lopes
and
N.
Betrouni
, “
Fractal and multifractal analysis: A review
,”
Med. Image Anal.
13
(
4
),
634
649
(
2009
).
26.
S.
Kumari
,
R.
Chugh
,
J.
Cao
, and
C.
Huang
, “
On the construction, properties and Hausdorff dimension of random Cantor one pth set
,”
AIMS Math.
5
(
4
),
3138
3155
(
2020
).
27.
J.
Feder
,
Fractals
(
Springer Science & Business Media
,
2013
).
28.
M. F.
Barnsley
,
Fractals Everywhere
(
Academic Press
,
2014
).
29.
Y.
Wu
and
S.
Verdú
, “
Rényi information dimension: Fundamental limits of almost lossless analog compression
,”
IEEE Trans. Inf. Theory
56
(
8
),
3721
3748
(
2010
).
30.
S.
Kak
, “
Fractals with optimal information dimension
,”
Circuits, Syst., Signal Process.
40
(
11
),
5733
5743
(
2021
).
31.
C.
Qiang
,
Y.
Deng
, and
K. H.
Cheong
, “
Information fractal dimension of mass function
,”
Fractals
30
(
06
),
2250110
(
2022
).
32.
T.
Zhao
,
Z.
Li
, and
Y.
Deng
, “
Information fractal dimension of random permutation set
,”
Chaos, Solitons Fractals
174
,
113883
(
2023
).
33.
M. I.
Hwang
and
J. W.
Lin
, “
Information dimension, information overload and decision quality
,”
J. Inf. Sci.
25
(
3
),
213
218
(
1999
).
34.
V.
Županović
and
D.
Žubrinić
, “
Fractal dimensions in dynamics
,” in
Encyclopedia of Mathematical Physics
, edited by
J.-P.
Françoise
,
G. L.
Naber
and
T. S.
Tsun
(
Academic Press
,
Oxford
,
2006
), pp.
394
402
.
35.
M.
Hirano
and
H.
Nagahama
, “
Informative fractal dimension associated with nonmetricity in information geometry
,”
Physica A
625
,
129017
(
2023
).
36.
H. A.
Voorveld
,
G.
Van Noort
,
D. G.
Muntinga
, and
F.
Bronner
, “
Engagement with social media and social media advertising: The differentiating role of platform type
,”
J. Advertising
47
(
1
),
38
54
(
2018
).
37.
G.
Dwyer
,
C.
Cummings
,
S.
Rice
,
J.
Lancaster
,
B. J.
Downes
,
L.
Slater
, and
R. E.
Lester
, “
Using fractals to describe ecologically relevant patterns in distributions of large rocks in streams
,”
Water Resour. Res.
57
(
7
),
e2021WR029796
, (
2021
).
38.
C.
Wu
,
D.
Duan
, and
R.
Xiao
, “
A novel dimension reduction method with information entropy to evaluate network resilience
,”
Physica A
620
,
128727
(
2023
).
39.
A.
Ramirez-Arellano
,
L. M.
Hernández-Simón
, and
J.
Bory-Reyes
, “
Two-parameter fractional Tsallis information dimensions of complex networks
,”
Chaos, Solitons Fractals
150
,
111113
(
2021
).
40.
M.
Lei
, “
Information dimension based on deng entropy
,”
Physica A
600
,
127584
(
2022
).
41.
D.
Lin
et al, “
An information-theoretic definition of similarity
,” in
Proceedings of the Fifteenth International Conference on Machine Learning
(
Morgan Kaufmann Publishers, Inc
.,
1998
), Vol.
98
, pp.
296
304
.
42.
A.-X.
Zhu
, “
A similarity model for representing soil spatial information
,”
Geoderma
77
(
2–4
),
217
242
(
1997
).
43.
J.
Wang
and
Y.
Dong
, “
Measurement of text similarity: A survey
,”
Information
11
(
9
),
421
(
2020
).
44.
Y.
Tang
,
S.
Tan
, and
D.
Zhou
, “
An improved failure mode and effects analysis method using belief Jensen–Shannon divergence and entropy measure in the evidence theory
,”
Arabian J. Sci. Eng.
48
(
5
),
7163
7176
(
2023
).
45.
M. J.
Donald
, “
On the relative entropy
,”
Commun. Math. Phys.
105
,
13
34
(
1986
).
46.
V.
Vedral
, “
The role of relative entropy in quantum information theory
,”
Rev. Mod. Phys.
74
(
1
),
197
(
2002
).
47.
Y.
Seo
, “
Matrix trace inequalities on Tsallis relative entropy of negative order
,”
J. Math. Anal. Appl.
472
(
2
),
1499
1508
(
2019
).
48.
T.
Wen
,
S.
Duan
, and
W.
Jiang
, “
Node similarity measuring in complex networks with relative entropy
,”
Commun. Nonlinear Sci. Numer. Simul.
78
,
104867
(
2019
).
49.
Y.
Li
,
D.
Pelusi
,
Y.
Deng
, and
K. H.
Cheong
, “
Relative entropy of z-numbers
,”
Inf. Sci.
581
,
1
17
(
2021
).
50.
C.-I.
Chang
,
K.
Chen
,
J.
Wang
, and
M. L.
Althouse
, “
A relative entropy-based approach to image thresholding
,”
Pattern Recognit.
27
(
9
),
1275
1289
(
1994
).
51.
V.
Chandrasekaran
and
P.
Shah
, “
Relative entropy optimization and its applications
,”
Math. Program.
161
,
1
32
(
2017
).
52.
Y.
Tang
,
G.
Dai
,
Y.
Zhou
,
Y.
Huang
, and
D.
Zhou
, “
Conflicting evidence fusion using a correlation coefficient-based approach in complex network
,”
Chaos, Solitons Fractals
176
,
114087
(
2023
).
53.
A.
Rényi
, “
On measures of entropy and information
,” in
Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics
(
University of California Press
,
1961
) Vol.
4
, pp.
547
562
.
54.
R.
Renner
and
S.
Wolf
, “
Smooth Rényi entropy and applications
,” in
International Symposium on Information Theory, 2004. ISIT 2004. Proceedings
(
IEEE
,
2004
), p.
233
.
55.
M.
Nechba
and
M.
Ouyaaz
, “
Understanding the Hausdorff measure and dimension: Fundamentals and examples
,” arXiv:2304.11500.
56.
E.
Guariglia
, “
Harmonic Sierpinski gasket and applications
,”
Entropy
20
(
9
),
714
(
2018
).
57.
Y.
Chen
,
Q.
Guo
,
H.
Sun
,
Z.
Li
,
W.
Wu
, and
Z.
Li
, “
A distributionally robust optimization model for unit commitment based on Kullback–Leibler divergence
,”
IEEE Trans. Power Appar. Syst.
33
(
5
),
5147
5160
(
2018
).
58.
W.
Ha
,
E. Y.
Sidky
,
R. F.
Barber
,
T. G.
Schmidt
, and
X.
Pan
, “
Estimating the spectrum in computed tomography via Kullback–Leibler divergence constrained optimization
,”
Med. Phys.
46
(
1
),
81
92
(
2019
).