Benford's first digit law states that the first digit of any set of random numbers is most likely to be 1. Specifically, the probability of the first digit being d is $log ( 1 + ( 1 / d ) )$ with $d = 1 , 2 , … , 9$. In a recent paper, Don Lemons introduced an ingenious way to relate Benford's law to thermodynamics.1 He likened a large number N to a thermal system that can be partitioned into I smaller subsystems, each having a number $i n i$, such that $N = ∑ i = 1 I i n i$. If $ε$ is a quantum of energy, $N ε$ represents the total energy of the system. For a fixed N, the system is closed, and the partition of N corresponds to the distribution of energy among the subsystems. Different partitions, therefore, amount to fluctuating energies for each subsystem. This can be viewed as the consequence of the free exchange of energy among subsystems through thermal contact. The number of all possible partitions of N corresponds to the number of all accessible microstates, $Ω I , N$, which defines the entropy $S = k B ln Ω I , N$ of this closed system. Lemons then derived the entropy S using the so-called “high-dimensionality argument,”2 namely, for $I ≫ 1$: $S = k B ln ( ( e 2 N ) / I 2 ) I$. The “Benford distribution” was finally obtained as
(1)
as a direct consequence of equilibrium among subsystems. Here, $⟨ n i ⟩$ represents the equilibrium number of subsystem i. It should be pointed out that the average bracket is needed here because, in the thermodynamic limit, the subsystems will most likely take their average values, by virtue of the central limit theorem.

When I use this example in my thermodynamics class, the students find it both inspiring and confusing. The introduction of the analogy with a thermal system deepens their understanding of what a closed system is and helps me convey two fundamental thermodynamic concepts: temperature and entropy. However, the derivation of the entropy S proves to be a big challenge for undergraduates. Therefore, following the ideas of Lemons, I will show that the Benford distribution could be derived using a simple physical model without invoking the high-dimensionality argument.

We may regard each subsystem i as a single radiation mode. $i ε$ then corresponds to the unit energy or photon energy of radiation mode i, and $⟨ n i ⟩$ is the average photon number at some temperature T,3,
(2)
As $T → ∞$ or $i ε ≪ k B T$, the discreteness of energy is not relevant, and we have the classical limit or the Rayleigh–Jeans approximation,
(3)
which recovers the Benford distribution. This also expresses the equipartition principle of classical statistical mechanics, where each radiation mode has equal thermal energy $k B T$ or $⟨ n i ⟩ i = N / I$. Since $k B T ≫ ε$, we notice that $N ≫ I$. This is reasonable because the key property of the Benford distribution is scale invariance.4 Since $ε$ is our arbitrary unit of energy, it could be taken as small as we want, so that N would then go to infinity. Meanwhile, in the thermodynamic limit, the central limit theorem states that the distribution of photon number in each mode becomes peaked around $⟨ n i ⟩$. Since the average photon number of each mode is determined by the Benford distribution, the uncertainty in the system or the system's entropy can be calculated by summing contributions from each mode. Mode i has a photon distribution,3
(4)
where $x = i ε / k B T$. Its Gibbs entropy is
(5)
Taking the classical limit, $T → ∞$, we have $x = i ε / k B T ≅ ⟨ n i ⟩ − 1 ≪ 1$ and
(6)
The total entropy can be recovered as follows:
(7)
where Stirling's approximation $I ! ≈ ( I / e ) I$ is used in the last step. In Ref. 1, this entropy is obtained by counting all the possible microstates, which correspond to the microcanonical ensemble method for a closed system. Here, by introducing the concept of temperature, we use the grand canonical ensemble for an open system, which is much simpler. In the thermodynamic limit, these two ensembles are equivalent.
Finally, we would like to comment on our choice of the radiation mode, or Bose–Einstein statistics (with chemical potential $μ = 0$), to derive the Benford law. This may seem like a natural but arbitrary choice, but it is actually the only valid one. Indeed, the Bose–Einstein and Fermi–Dirac population of an energy level $ε$ are given by
(8)
with “+” for the Fermi–Dirac and “–” for the Bose–Einstein distributions. If $e ( ε − μ ) / k B T ≫ 1$, the $± 1$ term can be neglected, and $f ± ( ε )$ reduces to the Boltzmann distribution, $f ( ε ) = e − ( ε − μ ) / k B T ≪ 1$, which means negligible level occupation. As long as the occupation with two or more particles is highly improbable, a harmonic oscillator reduces to a two level system (occupied or empty), which can, in turn, be described by the Fermi–Dirac distribution. However, in our case, each harmonic oscillator mode has high occupation number, $⟨ n i ⟩ ≫ 1$. Therefore, we cannot use the Fermi–Dirac or the Boltzmann statistics to get the Benford law.
1.
Don S.
Lemons
, “
Thermodynamics of Benford's first digit law
,”
Am. J. Phys.
87
(
10
),
787
790
(
2019
).
2.
Herbert B.
Callen
,
Thermodynamics and an Introduction to Statistical Mechanics
, 2nd ed. (
Wiley
,
New York
,
1985
), pp.
343
348
.
3.
C.
Kittel
and
H.
Kroemer
,
Thermal Physics
, 2nd ed. (
W.H.Freeman and Company
,
New York
,
1980
), pp.
89
91
.
4.
A.
Burgos
and
A.
Santos
, “
The Newcomb-Benford law: Scale invariance and a simple Markov process based on it
,”
Am. J. Phys.
89
(
9
),
851
861
(
2021
).