The Golden–Thompson trace inequality, which states that Tr eH+KTr eHeK, has proved to be very useful in quantum statistical mechanics. Golden used it to show that the classical free energy is less than the quantum one. Here, we make this G–T inequality more explicit by proving that for some operators, notably the operators of interest in quantum mechanics, H = Δ or H=Δ+m and K = potential, Tr eH+(1−u)KeuK is a monotone increasing function of the parameter u for 0 ≤ u ≤ 1. Our proof utilizes an inequality of Ando, Hiai, and Okubo (AHO): Tr XsYtX1−sY1−tTr XY for positive operators X, Y and for 12s,t1, and s+t32. The obvious conjecture that this inequality should hold up to s + t ≤ 1 was proved false by Plevnik [Indian J. Pure Appl. Math. 47, 491–500 (2016)]. We give a different proof of AHO and also give more counterexamples in the 32,1 range. More importantly, we show that the inequality conjectured in AHO does indeed hold in the full range if X, Y have a certain positivity property—one that does hold for quantum mechanical operators, thus enabling us to prove our G–T monotonicity theorem.

In 2000, Ando, Hiai, and Okubo2 (AHO) considered several inequalities for traces of products of two positive semidefinite matrices X and Y, of which the two simplest were




with 1/2 ≤ s ≤ 1 and 1/2 ≤ t ≤ 1.

Note that the absolute value, or at least a real part, is necessary for either (1.1) or (1.2) to make sense; Tr[XsYtX1−sY1−t] may be a complex number.

Ando, Hiai, and Okubo succeeded in proving both inequalities when X and Y were 2 × 2 matrices or, more generally, when both X and Y have at most two distinct eigenvalues (Ref. 2, Corollary 4.3). They also proved (1.1) when s + t ≤ 3/2 but could only prove (1.2) when either s = 1/2 or t = 1/2. They raised the question as to whether the inequalities (1.1) and (1.2) hold over the entire range 1 ≤ s + t ≤ 2. In addition to proving the positive results mentioned above (and some generalizations discussed below), they remarked that the behavior of the function (s, t) ↦ |Tr[XsYtX1−sY1−t]| on the whole interval [1/2, 1] × [1/2, 1] “is rather complicated for general n × n positive semidefinite matrices.”

The question they raised attracted the attention of other researchers. In particular, Bottazzi et al.5 gave another proof, for the case s = t, that (1.1) is valid for s + t ≤ 3/2. Instead of the majorization techniques used in Ref. 2, they used the Lieb–Thirring inequality and the Hölder inequality for matrix trace norms. Using these tools, they showed that for z = 1/4 + iy or z = 3/4 + iy, yR,


and then used the maximum modulus principle to conclude that (1.1) is valid for s = t, 1/4 ≤ t ≤ 3/4. Moreover, they proved that unless A and B commute, this inequality is strict, and thus, for any given X and Y, the inequality extends to a wider interval depending on X and Y. However, 16 years after the original work of Ando, Hiai, and Okubo, Plevnik9 finally found a counterexample to the conjectured inequality (1.1) in the missing range 3/2 ≤ s + t ≤ 2, as well as a counterexample to (1.2).

We, unaware of these developments, attempted to show a monotonicity property for the Golden–Thompson inequality7,11 and were led to exactly the same inequality that Ando et al.2 had discussed 22 years earlier. Our proof for the 1 ≤ s + t ≤ 3/2 range is a little different, and we shall give that proof here. We also identify interesting conditions on X and Y under which (1.1) and (1.2) do hold for all 0 ≤ s, t ≤ 1 and apply this to prove our conjecture on the Golden–Thompson inequality in these cases. We shall also give a systematic construction of counterexamples for the 3/2 ≤ s + t ≤ 2 range that complement the example in Ref. 9 and show that not only is (1.2) false, but also it is even possible for Tr[XsYtX1−sY1−t] to be negative when X and Y are real positive semidefinite matrices.

We recall the Lieb–Thirring inequality,8 which says that for all r ≥ 1 and any positive semidefinite n × n matrices,


Later, Araki3 proved that the inequality reverses for 0 < r < 1. It was shown by Friedland and So6 that for r > 1, the inequality is strict unless A and B commute.

In the following and in the whole of this paper, X and Y are positive semidefinite matrices. We will use (2.1) to estimate ‖X1/pY1/pp for various values of p ≥ 1. Since


we may apply (2.1) to get an upper bound on ‖X1/pY1/pp taking r = p/2, provided p/2 ≥ 1 or, equivalently, 1/p ≤ 1/2. [Otherwise, by Araki’s complement to (2.1), we would get a lower bound.] In summary,

X1/pY1/pppTr[XY]for all0<1/p1/2.

As in Ref. 5, we shall use the generalized Hölder inequality for trace norms (see, e.g., Simon’s book10). For any 3 n × n matrices A, B, and C and any p, q, r ≥ 1 with 1/p + 1/q + 1/r = 1,


(This generalizes in the obvious way to products of arbitrarily many matrices.)

The next theorem is a small generalization of the result in Ref. 2 in that we consider four positive semidefinite matrices instead of only two.

Theorem 2.1.
LetX,Y,Z, andWbe positive semidefinite, and let 1/2 ≤ s, t,t + s ≤ 3/2. Then,
In particular, takingZ = XandW = Y, we obtain(1.1)under these conditions onsandt.

Since s, t ≥ 1/2, t ≥ 1 − s. Write t = (1 − s) + (t + s − 1), and both summands are non-negative. By cyclicity of the trace,
Define r1t + s − 1, r2 ≔ 1 − t, and r3 ≔ 1 − s. Then, we have
By what was noted above, r1, r2, r3 ≥ 0, and of course, r1 + r2 + r3 = 1. Thus, by Hölder’s inequality,
We may now apply (2.2), provided r1, r2, and r3 are all no greater than 1/2. Since s, t ≥ 1/2, it is always the case that r2, r3 ≤ 1/2, while r1 ≤ 1/2 if and only if t+s32. Hence, under this condition, (2.4) is proved.□

Remark 2.2.

The assumption that the two powers of X sum to 1 is not a real restriction. Given two arbitrary positive powers a, b, we may rename Xa+b to be X and define s ≔ max{a, b}/(a + b) and similarly for Y.

Remark 2.3.

In Ref. 2, Theorem 2.1 was generalized to n X’s and n Y’s, and our method of Proof of Theorem 2.1 using the Lieb–Thirring inequality likewise generalizes. This theorem will not be needed in the rest of this paper, and we do not discuss this here.

Remark 2.4.
The fact that this method of proof cannot yield the inequality for all s, t, even in cases such as those described below for which the inequality is true for all s, t, has nothing to do with what is given up in the application of the Lieb–Thirring inequality: Consider the case s = t, Z = X and W = Y. Then, (2.5) becomes
Hence, for X, Y > 0,
and in general, ‖XY1 > Tr[XY].

We now present several results that provide conditions on X and Y under which (1.1) and (1.2) are valid for all s, t ∈ [1/2, 1] × [1/2, 1]. We will use the following lemma:

Lemma 2.5.
Suppose thatXandsare such that in a basis in whichYis diagonal,
(Xs)i,j(X1s)j,i0for alli,j.
Then, for allt ∈ [1/2, 1],

Remark 2.6.

The matrix Mi,j(Xs)i,j(X1s)j,i is the Hadamard product of two positive matrices, namely, Xs and the transpose of X1−s, and as such, it is positive semidefinite. However, the off-diagonal entries need not be positive or even real.

Assume first that Y > 0. Computing in any basis that diagonalizes Y, with the jth diagonal entry of Y denoted by yj,
where now it is convenient to let t range over [0, 1]. Under hypothesis (2.7), f(t) is symmetric and convex in t. Hence, its maximum occurs at t = 0 and t = 1, and its minimum occurs at t = 1/2. Since Y > 0, limt1TrXsYtX1sY1t=TrXsYX1s=TrXY. This proves that TrXsYtX1−sY1−t is real and satisfies
Since (Y1/2)i,j(Y1/2)j,i=|Yi,j1/2|2, we may now apply what was proved above with the roles of X and Y interchanged to conclude that
Finally, we obtain the same result assuming only Y ≥ 0 using the obvious limiting argument.□

Our first application of Lemma 2.5 is to pairs of operators of a sort that arise frequently in mathematical physics. For X > 0, define H = −log(X) so that X = eH. Suppose that in a basis in which Y is diagonal, all off-diagonal entries of H are non-positive, i.e.,

Hi,j0for allij.

For example, this is the case if H is the graph Laplacian on an unoriented graph (with the graph theorist’s sign convention that the graph Laplacian is non-negative); see Example 3.3.

It is well-known that under these conditions, as a consequence of the Beurling–Deny Theorem (Ref. 4, Theorem 5), the semigroup esH is positivity preserving, and so, in particular, (esH)i,j0 for all s and all i, j. For the reader’s convenience, we recall the relevant part of their proof adapted to our setting: Take λ > 0 sufficiently small that I + λH is invertible. Then, for any vector f, 1+λH1f is the unique minimizer of


The uniqueness follows from the strict convexity of F for sufficiently small λ > 0. Under condition (2.9), when f = |f|, F(|u|) ≤ F(u). Hence, 1+λH1f maps the positive cone into itself, and all entries of this matrix are non-negative. The same is evidently true of 1+λHnf for all n. Taking λ = s/n and n → ∞, the same is true of esH for all s ≥ 0.

Theorem 2.7.

Suppose thatH = −log Xsatisfies(2.9)in a basis in whichYis diagonal. Then,(1.1)and(1.2)are valid for alls, t ∈ [1/2, 1] × [1/2, 1].

By the Beurling–Deny theorem as explained above, for all s > 0,
It follows that (2.7) is satisfied for all s, and now the conclusion follows from Lemma 2.5.□

One may also use Lemma 2.5 to show that both (1.1) and (1.2) are valid for 2 × 2 matrices, as was already shown in Ref. 2: Write X=azz̄b. Then, by the usual integral representation formula for Xs, 0 < s < 1,


showing that for all 0 < α < 1, (Yα)1,2 is a positive multiple of −z, and hence, (2.7) is always true.

Our next theorem provides another class of examples of positive matrices X and Y for which (1.1) is true for all 1/2 ≤ s, t ≤ 1. A related theorem, for a version of (1.1) with the operator norm in place of the trace, has recently been proved in Ref. 1 by quite different means.

Theorem 2.8.

LetHandKbe arbitrary self-adjointn × nmatrices. Then, there exists anα0 > 0 depending onHandKso that for allα < α0, withXeαHandYeαY,(1.1)is valid all 1/2 ≤ s, t ≤ 1.

If H and K commute, then it is obvious that (1.1) is valid all 1/2 ≤ s, t ≤ 1, no matter what α > 0 may be. Hence, we may assume without loss of generality that [H, K] ≠ 0. In addition, without loss of generality, we may suppose that H and K are both contractions and 0 ≤ α ≤ 1. Then, by the spectral theorem,
and likewise for K, Thus,
Note that
where ‖R‖ ≤ 3α3.
Now, writing X = eαH and Y = eαK,
Using (2.10) and (2.11), we obtain
for some constant C that can be easily estimated. Note that for all s, t, Z(0, t) = Z(s, 0) = I. For this reason, there cannot be any terms proportional to s2 or t2 in the second order expansion.
Altogether we have
where ‖R2‖, ‖R3‖ ≤ 3 for some constant C. Evidently, Tr[[H,K]]=Tr[H[H,K]]=Tr[K[H,K]]=Tr[H2[H,K]]=Tr[K2[H,K]]=0. A simple computation shows that
where ‖R4‖ ≤ 3, and hence, Tr[R4] ≤ nCα3. Evidently, since by hypothesis [H, K] ≠ 0, Tr[H,K]2 < 0. Thus, for all α sufficiently small, Tr[X1−tY1−sXtYs] − Tr[XY] < 0 for all (s, t) ∈ [1/2, 1] × [1/2, 1].□

Of course, replacing t by 1 − t and s by 1 − s, the same proof shows, with the same α0 that when αα0,


Replacing s by is and t by it yields


and hence, [X, Y] ≠ 0, and α sufficiently small,


Thus, the three lines argument in Ref. 5 cannot hold for s, t sufficiently close to 1 or 0.

Let H and K be self-adjoint n × n matrices. For 0 ≤ u ≤ 1, define


Then, f(0) = Tr[eH+K] and f(1) = Tr[eHeK], and by the Golden–Thompson inequality,


fH,K(0) ≤ fH,K(1). In this section, we ask the following: When is fH,K(u) monotone increasing in u? We shall prove that this is the case for an interesting class of pairs (H, K) of self-adjoint matrices, and we shall show that it is not true in general.

Remark 3.1.
Observe that if one replaces H by H + aI and K by K + bI,
and hence, whether or not fH+aI,K+bI(u) is monotone increasing is independent of a and b.

Theorem 3.2.

Suppose thatKis diagonal and that all off-diagonal entries ofHare non-negative. Then,fH,K(u) is monotoneincreasing.

By Remark 3.1, we may assume that K ≥ 0. It will be convenient to define Hu = H + (1 − u)K. Then,
Now define X=eHu, and for each m, Y = Km+1 and s = (m + 1)−1. With these definitions,
Since Y is diagonal, for each u, −log Hu has non-positive off-diagonal entries. By Theorem 2.7,
Then, by (3.4), f′(u) ≥ 0.□

Example 3.3.
Let G be a graph with a finite set of vertices V. Let the edge set be E; this is a subset of V×V. Suppose that G is a simple graph, meaning that (x,x)E for all xV and that (x,y)E if and only if (y,x)E. Then, the graph Laplacian, ΔG, is defined by
In the natural basis, all off diagonal elements of the matrix representing ΔG are non-positive. Define H0=ΔG to obtain a non-negative “free Hamiltonian” as in the usual mathematical physics convention. Let V be a self-adjoint multiplication operator on L2(V,μ), where μ is the uniform probability measure on V. In the natural basis, V is diagonal.
Then, by Theorem 3.2,
is strictly monotone increasing in u.

Example 3.4.
Although we have given proofs in the context of matrices, it is easy to see that the proofs extend to cover interesting infinite dimensional cases. Let X = eβΔ, where Δ is the Laplacian on Rd and β > 0. Let V be a real valued function on Rd, and let V also denote multiplication by V acting on L2(Rd), which is, in general, unbounded. Let Y = eβV. Then, since Xt has a positive kernel and Y acts by multiplication on L2(Rd), the Proof of Theorem 3.2 is easily adapted to show that
is monotone increasing in u. The same applies with −Δ replaced by (−Δ)1/2, another case that arises in physical applications.

This section presents the construction of counterexamples, showing that inequalities (1.1) and (1.2) cannot hold in general, even in the 3 × 3 case and showing the monotonicity property established in Theorem 3.2 under specified conditions cannot hold in general. While counterexamples for (1.1) and (1.2) were found by Plevnik,9 our goal is to provide a systematic approach to their construction. Plevnik provided two completely separate and purely numerical counterexamples to (1.1) and (1.2). We provide a method for constructing a family of counterexamples that goes further in significant ways. For example, while Plevnik showed in Ref. 9 (Example 2.5) that (1.2) can be violated, his example does not show that it is possible for Tr[XsYyX1−sY1−t] to be negative. We show that this is the case. Moreover, our construction shows that the failure of inequalities (1.1) and (1.2) as well as the failure, in general, of the monotonicity of the Golden–Thompson inequality described in Theorem 3.2 are all closely connected. Essentially, one example undoes all three would-be conjectures.

We have seen in Lemma 2.5 that that if all of the entries of Mi,j(Xs)i,j(X1s)j,i are non-negative, then (1.1) and (1.2) both hold. In constructing our counterexamples, we shall take X to be real, and hence, the entries of M will be real for each s.

Lemma 4.1.
Lety ≔ (y1, …yn) be any vector inRn, LetXbe any positive semidefiniten × nmatrix matrix, and let 0 ≤ t ≤ 1. LetM(s) denote the matrixMi,j(s)(Xs)i,j(X1s)j,i. Then, for all 0 < s < 1,


We may assume that the entries of y are positive since the left-hand side of (4.1) does not change when we add to y any multiple of the vector each of whose entries is 1.

By Lemma 2.5, we know that for X and any matrix Y ≥ 0 (we replace Y by Y2 in Lemma 2.5 for convenience),
Letting Y be the diagonal matrix whose jth diagonal entry is yj, this becomes

We now claim that if X ≥ 0 is a real 3 × 3 matrix, for any 0 < s < 1, M(s) has at most one entry above the diagonal that is negative. [By Remark 2.6, all diagonal entries are non-negative, and M(s) is symmetric, so the same is true below the diagonal.] To see this, take the vector y to be of the form (0, 1, 1), (1, 0, 1) of (1, 1, 0). Then, for these choices, (4.1) becomes


Thus, each pair of entries above the diagonal must have a non-negative sum, and hence, no two can be negative.

One might hope that one could construct counterexamples to (1.1) and (1.2) by constructing matrices X > 0 for which Mi,j(t) < 0 for all t ∈ (0, 1/2) ∪ (1/2, 1). This is easy to do, but this alone does not yield counterexamples.

For example, define X1/2=220222022. This matrix is easily diagonalized; the eigenvalues are 4, 2, and 0. Since X1,31/2=0, one might expect that X1,3s changes sign at s = 1/2, and only there so that M1,3(s) ≤ 0 for all 0 < s < 1. Indeed, doing the computations, one finds


Now take Ya0000000b with a, b > 0 and distinct. Then,


For fixed s ∉ {0, 1/2, 1}, this is strictly concave in t and symmetric about t = 1/2, so the maximum occurs only at t = 1/2 and the minimum only at t ∈ {0, 1}. However, since limt0Yt=P100000001I, we do not have limt0Tr[X1sY1tXsYt]=Tr[XY], which would provide a counterexample to (1.1) but instead limt0Tr[X1sY1tXsYt]=Tr[X1sYXsP]. As we have just seen, this is less than Tr[X1−sY1/2XsY1/2], and by Lemma 2.5, this, in turn, is less than Tr[XY]. In fact, defining h(t) ≔ 4t−1/2 + 41/2−t, we can rewrite (4.3) as


Then, from (4.4),


By the arithmetic-geometric mean inequality, a + batb1−ta1−tbt ≥ 0 for all 0 ≤ t ≤ 1. Since h(s) is evidently convex and symmetric about s = 1/2, for each fixed t ∈ (0, 1), Tr[X1−sY1−tXsYt] is a strictly convex function of s, symmetric about s = 1/2. Therefore, this function is minimized only for s = 1/2 and maximized only for s ∈ {0, 1}, and hence, for any t,


and the right side is independent of t since X1,31/2=0. Hence, (1.2) is satisfied for all choices of a, b > 0. Likewise, by what was proved above, for all s, t, with Qlims0Xs, which is an orthogonal projection,


and hence, (1.1) is satisfied for all choices of a, b > 0.

This shows that the construction of counterexamples is more subtle than simply producing negative entries in M(s). It appears that the key to the construction of counterexamples for 3 × 3 matrices is to choose X so that one of the inequalities in (4.1) to is nearly saturated, with one of the summands negative for most values of s. Furthermore, it is natural to choose X and Y to be perturbations of positive semidefinite matrices X0 and Y0 such that Tr[X01sY01tX0sY0t]=0 for all 0 ≤ s, t ≤ 1. Of course, this is satisfied if X0 and Y0 are orthogonal projections with mutually orthogonal ranges.

Our construction relies on the Householder reflections determined by two distinct unit vectors u,vRn. This is given by Hu,vI − 2‖uv−1|uv⟩⟨uv|. Evidently, Hu,v is self-adjoint, orthogonal, and Hu,vu = v and Hu,vv = u. For simplicity, choose

u(0,0,1)and v21/2(1,1,0).



Now choose


Then, X0 and Y0 are orthogonal projections such that X0Y0 = 0.

Now, we make a simple perturbation. For a, b > 0 small, to be chosen later, define


and also for 0 < t < 1, define




The off-diagonal entries of UYU will not change sign as t varies, but we can make this happen by applying are further orthogonal transformation; define Rcosx0sinx010sinx0cosx, and finally put


where RT is the transpose of R, with x, a, and b to be chosen later. We compute




We seek a small perturbation of X0, and hence, we will take a, b, and |x| all to be small positive numbers. It is easy to see that the sign change we seek occurs in X1,3t if we take ab ≪ 1 and 0 < x ≪ 1, and occurs in X2,3t if we take ba ≪ 1 and 0 < x ≪ 1.

Example 4.2.
To get a counterexample to (1.1), take a = 10−10, b = 10−19, x = 10−5, c = 10−10 and d = 0. Then, one finds

Example 4.3.
To get a counterexample to (1.2), take a = 10−19, b = 10−10, x = 10−5, c = 10−10, and d = 0. Then, one finds that
which being negative, is certainly less that Tr[X1/2Y1/2X1/2Y1/2] > 0, and by continuity, somewhere the trace must be zero.

Note that the only difference between the two examples is that we have swapped the values assigned to a and b; all other parameters are left the same. Numerical plots show that in both cases, the maximum value of |M1,3(t) + M2,3(t)| is less than 10−3 times the maximum of |M1,3(t)| + |M2,3(t)| so that the last inequality in (4.1) is nearly saturated; there is near cancellation in the sum X1,3t+X2,3t. Note that in our counterexample to (1.1), the sum of the exponents s + t is 1.58, not so much larger than the minimum value, 3/2, at which such a counterexample cannot exist. It would be of interest to see if one can build on this construction, possibly extending it into higher dimensions, to show that the condition s + t ≤ 3/2 in Theorem 2.1 is sharp.

We close by showing that the monotonicity property for the Golden–Thompson inequality described in Theorem 3.2 does not hold for arbitrary self-adjoint matrices H and K.

Recall that fH,K(u) has been defined by (3.1),


With X and Y as above, we define K = log(X) and H = log Y. Since H is diagonal, the integral 01etHKe(1t)Hdt can be explicitly evaluated as a Hadamard product. One finds


This shows that the monotonicity proved in Theorem 3.2 is not true for general self-adjoint H and K.

We thank Victoria Chayes and Rupert Frank for useful conversations. E.A.C. gratefully acknowledges partial support from the U.S. National Science Foundation (Grant No. DMS 2055282).

The authors have no conflicts to disclose.

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

, and
, “
On matrix rearrangement inequalities
Proc. Am. Math. Soc.
, and
, “
Trace inequalities for multiple products of two matrices
Math. Inequalities Appl.
, “
On an inequality of Lieb and Thirring
Lett. Math. Phys.
, “
Espaces de dirichlet: I. Le cas élémentaire
Acta Math.
, and
, “
Inequalities related to Bourin and Heinz means with a complex parameter
J. Math. Anal. Appl.
, “
On the product of matrix exponentials
Linear Algebra Appl.
, “
Lower bounds for the Helmholtz function
Phys. Rev.
E. H.
W. E.
, “
Inequalities for the moments of the eigenvalues of the Schrödinger Hamiltonian and their relation to Sobolev inequalities
,” in
Studies in Mathematical Physics
, edited by
, and
Princeton University Press
), pp.
, “
On a matrix trace inequality due to Ando, Hiai and Okubo
Indian J. Pure Appl. Math.
Trace Ideals and Their Applications
, 2nd ed., Mathematical Surveys and Monographs Vol. 120 (
Providence, RI
C. J.
, “
Inequality with applications in statistical mechanics
J. Math. Phys.