The Golden–Thompson trace inequality, which states that Tr eH+K ≤ Tr eHeK, has proved to be very useful in quantum statistical mechanics. Golden used it to show that the classical free energy is less than the quantum one. Here, we make this G–T inequality more explicit by proving that for some operators, notably the operators of interest in quantum mechanics, H = Δ or and K = potential, Tr eH+(1−u)KeuK is a monotone increasing function of the parameter u for 0 ≤ u ≤ 1. Our proof utilizes an inequality of Ando, Hiai, and Okubo (AHO): Tr XsYtX1−sY1−t ≤ Tr XY for positive operators X, Y and for , and . The obvious conjecture that this inequality should hold up to s + t ≤ 1 was proved false by Plevnik [Indian J. Pure Appl. Math. 47, 491–500 (2016)]. We give a different proof of AHO and also give more counterexamples in the range. More importantly, we show that the inequality conjectured in AHO does indeed hold in the full range if X, Y have a certain positivity property—one that does hold for quantum mechanical operators, thus enabling us to prove our G–T monotonicity theorem.
I. INTRODUCTION
In 2000, Ando, Hiai, and Okubo2 (AHO) considered several inequalities for traces of products of two positive semidefinite matrices X and Y, of which the two simplest were
and
with 1/2 ≤ s ≤ 1 and 1/2 ≤ t ≤ 1.
Note that the absolute value, or at least a real part, is necessary for either (1.1) or (1.2) to make sense; Tr[XsYtX1−sY1−t] may be a complex number.
Ando, Hiai, and Okubo succeeded in proving both inequalities when X and Y were 2 × 2 matrices or, more generally, when both X and Y have at most two distinct eigenvalues (Ref. 2, Corollary 4.3). They also proved (1.1) when s + t ≤ 3/2 but could only prove (1.2) when either s = 1/2 or t = 1/2. They raised the question as to whether the inequalities (1.1) and (1.2) hold over the entire range 1 ≤ s + t ≤ 2. In addition to proving the positive results mentioned above (and some generalizations discussed below), they remarked that the behavior of the function (s, t) ↦ |Tr[XsYtX1−sY1−t]| on the whole interval [1/2, 1] × [1/2, 1] “is rather complicated for general n × n positive semidefinite matrices.”
The question they raised attracted the attention of other researchers. In particular, Bottazzi et al.5 gave another proof, for the case s = t, that (1.1) is valid for s + t ≤ 3/2. Instead of the majorization techniques used in Ref. 2, they used the Lieb–Thirring inequality and the Hölder inequality for matrix trace norms. Using these tools, they showed that for z = 1/4 + iy or z = 3/4 + iy, ,
and then used the maximum modulus principle to conclude that (1.1) is valid for s = t, 1/4 ≤ t ≤ 3/4. Moreover, they proved that unless A and B commute, this inequality is strict, and thus, for any given X and Y, the inequality extends to a wider interval depending on X and Y. However, 16 years after the original work of Ando, Hiai, and Okubo, Plevnik9 finally found a counterexample to the conjectured inequality (1.1) in the missing range 3/2 ≤ s + t ≤ 2, as well as a counterexample to (1.2).
We, unaware of these developments, attempted to show a monotonicity property for the Golden–Thompson inequality7,11 and were led to exactly the same inequality that Ando et al.2 had discussed 22 years earlier. Our proof for the 1 ≤ s + t ≤ 3/2 range is a little different, and we shall give that proof here. We also identify interesting conditions on X and Y under which (1.1) and (1.2) do hold for all 0 ≤ s, t ≤ 1 and apply this to prove our conjecture on the Golden–Thompson inequality in these cases. We shall also give a systematic construction of counterexamples for the 3/2 ≤ s + t ≤ 2 range that complement the example in Ref. 9 and show that not only is (1.2) false, but also it is even possible for Tr[XsYtX1−sY1−t] to be negative when X and Y are real positive semidefinite matrices.
II. CONDITIONS FOR VALIDITY OF THE AHO INEQUALITIES
We recall the Lieb–Thirring inequality,8 which says that for all r ≥ 1 and any positive semidefinite n × n matrices,
Later, Araki3 proved that the inequality reverses for 0 < r < 1. It was shown by Friedland and So6 that for r > 1, the inequality is strict unless A and B commute.
In the following and in the whole of this paper, X and Y are positive semidefinite matrices. We will use (2.1) to estimate ‖X1/pY1/p‖p for various values of p ≥ 1. Since
we may apply (2.1) to get an upper bound on ‖X1/pY1/p‖p taking r = p/2, provided p/2 ≥ 1 or, equivalently, 1/p ≤ 1/2. [Otherwise, by Araki’s complement to (2.1), we would get a lower bound.] In summary,
As in Ref. 5, we shall use the generalized Hölder inequality for trace norms (see, e.g., Simon’s book10). For any 3 n × n matrices A, B, and C and any p, q, r ≥ 1 with 1/p + 1/q + 1/r = 1,
(This generalizes in the obvious way to products of arbitrarily many matrices.)
The next theorem is a small generalization of the result in Ref. 2 in that we consider four positive semidefinite matrices instead of only two.
The assumption that the two powers of X sum to 1 is not a real restriction. Given two arbitrary positive powers a, b, we may rename Xa+b to be X and define s ≔ max{a, b}/(a + b) and similarly for Y.
In Ref. 2, Theorem 2.1 was generalized to n X’s and n Y’s, and our method of Proof of Theorem 2.1 using the Lieb–Thirring inequality likewise generalizes. This theorem will not be needed in the rest of this paper, and we do not discuss this here.
We now present several results that provide conditions on X and Y under which (1.1) and (1.2) are valid for all s, t ∈ [1/2, 1] × [1/2, 1]. We will use the following lemma:
The matrix is the Hadamard product of two positive matrices, namely, Xs and the transpose of X1−s, and as such, it is positive semidefinite. However, the off-diagonal entries need not be positive or even real.
Our first application of Lemma 2.5 is to pairs of operators of a sort that arise frequently in mathematical physics. For X > 0, define H = −log(X) so that X = e−H. Suppose that in a basis in which Y is diagonal, all off-diagonal entries of H are non-positive, i.e.,
For example, this is the case if H is the graph Laplacian on an unoriented graph (with the graph theorist’s sign convention that the graph Laplacian is non-negative); see Example 3.3.
It is well-known that under these conditions, as a consequence of the Beurling–Deny Theorem (Ref. 4, Theorem 5), the semigroup e−sH is positivity preserving, and so, in particular, for all s and all i, j. For the reader’s convenience, we recall the relevant part of their proof adapted to our setting: Take λ > 0 sufficiently small that I + λH is invertible. Then, for any vector f, is the unique minimizer of
The uniqueness follows from the strict convexity of F for sufficiently small λ > 0. Under condition (2.9), when f = |f|, F(|u|) ≤ F(u). Hence, maps the positive cone into itself, and all entries of this matrix are non-negative. The same is evidently true of for all n. Taking λ = s/n and n → ∞, the same is true of e−sH for all s ≥ 0.
One may also use Lemma 2.5 to show that both (1.1) and (1.2) are valid for 2 × 2 matrices, as was already shown in Ref. 2: Write . Then, by the usual integral representation formula for Xs, 0 < s < 1,
showing that for all 0 < α < 1, is a positive multiple of −z, and hence, (2.7) is always true.
Our next theorem provides another class of examples of positive matrices X and Y for which (1.1) is true for all 1/2 ≤ s, t ≤ 1. A related theorem, for a version of (1.1) with the operator norm in place of the trace, has recently been proved in Ref. 1 by quite different means.
Let H and K be arbitrary self-adjoint n × n matrices. Then, there exists an α0 > 0 depending on H and K so that for all α < α0, with X ≔ eαH and Y ≔ eαY, (1.1) is valid all 1/2 ≤ s, t ≤ 1.
Of course, replacing t by 1 − t and s by 1 − s, the same proof shows, with the same α0 that when α ≤ α0,
Replacing s by is and t by it yields
and hence, [X, Y] ≠ 0, and α sufficiently small,
Thus, the three lines argument in Ref. 5 cannot hold for s, t sufficiently close to 1 or 0.
III. THE MONOTONICITY OF THE GOLDEN–THOMPSON INEQUALITY
Let H and K be self-adjoint n × n matrices. For 0 ≤ u ≤ 1, define
Then, f(0) = Tr[eH+K] and f(1) = Tr[eHeK], and by the Golden–Thompson inequality,
fH,K(0) ≤ fH,K(1). In this section, we ask the following: When is fH,K(u) monotone increasing in u? We shall prove that this is the case for an interesting class of pairs (H, K) of self-adjoint matrices, and we shall show that it is not true in general.
Suppose that K is diagonal and that all off-diagonal entries of H are non-negative. Then, fH,K(u) is monotone increasing.
IV. COUNTEREXAMPLES
This section presents the construction of counterexamples, showing that inequalities (1.1) and (1.2) cannot hold in general, even in the 3 × 3 case and showing the monotonicity property established in Theorem 3.2 under specified conditions cannot hold in general. While counterexamples for (1.1) and (1.2) were found by Plevnik,9 our goal is to provide a systematic approach to their construction. Plevnik provided two completely separate and purely numerical counterexamples to (1.1) and (1.2). We provide a method for constructing a family of counterexamples that goes further in significant ways. For example, while Plevnik showed in Ref. 9 (Example 2.5) that (1.2) can be violated, his example does not show that it is possible for Tr[XsYyX1−sY1−t] to be negative. We show that this is the case. Moreover, our construction shows that the failure of inequalities (1.1) and (1.2) as well as the failure, in general, of the monotonicity of the Golden–Thompson inequality described in Theorem 3.2 are all closely connected. Essentially, one example undoes all three would-be conjectures.
We have seen in Lemma 2.5 that that if all of the entries of are non-negative, then (1.1) and (1.2) both hold. In constructing our counterexamples, we shall take X to be real, and hence, the entries of M will be real for each s.
We may assume that the entries of y are positive since the left-hand side of (4.1) does not change when we add to y any multiple of the vector each of whose entries is 1.
We now claim that if X ≥ 0 is a real 3 × 3 matrix, for any 0 < s < 1, M(s) has at most one entry above the diagonal that is negative. [By Remark 2.6, all diagonal entries are non-negative, and M(s) is symmetric, so the same is true below the diagonal.] To see this, take the vector y to be of the form (0, 1, 1), (1, 0, 1) of (1, 1, 0). Then, for these choices, (4.1) becomes
Thus, each pair of entries above the diagonal must have a non-negative sum, and hence, no two can be negative.
One might hope that one could construct counterexamples to (1.1) and (1.2) by constructing matrices X > 0 for which Mi,j(t) < 0 for all t ∈ (0, 1/2) ∪ (1/2, 1). This is easy to do, but this alone does not yield counterexamples.
For example, define . This matrix is easily diagonalized; the eigenvalues are 4, 2, and 0. Since , one might expect that changes sign at s = 1/2, and only there so that M1,3(s) ≤ 0 for all 0 < s < 1. Indeed, doing the computations, one finds
Now take with a, b > 0 and distinct. Then,
For fixed s ∉ {0, 1/2, 1}, this is strictly concave in t and symmetric about t = 1/2, so the maximum occurs only at t = 1/2 and the minimum only at t ∈ {0, 1}. However, since , we do not have , which would provide a counterexample to (1.1) but instead . As we have just seen, this is less than Tr[X1−sY1/2XsY1/2], and by Lemma 2.5, this, in turn, is less than Tr[XY]. In fact, defining h(t) ≔ 4t−1/2 + 41/2−t, we can rewrite (4.3) as
Then, from (4.4),
By the arithmetic-geometric mean inequality, a + b − atb1−t − a1−tbt ≥ 0 for all 0 ≤ t ≤ 1. Since h(s) is evidently convex and symmetric about s = 1/2, for each fixed t ∈ (0, 1), Tr[X1−sY1−tXsYt] is a strictly convex function of s, symmetric about s = 1/2. Therefore, this function is minimized only for s = 1/2 and maximized only for s ∈ {0, 1}, and hence, for any t,
and the right side is independent of t since . Hence, (1.2) is satisfied for all choices of a, b > 0. Likewise, by what was proved above, for all s, t, with , which is an orthogonal projection,
and hence, (1.1) is satisfied for all choices of a, b > 0.
This shows that the construction of counterexamples is more subtle than simply producing negative entries in M(s). It appears that the key to the construction of counterexamples for 3 × 3 matrices is to choose X so that one of the inequalities in (4.1) to is nearly saturated, with one of the summands negative for most values of s. Furthermore, it is natural to choose X and Y to be perturbations of positive semidefinite matrices X0 and Y0 such that for all 0 ≤ s, t ≤ 1. Of course, this is satisfied if X0 and Y0 are orthogonal projections with mutually orthogonal ranges.
Our construction relies on the Householder reflections determined by two distinct unit vectors . This is given by Hu,v ≔ I − 2‖u − v‖−1|u − v⟩⟨u − v|. Evidently, Hu,v is self-adjoint, orthogonal, and Hu,vu = v and Hu,vv = u. For simplicity, choose
Then,
Now choose
Then, X0 and Y0 are orthogonal projections such that X0Y0 = 0.
Now, we make a simple perturbation. For a, b > 0 small, to be chosen later, define
and also for 0 < t < 1, define
Then,
The off-diagonal entries of UYU will not change sign as t varies, but we can make this happen by applying are further orthogonal transformation; define , and finally put
where RT is the transpose of R, with x, a, and b to be chosen later. We compute
and
We seek a small perturbation of X0, and hence, we will take a, b, and |x| all to be small positive numbers. It is easy to see that the sign change we seek occurs in if we take a ≪ b ≪ 1 and 0 < x ≪ 1, and occurs in if we take b ≪ a ≪ 1 and 0 < x ≪ 1.
Note that the only difference between the two examples is that we have swapped the values assigned to a and b; all other parameters are left the same. Numerical plots show that in both cases, the maximum value of |M1,3(t) + M2,3(t)| is less than 10−3 times the maximum of |M1,3(t)| + |M2,3(t)| so that the last inequality in (4.1) is nearly saturated; there is near cancellation in the sum . Note that in our counterexample to (1.1), the sum of the exponents s + t is 1.58, not so much larger than the minimum value, 3/2, at which such a counterexample cannot exist. It would be of interest to see if one can build on this construction, possibly extending it into higher dimensions, to show that the condition s + t ≤ 3/2 in Theorem 2.1 is sharp.
We close by showing that the monotonicity property for the Golden–Thompson inequality described in Theorem 3.2 does not hold for arbitrary self-adjoint matrices H and K.
Recall that fH,K(u) has been defined by (3.1),
With X and Y as above, we define K = log(X) and H = log Y. Since H is diagonal, the integral can be explicitly evaluated as a Hadamard product. One finds
This shows that the monotonicity proved in Theorem 3.2 is not true for general self-adjoint H and K.
ACKNOWLEDGMENTS
We thank Victoria Chayes and Rupert Frank for useful conversations. E.A.C. gratefully acknowledges partial support from the U.S. National Science Foundation (Grant No. DMS 2055282).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.