In this paper, we consider the defocusing Hartree nonlinear Schrödinger equations on with real-valued and even potential V and Fourier multiplier decaying such as |k|−β. By relying on the method of random averaging operators [Deng et al., arXiv:1910.08492 (2019)], we show that there exists β0, which is less than but close to 1, such that for β > β0, we have invariance of the associated Gibbs measure and global existence of strong solutions in its statistical ensemble. In this way, we extend Bourgain’s seminal result [J. Bourgain, J. Math. Pures Appl. 76, 649–702 (1997)], which requires β > 2 in this case.
I. INTRODUCTION
In this paper, we study the invariant Gibbs measure problem for the nonlinear Schrödinger (NLS) equation on with Hartree nonlinearity. Such an equation takes the form
where V is a convolution potential. We will assume that it satisfies the following properties:
V is real-valued and even, and so is .
(1.1) is defocusing, i.e., V ≥ 0.
V acts like β antiderivatives, i.e., and for some β ≥ 0.
A typical example for such V is the Bessel potential ⟨∇⟩−β of order β on , which can be written in the form (for some c > 0) V(x) = c|x|−(3−β) + K(x) for 0 < β < 3 and , where K is a real-valued smooth function on . Note that when V is the δ function (and β = 0), we recover the usual cubic NLS equation. Our main result (see Theorem 1.3) establishes invariance of the Gibbs measure for (1.1) when V is the Bessel potential of order β, β < 1, and is close enough to 1, greatly improving the previous result of Bourgain,7 which assumes that β > 2 (see also Remark 1.4).
A. Background
Equation (1.1) can be viewed as a regularized or tempered version of the cubic NLS equation, and both naturally arise in the limit of quantum many-body problems for interacting bosons (see, e.g., Refs. 21 and 33 and references therein). An important question, both physically and mathematically, is to study the construction and dynamics of the Gibbs measure for (1.1), which is a Hamiltonian system.
1. Gibbs measure construction
The Gibbs measure, which we henceforth denote by dν, is formally expressed as
where H[u] is the renormalization of the Hamiltonian,
Rigorously making sense of (1.2) is closely linked to the construction of the measure in quantum field theory, which has attracted much interest since the 1970s and 1980s1,20,22,26,27,31 and in recent years.3,4,21,32 In the case of (1.1), the answer actually depends on the value of β. When36 β > 1/2, the measure dν can be defined as a weighted version of the Gaussian measure dρ, namely,
where :|u|2(V * |u|2): is a suitable renormalization of the nonlinearity [see (1.12) for a precise definition] and the Gaussian free field dρ is defined as the law of distribution for the random variable,37
with {gk(ω)} being i.i.d. normalized centered complex Gaussians. On the other hand, if 0 < β ≤ 1/2, then dν is a weighted version of a shifted Gaussian measure, which is singular with respect to dρ1. These results were proved recently by Bringmann11 and Oh, Okamoto, and Tolomeo28 by adapting the variational methods of Barashkov and Gubinelli.3
We remark that, in either case mentioned above, it can be shown that the Gibbs measure dν is supported in , the same space as dρ1. In particular, the typical element in the support of dν has infinite mass, which naturally leads to the renormalizations in the construction of dν alluded above; see Sec. I B. From the physical point of view, it is also worth mentioning that, in the same way (1.1) is derived from quantum many-body systems, the Gibbs measure dν, with the correct renormalizations, can also be obtained by taking the limit of thermal states of such systems, at least when V is sufficiently regular (see Refs. 21 and 32).
2. Gibbs measure dynamics and invariance
Of the same importance as the construction of the Gibbs measure is the study of its dynamics and rigorous justification of its invariance under the flow of (1.1). The question of proving invariance of Gibbs measures for infinite dimensional Hamiltonian systems, with interest from both mathematical and physical aspects, has been extensively studied over the last few decades. In fact, it is the works of Refs. 5, 6, and 26—which attempted to answer this question in some special cases—that mark the very beginning of the subject of random data partial differential equations (PDEs).
The literature is now extensive, so we will only review those related to NLS equations. After the construction of Gibbs measures in Ref. 26, the first invariance result was due to Bourgain,5 which applies in one dimension for focusing sub-quintic equations and for defocusing equations with any power nonlinearity. Bourgain6 then extended the defocusing result to two dimensions, but only for the cubic equation; the two-dimensional case with arbitrary (odd) power nonlinearity was recently solved by the authors.17 For the case of Hartree nonlinearity (1.1) in three dimensions, Bourgain7 obtained invariance for β > 2. We also mention the works of Tzvetkov34,35 and of Bourgain and Bulut,8,9 which concern the NLS equation inside a disk or ball, the construction of non-unique weak solutions by Oh and Thomann30 following the scheme in Refs. 2, 13, and 15, and the relevant works on wave equations.11,12,14,28,29 In particular, the recent work of Bringmann12 established Gibbs measure invariance for the wave equation with the Hartree nonlinearity (1.1) for arbitrary β > 0.
The main mathematical challenge in proving invariance of Gibbs measure is the low regularity of the support of the measure, especially in two or more dimensions. For example, for the two-dimensional NLS equation with power nonlinearity, the support of the Gibbs measure dν lies in the space of distributions , while the scaling critical space is for the quintic equation and approaches for equations with high power nonlinearities. This gap is a major reason why the two-dimensional quintic and higher cases have remained open for so many years. In the case of (1.1), a similar gap is present, namely, between the support of dν at and the scaling critical space , which is higher than with β < 1.
On the other hand, it is known since the pioneering work of Bourgain6 that with the random initial data, one can go below the classical scaling critical threshold and obtain almost-sure well-posedness results. In the recent works17,18 of the authors, an intuitive probabilistic scaling argument was performed. This leads to the notion of the probabilistic scaling critical index spr ≔ −1/(p − 1), which is much lower than the classical scaling critical index scr≔ (d/2) − 2/(p − 1) in the case of pth power nonlinearity in d dimensions. In Ref. 18, we proved that almost-sure local well-posedness indeed holds in Hs in the full probabilistic subcritical range when s > spr, in any dimensions, and for any (odd) power nonlinearity.
For the case of (1.1), a similar argument as in Refs. 17 and 18 yields that the probabilistic scaling critical index for (1.1) is spr = (−1 − β)/2, which is lower than −1/2, so it is reasonable to think that almost-sure well-posedness would be true. However, the situation here is somewhat different from that in Refs. 17 and 18 due to the asymmetry of the nonlinearity (1.1) compared to the power one, which leads to interesting modifications of the methods in these previous works, as we will discuss in Sec. I C.
3. Probabilistic methods
The first idea in proving almost-sure well-posedness was due to Bourgain6 and Da Prato and Debussche,15 the latter in the setting of parabolic stochastic partial differential equations (SPDEs), which can be described as a linear–nonlinear decomposition. Namely, the solution is decomposed into a linear term, random evolution (or noise) term, and a nonlinear term that has strictly higher regularity, thanks to probabilistic cancellations in the nonlinearity. If the linear term has regularity close to scaling criticality, then the nonlinear term can usually be bounded sub-critically; hence, a fixed point argument applies. However, this idea has its limitations in that the nonlinear term may not be smooth enough, and in practice, it is usually limited to slightly supercritical cases (relative to deterministic scaling) and does not give optimal results.
In Ref. 17, inspired partly by the regularity structure theory of Hairer and the para-controlled calculus by Gubinelli, Imkeller, and Perkowski in the parabolic SPDE setting, we developed the theory of random averaging operators. The main idea is to take the high–low interaction, which is usually the worst contribution in the nonlinear term described above, and express them as a para-product-type linear operator—called the random averaging operator—applied to the random initial data. Moreover, this linear operator is independent of the initial data it applies to and has a randomness structure, which includes the information of the solution at lower scales; see Sec. I C. This structure is then shown to be preserved from low to high frequencies by an induction on scales argument and eventually leads to improved almost-sure well-posedness results. We refer the reader to Ref. 33 for an example of a recent application of the method of random averaging operators of Ref. 17 to weakly dispersive NLS.
In Ref. 18, the random averaging operators are extended to the more general theory of random tensors. In this theory, the linear operators are extended to multilinear operators, which are represented by tensors, and whole algebraic and analytic theories are then developed for these random tensors. For NLS equations with odd power nonlinearity, this theory leads to the proof of optimal almost-sure well-posedness results; see Ref. 18. We remark that, while the theory of random tensors is more powerful than random averaging operators, the latter has a simpler structure, is less notation-heavy, and is already sufficient in many situations (especially if one is not very close to probabilistic criticality).
Finally, we would like to mention other probabilistic methods, developed in the recent works of Gubinelli, Koch, and Oh,25 Bringmann,10,12 and Oh, Okamoto, and Tolomeo.28 These methods also go beyond the linear–nonlinear decomposition and are partly inspired by the parabolic theories. They have important similarities and differences compared to our methods in Refs. 17 and 18, but they mostly apply for wave equations instead of Schrödinger equations, so we will not further elaborate here but refer the reader to the above papers for further explanation.
B. Setup and the main result
We start by fixing the i.i.d. normalized (complex) Gaussian random variables so that and . Let
and it is easy to see that almost surely. Let be a potential such that V is even, non-negative, and V0 = 1, |Vk| ≲ ⟨k⟩−β as described above. Here and below, we will use uk to denote the Fourier coefficients of u and use to represent time Fourier transform only. In this paper, we fix β < 1 and sufficiently close (This is a specific value, but we do not track it below.) to 1. Let be a dyadic scale, and define projections ΠN such that and ΔN = ΠN − ΠN/2, and define
We introduce the following truncated and renormalized versions of (1.1), with the truncated random initial data:
Here, in (1.7), we fix

and is a Fourier multiplier,
Note that uN is supported in ⟨k⟩ ≤ N for all time. The first counterterm in (1.7), namely, −σNuN, corresponds to the standard Wick ordering, where one fixes k1 = k2 in the expression,
plugs in u = fN(ω), and takes expectations. The second term corresponds to fixing k2 = k3, which is present due to the asymmetry of the nonlinearity (|u|2 * V) · u. Note that is uniformly bounded and, thus, is unnecessary if β > 1 (in particular, this is the case of Bourgain7); if β < 1, this becomes a divergent term that needs to be subtracted.
Equation (1.7) is a finite dimensional Hamiltonian equation with the Hamiltonian
where .
Suppose that V is the Bessel potential of order β and β > 1/2; then, GN(u) converges to a limit G(u) in Lq(dρ) for all 1 ≤ q < ∞, and the sequence of measures dνN converges to a probability measure dν in the sense of total variations. The measure dν is called the Gibbs measure associated with system (1.1).
This is proved in the recent works of Bringmann11 and Oh, Okamoto, and Tolomeo.28 Strictly speaking, they are dealing with the case of real-valued u (as they are concerned about the wave equation), but the proof can be readily adapted to the complex-valued case here.□
Now, we can state our main theorem.45
The condition that V is the Bessel potential of order β in Theorem 1.3 is required only because of the assumption of Proposition 1.2, which is proved by Refs. 11 and 28. In fact, all the proofs in this paper are mainly about the almost-sure local-in-time well-posedness and only require that V satisfies the following: (1) V is real-valued and even, and so is ; (2) V ≥ 0; and (3) and .
As in Refs. 17 and 18, the sequence {uN} can be replaced by other canonical approximation sequences, for example, with the sharp truncations ΠN on the initial data replaced by smooth truncations or with the projection ΠN onto the nonlinearity in (1.7) omitted. The limit obtained does not depend on the choice of such sequences, and the proof will essentially be the same.
1. Regarding the range of β
The range of β obtained in Theorem 1.3 is clearly not optimal. In fact, Eq. (1.1) with Gibbs measure data is probabilistically subcritical as long as β > 0, and one should expect the same result at least when β > 1/2 (so the Gibbs measure is absolutely continuous with the Gaussian free field).
The purpose of this paper, however, is to provide an example where the method of random averaging operators17 is applied so that one can significantly improve the existing probabilistic results (β close but smaller than 1 vs β > 2 in Ref. 7) while keeping the presentation relatively short. In order to treat β > 1/2, one would need to adapt the sophisticated theory of random tensors,18 which will considerably increase the length of this work, so we decide to leave this part to a next paper.
As for the case 0 < β < 1/2, one would need to deal with the mutual singularity between the Gibbs measure and the Gaussian free field [of course, if one studies the local well-posedness problem with Gaussian initial data, as in (1.5), which is, of course, different from Gibbs, then a modification of the random tensor theory18 would also likely work for all β > 0]. The recent work of Bringmann12 provides a nice example where this issue is solved in the context of wave equations, and it would be interesting to see whether this can be extended to Schrödinger equations. Finally, the case β = 0, which is the famous Gibbs measure invariance problem for the three-dimensional cubic NLS equation, still remains an outstanding open problem as of now. It is probabilistically critical, which presumably would require completely new techniques to solve.
C. Main ideas
Due to the absolute continuity of the Gibbs measure in Proposition 1.2, in order to prove Theorem 1.3, we only need to consider the initial data distributed according to dρ for (the renormalized version of) (1.1) and the initial data distributed according to dρN for (1.7). In other words, we may assume that u(0) = f(ω) for (1.1) and uN(0) = fN(ω) for (1.7).
1. Random averaging operators
Let us focus on (1.7); for simplicity, we will ignore the renormalization terms. The approach of Bourgain and of Da Prato and Debussche corresponds to decomposing
where fN is as in (1.6) and v(t) is the nonlinear evolution. In particular, this v(t) contains a trilinear Gaussian term,
This term turns out to only have H0− regularity, which is not regular enough for a fixed point argument (note that the classical scaling critical threshold is H(1−β)/2). Therefore, this approach does not work.
Nevertheless, one may observe that the only contribution to v* that has the worst (H0−) regularity is when the first two input factors are at low frequency and the third factor is at high frequency, such as
for N′ ≪ N and FN as in (1.6). Moreover, this low frequency component fN′ may also be replaced by the corresponding nonlinear term at frequency N′, so it makes sense to separate the low–low–high interaction term ψN defined by
as the singular part of yN ≔ uN − uN/2 so that yN − ψN has higher regularity.
The idea of considering high–low interactions is consistent with the para-controlled calculus in Refs. 23–25. However, in those works, the singular term ψN and the regular term yN − ψN are characterized only by their regularity (for example, one is constructed via fixed point argument in H0− and the other in H1/2−), which, as pointed out in Ref. 17, is not enough in the context of Schrödinger equations. Instead, it is crucial that one studies the operator, referred to as the random averaging operator in Ref. 17, which maps z to the solution to the following equation:
Note that the kernel of this operator, which we denote by , is a Borel function of and is independent of FN(ω). Moreover, this HN encodes the whole randomness structure of uN/2, which is captured in two particular matrix norm bounds for HN. Essentially, they involve the operator norm and the Hilbert–Schmidt norm for fixed time t (or fixed Fourier variable λ); see Sec. II B 2 for details.
This is the main idea of the random averaging operators in Ref. 17. Basically, it allows one to fully exploit the randomness structure of the solution at all scales, which is necessary for the proof in the setting of Schrödinger equations in the lack of any smoothing effect.
2. The special term ρN: A “critical” component
In addition to the ansatz introduced in Sec. I C 1, it turns out that an extra term is necessary due to the structure (especially, the asymmetry) of the nonlinearity (1.1). Recall that (|u|2 * V)u can be expressed as in (1.10); for simplicity, we will ignore any resonances (which are canceled by the renormalizations), i.e., assume that k2 ∉ {k1, k3} in (1.10). Here, if |k1 − k2| ≳ Nε for some small constant ε, then the potential , which is bounded by , will transform into a derivative gain, which allows one to close easily using the random averaging operator ansatz in Sec. I C 1.
However, suppose that |k1 − k2| is very small, say, |k1 − k2| ∼ 1 in (1.10); then, the potential does not lead to any gain of derivatives, and we will see that this particular term, in fact, exhibits some (probabilistically) “critical” feature. To see this, let us define to be this portion of nonlinearity (and the corresponding multilinear expression),
and note the Π1 projector restricting to |k1 − k2| ∼ 1. Then, if we define the iteration terms
it follows from simple calculations that u(0) has regularity H−1/2−, while each u(m), where m ≥ 1, has exactly regularity H1/2−. Therefore, although u(1) is indeed more regular than u(0), the higher order iterations are not getting smoother despite all input functions [which are FN(ω)] having the same (and high) frequency. This is in contrast with the “genuinely (probabilistically) subcritical” situations (for the standard NLS equation) in Ref. 18, where for fixed positive constants ε and c, the mth iteration u(m), assuming that all input frequencies are the same, will have increasing and positive regularity in Hεm−c as m grows and becomes large. Similarly, one may consider the linear operator,
with as in (1.18), and in typical subcritical cases, the norm of this operator from a suitable Xs,b space to itself would be N−α for some α > 0; see Refs. 17 and 18. However, here (for Hartree), one can check that the corresponding norm is, in fact, ∼1 and may even exhibit a logarithmic divergence if one adds up different scales.
Therefore, it is clear that the contribution as in (1.18) needs a special treatment in addition to the ansatz in Sec. I C 1. Fortunately, this term does not depend on the value of β and was already treated in the work of Bourgain.7 In this work, we introduce an extra term ρN, which corresponds to the term treated in the work of Bourgain,7 by defining ξN such that
and defining ρN = ξN − ψN, where is a smooth truncation at frequency Nε for some small ε. This term is then measured at regularity Hs for some s < 1/2, while the remainder term zN ≔ yN − ξN, where yN = uN − uN/2, is measured at regularity Hs′ for some s < s′ < 1/2. See Sec. III A 3 for the solution ansatz and Proposition 3.1 for the precise formulations.
3. Additional remark
Note that the precise definitions of the equations satisfied by ψN and ξN [see (3.2) and (3.8)] involve projection ΔN onto the right-hand sides; this is to make sure that and are exactly supported in N/2 < ⟨k⟩ ≤ N so that one can exploit the cancellation due to the unitarity of the matrices HN (corresponding to ψN), as well as the matrices MN that correspond to the term ξN. This unitarity comes from the mass conservation property of the linear equations defining these matrices and already plays a key role in the work of Bourgain.7 See Sec. III B for details.
II. PREPARATIONS
A. Reduction of the equation
We start with system (1.7) with initial data uN(0) = fN(ω). Clearly, is supported in ⟨k⟩ ≤ N. If we denote the right-hand side of (1.7) by , then in Fourier space, we have
We will extend , which is a cubic polynomial of u, to an -trilinear operator in the standard way. Note that

is conserved under the flow (1.7), and we may get rid of the second term on the right-hand side of (2.1) by a gauge transform,
If we further define the profile vN by
then v will satisfy the integral equation,
where
B. Notations and norms
We set up some basic notations and norms needed later in the proof.
1. Notations
As denoted above, we will use vk to denote Fourier coefficients, and denotes the Fourier transform in time. For a finite index set A, we will write kA = (kj: j ∈ A), where each , and denote by a tensor . We may also define tensors involving λ variables where .
We fix the parameters, to be used in the proof, as follows. Let ε > 0 be sufficiently small absolute constant. Let ε1 and ε2 be fixed such that ε2 ≪ ε1 ≪ ε. Let β < 1 be such that 1 − β ≪ ε2, and choose δ such that δ ≪ 1 − β and κ such that κ ≫ δ−1. We use θ to denote any generic small positive constant such that θ ≪ δ (which may be different at different places). Let b = 1/2 + κ−1, so 1 − b = 1/2 − κ−1. Finally, let τ be sufficiently small compared to all the above-mentioned parameters, and denote J = [−τ, τ]. Fix a smooth cutoff function χ(t), which equals 1 for |t| ≤ 1 and equals 0 for |t| ≥ 2, and define χτ(t) ≔ χ(τ−1t). We use C to denote any large absolute constant and Cθ for any large constant depending on θ. If some event happens with probability , where A is a large parameter, we say that this event happens A-certainly.
2. Norms
If (B, C) is a partition of A, namely, B ∩ C = ∅ and B ∪ C = A, we define the norm such that
The same notation also applies for tensors involving the λ variables. For functions u = uk(t) and h = hkk′(t) and 0 < c < 1, we also define the norms
For any interval I, define the corresponding localized norms
and similarly define Yc(I) and Zc(I). By abusing notations, we will call the above v an extension of u, though it is actually an extension of the restriction of u to I.
C. Preliminary estimates
Here, we record some basic estimates. Most of them are standard or are in our previous works.17,18
1. Linear estimates
Define the original and truncated Duhamel operators,
See Ref. 18, Lemma 4.2.□
We only need to bound locally in-time the function f*(t), which equals f(0) for t ≥ 0 and f(t) for t < 0; in fact, g is obtained by performing twice the transformation from f to f*, first at center τ and then at center −τ.
We can decompose f into two parts, f1, which is smooth and equals f(0) near 0, and f2 such that f2(0) = 0. Clearly, we only need to consider f2 so that f* equals f2 multiplied by a smooth truncation of 1[0,+∞), with f2(0) = 0.
2. Counting estimates
Here, we list some counting estimates and the resulting tensor norm bounds.
Let or . Then, given and , the number of choices for that satisfies
For dyadic numbers N1, N2, N3, R > 0 and some fixed number Ω0,
(1) It is the same as part (1) of Lemma 4.3 in Ref. 17. (2) We consider |SR|. First, the number of choices of k1 and k3 is . After fixing the choice of k1 and k3 to count (k, k2), it is equivalent to count k2 satisfying the restriction |k2|2 + |k2 + c1|2 = c2 or to count k satisfying the restriction |k|2 + |k + c3|2 = c4 for some fixed numbers c1, … , c4, and hence, we have . Similarly, if we first fix k and k2, we have . In addition, if we fix k2 first, then to count (k, k1, k3) is equivalent to count (k1, k3) with the restriction (k2 − k1) · (k2 − k3) = c for some fixed number c. By fixing the first two components of (k1, k3) and using part (1), we have . Similarly, we also have ). The proofs of (2.18)–(2.24) are similar.□
3. Probabilistic and tensor estimates
For the Proof of Proposition 2.8, see Ref. 18 and Propositions 4.14 and 4.15.
III. THE ANSATZ
A. The structure of yN
Start with systems (2.3) and (2.4). Let yN = vN − vN/2, and then, yN satisfies the integral equation,
1. The term ψN,L
For any L ≤ N/2, consider the linear equation for Ψ = Ψk(t),
where we define, with δ ≪ 1,
and define also . If (3.2) has initial data Ψk(0) = ΔNϕk, then the solution may be expressed as
where is the kernel of a linear operator (or a matrix). Define also
and similarly,
Note that when L = 1, we will replace L/2 by 0, so, for example, . For simplicity, denote
Note that each hN,L and HN,L is a Borel function of and is thus independent of the Gaussians in FN.
2. The terms ξN and ρN
Next, similar to (3.2), we consider the linear equation,
where is defined by
If the initial data are Ξk(0) = ΔNϕk, then we may write the solution as
which defines the matrix . We then define ξN and ρN by
3. The ansatz
Now, we introduce the ansatz
where zN is a remainder term. We can calculate that zN solves the following equation (recall yN = vN − vN/2):
B. Unitarity of matrices HN,L and MN
The following properties of H and M will play a fundamental role. This idea goes back to that of Bourgain.7 Recall that for L ≤ N/2, the matrix HN,L is defined by (3.2) and (3.4). Note that if Ψ solves (3.2), then Ψk(t) is supported in N/2 < ⟨k⟩ ≤ N and recall that , and then, we have
The sum on the right-hand side may be replaced by two terms, namely, S1 where we only require k1 ≠ k2 in the summation and S2 where we require k1 ≠ k2 and k2 = k3 in the summation. For S1, by swapping (k, k1, k2, k3) ↦ (k3, k2, k1, k), we also see that and, hence, Im(S1) = 0; moreover,
which is also real-valued by swapping (k, k2) ↦ (k2, k). This means that ∑k|Ψk(t)|2 is conserved in time. Therefore, for each fixed t, the matrix is unitary; hence, we get the identity
C. The a priori estimates
We now state the main a priori estimate and prove that this implies Theorem 1.3.
Given 0 < τ ≪ 1, and let J = [−τ, τ]. Recall the parameters defined in Sec. II B. For any M, consider the following statements, which we call Local(M):
For the operators hN,L, where L < M and N > L is arbitrary, we have
For the terms ρN and zN, where N ≤ M, we have
For any L1, L2 < M, the operator defined by
Therefore, the solution uN to (1.7) converges to a unique limit as N → ∞ up to an exceptional set with probability . This proves the almost-sure local well-posedness of (1.1) with Gibbs measure initial data. Since the truncated Gibbs measure dηN defined by (1.13) is invariant under (1.7) and the truncated Gibbs measures converge strongly to the Gibbs measure dν as in Proposition 1.2, we can apply the standard local-to-global argument of Bourgain, where the a priori estimates in Proposition 3.1 allow us to prove the suitable stability bounds needed in the process, in exactly the same way as in Ref. 17. The almost-sure global existence and invariance of Gibbs measure then follow.□
D. A few remarks and simplifications
From now on, we will focus on the Proof of Proposition 3.1, and assume that the bounds involved in Local(M/2) are already true. The goal is to recover (3.16)–(3.18) and (3.20)–(3.21) for M. Before proceeding, we want to remark on a few simplifications that we would like to make in the proof below. These are either standard or are the same as in Refs. 17 and 18, and we will not detail out these arguments in the proof below.
In proving these bounds, we will use the standard continuity argument, which involves a smallness factor. The localized Xc norm, Xc([0, T]), is continuous in T if the function is smooth. This should enable the continuity argument. Here, this factor is provided by the short time τ ≪ 1. In particular, we can gain a positive power τθ by using38 Proposition 2.3 at the price of changing the c exponent in the Xc (or Yc or Zc) norm by a little (e.g., from 1 − b to b). It can be checked in the proof below that all the estimates allow for some room in c, so this is always possible.
In each proof below, we can actually gain an extra power Mδ/10 compared to the desired estimate, so any loss that is will be acceptable. In fact, in the proof below, we will frequently encounter losses of at most due to manipulations of the c exponent in various norms as in (1) and due to the application of probabilistic bounds such as Proposition 2.8 where we lose a small θ power.
In the course of the proof, we will occasionally need to obtain bounds of quantities of the form supλG(λ), where λ ranges in an interval, and for each fixed λ, the quantity |G(λ)| can be bounded, apart from a small exceptional set; moreover, here, G will be differentiable and G′(λ) will satisfy a weaker but unconditional bound. Then, we can apply the meshing argument in Refs. 17 and 18, where we divide the interval into a large number of subintervals, approximate G on each small interval by a sample (or an average), control the error term using G′, and add up the exceptional sets corresponding to the sample in each interval. In typical cases, where M-certainly |G(λ)| ≤ Mθ for each fixed λ, |I| ≤ MC, and |G′(λ)| ≤ MC unconditionally, we can deduce that M-certainly, supλ|G(λ)| ≤ Mθ because the number of subintervals is O(MC), so the total probability for the union of exceptional sets is still sufficiently small.
IV. THE RANDOM AVERAGING OPERATOR
A. The operator
We start by proving (3.20) and (3.21) for L = M/2. We need to construct an extension of defined in (3.19). This is done first using Lemma 2.4 to find extensions of each component of and [note that max(L1, L2) = M/2] such that these extension terms satisfy (3.16)–(3.18) with the localized Xb(J), …, norms replaced by the global Xb, …, norms, at the expense of some slightly worse exponents. The change in the value in exponents will play no role in the proof below, so we will omit it. Then, by attaching to a factor χ(τ−1t) and using Lemma 2.3 (see Sec. III D), we can gain a smallness factor τθ at the price of further worsening the exponents. These operations are standard, so we will not repeat them below.
Note that the extension defined in Lemma 2.4 preserves the independence between the matrices and for Rj ≤ Lj/2.
Recall that is the Fourier transform of the kernel of , and we have
Now, we consider the different cases.
where Ω = |k|2 − |k1|2 + |k2|2 − |k′|2 and I = I(λ, μ) is as in (2.10); we will omit the factor η((k1 − k2)/N1−δ) in the definition of in (3.3) as it does not play a role. We may also assume that |k1 − k2| ∼ R ≲ L. In the above expression, let μ ≔ λ − (Ω + λ1 − λ2 + λ′); in particular, we have |I| ≲ ⟨λ⟩−1⟨μ⟩−1 by (2.10). By a routine argument, in proving (3.20), we may assume that |λj| ≤ L100 and |μ| ≲ L100; in fact, if, say, |λ1| is the maximum of these values and |λ1| ≥ L100 (the other cases being similar), then we may fix the values of kj and, hence, k − k′, at a loss of at most L12, and reduce to estimating
with |λ1| ∼ K ≥ L100 and for each j. By estimating w1 in the unweighted L2 norm, we can gain a power K−1/2, and using the integrability of , which follows from the weighted L2 norm, we can fix the value of λ2. In the end, this leads to
and hence,
which is more than enough because if is supported, where k − k′ is constant.
Now, we may assume |λj| ≤ L100 for j ∈ {1, 2} and |μ| ≤ L100; we may also assume as, otherwise, we gain from the weights ⟨λ⟩2(1−b) and ⟨λ′⟩−2b in (3.20). Similarly, in proving (3.21), we may assume |λj| ≤ N100 for j ∈ {1, 2}, |μ| ≤ N100, and |λ| + |λ′| ≤ N100 (otherwise, we may also fix (k, k′) and argue as above). Therefore, in proving (3.21), we may replace the unfavorable exponents ⟨λ⟩2b⟨λ′⟩−2(1−b) by the favorable ones ⟨λ⟩2(1−b)⟨λ′⟩−2b at a price of ; this will be acceptable since in the proof, we will be able to gain a power N−δ/2. We remark that in the proof below (though not here), we may use the Y1−b norm as in (3.16) for the matrices in the decomposition of ; using the bounds of λj as above, we may replace the exponent 1 − b by b (which then implies integrability) again at a loss of either or depending on whether we are proving (3.20) or (3.21), which is acceptable. See also Sec. III D.
This then allows us to fix the values of λj in (4.2) using the integrability coming from the weighted norms; moreover, by using the bound |I| ≲ ⟨λ⟩−1⟨μ⟩−1, upper bounds for λ and μ as mentioned above, and the weights in (3.20)–(3.21), we may also fix the values of λ, λ′, and ⌊μ⌋and reduce to estimating the following quantity:
where the tensor (which we call the base tensor)
with some value Ω0 determined by λj, λ, λ′, and ⌊μ⌋. Here, we also assume |kj| ≲ Lj and |k1 − k2| ∼ R ≲ L and .
Now, (4.3) is easily estimated by using Proposition 2.7 that
which is enough for (3.20) [namely, we multiply this by the factor ⟨λ⟩−1 coming from I and the weight ⟨λ⟩1−b⟨λ′⟩−b in (3.20) and then take the L2 norm in λ and λ′ to get (3.20); the same happens below]. In particular, the norm is bounded as
by Schur’s bound, where and are defined similarly as in Lemma 2.5.
For the norm, we have
which is enough for (3.21). Note that all the bounds for hb we use here follow from Schur’s bound and Lemma 2.5.
Suppose that is replaced by and is replaced by . We may further decompose into for R2 ≤ L2/2 (including the case R2 = 0 by which we mean ) and perform the same arguments as mentioned above fixing the λ variables and reduce (this reduction step actually involves a meshing argument as the estimate for is probabilistic; see Sec. III D) to estimating the following quantity:
where and h(2) is independent of and is either the identity matrix or satisfies and . We then estimate (4.4) by
using Propositions 2.6 and 2.8, which is enough for (3.20). Note that here hb depends on k and k′ only via k − k′ and |k|2 − |k′|2 and that , given the assumptions, so Proposition 2.8 is applicable. Similarly, for the norm, we have
which is enough for (3.21).
Suppose that is replaced by for j ∈ {1, 2}. In this case, we will start from (3.19) and expand
for j ∈ {1, 2}. There are then two cases, namely, when = or otherwise.
If ≠ , then we can repeat the above argument [including further decomposing into using (3.6) and (3.7)] and fix the time Fourier variables and reduce to estimating a quantity
where h(j) is independent of and is either the identity matrix or satisfies and . Since ≠ , we can apply Proposition 2.8 either in (, ) jointly (if L1 = L2) or first in and then in (if, say, L1 ≥ 2L2) and get that
which is enough for (3.20). As for the norm, we have
which is enough for (3.21).
Finally, assume that = , and then, L1 = L2 = L. In (3.19), the summation in = gives
Using the cancellation (3.15) since k1 ≠ k2, we can replace the factor in the above expression by ; then, by further decomposing into by (3.7) and repeating the above arguments, we can reduce to estimating the following quantity:
where h(j) is either the identity matrix or satisfies and . Note that we may assume |kj − | ≲ RjLδ using the bound (3.17), so, in particular, we have
up to a loss of LCδ (which is acceptable as in this case, we can gain at least ). Using these, we estimate, assuming without loss of generality that R1 ≥ R2,
B. The matrices HN,L and hN,L
and we also extend its kernel in the same way as we do for in Sec. IV A. Let , and then, by induction hypothesis and the proof in Sec. IV A, we know that also satisfies estimates (3.20) and (3.21). Clearly, (3.20) implies that ; moreover, it is easy to see that
and hence, , and the same holds for . By interpolation, we obtain that for α ∈ {b, 1 −b} (note that we can always gain a positive power of τ using Lemma 2.3; see Sec. III D). Moreover, consider the kernel , and then, we also have the following bound:
which follows from (3.20). If we replace the factor ⟨λ′⟩−b by 1, then a simple argument shows that
(and the same for ) by using that
and then fixing the Fourier modes of vL. Interpolating again, we get that
A similar interpolation gives
Now, let
and then, it is easy to see that satisfies the same bounds (4.12) and (4.13) with right-hand sides replaced by 1; for example, (4.12) follows from iterating the following bound:
provided that
and (4.13) is proved similarly. Define further
By iterating the Xα → Xα bounds and using also (3.21) for , we can show that
The weighted bound
is shown in the same way but using Proposition 2.9.
In addition, we can also show that
and similarly,
assuming (4.14).
Now, we can finally prove (3.16) and (3.17). In fact, by definition of and , there exists an extension of hN,L such that
V. ESTIMATES FOR ρN
In this section, we prove the first bound in (3.18) regarding ρN, assuming N = M. Recall that from (3.2), (3.4), and (3.8), we deduce that ρN satisfies the following equation:
with initial data . Let be defined as in (4.11), and denote ; from Sec. IV B, we know that is well-defined and has kernel in physical space and in Fourier space. Then, (5.1) can be reduced to
where
Here, in (5.3), we assume for j ∈ {1, 2} that , where max(N1, N2) = N, and that w3 ∈ {ψN, ρN}.
In order to prove the bound for ρN in (3.17), we will apply a continuity argument, namely, assuming (3.17) and then improving it with a smallness factor. This can be done as long as we bound
since from Sec. IV B, we know that is bounded from Xb(J) to Xb(J). In fact, we will prove (5.4) with an extra gain , which will allow us to ignore any possible NCδ loss in the process. The smallness factor τθ will be provided by Lemma 2.3 as in Sec. III D, so we will not worry about it below. We divide the right-hand side of (5.3) into three terms:
Term I: when w3 = ρN.
Term II: when w3 = ψN and zN′ ∈ {w1, w2} for some N′ ≥ N/2.
Term III: when w3 = ψN and w1, w2 ∈ {ψN, ρN, ψN/2, ρN/2}.
Note that these are the only possibilities since if (say) N1 = N, w1 ∈ {ψN, ρN}, and N2 ≤ N/2, then we must have N2 = N/2 due to the support condition for ψN and ρN, as well as the restriction |k1 − k2| ≲ Nε in . Moreover, the estimate of term I follows from the operator norm bound,
A. Term II
For simplicity, we assume that w1 = zN′ (the proof of the case w2 = zN′ is similar). There are then two cases to consider, when or when .
1. The case
If , then we, in particular, have . By Lemma 2.4, we may fix an extension of w1 and w2 that satisfy the same bounds as they do but with Xb(J) replaced by Xb; moreover, they satisfy the same measurability conditions as w1 and w2. For simplicity, we will still denote them by w1 and w2. The same thing is done for w3 = ψN, as well as the corresponding matrices.
Now, by (5.3) and Lemma 2.2, we can find an extension of II, which we still denote by II for simplicity such that
where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2 and I = I(λ, μ) is as in (2.10). In the above expression, let μ ≔ λ − (Ω + λ1 − λ2 + λ3). In particular, we have |I| ≲ ⟨λ⟩−1⟨μ⟩−1 by (2.10). By a routine argument, we may assume that |λ| ≤ N100 and similarly for μ and each λj; in fact, if, say, |λ1| is the maximum of these values and |λ1| ≥ N100, then we may fix the values of k and all kj at a loss of at most N12 and reduce to estimating (with the value of Ω fixed)
with |λ1| ∼ K ≥ N100 and for each j. By estimating w1 in the unweighted L2 norm, we can gain a power K−1/2, and using the L1 integrability of that follows from the weighted L2 norms, we can fix the values of λj for j ∈ {2, 3}. In the end, this leads to
and hence, , which is more than enough for (3.18).
Now, with |λ| ≤ N100, …, we may apply the bounds (3.16)–(3.18), but for the extensions and global norms, and replace the Y1−b norm (if any) by the Yb norm at a loss of , which will be neglected as stated above. Similarly, as |λ| ≤ N100, we also only need to estimate II in the X1−b instead of Xb norm again at a loss of . Then, using L1 integrability in λj (together with a meshing argument; see Sec. III D), provided by weighted bounds (3.16)–(3.18) and the (almost) summability in μ due to the ⟨μ⟩−1 factor in (2.10), we may fix the values of λ, λj (1 ≤ j ≤ 3), and ⌊μ⌋(and hence, the value of ) and reduce to estimating the norm of the following quantity:
Here, in (5.7), we assume that |k1| ≤ N, |k2| ≤ N2, |k1 − k2| ≲ Nε, and N/2 < ⟨k3⟩, ⟨⟩ ≤ N and is fixed, and the inputs satisfy that
To estimate , we may assume |k1 − k2| ∼ R ≲ Nε and define the base tensor
with also the restrictions on kj as mentioned above. Then, we have
and hence,
By Lemma 2.8 and the independence between and , we get that
N-certainly. By the definition of hb and using Schur’s bound and counting estimates in Lemma 2.5 and noting that |k2| ≤ N2 and |k1 − k2| ≲ R, we can bound
Since also , we conclude that
which is enough for (3.18). This concludes the proof for term II when . Note that the above argument also works for the case when w1 = ρN and because here, we must have N2 ≥ N/2 due to the support condition of ρN and the assumption |k1 − k2| ≲ Nε, and the above arguments give the same (in fact, better) estimates.
2. The case
In this case, by repeating the first part of the arguments in Sec. V A 1, we can reduce to estimating the norm of the following quantity:
Here, in (5.9), we assume that |k1| ≤ N, |k2| ≤ N2, |k1 − k2| ∼ R ≲ Nε, N2/2 < ⟨k2⟩, ⟨⟩ ≤ N2, and N/2 < ⟨k3⟩, ⟨⟩ ≤ N, and is fixed, and the inputs satisfy that
Moreover, this H(j) is such that either H(j) = Id or with N3 = N. The sum in (5.9) can be decomposed into a term where ≠ and a term where = .
Case 1: ≠ . Let be defined as mentioned above, and it suffices to estimate the norm of the tensor
by using the ℓ2 norm of w1. If N3 = N, then the tensors hb, H(2), and H(3) are independent of and and ≠ , so we can apply Lemma 2.8; if N2 ≤ N/2, then hb, H(2), and H(3) and are all independent of , and moreover, hb and H(2) are independent of , so we can apply Lemma 2.8 iteratively, first for the sum in (k3, ) and then for the sum in (k2, ). In either case, by applying Lemma 2.8 and combining it with Lemma 2.7 and estimating H(j) in the kj → norm, we obtain N-certainly that the desired norm of the tensor is bounded by
Using the fact that |k2| ≤ N2 and |k3| ≤ N in the support of hb and Lemma 2.5 as above, we can show that
and hence,
which is enough for (3.18).
Case 2: = . In this case, we must have N2 = N, and we can reduce (5.9) to the following expression:39
where
As k2 ≠ k3 in (5.9) due to the definition of , we know that either H(2) or H(3) must not be identity, and hence, we have . By (5.11), we then simply estimate
using Lemma 2.5, notig that |k1 − k2| ≲ R and |k3| ≤ N. This completes the proof for term II.
B. Term III
Here, we assume w3 = ψN and w1, w2 ∈ {ψN, ρN, ψN/2, ρN/2}. We consider two possibilities, when w1, w2 ∈ {ψN, ψN/2}, which we call term IV, and when wj ∈ {ρN, ρN/2} for some j ∈ {1, 2}, which we call term V.
1. Term IV
Suppose w1, w2 ∈ {ψN, ψN/2}. We may also decompose them into for Lj ≤ Nj/2 and reduce to
where N1, N2 ∈ {N, N/2} and N3 = N. In (5.13), we consider two cases, depending on whether there is a pairing = or = or not.
Case 1: no-pairing. Assume that ∉ {, }, and then, we take the Fourier transform in the time variable t and repeat the first part of the arguments in Sec. V A 1 to reduce to estimating the norm of the following quantity:
In (5.14), we assume that |kj| ∼ N and |k1 − k2| ∼ R ≲ Nε and that the matrices h(j) are either identity or satisfy that
and moreover, we may assume that h(j) is supported in |kj − | ≲ LjNδ by inserting a cutoff exploiting (3.17). The norm for can then be estimated using Proposition 2.8 in the same way as in Sec. V A 2, either jointly in (, , ) if each Nj = N or first in those kj with Nj = N and then in those kj with Nj = N/2, so that N-certainly we have (with the base tensor hb defined as in Secs. V A 1 and V A 2)
using Lemma 2.5, which is enough for (3.18).
Case 2: pairing. We now consider the cases when = or = . First, if = , then we can apply the reduction arguments as mentioned above and reduce to
where
Note that both h(2) and h(3) cannot be identity as k2 ≠ k3. Now, if max(L2, L3) ≤ N/2, then due to independence, applying similar arguments as mentioned before, we can estimate N-certainly that
using the constraint |k1 − k2| ≲ Nε, which is enough for (3.18); if max(L2, L3) = N, then we can gain a negative power of this value and view simply as an H−1/2− function (without considering randomness) and bound
which is also enough for (3.18).
Finally, consider the case = , so, in particular, N1 = N2. We will sum over L1 and L2 in order to exploit the cancellation (3.15) (as k1 ≠ k2); this leads to the following expression:
where again we have replaced by 1 as mentioned before. Since k1 ≠ k2, by (3.15), we may replace in the above expression by . Then, decomposing in L1 and L2 again and taking Fourier transform in t and repeating the reduction steps as mentioned before, we arrive at the following quantity:
where
with h(j) as mentioned above. Note that we may assume |k1 − | ≲ Nδ min(L1, Nε + L2) ≲ Nε+δ min(L1, L2) in view of |k1 − k2| ≲ Nε, and it is easy to show, assuming min(L1, L2) = L, that
Since max(L1, L2) ≤ N/2, using independence and arguing as mentioned before, we can estimate that N-certainly,
which is enough for (3.18). This completes the estimate for term IV.
2. Properties of the matrix MN − HN
Before studying term V, we first establish some properties of the matrix such that
Lemma 5.1 plays an important role in Sec. V B 3 when estimating term V. In particular, we will exploit the one-dimensional extra independence of with since depends only on m*⋅k instead of on k.
3. Term V
Now, let us consider term V as defined in the introduction of Sec. V B. In the following proof of estimating term V, we will fully use the cancellation in (3.15) together with Lemma 5.1. We may assume that N1 = N2 = N because if N1 ≠ N2, then in later expansions, we must have ≠ [so the cancellation in (3.15) is not needed], and the proof will go in the same way; if N1 = N2 = N/2, then the same cancellation holds, and again, we have the same proof. Now, recall that ρN = ξN − ψN and that
as in (3.5) and (3.11), and that both MN and HN satisfy equality (3.15). Using this cancellation (when = in the expansion) in the same way as in Sec. V B 1 and by repeating the reduction steps mentioned before, we can reduce to estimating the quantity that is either
or
where
Here, in (5.27) and (5.28), the matrix Q is coming from QN, where for some fixed λ; similarly, P is coming from either QN or , and h(3) is coming from in the same way.
First, we consider (5.28). By losing a power NCε, we may fix the values of k1 − k2 and k − k3, and then, we will estimate using , and we have the following bounds:
(with L2 = N if P is coming from QN), where the first bound mentioned above follows from Proposition 2.8 for each fixed k3 and the second bound follows from estimating . This leads to
which is enough.
Now, we consider (5.27). If P is coming from QN, then in (5.27), we may remove the condition ≠ , reducing essentially to the expression in (5.3) with both w1 and w2 replaced by ρN, which is estimated in the same way as in Sec. V A 1. On the other hand, the term when = can be estimated in the same way as in (5.28). The same argument applies if P is coming from and max(L2, L3) ≥ Nε′, where we can gain a power N−ε′/4 from either L2 or L3, or if Q is coming from QN,rem, where we can gain extra powers N−ε′/4 using Lemma 5.1.
Finally, consider (5.27), assuming that max(L2, L3) ≤ Nε′ and that Q comes from QN,≪ in Lemma 5.1. By losing at most NCε′, we may fix the values of k1 − k2, k − k3, k2 − , and k3 − and consider one single component of QN,≪ described as in Lemma 5.1. Then, there are only two independent variables—namely, k and k1—and we essentially reduce (5.27) to
Here, is a non-probabilistic factor, |ℓ|, |ℓ*| ≲ Nε′ are fixed vectors, and are as in Lemma 5.1, and is defined as above. Moreover, we know that and P are independent of and , that depends only on m*⋅k1 for some fixed vector |m*| ≲ Nε′, and that |P| ≲ NO(ε), , and (after fixing λ as before). Finally, in (5.29) is bounded by .
Since only depends on m*⋅k1, if we fix the value of m*⋅k1 in the above-mentioned summation, then can be extracted as a common factor, and for the rest of the sum, we can apply independence (using Proposition 2.8) and get
where for any k1 · m* = a and . Note that in the above estimate, we are dividing the set of possible k1’s into subsets Sa,k, where ℓ · k1 equals some constant and m* · k1 equals another constant, and that Sa,k is either empty or has cardinality ≥N1−Cε′. When Sa,k = ∅, . When Sa,k ≠ ∅, we have |Sa,k| ≥ N1−Cε′, and hence,
Then, using Schur’s bound, we get that
which is enough for (3.18). This completes the proof for ρN.
C. An extra improvement
For the purpose of Sec. VI, we need an improvement for the ρN bound in (3.18), namely, the following.
We only need to examine terms I ∼ V in the above proof. For terms I and IV and V (hence also III), in the above proof, we already obtain bounds better than , so these terms are acceptable, and we need to study term II. Note that the definition of ρ* restricts k to a set E of cardinality ≤N1+Cε′ by the standard divisor counting bound.
The case in Sec. V A 1: Here, bound (5.8) suffices unless max(N2, R) ≤ NCε′; if this happens, note that in the above proof, (5.8) follows from the estimate
Case 1 in Sec. V A 2: Here, the bound (5.10) suffices unless R ≤ NCε′; if this happens, note that (5.10) follows from the estimate
□
VI. THE REMAINDER TERMS
Now, we will prove the zN part of bound (3.18), assuming that N = M. We will prove it by a continuity argument [see part (1) of Sec. III D for more details], so we may assume (3.18) and only need to improve it using Eq. (3.13); note that the smallness factor is automatic as long as we use (3.13), as explained before. As such, we can assume that each input factor wjs on the right-hand side of (3.13) has one of the following four types, where in all cases, we have Nj ≤ N:
Type (G), where we define Lj = 1 and
- (ii)
Type (C), where
with supported in the set and measurable for some Lj ≤ Nj/2 and satisfying the following bounds (where in the first bound, we first fix λj, take the operator norm, and the take the L2 norm in λj):
Moreover, using (3.17), we may assume that h(j) is supported in |kj − | ≲ NδLj. Note that if wj is of type (G), can be also expressed in the same form as (6.2) but with , except that the second equation in (6.3) is not true in this case.
- (iii)
Type (L), where is supported in {|kj| ∼ Nj} and satisfies
Also such wj is a solution to Eq. (5.1).
- (iv)
Type (D), where is supported in {|kj| ≲ Nj} and satisfies
Also such wj is a solution to Eq. (3.13).
Now, let the multilinear forms , , , and be as in (2.4), (3.3) and (3.9). The terms on the right-hand side of (3.13), apart from the first term in the second line of (3.13) which are trivially bounded, are as follows:
Term
where wj can be of any type and max(N1, N2, N3) = N.
Term
where wj can be of any type and max(N1, N2) = N.
Term
where wj can be of any type and max(N1, N2, N3) ≤ N/2.
Term
where wj can be of any type and max(N1, N2) ≤ N/2 and N3 = N.
Term
where wj can be of any type and max(N1, N2) = N3 = N.
Term
where w1 and w2 can be of any type, w3 has type (D), and max(N1, N2) ≤ N/2 and N3 = N.
Term
where w1 and w2 can be of any type, w3 has type (D), and max(N1, N2) = N3 = N.
Term VIII represents the last two lines of the right-hand side of (3.13).
Our goal is to recover the bound for zN in (3.18) for each of the terms I–VIII mentioned above. In doing so, we will consider two cases. First is the no-pairing case, where if w1 and w2 are of type (C) or (G) and, hence, expanded as in (6.2), then we assume that ≠ ; similarly, if w2 and w3 are of type (C) or (G), then we assume that ≠ . The second case is the pairing case which is when = or = (the over-pairing case where = = is easy, and we shall omit it). We will deal with the no-pairing case for terms I–VII in Secs. VI A–VI C, the pairing case for these terms in Sec. VI D, and term VIII in Sec. VI E.
A. No-pairing case
We start with the no-pairing case.
1. Preparation of the proof
We start with some general reductions in the no-pairing case. Recall as in Sec. III D that we can always gain a smallness factor from the short time τ ≪ 1 and can always ignore losses of , provided that we can gain a power N−ε/10 (which will be clear in the proof). We will consider , where can be one of , , , , and , with Π being a general notation for projections for ΠN, ΠN/2, and ΔN,
where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2 and ∑(⋆) is directly defined based on the definitions of , , , and and the selection of Π. For example, if is , then there will be two more restrictions |k| ≤ N and ⟨k1 − k2⟩ > N1−δ in the sum ∑(⋆). The other ∑(⋆) will defined in the similar ways.
Before going into the different estimates for I–VII, we first make a few remarks.
If a position wj has type (L) or (D), then in most cases, we only need to consider type (L) terms since (6.5) is stronger than (6.4); there are exceptions that will be treated separately later.
wj of type (G) can be considered as a special case of type (C) when ; if we avoid using the norm in (6.3), then we only need to consider type (C) terms.
Term I can be estimated in the same way as term II. In fact, the definition of implies max(N1, N2) ≥ N1−δ, so we are essentially in (special case of) term II up to a possible loss NCδ, which will be negligible compared to the gain. Moreover, term V can be estimated similarly as term IV; see Sec. VI C.
Terms VI and VII are readily estimated using the Xα → Xα bounds for the linear operator (3.19) proved in Secs. IV A and IV B.
Based on these remarks, from now on, we will consider terms II–IV (and VIII at the end), where the possible cases for the types of (w1, w2, w3) are (a) (C, C, C), (b) (C, C, L), (c) (C, L, C), (d) (L, C, C), (e) (L, L, C), (f) (C, L, L), (g) (L, C, L), and (h) (L, L, L).
In Sec. VI B, we will estimate term II, which can be understood as high–high interactions in view of max(N1, N2) = N, and noting that assuming k is the high frequency, then either k3 is also high frequency or |k1 − k2| must be large. In Sec. VI C, we will estimate terms III and IV by using a counting technique in a special situation called Γ-condition [see (6.18)]. In Sec. VI D, we consider the pairing case.
B. High–high interactions
We will estimate term II in this subsection. First, we can repeat the arguments for λ, λj, and the Duhamel operator I in (6.6) as in Secs. IV and V. Namely, we first restrict to |λj| ≤ N100 and |λ|, |μ| ≤ N100, where μ = λ − (Ω + λ1 − λ2 + λ3), and replace the unfavorable exponents (1 − b or b depending on the context) by the favorable ones (b or 1 − b) and then exploit the resulting integrability in λj to fix the values of λ, λj, and ⌊μ⌋. Then, we reduce to the following expression where Ω0 is a fixed integer:
where hb is the base tensor thatcontains the factors
We assume that hb is supported in the set where |kj| ≤ Nj and ⟨k1 − k2⟩ ∼ R, where R is a dyadic number. Moreover, we assume that R and the support of hb satisfy the conditions associated with the definition of some . In view of the factor in hb, we also define hR,(⋆)≔ Rβ · hb, which is essentially the characteristic function of the set
possibly with extra conditions determined by the definition of . We also define to be the set of (k, k1, k2, k3) ∈ SR with fixed k and, similarly, define . Noting that when wj has type (G), (C), or (L), we can further assume that |kj| > Nj/2 in the definition of SR.
1. Case (a): (C, C, C)
In this case, we have
where satisfies (6.3) with some Nj and Lj ≤ Nj/2 for 1 ≤ j ≤ 3 and is defined as mentioned above.
To estimate , we would like to apply Proposition 2.8 and then Proposition 2.7. Like in Secs. IV A and V, the way we apply Proposition 2.8 depends on the relative sizes of Nj (1 ≤ j ≤ 3). For example, if N1 = N2 = N3, we shall apply Proposition 2.8 jointly in the (, , ) summation in (6.9); if N1 = N3 > N2, we will first apply Proposition 2.8 jointly in the (, ) summation and then apply it in the summation, and if N3 > N1 > N2, we will apply first in the summation, then in the summation, and then in the summation. The results in the end will be the same in all cases, so, for example, we will consider the case N3 > N1 > N2. Now, we have
where
By the independence between and since N3 > N1 > N2, we apply Proposition 2.8 and Proposition 2.7 and get τ−1N*-certainly that
Similarly, by the independence between and since N1 > N2 and also by the independence between and , once again we can apply Proposition 2.8 and Proposition 2.7 to and then to . As a consequence, we have τ−1N*-certainly that
In the other cases, we get the same bound. Without loss of generality, we may assume that N1 = N, and then, using Lemma 2.5, we can estimate that
which implies that , which is enough for (3.18).
2. Case (b): (C, C, L)
In this case, we have
where satisfies (6.3) with some Nj and Lj ≤ Nj/2 for 1 ≤ j ≤ 2 and the base tensor is defined as before. Clearly, can be bounded by times the norm,
By applying Propositions 2.8 and 2.7 again, in the same manner as (VIB1), we get that the above norm is bounded by
By Lemma 2.5, we can conclude that
and hence, we easily get , which is enough for (3.18).
3. Case (c): (C, L, C) and case (d): (L, C, C)
The estimates of case (c) and case (d) are similar to case (b), so we will state the estimates in case (c) and case (d) without proofs. In case (c), we get
and in case (d), we get a similar bound, but with subindices 1 and 2 switched.
Now, by Lemma 2.5, we can obtain that
In the first case, we directly get that
which is enough for (3.18) as max(N1, N2) = N and N1−δ ≥ R ≥ Nε (then, we have N1 ∼ N2 ∼ N) in view of the definition of . In the second case, we get
which is also enough for (3.18) as max(N1, N2) = N and R ≥ Nε. In the third case, we get
which is also enough for (3.18). By switching indices 1 and 2, we also get the same estimates in case (d).
4. Case (e): (L, L, C)
In this case, we have
where satisfies (6.3) with some N3 and L3 ≤ N3/2 and the base tensor is defined as before. By symmetry, we may assume N1 ≤ N2, and then, by the same argument as above, using Propositions 2.7 and 2.8, we can bound
By Lemma 2.5, both tensor norms are bounded by min(N1, R)N3; as N1 ≤ N2 (and hence, N2 = N) and R ≥ Nε, it is easy to check that this bound is enough for (3.18).
5. Case (f): (C, L, L) and case (g): (L, C, L)
The estimates of cases (f) and (g) are similar to case (e), so we will state the estimates of case (f) and (g) directly. Again the two cases here only differ by switching indices 1 and 2, so we only consider case (f). Like in case (e), we get two bounds,
and
Now, if , we will apply the first bound and use that
so the factor , together with , where , provides the bound that is enough for (3.18). Moreover, the same bound also works if (since in this case, N1 = N).
If and , we will apply the second bound and use that
assuming that . This is also enough for (3.18) assuming that and R ≥ Nε.
6. Case (h): (L, L, L)
In this case, we have
where the base tensor is defined as before. Then, simply using Proposition 2.7, we get
By Lemma 2.5, we have , which implies that
which is enough for (3.18) because max(N1, N2) = N and R ≥ Nε.
C. The Γ condition terms
In this section, we estimate terms III and IV. These two terms are actually similar, and the key property that they satisfy is the so-called Γ condition. Namely, due to the projections and assumptions on the inputs in terms III and IV, we have that
for some real number Γ, where S is the support of the base tensor hb [note that in term IV, we may assume that w3 is not of type (D) as, otherwise, the bound follows from what we have already done, so here, we may choose Γ = (N/2)2 − 1].
To proceed, we return to in (6.6), where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2, and suppose that μ = λ − (Ω + λ1 − λ2 + λ3), and then, we have |I| ≲ ⟨λ⟩−1⟨μ⟩−1 by (2.10). Following the same reduction steps as before, we can assume that |λ|, |λj|(j = 1, 2, 3), |μ| ≤ N100 and may replace the unfavorable exponents by the favorable ones. Now, instead of fixing each λj and λ and ⌊μ⌋, we do the following.
Without loss of generality, we may assume that |λ3| is the maximum of all the parameters |λj| and |λ| and |μ|; the other cases are treated similarly. We may fix a dyadic number K and assume that |λ3| ∼ K. Then, we may fix λj (j ≠ 3) and λ and ⌊μ⌋, again using integrability in these variables, and exploit the weight in the weighted norms in which w3 is bounded and reduce to an expression
where and is essentially the characteristic function of the set (with possibly more restrictions according to the definition of )
where Ω0 is a fixed number such that |Ω0| ≲ K. We also define the sets to be the set of (k1, k2, k3, λ3) such that (k, k1, k2, k3, λ3) ∈ SR,M for fixed k. Note that when wj is of type (C), (G) or (L), we can further assume that .
The idea in estimating (6.19) is to view (k3, λ3) as a whole (say, denote it by ), which will allow us to gain using the Γ condition in estimating the norms of the base tensor hR,K,(⋆). Although our tensors here involve the variable , it is clear that Propositions 2.6 and 2.7 still hold for such tensors, and Proposition 2.8 can also be proved by using a meshing argument (see Sec. III D, where the derivative bounds in λ3 are easily proved as all the relevant functions are compactly supported in physical space). Moreover, by the induction hypothesis and the manipulation mentioned above (for example, with the Y1−b norm replaced by the Yb norm), we can also deduce corresponding bounds for and the corresponding matrices such that , for example, . Because of this, in the proof below, we will simply write , while we actually mean , so the proof has the same format as the previous ones.
We now consider the input functions. In term III, clearly, max(N1, N2, N3) ≳ N; if N3 ≪ N, then we must have max(N1, N2) ≳ N and |k1 − k2| ≳ N, and hence, this term can be treated in the same way as term II. Therefore, we may assume that N3 ∼ N, and clearly, the same happens for term IV. If max(N1, N2) ≳ N, then again using the term II estimate, we only need to consider the case where |k1 − k2| ≲ Nε. This term can be treated using similar arguments as below and is much easier due to the smallness of |k1 − k2|, so we will only consider the case max(N1, N2) ≪ N. In the same way, we will not consider term V here. Finally, if , with N3 ∼ N, then (3.18) directly follows from the linear estimate proved in Sec. IV A and the Γ condition is not needed.
There are two cases: when w3 has type (L) or or w3 has type (C) [or (G)]. In the latter case, there are four further cases for the types of w1 and w2, which we will discuss below.
1. The type (L) case
Suppose that w3 has type (L). Clearly, if , then (3.18) also follows from the linear estimates in Sec. IV A [because the difference between the ρN bound and the zN bound in (3.18) is at most ], so we may assume that . Then, in (6.19), we may further fix the values of (k1, k2) at the price of , and hence, we may write
and by definition, it is easy to see that . Then, (3.18) follows, using the bound for w3, if . Finally, if , then we have , where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2. Using the Γ condition (6.18), we conclude that |k3|2 belongs to an interval of length , so we can apply Proposition 5.3 to gain a power , which covers the loss and is enough for (3.18).
2. The type (C, C, C) case
Now, suppose that w1, w2, and w3 have type (C, C, C). By symmetry, we may assume that N1 ≤ N2. Then, by the same argument as in Sec. VI B 1, we obtain that
The last three factors are easily bounded by 1, so it suffices to bound the tensor hR,K,(⋆).
By definition, this is equivalent to counting the number of lattice points (k, k1, k2, k3) such that k1 − k2 + k3 = k (and also satisfying the inequalities listed above) and |Ω| ≲ K. Note that
so when K ≤ K1, by the Γ condition, |k|2 has at most K1 choices, and hence, k has at most K1N choices. Once k is fixed, the number of choices for (k1, k2, k3) is at most , which leads to the bound
If, instead, K ≥ K1, then k has at most KN choices, and once k is fixed, the number of choices for (k1, k2, k3) is at most , so we get
In either way, we get
which is enough for (3.18) as max(R, N1) ≲ N2.
3. The type (L, L, C) case
Now, suppose that w1, w2, and w3 have type (L, L, C). First, assume that N1 ≤ N2. The same arguments in Sec. VI B 4 yields
The second norm mentioned above is easily bounded by K1/2RN1 using Lemma 2.5, which is clearly enough for (3.18); for the first norm, there are two ways to estimate.
The first way is to use Lemma 2.5 directly, without using the Γ condition, to get
The second way is to use the Γ condition and first fix the value of |k|2 and then we fix k and then count . This yields
assuming K ≤ K1 and a better bound assuming K ≥ K1. Now, plugging in the second bound yields
which can be shown to be ≲N−1/2 using the fact max(R, N1) ≤ N2 and by considering whether R ≥ N1 or R ≤ N1. Moreover, the same estimate can be checked to work if . If , we can switch subscripts 1 and 2, in which case, we have the weaker bound,
without the 1/2 power in the last factor; however, this is still ≲N−1/2, provided that .
4. The type (L, C, C) and (C, L, C) cases
Now, suppose that w1, w2, and w3 have type (L,C,C); the case (C,L,C) is treated similarly. Here, the same arguments in Sec. VI B 3 implies
The two norms and can be estimated by K1/2R min(N1, N2), using Lemma 2.5 only and without the Γ condition, which is clearly enough for (3.18). For the norm, we can use the estimates in Sec. VI C 3 and get