In this paper, we consider the defocusing Hartree nonlinear Schrödinger equations on $T3$ with real-valued and even potential *V* and Fourier multiplier decaying such as |*k*|^{−β}. By relying on the method of random averaging operators [Deng *et al.*, arXiv:1910.08492 (2019)], we show that there exists *β*_{0}, which is less than but close to 1, such that for *β* > *β*_{0}, we have invariance of the associated Gibbs measure and global existence of strong solutions in its statistical ensemble. In this way, we extend Bourgain’s seminal result [J. Bourgain, J. Math. Pures Appl. **76**, 649–702 (1997)], which requires *β* > 2 in this case.

## I. INTRODUCTION

In this paper, we study the invariant Gibbs measure problem for the nonlinear Schrödinger (NLS) equation on $T3$ with Hartree nonlinearity. Such an equation takes the form

where *V* is a convolution potential. We will assume that it satisfies the following properties:

*V*is real-valued and even, and so is $V^$.(1.1) is

*defocusing*, i.e.,*V*≥ 0.*V*acts like*β*antiderivatives, i.e., $V^(0)=1$ and $|V^(k)|\u2272\u27e8k\u27e9\u2212\beta $ for some*β*≥ 0.

A typical example for such *V* is the Bessel potential ⟨∇⟩^{−β} of order *β* on $T3$, which can be written in the form (for some *c* > 0) *V*(*x*) = *c*|*x*|^{−(3−β)} + *K*(*x*) for 0 < *β* < 3 and $x\u2208T3\{0}$, where *K* is a real-valued smooth function on $T3$. Note that when *V* is the *δ* function (and *β* = 0), we recover the usual cubic NLS equation. Our main result (see Theorem 1.3) establishes invariance of the Gibbs measure for (1.1) when *V* is the Bessel potential of order *β*, *β* < 1, and is close enough to 1, greatly improving the previous result of Bourgain,^{7} which assumes that *β* > 2 (see also Remark 1.4).

### A. Background

Equation (1.1) can be viewed as a regularized or tempered version of the cubic NLS equation, and both naturally arise in the limit of quantum many-body problems for interacting bosons (see, e.g., Refs. 21 and 33 and references therein). An important question, both physically and mathematically, is to study the construction and dynamics of the *Gibbs measure* for (1.1), which is a Hamiltonian system.

#### 1. Gibbs measure construction

The Gibbs measure, which we henceforth denote by d*ν*, is formally expressed as

where *H*[*u*] is the renormalization of the Hamiltonian,

Rigorously making sense of (1.2) is closely linked to the construction of the $\Phi 34$ measure in quantum field theory, which has attracted much interest since the 1970s and 1980s^{1,20,22,26,27,31} and in recent years.^{3,4,21,32} In the case of (1.1), the answer actually depends on the value of *β*. When^{36} *β* > 1/2, the measure d*ν* can be defined as a weighted version of the Gaussian measure d*ρ*, namely,

where :|*u*|^{2}(*V* * |*u*|^{2}): is a suitable renormalization of the nonlinearity [see (1.12) for a precise definition] and the Gaussian free field d*ρ* is defined as the law of distribution for the random variable,^{37}

with {*g*_{k}(*ω*)} being i.i.d. normalized centered complex Gaussians. On the other hand, if 0 < *β* ≤ 1/2, then d*ν* is a weighted version of a *shifted* Gaussian measure, which is singular with respect to d*ρ*_{1}. These results were proved recently by Bringmann^{11} and Oh, Okamoto, and Tolomeo^{28} by adapting the variational methods of Barashkov and Gubinelli.^{3}

We remark that, in either case mentioned above, it can be shown that the Gibbs measure d*ν* is supported in $H\u22121/2\u2212(T3)$, the same space as d*ρ*_{1}. In particular, the typical element in the support of d*ν* has infinite mass, which naturally leads to the *renormalizations* in the construction of d*ν* alluded above; see Sec. I B. From the physical point of view, it is also worth mentioning that, in the same way (1.1) is derived from quantum many-body systems, the Gibbs measure d*ν*, with the correct renormalizations, can also be obtained by taking the limit of thermal states of such systems, at least when *V* is sufficiently regular (see Refs. 21 and 32).

#### 2. Gibbs measure dynamics and invariance

Of the same importance as the construction of the Gibbs measure is the study of its dynamics and rigorous justification of its invariance under the flow of (1.1). The question of proving invariance of Gibbs measures for infinite dimensional Hamiltonian systems, with interest from both mathematical and physical aspects, has been extensively studied over the last few decades. In fact, it is the works of Refs. 5, 6, and 26—which attempted to answer this question in some special cases—that mark the very beginning of the subject of random data partial differential equations (PDEs).

The literature is now extensive, so we will only review those related to NLS equations. After the construction of Gibbs measures in Ref. 26, the first invariance result was due to Bourgain,^{5} which applies in one dimension for focusing sub-quintic equations and for defocusing equations with any power nonlinearity. Bourgain^{6} then extended the defocusing result to two dimensions, but only for the cubic equation; the two-dimensional case with arbitrary (odd) power nonlinearity was recently solved by the authors.^{17} For the case of Hartree nonlinearity (1.1) in three dimensions, Bourgain^{7} obtained invariance for *β* > 2. We also mention the works of Tzvetkov^{34,35} and of Bourgain and Bulut,^{8,9} which concern the NLS equation inside a disk or ball, the construction of non-unique weak solutions by Oh and Thomann^{30} following the scheme in Refs. 2, 13, and 15, and the relevant works on wave equations.^{11,12,14,28,29} In particular, the recent work of Bringmann^{12} established Gibbs measure invariance for the *wave* equation with the Hartree nonlinearity (1.1) for *arbitrary β* > 0.

The main mathematical challenge in proving invariance of Gibbs measure is the low regularity of the support of the measure, especially in two or more dimensions. For example, for the two-dimensional NLS equation with power nonlinearity, the support of the Gibbs measure d*ν* lies in the space of distributions $H0\u2212(T2)$, while the scaling critical space is $H1/2(T2)$ for the quintic equation and approaches $H1(T2)$ for equations with high power nonlinearities. This gap is a major reason why the two-dimensional quintic and higher cases have remained open for so many years. In the case of (1.1), a similar gap is present, namely, between the support of d*ν* at $H\u22121/2\u2212(T3)$ and the scaling critical space $H(1\u2212\beta )/2(T3)$, which is higher than $H0(T3)$ with *β* < 1.

On the other hand, it is known since the pioneering work of Bourgain^{6} that with the random initial data, one can go below the classical scaling critical threshold and obtain almost-sure well-posedness results. In the recent works^{17,18} of the authors, an intuitive probabilistic scaling argument was performed. This leads to the notion of the *probabilistic scaling critical index s*_{pr} ≔ −1/(*p* − 1), which is much lower than the classical scaling critical index *s*_{cr}≔ (*d*/2) − 2/(*p* − 1) in the case of *p*th power nonlinearity in *d* dimensions. In Ref. 18, we proved that almost-sure local well-posedness indeed holds in *H*^{s} in the full probabilistic subcritical range when *s* > *s*_{pr}, in any dimensions, and for any (odd) power nonlinearity.

For the case of (1.1), a similar argument as in Refs. 17 and 18 yields that the probabilistic scaling critical index for (1.1) is *s*_{pr} = (−1 − *β*)/2, which is lower than −1/2, so it is reasonable to think that almost-sure well-posedness would be true. However, the situation here is somewhat different from that in Refs. 17 and 18 due to the asymmetry of the nonlinearity (1.1) compared to the power one, which leads to interesting modifications of the methods in these previous works, as we will discuss in Sec. I C.

#### 3. Probabilistic methods

The first idea in proving almost-sure well-posedness was due to Bourgain^{6} and Da Prato and Debussche,^{15} the latter in the setting of parabolic stochastic partial differential equations (SPDEs), which can be described as a *linear–nonlinear decomposition*. Namely, the solution is decomposed into a linear term, random evolution (or noise) term, and a nonlinear term that has strictly higher regularity, thanks to probabilistic cancellations in the nonlinearity. If the linear term has regularity close to scaling criticality, then the nonlinear term can usually be bounded sub-critically; hence, a fixed point argument applies. However, this idea has its limitations in that the nonlinear term may not be smooth *enough*, and in practice, it is usually limited to slightly supercritical cases (relative to deterministic scaling) and does not give optimal results.

In Ref. 17, inspired partly by the *regularity structure theory* of Hairer and the *para-controlled calculus* by Gubinelli, Imkeller, and Perkowski in the parabolic SPDE setting, we developed the theory of *random averaging operators*. The main idea is to take the high–low interaction, which is usually the worst contribution in the nonlinear term described above, and express them as a para-product-type linear operator—called the random averaging operator—applied to the random initial data. Moreover, this linear operator is *independent* of the initial data it applies to and has a *randomness structure,* which includes the information of the solution at lower scales; see Sec. I C. This structure is then shown to be preserved from low to high frequencies by an *induction on scales* argument and eventually leads to improved almost-sure well-posedness results. We refer the reader to Ref. 33 for an example of a recent application of the method of random averaging operators of Ref. 17 to weakly dispersive NLS.

In Ref. 18, the random averaging operators are extended to the more general theory of *random tensors*. In this theory, the linear operators are extended to multilinear operators, which are represented by tensors, and whole algebraic and analytic theories are then developed for these random tensors. For NLS equations with odd power nonlinearity, this theory leads to the proof of optimal almost-sure well-posedness results; see Ref. 18. We remark that, while the theory of random tensors is more powerful than random averaging operators, the latter has a simpler structure, is less notation-heavy, and is already sufficient in many situations (especially if one is not very close to probabilistic criticality).

Finally, we would like to mention other probabilistic methods, developed in the recent works of Gubinelli, Koch, and Oh,^{25} Bringmann,^{10,12} and Oh, Okamoto, and Tolomeo.^{28} These methods also go beyond the linear–nonlinear decomposition and are partly inspired by the parabolic theories. They have important similarities and differences compared to our methods in Refs. 17 and 18, but they mostly apply for wave equations instead of Schrödinger equations, so we will not further elaborate here but refer the reader to the above papers for further explanation.

### B. Setup and the main result

We start by fixing the i.i.d. normalized (complex) Gaussian random variables ${gk(\omega )}k\u2208Z3$ so that $Egk=0$ and $E|gk|2=1$. Let

and it is easy to see that $f(\omega )\u2208H\u22121/2\u2212(T3)$ almost surely. Let $V:T3\u2192R$ be a potential such that *V* is even, non-negative, and *V*_{0} = 1, |*V*_{k}| ≲ ⟨*k*⟩^{−β} as described above. Here and below, we will use *u*_{k} to denote the Fourier coefficients of *u* and use $u^$ to represent time Fourier transform only. In this paper, we fix *β* < 1 and sufficiently close (This is a specific value, but we do not track it below.) to 1. Let $N\u22082Z\u22650\u222a{0}$ be a dyadic scale, and define projections Π_{N} such that $(\Pi Nu)k=1\u27e8k\u27e9\u2264N\u22c5uk$ and Δ_{N} = Π_{N} − Π_{N/2}, and define

We introduce the following truncated and renormalized versions of (1.1), with the truncated random initial data:

Here, in (1.7), we fix

and $CN$ is a Fourier multiplier,

Note that *u*_{N} is supported in ⟨*k*⟩ ≤ *N* for all time. The first counterterm in (1.7), namely, −*σ*_{N}*u*_{N}, corresponds to the standard Wick ordering, where one fixes *k*_{1} = *k*_{2} in the expression,

plugs in *u* = *f*_{N}(*ω*), and takes expectations. The second term $\u2212CNuN$ corresponds to fixing *k*_{2} = *k*_{3}, which is present due to the asymmetry of the nonlinearity (|*u*|^{2} * *V*) · *u*. Note that $(CN)k$ is uniformly bounded and, thus, is unnecessary if *β* > 1 (in particular, this is the case of Bourgain^{7}); if *β* < 1, this becomes a divergent term that needs to be subtracted.

Equation (1.7) is a finite dimensional Hamiltonian equation with the Hamiltonian

where $\gamma N=\u2211\u27e8k\u27e9,\u27e8\u2113\u27e9\u2264NVk\u2212\u2113\u27e8k\u27e92\u27e8\u2113\u27e92$.

*Remark 1.1.*

*H*

_{N}[

*u*] can also be expressed as $\u222bT3|\u2207u|2+:|u|2(V*|u|2):$, where the suitable renormalized nonlinearity :|

*u*|

^{2}(

*V** |

*u*|

^{2}): is defined as

*Z*

_{N}> 0 is a normalization constant, making d

*ν*

_{N}a probability measure. Clearly, d

*ν*

_{N}is invariant under the finite dimensional flow (1.7). Note that we can also write

*ρ*

_{N}is the law of distribution for the linear Gaussian random variable

*f*

_{N}(

*ω*) ≔ Π

_{N}

*f*(

*ω*), and $HNpot[u]$ represents the potential energy, given by

_{N}and $\Pi N\u22a5$, and let d

*ρ*and $d\rho N\u22a5$ be the laws of distribution for

*f*(

*ω*) and $\Pi N\u22a5f(\omega )$, respectively. Then, we have $d\rho =d\rho N\xd7d\rho N\u22a5$; moreover, we define

*β*< 1 close enough to 1; in particular,

*β*> 1/2.

*Proposition 1.2.*

*Suppose that V is the Bessel potential of order β and β* > 1/2*; then, G*_{N}(*u*) *converges to a limit G*(*u*) *in L*^{q}(d*ρ*) *for all* 1 ≤ *q* < *∞, and the sequence of measures* d*ν*_{N} *converges to a probability measure* d*ν in the sense of total variations. The measure* d*ν is called the Gibbs measure associated with system (**1.1**).*

*Proof.*

This is proved in the recent works of Bringmann^{11} and Oh, Okamoto, and Tolomeo.^{28} Strictly speaking, they are dealing with the case of real-valued *u* (as they are concerned about the wave equation), but the proof can be readily adapted to the complex-valued case here.□

Now, we can state our main theorem.^{45}

*Let V be the Bessel potential of order β and β*< 1

*be close enough to*1.

*There exists a Borel set*$\Sigma \u2282H\u22121/2\u2212(T3)$

*such that ν*(Σ) = 1

*, and the following holds. For any u*

_{in}∈Σ

*, let u*

_{N}(

*t*)

*be defined by*(1.7)

*, and then,*

*exists in*$Ct0Hx\u22121/2\u2212(R\xd7T3)$

*, and u*(

*t*) ∈Σ

*for all*$t\u2208R$

*. This u*(

*t*)

*solves*(1.1)

*with a suitably renormalized nonlinearity and defines a mapping*Φ

_{t}: Σ → Σ

*for each*$t\u2208R$

*. These mappings satisfy the group properties*Φ

_{t+s}= Φ

_{t}Φ

_{s}

*and keep the Gibbs measure*d

*ν invariant, namely, ν*(

*E*) =

*ν*(Φ

_{t}(

*E*))

*for any*$t\u2208R$

*and Borel set E*⊂Σ

*.*

*Remark 1.4.*

The condition that *V* is the Bessel potential of order *β* in Theorem 1.3 is required only because of the assumption of Proposition 1.2, which is proved by Refs. 11 and 28. In fact, all the proofs in this paper are mainly about the almost-sure local-in-time well-posedness and only require that *V* satisfies the following: (1) *V* is real-valued and even, and so is $V^$; (2) *V* ≥ 0; and (3) $V^(0)=1$ and $|V^(k)|\u2272\u27e8k\u27e9\u2212\beta $.

*Remark 1.5.*

As in Refs. 17 and 18, the sequence {*u*_{N}} can be replaced by other canonical approximation sequences, for example, with the sharp truncations Π_{N} on the initial data replaced by smooth truncations or with the projection Π_{N} onto the nonlinearity in (1.7) omitted. The limit obtained does not depend on the choice of such sequences, and the proof will essentially be the same.

#### 1. Regarding the range of *β*

*β*The range of *β* obtained in Theorem 1.3 is clearly not optimal. In fact, Eq. (1.1) with Gibbs measure data is probabilistically subcritical as long as *β* > 0, and one should expect the same result at least when *β* > 1/2 (so the Gibbs measure is absolutely continuous with the Gaussian free field).

The purpose of this paper, however, is to provide an example where the method of random averaging operators^{17} is applied so that one can significantly improve the existing probabilistic results (*β* close but smaller than 1 vs *β* > 2 in Ref. 7) while keeping the presentation relatively short. In order to treat *β* > 1/2, one would need to adapt the sophisticated theory of random tensors,^{18} which will considerably increase the length of this work, so we decide to leave this part to a next paper.

As for the case 0 < *β* < 1/2, one would need to deal with the mutual singularity between the Gibbs measure and the Gaussian free field [of course, if one studies the local well-posedness problem with Gaussian initial data, as in (1.5), which is, of course, different from Gibbs, then a modification of the random tensor theory^{18} would also likely work for all *β* > 0]. The recent work of Bringmann^{12} provides a nice example where this issue is solved in the context of wave equations, and it would be interesting to see whether this can be extended to Schrödinger equations. Finally, the case *β* = 0, which is the famous Gibbs measure invariance problem for the three-dimensional cubic NLS equation, still remains an outstanding open problem as of now. It is probabilistically *critical*, which presumably would require completely new techniques to solve.

### C. Main ideas

Due to the absolute continuity of the Gibbs measure in Proposition 1.2, in order to prove Theorem 1.3, we only need to consider the initial data distributed according to d*ρ* for (the renormalized version of) (1.1) and the initial data distributed according to d*ρ*_{N} for (1.7). In other words, we may assume that *u*(0) = *f*(*ω*) for (1.1) and *u*_{N}(0) = *f*_{N}(*ω*) for (1.7).

#### 1. Random averaging operators

Let us focus on (1.7); for simplicity, we will ignore the renormalization terms. The approach of Bourgain and of Da Prato and Debussche corresponds to decomposing

where *f*_{N} is as in (1.6) and *v*(*t*) is the nonlinear evolution. In particular, this *v*(*t*) contains a trilinear Gaussian term,

This term turns out to only have *H*^{0−} regularity, which is not regular enough for a fixed point argument (note that the classical scaling critical threshold is *H*^{(1−β)/2}). Therefore, this approach does not work.

Nevertheless, one may observe that the only contribution to *v*^{*} that has the worst (*H*^{0−}) regularity is when the first two input factors are at *low* frequency and the third factor is at *high* frequency, such as

for *N*′ ≪ *N* and *F*_{N} as in (1.6). Moreover, this low frequency component *f*_{N′} may also be replaced by the corresponding nonlinear term at frequency *N*′, so it makes sense to separate the low–low–high interaction term *ψ*^{N} defined by

as the singular part of *y*_{N} ≔ *u*_{N} − *u*_{N/2} so that *y*_{N} − *ψ*^{N} has higher regularity.

The idea of considering high–low interactions is consistent with the para-controlled calculus in Refs. 23–25. However, in those works, the singular term *ψ*^{N} and the regular term *y*_{N} − *ψ*^{N} are characterized only by their regularity (for example, one is constructed via fixed point argument in *H*^{0−} and the other in *H*^{1/2−}), which, as pointed out in Ref. 17, is not enough in the context of Schrödinger equations. Instead, it is crucial that one studies the operator, referred to as the *random averaging operator* in Ref. 17, which maps *z* to the solution to the following equation:

Note that the kernel of this operator, which we denote by $HN=(HN)kk\u2032(t)$, is a Borel function of ${gk(\omega )}\u27e8k\u27e9\u2264N/2$ and is *independent* of *F*_{N}(*ω*). Moreover, this *H*^{N} encodes the whole *randomness structure* of *u*_{N/2}, which is captured in two particular matrix norm bounds for *H*^{N}. Essentially, they involve the $\u2113k2\u2192\u2113k\u20322$ operator norm and the $\u2113kk\u20322$ Hilbert–Schmidt norm for fixed time *t* (or fixed Fourier variable *λ*); see Sec. II B 2 for details.

This is the main idea of the random averaging operators in Ref. 17. Basically, it allows one to fully exploit the randomness structure of the solution at all scales, which is necessary for the proof in the setting of Schrödinger equations in the lack of any smoothing effect.

#### 2. The special term *ρ*^{N}: A “critical” component

*ρ*In addition to the ansatz introduced in Sec. I C 1, it turns out that an extra term is necessary due to the structure (especially, the asymmetry) of the nonlinearity (1.1). Recall that (|*u*|^{2} * *V*)*u* can be expressed as in (1.10); for simplicity, we will ignore any resonances (which are canceled by the renormalizations), i.e., assume that *k*_{2} ∉ {*k*_{1}, *k*_{3}} in (1.10). Here, if |*k*_{1} − *k*_{2}| ≳ *N*^{ε} for some small constant *ε*, then the potential $Vk1\u2212k2$, which is bounded by $\u27e8k1\u2212k2\u27e9\u2212\beta $, will transform into a derivative gain, which allows one to close easily using the random averaging operator ansatz in Sec. I C 1.

However, suppose that |*k*_{1} − *k*_{2}| is very small, say, |*k*_{1} − *k*_{2}| ∼ 1 in (1.10); then, the potential does not lead to any gain of derivatives, and we will see that this particular term, in fact, exhibits some (probabilistically) “critical” feature. To see this, let us define $N\u2009$ to be this portion of nonlinearity (and the corresponding multilinear expression),

and note the Π_{1} projector restricting to |*k*_{1} − *k*_{2}| ∼ 1. Then, if we define the iteration terms

it follows from simple calculations that *u*^{(0)} has regularity *H*^{−1/2−}, while each *u*^{(m)}, where *m* ≥ 1, has *exactly* regularity *H*^{1/2−}. Therefore, although *u*^{(1)} is indeed more regular than *u*^{(0)}, the higher order iterations are *not* getting smoother despite all input functions [which are *F*_{N}(*ω*)] having the *same* (and high) frequency. This is in contrast with the “genuinely (probabilistically) subcritical” situations (for the standard NLS equation) in Ref. 18, where for fixed positive constants *ε* and *c*, the *m*th iteration *u*^{(m)}, assuming that all input frequencies are the same, will have increasing and positive regularity in *H*^{εm−c} as *m* grows and becomes large. Similarly, one may consider the linear operator,

with $N\u2009$ as in (1.18), and in typical subcritical cases, the norm of this operator from a suitable *X*^{s,b} space to itself would be *N*^{−α} for some *α* > 0; see Refs. 17 and 18. However, here (for Hartree), one can check that the corresponding norm is, in fact, ∼1 and may even exhibit a logarithmic divergence if one adds up different scales.

Therefore, it is clear that the contribution $N\u2009$ as in (1.18) needs a special treatment in addition to the ansatz in Sec. I C 1. Fortunately, this term does not depend on the value of *β* and was already treated in the work of Bourgain.^{7} In this work, we introduce an extra term *ρ*^{N}, which corresponds to the term treated in the work of Bourgain,^{7} by defining *ξ*^{N} such that

and defining *ρ*^{N} = *ξ*^{N} − *ψ*^{N}, where $\Pi \u0303N\epsilon $ is a smooth truncation at frequency *N*^{ε} for some small *ε*. This term is then measured at regularity *H*^{s} for some *s* < 1/2, while the remainder term *z*_{N} ≔ *y*_{N} − *ξ*^{N}, where *y*_{N} = *u*_{N} − *u*_{N/2}, is measured at regularity *H*^{s′} for some *s* < *s*′ < 1/2. See Sec. III A 3 for the solution ansatz and Proposition 3.1 for the precise formulations.

#### 3. Additional remark

Note that the precise definitions of the equations satisfied by *ψ*^{N} and *ξ*^{N} [see (3.2) and (3.8)] involve projection Δ_{N} onto the right-hand sides; this is to make sure that $(\psi N)k$ and $(\xi N)k$ are exactly supported in *N*/2 < ⟨*k*⟩ ≤ *N* so that one can exploit the cancellation due to the unitarity of the matrices *H*^{N} (corresponding to *ψ*^{N}), as well as the matrices *M*^{N} that correspond to the term *ξ*^{N}. This unitarity comes from the mass conservation property of the linear equations defining these matrices and already plays a key role in the work of Bourgain.^{7} See Sec. III B for details.

## II. PREPARATIONS

### A. Reduction of the equation

We start with system (1.7) with initial data *u*_{N}(0) = *f*_{N}(*ω*). Clearly, $(uN)k$ is supported in ⟨*k*⟩ ≤ *N*. If we denote the right-hand side of (1.7) by $\Pi NN\u2009(uN)$, then in Fourier space, we have

We will extend $N\u2009\u2009\u25cb(u)$, which is a cubic polynomial of *u*, to an $R$-trilinear operator $N\u2009\u2009\u25cb(u,v,w)$ in the standard way. Note that

is conserved under the flow (1.7), and we may get rid of the second term on the right-hand side of (2.1) by a gauge transform,

If we further define the profile *v*_{N} by

then *v* will satisfy the integral equation,

where

### B. Notations and norms

We set up some basic notations and norms needed later in the proof.

#### 1. Notations

As denoted above, we will use *v*_{k} to denote Fourier coefficients, and $Fvk=v^k=v^k(\lambda )$ denotes the Fourier transform in time. For a finite index set *A*, we will write *k*_{A} = (*k*_{j}: *j* ∈ *A*), where each $kj\u2208Z3$, and denote by $hkA$ a tensor $h:(Z3)A\u2192C$. We may also define tensors involving *λ* variables where $\lambda \u2208R$.

We fix the parameters, to be used in the proof, as follows. Let *ε* > 0 be sufficiently small absolute constant. Let *ε*_{1} and *ε*_{2} be fixed such that *ε*_{2} ≪ *ε*_{1} ≪ *ε*. Let *β* < 1 be such that 1 − *β* ≪ *ε*_{2}, and choose *δ* such that *δ* ≪ 1 − *β* and *κ* such that *κ* ≫ *δ*^{−1}. We use *θ* to denote any generic small positive constant such that *θ* ≪ *δ* (which may be different at different places). Let *b* = 1/2 + *κ*^{−1}, so 1 − *b* = 1/2 − *κ*^{−1}. Finally, let *τ* be sufficiently small compared to all the above-mentioned parameters, and denote *J* = [−*τ*, *τ*]. Fix a smooth cutoff function *χ*(*t*), which equals 1 for |*t*| ≤ 1 and equals 0 for |*t*| ≥ 2, and define *χ*_{τ}(*t*) ≔ *χ*(*τ*^{−1}*t*). We use *C* to denote any large absolute constant and *C*_{θ} for any large constant depending on *θ*. If some event happens with probability $\u22651\u2212C\theta e\u2212A\theta $, where *A* is a large parameter, we say that this event happens *A-certainly*.

#### 2. Norms

If (*B*, *C*) is a partition of *A*, namely, *B* ∩ *C* = *∅* and *B* ∪ *C* = *A*, we define the norm $\Vert h\Vert kB\u2192kC$ such that

The same notation also applies for tensors involving the *λ* variables. For functions *u* = *u*_{k}(*t*) and *h* = *h*_{kk′}(*t*) and 0 < *c* < 1, we also define the norms

For any interval *I*, define the corresponding localized norms

and similarly define *Y*^{c}(*I*) and *Z*^{c}(*I*). By abusing notations, we will call the above *v* an *extension* of *u*, though it is actually an extension of the restriction of *u* to *I*.

### C. Preliminary estimates

Here, we record some basic estimates. Most of them are standard or are in our previous works.^{17,18}

#### 1. Linear estimates

Define the original and truncated Duhamel operators,

*Lemma 2.2.*

*We have the formula*

*where the kernel I satisfies that*

*Proposition 2.3*

*Let φ be any Schwartz function, and recall that φ*

_{τ}(

*t*) =

*φ*(

*τ*

^{−1}

*t*)

*for τ*≪ 1

*. Then, for any u*=

*u*

_{k}(

*t*),

*we have*

*provided that either*0 <

*c*≤

*c*

_{1}< 1/2

*or u*

_{k}(0) = 0

*and*1/2 <

*c*≤

*c*

_{1}< 1

*. The same result also holds if u*=

*u*(

*t*)

*is measured in norms other than ℓ*

^{2}

*, so*(2.11)

*is true with X replaced by Y or Z.*

*Proof.*

See Ref. 18, Lemma 4.2.□

*Lemma 2.4*

*Suppose that f*(

*x*,

*t*)

*is a function defined in t*∈ [−

*τ*,

*τ*] =

*J*,

*with*|

*τ*| ≪ 1

*. Define that*

*For any Schwartz function φ, we have*

*provided that either*0 <

*b*<

*b*

_{1}< 1/2

*or*1/2 <

*b*<

*b*

_{1}< 1

*. When*1/2 <

*b*<

*b*

_{1}< 1

*, we have*

*Proof.*

We only need to bound locally in-time the function *f*^{*}(*t*), which equals *f*(0) for *t* ≥ 0 and *f*(*t*) for *t* < 0; in fact, *g* is obtained by performing twice the transformation from *f* to *f*^{*}, first at center *τ* and then at center −*τ*.

We can decompose *f* into two parts, *f*_{1}, which is smooth and equals *f*(0) near 0, and *f*_{2} such that *f*_{2}(0) = 0. Clearly, we only need to consider *f*_{2} so that *f*^{*} equals *f*_{2} multiplied by a smooth truncation of **1**_{[0,+∞)}, with *f*_{2}(0) = 0.

**1**

_{[0,+∞)}by the sign function and then apply Proposition 2.3; note that for an even smooth cutoff function

*χ*,

_{N}are the standard Littlewood–Paley projections. Moreover, Δ

_{N}(

*χ*· sgn) (

*x*) can be viewed as a rescaled Schwartz function of the same form as in Proposition 2.3 with

*τ*=

*N*

^{−1}(due to the expression of the Fourier transform of sgn and simple calculations), so the desired result follows from Proposition 2.3.□

#### 2. Counting estimates

Here, we list some counting estimates and the resulting tensor norm bounds.

*Lemma 2.5.*

*Let*$R=Z$*or*$Z[i]$*. Then, given*$0\u2260m\u2208R$*and*$a0,b0\u2208C$*, the number of choices for*$(a,b)\u2208R2$*that satisfies*

*is O*(

*M*

^{θ}

*N*

^{θ})

*with constant depending only on θ*> 0

*.*

*For dyadic numbers N*_{1},*N*_{2},*N*_{3},*R*> 0*and some fixed number*Ω_{0},

*and then,*$SkR$

*is the set of*(

*k*,

*k*

_{1},

*k*

_{2},

*k*

_{3}) ∈

*S*

^{R}

*when k is fixed*.

*We have the following counting estimates:*

*Proof.*

(1) It is the same as part (1) of Lemma 4.3 in Ref. 17. (2) We consider |*S*^{R}|. First, the number of choices of *k*_{1} and *k*_{3} is $N13N33$. After fixing the choice of *k*_{1} and *k*_{3} to count (*k*, *k*_{2}), it is equivalent to count *k*_{2} satisfying the restriction |*k*_{2}|^{2} + |*k*_{2} + *c*_{1}|^{2} = *c*_{2} or to count *k* satisfying the restriction |*k*|^{2} + |*k* + *c*_{3}|^{2} = *c*_{4} for some fixed numbers *c*_{1}, … , *c*_{4}, and hence, we have $|SR|\u2272N13N33(N2\u2227N)1+\theta $. Similarly, if we first fix *k* and *k*_{2}, we have $|SR|\u2272N3N23(N1\u2227N3)1+\theta $. In addition, if we fix *k*_{2} first, then to count (*k*, *k*_{1}, *k*_{3}) is equivalent to count (*k*_{1}, *k*_{3}) with the restriction (*k*_{2} − *k*_{1}) · (*k*_{2} − *k*_{3}) = *c* for some fixed number *c*. By fixing the first two components of (*k*_{1}, *k*_{3}) and using part (1), we have $|SR,M|\u2272N23RN32+\theta $. Similarly, we also have $|SR|\u2272N3(RN1)2+\theta $). The proofs of (2.18)–(2.24) are similar.□

#### 3. Probabilistic and tensor estimates

*Proposition 2.6*

*Consider two tensors*$hkA1(1)$

*and*$hkA2(2)$

*, where A*

_{1}∩

*A*

_{2}=

*C. Let A*

_{1}Δ

*A*

_{2}=

*A, and define the semi-product*

*Then, for any partition*(

*X*,

*Y*)

*of A, let X*∩

*A*

_{1}=

*X*

_{1}

*and Y*∩

*A*

_{1}=

*Y*

_{1}

*, and we have*

*Proposition 2.7*

*Let A*

_{j}(1 ≤

*j*≤

*m*)

*be index sets such that any index appears in at most two A*

_{j}’

*s, and let*$h(j)=hkAj(j)$

*be tensors. Let A*=

*A*

_{1}Δ⋯Δ

*A*

_{m}

*be the set of indices that belong to only one A*

_{j}

*and C*= (

*A*

_{1}∪⋯∪

*A*

_{m})

*A be the set of indices that belong to two different A*

_{j}’

*s. Define the semi-product*

*Let*(

*X*,

*Y*)

*be a partition of A. For*1 ≤

*j*≤

*m*,

*let X*

_{j}=

*X*∩

*A*

_{j}

*and Y*

_{j}=

*Y*∩

*A*

_{j}

*, and define*

*and then, we have*

*q*≤

*r*):

*k*

_{A′},

*k*

_{B′}) is a partition of the variables ($k1\u2032$, …, $kq\u2032$,

*k*

_{q+1}, …,

*k*

_{r}) and (

*k*

_{A},

*k*

_{B}) is a partition of the variables (

*k*

_{1}, …,

*k*

_{r}), where each $kj\u2032$ (1 ≤

*j*≤

*q*) is replaced by

*k*

_{j}in (

*k*

_{A′},

*k*

_{B′}).

*Proposition 2.8*

*Let A be a finite set and*$hbckA=hbckA(\omega )$

*be a tensor, where each*$kj\u2208Zd$

*and*$(b,c)\u2208(Z3)q$

*for some integer q*≥ 2

*. Given signs ζ*

_{j}∈ {±}

*, we also assume that*⟨

*b*⟩, ⟨

*c*⟩ ≲

*M and*⟨

*k*

_{j}⟩ ≲

*M for all j*∈

*A, where M is a dyadic number, and that in the support of*$hbckA$,

*there is no-pairing in k*

_{A}

*. Define the tensor*

*where we restrict k*

_{j}∈

*E in (*

*2.31*

*), with E being a finite set such that*${hbckA}$

*is independent with*{

*η*

_{k}:

*k*∈

*E*}

*. Then, τ*

^{−1}

*M-certainly, we have*

*where*(

*B*,

*C*)

*runs over all partitions of A. The same results hold if we do not assume*⟨

*b*⟩, ⟨

*c*⟩ ≲

*M, but instead that (i)*$b,c\u2208Z3$

*and*|

*b*−

*c*| ≲

*M and*$\Vert b|2\u2212|c|2|\u2272M\kappa 3$

*and (ii)*$hbckA$

*can be written as a function of b*−

*c,*|

*b*|

^{2}− |

*c*|

^{2},

*and k*

_{A}

*.*

For the Proof of Proposition 2.8, see Ref. 18 and Propositions 4.14 and 4.15.

*Proposition 2.9*

*Suppose that the matrices*$h=hkk\u2032\u2032$

*,*$h(1)=hkk\u2032(1)$,

*and*$h(2)=hk\u2032k\u2032\u2032(2)$

*satisfy that*

*and*$hkk\u2032(1)$

*is supported in*|

*k*−

*k*′| ≲

*L, then we have*

For the Proof of Proposition 2.9, see Ref. 17, Proposition 2.5 or Ref. 18, Lemma 4.3 (there are different versions of this bound, but the proofs are the same).

## III. THE ANSATZ

### A. The structure of *y*_{N}

Start with systems (2.3) and (2.4). Let *y*_{N} = *v*_{N} − *v*_{N/2}, and then, *y*_{N} satisfies the integral equation,

#### 1. The term *ψ*^{N,L}

*ψ*For any *L* ≤ *N*/2, consider the linear equation for Ψ = Ψ_{k}(*t*),

where we define, with *δ* ≪ 1,

and define also $M\u2009>\u2254M\u2009\u25cb\u2212M\u2009<$. If (3.2) has initial data Ψ_{k}(0) = Δ_{N}*ϕ*_{k}, then the solution may be expressed as

where $HN,L=Hkk\u2032N,L$ is the kernel of a linear operator (or a matrix). Define also

and similarly,

Note that when *L* = 1, we will replace *L*/2 by 0, so, for example, $(\psi N,0)k(t)=(FN)k$. For simplicity, denote

Note that each *h*^{N,L} and *H*^{N,L} is a Borel function of $(gk(\omega ))\u27e8k\u27e9\u2264N/2$ and is thus independent of the Gaussians in *F*_{N}.

#### 2. The terms *ξ*^{N} and *ρ*^{N}

*ξ*

*ρ*Next, similar to (3.2), we consider the linear equation,

where $M\u2009\u226a$ is defined by

If the initial data are Ξ_{k}(0) = Δ_{N}*ϕ*_{k}, then we may write the solution as

which defines the matrix $MN=Mkk\u2032N$. We then define *ξ*^{N} and *ρ*^{N} by

#### 3. The ansatz

Now, we introduce the ansatz

where *z*_{N} is a remainder term. We can calculate that *z*_{N} solves the following equation (recall *y*_{N} = *v*_{N} − *v*_{N/2}):

### B. Unitarity of matrices *H*^{N,L} and *M*^{N}

The following properties of *H* and *M* will play a fundamental role. This idea goes back to that of Bourgain.^{7} Recall that for *L* ≤ *N*/2, the matrix *H*^{N,L} is defined by (3.2) and (3.4). Note that if Ψ solves (3.2), then Ψ_{k}(*t*) is supported in *N*/2 < ⟨*k*⟩ ≤ *N* and recall that $Vk=V\u2212k=Vk\xaf$, and then, we have

The sum on the right-hand side may be replaced by two terms, namely, *S*_{1} where we only require *k*_{1} ≠ *k*_{2} in the summation and *S*_{2} where we require *k*_{1} ≠ *k*_{2} and *k*_{2} = *k*_{3} in the summation. For *S*_{1}, by swapping (*k*, *k*_{1}, *k*_{2}, *k*_{3}) ↦ (*k*_{3}, *k*_{2}, *k*_{1}, *k*), we also see that $S1\u2208R$ and, hence, Im(*S*_{1}) = 0; moreover,

which is also real-valued by swapping (*k*, *k*_{2}) ↦ (*k*_{2}, *k*). This means that ∑_{k}|Ψ_{k}(*t*)|^{2} is conserved in time. Therefore, for each fixed *t*, the matrix $HN,L=Hkk\u2032N,L$ is unitary; hence, we get the identity

### C. The *a priori* estimates

We now state the main *a priori* estimate and prove that this implies Theorem 1.3.

*Proposition 3.1.*

*Given* 0 < *τ* ≪ 1*, and let J* = [−*τ*, *τ*]*. Recall the parameters defined in* Sec. II B*. For any M, consider the following statements, which we call* Local(*M*):

*For the operators h*^{N,L}*, where L*<*M and N*>*L is arbitrary, we have*

*as well as*

*For the terms ρ*^{N}*and z*_{N}*, where N*≤*M, we have*

*For any L*_{1},*L*_{2}<*M, the operator defined by*

*has an extension, which we still denote by*$L\u2009$

*for simplicity. The kernel*$L\u2009kk\u2032(t,t\u2032)$

*has Fourier transform*$L\u2009^kk\u2032(\lambda ,\lambda \u2032)$

*, which satisfies*

*and*

*where L*= max(

*L*

_{1},

*L*

_{2}).

*Now, with the above definition, we have that*

*Proof of Theorem 1.3.*

*τ*

^{−1}-certainly, the event Local(

*M*) happens for any

*M*. By (3.4), (3.11), and (3.12), we have

*h*

^{N,L}and

*F*

_{N}and using Proposition 2.8 combined with (3.16), we can show that $\Vert \zeta N,L\Vert Xb(J)\u2272N\delta L\u22121/3$. Summing over

*L*and noting that

*ζ*

^{N,L}is supported in

*N*/2 < ⟨

*k*⟩ ≤

*N*, we see that

*γ*> 0. Using also (3.18), we can see that the sequence {

*v*

_{N}−

*f*

_{N}} converges in $Ct0Hx0\u2212(J)$. Hence, {

*v*

_{N}} converges in $Ct0Hx\u22121/2\u2212(J)$, and so does the original sequence {

*u*

_{N}}.

Therefore, the solution *u*_{N} to (1.7) converges to a unique limit as *N* → *∞* up to an exceptional set with probability $\u22651\u2212C\theta e\u2212\tau \u2212\theta $. This proves the *almost-sure local well-posedness* of (1.1) with Gibbs measure initial data. Since the truncated Gibbs measure d*η*_{N} defined by (1.13) is invariant under (1.7) and the truncated Gibbs measures converge strongly to the Gibbs measure d*ν* as in Proposition 1.2, we can apply the standard local-to-global argument of Bourgain, where the *a priori* estimates in Proposition 3.1 allow us to prove the suitable stability bounds needed in the process, in exactly the same way as in Ref. 17. The almost-sure global existence and invariance of Gibbs measure then follow.□

### D. A few remarks and simplifications

From now on, we will focus on the Proof of Proposition 3.1, and assume that the bounds involved in Local(*M*/2) are already true. The goal is to recover (3.16)–(3.18) and (3.20)–(3.21) for *M*. Before proceeding, we want to remark on a few simplifications that we would like to make in the proof below. These are either standard or are the same as in Refs. 17 and 18, and we will not detail out these arguments in the proof below.

In proving these bounds, we will use the standard continuity argument, which involves a smallness factor. The localized

*X*^{c}norm,*X*^{c}([0,*T*]), is continuous in T if the function is smooth. This should enable the continuity argument. Here, this factor is provided by the short time*τ*≪ 1. In particular, we can gain a positive power*τ*^{θ}by using^{38}Proposition 2.3 at the price of changing the*c*exponent in the*X*^{c}(or*Y*^{c}or*Z*^{c}) norm by a little (e.g., from 1 −*b*to*b*). It can be checked in the proof below that all the estimates allow for some room in*c*, so this is always possible.In each proof below, we can actually gain an extra power

*M*^{δ/10}compared to the desired estimate, so any loss that is $MC\kappa \u22121$ will be acceptable. In fact, in the proof below, we will frequently encounter losses of at most $MC\kappa \u22121$ due to manipulations of the*c*exponent in various norms as in (1) and due to the application of probabilistic bounds such as Proposition 2.8 where we lose a small*θ*power.In the course of the proof, we will occasionally need to obtain bounds of quantities of the form sup

_{λ}*G*(*λ*), where*λ*ranges in an interval, and for each*fixed λ*, the quantity |*G*(*λ*)| can be bounded, apart from a small exceptional set; moreover, here,*G*will be differentiable and*G*′(*λ*) will satisfy a weaker but unconditional bound. Then, we can apply the*meshing argument*in Refs. 17 and 18, where we divide the interval into a large number of subintervals, approximate*G*on each small interval by a sample (or an average), control the error term using*G*′, and add up the exceptional sets corresponding to the sample in each interval. In typical cases, where*M*-certainly |*G*(*λ*)| ≤*M*^{θ}for each fixed*λ*, |*I*| ≤*M*^{C}, and |*G*′(*λ*)| ≤*M*^{C}unconditionally, we can deduce that*M*-certainly, sup_{λ}|*G*(*λ*)| ≤*M*^{θ}because the number of subintervals is*O*(*M*^{C}), so the total probability for the union of exceptional sets is still sufficiently small.

## IV. THE RANDOM AVERAGING OPERATOR

### A. The operator $L\u2009$

We start by proving (3.20) and (3.21) for *L* = *M*/2. We need to construct an extension of $L\u2009$ defined in (3.19). This is done first using Lemma 2.4 to find extensions of each component of $yL1$ and $yL2$ [note that max(*L*_{1}, *L*_{2}) = *M*/2] such that these extension terms satisfy (3.16)–(3.18) with the localized *X*^{b}(*J*), …, norms replaced by the global *X*^{b}, …, norms, at the expense of some slightly worse exponents. The change in the value in exponents will play no role in the proof below, so we will omit it. Then, by attaching to $L\u2009$ a factor *χ*(*τ*^{−1}*t*) and using Lemma 2.3 (see Sec. III D), we can gain a smallness factor *τ*^{θ} at the price of further worsening the exponents. These operations are standard, so we will not repeat them below.

Note that the extension defined in Lemma 2.4 preserves the independence between the matrices $hLj,Rj$ and $FLj$ for *R*_{j} ≤ *L*_{j}/2.

Recall that $L\u2009^kk\u2032(\lambda ,\lambda \u2032)$ is the Fourier transform of the kernel $L\u2009kk\u2032(t,t\u2032)$ of $L\u2009$, and we have

Now, we consider the different cases.

where Ω = |*k*|^{2} − |*k*_{1}|^{2} + |*k*_{2}|^{2} − |*k*′|^{2} and *I* = *I*(*λ*, *μ*) is as in (2.10); we will omit the factor *η*((*k*_{1} − *k*_{2})/*N*^{1−δ}) in the definition of $M\u2009<$ in (3.3) as it does not play a role. We may also assume that |*k*_{1} − *k*_{2}| ∼ *R* ≲ *L*. In the above expression, let *μ* ≔ *λ* − (Ω + *λ*_{1} − *λ*_{2} + *λ*′); in particular, we have |*I*| ≲ ⟨*λ*⟩^{−1}⟨*μ*⟩^{−1} by (2.10). By a routine argument, in proving (3.20), we may assume that |*λ*_{j}| ≤ *L*^{100} and |*μ*| ≲ *L*^{100}; in fact, if, say, |*λ*_{1}| is the maximum of these values and |*λ*_{1}| ≥ *L*^{100} (the other cases being similar), then we may fix the values of *k*_{j} and, hence, *k* − *k*′, at a loss of at most *L*^{12}, and reduce to estimating

with |*λ*_{1}| ∼ *K* ≥ *L*^{100} and $\Vert \u27e8\lambda j\u27e9bwj^\Vert L2\u22721$ for each *j*. By estimating *w*_{1} in the unweighted *L*^{2} norm, we can gain a power *K*^{−1/2}, and using the $L\lambda 21$ integrability of $w2^$, which follows from the weighted *L*^{2} norm, we can fix the value of *λ*_{2}. In the end, this leads to

and hence,

which is more than enough because $\Vert L\u2009^\Vert k\u2192k\u2032=supk,k\u2032|L\u2009^kk\u2032|$ if $L\u2009$ is supported, where *k* − *k*′ is constant.

Now, we may assume |*λ*_{j}| ≤ *L*^{100} for *j* ∈ {1, 2} and |*μ*| ≤ *L*^{100}; we may also assume $|\lambda |+|\lambda \u2032|\u2264L\kappa 3$ as, otherwise, we gain from the weights ⟨*λ*⟩^{2(1−b)} and ⟨*λ*′⟩^{−2b} in (3.20). Similarly, in proving (3.21), we may assume |*λ*_{j}| ≤ *N*^{100} for *j* ∈ {1, 2}, |*μ*| ≤ *N*^{100}, and |*λ*| + |*λ*′| ≤ *N*^{100} (otherwise, we may also fix (*k*, *k*′) and argue as above). Therefore, in proving (3.21), we may replace the unfavorable exponents ⟨*λ*⟩^{2b}⟨*λ*′⟩^{−2(1−b)} by the favorable ones ⟨*λ*⟩^{2(1−b)}⟨*λ*′⟩^{−2b} at a price of $NC\kappa \u22121$; this will be acceptable since in the proof, we will be able to gain a power *N*^{−δ/2}. We remark that in the proof below (though not here), we may use the *Y*^{1−b} norm as in (3.16) for the matrices in the decomposition of $yLj$; using the bounds of *λ*_{j} as above, we may replace the exponent 1 − *b* by *b* (which then implies $L\lambda j1$ integrability) again at a loss of either $LC\kappa \u22121$ or $NC\kappa \u22121$ depending on whether we are proving (3.20) or (3.21), which is acceptable. See also Sec. III D.

This then allows us to fix the values of *λ*_{j} in (4.2) using the $L\lambda j1$ integrability coming from the weighted norms; moreover, by using the bound |*I*| ≲ ⟨*λ*⟩^{−1}⟨*μ*⟩^{−1}, upper bounds for *λ* and *μ* as mentioned above, and the weights in (3.20)–(3.21), we may also fix the values of *λ*, *λ*′, and ⌊*μ*⌋and reduce to estimating the following quantity:

where the tensor (which we call the *base tensor*)

with some value Ω_{0} determined by *λ*_{j}, *λ*, *λ*′, and ⌊*μ*⌋. Here, we also assume |*k*_{j}| ≲ *L*_{j} and |*k*_{1} − *k*_{2}| ∼ *R* ≲ *L* and $\Vert wj\Vert \u21132\u2272Lj\u22121/2+\epsilon 1+\epsilon 2$.

Now, (4.3) is easily estimated by using Proposition 2.7 that

which is enough for (3.20) [namely, we multiply this by the factor ⟨*λ*⟩^{−1} coming from *I* and the weight ⟨*λ*⟩^{1−b}⟨*λ*′⟩^{−b} in (3.20) and then take the *L*^{2} norm in *λ* and *λ*′ to get (3.20); the same happens below]. In particular, the norm $\Vert hb\Vert kk2\u2192k1k\u2032$ is bounded as

by Schur’s bound, where $Sk1k\u2032R$ and $Skk2R$ are defined similarly as in Lemma 2.5.

For the $\Vert Q\Vert kk\u2032$ norm, we have

which is enough for (3.21). Note that all the bounds for *h*^{b} we use here follow from Schur’s bound and Lemma 2.5.

Suppose that $yL1$ is replaced by $\rho L1+zL1$ and $yL2$ is replaced by $\psi L2$. We may further decompose $\psi L2$ into $\zeta L2,R2$ for

*R*_{2}≤*L*_{2}/2 (including the case*R*_{2}= 0 by which we mean $\zeta L2,0=FL2$) and perform the same arguments as mentioned above fixing the*λ*variables and reduce (this reduction step actually involves a meshing argument as the estimate for $Q$ is probabilistic; see Sec. III D) to estimating the following quantity:

where $\Vert w1\Vert \u21132\u2272L1\u22121/2+\epsilon 1+\epsilon 2$ and *h*^{(2)} is independent of $FL2$ and is either the identity matrix or satisfies $\Vert h(2)\Vert k2\u2192k2\u2032\u2272R2\u22121/2+3\epsilon 1$ and $\Vert h(2)\Vert k2k2\u2032\u2272L21+\delta R2\u22121/2+2\epsilon 1$. We then estimate (4.4) by

using Propositions 2.6 and 2.8, which is enough for (3.20). Note that here *h*^{b} depends on *k* and *k*′ only via *k* − *k*′ and |*k*|^{2} − |*k*′|^{2} and that $\Vert k|2\u2212|k\u2032|2|\u2264L\kappa 3$, given the assumptions, so Proposition 2.8 is applicable. Similarly, for the $\u2113kk\u20322$ norm, we have

which is enough for (3.21).

Suppose that $yLj$ is replaced by $\psi Lj$ for

*j*∈ {1, 2}. In this case, we will start from (3.19) and expand

for *j* ∈ {1, 2}. There are then two cases, namely, when $k1\u2032$ = $k2\u2032$ or otherwise.

If $k1\u2032$ ≠ $k2\u2032$, then we can repeat the above argument [including further decomposing $\psi Lj$ into $\zeta Lj,Rj$ using (3.6) and (3.7)] and fix the time Fourier variables and reduce to estimating a quantity

where *h*^{(j)} is independent of $FLj$ and is either the identity matrix or satisfies $\Vert h(j)\Vert kj\u2192kj\u2032\u2272Rj\u22121/2+3\epsilon 1$ and $\Vert h(j)\Vert kjkj\u2032\u2272Lj1+\delta Rj\u22121/2+2\epsilon 1$. Since $k1\u2032$ ≠ $k2\u2032$, we can apply Proposition 2.8 either in ($k1\u2032$, $k2\u2032$) jointly (if *L*_{1} = *L*_{2}) or first in $k1\u2032$ and then in $k2\u2032$ (if, say, *L*_{1} ≥ 2*L*_{2}) and get that

which is enough for (3.20). As for the $\u2113kk\u20322$ norm, we have

which is enough for (3.21).

Finally, assume that $k1\u2032$ = $k2\u2032$, and then, *L*_{1} = *L*_{2} = *L*. In (3.19), the summation in $k1\u2032$ = $k2\u2032$ gives

Using the cancellation (3.15) since *k*_{1} ≠ *k*_{2}, we can replace the factor $1/\u27e8k1\u2032\u27e92$ in the above expression by $1/\u27e8k1\u2032\u27e92\u22121/\u27e8k1\u27e92$; then, by further decomposing $HLj$ into $hLj,Rj$ by (3.7) and repeating the above arguments, we can reduce to estimating the following quantity:

where *h*^{(j)} is either the identity matrix or satisfies $\Vert h(j)\Vert kj\u2192kj\u2032\u2272Rj\u22121/2+3\epsilon 1$ and $\Vert h(j)\Vert kjkj\u2032\u2272Lj1+\delta Rj\u22121/2+2\epsilon 1$. Note that we may assume |*k*_{j} − $kj\u2032$| ≲ *R*_{j}*L*^{δ} using the bound (3.17), so, in particular, we have

up to a loss of *L*^{Cδ} (which is acceptable as in this case, we can gain at least $L\epsilon 2$). Using these, we estimate, assuming without loss of generality that *R*_{1} ≥ *R*_{2},

### B. The matrices *H*^{N,L} and *h*^{N,L}

and we also extend its kernel in the same way as we do for $L\u2009$ in Sec. IV A. Let $L\u2009\u0303N,L=L\u2009N,L\u2212L\u2009N,L/2$, and then, by induction hypothesis and the proof in Sec. IV A, we know that $L\u2009\u0303N,L$ also satisfies estimates (3.20) and (3.21). Clearly, (3.20) implies that $\Vert L\u2009\u0303N,L\Vert Xb\u2192X1\u2212b\u2272L\u22121/2+3\epsilon 1\u2212\epsilon 2$; moreover, it is easy to see that

and hence, $\Vert L\u2009N,L\Vert X0\u2192X1\u2272L12$, and the same holds for $L\u2009\u0303N,L$. By interpolation, we obtain that $\Vert L\u2009\u0303N,L\Vert X\alpha \u2192X\alpha \u2272L\u22121/2+3\epsilon 1$ for *α* ∈ {*b*, 1 −*b*} (note that we can always gain a positive power of *τ* using Lemma 2.3; see Sec. III D). Moreover, consider the kernel $(FL\u2009\u0303N,L)kk\u2032(\lambda ,\lambda \u2032)$, and then, we also have the following bound:

which follows from (3.20). If we replace the factor ⟨*λ*′⟩^{−b} by 1, then a simple argument shows that

(and the same for $L\u2009\u0303N,L$) by using that

and then fixing the Fourier modes of *v*_{L}. Interpolating again, we get that

A similar interpolation gives

Now, let

and then, it is easy to see that $H\u2009\u2009N,L\u22121$ satisfies the same bounds (4.12) and (4.13) with right-hand sides replaced by 1; for example, (4.12) follows from iterating the following bound:

provided that

and (4.13) is proved similarly. Define further

By iterating the *X*^{α} → *X*^{α} bounds and using also (3.21) for $L\u2009\u0303N,L$, we can show that

The weighted bound

is shown in the same way but using Proposition 2.9.

In addition, we can also show that

and similarly,

assuming (4.14).

Now, we can finally prove (3.16) and (3.17). In fact, by definition of $H\u2009\u2009N,L$ and $H\u2009\u2009\u0303N,L$, there exists an extension of *h*^{N,L} such that

so the *Y*^{1−b} and *Z*^{b} bounds in (3.16), as well as (3.17), can be deduced directly from (4.15)–(4.17). The bound sup_{t}$\Vert $*h*^{N,L}(*t*)$\Vert $_{k→k′} is also easily controlled by $\Vert H\u2009\u2009\u0303N,L\Vert Xb\u2192Xb$ using the embedding $Lt\u221eL2\u21aaXb$. This completes the proof for (3.16) and (3.17).

## V. ESTIMATES FOR *ρ*^{N}

*ρ*In this section, we prove the first bound in (3.18) regarding *ρ*^{N}, assuming *N* = *M*. Recall that from (3.2), (3.4), and (3.8), we deduce that *ρ*^{N} satisfies the following equation:

with initial data $(\rho N)k(0)=0$. Let $L\u2009N,L$ be defined as in (4.11), and denote $L\u2009N\u2254L\u2009N,N/2$; from Sec. IV B, we know that $(1\u2212L\u2009N)\u22121\u2254H\u2009\u2009N$ is well-defined and has kernel $(H\u2009\u2009N)kk\u2032(t,t\u2032)$ in physical space and $(FH\u2009\u2009N)kk\u2032(\lambda ,\lambda \u2032)$ in Fourier space. Then, (5.1) can be reduced to

where

Here, in (5.3), we assume for *j* ∈ {1, 2} that $wj\u2208{\psi Nj,\rho Nj,zNj}$, where max(*N*_{1}, *N*_{2}) = *N*, and that *w*_{3} ∈ {*ψ*^{N}, *ρ*^{N}}.

In order to prove the bound for *ρ*^{N} in (3.17), we will apply a continuity argument, namely, assuming (3.17) and then improving it with a smallness factor. This can be done as long as we bound

since from Sec. IV B, we know that $H\u2009\u2009N$ is bounded from *X*^{b}(*J*) to *X*^{b}(*J*). In fact, we will prove (5.4) with an extra gain $N\u2212\epsilon 2/2$, which will allow us to ignore any possible *N*^{Cδ} loss in the process. The smallness factor *τ*^{θ} will be provided by Lemma 2.3 as in Sec. III D, so we will not worry about it below. We divide the right-hand side of (5.3) into three terms:

Term I: when

*w*_{3}=*ρ*^{N}.Term II: when

*w*_{3}=*ψ*^{N}and*z*_{N′}∈ {*w*_{1},*w*_{2}} for some*N*′ ≥*N*/2.Term III: when

*w*_{3}=*ψ*^{N}and*w*_{1},*w*_{2}∈ {*ψ*^{N},*ρ*^{N},*ψ*^{N/2},*ρ*^{N/2}}.

Note that these are the only possibilities since if (say) *N*_{1} = *N*, *w*_{1} ∈ {*ψ*^{N}, *ρ*^{N}}, and *N*_{2} ≤ *N*/2, then we must have *N*_{2} = *N*/2 due to the support condition for *ψ*^{N} and *ρ*^{N}, as well as the restriction |*k*_{1} − *k*_{2}| ≲ *N*^{ε} in $M\u2009\u226a$. Moreover, the estimate of term I follows from the operator norm bound,

### A. Term II

For simplicity, we assume that *w*_{1} = *z*_{N′} (the proof of the case *w*_{2} = *z*_{N′} is similar). There are then two cases to consider, when $w2\u2208{\rho N2,zN2}$ or when $w2=\psi N2$.

#### 1. The case $w2\u2208{\rho N2,zN2}$

If $w2\u2208{\rho N2,zN2}$, then we, in particular, have $\Vert w2\Vert Xb(J)\u2272N2\u22121/2+\epsilon 1+\epsilon 2$. By Lemma 2.4, we may fix an extension of *w*_{1} and *w*_{2} that satisfy the same bounds as they do but with *X*^{b}(*J*) replaced by *X*^{b}; moreover, they satisfy the same measurability conditions as *w*_{1} and *w*_{2}. For simplicity, we will still denote them by *w*_{1} and *w*_{2}. The same thing is done for *w*_{3} = *ψ*^{N}, as well as the corresponding matrices.

Now, by (5.3) and Lemma 2.2, we can find an extension of II, which we still denote by II for simplicity such that

where Ω = |*k*|^{2} − |*k*_{1}|^{2} + |*k*_{2}|^{2} − |*k*_{3}|^{2} and *I* = *I*(*λ*, *μ*) is as in (2.10). In the above expression, let *μ* ≔ *λ* − (Ω + *λ*_{1} − *λ*_{2} + *λ*_{3}). In particular, we have |*I*| ≲ ⟨*λ*⟩^{−1}⟨*μ*⟩^{−1} by (2.10). By a routine argument, we may assume that |*λ*| ≤ *N*^{100} and similarly for *μ* and each *λ*_{j}; in fact, if, say, |*λ*_{1}| is the maximum of these values and |*λ*_{1}| ≥ *N*^{100}, then we may fix the values of *k* and all *k*_{j} at a loss of at most *N*^{12} and reduce to estimating (with the value of Ω fixed)

with |*λ*_{1}| ∼ *K* ≥ *N*^{100} and $\Vert \u27e8\lambda j\u27e9bwj^\Vert L2\u22721$ for each *j*. By estimating *w*_{1} in the unweighted *L*^{2} norm, we can gain a power *K*^{−1/2}, and using the *L*^{1} integrability of $wj^$ that follows from the weighted *L*^{2} norms, we can fix the values of *λ*_{j} for *j* ∈ {2, 3}. In the end, this leads to

and hence, $\Vert \u27e8\lambda \u27e9bII^\Vert L2\u2272K\u22121/3\u2272N\u221230$, which is more than enough for (3.18).

Now, with |*λ*| ≤ *N*^{100}, …, we may apply the bounds (3.16)–(3.18), but for the extensions and global norms, and replace the *Y*^{1−b} norm (if any) by the *Y*^{b} norm at a loss of $NC\kappa \u22121$, which will be neglected as stated above. Similarly, as |*λ*| ≤ *N*^{100}, we also only need to estimate II in the *X*^{1−b} instead of *X*^{b} norm again at a loss of $NC\kappa \u22121$. Then, using *L*^{1} integrability in *λ*_{j} (together with a meshing argument; see Sec. III D), provided by weighted bounds (3.16)–(3.18) and the (almost) summability in *μ* due to the ⟨*μ*⟩^{−1} factor in (2.10), we may fix the values of *λ*, *λ*_{j} (1 ≤ *j* ≤ 3), and ⌊*μ*⌋(and hence, the value of $\Omega \u2208Z$) and reduce to estimating the $\u2113k2$ norm of the following quantity:

Here, in (5.7), we assume that |*k*_{1}| ≤ *N*, |*k*_{2}| ≤ *N*_{2}, |*k*_{1} − *k*_{2}| ≲ *N*^{ε}, and *N*/2 < ⟨*k*_{3}⟩, ⟨$k3\u2032$⟩ ≤ *N* and $\Omega 0\u2208Z$ is fixed, and the inputs satisfy that

To estimate $Q$, we may assume |*k*_{1} − *k*_{2}| ∼ *R* ≲ *N*^{ε} and define the base tensor

with also the restrictions on *k*_{j} as mentioned above. Then, we have

and hence,

By Lemma 2.8 and the independence between $Hk3k3\u2032$ and $(FN)k3\u2032$, we get that

*N*-certainly. By the definition of *h*^{b} and using Schur’s bound and counting estimates in Lemma 2.5 and noting that |*k*_{2}| ≤ *N*_{2} and |*k*_{1} − *k*_{2}| ≲ *R*, we can bound

Since also $\Vert H\Vert k3\u2192k3\u2032\u22721$, we conclude that

which is enough for (3.18). This concludes the proof for term II when $w2\u2208{\rho N2,zN2}$. Note that the above argument also works for the case when *w*_{1} = *ρ*^{N} and $w2=\rho N2$ because here, we must have *N*_{2} ≥ *N*/2 due to the support condition of *ρ*^{N} and the assumption |*k*_{1} − *k*_{2}| ≲ *N*^{ε}, and the above arguments give the same (in fact, better) estimates.

#### 2. The case $w2=\psi N2$

In this case, by repeating the first part of the arguments in Sec. V A 1, we can reduce to estimating the $\u2113k2$ norm of the following quantity:

Here, in (5.9), we assume that |*k*_{1}| ≤ *N*, |*k*_{2}| ≤ *N*_{2}, |*k*_{1} − *k*_{2}| ∼ *R* ≲ *N*^{ε}, *N*_{2}/2 < ⟨*k*_{2}⟩, ⟨$k2\u2032$⟩ ≤ *N*_{2}, and *N*/2 < ⟨*k*_{3}⟩, ⟨$k3\u2032$⟩ ≤ *N*, and $\Omega 0\u2208Z$ is fixed, and the inputs satisfy that

Moreover, this *H*^{(j)} is such that either *H*^{(j)} = Id or $\Vert H(j)\Vert kjkj\u2032\u2272Nj1+\delta $ with *N*_{3} = *N*. The sum in (5.9) can be decomposed into a term where $k2\u2032$ ≠ $k3\u2032$ and a term where $k2\u2032$ = $k3\u2032$.

Case 1: $k2\u2032$ ≠ $k3\u2032$. Let $hkk1k2k3b$ be defined as mentioned above, and it suffices to estimate the $\u2113k12\u2192\u2113k2$ norm of the tensor

by using the *ℓ*^{2} norm of *w*_{1}. If *N*_{3} = *N*, then the tensors *h*^{b}, *H*^{(2)}, and *H*^{(3)} are independent of $(FN)k2\u2032$ and $(FN)k3\u2032$ and $k2\u2032$ ≠ $k3\u2032$, so we can apply Lemma 2.8; if *N*_{2} ≤ *N*/2, then *h*^{b}, *H*^{(2)}, and *H*^{(3)} and $(FN2)k2\u2032$ are all independent of $(FN)k3\u2032$, and moreover, *h*^{b} and *H*^{(2)} are independent of $(FN2)k2\u2032$, so we can apply Lemma 2.8 iteratively, first for the sum in (*k*_{3}, $k3\u2032$) and then for the sum in (*k*_{2}, $k2\u2032$). In either case, by applying Lemma 2.8 and combining it with Lemma 2.7 and estimating *H*^{(j)} in the *k*_{j} → $kj\u2032$ norm, we obtain *N*-certainly that the desired $\u2113k12\u2192\u2113k2$ norm of the tensor is bounded by

Using the fact that |*k*_{2}| ≤ *N*_{2} and |*k*_{3}| ≤ *N* in the support of *h*^{b} and Lemma 2.5 as above, we can show that

and hence,

which is enough for (3.18).

Case 2: $k2\u2032$ = $k3\u2032$. In this case, we must have *N*_{2} = *N*, and we can reduce (5.9) to the following expression:^{39}

where

As *k*_{2} ≠ *k*_{3} in (5.9) due to the definition of $M\u2009\u226a$, we know that either *H*^{(2)} or *H*^{(3)} must not be identity, and hence, we have $\Vert H\u0303\Vert \u2113k2k32\u2272N\u22121+\delta $. By (5.11), we then simply estimate

using Lemma 2.5, notig that |*k*_{1} − *k*_{2}| ≲ *R* and |*k*_{3}| ≤ *N*. This completes the proof for term II.

### B. Term III

Here, we assume *w*_{3} = *ψ*^{N} and *w*_{1}, *w*_{2} ∈ {*ψ*^{N}, *ρ*^{N}, *ψ*^{N/2}, *ρ*^{N/2}}. We consider two possibilities, when *w*_{1}, *w*_{2} ∈ {*ψ*^{N}, *ψ*^{N/2}}, which we call term IV, and when *w*_{j} ∈ {*ρ*^{N}, *ρ*^{N/2}} for some *j* ∈ {1, 2}, which we call term V.

#### 1. Term IV

Suppose *w*_{1}, *w*_{2} ∈ {*ψ*^{N}, *ψ*^{N/2}}. We may also decompose them into $\psi Nj,Lj$ for *L*_{j} ≤ *N*_{j}/2 and reduce to

where *N*_{1}, *N*_{2} ∈ {*N*, *N*/2} and *N*_{3} = *N*. In (5.13), we consider two cases, depending on whether there is a pairing $k1\u2032$ = $k2\u2032$ or $k2\u2032$ = $k3\u2032$ or not.

Case 1: *no-pairing*. Assume that $k2\u2032$ ∉ {$k1\u2032$, $k3\u2032$}, and then, we take the Fourier transform in the time variable *t* and repeat the first part of the arguments in Sec. V A 1 to reduce to estimating the $\u2113k2$ norm of the following quantity:

In (5.14), we assume that |*k*_{j}| ∼ *N* and |*k*_{1} − *k*_{2}| ∼ *R* ≲ *N*^{ε} and that the matrices *h*^{(j)} are either identity or satisfy that

and moreover, we may assume that *h*^{(j)} is supported in |*k*_{j} − $kj\u2032$| ≲ *L*_{j}*N*^{δ} by inserting a cutoff exploiting (3.17). The $\u2113k2$ norm for $Qk$ can then be estimated using Proposition 2.8 in the same way as in Sec. V A 2, either jointly in ($k1\u2032$, $k2\u2032$, $k3\u2032$) if each *N*_{j} = *N* or first in those *k*_{j} with *N*_{j} = *N* and then in those *k*_{j} with *N*_{j} = *N*/2, so that *N*-certainly we have (with the base tensor *h*^{b} defined as in Secs. V A 1 and V A 2)

using Lemma 2.5, which is enough for (3.18).

Case 2: *pairing*. We now consider the cases when $k1\u2032$ = $k2\u2032$ or $k2\u2032$ = $k3\u2032$. First, if $k2\u2032$ = $k3\u2032$, then we can apply the reduction arguments as mentioned above and reduce to

where

Note that both *h*^{(2)} and *h*^{(3)} cannot be identity as *k*_{2} ≠ *k*_{3}. Now, if max(*L*_{2}, *L*_{3}) ≤ *N*/2, then due to independence, applying similar arguments as mentioned before, we can estimate *N*-certainly that

using the constraint |*k*_{1} − *k*_{2}| ≲ *N*^{ε}, which is enough for (3.18); if max(*L*_{2}, *L*_{3}) = *N*, then we can gain a negative power of this value and view $FN1$ simply as an *H*^{−1/2−} function (without considering randomness) and bound

which is also enough for (3.18).

Finally, consider the case $k1\u2032$ = $k2\u2032$, so, in particular, *N*_{1} = *N*_{2}. We will sum over *L*_{1} and *L*_{2} in order to exploit the cancellation (3.15) (as *k*_{1} ≠ *k*_{2}); this leads to the following expression:

where again we have replaced $|gk1\u2032|2$ by 1 as mentioned before. Since *k*_{1} ≠ *k*_{2}, by (3.15), we may replace $\u27e8k1\u2032\u27e9\u22122$ in the above expression by $\u27e8k1\u2032\u27e9\u22122\u2212\u27e8k1\u27e9\u22122$. Then, decomposing in *L*_{1} and *L*_{2} again and taking Fourier transform in *t* and repeating the reduction steps as mentioned before, we arrive at the following quantity:

where

with *h*^{(j)} as mentioned above. Note that we may assume |*k*_{1} − $k1\u2032$| ≲ *N*^{δ} min(*L*_{1}, *N*^{ε} + *L*_{2}) ≲ *N*^{ε+δ} min(*L*_{1}, *L*_{2}) in view of |*k*_{1} − *k*_{2}| ≲ *N*^{ε}, and it is easy to show, assuming min(*L*_{1}, *L*_{2}) = *L*, that

Since max(*L*_{1}, *L*_{2}) ≤ *N*/2, using independence and arguing as mentioned before, we can estimate that *N*-certainly,

which is enough for (3.18). This completes the estimate for term IV.

#### 2. Properties of the matrix *M*^{N} − *H*^{N}

Before studying term V, we first establish some properties of the matrix $QN\u2254MN\u2212HN=(QN)kk\u2032(t)$ such that

*Lemma 5.1.*

*Let*$\epsilon \u2032\u2254\epsilon $

*so that*(

*ε*

_{1}≪)

*ε*≪

*ε*′ ≪ 1

*. Then, we have*

*Moreover, we can decompose Q*

^{N}=

*Q*

^{N,≪}+

*Q*

^{N,rem}

*such that*$\Vert QN,rem\Vert Zb(J)\u2272N1/2+2\epsilon 1\u2212\epsilon \u2032/4$

*and that*

*Moreover, Q*

^{N,≪}

*can be decomposed into at most N*

^{Cε′}

*terms. For each term Q*,

*there exist vectors ℓ*

^{*},

*m*

^{*}

*such that*|

*ℓ*

^{*}|, |

*m*

^{*}| ≲

*N*

^{ε′}

*and that*$(Q^)kk\u2032(\lambda )$

*is a linear combination (in the form of some integral*

^{40}

*), with summable coefficients, of expressions of the form*

*where*$Y\u2009$

*is independent with*$(FN)k$

*and*$|Y\u2009|\u22721$

*and*$R(k)$

*depends only on m*

^{*}⋅

*k; moreover, we have*

*Remark 5.2.*

Lemma 5.1 plays an important role in Sec. V B 3 when estimating term V. In particular, we will exploit the one-dimensional extra independence of $R(k)$ with $(FN)k$ since $R(k)$ depends only on *m*^{*}⋅*k* instead of on *k*.

*Proof.*

*ξ*

^{N}and

*ψ*

^{N}in (3.11) and (5.1)–(5.3), as well as the associated matrices, we have the following identity:

*X*

^{α}→

*X*

^{α}bounds) for the operators $H\u2009\u2009N$ and $M\u2009$, where the bounds for $M\u2009$ are proved in the same way as in Secs. IV A and IV B. Moreover, in (5.22), if we assume

*n*≥ 2 or replace $H\u2009\u2009N$ by $H\u2009\u2009N\u2212H\u2009\u2009N,N\epsilon \u2032$ (or

*H*

^{N}by $HN\u2212HN,N\epsilon \u2032$), then the corresponding bounds can be improved by

*N*

^{−ε′/4}, and the resulting terms can be put in Ref. 41

*Q*

^{rem}. As for the remaining contribution, we can write

*k*−

*k*

_{1}| ≲

*N*

^{ε′}and the same for

*k*

_{1}−

*k*

_{2}(using the definition of $M\u2009$) and

*k*

_{2}−

*k*′, so at a loss of

*N*

^{Cε′}, we may fix the values of

*k*−

*k*

_{1},

*k*

_{1}−

*k*

_{2}, and

*k*

_{2}−

*k*′. Note that the matrices $FH\u2009\u2009N,N\epsilon \u2032$ and $FM\u2009$ satisfy bounds (4.15)–(4.17); moreover, in (4.17), we may replace the unfavorable exponents ⟨

*λ*⟩

^{2(1−b)}⟨

*λ*′⟩

^{−2b}by the favorable ones ⟨

*λ*⟩

^{2b}⟨

*λ*′⟩

^{−2(1−b)}, at the price of replacing the right-hand side by a small positive power $NC\kappa \u22121$, by repeating the interpolation argument in Sec. IV B. Using these bounds, we then see that the integral (5.24) provides the required linear combination. Here, summability of coefficients follows from the estimate

*k*−

*k*

_{1},

*k*

_{1}−

*k*

_{2}, and

*k*

_{2}−

*k*′ are all fixed. We set that

*ℓ*

^{*}≔ (

*k*

_{1}−

*k*) + (

*k*

_{2}−

*k*

_{1}) + (

*k*′ −

*k*

_{2}) =

*k*′ −

*k*and

*m*

^{*}≔

*k*

_{1}−

*k*

_{2}. Finally, for fixed (

*λ*

_{1},

*λ*

_{2}),

^{42}we set $Y\u2009\u2113*,m*(k,\lambda )\u2254(FH\u2009\u2009N,N\epsilon \u2032)kk1(\lambda ,\lambda 1)(FHN,N\epsilon \u2032)k2k\u2032(\lambda 2)$ and $R\u2113*,m*(k)\u2254(FM\u2009)k1k2(\lambda 1,\lambda 2)$. The factors coming from $HN,N\epsilon \u2032$ and $H\u2009\u2009N,N\epsilon \u2032$ are independent of $(FN)k$, while the factor coming from $M\u2009$

*depends on k*

_{1}

*only*via

*the quantity*|

*k*

_{1}|

^{2}− |

*k*

_{2}|

^{2}in view of the definition (5.23); hence, the desired decomposition is valid because |

*k*

_{1}|

^{2}− |

*k*

_{2}|

^{2}equals

*m*

^{*}⋅

*k*plus a constant once the above-mentioned difference vectors are all fixed. In addition, the bounds of $R$ and $Y\u2009$ in (5.21) can be easily proved by the above setting of $R$ and $Y\u2009$ together with bounds (4.12)–(4.17) (together with the

*X*

^{α}→

*X*

^{α}bounds) for the operators $H\u2009\u2009N$ and $M\u2009$.□

#### 3. Term V

Now, let us consider term V as defined in the introduction of Sec. V B. In the following proof of estimating term V, we will fully use the cancellation in (3.15) together with Lemma 5.1. We may assume that *N*_{1} = *N*_{2} = *N* because if *N*_{1} ≠ *N*_{2}, then in later expansions, we must have $k1\u2032$ ≠ $k2\u2032$ [so the cancellation in (3.15) is not needed], and the proof will go in the same way; if *N*_{1} = *N*_{2} = *N*/2, then the same cancellation holds, and again, we have the same proof. Now, recall that *ρ*^{N} = *ξ*^{N} − *ψ*^{N} and that

as in (3.5) and (3.11), and that both *M*^{N} and *H*^{N} satisfy equality (3.15). Using this cancellation (when $k1\u2032$ = $k2\u2032$ in the expansion) in the same way as in Sec. V B 1 and by repeating the reduction steps mentioned before, we can reduce to estimating the quantity that is either

or

where

Here, in (5.27) and (5.28), the matrix *Q* is coming from *Q*^{N}, where $Qk1k1\u2032=(QN^)k1k1\u2032(\lambda )$ for some fixed *λ*; similarly, *P* is coming from either *Q*^{N} or $hN,L2$, and *h*^{(3)} is coming from $hN,L3$ in the same way.

First, we consider (5.28). By losing a power *N*^{Cε}, we may fix the values of *k*_{1} − *k*_{2} and *k* − *k*_{3}, and then, we will estimate $Q$ using $\Vert hb\Vert k1k2\u2192kk3\u2272N2+C\epsilon $, and we have the following bounds:

(with *L*_{2} = *N* if *P* is coming from *Q*^{N}), where the first bound mentioned above follows from Proposition 2.8 for each fixed *k*_{3} and the second bound follows from estimating $\Vert h\u0303\Vert k1k2\u2272L2N\u22123\Vert Q\Vert k1k1\u2032\Vert P\Vert k2\u2192k2\u2032$. This leads to

which is enough.

Now, we consider (5.27). If *P* is coming from *Q*^{N}, then in (5.27), we may remove the condition $k1\u2032$ ≠ $k2\u2032$, reducing essentially to the expression in (5.3) with both *w*_{1} and *w*_{2} replaced by *ρ*^{N}, which is estimated in the same way as in Sec. V A 1. On the other hand, the term when $k1\u2032$ = $k2\u2032$ can be estimated in the same way as in (5.28). The same argument applies if *P* is coming from $hN,L2$ and max(*L*_{2}, *L*_{3}) ≥ *N*^{ε′}, where we can gain a power *N*^{−ε′/4} from either *L*_{2} or *L*_{3}, or if *Q* is coming from *Q*^{N,rem}, where we can gain extra powers *N*^{−ε′/4} using Lemma 5.1.

Finally, consider (5.27), assuming that max(*L*_{2}, *L*_{3}) ≤ *N*^{ε′} and that *Q* comes from *Q*^{N,≪} in Lemma 5.1. By losing at most *N*^{Cε′}, we may fix the values of *k*_{1} − *k*_{2}, *k* − *k*_{3}, *k*_{2} − $k2\u2032$, and *k*_{3} − $k3\u2032$ and consider one single component of *Q*^{N,≪} described as in Lemma 5.1. Then, there are only two independent variables—namely, *k* and *k*_{1}—and we essentially reduce (5.27) to

Here, $|A\u2009|\u22721$ is a non-probabilistic factor, |*ℓ*|, |*ℓ*^{*}| ≲ *N*^{ε′} are fixed vectors, $Y\u2009=Y\u2009(k1)$ and $R=R(k1)$ are as in Lemma 5.1, and $P=Pk2k2\u2032$ is defined as above. Moreover, we know that $Y\u2009$ and *P* are independent of $gk1\u2032$ and $gk2\u2032$, that $R(k1)$ depends only on *m*^{*}⋅*k*_{1} for some fixed vector |*m*^{*}| ≲ *N*^{ε′}, and that |*P*| ≲ *N*^{O(ε)}, $|Y\u2009|\u2272NO(\epsilon )$, and $\Vert R\Vert \u21132\u2272N1/2+O(\epsilon )$ (after fixing *λ* as before). Finally, $gk\u0303$ in (5.29) is $\u2211k3\u2032hk3k3\u2032(3)(FN3)k3\u2032$ bounded by $|gk\u0303|\u2272N\u22121$.

Since $R(k1)$ only depends on *m*^{*}⋅*k*_{1}, if we fix the value of *m*^{*}⋅*k*_{1} in the above-mentioned summation, then $R(k1)$ can be extracted as a common factor, and for the rest of the sum, we can apply independence (using Proposition 2.8) and get

where $R(a)=R(k1)$ for any *k*_{1} · *m*^{*} = *a* and $Sa,k\u2254{k1\u2208Z3:\u2113\u22c5(k1+k)=\Omega 0,k1\u22c5m*=a}$. Note that in the above estimate, we are dividing the set of possible *k*_{1}’s into subsets *S*_{a,k}, where *ℓ* · *k*_{1} equals some constant and *m*^{*} · *k*_{1} equals another constant, and that *S*_{a,k} is either empty or has cardinality ≥*N*^{1−Cε′}. When *S*_{a,k} = *∅*, $|Qk|=0$. When *S*_{a,k} ≠ *∅*, we have |*S*_{a,k}| ≥ *N*^{1−Cε′}, and hence,

Then, using Schur’s bound, we get that

which is enough for (3.18). This completes the proof for *ρ*^{N}.

### C. An extra improvement

For the purpose of Sec. VI, we need an improvement for the *ρ*^{N} bound in (3.18), namely, the following.

*Proposition 5.3.*

*Let N*=

*M,*$Y\u2208R$

*be any constant, and consider ρ*

^{*}

*defined by*

*and then, N-certainly, we can improve*(3.18)

*to*$\Vert \rho *\Vert Xb(J)\u2264N\u22121/2+\epsilon 1/2$

*. Note that this bound is better than the bound for z*

_{N}

*in*(3.18) [

*which is better than the bound for ρ*

^{N}

*in*(3.18)]

*.*

*Proof.*

We only need to examine terms I ∼ V in the above proof. For terms I and IV and V (hence also III), in the above proof, we already obtain bounds better than $N\u22121/2+\epsilon 1/2$, so these terms are acceptable, and we need to study term II. Note that the definition of *ρ*^{*} restricts *k* to a set *E* of cardinality ≤*N*^{1+Cε′} by the standard divisor counting bound.

*k*

_{j}| ≲

*N*

_{j}≲

*N*and |

*k*

_{1}−

*k*

_{2}| ∼

*R*, such that in the support of

*h*

^{b}, we have

*k*−

*k*

_{1}+

*k*

_{2}−

*k*

_{3}= 0 and |

*k*|

^{2}− |

*k*

_{1}|

^{2}+ |

*k*

_{2}|

^{2}− |

*k*

_{3}|

^{2}= Ω

_{0}. There are three cases in term II that need consideration:

The case in Sec. V A 1: Here, bound (5.8) suffices unless max(

*N*_{2},*R*) ≤*N*^{Cε′}; if this happens, note that in the above proof, (5.8) follows from the estimate

*N*

_{2},

*R*) ≤

*N*

^{Cε′}. However, if we further require

*k*∈

*E*, then the right-hand side of the above bound can be improved to |

*E*|

^{1/2}=

*N*

^{1/2+Cε′}, which leads to the desired improvement of (3.18).

Case 1 in Sec. V A 2: Here, the bound (5.10) suffices unless

*R*≤*N*^{Cε′}; if this happens, note that (5.10) follows from the estimate

*R*≤

*N*

^{Cε′}. However, if we further require

*k*∈

*E*, then the right-hand side can be improved to |

*E*|

^{1/2}

*N*

_{2}=

*N*

^{1/2+Cε′}

*N*

_{2}, which allows for the improvement.

□

## VI. THE REMAINDER TERMS

Now, we will prove the *z*_{N} part of bound (3.18), assuming that *N* = *M*. We will prove it by a continuity argument [see part (1) of Sec. III D for more details], so we may assume (3.18) and only need to improve it using Eq. (3.13); note that the smallness factor is automatic as long as we use (3.13), as explained before. As such, we can assume that each input factor *w*_{j}s on the right-hand side of (3.13) has one of the following four types, where in all cases, we have *N*_{j} ≤ *N*:

Type (G), where we define

*L*_{j}= 1 and

- (ii)
Type (C), where

with $hkjkj\u2032(j)(\lambda j,\omega )$ supported in the set $Nj2<\u27e8kj\u27e9\u2264Nj,Nj2<\u27e8kj\u2032\u27e9\u2264Nj$ and $B\u2264Lj$ measurable for some *L*_{j} ≤ *N*_{j}/2 and satisfying the following bounds (where in the first bound, we first fix *λ*_{j}, take the operator norm, and the take the *L*^{2} norm in *λ*_{j}):

Moreover, using (3.17), we may assume that *h*^{(j)} is supported in |*k*_{j} − $kj\u2032$| ≲ *N*^{δ}*L*_{j}. Note that if *w*_{j} is of type (G), $(wj^)kj(\lambda j)$ can be also expressed in the same form as (6.2) but with $hkjkj\u2032(j)=1kj=kj\u2032\u22c5\chi ^(\lambda j)$, except that the second equation in (6.3) is not true in this case.

- (iii)
Type (L), where $(wj^)kj(\lambda j)$ is supported in {|

*k*_{j}| ∼*N*_{j}} and satisfies

Also such *w*_{j} is a solution to Eq. (5.1).

- (iv)
Type (D), where $(wj^)kj(\lambda j)$ is supported in {|

*k*_{j}| ≲*N*_{j}} and satisfies

Also such *w*_{j} is a solution to Eq. (3.13).

Now, let the multilinear forms $M\u2009\u25cb$, $M\u2009<$, $M\u2009>$, and $M\u2009\u226a$ be as in (2.4), (3.3) and (3.9). The terms on the right-hand side of (3.13), apart from the first term in the second line of (3.13) which are trivially bounded, are as follows:

Term

where *w*_{j} can be of any type and max(*N*_{1}, *N*_{2}, *N*_{3}) = *N*.

Term

where *w*_{j} can be of any type and max(*N*_{1}, *N*_{2}) = *N*.

Term

where *w*_{j} can be of any type and max(*N*_{1}, *N*_{2}, *N*_{3}) ≤ *N*/2.

Term

where *w*_{j} can be of any type and max(*N*_{1}, *N*_{2}) ≤ *N*/2 and *N*_{3} = *N*.

Term

where *w*_{j} can be of any type and max(*N*_{1}, *N*_{2}) = *N*_{3} = *N*.

Term

where *w*_{1} and *w*_{2} can be of any type, *w*_{3} has type (D), and max(*N*_{1}, *N*_{2}) ≤ *N*/2 and *N*_{3} = *N*.

Term

where *w*_{1} and *w*_{2} can be of any type, *w*_{3} has type (D), and max(*N*_{1}, *N*_{2}) = *N*_{3} = *N*.

Term VIII represents the last two lines of the right-hand side of (3.13).

Our goal is to recover the bound for *z*_{N} in (3.18) for each of the terms I–VIII mentioned above. In doing so, we will consider two cases. First is the *no-pairing* case, where if *w*_{1} and *w*_{2} are of type (C) or (G) and, hence, expanded as in (6.2), then we assume that $k1\u2032$ ≠ $k2\u2032$; similarly, if *w*_{2} and *w*_{3} are of type (C) or (G), then we assume that $k2\u2032$ ≠ $k3\u2032$. The second case is the *pairing* case which is when $k1\u2032$ = $k2\u2032$ or $k2\u2032$ = $k3\u2032$ (the *over-pairing* case where $k1\u2032$ = $k2\u2032$ = $k3\u2032$ is easy, and we shall omit it). We will deal with the no-pairing case for terms I–VII in Secs. VI A–VI C, the pairing case for these terms in Sec. VI D, and term VIII in Sec. VI E.

### A. No-pairing case

We start with the no-pairing case.

#### 1. Preparation of the proof

We start with some general reductions in the no-pairing case. Recall as in Sec. III D that we can always gain a smallness factor from the short time *τ* ≪ 1 and can always ignore losses of $(N*)C\kappa \u22121$, provided that we can gain a power *N*^{−ε/10} (which will be clear in the proof). We will consider $I\chi M\u2009(\u22c6)^(w1,w2,w3)k(\lambda )$, where $M\u2009(\u22c6)$ can be one of $\Pi M\u2009\u25cb$, $\Pi M\u2009<$, $\Pi M\u2009>$, $\Pi M\u2009\u226a$, and $\Pi (M\u2009<\u2212M\u2009\u226a)$, with Π being a general notation for projections for Π_{N}, Π_{N/2}, and Δ_{N},

where Ω = |*k*|^{2} − |*k*_{1}|^{2} + |*k*_{2}|^{2} − |*k*_{3}|^{2} and ∑^{(⋆)} is directly defined based on the definitions of $M\u2009\u25cb$, $M\u2009<$, $M\u2009>$, and $M\u2009\u226a$ and the selection of Π. For example, if $M\u2009(\u22c6)$ is $\Pi NM\u2009>$, then there will be two more restrictions |*k*| ≤ *N* and ⟨*k*_{1} − *k*_{2}⟩ > *N*^{1−δ} in the sum ∑^{(⋆)}. The other ∑^{(⋆)} will defined in the similar ways.

Before going into the different estimates for I–VII, we first make a few remarks.

If a position

*w*_{j}has type (L) or (D), then in most cases, we only need to consider type (L) terms since (6.5) is stronger than (6.4); there are exceptions that will be treated separately later.*w*_{j}of type (G) can be considered as a special case of type (C) when $hkjkj\u2032(j)(\lambda j)=1Nj/2<\u27e8kj\u27e9\u2264Nj\u22c51kj=kj\u2032\u22c5\chi ^(\lambda j)$; if we avoid using the $\u2113kjkj\u20322$ norm in (6.3), then we only need to consider type (C) terms.Term I can be estimated in the same way as term II. In fact, the definition of $M\u2009>$ implies max(

*N*_{1},*N*_{2}) ≥*N*^{1−δ}, so we are essentially in (special case of) term II up to a possible loss*N*^{Cδ}, which will be negligible compared to the gain. Moreover, term V can be estimated similarly as term IV; see Sec. VI C.Terms VI and VII are readily estimated using the

*X*^{α}→*X*^{α}bounds for the linear operator (3.19) proved in Secs. IV A and IV B.

Based on these remarks, from now on, we will consider terms II–IV (and VIII at the end), where the possible cases for the types of (*w*_{1}, *w*_{2}, *w*_{3}) are (a) (C, C, C), (b) (C, C, L), (c) (C, L, C), (d) (L, C, C), (e) (L, L, C), (f) (C, L, L), (g) (L, C, L), and (h) (L, L, L).

In Sec. VI B, we will estimate term II, which can be understood as high–high interactions in view of max(*N*_{1}, *N*_{2}) = *N*, and noting that assuming *k* is the high frequency, then either *k*_{3} is also high frequency or |*k*_{1} − *k*_{2}| must be large. In Sec. VI C, we will estimate terms III and IV by using a counting technique in a special situation called Γ*-condition* [see (6.18)]. In Sec. VI D, we consider the pairing case.

### B. High–high interactions

We will estimate term II in this subsection. First, we can repeat the arguments for *λ*, *λ*_{j}, and the Duhamel operator *I* in (6.6) as in Secs. IV and V. Namely, we first restrict to |*λ*_{j}| ≤ *N*^{100} and |*λ*|, |*μ*| ≤ *N*^{100}, where *μ* = *λ* − (Ω + *λ*_{1} − *λ*_{2} + *λ*_{3}), and replace the unfavorable exponents (1 − *b* or *b* depending on the context) by the favorable ones (*b* or 1 − *b*) and then exploit the resulting integrability in *λ*_{j} to fix the values of *λ*, *λ*_{j}, and ⌊*μ*⌋. Then, we reduce to the following expression where Ω_{0} is a fixed integer:

where *h*^{b} is the base tensor thatcontains the factors

We assume that *h*^{b} is supported in the set where |*k*_{j}| ≤ *N*_{j} and ⟨*k*_{1} − *k*_{2}⟩ ∼ *R*, where *R* is a dyadic number. Moreover, we assume that *R* and the support of *h*^{b} satisfy the conditions associated with the definition of some $M\u2009(\u22c6)$. In view of the factor $|Vk1\u2212k2|\u223cR\u2212\beta $ in *h*^{b}, we also define *h*^{R,(⋆)}≔ *R*^{β} · *h*^{b}, which is essentially the characteristic function of the set

possibly with extra conditions determined by the definition of $M\u2009(\u22c6)$. We also define $SkR$ to be the set of (*k*, *k*_{1}, *k*_{2}, *k*_{3}) ∈ *S*^{R} with fixed *k* and, similarly, define $Sk1k2R$. Noting that when *w*_{j} has type (G), (C), or (L), we can further assume that |*k*_{j}| > *N*_{j}/2 in the definition of *S*^{R}.

#### 1. Case (a): (C, C, C)

In this case, we have

where $hkjkj\u2032(j)=hkjkj\u2032(j)(\omega )$ satisfies (6.3) with some *N*_{j} and *L*_{j} ≤ *N*_{j}/2 for 1 ≤ *j* ≤ 3 and $hkk1k2k3R,(\u22c6)$ is defined as mentioned above.

To estimate $\Vert X\Vert k$, we would like to apply Proposition 2.8 and then Proposition 2.7. Like in Secs. IV A and V, the way we apply Proposition 2.8 depends on the relative sizes of *N*_{j} (1 ≤ *j* ≤ 3). For example, if *N*_{1} = *N*_{2} = *N*_{3}, we shall apply Proposition 2.8 jointly in the ($k1\u2032$, $k2\u2032$, $k3\u2032$) summation in (6.9); if *N*_{1} = *N*_{3} > *N*_{2}, we will first apply Proposition 2.8 jointly in the ($k1\u2032$, $k3\u2032$) summation and then apply it in the $k2\u2032$ summation, and if *N*_{3} > *N*_{1} > *N*_{2}, we will apply first in the $k3\u2032$ summation, then in the $k1\u2032$ summation, and then in the $k2\u2032$ summation. The results in the end will be the same in all cases, so, for example, we will consider the case *N*_{3} > *N*_{1} > *N*_{2}. Now, we have

where

By the independence between $gk3\u2032$ and $H\u0303kk3hk3k3\u2032(3)$ since *N*_{3} > *N*_{1} > *N*_{2}, we apply Proposition 2.8 and Proposition 2.7 and get *τ*^{−1}*N*_{*}-certainly that

Similarly, by the independence between $gk1\u2032$ and $H\u0308kk1k3hk1k1\u2032(1)$ since *N*_{1} > *N*_{2} and also by the independence between $gk2\u2032$ and $hkk1k2k3R,(\u22c6)\u2009hk2k2\u2032(2)\xaf$, once again we can apply Proposition 2.8 and Proposition 2.7 to $\Vert H\u0303kk3\Vert kk3$ and then to $\Vert H\u0308\Vert kk1k3$. As a consequence, we have *τ*^{−1}*N*_{*}-certainly that

In the other cases, we get the same bound. Without loss of generality, we may assume that *N*_{1} = *N*, and then, using Lemma 2.5, we can estimate that

which implies that $\Vert Xk\Vert k\u2272N\u22121+C\delta N31/2\u2272N\u22121/2+C\delta $, which is enough for (3.18).

#### 2. Case (b): (C, C, L)

In this case, we have

where $hkjkj\u2032(j)=hkjkj\u2032(j)(\omega )$ satisfies (6.3) with some *N*_{j} and *L*_{j} ≤ *N*_{j}/2 for 1 ≤ *j* ≤ 2 and the base tensor $hkk1k2k3R,(\u22c6)$ is defined as before. Clearly, $\Vert Xk\Vert k$ can be bounded by $N3\u22121/2+\epsilon 1+\epsilon 2$ times the norm,

By applying Propositions 2.8 and 2.7 again, in the same manner as (VIB1), we get that the above norm is bounded by

By Lemma 2.5, we can conclude that

and hence, we easily get $\Vert Xk\Vert k\u2272N\u22121+C\epsilon 1$, which is enough for (3.18).

#### 3. Case (c): (C, L, C) and case (d): (L, C, C)

The estimates of case (c) and case (d) are similar to case (b), so we will state the estimates in case (c) and case (d) without proofs. In case (c), we get

and in case (d), we get a similar bound, but with subindices 1 and 2 switched.

Now, by Lemma 2.5, we can obtain that

In the first case, we directly get that

which is enough for (3.18) as max(*N*_{1}, *N*_{2}) = *N* and *N*^{1−δ} ≥ *R* ≥ *N*^{ε} (then, we have *N*_{1} ∼ *N*_{2} ∼ *N*) in view of the definition of $M\u2009<\u2212M\u2009\u226a$. In the second case, we get

which is also enough for (3.18) as max(*N*_{1}, *N*_{2}) = *N* and *R* ≥ *N*^{ε}. In the third case, we get

which is also enough for (3.18). By switching indices 1 and 2, we also get the same estimates in case (d).

#### 4. Case (e): (L, L, C)

In this case, we have

where $hk3k3\u2032(3)=hk3k3\u2032(3)(\omega )$ satisfies (6.3) with some *N*_{3} and *L*_{3} ≤ *N*_{3}/2 and the base tensor $hkk1k2k3R,(\u22c6)$ is defined as before. By symmetry, we may assume *N*_{1} ≤ *N*_{2}, and then, by the same argument as above, using Propositions 2.7 and 2.8, we can bound

By Lemma 2.5, both tensor norms are bounded by min(*N*_{1}, *R*)*N*_{3}; as *N*_{1} ≤ *N*_{2} (and hence, *N*_{2} = *N*) and *R* ≥ *N*^{ε}, it is easy to check that this bound is enough for (3.18).

#### 5. Case (f): (C, L, L) and case (g): (L, C, L)

The estimates of cases (f) and (g) are similar to case (e), so we will state the estimates of case (f) and (g) directly. Again the two cases here only differ by switching indices 1 and 2, so we only consider case (f). Like in case (e), we get two bounds,

and

Now, if $N3\u2265N\epsilon 2$, we will apply the first bound and use that

so the factor $N1\u22121N2\u22121/2+\epsilon 1+\epsilon 2min(N1,N2)$, together with $N3\u22121/2+\epsilon 1+\epsilon 2$, where $N3\u2265N\epsilon 2$, provides the bound that is enough for (3.18). Moreover, the same bound also works if $N2\u2264N1\u2212\epsilon 2$ (since in this case, *N*_{1} = *N*).

If $N3\u2264N\epsilon 2$ and $N2\u2265N1\u2212\epsilon 2$, we will apply the second bound and use that

assuming that $N3\u2264N\epsilon 2$. This is also enough for (3.18) assuming that $N2\u2265N1\u2212\epsilon 2$ and *R* ≥ *N*^{ε}.

#### 6. Case (h): (L, L, L)

In this case, we have

where the base tensor $hkk1k2k3R,(\u22c6)$ is defined as before. Then, simply using Proposition 2.7, we get

By Lemma 2.5, we have $\Vert hkk1k2k3R,(\u22c6)\Vert kk2\u2192k1k3\u2272(Rmin(N1,N2))1/2$, which implies that

which is enough for (3.18) because max(*N*_{1}, *N*_{2}) = *N* and *R* ≥ *N*^{ε}.

### C. The **Γ** condition terms

In this section, we estimate terms III and IV. These two terms are actually similar, and the key property that they satisfy is the so-called Γ *condition*. Namely, due to the projections and assumptions on the inputs in terms III and IV, we have that

for some real number Γ, where *S* is the support of the base tensor *h*^{b} [note that in term IV, we may assume that *w*_{3} is not of type (D) as, otherwise, the bound follows from what we have already done, so here, we may choose Γ = (*N*/2)^{2} − 1].

To proceed, we return to $I\chi M\u2009(\u22c6)^(w1,w2,w3)k(\lambda )$ in (6.6), where Ω = |*k*|^{2} − |*k*_{1}|^{2} + |*k*_{2}|^{2} − |*k*_{3}|^{2}, and suppose that *μ* = *λ* − (Ω + *λ*_{1} − *λ*_{2} + *λ*_{3}), and then, we have |*I*| ≲ ⟨*λ*⟩^{−1}⟨*μ*⟩^{−1} by (2.10). Following the same reduction steps as before, we can assume that |*λ*|, |*λ*_{j}|(*j* = 1, 2, 3), |*μ*| ≤ *N*^{100} and may replace the unfavorable exponents by the favorable ones. Now, instead of fixing each *λ*_{j} and *λ* and ⌊*μ*⌋, we do the following.

Without loss of generality, we may assume that |*λ*_{3}| is the maximum of all the parameters |*λ*_{j}| and |*λ*| and |*μ*|; the other cases are treated similarly. We may fix a dyadic number *K* and assume that |*λ*_{3}| ∼ *K*. Then, we may fix *λ*_{j} (*j* ≠ 3) and *λ* and ⌊*μ*⌋, again using integrability in these variables, and exploit the weight $\u27e8\lambda 3\u27e9b$ in the weighted norms in which *w*_{3} is bounded and reduce to an expression

where $(w3\u0303)k3(\lambda 3)=Kb(w3^)k3(\lambda 3)$ and $hkk1k2k3R,K,(\u22c6)(\lambda 3)$ is essentially the characteristic function of the set (with possibly more restrictions according to the definition of $M\u2009(\u22c6)$)

where Ω_{0} is a fixed number such that |Ω_{0}| ≲ *K*. We also define the sets $SkR,M$ to be the set of (*k*_{1}, *k*_{2}, *k*_{3}, *λ*_{3}) such that (*k*, *k*_{1}, *k*_{2}, *k*_{3}, *λ*_{3}) ∈ *S*^{R,M} for fixed *k*. Note that when *w*_{j} is of type (C), (G) or (L), we can further assume that $Nj2<|kj|\u2264Nj$.

The idea in estimating (6.19) is to view (*k*_{3}, *λ*_{3}) as a whole (say, denote it by $k3\u0303$), which will allow us to gain using the Γ condition in estimating the norms of the base tensor *h*^{R,K,(⋆)}. Although our tensors here involve the variable $\lambda 3\u2208R$, it is clear that Propositions 2.6 and 2.7 still hold for such tensors, and Proposition 2.8 can also be proved by using a meshing argument (see Sec. III D, where the derivative bounds in *λ*_{3} are easily proved as all the relevant functions are compactly supported in physical space). Moreover, by the induction hypothesis and the manipulation mentioned above (for example, with the *Y*^{1−b} norm replaced by the *Y*^{b} norm), we can also deduce corresponding bounds for $w3=(w3)k3\u0303$ and the corresponding matrices such that $h\u0303k3\u0303k3\u2032(3)$, for example, $\Vert h\u0303k3\u0303k3\u2032(3)\Vert k3\u2032\u2192k3\u0303\u2272L3\u22121/2+3\epsilon 1$. Because of this, in the proof below, we will simply write $\u2211k3\u0303$, while we actually mean $\u2211k3\u222bd\lambda 3$, so the proof has the same format as the previous ones.

We now consider the input functions. In term III, clearly, max(*N*_{1}, *N*_{2}, *N*_{3}) ≳ *N*; if *N*_{3} ≪ *N*, then we must have max(*N*_{1}, *N*_{2}) ≳ *N* and |*k*_{1} − *k*_{2}| ≳ *N*, and hence, this term can be treated in the same way as term II. Therefore, we may assume that *N*_{3} ∼ *N*, and clearly, the same happens for term IV. If max(*N*_{1}, *N*_{2}) ≳ *N*, then again using the term II estimate, we only need to consider the case where |*k*_{1} − *k*_{2}| ≲ *N*^{ε}. This term can be treated using similar arguments as below and is much easier due to the smallness of |*k*_{1} − *k*_{2}|, so we will only consider the case max(*N*_{1}, *N*_{2}) ≪ *N*. In the same way, we will not consider term V here. Finally, if $w3=zN3$, with *N*_{3} ∼ *N*, then (3.18) directly follows from the linear estimate proved in Sec. IV A and the Γ condition is not needed.

There are two cases: when *w*_{3} has type (L) or or *w*_{3} has type (C) [or (G)]. In the latter case, there are four further cases for the types of *w*_{1} and *w*_{2}, which we will discuss below.

#### 1. The type (L) case

Suppose that *w*_{3} has type (L). Clearly, if $max(N1,N2)\u2265N100\epsilon 2$, then (3.18) also follows from the linear estimates in Sec. IV A [because the difference between the *ρ*^{N} bound and the *z*_{N} bound in (3.18) is at most $N\epsilon 2$], so we may assume that $max(N1,N2)\u2264N100\epsilon 2$. Then, in (6.19), we may further fix the values of (*k*_{1}, *k*_{2}) at the price of $NC\epsilon 2$, and hence, we may write

and by definition, it is easy to see that $\Vert h\Vert k3\u0303\u2192k\u22721$. Then, (3.18) follows, using the bound for *w*_{3}, if $K\u2265N\epsilon 12$. Finally, if $K\u2264N\epsilon 12$, then we have $|\Omega |\u2272N\epsilon 12$, where Ω = |*k*|^{2} − |*k*_{1}|^{2} + |*k*_{2}|^{2} − |*k*_{3}|^{2}. Using the Γ condition (6.18), we conclude that |*k*_{3}|^{2} belongs to an interval of length $NO(\epsilon 12)$, so we can apply Proposition 5.3 to gain a power $N\u2212\epsilon 1/2$, which covers the loss $NO(\epsilon 2+\epsilon 12)$ and is enough for (3.18).

#### 2. The type (C, C, C) case

Now, suppose that *w*_{1}, *w*_{2}, and *w*_{3} have type (C, C, C). By symmetry, we may assume that *N*_{1} ≤ *N*_{2}. Then, by the same argument as in Sec. VI B 1, we obtain that

The last three factors are easily bounded by 1, so it suffices to bound the tensor *h*^{R,K,(⋆)}.

By definition, this is equivalent to counting the number of lattice points (*k*, *k*_{1}, *k*_{2}, *k*_{3}) such that *k*_{1} − *k*_{2} + *k*_{3} = *k* (and also satisfying the inequalities listed above) and |Ω| ≲ *K*. Note that

so when *K* ≤ *K*_{1}, by the Γ condition, |*k*|^{2} has at most *K*_{1} choices, and hence, *k* has at most *K*_{1}*N* choices. Once *k* is fixed, the number of choices for (*k*_{1}, *k*_{2}, *k*_{3}) is at most $KN12R2$, which leads to the bound

If, instead, *K* ≥ *K*_{1}, then *k* has at most *KN* choices, and once *k* is fixed, the number of choices for (*k*_{1}, *k*_{2}, *k*_{3}) is at most $N13R3$, so we get

In either way, we get

which is enough for (3.18) as max(*R*, *N*_{1}) ≲ *N*_{2}.

#### 3. The type (L, L, C) case

Now, suppose that *w*_{1}, *w*_{2}, and *w*_{3} have type (L, L, C). First, assume that *N*_{1} ≤ *N*_{2}. The same arguments in Sec. VI B 4 yields

The second norm mentioned above is easily bounded by *K*^{1/2}*RN*_{1} using Lemma 2.5, which is clearly enough for (3.18); for the first norm, there are two ways to estimate.

The first way is to use Lemma 2.5 directly, without using the Γ condition, to get

The second way is to use the Γ condition and first fix the value of |*k*|^{2} and then we fix *k* and then count $(k1,k3\u0303)$. This yields

assuming *K* ≤ *K*_{1} and a better bound assuming *K* ≥ *K*_{1}. Now, plugging in the second bound yields

which can be shown to be ≲*N*^{−1/2} using the fact max(*R*, *N*_{1}) ≤ *N*_{2} and by considering whether *R* ≥ *N*_{1} or *R* ≤ *N*_{1}. Moreover, the same estimate can be checked to work if $N1\u2264N21.1$. If $N1\u2265N21.1$, we can switch subscripts 1 and 2, in which case, we have the weaker bound,

without the 1/2 power in the last factor; however, this is still ≲*N*^{−1/2}, provided that $N1\u2265N21.1$.

#### 4. The type (L, C, C) and (C, L, C) cases

Now, suppose that *w*_{1}, *w*_{2}, and *w*_{3} have type (L,C,C); the case (C,L,C) is treated similarly. Here, the same arguments in Sec. VI B 3 implies

The two norms $k\u2192k1k2k3\u0303$ and $kk1\u2192k2k3\u0303$ can be estimated by *K*^{1/2}*R* min(*N*_{1}, *N*_{2}), using Lemma 2.5 only and without the Γ condition, which is clearly enough for (3.18). For the $kk1k3\u0303\u2192k2$ norm, we can use the estimates in Sec. VI C 3 and get