In this paper, we consider the defocusing Hartree nonlinear Schrödinger equations on $T3$ with real-valued and even potential V and Fourier multiplier decaying such as |k|β. By relying on the method of random averaging operators [Deng et al., arXiv:1910.08492 (2019)], we show that there exists β0, which is less than but close to 1, such that for β > β0, we have invariance of the associated Gibbs measure and global existence of strong solutions in its statistical ensemble. In this way, we extend Bourgain’s seminal result [J. Bourgain, J. Math. Pures Appl. 76, 649–702 (1997)], which requires β > 2 in this case.

In this paper, we study the invariant Gibbs measure problem for the nonlinear Schrödinger (NLS) equation on $T3$ with Hartree nonlinearity. Such an equation takes the form

$(i∂t+Δ)u=(|u|2*V)u,u(0)=uin,$
(1.1)

where V is a convolution potential. We will assume that it satisfies the following properties:

1. V is real-valued and even, and so is $V^$.

2. (1.1) is defocusing, i.e., V ≥ 0.

3. V acts like β antiderivatives, i.e., $V^(0)=1$ and $|V^(k)|≲⟨k⟩−β$ for some β ≥ 0.

A typical example for such V is the Bessel potential ⟨∇⟩β of order β on $T3$, which can be written in the form (for some c > 0) V(x) = c|x|−(3−β) + K(x) for 0 < β < 3 and $x∈T3\{0}$, where K is a real-valued smooth function on $T3$. Note that when V is the δ function (and β = 0), we recover the usual cubic NLS equation. Our main result (see Theorem 1.3) establishes invariance of the Gibbs measure for (1.1) when V is the Bessel potential of order β, β < 1, and is close enough to 1, greatly improving the previous result of Bourgain,7 which assumes that β > 2 (see also Remark 1.4).

Equation (1.1) can be viewed as a regularized or tempered version of the cubic NLS equation, and both naturally arise in the limit of quantum many-body problems for interacting bosons (see, e.g., Refs. 21 and 33 and references therein). An important question, both physically and mathematically, is to study the construction and dynamics of the Gibbs measure for (1.1), which is a Hamiltonian system.

#### 1. Gibbs measure construction

The Gibbs measure, which we henceforth denote by dν, is formally expressed as

$dν=e−H[u]∏x∈T3 du(x),$
(1.2)

where H[u] is the renormalization of the Hamiltonian,

$∫T3|∇u|2+12|u|2(V*|u|2) dx.$

Rigorously making sense of (1.2) is closely linked to the construction of the $Φ34$ measure in quantum field theory, which has attracted much interest since the 1970s and 1980s1,20,22,26,27,31 and in recent years.3,4,21,32 In the case of (1.1), the answer actually depends on the value of β. When36 β > 1/2, the measure dν can be defined as a weighted version of the Gaussian measure dρ, namely,

$dν = e−∫T3 12 :|u|2(V*|u|2): dx ⋅ dρ, dρ∼e−12∫T3|∇u|2∏x∈T3 dx,$
(1.3)

where :|u|2(V * |u|2): is a suitable renormalization of the nonlinearity [see (1.12) for a precise definition] and the Gaussian free field dρ is defined as the law of distribution for the random variable,37

$f(ω)=∑k∈Z3gk(ω)⟨k⟩eik⋅x,$
(1.4)

with {gk(ω)} being i.i.d. normalized centered complex Gaussians. On the other hand, if 0 < β ≤ 1/2, then dν is a weighted version of a shifted Gaussian measure, which is singular with respect to dρ1. These results were proved recently by Bringmann11 and Oh, Okamoto, and Tolomeo28 by adapting the variational methods of Barashkov and Gubinelli.3

We remark that, in either case mentioned above, it can be shown that the Gibbs measure dν is supported in $H−1/2−(T3)$, the same space as dρ1. In particular, the typical element in the support of dν has infinite mass, which naturally leads to the renormalizations in the construction of dν alluded above; see Sec. I B. From the physical point of view, it is also worth mentioning that, in the same way (1.1) is derived from quantum many-body systems, the Gibbs measure dν, with the correct renormalizations, can also be obtained by taking the limit of thermal states of such systems, at least when V is sufficiently regular (see Refs. 21 and 32).

#### 2. Gibbs measure dynamics and invariance

Of the same importance as the construction of the Gibbs measure is the study of its dynamics and rigorous justification of its invariance under the flow of (1.1). The question of proving invariance of Gibbs measures for infinite dimensional Hamiltonian systems, with interest from both mathematical and physical aspects, has been extensively studied over the last few decades. In fact, it is the works of Refs. 5, 6, and 26—which attempted to answer this question in some special cases—that mark the very beginning of the subject of random data partial differential equations (PDEs).

The literature is now extensive, so we will only review those related to NLS equations. After the construction of Gibbs measures in Ref. 26, the first invariance result was due to Bourgain,5 which applies in one dimension for focusing sub-quintic equations and for defocusing equations with any power nonlinearity. Bourgain6 then extended the defocusing result to two dimensions, but only for the cubic equation; the two-dimensional case with arbitrary (odd) power nonlinearity was recently solved by the authors.17 For the case of Hartree nonlinearity (1.1) in three dimensions, Bourgain7 obtained invariance for β > 2. We also mention the works of Tzvetkov34,35 and of Bourgain and Bulut,8,9 which concern the NLS equation inside a disk or ball, the construction of non-unique weak solutions by Oh and Thomann30 following the scheme in Refs. 2, 13, and 15, and the relevant works on wave equations.11,12,14,28,29 In particular, the recent work of Bringmann12 established Gibbs measure invariance for the wave equation with the Hartree nonlinearity (1.1) for arbitrary β > 0.

The main mathematical challenge in proving invariance of Gibbs measure is the low regularity of the support of the measure, especially in two or more dimensions. For example, for the two-dimensional NLS equation with power nonlinearity, the support of the Gibbs measure dν lies in the space of distributions $H0−(T2)$, while the scaling critical space is $H1/2(T2)$ for the quintic equation and approaches $H1(T2)$ for equations with high power nonlinearities. This gap is a major reason why the two-dimensional quintic and higher cases have remained open for so many years. In the case of (1.1), a similar gap is present, namely, between the support of dν at $H−1/2−(T3)$ and the scaling critical space $H(1−β)/2(T3)$, which is higher than $H0(T3)$ with β < 1.

On the other hand, it is known since the pioneering work of Bourgain6 that with the random initial data, one can go below the classical scaling critical threshold and obtain almost-sure well-posedness results. In the recent works17,18 of the authors, an intuitive probabilistic scaling argument was performed. This leads to the notion of the probabilistic scaling critical index spr ≔ −1/(p − 1), which is much lower than the classical scaling critical index scr≔ (d/2) − 2/(p − 1) in the case of pth power nonlinearity in d dimensions. In Ref. 18, we proved that almost-sure local well-posedness indeed holds in Hs in the full probabilistic subcritical range when s > spr, in any dimensions, and for any (odd) power nonlinearity.

For the case of (1.1), a similar argument as in Refs. 17 and 18 yields that the probabilistic scaling critical index for (1.1) is spr = (−1 − β)/2, which is lower than −1/2, so it is reasonable to think that almost-sure well-posedness would be true. However, the situation here is somewhat different from that in Refs. 17 and 18 due to the asymmetry of the nonlinearity (1.1) compared to the power one, which leads to interesting modifications of the methods in these previous works, as we will discuss in Sec. I C.

#### 3. Probabilistic methods

The first idea in proving almost-sure well-posedness was due to Bourgain6 and Da Prato and Debussche,15 the latter in the setting of parabolic stochastic partial differential equations (SPDEs), which can be described as a linear–nonlinear decomposition. Namely, the solution is decomposed into a linear term, random evolution (or noise) term, and a nonlinear term that has strictly higher regularity, thanks to probabilistic cancellations in the nonlinearity. If the linear term has regularity close to scaling criticality, then the nonlinear term can usually be bounded sub-critically; hence, a fixed point argument applies. However, this idea has its limitations in that the nonlinear term may not be smooth enough, and in practice, it is usually limited to slightly supercritical cases (relative to deterministic scaling) and does not give optimal results.

In Ref. 17, inspired partly by the regularity structure theory of Hairer and the para-controlled calculus by Gubinelli, Imkeller, and Perkowski in the parabolic SPDE setting, we developed the theory of random averaging operators. The main idea is to take the high–low interaction, which is usually the worst contribution in the nonlinear term described above, and express them as a para-product-type linear operator—called the random averaging operator—applied to the random initial data. Moreover, this linear operator is independent of the initial data it applies to and has a randomness structure, which includes the information of the solution at lower scales; see Sec. I C. This structure is then shown to be preserved from low to high frequencies by an induction on scales argument and eventually leads to improved almost-sure well-posedness results. We refer the reader to Ref. 33 for an example of a recent application of the method of random averaging operators of Ref. 17 to weakly dispersive NLS.

In Ref. 18, the random averaging operators are extended to the more general theory of random tensors. In this theory, the linear operators are extended to multilinear operators, which are represented by tensors, and whole algebraic and analytic theories are then developed for these random tensors. For NLS equations with odd power nonlinearity, this theory leads to the proof of optimal almost-sure well-posedness results; see Ref. 18. We remark that, while the theory of random tensors is more powerful than random averaging operators, the latter has a simpler structure, is less notation-heavy, and is already sufficient in many situations (especially if one is not very close to probabilistic criticality).

Finally, we would like to mention other probabilistic methods, developed in the recent works of Gubinelli, Koch, and Oh,25 Bringmann,10,12 and Oh, Okamoto, and Tolomeo.28 These methods also go beyond the linear–nonlinear decomposition and are partly inspired by the parabolic theories. They have important similarities and differences compared to our methods in Refs. 17 and 18, but they mostly apply for wave equations instead of Schrödinger equations, so we will not further elaborate here but refer the reader to the above papers for further explanation.

We start by fixing the i.i.d. normalized (complex) Gaussian random variables ${gk(ω)}k∈Z3$ so that $Egk=0$ and $E|gk|2=1$. Let

$f(ω)=∑k∈Z3gk(ω)⟨k⟩eik⋅x,$
(1.5)

and it is easy to see that $f(ω)∈H−1/2−(T3)$ almost surely. Let $V:T3→R$ be a potential such that V is even, non-negative, and V0 = 1, |Vk| ≲ ⟨kβ as described above. Here and below, we will use uk to denote the Fourier coefficients of u and use $u^$ to represent time Fourier transform only. In this paper, we fix β < 1 and sufficiently close (This is a specific value, but we do not track it below.) to 1. Let $N∈2Z≥0∪{0}$ be a dyadic scale, and define projections ΠN such that $(ΠNu)k=1⟨k⟩≤N⋅uk$ and ΔN = ΠN − ΠN/2, and define

$fN(ω)=ΠNf(ω), FN(ω)=ΔNf(ω)=fN(ω)−fN/2(ω).$
(1.6)

We introduce the following truncated and renormalized versions of (1.1), with the truncated random initial data:

$i∂tuN+ΔuN=ΠN[(|uN|2*V)⋅uN]−σNuN−CNuN, uN(0)=ΠNuin.$
(1.7)

Here, in (1.7), we fix (1.8)

and $CN$ is a Fourier multiplier,

$(CNu)k=(CN)k⋅uk, (CN)k≔∑⟨ℓ⟩≤NVk−ℓ⟨ℓ⟩2.$
(1.9)

Note that uN is supported in ⟨k⟩ ≤ N for all time. The first counterterm in (1.7), namely, −σNuN, corresponds to the standard Wick ordering, where one fixes k1 = k2 in the expression,

$[(|u|2*V)⋅u]k=∑k1−k2+k3=kVk1−k2⋅uk1uk2¯uk3,$
(1.10)

plugs in u = fN(ω), and takes expectations. The second term $−CNuN$ corresponds to fixing k2 = k3, which is present due to the asymmetry of the nonlinearity (|u|2 * V) · u. Note that $(CN)k$ is uniformly bounded and, thus, is unnecessary if β > 1 (in particular, this is the case of Bourgain7); if β < 1, this becomes a divergent term that needs to be subtracted.

Equation (1.7) is a finite dimensional Hamiltonian equation with the Hamiltonian

$HN[u]≔∫T3|∇u|2+12|u|2(V*|u|2)−σN|u|2−12CNu⋅u¯+12σN2−12γN,$
(1.11)

where $γN=∑⟨k⟩,⟨ℓ⟩≤NVk−ℓ⟨k⟩2⟨ℓ⟩2$.

Remark 1.1.
In fact, the Hamiltonian HN[u] can also be expressed as $∫T3|∇u|2+:|u|2(V*|u|2):$, where the suitable renormalized nonlinearity :|u|2(V * |u|2): is defined as
$:|u|2(V*|u|2):=|u|2(V*|u|2)−σN(V*|u|2)−σN|u|2−CNu⋅u¯+σN2−γN.$
(1.12)
Note that $∫T3σN(V*|u|2)=∫T3σN|u|2$ since $V^(0)=1$.
We can define the corresponding truncated and renormalized Gibbs measures, namely,
$dηN(u)=1ZNe−HN[u]−‖u‖L22∏⟨k⟩≤N dukduk¯,$
(1.13)
where ZN > 0 is a normalization constant, making dνN a probability measure. Clearly, dνN is invariant under the finite dimensional flow (1.7). Note that we can also write
$dηN(u)=1ZN*e−HNpot[u] dρN(u),$
(1.14)
where $ZN*$ is another positive constant, dρN is the law of distribution for the linear Gaussian random variable fN(ω) ≔ ΠNf(ω), and $HNpot[u]$ represents the potential energy, given by
$HNpot[u]=∫T312|u|2(V*|u|2)−σN|u|2−12CNu⋅u¯+12σN2−12γN.$
(1.15)
Now, define $ΠN⊥=1−ΠN$. Let $VN$ and $VN⊥$ be the ranges of the projections ΠN and $ΠN⊥$, and let dρ and $dρN⊥$ be the laws of distribution for f(ω) and $ΠN⊥f(ω)$, respectively. Then, we have $dρ=dρN×dρN⊥$; moreover, we define
$dνN=dηN×dρN⊥=GN(u)⋅dρ, GN(u)≔1ZN*e−HNpot[ΠNu].$
We have the following result. Recall that in this paper, we are fixing β < 1 close enough to 1; in particular, β > 1/2.

Proposition 1.2.

Suppose that V is the Bessel potential of order β and β > 1/2; then, GN(u) converges to a limit G(u) in Lq(dρ) for all 1 ≤ q < ∞, and the sequence of measures dνNconverges to a probability measure dν in the sense of total variations. The measure dν is called the Gibbs measure associated with system (1.1).

Proof.

This is proved in the recent works of Bringmann11 and Oh, Okamoto, and Tolomeo.28 Strictly speaking, they are dealing with the case of real-valued u (as they are concerned about the wave equation), but the proof can be readily adapted to the complex-valued case here.□

Now, we can state our main theorem.45

Theorem 1.3.
Let V be the Bessel potential of order β and β < 1 be close enough to 1. There exists a Borel set$Σ⊂H−1/2−(T3)$such that ν(Σ) = 1, and the following holds. For any uin ∈Σ, let uN(t) be defined by (1.7), and then,
$limN→∞uN(t)=u(t)$
exists in$Ct0Hx−1/2−(R×T3)$, and u(t) ∈Σ for all$t∈R$. This u(t) solves (1.1) with a suitably renormalized nonlinearity and defines a mapping Φt: Σ → Σ for each$t∈R$. These mappings satisfy the group properties Φt+s = ΦtΦsand keep the Gibbs measure dν invariant, namely, ν(E) = νt(E)) for any$t∈R$and Borel set E ⊂Σ.

Remark 1.4.

The condition that V is the Bessel potential of order β in Theorem 1.3 is required only because of the assumption of Proposition 1.2, which is proved by Refs. 11 and 28. In fact, all the proofs in this paper are mainly about the almost-sure local-in-time well-posedness and only require that V satisfies the following: (1) V is real-valued and even, and so is $V^$; (2) V ≥ 0; and (3) $V^(0)=1$ and $|V^(k)|≲⟨k⟩−β$.

Remark 1.5.

As in Refs. 17 and 18, the sequence {uN} can be replaced by other canonical approximation sequences, for example, with the sharp truncations ΠN on the initial data replaced by smooth truncations or with the projection ΠN onto the nonlinearity in (1.7) omitted. The limit obtained does not depend on the choice of such sequences, and the proof will essentially be the same.

#### 1. Regarding the range of β

The range of β obtained in Theorem 1.3 is clearly not optimal. In fact, Eq. (1.1) with Gibbs measure data is probabilistically subcritical as long as β > 0, and one should expect the same result at least when β > 1/2 (so the Gibbs measure is absolutely continuous with the Gaussian free field).

The purpose of this paper, however, is to provide an example where the method of random averaging operators17 is applied so that one can significantly improve the existing probabilistic results (β close but smaller than 1 vs β > 2 in Ref. 7) while keeping the presentation relatively short. In order to treat β > 1/2, one would need to adapt the sophisticated theory of random tensors,18 which will considerably increase the length of this work, so we decide to leave this part to a next paper.

As for the case 0 < β < 1/2, one would need to deal with the mutual singularity between the Gibbs measure and the Gaussian free field [of course, if one studies the local well-posedness problem with Gaussian initial data, as in (1.5), which is, of course, different from Gibbs, then a modification of the random tensor theory18 would also likely work for all β > 0]. The recent work of Bringmann12 provides a nice example where this issue is solved in the context of wave equations, and it would be interesting to see whether this can be extended to Schrödinger equations. Finally, the case β = 0, which is the famous Gibbs measure invariance problem for the three-dimensional cubic NLS equation, still remains an outstanding open problem as of now. It is probabilistically critical, which presumably would require completely new techniques to solve.

Due to the absolute continuity of the Gibbs measure in Proposition 1.2, in order to prove Theorem 1.3, we only need to consider the initial data distributed according to dρ for (the renormalized version of) (1.1) and the initial data distributed according to dρN for (1.7). In other words, we may assume that u(0) = f(ω) for (1.1) and uN(0) = fN(ω) for (1.7).

#### 1. Random averaging operators

Let us focus on (1.7); for simplicity, we will ignore the renormalization terms. The approach of Bourgain and of Da Prato and Debussche corresponds to decomposing

$uN(t)=eitΔfN(ω)+v(t),$

where fN is as in (1.6) and v(t) is the nonlinear evolution. In particular, this v(t) contains a trilinear Gaussian term,

$v*(t)=∫0tei(t−t′)ΔΠN[(|eit′ΔfN(ω)|2*V)eit′ΔfN(ω)] dt′.$

This term turns out to only have H0− regularity, which is not regular enough for a fixed point argument (note that the classical scaling critical threshold is H(1−β)/2). Therefore, this approach does not work.

Nevertheless, one may observe that the only contribution to v* that has the worst (H0−) regularity is when the first two input factors are at low frequency and the third factor is at high frequency, such as

$∫0tei(t−t′)ΔΠN[(|eit′ΔfN′(ω)|2*V)eit′ΔFN(ω)] dt′$

for N′ ≪ N and FN as in (1.6). Moreover, this low frequency component fN may also be replaced by the corresponding nonlinear term at frequency N′, so it makes sense to separate the low–low–high interaction term ψN defined by

$(i∂t+Δ)ψN=ΠN[(|uN/2|2*V)ψN],ψN(0)=FN(ω)$
(1.16)

as the singular part of yNuNuN/2 so that yNψN has higher regularity.

The idea of considering high–low interactions is consistent with the para-controlled calculus in Refs. 2325. However, in those works, the singular term ψN and the regular term yNψN are characterized only by their regularity (for example, one is constructed via fixed point argument in H0− and the other in H1/2−), which, as pointed out in Ref. 17, is not enough in the context of Schrödinger equations. Instead, it is crucial that one studies the operator, referred to as the random averaging operator in Ref. 17, which maps z to the solution to the following equation:

$(i∂t+Δ)ψ=ΠN[(|uN/2|2*V)ψ],ψ(0)=z.$
(1.17)

Note that the kernel of this operator, which we denote by $HN=(HN)kk′(t)$, is a Borel function of ${gk(ω)}⟨k⟩≤N/2$ and is independent of FN(ω). Moreover, this HN encodes the whole randomness structure of uN/2, which is captured in two particular matrix norm bounds for HN. Essentially, they involve the $ℓk2→ℓk′2$ operator norm and the $ℓkk′2$ Hilbert–Schmidt norm for fixed time t (or fixed Fourier variable λ); see Sec. II B 2 for details.

This is the main idea of the random averaging operators in Ref. 17. Basically, it allows one to fully exploit the randomness structure of the solution at all scales, which is necessary for the proof in the setting of Schrödinger equations in the lack of any smoothing effect.

#### 2. The special term ρN: A “critical” component

In addition to the ansatz introduced in Sec. I C 1, it turns out that an extra term is necessary due to the structure (especially, the asymmetry) of the nonlinearity (1.1). Recall that (|u|2 * V)u can be expressed as in (1.10); for simplicity, we will ignore any resonances (which are canceled by the renormalizations), i.e., assume that k2 ∉ {k1, k3} in (1.10). Here, if |k1k2| ≳ Nε for some small constant ε, then the potential $Vk1−k2$, which is bounded by $⟨k1−k2⟩−β$, will transform into a derivative gain, which allows one to close easily using the random averaging operator ansatz in Sec. I C 1.

However, suppose that |k1k2| is very small, say, |k1k2| ∼ 1 in (1.10); then, the potential does not lead to any gain of derivatives, and we will see that this particular term, in fact, exhibits some (probabilistically) “critical” feature. To see this, let us define $N$ to be this portion of nonlinearity (and the corresponding multilinear expression),

$N (u,v,w)=ΠN[(Π1(uv¯)*V)⋅w],$
(1.18)

and note the Π1 projector restricting to |k1k2| ∼ 1. Then, if we define the iteration terms

$u(0)(t)=eitΔFN(ω), u(m)(t)=∑m1+m2+m3=m−1∫0tei(t−t′)ΔN (u(m1),u(m2),u(m3))(t′) dt′,$

it follows from simple calculations that u(0) has regularity H−1/2−, while each u(m), where m ≥ 1, has exactly regularity H1/2−. Therefore, although u(1) is indeed more regular than u(0), the higher order iterations are not getting smoother despite all input functions [which are FN(ω)] having the same (and high) frequency. This is in contrast with the “genuinely (probabilistically) subcritical” situations (for the standard NLS equation) in Ref. 18, where for fixed positive constants ε and c, the mth iteration u(m), assuming that all input frequencies are the same, will have increasing and positive regularity in Hεmc as m grows and becomes large. Similarly, one may consider the linear operator,

$z↦∫0tei(t−t′)ΔN (z(t′),eit′ΔFN(ω),eit′ΔFN(ω)) dt′,$

with $N$ as in (1.18), and in typical subcritical cases, the norm of this operator from a suitable Xs,b space to itself would be Nα for some α > 0; see Refs. 17 and 18. However, here (for Hartree), one can check that the corresponding norm is, in fact, ∼1 and may even exhibit a logarithmic divergence if one adds up different scales.

Therefore, it is clear that the contribution $N$ as in (1.18) needs a special treatment in addition to the ansatz in Sec. I C 1. Fortunately, this term does not depend on the value of β and was already treated in the work of Bourgain.7 In this work, we introduce an extra term ρN, which corresponds to the term treated in the work of Bourgain,7 by defining ξN such that

$(i∂t+Δ)ξN=ΠN[(|uN/2|2*V)ξN+Π̃Nε((|uN|2−|uN/2|2)*V)]ξN,ξN(0)=FN(ω)$
(1.19)

and defining ρN = ξNψN, where $Π̃Nε$ is a smooth truncation at frequency Nε for some small ε. This term is then measured at regularity Hs for some s < 1/2, while the remainder term zNyNξN, where yN = uNuN/2, is measured at regularity Hs for some s < s′ < 1/2. See Sec. III A 3 for the solution ansatz and Proposition 3.1 for the precise formulations.

Note that the precise definitions of the equations satisfied by ψN and ξN [see (3.2) and (3.8)] involve projection ΔN onto the right-hand sides; this is to make sure that $(ψN)k$ and $(ξN)k$ are exactly supported in N/2 < ⟨k⟩ ≤ N so that one can exploit the cancellation due to the unitarity of the matrices HN (corresponding to ψN), as well as the matrices MN that correspond to the term ξN. This unitarity comes from the mass conservation property of the linear equations defining these matrices and already plays a key role in the work of Bourgain.7 See Sec. III B for details.

We start with system (1.7) with initial data uN(0) = fN(ω). Clearly, $(uN)k$ is supported in ⟨k⟩ ≤ N. If we denote the right-hand side of (1.7) by $ΠNN (uN)$, then in Fourier space, we have

$N (u)k=N ○(u)k+uk⋅∑ℓ|uℓ|2−αN+uk⋅∑ℓ≠k,〈ℓ〉≤NVk−ℓ|uℓ|2−1⟨ℓ⟩2−uk⟨k⟩2,$
(2.1)
$N ○(u)k=∑k1−k2+k3=kk2∉{k1,k3}Vk1−k2⋅uk1u¯k2uk3.$
(2.2)

We will extend $N ○(u)$, which is a cubic polynomial of u, to an $R$-trilinear operator $N ○(u,v,w)$ in the standard way. Note that is conserved under the flow (1.7), and we may get rid of the second term on the right-hand side of (2.1) by a gauge transform,

$uN→eiBNtuN, BN≔∑⟨ℓ⟩≤N|gℓ|2−1⟨ℓ⟩2.$

If we further define the profile vN by

$(vN)k(t)=e−it|k|2eiBNt(uN)k(t),$

then v will satisfy the integral equation,

$(vN)k(t)=(fN)k−i∫0tΠNM ○(vN,vN,vN)k(s) ds,−i∫0t(vN)k(s)∑ℓ≠k,⟨ℓ⟩≤NVk−ℓ|(vN)ℓ(s)|2−1⟨ℓ⟩2 ds+i∫0t(vN)k(s)⟨k⟩2 ds,$
(2.3)

where

$M ○(u,v,w)k(s)=∑k1−k2+k3=kk2∉{k1,k3}eisΩ⋅Vk1−k2⋅uk1(s)vk2¯(s)wk3(s), Ω≔|k1|2−|k2|2+|k3|2−|k|2.$
(2.4)

Below, we will focus on systems (2.3) and (2.4).

We set up some basic notations and norms needed later in the proof.

#### 1. Notations

As denoted above, we will use vk to denote Fourier coefficients, and $Fvk=v^k=v^k(λ)$ denotes the Fourier transform in time. For a finite index set A, we will write kA = (kj: jA), where each $kj∈Z3$, and denote by $hkA$ a tensor $h:(Z3)A→C$. We may also define tensors involving λ variables where $λ∈R$.

We fix the parameters, to be used in the proof, as follows. Let ε > 0 be sufficiently small absolute constant. Let ε1 and ε2 be fixed such that ε2ε1ε. Let β < 1 be such that 1 − βε2, and choose δ such that δ ≪ 1 − β and κ such that κδ−1. We use θ to denote any generic small positive constant such that θδ (which may be different at different places). Let b = 1/2 + κ−1, so 1 − b = 1/2 − κ−1. Finally, let τ be sufficiently small compared to all the above-mentioned parameters, and denote J = [−τ, τ]. Fix a smooth cutoff function χ(t), which equals 1 for |t| ≤ 1 and equals 0 for |t| ≥ 2, and define χτ(t) ≔ χ(τ−1t). We use C to denote any large absolute constant and Cθ for any large constant depending on θ. If some event happens with probability $≥1−Cθe−Aθ$, where A is a large parameter, we say that this event happens A-certainly.

#### 2. Norms

If (B, C) is a partition of A, namely, BC = and BC = A, we define the norm $‖h‖kB→kC$ such that

$‖h‖kB→kC2=sup∑kC∑kBhkAzkB2:∑kB|zkB|2=1.$
(2.5)

The same notation also applies for tensors involving the λ variables. For functions u = uk(t) and h = hkk(t) and 0 < c < 1, we also define the norms

$‖u‖Xc2≔∫R⟨λ⟩2c‖u^k(λ)‖k2 dλ,‖h‖Yc2≔∫R⟨λ⟩2c‖h^kk′(λ)‖k→k′2 dλ,‖h‖Zc2≔∫R⟨λ⟩2c‖h^kk′(λ)‖kk′2 dλ.$
(2.6)

For any interval I, define the corresponding localized norms

$‖u‖Xc(I)≔inf‖v‖Xc:v=u on I$
(2.7)

and similarly define Yc(I) and Zc(I). By abusing notations, we will call the above v an extension of u, though it is actually an extension of the restriction of u to I.

Remark 2.1.

Note that the Yc norm defined above follows the Yc norm defined in our recent work,18 which is more convenient for the purpose of this paper than the Yc norm defined in our earlier work.17

Here, we record some basic estimates. Most of them are standard or are in our previous works.17,18

#### 1. Linear estimates

Define the original and truncated Duhamel operators,

$Iv(t)=∫0tv(t′) dt′, Iχv(t)=χ(t)∫0tχ(t′)v(t′) dt′.$
(2.8)

Lemma 2.2.
We have the formula
$Iχv^(λ)=∫RI(λ,λ′)v^(λ′) dλ′,$
(2.9)
where the kernel I satisfies that
$|I|+|∂λ,λ′I|≲1⟨λ⟩3+1⟨λ−λ′⟩31⟨λ′⟩≲1⟨λ⟩⟨λ−λ′⟩.$
(2.10)

Proof.

See Ref. 16, Lemma 3.1, whence by a similar proof, one can also prove (2.10) for |λ,λI|.□

Proposition 2.3
(Short time bounds). Let φ be any Schwartz function, and recall that φτ(t) = φ(τ−1t) for τ ≪ 1. Then, for any u = uk(t), we have
$‖φτ⋅u‖Xc≲τc1−c‖u‖Xc1,$
(2.11)
provided that either 0 < cc1 < 1/2 or uk(0) = 0 and 1/2 < cc1 < 1. The same result also holds if u = u(t) is measured in norms other than ℓ2, so (2.11) is true with X replaced by Y or Z.

Proof.

See Ref. 18, Lemma 4.2.□

Lemma 2.4
(Suitable extensions). Suppose that f(x, t) is a function defined in t ∈ [−τ, τ] = J, with |τ| ≪ 1. Define that
$g(t)=f(t) if|t|≤τf(τ) ift>τf(−τ) ift<−τ.$
(2.12)
For any Schwartz function φ, we have
$‖φ(t)⋅g(t)‖Xb≲‖f‖Xb1(J)+‖f‖Lt∞Lx2(J),$
(2.13)
provided that either 0 < b < b1 < 1/2 or 1/2 < b < b1 < 1. When 1/2 < b < b1 < 1, we have
$‖φ(t)⋅g(t)‖Xb≲‖f‖Xb1(J).$
(2.14)

Proof.

We only need to bound locally in-time the function f*(t), which equals f(0) for t ≥ 0 and f(t) for t < 0; in fact, g is obtained by performing twice the transformation from f to f*, first at center τ and then at center −τ.

We can decompose f into two parts, f1, which is smooth and equals f(0) near 0, and f2 such that f2(0) = 0. Clearly, we only need to consider f2 so that f* equals f2 multiplied by a smooth truncation of 1[0,+), with f2(0) = 0.

We may replace 1[0,+) by the sign function and then apply Proposition 2.3; note that for an even smooth cutoff function χ,
$χ(x)⋅sgn(x)=∑N≥1ΔN(χ⋅sgn)(x),$
where ΔN are the standard Littlewood–Paley projections. Moreover, ΔN(χ · sgn) (x) can be viewed as a rescaled Schwartz function of the same form as in Proposition 2.3 with τ = N−1 (due to the expression of the Fourier transform of sgn and simple calculations), so the desired result follows from Proposition 2.3.□

#### 2. Counting estimates

Here, we list some counting estimates and the resulting tensor norm bounds.

Lemma 2.5.
1. Let$R=Z$or$Z[i]$. Then, given$0≠m∈R$and$a0,b0∈C$, the number of choices for$(a,b)∈R2$that satisfies

$m=ab, |a−a0|≤M, |b−b0|≤N$
(2.15)
is O(MθNθ) with constant depending only on θ > 0.
1. For dyadic numbers N1, N2, N3, R > 0 and some fixed number Ω0,

$SR=(k,k1,k2,k3)∈(Z3)4, k2∉{k1,k3}k=k1−k2+k3, |k|≤N|k|2−|k1|2+|k2|2−|k3|2=Ω0Nj2<|kj|≤Nj (j∈{1,2,3}), R2<⟨k1−k2⟩≤R,$
(2.16)
and then,$SkR$is the set of (k, k1, k2, k3) ∈ SRwhen k is fixed. We have the following counting estimates:
$SR≲min(N13N33(N2∧N)1+θ,N3N23(N1∧N3)1+θ,N23(RN3)2+θ,N3(RN1)2+θ),$
(2.17)
$SkR≲minN23(N1∧N3)1+θ,(N1N3)2+θ,(RN1)2+θ,$
(2.18)
$Sk1R≲minN33(N2∧N)1+θ,(N2N)2+θ,(RN)2+θ,$
(2.19)
$Sk2R≲minN3(N1∧N3)1+θ,(N1N3)2+θ,(RN3)2+θ,$
(2.20)
$Sk3R≲minN13(N2∧N)1+θ,(N2N)2+θ,(RN2)2+θ,$
(2.21)
$Skk1R≲minN2,N3,R2+θ, Sk2k3R≲minN,N1,R2+θ,$
(2.22)
$Skk2R≲minN1,N3,R1+θ, Sk1k3R≲minN2,N,R1+θ,$
(2.23)
$Skk3R≲minN1,N2,R2+θ, Sk1k2R≲minN,N3,R2+θ.$
(2.24)

Proof.

(1) It is the same as part (1) of Lemma 4.3 in Ref. 17. (2) We consider |SR|. First, the number of choices of k1 and k3 is $N13N33$. After fixing the choice of k1 and k3 to count (k, k2), it is equivalent to count k2 satisfying the restriction |k2|2 + |k2 + c1|2 = c2 or to count k satisfying the restriction |k|2 + |k + c3|2 = c4 for some fixed numbers c1, … , c4, and hence, we have $|SR|≲N13N33(N2∧N)1+θ$. Similarly, if we first fix k and k2, we have $|SR|≲N3N23(N1∧N3)1+θ$. In addition, if we fix k2 first, then to count (k, k1, k3) is equivalent to count (k1, k3) with the restriction (k2k1) · (k2k3) = c for some fixed number c. By fixing the first two components of (k1, k3) and using part (1), we have $|SR,M|≲N23RN32+θ$. Similarly, we also have $|SR|≲N3(RN1)2+θ$). The proofs of (2.18)–(2.24) are similar.□

#### 3. Probabilistic and tensor estimates

Proposition 2.6
(Proposition 4.11 in Ref. 18). Consider two tensors$hkA1(1)$and$hkA2(2)$, where A1A2 = C. Let A1ΔA2 = A, and define the semi-product
$HkA=∑kChkA1(1)hkA2(2).$
(2.25)
Then, for any partition (X, Y) of A, let XA1 = X1and YA1 = Y1, and we have
$‖H‖kX→kY≤‖h(1)‖kX1∪C→kY1⋅‖h(2)‖kX2→kC∪Y2.$
(2.26)

Proposition 2.7
(Proposition 4.12 in Ref. 18). Let Aj (1 ≤ jm) be index sets such that any index appears in at most two Ajs, and let$h(j)=hkAj(j)$be tensors. Let A = A1Δ⋯ΔAmbe the set of indices that belong to only one Ajand C = (A1 ∪⋯∪ Am) A be the set of indices that belong to two different Ajs. Define the semi-product
$HkA=∑kC∏j=1mhkAj(j).$
(2.27)
Let (X, Y) be a partition of A. For 1 ≤ jm, let Xj = XAjand Yj = YAj, and define
$Bj≔⋃ℓ>j(Aj∩Aℓ), Cj=⋃ℓ
(2.28)
and then, we have
$‖H‖kX→kY≤∏j=1m‖h(j)‖kXj∪Bj→kYj∪Cj.$
(2.29)
For the proofs of Propositions 2.6 and 2.7, see Ref. 18. In that work, the full power of (2.26) and (2.29) is needed, but here, we only need some specific cases, mainly those of the following form (where qr):
$∑k1,…,kqHk1⋯krhk1k1′(1)⋯hkqkq′(q)kA′→kB′≤‖H‖kA→kB∏j=1q‖h(j)‖kj→kj′,$
(2.30)
where (kA, kB) is a partition of the variables ($k1′$, …, $kq′$, kq+1, …, kr) and (kA, kB) is a partition of the variables (k1, …, kr), where each $kj′$ (1 ≤ jq) is replaced by kj in (kA, kB).

Proposition 2.8
(Proposition 4.14 in Ref. 18). Let A be a finite set and$hbckA=hbckA(ω)$be a tensor, where each$kj∈Zd$and$(b,c)∈(Z3)q$for some integer q ≥ 2. Given signs ζj ∈ {±}, we also assume thatb⟩, ⟨c⟩ ≲ M andkj⟩ ≲ M for all jA, where M is a dyadic number, and that in the support of$hbckA$, there is no-pairing in kA. Define the tensor
$Hbc=∑kAhbckA∏j∈Aηkjζj,$
(2.31)
where we restrict kjE in (2.31), with E being a finite set such that${hbckA}$is independent with {ηk: kE}. Then, τ−1M-certainly, we have
$‖Hbc‖b→c≲τ−θMθ⋅max(B,C)‖h‖bkB→ckC,$
(2.32)
where (B, C) runs over all partitions of A. The same results hold if we do not assumeb⟩, ⟨c⟩ ≲ M, but instead that (i)$b,c∈Z3$and |bc| ≲ M and$‖b|2−|c|2|≲Mκ3$and (ii)$hbckA$can be written as a function of bc, |b|2 − |c|2, and kA.

For the Proof of Proposition 2.8, see Ref. 18 and Propositions 4.14 and 4.15.

Proposition 2.9
(Weighted bounds). Suppose that the matrices$h=hkk′′$,$h(1)=hkk′(1)$, and$h(2)=hk′k′′(2)$satisfy that
$hkk′′=∑k′hkk′(1)hk′k′′(2)$
and$hkk′(1)$is supported in |kk′| ≲ L, then we have
$1+|k−k′′|Lκhkk′′ℓkk′′2≲‖h(1)‖k→k′⋅1+|k′−k′′|Lκhk′k′′(2)ℓk′k′′2.$

For the Proof of Proposition 2.9, see Ref. 17, Proposition 2.5 or Ref. 18, Lemma 4.3 (there are different versions of this bound, but the proofs are the same).

Start with systems (2.3) and (2.4). Let yN = vNvN/2, and then, yN satisfies the integral equation,

$(yN)k(t)=(FN)k−i∑max(N1,N2,N3)=N∫0tΠNM ○(yN1,yN2,yN3)k(s) ds + i∫0t(yN)k(s)⟨k⟩2 ds−i∑max(N1,N2,N3)≤N/2∫0tΔNM ○(yN1,yN2,yN3)k(s) ds − i∫0t(vN)k(s)∑ℓ≠k,⟨ℓ⟩≤NVk−ℓ|(vN)ℓ(s)|2−1⟨ℓ⟩2−(vN/2)k(s)∑ℓ≠k,⟨ℓ⟩≤N/2Vk−ℓ|(vN/2)ℓ(s)|2−1⟨ℓ⟩2 ds.$
(3.1)

#### 1. The term ψN,L

For any LN/2, consider the linear equation for Ψ = Ψk(t),

$∂tΨk(t)=−iΔNM <(vL,vL,Ψ)k(t),$
(3.2)

where we define, with δ ≪ 1,

$M <(u,v,w)k(t)≔∑k1−k2+k3=kk2∉{k1,k3}eitΩ⋅ηk1−k2N1−δVk1−k2⋅uk1(t)vk2¯(t)wk3(t),$
(3.3)

and define also $M >≔M ○−M <$. If (3.2) has initial data Ψk(0) = ΔNϕk, then the solution may be expressed as

$Ψk(t)=∑k′Hkk′N,L(t)ϕk′,$
(3.4)

where $HN,L=Hkk′N,L$ is the kernel of a linear operator (or a matrix). Define also

$(ψN,L)k(t)=∑k′Hkk′N,L(t)(FN)k′,$
(3.5)

and similarly,

$hN,L≔HN,L−HN,L/2, ζN,L≔ψN,L−ψN,L/2.$
(3.6)

Note that when L = 1, we will replace L/2 by 0, so, for example, $(ψN,0)k(t)=(FN)k$. For simplicity, denote

$HN≔HN,N/2 and ψN:=ψN,N/2.$
(3.7)

Note that each hN,L and HN,L is a Borel function of $(gk(ω))⟨k⟩≤N/2$ and is thus independent of the Gaussians in FN.

#### 2. The terms ξN and ρN

Next, similar to (3.2), we consider the linear equation,

$∂tΞk(t)=−iΔNM <(vN/2,vN/2,Ξ)+M ≪(vN,vN,Ξ)−M ≪(vN/2,vN/2,Ξ)k(t),$
(3.8)

where $M ≪$ is defined by

$M ≪(u,v,w)k(s)≔∑k1−k2+k3=kk2∉{k1,k3}eisΩ⋅ηk1−k2NεVk1−k2⋅uk1(s)vk2¯(s)wk3(s).$
(3.9)

If the initial data are Ξk(0) = ΔNϕk, then we may write the solution as

$Ξk(t)=∑k′Mkk′N(t)ϕk′,$
(3.10)

which defines the matrix $MN=Mkk′N$. We then define ξN and ρN by

$(ξN)k(t)≔∑k′Mkk′N(t)(FN)k′, ρN≔ξN−ψN.$
(3.11)

#### 3. The ansatz

Now, we introduce the ansatz

$(yN)k(t) = (ξN)k(t) + (zN)k(t),$
(3.12)

where zN is a remainder term. We can calculate that zN solves the following equation (recall yN = vNvN/2):

$(zN)k(t)=−i∑max(N1,N2,N3)=N∫0tΠNM >(yN1,yN2,yN3)k(s) ds+i∫0t(yN)k(s)⟨k⟩2 ds−i∑max(N1,N2,N3)≤N/2∫0tΔNM ○(yN1,yN2,yN3)k(s) ds−i∑max(N1,N2)=N;N3≤N∫0tΠN(M <−M ≪)(yN1,yN2,yN3)k(s) ds−i∫0tΠN/2M <(vN/2,vN/2,yN)k(s) ds−i∫0tΔNM <(vN/2,vN/2,zN)k(s) ds−i∑max(N1,N2)=NΠN/2M ≪(yN1,yN2,yN)k(s) ds−i∑max(N1,N2)=NΔNM ≪(yN1,yN2,zN)k(s) ds−i∫0t(vN)k(s)∑ℓ≠k,⟨ℓ⟩≤NVk−ℓ|(vN)ℓ(s)|2−1⟨ℓ⟩2−(vN/2)k(s)∑ℓ≠k,⟨ℓ⟩≤N/2Vk−ℓ|(vN/2)ℓ(s)|2−1⟨ℓ⟩2 ds.$
(3.13)

The following properties of H and M will play a fundamental role. This idea goes back to that of Bourgain.7 Recall that for LN/2, the matrix HN,L is defined by (3.2) and (3.4). Note that if Ψ solves (3.2), then Ψk(t) is supported in N/2 < ⟨k⟩ ≤ N and recall that $Vk=V−k=Vk¯$, and then, we have

$∂t∑k|Ψk(t)|2=2⋅Im∑kΨk(t)¯⋅∑k1−k2+k3=kk2∉{k1,k3}eit(|k1|2−|k2|2+|k3|2−|k|2) ×ηk1−k2N1−δVk1−k2⋅(vL)k1(t)(vL)k2(t)¯Ψk3(t).$
(3.14)

The sum on the right-hand side may be replaced by two terms, namely, S1 where we only require k1k2 in the summation and S2 where we require k1k2 and k2 = k3 in the summation. For S1, by swapping (k, k1, k2, k3) ↦ (k3, k2, k1, k), we also see that $S1∈R$ and, hence, Im(S1) = 0; moreover,

$S2=∑k≠k2ηk−k2N1−δVk−k2Ψk(t)¯(vL)k(t)⋅Ψk2(t)(vL)k2(t)¯,$

which is also real-valued by swapping (k, k2) ↦ (k2, k). This means that ∑kk(t)|2 is conserved in time. Therefore, for each fixed t, the matrix $HN,L=Hkk′N,L$ is unitary; hence, we get the identity

$∑k′Hk1k′N,L⋅Hk2k′N,L¯=δk1k2,$
(3.15)

with $δk1k2$ being the Kronecker delta. This, in particular, holds for L = N/2. In the same way, the matrix MN defined by (3.8) and (3.10) also satisfies (3.15).

We now state the main a priori estimate and prove that this implies Theorem 1.3.

Proposition 3.1.

Given 0 < τ ≪ 1, and let J = [−τ, τ]. Recall the parameters defined in Sec. II B. For any M, consider the following statements, which we call Local(M):

1. For the operators hN,L, where L < M and N > L is arbitrary, we have

$‖hN,L‖Y1−b(J)+supt∈J‖hN,L(t)‖ℓ2→ℓ2≤L−1/2+3ε1, ‖hN,L‖Zb(J)≤N1+δL−1/2+2ε1,$
(3.16)
as well as
$1+|k−k′|min(L,N1−δ)κhkk′N,LZb(J)≤N3/2.$
(3.17)
1. For the terms ρNand zN, where NM, we have

$‖ρN‖Xb(J)≤N−1/2+ε1+ε2, ‖zN‖Xb(J)≤N−1/2+ε1.$
(3.18)
1. For any L1, L2 < M, the operator defined by

$(L z)k(t)=−i∫0tΔNM <(yL1,yL2,z)k(t′) dt′$
(3.19)
has an extension, which we still denote by$L$for simplicity. The kernel$L kk′(t,t′)$has Fourier transform$L ^kk′(λ,λ′)$, which satisfies
$∫R2⟨λ⟩2(1−b)⟨λ′⟩−2b‖L ^‖k→k′2 dλdλ′≤L−1+6ε1−2ε2$
(3.20)
and
$∫R2⟨λ⟩2b⟨λ′⟩−2(1−b)‖L ^‖kk′2 dλdλ′≤N2+2δL−1+4ε1−2ε2,$
(3.21)
where L = max(L1, L2).
Now, with the above definition, we have that
$P(Local(M/2)∧¬(Local(M)))≤Cθe−(τ−1M)θ.$

Proof of Theorem 1.3.
By Proposition 3.1, in particular, we know that τ−1-certainly, the event Local(M) happens for any M. By (3.4), (3.11), and (3.12), we have
$yN=FN+∑L≤N/2ζN,L+ρN+zN.$
Exploiting independence between hN,L and FN and using Proposition 2.8 combined with (3.16), we can show that $‖ζN,L‖Xb(J)≲NδL−1/3$. Summing over L and noting that ζN,L is supported in N/2 < ⟨k⟩ ≤ N, we see that
$∑L≤N/2ζN,LCt0Hxγ(J)≲N−γ/2$
for any γ > 0. Using also (3.18), we can see that the sequence {vNfN} converges in $Ct0Hx0−(J)$. Hence, {vN} converges in $Ct0Hx−1/2−(J)$, and so does the original sequence {uN}.

Therefore, the solution uN to (1.7) converges to a unique limit as N up to an exceptional set with probability $≥1−Cθe−τ−θ$. This proves the almost-sure local well-posedness of (1.1) with Gibbs measure initial data. Since the truncated Gibbs measure dηN defined by (1.13) is invariant under (1.7) and the truncated Gibbs measures converge strongly to the Gibbs measure dν as in Proposition 1.2, we can apply the standard local-to-global argument of Bourgain, where the a priori estimates in Proposition 3.1 allow us to prove the suitable stability bounds needed in the process, in exactly the same way as in Ref. 17. The almost-sure global existence and invariance of Gibbs measure then follow.□

From now on, we will focus on the Proof of Proposition 3.1, and assume that the bounds involved in Local(M/2) are already true. The goal is to recover (3.16)–(3.18) and (3.20)–(3.21) for M. Before proceeding, we want to remark on a few simplifications that we would like to make in the proof below. These are either standard or are the same as in Refs. 17 and 18, and we will not detail out these arguments in the proof below.

1. In proving these bounds, we will use the standard continuity argument, which involves a smallness factor. The localized Xc norm, Xc([0, T]), is continuous in T if the function is smooth. This should enable the continuity argument. Here, this factor is provided by the short time τ ≪ 1. In particular, we can gain a positive power τθ by using38 Proposition 2.3 at the price of changing the c exponent in the Xc (or Yc or Zc) norm by a little (e.g., from 1 − b to b). It can be checked in the proof below that all the estimates allow for some room in c, so this is always possible.

2. In each proof below, we can actually gain an extra power Mδ/10 compared to the desired estimate, so any loss that is $MCκ−1$ will be acceptable. In fact, in the proof below, we will frequently encounter losses of at most $MCκ−1$ due to manipulations of the c exponent in various norms as in (1) and due to the application of probabilistic bounds such as Proposition 2.8 where we lose a small θ power.

3. In the course of the proof, we will occasionally need to obtain bounds of quantities of the form supλG(λ), where λ ranges in an interval, and for each fixed λ, the quantity |G(λ)| can be bounded, apart from a small exceptional set; moreover, here, G will be differentiable and G′(λ) will satisfy a weaker but unconditional bound. Then, we can apply the meshing argument in Refs. 17 and 18, where we divide the interval into a large number of subintervals, approximate G on each small interval by a sample (or an average), control the error term using G′, and add up the exceptional sets corresponding to the sample in each interval. In typical cases, where M-certainly |G(λ)| ≤ Mθ for each fixed λ, |I| ≤ MC, and |G′(λ)| ≤ MC unconditionally, we can deduce that M-certainly, supλ|G(λ)| ≤ Mθ because the number of subintervals is O(MC), so the total probability for the union of exceptional sets is still sufficiently small.

We start by proving (3.20) and (3.21) for L = M/2. We need to construct an extension of $L$ defined in (3.19). This is done first using Lemma 2.4 to find extensions of each component of $yL1$ and $yL2$ [note that max(L1, L2) = M/2] such that these extension terms satisfy (3.16)–(3.18) with the localized Xb(J), …, norms replaced by the global Xb, …, norms, at the expense of some slightly worse exponents. The change in the value in exponents will play no role in the proof below, so we will omit it. Then, by attaching to $L$ a factor χ(τ−1t) and using Lemma 2.3 (see Sec. III D), we can gain a smallness factor τθ at the price of further worsening the exponents. These operations are standard, so we will not repeat them below.

Note that the extension defined in Lemma 2.4 preserves the independence between the matrices $hLj,Rj$ and $FLj$ for RjLj/2.

Recall that $L ^kk′(λ,λ′)$ is the Fourier transform of the kernel $L kk′(t,t′)$ of $L$, and we have

$(L z^)k(λ)=∑k′∫RL ^kk′(λ,λ′)z^k′(λ′) dλ′.$
(4.1)

Now, we consider the different cases.

1. Suppose in (3.19) we replace $yLj$ by $ρLj+zLj$ for j ∈ {1, 2}, then, in particular, we may assume that $‖yLj‖Xb≲Lj−1/2+ε1+ε2$ due to (3.18). By (3.19) and (4.1), we have

$L ^kk′(λ,λ′)=∑k1−k2=k−k′∫R2I(λ,Ω+λ1−λ2+λ′)⋅Vk1−k2(yL1̂)k1(λ1)⋅(yL2̂)k2(λ2)¯ dλ1dλ2,$
(4.2)

where Ω = |k|2 − |k1|2 + |k2|2 − |k′|2 and I = I(λ, μ) is as in (2.10); we will omit the factor η((k1k2)/N1−δ) in the definition of $M <$ in (3.3) as it does not play a role. We may also assume that |k1k2| ∼ RL. In the above expression, let μλ − (Ω + λ1λ2 + λ′); in particular, we have |I| ≲ ⟨λ−1μ−1 by (2.10). By a routine argument, in proving (3.20), we may assume that |λj| ≤ L100 and |μ| ≲ L100; in fact, if, say, |λ1| is the maximum of these values and |λ1| ≥ L100 (the other cases being similar), then we may fix the values of kj and, hence, kk′, at a loss of at most L12, and reduce to estimating

$|L ^kk′(λ,λ′)|≲∫R31⟨λ⟩⟨λ−λ1+λ2−λ′−Ω⟩w1^(λ1)w2^(λ2)¯ dλ1dλ2,$

with |λ1| ∼ KL100 and $‖⟨λj⟩bwj^‖L2≲1$ for each j. By estimating w1 in the unweighted L2 norm, we can gain a power K−1/2, and using the $Lλ21$ integrability of $w2^$, which follows from the weighted L2 norm, we can fix the value of λ2. In the end, this leads to

$supk,k′|L ^kk′(λ,λ′)|≲⟨λ⟩−1K−1/2,$

and hence,

$⟨λ⟩1−b⟨λ′⟩−bsupk,k′|L ^kk′(λ,λ′)|Lλ,λ′2≲K−1/3≲L−30,$

which is more than enough because $‖L ^‖k→k′=supk,k′|L ^kk′|$ if $L$ is supported, where kk′ is constant.

Now, we may assume |λj| ≤ L100 for j ∈ {1, 2} and |μ| ≤ L100; we may also assume $|λ|+|λ′|≤Lκ3$ as, otherwise, we gain from the weights ⟨λ2(1−b) and ⟨λ′⟩−2b in (3.20). Similarly, in proving (3.21), we may assume |λj| ≤ N100 for j ∈ {1, 2}, |μ| ≤ N100, and |λ| + |λ′| ≤ N100 (otherwise, we may also fix (k, k′) and argue as above). Therefore, in proving (3.21), we may replace the unfavorable exponents ⟨λ2bλ′⟩−2(1−b) by the favorable ones ⟨λ2(1−b)λ′⟩−2b at a price of $NCκ−1$; this will be acceptable since in the proof, we will be able to gain a power Nδ/2. We remark that in the proof below (though not here), we may use the Y1−b norm as in (3.16) for the matrices in the decomposition of $yLj$; using the bounds of λj as above, we may replace the exponent 1 − b by b (which then implies $Lλj1$ integrability) again at a loss of either $LCκ−1$ or $NCκ−1$ depending on whether we are proving (3.20) or (3.21), which is acceptable. See also Sec. III D.

This then allows us to fix the values of λj in (4.2) using the $Lλj1$ integrability coming from the weighted norms; moreover, by using the bound |I| ≲ ⟨λ−1μ−1, upper bounds for λ and μ as mentioned above, and the weights in (3.20)–(3.21), we may also fix the values of λ, λ′, and ⌊μ⌋and reduce to estimating the following quantity:

$Qkk′=∑k1−k2=k−k′hkk1k2k′b(w1)k1(w2)k2¯,$
(4.3)

where the tensor (which we call the base tensor)

$hb=hkk1k2k′b=Vk1−k2⋅1k1−k2+k′=k⋅1|k|2−|k1|2+|k2|2−|k′|2=Ω0⋅1k1,k′≠k2,$

with some value Ω0 determined by λj, λ, λ′, and ⌊μ⌋. Here, we also assume |kj| ≲ Lj and |k1k2| ∼ RL and $‖wj‖ℓ2≲Lj−1/2+ε1+ε2$.

Now, (4.3) is easily estimated by using Proposition 2.7 that

$‖Q‖k→k′≲‖hb‖kk2→k1k′⋅‖w1‖k1⋅‖w2‖k2≲R⋅R−β⋅L1−1/2+ε1+ε2L2−1/2+ε1+ε2≲L−1/2+2ε1−ε2,$

which is enough for (3.20) [namely, we multiply this by the factor ⟨λ−1 coming from I and the weight ⟨λ1−bλ′⟩b in (3.20) and then take the L2 norm in λ and λ′ to get (3.20); the same happens below]. In particular, the norm $‖hb‖kk2→k1k′$ is bounded as

$‖hb‖kk2→k1k′≲R−β⋅maxk1,k′|Sk1k′R|1/2⋅maxk,k2|Skk2R|1/2≲R−β⋅R1+θ$

by Schur’s bound, where $Sk1k′R$ and $Skk2R$ are defined similarly as in Lemma 2.5.

For the $‖Q‖kk′$ norm, we have

$‖Q‖kk′≲‖hb‖k1→kk2k′⋅‖w1‖k1⋅‖w2‖k2≲R−β⋅NR⋅L1−1/2+ε1+ε2L2−1/2+ε1+ε2≲NL−1/2+2ε1−ε2,$

which is enough for (3.21). Note that all the bounds for hb we use here follow from Schur’s bound and Lemma 2.5.

1. Suppose that $yL1$ is replaced by $ρL1+zL1$ and $yL2$ is replaced by $ψL2$. We may further decompose $ψL2$ into $ζL2,R2$ for R2L2/2 (including the case R2 = 0 by which we mean $ζL2,0=FL2$) and perform the same arguments as mentioned above fixing the λ variables and reduce (this reduction step actually involves a meshing argument as the estimate for $Q$ is probabilistic; see Sec. III D) to estimating the following quantity:

$Qkk′=∑k1−k2=k−k′hkk1k2k′b(w1)k1∑k2′hk2k2′(2)¯⋅(FL2)k2′¯,$
(4.4)

where $‖w1‖ℓ2≲L1−1/2+ε1+ε2$ and h(2) is independent of $FL2$ and is either the identity matrix or satisfies $‖h(2)‖k2→k2′≲R2−1/2+3ε1$ and $‖h(2)‖k2k2′≲L21+δR2−1/2+2ε1$. We then estimate (4.4) by

$‖Q‖k→k′≲L2−1(‖hb‖kk1k2→k′+‖hb‖kk1→k2k′)‖w1‖k1‖h(2)‖k2→k2′≲R−β⋅Rmin(L1,L2)⋅L2−1L1−1/2+ε1+ε2≲L−1/2+2ε1−ε2,$
(4.5)

using Propositions 2.6 and 2.8, which is enough for (3.20). Note that here hb depends on k and k′ only via kk′ and |k|2 − |k′|2 and that $‖k|2−|k′|2|≤Lκ3$, given the assumptions, so Proposition 2.8 is applicable. Similarly, for the $ℓkk′2$ norm, we have

$‖Q‖kk′≲L2−1(‖hb‖kk′→k1k2+‖hb‖kk1k′→k2)‖w1‖k1‖h(2)‖k2→k2′≲R−β⋅N(min(L1,L2)+min(L1,R))⋅L2−1L1−1/2+ε1+ε2≲NL−1/2+2ε1−ε2,$
(4.6)

which is enough for (3.21).

1. Suppose that $yLj$ is replaced by $ψLj$ for j ∈ {1, 2}. In this case, we will start from (3.19) and expand

$(ψLj)kj=∑kj′(HLj)kjkj′(FLj)kj′$

for j ∈ {1, 2}. There are then two cases, namely, when $k1′$ = $k2′$ or otherwise.

If $k1′$$k2′$, then we can repeat the above argument [including further decomposing $ψLj$ into $ζLj,Rj$ using (3.6) and (3.7)] and fix the time Fourier variables and reduce to estimating a quantity

$Qkk′=∑k1−k2=k−k′hkk1k2k′b∑k1′,k2′hk1k1′(1)(FL1)k1′⋅hk2k2′(2)¯⋅(FL2)k2′¯,$
(4.7)

where h(j) is independent of $FLj$ and is either the identity matrix or satisfies $‖h(j)‖kj→kj′≲Rj−1/2+3ε1$ and $‖h(j)‖kjkj′≲Lj1+δRj−1/2+2ε1$. Since $k1′$$k2′$, we can apply Proposition 2.8 either in ($k1′$, $k2′$) jointly (if L1 = L2) or first in $k1′$ and then in $k2′$ (if, say, L1 ≥ 2L2) and get that

$‖Q‖k→k′≲(L1L2)−1max(‖hb‖k→k1k2k′,‖hb‖kk1→k2k′,‖hb‖kk2→k1k′,‖hb‖kk1k2→k′)×‖h(1)‖k1→k1′‖h(2)‖k2→k2′≲R−β(L1L2)−1⋅Rmin(L1,L2)≲L−2/3,$
(4.8)

which is enough for (3.20). As for the $ℓkk′2$ norm, we have

$‖Q‖kk′≲(L1L2)−1‖hb‖kk1k2k3⋅‖h(1)‖k1→k1′‖h(2)‖k2→k2′≲(L1L2)−1R−β⋅min(L1,L2)3/2,NR$

which is enough for (3.21).

Finally, assume that $k1′$ = $k2′$, and then, L1 = L2 = L. In (3.19), the summation in $k1′$ = $k2′$ gives

$∑k1′1⟨k1′⟩2(HL)k1k1′(t′)(HL)k2k1′¯(t′).$

Using the cancellation (3.15) since k1k2, we can replace the factor $1/⟨k1′⟩2$ in the above expression by $1/⟨k1′⟩2−1/⟨k1⟩2$; then, by further decomposing $HLj$ into $hLj,Rj$ by (3.7) and repeating the above arguments, we can reduce to estimating the following quantity:

$Qkk′=∑k1−k2=k−k′hkk1k2k′b⋅(h̃)k1k2, (h̃)k1k2=∑k1′1⟨k1′⟩2−1⟨k1⟩2hk1k1′(1)hk2k1′(2)¯,$
(4.9)

where h(j) is either the identity matrix or satisfies $‖h(j)‖kj→kj′≲Rj−1/2+3ε1$ and $‖h(j)‖kjkj′≲Lj1+δRj−1/2+2ε1$. Note that we may assume |kj$kj′$| ≲ RjLδ using the bound (3.17), so, in particular, we have

$1⟨k1′⟩2−1⟨k1⟩2≲R+min(R1,R2)L3$

up to a loss of L (which is acceptable as in this case, we can gain at least $Lε2$). Using these, we estimate, assuming without loss of generality that R1R2,

$‖Q‖k→k′≲‖hb‖kk1k2→k′‖h̃‖k1k2≲R+R2L3⋅L1+δR1−1/2+2ε1R2−1/2+3ε1⋅R−βLmin(R1,R)≲L−1/2+3ε1−ε2,$
(4.10)
$‖Q‖kk′≲‖hb‖k1k2→kk′‖h̃‖k1k2≲R+R2L3⋅L1+δR1−1/2+2ε1R2−1/2+3ε1⋅R−βNL≲NL−2/3.$

This completes the proof for (3.20) and (3.21).

We now prove (3.16) and (3.17). Let $L N,L$ be the linear operator defined by

$z↦−i∫0tΔNM <(vL,vL,z)(t′) dt′,$
(4.11)

and we also extend its kernel in the same way as we do for $L$ in Sec. IV A. Let $L ̃N,L=L N,L−L N,L/2$, and then, by induction hypothesis and the proof in Sec. IV A, we know that $L ̃N,L$ also satisfies estimates (3.20) and (3.21). Clearly, (3.20) implies that $‖L ̃N,L‖Xb→X1−b≲L−1/2+3ε1−ε2$; moreover, it is easy to see that

$‖L N,Lz‖X1≲‖M <(vL,vL,z)‖Lt,x2≲L12‖z‖X0,$

and hence, $‖L N,L‖X0→X1≲L12$, and the same holds for $L ̃N,L$. By interpolation, we obtain that $‖L ̃N,L‖Xα→Xα≲L−1/2+3ε1$ for α ∈ {b, 1 −b} (note that we can always gain a positive power of τ using Lemma 2.3; see Sec. III D). Moreover, consider the kernel $(FL ̃N,L)kk′(λ,λ′)$, and then, we also have the following bound:

$∫R⟨λ⟩2(1−b)‖⟨λ′⟩−b(FL ̃N,L)kk′(λ,λ′)‖k′λ′→k2 dλ≲L−1+6ε1−2ε2,$

which follows from (3.20). If we replace the factor ⟨λ′⟩b by 1, then a simple argument shows that

$‖(FL N,L)kk′(λ,λ′)‖k′λ′→k≲L12⟨λ⟩−1$

(and the same for $L ̃N,L$) by using that

$|(FL N,Lz)k(λ)|≲⟨λ⟩−1∫R⟨λ−μ⟩−1|FM <(vL,vL,z)k(μ)| dμ≲⟨λ⟩−1‖M <(vL,vL,z)‖L2$

and then fixing the Fourier modes of vL. Interpolating again, we get that

$∫R⟨λ⟩2(1−b)‖⟨λ′⟩−(1−b)(FL ̃N,L)kk′(λ,λ′)‖k′λ′→k2 dλ≲L−1+6ε1.$
(4.12)

A similar interpolation gives

$∫R⟨λ′⟩−2b‖⟨λ⟩b(FL ̃N,L)kk′(λ,λ′)‖k′→kλ2 dλ′≲L−1+6ε1.$
(4.13)

Clearly, $L N,L$ satisfies (4.12) and (4.13) with right-hand sides replaced by 1.

Now, let

$H N,L=(1−L N,L)−1=∑n=0∞(L N,L)n,$

and then, it is easy to see that $H N,L−1$ satisfies the same bounds (4.12) and (4.13) with right-hand sides replaced by 1; for example, (4.12) follows from iterating the following bound:

$⟨λ⟩1−b‖⟨λ″⟩−(1−b)(A B)kk″(λ,λ″)‖k″λ″→kLλ2≲⟨λ⟩1−b‖⟨λ′⟩−(1−b)A kk′(λ,λ″)‖k′λ′→kLλ2⋅‖B‖X1−b→X1−b,$

provided that

$(A B)kk″(λ,λ″)=∑k′∫RA kk′(λ,λ′)Bk′k″(λ′,λ″) dλ′,$
(4.14)

and (4.13) is proved similarly. Define further

$H ̃N,L=H N,L−H N,L/2=∑n=1∞(−1)n−1(H N,LL ̃N,L)nH N,L.$

By iterating the XαXα bounds and using also (3.21) for $L ̃N,L$, we can show that

$∫R2⟨λ⟩2b⟨λ′⟩−2(1−b)‖(FH ̃N,L)kk′(λ,λ′)‖kk′2 dλdλ′≲N2+2δL−1+4ε1.$
(4.15)

The weighted bound

$∫R2⟨λ⟩2b⟨λ′⟩−2(1−b)1+|k−k′|min(L,N1−δ)κ(FH ̃N,L)kk′(λ,λ′)kk′2 dλdλ′≲N3$
(4.16)

is shown in the same way but using Proposition 2.9.

In addition, we can also show that

$∫R2⟨λ⟩2(1−b)⟨λ′⟩−2b‖(FH ̃N,L)k→k′(λ,λ′)‖kk′2 dλdλ′≲L−1+6ε1.$
(4.17)

This can be proved using (4.12)–(4.13), by iterating the following bounds:

$⟨λ⟩1−b⟨λ″⟩−b(A B)kk′(λ,λ″)k′→k″Lλ,λ″2 ≲⟨λ⟩1−b⟨λ′⟩−bA kk′(λ,λ′)k′→k′Lλ,λ′2⋅⟨λ″⟩−b⟨λ′⟩bBk′k″(λ′,λ″)k″→k′λ′Lλ″2$
(4.18)

and similarly,

$⟨λ⟩1−b⟨λ″⟩−b‖(A B)kk′(λ,λ″)‖k′→k″Lλ,λ″2 ≲⟨λ⟩1−b‖⟨λ′⟩−(1−b)A kk′(λ,λ′)‖k′λ′→kLλ2⋅⟨λ′⟩1−b⟨λ″⟩−b‖Bk′k″(λ′,λ″)‖k′→k″Lλ′,λ″2$
(4.19)

assuming (4.14).

Now, we can finally prove (3.16) and (3.17). In fact, by definition of $H N,L$ and $H ̃N,L$, there exists an extension of hN,L such that

$(hN,L^)kk′(λ)=∫R(FH ̃N,L)kk′(λ,λ′)χ^(λ′) dλ′,$

so the Y1−b and Zb bounds in (3.16), as well as (3.17), can be deduced directly from (4.15)–(4.17). The bound supt$‖$hN,L(t)$‖$kk is also easily controlled by $‖H ̃N,L‖Xb→Xb$ using the embedding $Lt∞L2↪Xb$. This completes the proof for (3.16) and (3.17).

In this section, we prove the first bound in (3.18) regarding ρN, assuming N = M. Recall that from (3.2), (3.4), and (3.8), we deduce that ρN satisfies the following equation:

$(ρN)k(t)=−i∫0tΔNM <(vN/2,vN/2,ρN)k(t′) dt′ −iΔN[M ≪(vN,vN,ψN+ρN)−M ≪(vN/2,vN/2,ψN+ρN)]k(t′) dt′,$
(5.1)

with initial data $(ρN)k(0)=0$. Let $L N,L$ be defined as in (4.11), and denote $L N≔L N,N/2$; from Sec. IV B, we know that $(1−L N)−1≔H N$ is well-defined and has kernel $(H N)kk′(t,t′)$ in physical space and $(FH N)kk′(λ,λ′)$ in Fourier space. Then, (5.1) can be reduced to

$(ρN)k(t)=∑k′∫0t(H N)kk′(t,t′)Wk′(t′) dt′,$
(5.2)

where

$Wk(t)=−iΔN∫0t∑w1,w2,w3M ≪(w1,w2,w3)k(t′) dt′.$
(5.3)

Here, in (5.3), we assume for j ∈ {1, 2} that $wj∈{ψNj,ρNj,zNj}$, where max(N1, N2) = N, and that w3 ∈ {ψN, ρN}.

In order to prove the bound for ρN in (3.17), we will apply a continuity argument, namely, assuming (3.17) and then improving it with a smallness factor. This can be done as long as we bound

$‖W‖Xb(J)≤τθN−1/2+ε1+ε2$
(5.4)

since from Sec. IV B, we know that $H N$ is bounded from Xb(J) to Xb(J). In fact, we will prove (5.4) with an extra gain $N−ε2/2$, which will allow us to ignore any possible N loss in the process. The smallness factor τθ will be provided by Lemma 2.3 as in Sec. III D, so we will not worry about it below. We divide the right-hand side of (5.3) into three terms:

• Term I: when w3 = ρN.

• Term II: when w3 = ψN and zN ∈ {w1, w2} for some N′ ≥ N/2.

• Term III: when w3 = ψN and w1, w2 ∈ {ψN, ρN, ψN/2, ρN/2}.

Note that these are the only possibilities since if (say) N1 = N, w1 ∈ {ψN, ρN}, and N2N/2, then we must have N2 = N/2 due to the support condition for ψN and ρN, as well as the restriction |k1k2| ≲ Nε in $M ≪$. Moreover, the estimate of term I follows from the operator norm bound,

$IΔNM ≪(yN1,yN2,z)Xb(J)≲τθmax(N1,N2)−1/3‖z‖Xb(J),$
(5.5)

which is proved by repeating the arguments in Sec. IV A (the proof that works for $M <$ certainly also works for $M ≪$). In Secs. V A and V B, we will deal with terms II and III, respectively.

For simplicity, we assume that w1 = zN (the proof of the case w2 = zN is similar). There are then two cases to consider, when $w2∈{ρN2,zN2}$ or when $w2=ψN2$.

#### 1. The case $w2∈{ρN2,zN2}$

If $w2∈{ρN2,zN2}$, then we, in particular, have $‖w2‖Xb(J)≲N2−1/2+ε1+ε2$. By Lemma 2.4, we may fix an extension of w1 and w2 that satisfy the same bounds as they do but with Xb(J) replaced by Xb; moreover, they satisfy the same measurability conditions as w1 and w2. For simplicity, we will still denote them by w1 and w2. The same thing is done for w3 = ψN, as well as the corresponding matrices.

Now, by (5.3) and Lemma 2.2, we can find an extension of II, which we still denote by II for simplicity such that

$IIk^(λ)=∑k1−k2+k3=k∫R3I(λ,Ω+λ1−λ2+λ3)⋅(w1̂)k1(λ1)⋅(w2̂)k2(λ2)¯ × ηk1−k2NεVk1−k2∑k3′(HN̂)k3k3′(λ3)(FN)k3′ dλ1dλ2dλ3,$
(5.6)

where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2 and I = I(λ, μ) is as in (2.10). In the above expression, let μλ − (Ω + λ1λ2 + λ3). In particular, we have |I| ≲ ⟨λ−1μ−1 by (2.10). By a routine argument, we may assume that |λ| ≤ N100 and similarly for μ and each λj; in fact, if, say, |λ1| is the maximum of these values and |λ1| ≥ N100, then we may fix the values of k and all kj at a loss of at most N12 and reduce to estimating (with the value of Ω fixed)

$|II^(λ)|≲∫R31⟨λ⟩⟨λ−λ1+λ2−λ3−Ω⟩w1^(λ1)w2^(λ2)¯w3^(λ3) dλ1dλ2dλ3,$

with |λ1| ∼ KN100 and $‖⟨λj⟩bwj^‖L2≲1$ for each j. By estimating w1 in the unweighted L2 norm, we can gain a power K−1/2, and using the L1 integrability of $wj^$ that follows from the weighted L2 norms, we can fix the values of λj for j ∈ {2, 3}. In the end, this leads to

$|II^(λ)|≲1|λ|≲K⟨λ⟩−1K−1/2,$

and hence, $‖⟨λ⟩bII^‖L2≲K−1/3≲N−30$, which is more than enough for (3.18).

Now, with |λ| ≤ N100, …, we may apply the bounds (3.16)–(3.18), but for the extensions and global norms, and replace the Y1−b norm (if any) by the Yb norm at a loss of $NCκ−1$, which will be neglected as stated above. Similarly, as |λ| ≤ N100, we also only need to estimate II in the X1−b instead of Xb norm again at a loss of $NCκ−1$. Then, using L1 integrability in λj (together with a meshing argument; see Sec. III D), provided by weighted bounds (3.16)–(3.18) and the (almost) summability in μ due to the ⟨μ−1 factor in (2.10), we may fix the values of λ, λj (1 ≤ j ≤ 3), and ⌊μ⌋(and hence, the value of $Ω∈Z$) and reduce to estimating the $ℓk2$ norm of the following quantity:

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2⋅(w1̂)k1(w2̂)k2¯⋅∑k3′Hk3k3′(FN)k3′.$
(5.7)

Here, in (5.7), we assume that |k1| ≤ N, |k2| ≤ N2, |k1k2| ≲ Nε, and N/2 < ⟨k3⟩, ⟨$k3′$⟩ ≤ N and $Ω0∈Z$ is fixed, and the inputs satisfy that

$‖w1^‖ℓ2≲N−1/2+ε1, ‖w2^‖ℓ2≲N2−1/2+ε1+ε2, ‖H‖k3→k3′≲1.$

To estimate $Q$, we may assume |k1k2| ∼ RNε and define the base tensor

$hb=hkk1k2k3b=ηk1−k2NεVk1−k2⋅1k1−k2+k3=k⋅1|k|2−|k1|2+|k2|2−|k3|2=Ω0,$

with also the restrictions on kj as mentioned above. Then, we have

$Qk=∑k1,k2,k3,k3′hkk1k2k3b⋅(w1̂)k1(w2̂)k2¯⋅Hk3k3′(FN)k3′,$

and hence,

$‖Q‖ℓ2≲N−1/2+ε1N2−1/2+ε1+ε2∑k3,k3′hkk1k2k3bHk3k3′(FN)k3′kk2→k1.$

By Lemma 2.8 and the independence between $Hk3k3′$ and $(FN)k3′$, we get that

$∑k3,k3′hkk1k2k3bHk3k3′(FN)k3′kk2→k1≲Nδ⋅N−1⋅max∑k3hkk1k2k3bHk3k3′kk2k3′→k1,∑k3hkk1k2k3bHk3k3′kk2→k1k3′≲Nδ−1‖H‖k3→k3′⋅maxhkk1k2k3bkk2k3→k1,hkk1k2k3bkk2→k1k3$

N-certainly. By the definition of hb and using Schur’s bound and counting estimates in Lemma 2.5 and noting that |k2| ≤ N2 and |k1k2| ≲ R, we can bound

$max‖hkk1k2k3b‖kk2k3→k1,hkk1k2k3bkk2→k1k3≲NδR−β⋅N⋅min(N2,R).$

Since also $‖H‖k3→k3′≲1$, we conclude that

$‖Q‖ℓ2≲N−1/2+ε1N2−1/2+ε1+ε2⋅N2δR−βmin(N2,R)≲N−1/2+ε1+ε2/2,$
(5.8)

which is enough for (3.18). This concludes the proof for term II when $w2∈{ρN2,zN2}$. Note that the above argument also works for the case when w1 = ρN and $w2=ρN2$ because here, we must have N2N/2 due to the support condition of ρN and the assumption |k1k2| ≲ Nε, and the above arguments give the same (in fact, better) estimates.

#### 2. The case $w2=ψN2$

In this case, by repeating the first part of the arguments in Sec. V A 1, we can reduce to estimating the $ℓk2$ norm of the following quantity:

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2⋅(w1̂)k1⋅∑k2′Hk2k2′(2)(FN2)k2′¯⋅∑k3′Hk3k3′(3)(FN)k3′.$
(5.9)

Here, in (5.9), we assume that |k1| ≤ N, |k2| ≤ N2, |k1k2| ∼ RNε, N2/2 < ⟨k2⟩, ⟨$k2′$⟩ ≤ N2, and N/2 < ⟨k3⟩, ⟨$k3′$⟩ ≤ N, and $Ω0∈Z$ is fixed, and the inputs satisfy that

$‖w1^‖ℓ2≲N−1/2+ε1, ‖H(j)‖kj→kj′≲1 (j=2,3).$

Moreover, this H(j) is such that either H(j) = Id or $‖H(j)‖kjkj′≲Nj1+δ$ with N3 = N. The sum in (5.9) can be decomposed into a term where $k2′$$k3′$ and a term where $k2′$ = $k3′$.

Case 1: $k2′$$k3′$. Let $hkk1k2k3b$ be defined as mentioned above, and it suffices to estimate the $ℓk12→ℓk2$ norm of the tensor

$(k,k1)↦∑k2,k3hkk1k2k3b∑k2′,k3′Hk2k2′(2)(FN2)k2′¯⋅Hk3k3′(3)(FN)k3′$

by using the 2 norm of w1. If N3 = N, then the tensors hb, H(2), and H(3) are independent of $(FN)k2′$ and $(FN)k3′$ and $k2′$$k3′$, so we can apply Lemma 2.8; if N2N/2, then hb, H(2), and H(3) and $(FN2)k2′$ are all independent of $(FN)k3′$, and moreover, hb and H(2) are independent of $(FN2)k2′$, so we can apply Lemma 2.8 iteratively, first for the sum in (k3, $k3′$) and then for the sum in (k2, $k2′$). In either case, by applying Lemma 2.8 and combining it with Lemma 2.7 and estimating H(j) in the kj$kj′$ norm, we obtain N-certainly that the desired $ℓk12→ℓk2$ norm of the tensor is bounded by

$NδN2−1N−1⋅maxhbkk2k3→k1,hbkk2→k1k3,hbkk3→k1k2,hbk→k1k2k3.$

Using the fact that |k2| ≤ N2 and |k3| ≤ N in the support of hb and Lemma 2.5 as above, we can show that

$maxhbkk2k3→k1,hbkk2→k1k3,hbkk3→k1k2,hbk→k1k2k3≲R−βNδ⋅NN2,$

and hence,

$‖Q‖ℓ2≲N−1/2+ε1⋅R−βN2δ≲N−1/2+ε1+ε2/2,$
(5.10)

which is enough for (3.18).

Case 2: $k2′$ = $k3′$. In this case, we must have N2 = N, and we can reduce (5.9) to the following expression:39

$Qk=∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2⋅(w1̂)k1⋅(H̃)k2k3,$
(5.11)

where

$(H̃)k2k3=∑k2′1⟨k2′⟩2Hk2k2′(2)¯⋅Hk3k2′(3).$

As k2k3 in (5.9) due to the definition of $M ≪$, we know that either H(2) or H(3) must not be identity, and hence, we have $‖H̃‖ℓk2k32≲N−1+δ$. By (5.11), we then simply estimate

$‖Q‖ℓk2≲‖w1^‖ℓ2⋅‖H̃‖ℓk2k32⋅‖hb‖kk2k3→k1≲N−1/2+ε1⋅N−1+δ⋅R−β⋅NR≲N−1/2+ε1+ε2/2$
(5.12)

using Lemma 2.5, notig that |k1k2| ≲ R and |k3| ≤ N. This completes the proof for term II.

Here, we assume w3 = ψN and w1, w2 ∈ {ψN, ρN, ψN/2, ρN/2}. We consider two possibilities, when w1, w2 ∈ {ψN, ψN/2}, which we call term IV, and when wj ∈ {ρN, ρN/2} for some j ∈ {1, 2}, which we call term V.

#### 1. Term IV

Suppose w1, w2 ∈ {ψN, ψN/2}. We may also decompose them into $ψNj,Lj$ for LjNj/2 and reduce to

$IVk(t)=−iΔN∫0t∑k1−k2+k3=keit′Ωηk1−k2NεVk1−k2 ×∑k1′,k2′,k3′(hN1,L1)k1k1′(t′)(hN2,L2)k2k2′(t′)¯(hN3,L3)k3k3′(t′)(FN1)k1′(FN2)k2′¯(FN3)k3′ dt′,$
(5.13)

where N1, N2 ∈ {N, N/2} and N3 = N. In (5.13), we consider two cases, depending on whether there is a pairing $k1′$ = $k2′$ or $k2′$ = $k3′$ or not.

Case 1: no-pairing. Assume that $k2′$ ∉ {$k1′$, $k3′$}, and then, we take the Fourier transform in the time variable t and repeat the first part of the arguments in Sec. V A 1 to reduce to estimating the $ℓk2$ norm of the following quantity:

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2 ×∑k1′hk1k1′(1)(FN1)k1′⋅∑k2′hk2k2′(2)(FN2)k2′¯⋅∑k3′hk3k3′(3)(FN3)k3′.$
(5.14)

In (5.14), we assume that |kj| ∼ N and |k1k2| ∼ RNε and that the matrices h(j) are either identity or satisfy that

$‖h(j)‖kj→kj′≲Lj−1/2+3ε1, ‖h(j)‖kjkj′≲N1+δLj−1/2+2ε1,$

and moreover, we may assume that h(j) is supported in |kj$kj′$| ≲ LjNδ by inserting a cutoff exploiting (3.17). The $ℓk2$ norm for $Qk$ can then be estimated using Proposition 2.8 in the same way as in Sec. V A 2, either jointly in ($k1′$, $k2′$, $k3′$) if each Nj = N or first in those kj with Nj = N and then in those kj with Nj = N/2, so that N-certainly we have (with the base tensor hb defined as in Secs. V A 1 and V A 2)

$‖Q‖ℓ2≲Nδ⋅N−3‖hb‖kk1k2k3∏j=13‖h(j)‖kj→kj′≲N−3+δ⋅N3/2RN⋅R−β≲N−1/2+ε1/2$

using Lemma 2.5, which is enough for (3.18).

Case 2: pairing. We now consider the cases when $k1′$ = $k2′$ or $k2′$ = $k3′$. First, if $k2′$ = $k3′$, then we can apply the reduction arguments as mentioned above and reduce to

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2⋅∑k1′hk1k1′(1)(FN1)k1′⋅h̃k2k3,$
(5.15)

where

$(h̃)k2k3=∑k2′1⟨k2′⟩2hk2k2′(2)¯⋅hk3k2′(3), ‖h̃‖k2k3≲N−2min(‖h(2)‖k2→k2′‖h(3)‖k3k3′,‖h(2)‖k2k2′‖h(3)‖k3→k3′).$

Note that both h(2) and h(3) cannot be identity as k2k3. Now, if max(L2, L3) ≤ N/2, then due to independence, applying similar arguments as mentioned before, we can estimate N-certainly that

$‖Q‖ℓ2≲NδN−1⋅‖h(1)‖k1→k1′‖hb‖kk1→k2k3‖h̃‖k2k3≲N−2+2ε+4δ,$

using the constraint |k1k2| ≲ Nε, which is enough for (3.18); if max(L2, L3) = N, then we can gain a negative power of this value and view $FN1$ simply as an H−1/2− function (without considering randomness) and bound

$‖Q‖ℓ2≲NδN1/2+δ⋅‖h(1)‖k1→k1′‖hb‖kk1→k2k3‖h̃‖k2k3≲N−1+2ε+4δ,$

which is also enough for (3.18).

Finally, consider the case $k1′$ = $k2′$, so, in particular, N1 = N2. We will sum over L1 and L2 in order to exploit the cancellation (3.15) (as k1k2); this leads to the following expression:

$∑k1′1⟨k1′⟩2(HN1)k1k1′(t′)(HN1)k2k1′(t′)¯,$

where again we have replaced $|gk1′|2$ by 1 as mentioned before. Since k1k2, by (3.15), we may replace $⟨k1′⟩−2$ in the above expression by $⟨k1′⟩−2−⟨k1⟩−2$. Then, decomposing in L1 and L2 again and taking Fourier transform in t and repeating the reduction steps as mentioned before, we arrive at the following quantity:

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2⋅h̃k1k2⋅∑k3′hk3k3′(3)(FN3)k3′,$
(5.16)

where

$h̃k1k2=∑k1′1⟨k1′⟩2−1⟨k1⟩2hk1k1′(1)hk2k1′(2)¯,$

with h(j) as mentioned above. Note that we may assume |k1$k1′$| ≲ Nδ min(L1, Nε + L2) ≲ Nε+δ min(L1, L2) in view of |k1k2| ≲ Nε, and it is easy to show, assuming min(L1, L2) = L, that

$‖h̃‖k1k2≲Nε+δ⋅LN−3‖h(1)‖k1k1′‖h(2)‖k2→k2′≲N−2Nε+δ+4ε1.$

Since max(L1, L2) ≤ N/2, using independence and arguing as mentioned before, we can estimate that N-certainly,

$‖Q‖ℓ2≲Nδ⋅N−1‖h(3)‖k3→k3′⋅‖hb‖kk3→k1k2⋅‖h̃‖k1k2≲N−1+ε+2δ+4ε1,$

which is enough for (3.18). This completes the estimate for term IV.

#### 2. Properties of the matrix MN − HN

Before studying term V, we first establish some properties of the matrix $QN≔MN−HN=(QN)kk′(t)$ such that

$(ρN)k(t)=∑k′(QN)kk′(t)(FN)k′.$
(5.17)

Lemma 5.1.
Let$ε′≔ε$so that (ε1 ≪) εε′ ≪ 1. Then, we have
$‖QN‖Y1−b(J)+supt∈J‖(QN)kk′(t)‖k→k′≲N−1/2+3ε1, ‖QN‖Zb(J)≲N1/2+2ε1.$
(5.18)
Moreover, we can decompose QN = QN,≪ + QN,remsuch that$‖QN,rem‖Zb(J)≲N1/2+2ε1−ε′/4$and that
$‖ρN,rem‖Xb(J)≲N−1/2+2ε1−ε′/4, where (ρN,rem)k(t)=∑k′(QN,rem)kk′(t)(FN)k′.$
(5.19)
Moreover, QN,≪can be decomposed into at most Nterms. For each term Q, there exist vectors ℓ*, m*such that |*|, |m*| ≲ Nεand that$(Q^)kk′(λ)$is a linear combination (in the form of some integral40 ), with summable coefficients, of expressions of the form
$1k′−k=ℓ*⋅Y ℓ*,m*(k,λ)⋅Rℓ*,m*(k),$
(5.20)
where$Y$is independent with$(FN)k$and$|Y |≲1$and$R(k)$depends only on m*k; moreover, we have
$‖R‖ℓk2≲N1/2+2ε1+Cε′, ‖⟨λ⟩bY ‖Lλ2ℓk∞≲NCε′.$
(5.21)

Remark 5.2.

Lemma 5.1 plays an important role in Sec. V B 3 when estimating term V. In particular, we will exploit the one-dimensional extra independence of $R(k)$ with $(FN)k$ since $R(k)$ depends only on m*k instead of on k.

Proof.
By definition of ξN and ψN in (3.11) and (5.1)–(5.3), as well as the associated matrices, we have the following identity:
$(QN)kk′(t)=∑k1∫0t(H NM )kk1(t,t1)(HN−QN)k1k′(t1) dt1,$
and hence, we have
$(QN)kk′(t)=∑n=1∞(−1)n−1∑k1∫0t{(H NM )n}kk1(t,t1)(HN)k1k′(t1) dt1,$
(5.22)
where $H N=H N,N/2$ is defined in Sec. IV B and $M$ denotes the following operator:
$z↦∑max(N1,N2)=NΔNM ≪(yN1,yN2,z).$
(5.23)
The bounds in (5.18) then follow from iterating such as in Sec. IV B using the bounds (4.12)–(4.17) (together with the XαXα bounds) for the operators $H N$ and $M$, where the bounds for $M$ are proved in the same way as in Secs. IV A and IV B. Moreover, in (5.22), if we assume n ≥ 2 or replace $H N$ by $H N−H N,Nε′$ (or HN by $HN−HN,Nε′$), then the corresponding bounds can be improved by Nε′/4, and the resulting terms can be put in Ref. 41Qrem. As for the remaining contribution, we can write
$(QN,≪)kk′(t)=∑k1,k2∫H kk1N,Nε′(t,t1)M k1k2(t1,t2)(HN,Nε′)k2k′(t2) dt1dt2,$
and hence,
$(QN,≪^)kk′(λ)=∑k1,k2∫(FH N,Nε′)kk1(λ,λ1)(FM )k1k2(λ1,λ2)(FHN,Nε′)k2k′(λ2) dλ1dλ2.$
(5.24)
We may assume that |kk1| ≲ Nε and the same for k1k2 (using the definition of $M$) and k2k′, so at a loss of N, we may fix the values of kk1, k1k2, and k2k′. Note that the matrices $FH N,Nε′$ and $FM$ satisfy bounds (4.15)–(4.17); moreover, in (4.17), we may replace the unfavorable exponents ⟨λ2(1−b)λ′⟩−2b by the favorable ones ⟨λ2bλ′⟩−2(1−b), at the price of replacing the right-hand side by a small positive power $NCκ−1$, by repeating the interpolation argument in Sec. IV B. Using these bounds, we then see that the integral (5.24) provides the required linear combination. Here, summability of coefficients follows from the estimate
$∫R2A(λ1)B(λ1,λ2)C(λ2) dλ1dλ2 ≲‖⟨λ1⟩−(1−b)A(λ1)‖L2⋅‖⟨λ1⟩b⟨λ2⟩−(1−b)B(λ1,λ2)‖L2⋅‖⟨λ2⟩bC(λ2)‖L2$
(5.25)
and the improved versions of (4.15)–(4.17). Recall that kk1, k1k2, and k2k′ are all fixed. We set that * ≔ (k1k) + (k2k1) + (k′ − k2) = k′ − k and m*k1k2. Finally, for fixed (λ1, λ2),42 we set $Y ℓ*,m*(k,λ)≔(FH N,Nε′)kk1(λ,λ1)(FHN,Nε′)k2k′(λ2)$ and $Rℓ*,m*(k)≔(FM )k1k2(λ1,λ2)$. The factors coming from $HN,Nε′$ and $H N,Nε′$ are independent of $(FN)k$, while the factor coming from $M$depends on k1only via the quantity |k1|2 − |k2|2 in view of the definition (5.23); hence, the desired decomposition is valid because |k1|2 − |k2|2 equals m*k plus a constant once the above-mentioned difference vectors are all fixed. In addition, the bounds of $R$ and $Y$ in (5.21) can be easily proved by the above setting of $R$ and $Y$ together with bounds (4.12)–(4.17) (together with the XαXα bounds) for the operators $H N$ and $M$.□

#### 3. Term V

Now, let us consider term V as defined in the introduction of Sec. V B. In the following proof of estimating term V, we will fully use the cancellation in (3.15) together with Lemma 5.1. We may assume that N1 = N2 = N because if N1N2, then in later expansions, we must have $k1′$$k2′$ [so the cancellation in (3.15) is not needed], and the proof will go in the same way; if N1 = N2 = N/2, then the same cancellation holds, and again, we have the same proof. Now, recall that ρN = ξNψN and that

$(ξN)k(t)=∑k′(MN)kk′(t)(FN)k′, (ψN)k(t)=∑k′(HN)kk′(t)(FN)k′,$
(5.26)

as in (3.5) and (3.11), and that both MN and HN satisfy equality (3.15). Using this cancellation (when $k1′$ = $k2′$ in the expansion) in the same way as in Sec. V B 1 and by repeating the reduction steps mentioned before, we can reduce to estimating the quantity that is either

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2 ×∑k1′≠k2′Qk1k1′(FN1)k1′⋅Pk2k2′(FN2)k2′¯⋅∑k3′hk3k3′(3)(FN3)k3′$
(5.27)

or

$Qk≔∑k1−k2+k3=k|k|2−|k1|2+|k2|2−|k3|2=Ω0ηk1−k2NεVk1−k2⋅h̃k1k2⋅∑k3′hk3k3′(3)(FN3)k3′,$
(5.28)

where

$h̃k1k2=∑k1′1⟨k1′⟩2−1⟨k1⟩2Qk1k1′Pk2k1′¯.$

Here, in (5.27) and (5.28), the matrix Q is coming from QN, where $Qk1k1′=(QN^)k1k1′(λ)$ for some fixed λ; similarly, P is coming from either QN or $hN,L2$, and h(3) is coming from $hN,L3$ in the same way.

First, we consider (5.28). By losing a power N, we may fix the values of k1k2 and kk3, and then, we will estimate $Q$ using $‖hb‖k1k2→kk3≲N2+Cε$, and we have the following bounds:

$supk3∑k3′hk3k3′(3)(FN3)k3′≲NO(ε1)⋅N−1L3−1/2, ‖h̃‖k1k2≲NO(ε1)⋅L2N−3N1/2L2−1/2$

(with L2 = N if P is coming from QN), where the first bound mentioned above follows from Proposition 2.8 for each fixed k3 and the second bound follows from estimating $‖h̃‖k1k2≲L2N−3‖Q‖k1k1′‖P‖k2→k2′$. This leads to

$‖Q‖ℓ2≲NO(ε)⋅N2N−1⋅L2N−3⋅N1/2L2−1/2≲N−1+Cε,$

which is enough.

Now, we consider (5.27). If P is coming from QN, then in (5.27), we may remove the condition $k1′$$k2′$, reducing essentially to the expression in (5.3) with both w1 and w2 replaced by ρN, which is estimated in the same way as in Sec. V A 1. On the other hand, the term when $k1′$ = $k2′$ can be estimated in the same way as in (5.28). The same argument applies if P is coming from $hN,L2$ and max(L2, L3) ≥ Nε, where we can gain a power Nε′/4 from either L2 or L3, or if Q is coming from QN,rem, where we can gain extra powers Nε′/4 using Lemma 5.1.

Finally, consider (5.27), assuming that max(L2, L3) ≤ Nε and that Q comes from QN,≪ in Lemma 5.1. By losing at most N, we may fix the values of k1k2, kk3, k2$k2′$, and k3$k3′$ and consider one single component of QN,≪ described as in Lemma 5.1. Then, there are only two independent variables—namely, k and k1—and we essentially reduce (5.27) to

$Qk=gk̃⋅∑k1:ℓ⋅(k+k1)=Ω0A ⋅1k1′−k1=ℓ*⋅1⟨k1′⟩⟨k2′⟩Y (k1)R(k1)Pk2k2′¯⋅gk1′gk2′¯.$
(5.29)

Here, $|A |≲1$ is a non-probabilistic factor, ||, |*| ≲ Nε are fixed vectors, $Y =Y (k1)$ and $R=R(k1)$ are as in Lemma 5.1, and $P=Pk2k2′$ is defined as above. Moreover, we know that $Y$ and P are independent of $gk1′$ and $gk2′$, that $R(k1)$ depends only on m*k1 for some fixed vector |m*| ≲ Nε, and that |P| ≲ NO(ε), $|Y |≲NO(ε)$, and $‖R‖ℓ2≲N1/2+O(ε)$ (after fixing λ as before). Finally, $gk̃$ in (5.29) is $∑k3′hk3k3′(3)(FN3)k3′$ bounded by $|gk̃|≲N−1$.

Since $R(k1)$ only depends on m*k1, if we fix the value of m*k1 in the above-mentioned summation, then $R(k1)$ can be extracted as a common factor, and for the rest of the sum, we can apply independence (using Proposition 2.8) and get

$|Qk|≲|gk̃|⋅∑a|R(a)|⋅∑k1∈Sa,k1⟨k1′⟩⟨k2′⟩Y (k1)R(k1)Pk2k2′¯21/2≲N−3+O(ε)⋅∑a|R(a)|⋅|Sa,k|1/2,$

where $R(a)=R(k1)$ for any k1 · m* = a and $Sa,k≔{k1∈Z3:ℓ⋅(k1+k)=Ω0,k1⋅m*=a}$. Note that in the above estimate, we are dividing the set of possible k1’s into subsets Sa,k, where · k1 equals some constant and m* · k1 equals another constant, and that Sa,k is either empty or has cardinality ≥N1−. When Sa,k = , $|Qk|=0$. When Sa,k, we have |Sa,k| ≥ N1−, and hence,

$|Qk|≲NCε′−7/2⋅∑a|R(a)|⋅|Sa,k|=NCε′−7/2⋅∑k1:ℓ⋅(k+k1)=Ω0|R(k1)|.$

Then, using Schur’s bound, we get that

$‖Q‖ℓ2≲NCε′−7/2N2‖R‖ℓ2≲N−1+Cε′,$

which is enough for (3.18). This completes the proof for ρN.

For the purpose of Sec. VI, we need an improvement for the ρN bound in (3.18), namely, the following.

Proposition 5.3.
Let N = M,$Y∈R$be any constant, and consider ρ*defined by
$(ρ*)k(t)=(ρN)k(t)⋅1Y≤|k|2≤Y+Nε′,$
and then, N-certainly, we can improve (3.18) to$‖ρ*‖Xb(J)≤N−1/2+ε1/2$. Note that this bound is better than the bound for zNin (3.18) [which is better than the bound for ρNin (3.18)].

Proof.

We only need to examine terms I ∼ V in the above proof. For terms I and IV and V (hence also III), in the above proof, we already obtain bounds better than $N−1/2+ε1/2$, so these terms are acceptable, and we need to study term II. Note that the definition of ρ* restricts k to a set E of cardinality ≤N1+ by the standard divisor counting bound.

Let $hb=hkk1k2k3b$ be the base tensor, which is supported in |kj| ≲ NjN and |k1k2| ∼ R, such that in the support of hb, we have kk1 + k2k3 = 0 and |k|2 − |k1|2 + |k2|2 − |k3|2 = Ω0. There are three cases in term II that need consideration:
1. The case in Sec. V A 1: Here, bound (5.8) suffices unless max(N2, R) ≤ N; if this happens, note that in the above proof, (5.8) follows from the estimate

$max‖hkk1k2k3b‖kk2k3→k1,hkk1k2k3bkk2→k1k3≲N1+δ$
assuming that max(N2, R) ≤ N. However, if we further require kE, then the right-hand side of the above bound can be improved to |E|1/2 = N1/2+, which leads to the desired improvement of (3.18).
1. Case 1 in Sec. V A 2: Here, the bound (5.10) suffices unless RN; if this happens, note that (5.10) follows from the estimate

$max‖hb‖kk2k3→k1,‖hb‖kk2→k1k3,‖hb‖kk3→k1k2,‖hb‖k→k1k2k3≲N1+δN2$
assuming that RN. However, if we further require kE, then the right-hand side can be improved to |E|1/2N2 = N1/2+N2, which allows for the improvement.
1. Case 2 in Sec. V A 2: Here, (5.12) follows from the estimate $‖hb‖kk2k3→k1≲R−βNR$. However, if we further require kE, then the right-hand side can be improved to Rβ|E|1/2R = RβN1/2+R, which allows for the improvement. This finishes the proof.

Now, we will prove the zN part of bound (3.18), assuming that N = M. We will prove it by a continuity argument [see part (1) of Sec. III D for more details], so we may assume (3.18) and only need to improve it using Eq. (3.13); note that the smallness factor is automatic as long as we use (3.13), as explained before. As such, we can assume that each input factor wjs on the right-hand side of (3.13) has one of the following four types, where in all cases, we have NjN:

• Type (G), where we define Lj = 1 and

$(wj^)kj(λj)=1Nj/2<〈kj〉≤Njgkj(ω)〈kj〉χ^(λj).$
(6.1)
• (ii)

Type (C), where

$(wj^)kj(λj)=∑Nj/2<〈kj′〉≤Njhkjkj′(j)(λj,ω)gkj′(ω)〈kj′〉,$
(6.2)

with $hkjkj′(j)(λj,ω)$ supported in the set $Nj2<⟨kj⟩≤Nj,Nj2<⟨kj′⟩≤Nj$ and $B≤Lj$ measurable for some LjNj/2 and satisfying the following bounds (where in the first bound, we first fix λj, take the operator norm, and the take the L2 norm in λj):

$‖⟨λj⟩1−bhkjkj′(j)(λj)‖Lλj2(ℓkj2→ℓkj′2)≲Lj−1/2+3ε1, ‖⟨λj⟩bhkjkj′(j)(λj)‖ℓkjkj′2Lλj2≲Nj1+δLj−1/2+2ε1.$
(6.3)

Moreover, using (3.17), we may assume that h(j) is supported in |kj$kj′$| ≲ NδLj. Note that if wj is of type (G), $(wj^)kj(λj)$ can be also expressed in the same form as (6.2) but with $hkjkj′(j)=1kj=kj′⋅χ^(λj)$, except that the second equation in (6.3) is not true in this case.

• (iii)

Type (L), where $(wj^)kj(λj)$ is supported in {|kj| ∼ Nj} and satisfies

$‖⟨λj⟩b(wj^)kj(λj)‖ℓkj2Lλj2≲Nj−1/2+ε1+ε2.$
(6.4)

Also such wj is a solution to Eq. (5.1).

• (iv)

Type (D), where $(wj^)kj(λj)$ is supported in {|kj| ≲ Nj} and satisfies

$‖⟨λj⟩b(wj^)kj(λj)‖ℓkj2Lλj2≲Nj−1/2+ε1.$
(6.5)

Also such wj is a solution to Eq. (3.13).

Now, let the multilinear forms $M ○$, $M <$, $M >$, and $M ≪$ be as in (2.4), (3.3) and (3.9). The terms on the right-hand side of (3.13), apart from the first term in the second line of (3.13) which are trivially bounded, are as follows:

1. Term

$I=IχΠNM >(w1,w2,w3),$

where wj can be of any type and max(N1, N2, N3) = N.

1. Term

$II=IχΠN(M <−M ≪)(w1,w2,w3),$

where wj can be of any type and max(N1, N2) = N.

1. Term

$III=IχΔNM ○(w1,w2,w3),$

where wj can be of any type and max(N1, N2, N3) ≤ N/2.

1. Term

$IV=IχΠN/2M <(w1,w2,w3),$

where wj can be of any type and max(N1, N2) ≤ N/2 and N3 = N.

1. Term

$V=IχΠN/2M ≪(w1,w2,w3),$

where wj can be of any type and max(N1, N2) = N3 = N.

1. Term

$VI=IχΔNM <(w1,w2,w3),$

where w1 and w2 can be of any type, w3 has type (D), and max(N1, N2) ≤ N/2 and N3 = N.

1. Term

$VII=IχΔNM ≪(w1,w2,w3),$

where w1 and w2 can be of any type, w3 has type (D), and max(N1, N2) = N3 = N.

1. Term VIII represents the last two lines of the right-hand side of (3.13).

Our goal is to recover the bound for zN in (3.18) for each of the terms I–VIII mentioned above. In doing so, we will consider two cases. First is the no-pairing case, where if w1 and w2 are of type (C) or (G) and, hence, expanded as in (6.2), then we assume that $k1′$$k2′$; similarly, if w2 and w3 are of type (C) or (G), then we assume that $k2′$$k3′$. The second case is the pairing case which is when $k1′$ = $k2′$ or $k2′$ = $k3′$ (the over-pairing case where $k1′$ = $k2′$ = $k3′$ is easy, and we shall omit it). We will deal with the no-pairing case for terms I–VII in Secs. VI AVI C, the pairing case for these terms in Sec. VI D, and term VIII in Sec. VI E.

#### 1. Preparation of the proof

We start with some general reductions in the no-pairing case. Recall as in Sec. III D that we can always gain a smallness factor from the short time τ ≪ 1 and can always ignore losses of $(N*)Cκ−1$, provided that we can gain a power Nε/10 (which will be clear in the proof). We will consider $IχM (⋆)^(w1,w2,w3)k(λ)$, where $M (⋆)$ can be one of $ΠM ○$, $ΠM <$, $ΠM >$, $ΠM ≪$, and $Π(M <−M ≪)$, with Π being a general notation for projections for ΠN, ΠN/2, and ΔN,

$IM (⋆)^(w1,w2,w3)k(λ)=∑(k1,k2,k3)k=k1−k2+k3,k2∉{k1,k3}(⋆)∫dλ1dλ2dλ3 I(λ,Ω+λ1−λ2+λ3) ×Vk1−k2⋅(w1^)k1(λ1) (w2^)k2(λ2)¯ (w3^)k3(λ3),$
(6.6)

where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2 and ∑(⋆) is directly defined based on the definitions of $M ○$, $M <$, $M >$, and $M ≪$ and the selection of Π. For example, if $M (⋆)$ is $ΠNM >$, then there will be two more restrictions |k| ≤ N and ⟨k1k2⟩ > N1−δ in the sum ∑(⋆). The other ∑(⋆) will defined in the similar ways.

Before going into the different estimates for I–VII, we first make a few remarks.

• If a position wj has type (L) or (D), then in most cases, we only need to consider type (L) terms since (6.5) is stronger than (6.4); there are exceptions that will be treated separately later.

• wj of type (G) can be considered as a special case of type (C) when $hkjkj′(j)(λj)=1Nj/2<⟨kj⟩≤Nj⋅1kj=kj′⋅χ^(λj)$; if we avoid using the $ℓkjkj′2$ norm in (6.3), then we only need to consider type (C) terms.

• Term I can be estimated in the same way as term II. In fact, the definition of $M >$ implies max(N1, N2) ≥ N1−δ, so we are essentially in (special case of) term II up to a possible loss N, which will be negligible compared to the gain. Moreover, term V can be estimated similarly as term IV; see Sec. VI C.

• Terms VI and VII are readily estimated using the XαXα bounds for the linear operator (3.19) proved in Secs. IV A and IV B.

Based on these remarks, from now on, we will consider terms II–IV (and VIII at the end), where the possible cases for the types of (w1, w2, w3) are (a) (C, C, C), (b) (C, C, L), (c) (C, L, C), (d) (L, C, C), (e) (L, L, C), (f) (C, L, L), (g) (L, C, L), and (h) (L, L, L).

In Sec. VI B, we will estimate term II, which can be understood as high–high interactions in view of max(N1, N2) = N, and noting that assuming k is the high frequency, then either k3 is also high frequency or |k1k2| must be large. In Sec. VI C, we will estimate terms III and IV by using a counting technique in a special situation called Γ-condition [see (6.18)]. In Sec. VI D, we consider the pairing case.

We will estimate term II in this subsection. First, we can repeat the arguments for λ, λj, and the Duhamel operator I in (6.6) as in Secs. IV and V. Namely, we first restrict to |λj| ≤ N100 and |λ|, |μ| ≤ N100, where μ = λ − (Ω + λ1λ2 + λ3), and replace the unfavorable exponents (1 − b or b depending on the context) by the favorable ones (b or 1 − b) and then exploit the resulting integrability in λj to fix the values of λ, λj, and ⌊μ⌋. Then, we reduce to the following expression where Ω0 is a fixed integer:

$Xk≔∑k1−k2+k3=khkk1k2k3b(w1̂)k1(λ1) (w2̂)k2(λ2)¯ (w3̂)k3(λ3),$
(6.7)

where hb is the base tensor thatcontains the factors

$Vk1−k2⋅1k1−k2+k3=k⋅1|k|2−|k1|2+|k2|2−|k3|2=Ω0.$

We assume that hb is supported in the set where |kj| ≤ Nj and ⟨k1k2⟩ ∼ R, where R is a dyadic number. Moreover, we assume that R and the support of hb satisfy the conditions associated with the definition of some $M (⋆)$. In view of the factor $|Vk1−k2|∼R−β$ in hb, we also define hR,(⋆)Rβ · hb, which is essentially the characteristic function of the set

$SR=(k,k1,k2,k3)∈(Z3)4, k2∉{k1,k3}k=k1−k2+k3, |k|≤N|k|2−|k1|2+|k2|2−|k3|2=Ω0|kj|≤Nj (j∈{1,2,3}), ⟨k1−k2⟩∼R,$
(6.8)

possibly with extra conditions determined by the definition of $M (⋆)$. We also define $SkR$ to be the set of (k, k1, k2, k3) ∈ SR with fixed k and, similarly, define $Sk1k2R$. Noting that when wj has type (G), (C), or (L), we can further assume that |kj| > Nj/2 in the definition of SR.

The goal now is to bound the norm $‖Xk‖ℓ$ or abbreviated $‖Xk‖k$, assuming that wj satisfy bounds (6.1)–(6.5) but without the λj component, for example, (6.5) becomes $‖wj‖kj≲Nj−1/2+ε1$.

#### 1. Case (a): (C, C, C)

In this case, we have

$Xk=R−β∑(k1,k2,k3)hkk1k2k3R,(⋆) ⋅∑(k1′,k2′,k3′)Nj/2<|kj′|≤Njj∈{1,2,3}hk1k1′(1)hk2k2′(2)¯hk3k3′(3) gk1′gk2′¯gk3′⟨k1′⟩⟨k2′⟩⟨k3′⟩,$
(6.9)

where $hkjkj′(j)=hkjkj′(j)(ω)$ satisfies (6.3) with some Nj and LjNj/2 for 1 ≤ j ≤ 3 and $hkk1k2k3R,(⋆)$ is defined as mentioned above.

To estimate $‖X‖k$, we would like to apply Proposition 2.8 and then Proposition 2.7. Like in Secs. IV A and V, the way we apply Proposition 2.8 depends on the relative sizes of Nj (1 ≤ j ≤ 3). For example, if N1 = N2 = N3, we shall apply Proposition 2.8 jointly in the ($k1′$, $k2′$, $k3′$) summation in (6.9); if N1 = N3 > N2, we will first apply Proposition 2.8 jointly in the ($k1′$, $k3′$) summation and then apply it in the $k2′$ summation, and if N3 > N1 > N2, we will apply first in the $k3′$ summation, then in the $k1′$ summation, and then in the $k2′$ summation. The results in the end will be the same in all cases, so, for example, we will consider the case N3 > N1 > N2. Now, we have

$Xkk=R−β‖∑k3′∑k3H̃kk3hk3k3′(3)gk3′⟨k3′⟩‖k,$
(6.10)

where

$H̃kk3≔∑k1′∑k1Ḧkk1k3hk1k1′(1)gk1′⟨k1′⟩, Ḧkk1k3≔∑k2′∑k2hkk1k2k3R,(⋆)hk2k2′(2)¯gk2′¯⟨k2′⟩.$
(6.11)

By the independence between $gk3′$ and $H̃kk3hk3k3′(3)$ since N3 > N1 > N2, we apply Proposition 2.8 and Proposition 2.7 and get τ−1N*-certainly that

$Xkk≤R−βN3−1⋅∑k1H̃kk3hk3k3′(3)kk3′≲R−βN3−1⋅hk3k3′(3)k3′→k3H̃kk3kk3.$
(6.12)

Similarly, by the independence between $gk1′$ and $Ḧkk1k3hk1k1′(1)$ since N1 > N2 and also by the independence between $gk2′$ and $hkk1k2k3R,(⋆) hk2k2′(2)¯$, once again we can apply Proposition 2.8 and Proposition 2.7 to $‖H̃kk3‖kk3$ and then to $‖Ḧ‖kk1k3$. As a consequence, we have τ−1N*-certainly that

$Xkk≲R−β(N1N2N3)−1⋅∏j=13hkjkj′(j)kj′→kj⋅hkk1k2k3R,(⋆)kk1k2k3.$
(6.13)

In the other cases, we get the same bound. Without loss of generality, we may assume that N1 = N, and then, using Lemma 2.5, we can estimate that

$‖hR,(⋆)‖kk1k2k3≲Nδ⋅N33/2⋅RN2,$

which implies that $‖Xk‖k≲N−1+CδN31/2≲N−1/2+Cδ$, which is enough for (3.18).

#### 2. Case (b): (C, C, L)

In this case, we have

$Xk=R−β∑(k1,k2,k3)hkk1k2k3R,(⋆) ⋅∑(k1′,k2′)Nj/2<|kj′|≤Njhk1k1′(1)hk2k2′(2)¯ gk1′gk2′¯⟨k1′⟩⟨k2′⟩(w3)k3,$
(6.14)

where $hkjkj′(j)=hkjkj′(j)(ω)$ satisfies (6.3) with some Nj and LjNj/2 for 1 ≤ j ≤ 2 and the base tensor $hkk1k2k3R,(⋆)$ is defined as before. Clearly, $‖Xk‖k$ can be bounded by $N3−1/2+ε1+ε2$ times the norm,

$R−β∑(k1,k2,k3)hkk1k2k3R,(⋆) ⋅∑(k1′,k2′)Nj/2<|kj′|≤Njhk1k1′(1)hk2k2′(2)¯ gk1′gk2′¯⟨k1′⟩⟨k2′⟩k→k3.$

By applying Propositions 2.8 and 2.7 again, in the same manner as (VIB1), we get that the above norm is bounded by

$R−β⋅(N1N2)−1max(‖hR,(⋆)‖k→k1k2k3,‖hR,(⋆)‖kk1→k2k3,‖hR,(⋆)‖kk2→k1k3,‖hR,(⋆)‖kk1k2→k3).$

By Lemma 2.5, we can conclude that

$max(‖hR,(⋆)‖k→k1k2k3,‖hR,(⋆)‖kk1→k2k3,‖hR,(⋆)‖kk2→k1k3,‖hR,(⋆)‖kk1k2→k3)≲R⋅min(N1,N2),$

and hence, we easily get $‖Xk‖k≲N−1+Cε1$, which is enough for (3.18).

#### 3. Case (c): (C, L, C) and case (d): (L, C, C)

The estimates of case (c) and case (d) are similar to case (b), so we will state the estimates in case (c) and case (d) without proofs. In case (c), we get

$‖Xk‖k≲N2−1/2+ε1+ε2R−β⋅(N1N3)−1 ×max(‖hR,(⋆)‖k→k1k2k3,‖hR,(⋆)‖kk1→k2k3,‖hR,(⋆)‖kk3→k1k2,‖hR,(⋆)‖kk1k3→k2),$

and in case (d), we get a similar bound, but with subindices 1 and 2 switched.

Now, by Lemma 2.5, we can obtain that

$‖hR,(⋆)‖k→k1k2k3≲NCδ⋅minN23/2(N1∧N3)1/2,N1N3,RN1,$
$‖hR,(⋆)‖kk3→k1k2≲NCδ⋅min(N1,N2,R)⋅min(N,N3,R),$
$‖hR,(⋆)‖kk1→k2k3≲NCδ⋅min(N2,N3,R)⋅min(N,N1,R),$
$‖hR,(⋆)‖kk1k3→k2≲min(R,N1)N3.$

In the first case, we directly get that

$‖Xk‖k≲N2−1/2+ε1+ε2R−β,$

which is enough for (3.18) as max(N1, N2) = N and N1−δRNε (then, we have N1N2N) in view of the definition of $M <−M ≪$. In the second case, we get

$‖Xk‖k≲min(R1−βN1−1N3−1N21/2+ε1+ε2,R−βN2−1/2+ε1+ε2),$

which is also enough for (3.18) as max(N1, N2) = N and RNε. In the third case, we get

$‖Xk‖k≲N2−1/2+ε1+ε2max(R,N1)−1,$

which is also enough for (3.18). By switching indices 1 and 2, we also get the same estimates in case (d).

#### 4. Case (e): (L, L, C)

In this case, we have

$Xk=∑k3′∑(k1,k2,k3)hkk1k2k3R,(⋆)⋅(w1)k1(w2)k2¯hk3k3′(3) gk3′⟨k3′⟩,$
(6.15)

where $hk3k3′(3)=hk3k3′(3)(ω)$ satisfies (6.3) with some N3 and L3N3/2 and the base tensor $hkk1k2k3R,(⋆)$ is defined as before. By symmetry, we may assume N1N2, and then, by the same argument as above, using Propositions 2.7 and 2.8, we can bound

$‖Xk‖k≲(N1N2)−1/2+ε1+ε2N3−1R−β⋅max(‖hR,(⋆)‖kk1k3→k2,‖hR,(⋆)‖kk1→k2k3).$

By Lemma 2.5, both tensor norms are bounded by min(N1, R)N3; as N1N2 (and hence, N2 = N) and RNε, it is easy to check that this bound is enough for (3.18).

#### 5. Case (f): (C, L, L) and case (g): (L, C, L)

The estimates of cases (f) and (g) are similar to case (e), so we will state the estimates of case (f) and (g) directly. Again the two cases here only differ by switching indices 1 and 2, so we only consider case (f). Like in case (e), we get two bounds,

$‖Xk‖k≲(N2N3)−1/2+ε1+ε2N1−1R−βmax(‖hR,(⋆)‖kk1k2→k3,‖hR,(⋆)‖kk2→k1k3)$

and

$‖Xk‖k≲(N2N3)−1/2+ε1+ε2N1−1R−βmax(‖hR,(⋆)‖kk1k3→k2,‖hR,(⋆)‖kk1→k2k3).$

Now, if $N3≥Nε2$, we will apply the first bound and use that

$max(‖hR,(⋆)‖kk1k2→k3,‖hR,(⋆)‖kk2→k1k3)≲Rmin(N1,N2),$

so the factor $N1−1N2−1/2+ε1+ε2min(N1,N2)$, together with $N3−1/2+ε1+ε2$, where $N3≥Nε2$, provides the bound that is enough for (3.18). Moreover, the same bound also works if $N2≤N1−ε2$ (since in this case, N1 = N).

If $N3≤Nε2$ and $N2≥N1−ε2$, we will apply the second bound and use that

$max(‖hR,(⋆)‖kk1k3→k2,‖hR,(⋆)‖kk1→k2k3)≲NCε2N1,$

assuming that $N3≤Nε2$. This is also enough for (3.18) assuming that $N2≥N1−ε2$ and RNε.

#### 6. Case (h): (L, L, L)

In this case, we have

$Xk≔∑(k1,k2,k3)hkk1k2k3R,(⋆)⋅(w1)k1(w2)k2¯(w3)k3,$
(6.16)

where the base tensor $hkk1k2k3R,(⋆)$ is defined as before. Then, simply using Proposition 2.7, we get

$‖Xk‖k≲R−β⋅(N1N2N3)−1/2+ε1+ε2⋅hkk1k2k3R,(⋆)kk2→k1k3.$
(6.17)

By Lemma 2.5, we have $‖hkk1k2k3R,(⋆)‖kk2→k1k3≲(Rmin(N1,N2))1/2$, which implies that

$‖Xk‖k≲R−β+1/2max(N1,N2)−1/2+Cε1,$

which is enough for (3.18) because max(N1, N2) = N and RNε.

In this section, we estimate terms III and IV. These two terms are actually similar, and the key property that they satisfy is the so-called Γ condition. Namely, due to the projections and assumptions on the inputs in terms III and IV, we have that

$|k|2≥Γ≥|k3|2 for all(k,k1,k2,k3)∈Sor |k|2≤Γ≤|k3|2 for all(k,k1,k2,k3)∈S$
(6.18)

for some real number Γ, where S is the support of the base tensor hb [note that in term IV, we may assume that w3 is not of type (D) as, otherwise, the bound follows from what we have already done, so here, we may choose Γ = (N/2)2 − 1].

To proceed, we return to $IχM (⋆)^(w1,w2,w3)k(λ)$ in (6.6), where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2, and suppose that μ = λ − (Ω + λ1λ2 + λ3), and then, we have |I| ≲ ⟨λ−1μ−1 by (2.10). Following the same reduction steps as before, we can assume that |λ|, |λj|(j = 1, 2, 3), |μ| ≤ N100 and may replace the unfavorable exponents by the favorable ones. Now, instead of fixing each λj and λ and ⌊μ⌋, we do the following.

Without loss of generality, we may assume that |λ3| is the maximum of all the parameters |λj| and |λ| and |μ|; the other cases are treated similarly. We may fix a dyadic number K and assume that |λ3| ∼ K. Then, we may fix λj (j ≠ 3) and λ and ⌊μ⌋, again using integrability in these variables, and exploit the weight $⟨λ3⟩b$ in the weighted norms in which w3 is bounded and reduce to an expression

$Xk≔R−βK−b∑k1−k2+k3=k∫ dλ3⋅hkk1k2k3R,K,(⋆)(λ3)⋅(w1)k1 (w2)k2¯⋅(w3̃)k3(λ3),$
(6.19)

where $(w3̃)k3(λ3)=Kb(w3^)k3(λ3)$ and $hkk1k2k3R,K,(⋆)(λ3)$ is essentially the characteristic function of the set (with possibly more restrictions according to the definition of $M (⋆)$)

$SR,K=(k,k1,k2,k3,λ3)∈(Z3)4×R, k2∉{k1,k3}k=k1−k2+k3, |k|≤N|k|2−|k1|2+|k2|2−|k3|2=−λ3+Ω0+O(1), |λ3|∼K|kj|≤Nj (j∈{1,2,3}), ⟨k1−k2⟩∼R,$
(6.20)

where Ω0 is a fixed number such that |Ω0| ≲ K. We also define the sets $SkR,M$ to be the set of (k1, k2, k3, λ3) such that (k, k1, k2, k3, λ3) ∈ SR,M for fixed k. Note that when wj is of type (C), (G) or (L), we can further assume that $Nj2<|kj|≤Nj$.

The idea in estimating (6.19) is to view (k3, λ3) as a whole (say, denote it by $k3̃$), which will allow us to gain using the Γ condition in estimating the norms of the base tensor hR,K,(⋆). Although our tensors here involve the variable $λ3∈R$, it is clear that Propositions 2.6 and 2.7 still hold for such tensors, and Proposition 2.8 can also be proved by using a meshing argument (see Sec. III D, where the derivative bounds in λ3 are easily proved as all the relevant functions are compactly supported in physical space). Moreover, by the induction hypothesis and the manipulation mentioned above (for example, with the Y1−b norm replaced by the Yb norm), we can also deduce corresponding bounds for $w3=(w3)k3̃$ and the corresponding matrices such that $h̃k3̃k3′(3)$, for example, $‖h̃k3̃k3′(3)‖k3′→k3̃≲L3−1/2+3ε1$. Because of this, in the proof below, we will simply write $∑k3̃$, while we actually mean $∑k3∫dλ3$, so the proof has the same format as the previous ones.

We now consider the input functions. In term III, clearly, max(N1, N2, N3) ≳ N; if N3N, then we must have max(N1, N2) ≳ N and |k1k2| ≳ N, and hence, this term can be treated in the same way as term II. Therefore, we may assume that N3N, and clearly, the same happens for term IV. If max(N1, N2) ≳ N, then again using the term II estimate, we only need to consider the case where |k1k2| ≲ Nε. This term can be treated using similar arguments as below and is much easier due to the smallness of |k1k2|, so we will only consider the case max(N1, N2) ≪ N. In the same way, we will not consider term V here. Finally, if $w3=zN3$, with N3N, then (3.18) directly follows from the linear estimate proved in Sec. IV A and the Γ condition is not needed.

There are two cases: when w3 has type (L) or or w3 has type (C) [or (G)]. In the latter case, there are four further cases for the types of w1 and w2, which we will discuss below.

#### 1. The type (L) case

Suppose that w3 has type (L). Clearly, if $max(N1,N2)≥N100ε2$, then (3.18) also follows from the linear estimates in Sec. IV A [because the difference between the ρN bound and the zN bound in (3.18) is at most $Nε2$], so we may assume that $max(N1,N2)≤N100ε2$. Then, in (6.19), we may further fix the values of (k1, k2) at the price of $NCε2$, and hence, we may write

$Xk=R−βK−b∑k3̃h(k,k3̃)⋅(w3̃)k3̃,$

and by definition, it is easy to see that $‖h‖k3̃→k≲1$. Then, (3.18) follows, using the bound for w3, if $K≥Nε12$. Finally, if $K≤Nε12$, then we have $|Ω|≲Nε12$, where Ω = |k|2 − |k1|2 + |k2|2 − |k3|2. Using the Γ condition (6.18), we conclude that |k3|2 belongs to an interval of length $NO(ε12)$, so we can apply Proposition 5.3 to gain a power $N−ε1/2$, which covers the loss $NO(ε2+ε12)$ and is enough for (3.18).

#### 2. The type (C, C, C) case

Now, suppose that w1, w2, and w3 have type (C, C, C). By symmetry, we may assume that N1N2. Then, by the same argument as in Sec. VI B 1, we obtain that

$‖Xk‖k≲R−βK−b(N1N2N)−1‖hR,K,(⋆)‖kk1k2k3̃⋅‖h(1)‖k1→k1′‖h(2)‖k2→k2‖h(3)‖k3′→k3̃.$

The last three factors are easily bounded by 1, so it suffices to bound the tensor hR,K,(⋆).

By definition, this is equivalent to counting the number of lattice points (k, k1, k2, k3) such that k1k2 + k3 = k (and also satisfying the inequalities listed above) and |Ω| ≲ K. Note that

$‖k1|2−|k2|2|≲R⋅max(N1,R)≔K1,$

so when KK1, by the Γ condition, |k|2 has at most K1 choices, and hence, k has at most K1N choices. Once k is fixed, the number of choices for (k1, k2, k3) is at most $KN12R2$, which leads to the bound

$‖hR,K,(⋆)‖kk1k2k3̃2≲NCδ⋅KK1NN12R2.$

If, instead, KK1, then k has at most KN choices, and once k is fixed, the number of choices for (k1, k2, k3) is at most $N13R3$, so we get

$‖hR,K,(⋆)‖kk1k2k3̃2≲NCδ⋅KNN13R3.$

In either way, we get

$‖Xk‖k≲NCε2N−1/2⋅max(R,R1/2N11/2)N2−1,$

which is enough for (3.18) as max(R, N1) ≲ N2.

#### 3. The type (L, L, C) case

Now, suppose that w1, w2, and w3 have type (L, L, C). First, assume that N1N2. The same arguments in Sec. VI B 4 yields

$‖Xk‖k≲(N1N2)−1/2+ε1+ε2N−1R−βK−b⋅max(‖hR,K,(⋆)‖kk1k3̃→k2,‖hR,K,(⋆)‖kk1→k2k3̃).$

The second norm mentioned above is easily bounded by K1/2RN1 using Lemma 2.5, which is clearly enough for (3.18); for the first norm, there are two ways to estimate.

The first way is to use Lemma 2.5 directly, without using the Γ condition, to get

$‖hR,K,(⋆)‖kk1k3̃→k2≲K1/2min(R,N1)N.$

The second way is to use the Γ condition and first fix the value of |k|2 and then we fix k and then count $(k1,k3̃)$. This yields

$‖hR,K,(⋆)‖kk1k3̃→k2≲K1/2N1/2(R+R1/2N11/2)min(R,N1)1/2$

assuming KK1 and a better bound assuming KK1. Now, plugging in the second bound yields

$‖Xk‖k≲(N1N2)−1/2+ε1+ε2N−1R−βK−b⋅K1/2N1/2(R+R1/2N11/2)min(R,N1)1/2,$

which can be shown to be ≲N−1/2 using the fact max(R, N1) ≤ N2 and by considering whether RN1 or RN1. Moreover, the same estimate can be checked to work if $N1≤N21.1$. If $N1≥N21.1$, we can switch subscripts 1 and 2, in which case, we have the weaker bound,

$‖Xk‖k≲(N1N2)−1/2+ε1+ε2N−1R−βK−b⋅K1/2N1/2(R+R1/2N21/2)min(R,N2),$

without the 1/2 power in the last factor; however, this is still ≲N−1/2, provided that $N1≥N21.1$.

#### 4. The type (L, C, C) and (C, L, C) cases

Now, suppose that w1, w2, and w3 have type (L,C,C); the case (C,L,C) is treated similarly. Here, the same arguments in Sec. VI B 3 implies

$‖Xk‖k≲N1−1N2−1/2+ε1+ε2N−1R−βK−b ×max(‖hR,K,(⋆)‖kk1k3̃→k2,‖hR,K,(⋆)‖k→k1k2k3̃,‖hR,K,(⋆)‖kk1→k2k3̃,‖hR,K,(⋆)‖kk3̃→k1k2).$
(6.21)

The two norms $k→k1k2k3̃$ and $kk1→k2k3̃$ can be estimated by K1/2R min(N1, N2), using Lemma 2.5 only and without the Γ condition, which is clearly enough for (3.18). For the $kk1k3̃→k2$ norm, we can use the estimates in Sec. VI C 3 and get

$‖hR,K,(⋆)‖$