The paper interprets the cubic nonlinear Schrödinger equation as a Hamiltonian system with infinite dimensional phase space. There exists a Gibbs measure which is invariant under the flow associated with the canonical equations of motion. The logarithmic Sobolev and concentration of measure inequalities hold for the Gibbs measures, and here are extended to the k-point correlation function and distributions of related empirical measures. By Hasimoto’s theorem, the nonlinear Schrödinger equation gives a Lax pair of coupled ordinary differential equations for which the solutions give a system of moving frames. The paper studies the evolution of the measure induced on the moving frames by the Gibbs measure; the results are illustrated by numerical simulations. The paper contains quantitative estimates with well-controlled constants on the rate of convergence of the empirical distribution in Wasserstein metric.

Consider the Hamiltonian
(1.1)
on T=R/2πZ which gives the canonical equations of motion
(1.2)
so u = P + iQ satisfies the nonlinear Schrödinger equation
(1.3)
When γ = 2, we have the cubic nonlinear Schrödinger equation. The spatial variable is xT, and the functions are periodic, so that the system applies to fields parametrized by a circle. Throughout the paper, we write (M, dμ) for a complete and separable metric space with a Radon (inner regular) probability measure μ on the σ-algebra generated by the Borel subsets. The squared L2 norm H1 = (P2 + Q2) is formally invariant under the canonical equations of motion, so we can consider possible invariant measures on
(1.4)
The Gibbs measure on BK for this micro-canonical ensemble is
(1.5)
where W(du) is Wiener loop measure, IBK is the indicator function of BK and ZK(β) is a normalizing constant. Lebowitz, Rose and Speer26 proved existence of such an invariant measure, so that for all K > 0 and βR there exists ZK(β) > 0 such that μK,β is a Radon probability measure on BKL2(T;R2). When β = 0, we refer to the measure as free Wiener loop measure, indicating that the dynamics are free of potentials. For β < 0, (1.3) is said to be focussing and the Hamiltonian is unbounded below, giving the source of the technical problem, which is addressed by restricting the measure to BK.

Bourgain9 gave an alternative existence proof using random Fourier series, and showed that the measure is invariant under the flow in the sense that the Cauchy problem is well posed on the support. Further refinements include a result of McKean,30 that the sample paths are Hölder continuous, and a result from Theorem 1.2(iv) in Ref. 6 that the invariant measure of the micro-canonical ensemble satisfies a logarithmic Sobolev inequality. Random Fourier series fit naturally into Sturm’s theory of metric measure spaces, which we use to reduce some of the analysis to invariant measures on finite-dimensional Hamiltonian systems.

The focusing case for spatial variable xR captures soliton solutions, and Ref. 26 discuss the possible transition of the system between an ambient bounded random field and a soliton solution. For xT, the notion of a spatially localized solution is inapplicable, but some of the results are still relevant.26 

Bourgain10 p. 128 comments that invariant Gibbs measures for the periodic cubic Schrödinger equation can be constructed on other phase spaces, and one can consider Gibbs measures on L2 that have different normalizations than μK,β.

In Sec. II, we consider tensor products of Hilbert space H and a k-point density matrix. For μ0 a centered Gaussian measure on H, we express a specific integral
as a series of elementary tensors. This calculation involves combinatorial results which are expressed in terms of Knuth’s odd and even decompositions of Young diagrams. In Sec. III, we use concentration of measure results to show how |uk⟩⟨uk| is close to its mean value J(k) on a set of large probability. This statement also holds when we replace μ0 by the Gibbs measure. In Sec. IV we introduce metric probability measure spaces and show how the infinite-dimensional dynamical system (1.2) can be approximated by finite-dimensional dynamical systems, particularly involving random Fourier series. In particular, we show that xu(x, t) is γ-Hölder continuous for 0 < γ < 1/16, in the sense that sup{u(x+h,t)u(x,t)Lx4/|h|γ);h0} is almost surely finite, for fixed 0 < t0 < t.

We also obtain results on the empirical distributions that arise when we sample solutions of (1.2) with respect to Gibbs measure (1.5), which we use in the numerical experiments in Sec. VII.

Hasimoto20 observed that (1.2) can be expressed as a Lax pair of coupled ordinary differential equations with solutions in SO(3), one of which is the Serret–Frenet system for a moving frame on a curve in R3. Cruzeiro and Malliavin14 developed stochastic differential geometry for frames, pursuing Cartan’s precedent.12 In Secs. V and VI we consider the evolution of the dynamical system corresponding to Hasimoto frames under the Gibbs measure. In Sec. VII we present numerical experiments regarding the solutions, which illustrate the nature of frames that arise from the solutions of (1.2) for typical elements in the support of the Gibbs measure (1.5). In the Appendix,20 Hasimoto expressed the change of variables in a polar decomposition u=ρexp(iϕ) where ρ is a probability density and ϕ a phase, and derived Betchov’s intrinsic equation for vortex filaments from the nonlinear Schrödinger equation. We remark that Villani35 p. 691 carries out a similar a transformation to interpret the linear Schrödinger equation as a transport problem for densities ρ for a suitable action integral. The current paper is a further step at introducing transportation methods into PDE.

Let H be a separable complex Hilbert space, with inner product ⟨·∣·⟩ which is linear in the second argument. The algebraic tensor product HH is identified with the set of finite-rank operators on H, and then we identify the injective tensor product ȞH with the algebra L(H) of bounded linear operators on H and the projective tensor product ĤH with the ideal L1(H) of trace class operators on H. By the theory of metric tensor products the dual space of L1(H) is canonically L(H). For H = L2, the identification is
Let AL1(H) be self-adjoint such that 0 ≤ AI, and let μ0 be a Gaussian measure on H of mean zero and covariance A. By the spectral theorem, we can choose an orthonormal basis (φj)j=1 of H such that j = αjφj where the spectrum of A is the closure of {αj: j = 1, 2, …}. Then we introduce mutually independent Gaussian N(0, 1) random variables (γj)j=1 and the vector
(2.1)
so that μ0 is the distribution of u on H, as one easily checks by computing the expectation
(2.2)
Hence A is the mean of rank-one tensors with respect to Gaussian measure
The k-fold tensor product Hk can be completed to give a Hilbert space, so that the space Hsk of symmetric tensors gives a closed linear subspace. We consider
An element of Hk̂Hk determines a linear operator L(Hk), commonly referred to as a matrix, so J(k)(L2)sk̂(L2)k gives a k-point density matrix, or equivalently a trace class operator J(k)L1(Hsk). The following computation of Gaussian moments is known in Quantum field theory as Wick’s theorem.
Lemma 3.3 of Ref. 27 contains calculations regarding J(k) which we have not been able to interpret, particularly line 8 of p. 79. Here we calculate J(2) directly, before addressing the case of general k. Evidently we have E(γjγγmγn)=0 if one of the indices j, , m, n is distinct from all the others; otherwise, we have all the indices equal, or two distinct pairs of equal indices. Hence we have
(2.3)
and we combine the second and fourth of these to obtain
(2.4)
which exhibits the right-hand side as a symmetric tensor, in which the final term shows the integral is not diagonal with respect to the orthonormal basis
of the symmetric tensor product Hs H, hence J(2) is not a multiple of AA.
For kN, let Πk be the set of all partitions of k so that π ∈ Πk may be expressed as k = k1 + k2 + ⋯ + kn where the row lengths kjN have k1k2 ≥ ⋯ ≥ kn. Given such π and a n-element subset {j1, …, jn} of N, there is a symmetric tensor
where the sum is over all the permutations σ of {j1, …, jn}. The set of all such tensors gives an orthonormal basis of the k-fold symmetric tensor product Hsk.
We express u as in (2.1) and consider the expansion
(2.5)
in terms of this orthonormal basis of Hsk, and look for the terms that do not vanish after integration with respect to μ0.

Definition II.1
(Even decomposition). Given π ∈ Πk consider a pair (λ,ρ)Πk2 with rows λ: k = 1 + 2 + ⋯ + n where jN{0} and ρ: k = r1 + r2 + ⋯ + rn where rjN{0} and
so that λ and ρ have equal numbers of odd rows; here rows may have zero lengths, and the rows are not necessarily in decreasing order. We refer to (λ, ρ) as an even decomposition of π.

Remark II.2.

There are various alternative descriptions of even decompositions. We write λρ if λ and ρ are partitions that have equal numbers of boxes and equal numbers of odd rows; evidently ∼ is an equivalence relation on the set of partitions. By Ref. 24 Theorem 4 there is a bijection between symmetric matrices A that have entries in N{0} with column sums c1, …, cn and Young tableaux P such that have cj occurrences of j as entries and number of columns of P of odd length equals the trace of A. Given symmetric matrices A and B with entries in N{0} such that A and B have equal traces and equal totals of entries, then the RSK correspondence takes A to P and B to Q where P and Q are Young tableaux with an equal number of boxes, and their transposed diagrams P′ and Q′ have an equal number of odd rows, so P′ ∼ Q′.

For notational convenience, we also regard φj0 as a factor which may be omitted in tensor products. Then given such a triple (π, λ, ρ) and an n-subset {j1, …, jn} of N,
(2.6)
where
(2.7)
gives a nonzero summand in J(k).

Conversely, let (λ,ρ)Πk2 and suppose that λ and ρ have equal numbers of odd rows, so that after adding zero rows and reordering the rows we have rj + j even for all j. Then we introduce 2kj = j + rj and after a further reordering write k = k1 + k2 + ⋯ + kn where kjN have k1k2 ≥ ⋯ ≥ kn, and we have π ∈ Πk as above. Given a n-subset {j1, …, jn} of N, we take 2km copies of jm and split them as m on the bra side and rm on the ket side of the tensor for m = 1, …, n, making a contribution as in (2.6). We summarize these results as follows.

Proposition II.3.

The integral J(k) is the sum over the summands (2.6) that arise from a π ∈ Πk with n nonzero rows, a n-subset of N, and an even decomposition of π into a pair (λ,ρ)Πk2 where λ and ρ have equal numbers of odd rows.

Let H=L2(T;R) and let (γj)jZ be mutually independent Gaussian N(0, 1) random variables on some probability space (Ω, P), where zj=(γj+iγj)/2) and zj=(γjiγj)/2 for jN. Then we take Brownian loop to be the random Fourier series in the style
(3.1)
so that Wiener loop W(du) is the distribution of uH, namely the probability measure induced by random variable u via (Ω, P) → (H, W). By orthogonality, we have
(3.2)
so that uBK/2π if and only if jγj2/j2K. Chernoff’s inequality, and independence we have
(3.3)
The low Fourier modes are the predominant terms since one has the estimate
(3.4)
which also follows from Chebyshev’s inequality and independence.
Let μλ(du) = ζ(λ)−1 exp(λV(u))W(du) where
so that μλ is a probability measure; we can take V(u)=Tu(θ)4dθ/(2π) and W to be Brownian loop measure. Here μλ is Gibbs measure (1.5) with the inverse temperature β, but we prefer to work with λ = −β > 0 so that the convexity statements are easier to interpret.

Theorem III.1.

Under the family of Gibbs measures (1.5) associated with the nonlinear Schrödinger equation (NLS) (1.3), the random variable u ↦ ⟨ukTukwith u ∈ (BK, L2μλ) and TL(Hsk) satisfies a Gaussian concentration of measure (3.6), the mean is a Lipschitz continuous function of β, and the mean for β = 0 is a sum over partitions of 2k over even decompositions.

The statements in this theorem will be proved in this section. They involve the integral
(3.5)
where μλ is the Gibbs measure for NLS. In the defocussing case, the k-particle density matrix of an interacting quantum system with suitable initial conditions converges to its classical analogue see Ref. 1 (2.16) for the 1D case and Ref. 28 for 2D and 3D.

We can write u = P + iQ for real variables (P, Q) and interpret ⟨ukTuk⟩ as a homogeneous polynomial in (p, q) of total degree 2k. The following result gives concentration of measure for Lipschitz functions on (BK, L2μλ), and shows that k-point matrices are concentrated near to their mean value.

Proposition III.2.
For TL(Hsk) with operator normT, let gT:BKC by gT(u) = ⟨ukTuk⟩. Then there exists α = α(β, K) > 0 such that
(3.6)

Proof.
Here gT has mean value
(3.7)
Also gT is Lipschitz, with
(3.8)
By the logarithmic Sobolev inequality Theorem 1.2(iv) of Ref. 6, there exists α = α(K, β) > 0 such that
(3.9)
for all continuously differentiable f:BKR, where ∇ refers to the Fréchet derivative. In particular, we choose
and we deduce that the moment generating function
(3.10)
satisfies φ(0) = 1,
hence r−1 log φ(r) → 0 as r → 0+. The differential inequality
(3.11)
follows directly from (3.9), hence
so we obtain the concentration inequality
(3.12)
One can conclude the proof by a standard application of Chebyshev’s inequality to the integral for φ in (3.10).□

To make full use of the previous result, one needs to know the mean trace(Gλ(k)T) as in (3.7), which depends upon the measure in (3.5). The following shows how the mean can vary with the inverse temperature β = −λ.

Proposition III.3.
For g:BKR an L-Lipschitz function, the mean values of g with respect to the measures μλ satisfy
(3.13)
where α is the constant in (3.9) for μa, and some λ ∈ (a, b).

Proof.
We observe that log ζ(λ) is a convex function of λ > 0 and by the mean value theorem, there exists a < λ < b such that
(3.14)
Let W1(μaμb) be the Wasserstein transportation distance between μb and μa for the cost function uvL2, as in p. 34 of Ref. 34. Then by duality we have
By results of Otto and Villani discussed in Ref. 34 pp. 291–292, the logarithmic Sobolev inequality of Theorem 1.2(iv) Ref. 6 implies a transportation cost inequality
(3.15)
in the style of Talagrand, where the relative entropy is
(3.16)
where the final step follows from (3.14). The stated result follows on combining these inequalities.□

Proposition III.4.

The integral G0(k) from (3.5) is the sum over the terms (2.6) that arise from a π ∈ Πk with n nonzero rows, a n-subset of N, and an even decomposition of π into a pair (λ,ρ)Πk2 where λ and ρ have equal numbers of odd rows.

Proof.
The measure μ0 is a Wiener loop measure restricted to Bk. For any sequence (εn)nZ{±1}Z, the sequence (γn)nZ with γn mutually independent N(0, 1) Gaussian random variables has the same distribution as the sequence (εnγ)nZ. Also, the condition nZ\{0}γn2/n2K does not change under this transformation. Let uε(θ)=jεjζjeijθ/|j|. We therefore have
where is the Haar probability measure on the Cantor group {±1}Z. The measure on {±} is associated with tossing a fair coin, and Haar measure is the product of such probability measures. We can therefore compute the inner integral in this expression for G0(k) by the same calculation that led to the corresponding statement for J(k), since we only used the even decomposition of partitions to derive (2.6).□

We have
(3.17)
where the sets are independent under the Gaussian measure dP, so we have a substitute for (2.7). Conditioning on the event BK,
(3.18)
where P(BK) satisfies (3.3) and there is an approximate formula
(3.19)
In a similar spirit, we give a concentration result for k-fold stochastic integrals. This result resemble the integrability criteria of Ref. 11 which relates to a single variable. Let H01 be the homogeneous Sobolev space of vL2(T;C) that are absolutely continuous with derivative vL2(T;C) with Tv(x)dx=0. Let hjH01 for j = 1, …, k be such that j=1k(hj)2dx1, and consider Φ:BKkRk given by
(4.1)
The following result describes the distribution of this Ck-valued random variable.

Proposition IV.1.
Let νK be the probability measure on Ck that is induced from μK,βk by Φ. Then there exists αK > 0 independent of k such that
(4.2)
for all GCc1(Ck;R). The distribution νK has mean x0 and satisfies
(4.3)

Proof.
We observe that for all u=(uj)j=1kBKk and v=(vj)j=1kBKk we have
(4.4)
so Φ:(BKk,2(L2))(Rk,2) is Lipschitz with constant one. Each metric probability space (BK, L2μK,β) satisfies a logarithmic Sobolev inequality with constant αK > 0 by Ref. 6, and the probability space (BKk,2(L2),μK,βk) is a direct product of the metric probability spaces (BK, L2μK,β), hence also satisfies a logarithmic Sobolev inequality
(4.5)
with constant αK independent of k, by Sec. 22 of Ref. 35.
We introduce x0=CkxνK(dx) and consider
(4.6)
which satisfies φ(0) = 1, φ′(0) = 0 and the differential inequality
(4.7)
follows from (4.2). This gives
(4.8)
which we integrate against ey2/2, where yCk=R2k, to obtain
(4.9)

The probability space (BK, L2μK,β) has a tangent space associated with infinitesimal translations. Let H1 be the Sobolev space of vL2(T;C) that are absolutely continuous with derivative vL2(T;C); let H1=(H1)* be the linear topological dual space for the pairing v,wTv(x)w(x)̄dx/(2π) as interpreted via Fourier series. Then there is a Radonifying triple of continuous linear inclusions
(4.10)
associated with the Gibbs measure μK. The space H1 has orthonormal basis (hn)n==(einθ/n2+1)n= and the covariance matrix of Wiener loop is R0=diag[1/(1+n2)]n= with respect to this basis. By Cauchy–Schwarz, we have
(4.11)
for all ɛ > 0. By such simple estimates, one can deduce that there exists, for each β and K > 0, a self-adjoint, nonnegative and trace class operator R such that
(4.12)
which gives the covariance matrix of the Gibbs measure on H1. This is essentially Gβ(1), up to the identification of Hilbert spaces in (4.10).

Cameron and Martin computed the density with respect to the Wiener measure that results from the linear translation uu + v for vH1; their results extends to Gibbs measure with some modifications.

We momentarily suppress the dependence of functions upon time, and consider for p, qH1, the linear transformation P + iQP + p + i(Q + q). Cameron and Martin proved that free Wiener measure (β = 0) is mapped to a measure that absolutely continuous with respect to the free Wiener measure. Likewise, Gibbs measure is mapped to a measure absolutely continuous with respect to Gibbs measure. The total space of the tangent bundle to the K sphere in L2 is
which has fibres that are subspaces of H1. With this in mind, we make a polar decomposition P + iQ = κe with κ=P2+Q2 and consider τ=σx.

Proposition IV.2.
For p, qH1 the functional
(4.13)
is a Lipschitz functional of P + iQ = κe such that
(4.14)

Proof.
Note that P + iQκ is 1-Lipschitz with
Also κ2‖Hess σ‖ is bounded. We have
(4.15)
which is bounded on L2 with norm Λ where
(4.16)
By the concentration of measure theorem for νK, we deduce the stated inequality.□

Definition.
We say that (M, dμ) satisfies T2(α) if
(4.17)
for all probability measures ν that are of finite relative entropy with respect to μ. The notation credits Talagrand, who developed the theory of such transportation inequalities. Otto and Villani showed that LSI(α) implies T2(α); see Refs. 34 and 35.

Theorem IV.3.
Let (M, dμ) be a metric probability space that satisfies T2(α); let (MN, 2(d), μN) be the direct product metric probability space. Let LNξ=N1j=1Nδξj be the empirical distribution for ξ=(ξj)j=1NMN where ξj distributed as μ. Then the concentration inequality holds
(4.18)
for p = 1, 2.

Proof.
The map between metric spaces
(4.19)
associated with the empirical distribution is 1/N-Lipschitz. Let (xj)j=1N,(yj)j=1NMN and consider the probability measure on M × M given by
(4.20)
which has marginals LNx=N1j=1Nδxj and LNy=N1j=1Nδyj, hence gives a transport plan with cost
(4.21)
Suppose that (M, dμ) satisfies T2(α). Then we take N independent samples ξ1, …, ξN, each distributed as μ so they have joint distribution μN on [MN, 2(d)], where by independence5 Theorem 1.2, [MN, 2(d), μN] also satisfies T2(α). By forming the empirical distribution, we obtain a map LN: (MN, 2(d)) → (Prob M, W2). Then φ(ξ)=NWp(LNξ,μ) is 1-Lipschitz (MN,2(d))R, since by the triangle inequality and (4.21),
(4.22)
hence φ satisfies the concentration inequality
(4.23)
Then the stated concentration inequality follows from Chebyshev’s inequality.□

Theorem IV.3 gives a metric version of Sanov’s theorem on the empirical distribution; see p. 70 of Ref. 15. There are related results in Bolley’s thesis.7 By Ref. 13, Theorem IV.3 applies to Haar probability measure on SO(3) and normalized area measure on S2, as is relevant in Sec. VII below. However, to ensure that EW2(LN,μ)0 as N, it is convenient to reduce to one-dimensional distributions, where we use the following integral formula. For distributions μ and ν on R with cumulative distribution functions F and G, we write Wp(μ, ν) = Wp(F, G).

Proposition IV.4.
Let ξ1 be a real random variable with finite fourth moment, and cumulative distribution function F. Let ξ1, …, ξN be mutually independent copies of ξ1 giving an empirical measure LNξ=N1j=1Nδξj with cumulative distribution function FNξ(t). Then
(4.24)

Proof.
Let H be Heaviside’s unit step function; then FNξ(t)=N1j=1NH(tξj), so N(FN(t)F(t)) is a sum of mutually independent and bounded random variables with mean zero. Also, as in the weak law of large numbers, we have
where the integral is finite by Chebyshev’s inequality since Eξ4 is finite. Compare with the result of Ref. 8, which are effective for very large N.□

Proposition IV.5.
Suppose that ξ1 has distribution μ on S2 where μ is absolutely continuous with respect to the normalized area measure ν1, and = fdν1 where f is bounded withfM. Let ξj be mutually independent copies of ξ1, and let LNξ be the empirical measure from N samples. Then

Proof.
Let g:S2R be 1-Lipschitz, and suppose without loss of generality that g has S2g(x)ν1(dx)=0; then g is bounded with ‖gπ. Given δ > 0, by considering squares for coordinates in longitude and colatitude, we choose disjoint and connected subsets E with diameter diam(E) ≤ δ and ν1(E) ≤ δ2 and μ(E) ≤ 1(E) such that E=S2. We can arrange that there are Sδ such sets E, where SδC/δ2. Let F be the σ-algebra that is generated by the E, take conditional expectations in L2(ν1), and observe that
(4.25)
where we have bounds
(4.26)
(4.27)
and the identity
so by Cauchy–Schwarz, we have
(4.28)
where LNξ(E)μ(E)=N1j=1N(IE(ξj)μ(E)) is a sum of independent random variables with mean zero and variance N−1μ(E)(1 − μ(E)), so
(4.29)
Choosing δ = N−1/4 we make both (4.26) and (4.29) small, which gives the stated result.□

Remark.

Consider the discrete metric δ on [0, 1], and observe that IA gives a 1-Lipschitz function on [0, 1] for all open A ⊆ [0, 1]. Then we have IA(x)(dμ(x)dν(x))=μ(A)ν(A), so by maximizing over A we obtain the total variation norm ‖μνvar. With μ a continuous measure and ν a purely discrete measure, such as an empirical measure, we have ‖μνvar = 1. The Propositions IV.4 and IV.5 depend upon the choice of cost function as well as the measures.

The Gibbs measure (1.5) was defined using random Fourier series. This construction gives us a sequence of finite-dimensional probability spaces which approximate the space (BK, L2μK,β). To make this idea precise, we recall some definitions from Ref. 33.

Definition IV.6

(Convergence of metric measure spaces).

  • For M a nonempty set, a pseudometric is a function δ: M → [0, ] such that
    (4.30)
    then (M, δ) is a pseudometric space.
  • Given pseudo metric spaces (M1, δ1) and (M, δ2), a coupling is a pseudo metric δ: M → [0, ] where M = M1M2 such that δM1 × M1 = δ1 and δM2 × M2 = δ2.

  • Suppose that M̂1=(M1,δ1,μ1) and M̂2=(M2,δ2,μ2) are complete separable metric spaces endowed with probability measures. Consider a coupling (M, δ) and a probability measure π on M1 × M2 with marginals π1 = μ1 and π2 = μ2. Then the L2 distance between M̂1 and M̂2 is
    (4.31)

Let Dn be the Dirichlet projection taking k=(ak+ibk)eikθ to k=nn(ak+ibk)eikθ. Following,10 we truncate the random Fourier series of u=P+iQ=k=(ak+ibk)eikθ to un=Pn+iQn=k=nn(ak+ibk)eikθ and correspondingly modify the Hamiltonian to
(4.32)
for the real canonical variables ((ak,bk))k=nn. Then the canonical equations become a coupled system of ordinary differential differential equations in the Fourier coefficients. We introduce the polar decomposition Pn+iQn=κneiσn, and observe that in terms of these noncanonical variables, the Hamiltonians H1(n)=Tκn2dθ and
(4.33)
are invariants under the flow.
The corresponding Gibbs measure is
(4.34)
in which W(dun) is the finite dimensional projection of Wiener loop measure and is defined in terms of the Fourier modes as
(4.35)
Consider the map u(x, t) ↦ u(x + h, t) of translation in the space variable. This commutes with Dn, and the Gibbs measures μK,β(n) are all invariant under this translation. In terms of Fourier components, we have M = BK and
(4.36)
with the canonical inclusions of metric spaces (M1, 2) ⊂ (M2, 2) ⊂ ⋯ ⊂ (M, 2) defined by adding zeros at the start and end of the sequences, which gives a sequence of isometric embeddings for the 2 metric on sequences. When we identify (aj,bj)j=nn with j=nn(aj+ibj)eijθ, then we have a corresponding embedding for the L2 metric.

Here (Mn,L2,μK,β(n)) is a finite-dimensional manifold and a metric probability space. We now show that these spaces converge to (M, L2μK,β) as n.

Lemma IV.7.

  • Suppose that 0 < −βK < 3/(14π2). Then M̂n=(Mn,L2,μK,β(n)) has
    (4.37)
  • The measures μK,β(n) converge in total variation norm to μK,β as n.

Proof.

  • This is proved in Theorem 3.2 of Ref. 4; see also Example 3.8 of Ref. 33. Let W2(μ(n)μ) be the Wasserstein transportation distance between free Brownian loop measure μ and the pushforward of μ under the Dirichlet projection, μ(n) = Dn♯μ, for the cost function uvL22.

    The key point is
    (4.38)
  • The measures μK,β(n) converge in total variation norm to μK, by an observation of McKean30 in his step 7. By Riesz’s theorem, there exists c4 > 0 such that T|Dnu|4dθc4T|u|4dθ, and by Ref. 26 the integral
    (4.39)
    is finite, so we can use the integrand as a dominating function to show
    (4.40)

Proposition IV.8.
Let (MnM, δn) be a coupling of (Mn, L2) and M, L2), and let φ:(MnM,δn)R be a Lipschitz function. Then
(4.41)

Proof.
We can introduce a pseudo metric δn on MnM that restricts to the L2 metric on Mn and M, and apply (4.42) to Lipschitz functions φ:(MnM,δn)R. We can regard Mn × M as a subset of M × M = (MnM) × (MnM). Note that for a Lipschitz function φ:MR such that |φ(x) − φ(y)| ≤ δ(x, y) for all x, yM, we have
(4.42)

For example, with u=n=(ak+ibk)eikθ we introduce Dnu=k=nn(ak+ibk)eikθ; then φ(u)=DnuL2 and ψ(u)=uDnuL2 give Lipschitz functions φ,ψ:(BK,L2)R.

Proposition IV.9.

For 0 < γ < 1/16 and fixed 0 < t < t0, the map xu(x, t) ∈ L4 is γ-Hölder continuous, so that sup{u(x+h,t)u(x,t)Lx4/|h|γ);h0} is almost surely finite.

Proof.
We prove that for 0 < t < t0, we have C = C(t0) such that
(4.43)
so xu(x,t)Lx4 is γ-Hölder continuous for 0 < γ < 1/16 by the Kolmogorov–Čentsov theorem, as Ref. 23. To obtain (4.43), let J3/8(x) = eikx/|k|3/8 so that J3/8(x)|x|5/8 is bounded on (−π, π) and J3/8L4/3(−π, π). Then by Young’s inequality for convolutions, with |D|: einx ↦ |n|einx we have
(4.44)
Then by Bourgain’s estimate on the solutions of NLS from Ref. 3, there exists C(t0) such that
(4.45)
where
(4.46)
By basic results about Gaussian series, the first factor on the right-hand side is bounded by the eighth power of
(4.47)
so we obtain (4.43). Also, by rotation invariance of the Gibbs measure, we have
which is
so xu(x, t) is 1/4-Hölder continuous along solutions in the support of the Gibbs measure.□

We recall the Hasimoto20 transform, which associates with a solution uC2 of (1.3) a space curve in R3 with moving frame {T, N, B}; Hasimoto considered the case β = −1/2. In the present context, u is associated with the space derivative of a tangent vector T to a unit speed space curve, so the curvature is κ=Tx. We have a polar decomposition u = κe where σ(x,t)=0xτ(y,t)dy and τ is the torsion. Then the Serret–Frenet formula is
(5.1)
so the frame develops along the space curve. Let X = [TNB] ∈ SO(3), and Ω1(x, t) the matrix in (5.1). When Ω1(,t)C(T;so(3)), the solution X(·, t) ∈ C([0, 2π]; SO(3)) to (5.1) is 2π periodic up to a multiplicative monodromy factor U(t) ∈ SO(3) such that X(x + 2π, t) = X(x, t)U(t).
The frame also evolves with respect to time, so that with μ=σtβκ2, we have
(5.2)
Let Ω2 denote the matrix in Eq. (5.2). For a pair of coupled ordinary differential equations (ODE) dX/dx − Ω1X = 0 and dX/dt − Ω2X = 0, the corresponding Lax pair is

Lemma V.1

(Hasimoto). If u is a C2 function that satisfies the nonlinear Schrödinger equation, then the coupled pair of differential equations is consistent in the sense that there exists a local solution of the pair of ODE, and there exists a local solution of Lax pair.

Thus the frame XSO(3) evolves along the solution P + iQBK of NLS, and we can regard d/dx − Ω1 and d/dt − Ω2 as connections for this evolution. Both of the coefficient matrices are real and skew symmetric. One can check that a solution of the integral equation
(5.3)
satisfies
so smooth solutions are given in terms of an integral equation.
From the Serret–Frenet formulas the components of the acceleration along the space curve satisfy
(5.4)
The total curvature of the space curve is
(5.5)
which is an invariant under the flow associated with the NLS.

Proposition V.2.
Let
(5.6)
  • −H2 is convergent almost surely and is invariant under the flow associated with NLS,

  • H2 represents the area that is enclosed by the contour {u(x): x ∈ [0, 2π]} in the complex plane, and
    (5.7)
  • H2241H1H3 for β > 0, where H1 is given in Eq. (5.5) and H3 is given in Eq. (1.1).

Proof.

  • The invariance of H2 was noted in Ref. 31 and can be proved by differentiating through the integral sign and using the canonical equations. We have a series
    which converges almost surely. This follows since
    (5.8)
    where the final integral involves the series
    (5.9)
    which is a martingale; by Fatou’s Lemma, we have
    (5.10)
    so the series in (5.9) is marginally exponentially integrable. Hence the integrals in (5.8) converge by the Lp martingale maximal theorem for all 1 < p < .
  • One can write H2 in terms of P + iQ = κe, and make a change of variables to obtain
    and
    To interpret this as an area, we write θ ∈ [0, 2π] for the space variable and extend functions on [0, 2π] to harmonic functions on the unit disc via the Poisson kernel. Then by Green’s theorem, we can express this invariant in terms of the area of the image of D under the map to P + iQ, as in
    (5.11)
    This is similar to Lévy’s stochastic area, as discussed in Example 5.1 of Ref. 21.
  • We then have
    which is bounded in terms of other invariants, with H2241H1H3.□

Remark V.3.

  • Bourgain10 interprets H2 in terms of momentum (5.70).

  • With Mn as in (4.36), the space C(Mn;R) is a Poisson algebra for the bracket {f,g}=j=nn(f,g)(aj,bj), and the canonical equations arise with Hamiltonian H3(n) on Mn. Let Q be the ring of quaternions, and extend the Poisson bracket to C(MnQ) via {fX, gY} = {f, g} ⊗ XY. Then (R3,×) may be realised as Q/RI and (so(3),[,])(R3,×); see Example 2.3 of Ref. 37. This Lie algebra is also the Lie algebra of SU(2), and there is a 2 − 1 group homomorphism SU(2) → SO(3). Hence some of the following results may be expressed in terms of SU(2), which is the form in which a Lax pair for the nonlinear Schrödinger equation was presented, see Refs. 36 and 38 (Subsection 8.3.2).

  • Suppose that TC2([0,a]×[0,b];S2), so that T(x, t) represents the spin of the particle at (x, t) and let
    (5.12)
    which corresponds to our (5.5). One can consider infinitesimal variations TT + T × V and thereby compute ET. In the focusing case β = −1, Ding16 introduces a symplectic structure on the space of such maps such that the Hamiltonian flow is
    (5.13)
    which corresponds to Heisenberg’s equation for the one-dimensional ferro-magnet, and gives the top entry of (5.2). There is a a gauge equivalence between the focussing NLS and Heisenberg’s ferro-magnet. There is also a gauge equivalence between the defocussing NLS and a hyperbolic version of the ferromagnet in which the standard cross product is modified. We have
    (5.14)
  • As in Ref. 17, the space
    with pointwise multiplication is a loop group, and its Lie algebra may be regarded as

The aim of the next section is to interpret the Lax pair suitably for solutions which are typically not differentiable and for which we have a pair of stochastic differential equations with random matrix coefficients.

The compact Lie group SO(3) of real orthogonal matrices with determinant one is a subset of M3×3(R), which has the scalar product ⟨X, Y⟩ = trace(XY) and associated metric d(X, Y) = ⟨XY, XY1/2 such that ⟨XU, YU⟩ = ⟨X, Y⟩ and d(XU, YU) = d(X, Y) for all USO(3) and X,YM3×3(R). The Lie group SO(3) has tangent space at the identity element give by the skew symmetric matrices so(3), so the tangent space TXSO(3) at XSO(3) consists of {ΩX: Ω ∈ so(3)}, where so(3) is a Lie algebra for [x, y] = xyyx, x, yso(3), and the exponential map is surjective so(3) → SO(3).

Consider the differential equation
(6.1)
where t ∈ [0, 1] is the evolving time, and XSO(3). We consider a column vector xR3, satisfying dxdt=Ωx which gives a velocity, and ‖x‖ = 1 because Ω ∈ so(3). Following Otto’s interpretation34 of optimal transport in the setting of partial differential equations, one constructs a weakly continuous family of probability measures, ν̃t on S2 for t ∈ [0, 1], which satisfy the weak continuity equation,
(6.2)
Likewise the differential equation (6.1) gives a weakly continuous family of probability measures, νt on SO(3). If the integral
(6.3)
and ΩX is locally bounded, then ΩX is locally Lipschitz and νt is the unique solution to the weak continuity equation by Theorem 5.34 of Ref. 34. Recall that for the operator norm on M3×3(R), A=sup{Ay:yR3}, where ‖X‖ = 1 for all XSO(3) so ‖ΩX‖ ≤ ‖Ω‖.
The weak continuity equation is equivalent to
(6.4)
for all fC(SO(3);R), where X0Xt(X0) gives the dependence of the solution of (6.1) on the initial condition. The velocity field ΩX is associated with a transportation plan taking νt1 to νt2 which is possibly not optimal, but does give an upper bound on the Wasserstein distance for the cost d(X,Y)2 on SO(3) of
(6.5)
Then by Theorem 23.9 of Ref. 35, the path (νt) of probability measures is absolutely continuous, so there exists L1[0, 1] such that W2(νt2,νt1)t1t2(t)dt and 1/2-Hölder continuous, so there exists C > 0 such that W2(νt2,νt1)C|t2t1|1/2.

Example VI.1.

  • If ΩtM3×3(R) is skew, and Xt, Yt give solutions of the differential equation
    (6.6)
    then d(Xt, Yt) = d(X0, Y0). We deduce that if X0 is distributed according to Haar measure on SO(3), then Xt is also distributed according to Haar measure since the measure, the metric and solutions are all preserved via XXU.
  • As an alternative, we can consider X0 to have first column [0; 0; 1] and observe the evolution of the first column T of X under the (6.1) where T evolves on S2.

We now consider the case in which Ω as in (5.2) is a so(3)-valued random variable over (MμK,β, L2).

Proposition VI.2.
Suppose that Ω = Ω(u(·, t)) where u(x, t) is a solution of NLS and that
(6.7)
converges. Then for almost all u with respect to μK,β, there exists a flow (νt(dXu)) of probability measures on SO(3).

Proof.
Each solution u of NLS determines Ω so that the associated ODE (6.1) transports the initial distribution of X0SO(3) to a probability measure on SO(3); then we average over the u with respect to μK(du). This Gibbs measure is invariant under the NLS flow, so by Fubini’s theorem
(6.8)
converges. Hence the condition (6.3) is satisfied, for almost all u, and we can invoke Theorem 23.9 of Ref. 35.□

For the finite-dimensional Mn of (4.36) and solutions un=κneiσn, the modified Hasimoto differential equations are
(6.9)
and
(6.10)
involves τn=σnx and (κnx)2+τn2κn2=(Pnx)2+(Qnx)2 which is continuous, so there exists a solution X(n)(x, t) ∈ SO(3). We can interpret the solutions as elements of a fibre bundle over (Mn,μK(n),L2) with fibres that are isomorphic to SO(3).
Let P + iQ = κe be a solution of NLS and let
(6.11)

Proposition VI.3.

  • Let P + iQ = κe be a solution of NLS with initial data in P(x, 0) + iQ(x, 0) ∈ BKH1. Then Ω1 in (6.11) gives an so(3)-valued vector field in L2(κ2(x, t)dx).

  • Let P + iQ = κe be a solution of NLS with initial data P(x, 0) + iQ(x, 0) ∈ H1BK, and let Pn+iQn=κneiσn be the corresponding solution of the NLS truncated in Fourier space, giving matrix Ω1(n). Let Xt(n)(x) be a solution of (6.9) and suppose that X(n) converges weakly in L2 to Xt(x). Then Xt gives a weak solution of (5.1).

Proof.

  • With ω=κ2+τ2, we have
    where the entries of Ω12 are bounded by κ2 + τ2, hence
    (6.12)
    for uH1; however, there is no reason to suppose that τ itself is integrable with respect to dx.
  • By (5.4) and (5.5), we have κΩ1Lx2 for all uH1. Moreover, Bourgain9 has shown that for initial data P(x, 0) + iQ(x, 0) = κ(x, 0)e(x,0) in H1BK, the map
    (6.13)
    is Lipschitz continuous for 0 ≤ tt0 with Lipschitz constant depending upon t0, K > 0. We have
    (6.14)
    where the right-hand side is integrable with respect to x by the Hardy–Littlewood maximal inequality and (6.12). Suppose that X(n) is a solution of (5.1). We take τn to be locally bounded. Then by applying Cauchy–Schwarz inequality to the integral
    we deduce that
    (6.15)
    where the integral is finite by (6.12). Also
    for 0 < x1 < x2 < < xN < 2π. We have
    (6.16)
    so for ZC([0,2π];M3×3(R)) and the inner product on M3×3(R), we have
    (6.17)
    where κnκ in H1, so with norm convergence, we have κnxκx in L2, and κnΩ(n)κΩ1 as n, and with weak convergence in L2, we have X(n)X, so
    (6.18)

The simulation of this differential equation computes XxS2 starting with X0 = [0; 0; 1] and produces a frame {Xx, ΩxXx, Xx × ΩxXx} of orthogonal vectors. Geodesics on S2 are the curves such that the principal normal is parallel to the position vector, namely the great circles. For a geodesic, Xx × ΩxXx is perpendicular to the plane that contains the great circle.

Let P + iQ = κe be a solution of NLS and let
(6.19)

Proposition VI.4.

  • Let P + iQ = κe be a solution of NLS with initial data P(x, 0) + iQ(x, 0) ∈ BK. Then x0xΩ2(y,t)dy gives a so(3)-valued stochastic of finite quadratic variation on [0, 2π] almost surely with respect to μK(dPdQ).

  • Let P + iQ = κe be a solution of NLS with initial data P(x, 0) + iQ(x, 0) ∈ H1BK, and let Pn+iQn=κneiσn be the corresponding solution of the NLS truncated in Fourier space, giving matrix Ω2(n). Let Xt(n) be a solution of (6.10). Then Xt(n) converges in Lx2 norm to Xt as n where Xt gives a weak solution of (5.2).

Proof.

  • The essential estimate is
    (6.20)
    The function σ is a progressively measurable stochastic process adapted with respect to a suitable filtration, and with differential satisfying an Ito integral equation.18 Therefore, we can control the κτ term via
    (6.21)
    which is a bounded martingale transform of Wiener loop. This formula is reminiscent of Levy’s stochastic area as in Example 5.1 of Ref. 21.
  • By (5.4) and (5.5), we have Ω2Lx2 for all uH1. Bourgain9 has shown that for initial data P(x, 0) + iQ(x, 0) = κ(x, 0)e(x,0) in H1BK, the map
    (6.22)
    is Lipschitz continuous for 0 ≤ tt0 with Lipschitz constant depending upon t0, K > 0. We have
    where the final integral is part of the Hamiltonian. With ZC(T;M3×3(R)), we have the integral equation for the pairing ⟨·, ·⟩ on L2([0,2π],M3×3(R))
    (6.23)
    Consider the variational differential equation in L2([0,2π],M3×3(R))
    (6.24)
    where Ω2(n)(x,t) and Ω2(m)(x,t)Ω2(n)(x,t) are skew.

We introduce a family of matrices U(n)(xt, s) such that U(n)(xt, r)U(n)(xr, s) = U(n)(xt, s) for t > r > s and U(n)(xt, t) = I such that
(6.25)
Then the variational equation has solution
Then
(6.26)
so from this differential inequality we have
(6.27)
Now X(m)(0) − X(n)(0) → 0 and Ω2(m)(s)Ω2(n)(s)0 in Lx2 norm as n, m, so there exists X(x,t)Lx2 such that X(x, t) − X(n)(x, t) → 0 in Lx2 norm as n.
We deduce that
(6.28)
so we have a weak solution of the ODE.□

Let Ω2(n,un)(x,t) be the Fourier truncated matrix that corresponds to a solution un of the Fourier truncated equation NLSn, then let Xn,un(x,t) be the solution of the ODE (6.10). By Proposition VI.2, the map unXn,un(,t) pushes forward the modified Gibbs measure μK(n) to a measure on [C(MnSO(3)), L2] that satisfies a Gaussian concentration of measure inequality with constant α(β, K)/n2; compare (3.9).

Corollary VI.5.
For each ZL2([0,2π];M3×3(R)), introduce the R-valued random variable on (Mn,L2,μK(n)) by
(6.29)
  • Then the distribution ν(n) of Zn satisfies the Gaussian concentration inequality
    (6.30)
  • Let νN(n)=N1j=1NδZn(j) be the empirical distribution of N independent copies of Zn. Then W1(νN(n),ν(n))0 almost surely as N.

Proof.

  • As with un, we introduce the corresponding data for another solution vn. As in (6.27), we have
    (6.31)
    For given initial condition Xn,vn)(0)=X(n,un)(0), and T > 0, we can take the supremum over t, then integrate this with respect to x and obtain
    (6.32)
    so Ω(u)Xu is a Lipschitz function L2([0,2π]×[0,T],so(3))L2([0,2π];L([0,T],R3)). By Bourgain’s results, there exists C > 0 such that
    (6.33)
    so unX(n,un) is a Lipschitz function on Lx2, albeit with a constant growing with n. Thus we can push forward the modified Gibbs measure (Mn,L2,μK,β(n))L2([0,2π];M3×3(R)) so that the image measure satisfies a Gaussian concentration inequality with constant α(β, K)/n2 dependent upon n. For each ZL2([0,2π];M3×3(R)), we introduce Zn, so that where unZn is Cn-Lipschitz function from (Mn,L2,μK,β(n)) to R. The random variable Zn therefore satisfies the Gaussian concentration inequality (6.30).
  • By Theorem IV.3, we can use the Borel–Cantelli Lemma to show that
    where by Proposition IV.4, EW1(νN(n),ν(n))0 as N.□

Consider a coupling of (Mn,L2,μK,β(n)) and (M, L2μK,β) involving measure πn. For any bounded continuous φ:CR we can consider
(6.34)
where

Proposition VI.6.
Let (φj)j=1 be a dense sequence in Ball(Cc(C;R)) and (Y)=1 a dense sequence in Ball(L2). Then there exists a subsequence (nk) such that
(6.35)
converges as nk for all j,N.

Proof.
We can introduce a metric so that M=n=1MnM becomes a complete and separable metric space, and we can transport μn onto M. Then ω=21μ+n=12n1μn is a probability measure on M, and μn is absolutely continuous with respect to ω, so n = fn for some probability density function fnL1(ω). By convergence in total variation from Lemma IV.7(ii), here exists fL1(ω) such that fnf in L1 as n. Given a bounded sequence (gn)n=1 in L(ω), there exists gL(ω) and a subsequence (nk) such that
(6.36)

Remark VI.7.

For uM, we have un = DnuMn so that unu in L2 norm as n. It is plausible that (6.34) tends to 0 as n, but we do not have a proof. Unfortunately, the constants are not sharp enough to allow us to use Proposition IV.8 to deduce W2 convergence for the distributions on SO(3).

Our objective in this section is to obtain a (random) numerical approximation to the solution of (6.9). We consider the case where the parameter β in (1.3) is equal to 0. Note that in this case, the Gibbs measure reduces to Wiener loop measure and stochastic processes with the Wiener loop measure as their law are by definition Brownian loop. Equation (6.9) is a partial differential equation (PDE) with respect to the space variable x, while the parameter of a stochastic process in a stochastic differential equation (SDE) is colloquially referred to as time. To avoid confusion, in this section we refer to x as s; whereas the time variable t is suppressed.

Recall the polar decomposition P + iQ = κe where, κ=P2+Q2 and σ is such that τ=σs. Define σε(P,Q)tan1(PQP2+ε2) as the regularised Itô integral of τ. The Itô differential can be written as
where
We can write (6.9) in the form of a SDE, including a correction to convert from a Stranovich SDE into an Itô SDE as follows
(7.1)
where,
(7.2)
As justified above P and Q are each a Brownian bridge with period T = 2π, thus they can be expressed in terms of Brownian motions W1 and W2; that is, P(s) = W1(s) − sW1(2π)/2π and likewise for Q. Equation (7.1) is now written as a standard Itô SDE,
(7.3)
where A, B and C are defined as in Eq. (7.2). The resulting stochastic process XsSO(3) is then used to rotate the unit vector y0 = [0, 0, 1] on S2 to ys = Xsy0, the third column of Xs. The sample paths of this process can be described by construction of a frame {ys,ys,ys×ys}. In order to simulate this SDE, we make use of a numerical scheme for matrix SDEs in SO(3) developed by Marjanovic and Solo.29 This involves a single step geometric Euler–Maruyama method, called g-EM, in the associated Lie algebra. Figure 1 demonstrates a sample-path of ys generated via this method, and the code used to simulate a sample path is available.25 The sample paths start off on the great circle perpendicular to the y-axis, and so have constant binormal ys×ys. As a sample path extends past the great circle, the binormal vector at each point deviates slowly; thus a sample path can be thought of as a precessing orbit.
FIG. 1.

The figure demonstrates a sample path of the stochastic process Xs, which is a solution to Eq. (7.3). As XsSO(3) the path is visualised as the action of Xs applied to a unit vector in R3. The numerical solution shown is for s ∈ [0, 10], and has a step size of h = 10−5.

FIG. 1.

The figure demonstrates a sample path of the stochastic process Xs, which is a solution to Eq. (7.3). As XsSO(3) the path is visualised as the action of Xs applied to a unit vector in R3. The numerical solution shown is for s ∈ [0, 10], and has a step size of h = 10−5.

Close modal
The Itô process ys is derived from the solution to Eq. (7.3) and takes values in R3. Let ŷs,h denote the numerical approximation to ys on [0, T] with step size h, which is calculated using the g-EM method. The approximation error converges to zero in the L2 space of Itô processes as the step size h → 0,
(7.4)
for some ɛ > 0 (See Piggott and Solo32). A value of ɛ = 1/4 allows us to maintain control of the implied constants on the interval [0, 1], and h is taken to be 10−5. We apply g-EM to Eq. (7.3) on the interval [0, 10], upon which a smaller value of h would be welcomed. However, we are attempting to calculate a distribution, so we need a large number of sample-paths.

The computational complexity of simulating a single sample-path is O(T/h) where T denotes the length of the interval simulated. Therefore, for a total of N samples, the computational complexity of our simulation algorithm is O(NT/h). We run our simulations using a machine equipped with an 8-core Intel Xeon Gold 6248R central processing unit (CPU) with a clock speed of 2993 MHz; we take advantage of integrated parallelisation in MATLAB. With h = 10−5 and N = 2 × 106 the algorithm takes around 1 week to run on our system.

Since the sample paths are constrained to S2 the points ys can be specified in spherical coordinates of longitude θs ∈ [−π, π) and colatitude ϕs ∈ [0, π]. Figure 2 demonstrates the empirical joint distribution of θs and ϕs for two different values of s. As can be observed, the distribution of (θs, ϕs) varies with s. We hypothesise that the angles θs and ϕs evolve to become statistically independent, and that ys will eventually be uniformly distributed on the sphere. In the remainder of the section, we test this hypothesis statistically.

FIG. 2.

Histograms of the joint distribution of the third column of Xs at two timesteps, s = 1 (left) and s = 10 (right). The distribution lies on the sphere S2 and thus the axes are chosen as the longitude θs and colatitude ϕs. With respect to these axes the marginals of the distribution become more independent over time.

FIG. 2.

Histograms of the joint distribution of the third column of Xs at two timesteps, s = 1 (left) and s = 10 (right). The distribution lies on the sphere S2 and thus the axes are chosen as the longitude θs and colatitude ϕs. With respect to these axes the marginals of the distribution become more independent over time.

Close modal
We start by calculating the Wasserstein distance W1(ν1, ν2) between probability measures ν1 and ν2 on S2, which are absolutely continuous with respect to area and have disintegrations
where fj (j = 1, 2) are probability density functions on [−π, π] that give the marginal distributions of νj in the longitude θ variable, and gj in the colatitude variable. Let Fj be the cumulative distribution function of fj(θ) and Gj be the cumulative distribution function of gj(ϕ)sin ϕdϕ. We measure W1(ν1, ν2) in terms of one-dimensional distributions. Given distributions on R with cumulative distribution functions F1 and F2, we write W1(F1, F2) for the Wasserstein distance between the distributions for cost function |xy|. Let ψ: [−π, π] → [−π, π] be an increasing function that induces f2(θ) from f1(θ); then
In particular, for f1(θ) = 1/(2π) and g1(ϕ) = 1/2, we have a product measure ν1(dθdϕ) = (4π)−1sin ϕdϕdθ giving normalized surface area on the sphere. Then F1(θ) = (θ + π)/(2π) and F2(ψ(θ)) = (θ + π)/(2π), so ψ(2π(τ − 1/2)) for τ ∈ [0, 1] gives the inverse function of F2. We deduce that
(7.5)
and
(7.6)
Hence the Wasserstein distance can be bounded in terms of the cumulative distribution functions by
(7.7)
where we have used the triangle inequality to obtain a more symmetrical expression involving the Wasserstein distances for the marginal distributions and the G conditional distributions, namely the dependence of the colatitude distribution on longitude.

For each s ∈ [0, 10], let Fθs and Gϕs be the marginal cumulative distribution functions (CDFs) of θs and ϕs respectively. For NN, denote by FNθs and GNϕs the empirical CDFs of θs and ϕs. We generate empirical CDFs FNθs and GNϕs with s = 0.3, 0.6, 0.9, …, 6.0, and N = 105. Figure 3 demonstrates that W1(F1,FNθs) and W1(G1,GNϕs), each decreases with increasing s. As a consequence of Theorem IV.3 and Proposition IV.4 for N = 105 with probability at least 0.99 it holds that W1(FNθs,Fθs)0.025 and W1(GNϕs,Gϕs)0.018. Thus, we observe that Fθs converges to F1 and Gϕs converges to G1.

FIG. 3.

The plots involve the difference between the CDFs of two marginals. For θs, the predicted CDF F1(θ) = (θ + π)/(2π) is compared with the empirical CDF, FNθs. The Wasserstein distance between F1(θ) and FNθs is displayed on the left. For ϕs, the predicted CDF G1(ϕ) = (1 − cos(ϕ))/2 is compared with the empirical CDF GNϕs. The Wasserstein distance between G1(ϕ) and GNϕs is displayed on the right. The emprical measures considered are created using N = 105 samples and evaluated at each of the datapoints indicated on the graphs (s = 0.3, 0.6, …, 6.0).

FIG. 3.

The plots involve the difference between the CDFs of two marginals. For θs, the predicted CDF F1(θ) = (θ + π)/(2π) is compared with the empirical CDF, FNθs. The Wasserstein distance between F1(θ) and FNθs is displayed on the left. For ϕs, the predicted CDF G1(ϕ) = (1 − cos(ϕ))/2 is compared with the empirical CDF GNϕs. The Wasserstein distance between G1(ϕ) and GNϕs is displayed on the right. The emprical measures considered are created using N = 105 samples and evaluated at each of the datapoints indicated on the graphs (s = 0.3, 0.6, …, 6.0).

Close modal

We run a total of 22 hypothesis tests to examine the evolution of the joint distribution of the angles θs and ϕs. In order to account for multiple testing, we set the significance level of each test to 0.000 45, leading to an overall level of 0.01. First, we generate sample paths to obtain N = 105 realisations of (θs, ϕs) for each value of s = 0.3, 0.6, 0.9, …, 6.0. For each s, we test the null hypothesis H0,s that the angles θs and ϕs are statistically independent, against the alternative hypothesis H1,s that they are dependent. To this end, we rely on a widely used nonparametric independence test, which is based on the Hilbert-Schmidt Independence Criterion (HSIC) dependence measure;2,19 the implementation is due to Jitkrittum et al.22 It is observed that while the null hypothesis is rejected for s = 0.3, …, 2.1, the test is unable to reject H0,s from s = 2.4, …, 6.0 at (an overall) significance level 0.01.

We run two Kolmogorov–Smirnov goodness-of-fit tests for s = 10 as follows. The first tests the null hypothesis H0θs that θs is distributed according to F1 against the alternative that it is not; the second tests the null hypothesis H0ϕs that ϕs is distributed according to G1 against the alternative that it is not. At significance level 0.01, the tests are unable to reject the null hypotheses H0θs and H0ϕs.

The authors are grateful to the referee for insightful suggestions which improved the paper. Also, the authors thank Nadia Mazza for her helpful remarks on combinatorics. M.K.S. is funded by a Faculty of Science and Technology studentship, Lancaster University.

The authors have no conflicts to disclose.

Gordon Blower: Investigation (equal); Methodology (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal). Azadeh Khaleghi: Investigation (equal); Methodology (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal). Moe Kuchemann-Scales: Investigation (equal); Methodology (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
Adami
,
R.
,
Golse
,
F.
, and
Teta
,
A.
, “
Rigorous derivation of the cubic NLS in dimension one
,”
J. Stat. Phys.
127
,
1193
1220
(
2007
).
2.
Albert
,
M.
,
Laurent
,
B.
,
Marrel
,
A.
, and
Meynaoui
,
A.
, “
Adaptive test of independence based on HSIC measures
,”
Ann. Stat.
50
(
2
),
858
879
(
2022
).
3.
Blower
,
G.
, “
A logarithmic Sobolev inequality for the invariant measure of the periodic Korteweg–de Vries equation
,”
Stochastics
84
,
533
542
(
2012
).
4.
Blower
,
G.
, “
Concentration of the invariant measures for the periodic Zakharov, KdV, NLS and Gross–Piatevskii equations in 1D and 2D
,”
J. Math. Anal. Appl.
438
,
240
266
(
2016
).
5.
Blower
,
G.
and
Bolley
,
F.
, “
Concentration of measure on product spaces with applications to Markov processes
,”
Stud. Math.
175
,
47
72
(
2006
).
6.
Blower
,
G.
,
Brett
,
C.
, and
Doust
,
I.
, “
Logarithmic Sobolev inequalities and spectral concentration for the cubic Schrödinger equation
,”
Stochastics
86
,
870
881
(
2014
).
7.
Bolley
,
F.
, “
Application du transport optimal à des problèmes de limites de champ moyen
,” Ph.D. thesis (
ENS Lyon
,
2005
).
8.
Bolley
,
F.
,
Guillin
,
A.
, and
Villani
,
C.
, “
Quantitative concentration inequalities for empirical measures on non-compact spaces
,”
Probab. Theory Relat. Fields
137
,
541
593
(
2007
).
9.
Bourgain
,
J.
, “
Periodic nonlinear Schrödinger equation and invariant measures
,”
Commun. Math. Phys.
166
,
1
26
(
1994
).
10.
Bourgain
,
J.
,
Global Solutions of Nonlinear Schrödinger Equations
(
American Mathematical Society
,
1999
).
11.
Cameron
,
R. H.
and
Martin
,
W. T.
, “
Transformations of Weiner integrals under translations
,”
Ann. Math.
45
,
386
396
(
1944
).
12.
Cartan
,
E.
,
Riemannian Geometry in an Orthogonal Frame
(
World Scientific
,
2001
).
13.
Cordero-Erausquin
,
D.
,
McCann
,
R. J.
, and
Schmuckenschläger
,
M.
, “
Prékopa–Leindler type inequalities on Riemannian manifolds, Jacobi fields, and optimal transport
,”
Ann. Fac. Sci. Toulouse Math.
15
,
613
635
(
2006
).
14.
Cruzeiro
,
A.-B.
and
Malliavin
,
P.
, “
Renormalized differential geometry on path space: Structural equation, curvature
,”
J. Funct. Anal.
139
,
119
181
(
1996
).
15.
Deuschel
,
J.-D.
and
Stroock
,
D. W.
,
Large Deviations
(
Academic Press
,
1989
).
16.
Ding
,
Q.
, “
A note on the NLS and the Schrödinger flow of maps
,”
Phys. Lett. A
248
,
49
56
(
1998
).
17.
Driver
,
B. K.
and
Lohrenz
,
T.
, “
Logarithmic Sobolev inequalities for pinned loop groups
,”
J. Funct. Anal.
140
,
381
448
(
1996
).
18.
Durrett
,
R.
,
Brownian Motion and Martingales in Analysis
(
Wadsworth Advanced Books & Software
,
1894
).
19.
Gretton
,
A.
and
Györfi
,
L.
, “
Consistent nonparametric tests of independence
,”
J. Mach. Learn. Res.
11
,
1391
1423
(
2010
).
20.
Hasimoto
,
H.
, “
A soliton on a vortex filament
,”
J. Fluid Mech.
51
,
477
485
(
1972
).
21.
Ikeda
,
N.
and
Watanabe
,
S.
, in
An Introduction to Malliavin’s Calculus
,
Stochastic Analysis
, edited by
Ito
,
K.
(
North-Holland
,
1984
), pp.
1
52
.
22.
Jitkrittum
,
W.
,
Szabó
,
Z.
, and
Gretton
,
A.
, “
An adaptive test of independence with analytic kernel embeddings
,” in
Proceedings of the 34th International Conference on Machine Learning
(
JMLR
,
2017
), Vol.
70
, pp.
1742
1751
.
23.
Karatzas
,
I.
and
Shreve
,
S. E.
,
Brownian Motion and Stochastic Calculus
, 2nd ed. (
Springer
,
1991
).
24.
Knuth
,
D. E.
, “
Permutations, matrices, and generalized Young tableaux
,”
Pacific J. Math.
34
,
709
727
(
1970
).
25.
Kuchemann-Scales
,
M.
(
2022
). “
NLS_stochastic numerical solver
,” GitHub. https://github.com/MoeK-S/NLS_stochastic
26.
Lebowitz
,
J. L.
,
Rose
,
H. A.
, and
Speer
,
E. R.
, “
Statistical mechanics of the nonlinear Schrödinger equation
,”
J. Stat. Phys.
50
,
657
687
(
1988
).
27.
Lewin
,
M.
,
Nam
,
P. T.
, and
Rougerie
,
N.
, “
Derivation of nonlinear Gibbs measures from many-body quantum mechanics
,”
J. Éc. Polytech. Math.
2
,
65
115
(
2015
).
28.
Lewin
,
M.
,
Nam
,
P. T.
, and
Rougerie
,
N.
, “
Classical field theory limit of many body quantum Gibbs states in 2D and 3D
,”
Inventiones Math.
224
,
315
344
(
2021
).
29.
Marjanovic
,
G.
and
Solo
,
V.
, “
Numerical methods for stochastic differential equations in matrix Lie groups made simple
,”
IEEE Trans. Autom. Control
63
(
12
),
4035
4050
(
2018
).
30.
McKean
,
H. P.
, “
Statistical mechanics of nonlinear wave equations (4): Cubic Schrödinger
,”
Commun. Math. Phys.
168
,
479
491
(
1995
).
31.
McKean
,
H. P.
and
Vaninsky
,
K. L.
, “
Action-angle variables for the cubic Schrödinger equation
,”
Commun. Pure Appl. Math.
50
,
489
562
(
1997
).
32.
Piggott
,
M. J.
and
Solo
,
V.
, “
Stochastic numerical analysis for Brownian motion on SO(3)
,” in
Proceedings of the 53rd IEEE Conference on Decision and Control
(
IEEE
,
2014
), pp.
3420
3425
.
33.
Sturm
,
K.-T.
, “
On the geometry of metric measure spaces. II
,”
Acta Math.
196
,
133
177
(
2006
).
34.
Villani
,
C.
,
Topics in Optimal Transportation
(
American Mathematical Society
,
2003
).
35.
Villani
,
C.
,
Optimal Transport: Old and New
(
Springer
,
2009
).
36.
Wadati
,
M.
, “
Generalized matrix form of the inverse scattering method
,” in
Solitons
, edited by
Bulough
,
R. K.
and
Caudrey
,
P. J.
(
Springer
,
1980
), pp.
287
299
.
37.
Xu
,
P.
, “
Noncommutative Poisson algebras
,”
Am. J. Math.
116
,
101
125
(
1994
).
38.
Zaharov
,
V. E.
and
Šabat
,
A. B.
, “
A scheme for integrating the nonlinear equations of mathematical physics by the method of the inverse scattering problem I.
,”
Funct. Anal. its Appl.
8
,
226
235
(
1974
).
Published open access through an agreement with JISC Collections