The paper interprets the cubic nonlinear Schrödinger equation as a Hamiltonian system with infinite dimensional phase space. There exists a Gibbs measure which is invariant under the flow associated with the canonical equations of motion. The logarithmic Sobolev and concentration of measure inequalities hold for the Gibbs measures, and here are extended to the k-point correlation function and distributions of related empirical measures. By Hasimoto’s theorem, the nonlinear Schrödinger equation gives a Lax pair of coupled ordinary differential equations for which the solutions give a system of moving frames. The paper studies the evolution of the measure induced on the moving frames by the Gibbs measure; the results are illustrated by numerical simulations. The paper contains quantitative estimates with well-controlled constants on the rate of convergence of the empirical distribution in Wasserstein metric.
I. INTRODUCTION
Bourgain9 gave an alternative existence proof using random Fourier series, and showed that the measure is invariant under the flow in the sense that the Cauchy problem is well posed on the support. Further refinements include a result of McKean,30 that the sample paths are Hölder continuous, and a result from Theorem 1.2(iv) in Ref. 6 that the invariant measure of the micro-canonical ensemble satisfies a logarithmic Sobolev inequality. Random Fourier series fit naturally into Sturm’s theory of metric measure spaces, which we use to reduce some of the analysis to invariant measures on finite-dimensional Hamiltonian systems.
The focusing case for spatial variable captures soliton solutions, and Ref. 26 discuss the possible transition of the system between an ambient bounded random field and a soliton solution. For , the notion of a spatially localized solution is inapplicable, but some of the results are still relevant.26
Bourgain10 p. 128 comments that invariant Gibbs measures for the periodic cubic Schrödinger equation can be constructed on other phase spaces, and one can consider Gibbs measures on L2 that have different normalizations than μK,β.
We also obtain results on the empirical distributions that arise when we sample solutions of (1.2) with respect to Gibbs measure (1.5), which we use in the numerical experiments in Sec. VII.
Hasimoto20 observed that (1.2) can be expressed as a Lax pair of coupled ordinary differential equations with solutions in SO(3), one of which is the Serret–Frenet system for a moving frame on a curve in . Cruzeiro and Malliavin14 developed stochastic differential geometry for frames, pursuing Cartan’s precedent.12 In Secs. V and VI we consider the evolution of the dynamical system corresponding to Hasimoto frames under the Gibbs measure. In Sec. VII we present numerical experiments regarding the solutions, which illustrate the nature of frames that arise from the solutions of (1.2) for typical elements in the support of the Gibbs measure (1.5). In the Appendix,20 Hasimoto expressed the change of variables in a polar decomposition where ρ is a probability density and ϕ a phase, and derived Betchov’s intrinsic equation for vortex filaments from the nonlinear Schrödinger equation. We remark that Villani35 p. 691 carries out a similar a transformation to interpret the linear Schrödinger equation as a transport problem for densities ρ for a suitable action integral. The current paper is a further step at introducing transportation methods into PDE.
II. TENSOR PRODUCTS AND k-POINT DENSITY MATRICES FOR GAUSSIAN MEASURE
There are various alternative descriptions of even decompositions. We write λ ∼ ρ if λ and ρ are partitions that have equal numbers of boxes and equal numbers of odd rows; evidently ∼ is an equivalence relation on the set of partitions. By Ref. 24 Theorem 4 there is a bijection between symmetric matrices A that have entries in with column sums c1, …, cn and Young tableaux P such that have cj occurrences of j as entries and number of columns of P of odd length equals the trace of A. Given symmetric matrices A and B with entries in such that A and B have equal traces and equal totals of entries, then the RSK correspondence takes A to P and B to Q where P and Q are Young tableaux with an equal number of boxes, and their transposed diagrams P′ and Q′ have an equal number of odd rows, so P′ ∼ Q′.
Conversely, let and suppose that λ and ρ have equal numbers of odd rows, so that after adding zero rows and reordering the rows we have rj + ℓj even for all j. Then we introduce 2kj = ℓj + rj and after a further reordering write k = k1 + k2 + ⋯ + kn where have k1 ≥ k2 ≥ ⋯ ≥ kn, and we have π ∈ Πk as above. Given a n-subset {j1, …, jn} of , we take 2km copies of jm and split them as ℓm on the bra side and rm on the ket side of the tensor for m = 1, …, n, making a contribution as in (2.6). We summarize these results as follows.
The integral J(k) is the sum over the summands (2.6) that arise from a π ∈ Πk with n nonzero rows, a n-subset of , and an even decomposition of π into a pair where λ and ρ have equal numbers of odd rows.
III. CONCENTRATION OF k-POINT MATRICES FOR GIBBS MEASURE
Under the family of Gibbs measures (1.5) associated with the nonlinear Schrödinger equation (NLS) (1.3), the random variable u ↦ ⟨u⊗k∣T∣u⊗k⟩ with u ∈ (BK, L2, μλ) and satisfies a Gaussian concentration of measure (3.6), the mean is a Lipschitz continuous function of β, and the mean for β = 0 is a sum over partitions of 2k over even decompositions.
We can write u = P + iQ for real variables (P, Q) and interpret ⟨u⊗k∣T∣u⊗k⟩ as a homogeneous polynomial in (p, q) of total degree 2k. The following result gives concentration of measure for Lipschitz functions on (BK, L2, μλ), and shows that k-point matrices are concentrated near to their mean value.
To make full use of the previous result, one needs to know the mean as in (3.7), which depends upon the measure in (3.5). The following shows how the mean can vary with the inverse temperature β = −λ.
IV. CONCENTRATION FOR METRIC MEASURE SPACES
Cameron and Martin computed the density with respect to the Wiener measure that results from the linear translation u ↦ u + v for v ∈ H1; their results extends to Gibbs measure with some modifications.
Theorem IV.3 gives a metric version of Sanov’s theorem on the empirical distribution; see p. 70 of Ref. 15. There are related results in Bolley’s thesis.7 By Ref. 13, Theorem IV.3 applies to Haar probability measure on SO(3) and normalized area measure on , as is relevant in Sec. VII below. However, to ensure that as N → ∞, it is convenient to reduce to one-dimensional distributions, where we use the following integral formula. For distributions μ and ν on with cumulative distribution functions F and G, we write Wp(μ, ν) = Wp(F, G).
Consider the discrete metric δ on [0, 1], and observe that gives a 1-Lipschitz function on [0, 1] for all open A ⊆ [0, 1]. Then we have , so by maximizing over A we obtain the total variation norm ‖μ − ν‖var. With μ a continuous measure and ν a purely discrete measure, such as an empirical measure, we have ‖μ − ν‖var = 1. The Propositions IV.4 and IV.5 depend upon the choice of cost function as well as the measures.
The Gibbs measure (1.5) was defined using random Fourier series. This construction gives us a sequence of finite-dimensional probability spaces which approximate the space (BK, L2, μK,β). To make this idea precise, we recall some definitions from Ref. 33.
(Convergence of metric measure spaces).
- For M a nonempty set, a pseudometric is a function δ: M → [0, ∞] such thatthen (M, δ) is a pseudometric space.(4.30)
Given pseudo metric spaces (M1, δ1) and (M, δ2), a coupling is a pseudo metric δ: M → [0, ∞] where M = M1 ⊔ M2 such that δ∣M1 × M1 = δ1 and δ∣M2 × M2 = δ2.
- Suppose that and are complete separable metric spaces endowed with probability measures. Consider a coupling (M, δ) and a probability measure π on M1 × M2 with marginals π1 = μ1 and π2 = μ2. Then the L2 distance between and is(4.31)
Here is a finite-dimensional manifold and a metric probability space. We now show that these spaces converge to (M∞, L2, μK,β) as n → ∞.
- Suppose that 0 < −βK < 3/(14π2). Then has(4.37)
The measures converge in total variation norm to μK,β as n → ∞.
For example, with we introduce ; then and give Lipschitz functions .
For 0 < γ < 1/16 and fixed 0 < t < t0, the map x ↦ u(x, t) ∈ L4 is γ-Hölder continuous, so that is almost surely finite.
V. HASIMOTO TRANSFORM
(Hasimoto). If u is a C2 function that satisfies the nonlinear Schrödinger equation, then the coupled pair of differential equations is consistent in the sense that there exists a local solution of the pair of ODE, and there exists a local solution of Lax pair.
- The invariance of H2 was noted in Ref. 31 and can be proved by differentiating through the integral sign and using the canonical equations. We have a serieswhich converges almost surely. This follows sincewhere the final integral involves the series(5.8)which is a martingale; by Fatou’s Lemma, we have(5.9)so the series in (5.9) is marginally exponentially integrable. Hence the integrals in (5.8) converge by the Lp martingale maximal theorem for all 1 < p < ∞.(5.10)
- One can write H2 in terms of P + iQ = κeiσ, and make a change of variables to obtainandTo interpret this as an area, we write θ ∈ [0, 2π] for the space variable and extend functions on [0, 2π] to harmonic functions on the unit disc via the Poisson kernel. Then by Green’s theorem, we can express this invariant in terms of the area of the image of under the map to P + iQ, as inThis is similar to Lévy’s stochastic area, as discussed in Example 5.1 of Ref. 21.(5.11)
- We then havewhich is bounded in terms of other invariants, with .□
Bourgain10 interprets H2 in terms of momentum (5.70).
With Mn as in (4.36), the space is a Poisson algebra for the bracket , and the canonical equations arise with Hamiltonian on Mn. Let Q be the ring of quaternions, and extend the Poisson bracket to C∞(Mn; Q) via {f ⊗ X, g ⊗ Y} = {f, g} ⊗ XY. Then may be realised as and ; see Example 2.3 of Ref. 37. This Lie algebra is also the Lie algebra of SU(2), and there is a 2 − 1 group homomorphism SU(2) → SO(3). Hence some of the following results may be expressed in terms of SU(2), which is the form in which a Lax pair for the nonlinear Schrödinger equation was presented, see Refs. 36 and 38 (Subsection 8.3.2).
- Suppose that , so that T(x, t) represents the spin of the particle at (x, t) and letwhich corresponds to our (5.5). One can consider infinitesimal variations T ↦ T + T × V and thereby compute . In the focusing case β = −1, Ding16 introduces a symplectic structure on the space of such maps such that the Hamiltonian flow is(5.12)which corresponds to Heisenberg’s equation for the one-dimensional ferro-magnet, and gives the top entry of (5.2). There is a a gauge equivalence between the focussing NLS and Heisenberg’s ferro-magnet. There is also a gauge equivalence between the defocussing NLS and a hyperbolic version of the ferromagnet in which the standard cross product is modified. We have(5.13)(5.14)
- As in Ref. 17, the spacewith pointwise multiplication is a loop group, and its Lie algebra may be regarded as
The aim of the next section is to interpret the Lax pair suitably for solutions which are typically not differentiable and for which we have a pair of stochastic differential equations with random matrix coefficients.
VI. GIBBS MEASURE TRANSPORTED TO THE FRAMES
The compact Lie group SO(3) of real orthogonal matrices with determinant one is a subset of , which has the scalar product ⟨X, Y⟩ = trace(XY⊤) and associated metric d(X, Y) = ⟨X − Y, X − Y⟩1/2 such that ⟨XU, YU⟩ = ⟨X, Y⟩ and d(XU, YU) = d(X, Y) for all U ∈ SO(3) and . The Lie group SO(3) has tangent space at the identity element give by the skew symmetric matrices so(3), so the tangent space TXSO(3) at X ∈ SO(3) consists of {ΩX: Ω ∈ so(3)}, where so(3) is a Lie algebra for [x, y] = xy − yx, x, y ∈ so(3), and the exponential map is surjective so(3) → SO(3).
- If is skew, and Xt, Yt give solutions of the differential equationthen d(Xt, Yt) = d(X0, Y0). We deduce that if X0 is distributed according to Haar measure on SO(3), then Xt is also distributed according to Haar measure since the measure, the metric and solutions are all preserved via X ↦ XU.(6.6)
As an alternative, we can consider X0 to have first column [0; 0; 1] and observe the evolution of the first column T of X under the (6.1) where T evolves on .
We now consider the case in which Ω as in (5.2) is a so(3)-valued random variable over (M∞, μK,β, L2).
Let P + iQ = κeiσ be a solution of NLS with initial data in P(x, 0) + iQ(x, 0) ∈ BK ∩ H1. Then Ω1 in (6.11) gives an so(3)-valued vector field in L2(κ2(x, t)dx).
Let P + iQ = κeiσ be a solution of NLS with initial data P(x, 0) + iQ(x, 0) ∈ H1 ∩ BK, and let be the corresponding solution of the NLS truncated in Fourier space, giving matrix . Let be a solution of (6.9) and suppose that X(n) converges weakly in L2 to Xt(x). Then Xt gives a weak solution of (5.1).
- With , we havewhere the entries of are bounded by κ2 + τ2, hencefor u ∈ H1; however, there is no reason to suppose that τ itself is integrable with respect to dx.(6.12)
- By (5.4) and (5.5), we have for all u ∈ H1. Moreover, Bourgain9 has shown that for initial data P(x, 0) + iQ(x, 0) = κ(x, 0)eiσ(x,0) in H1 ∩ BK, the mapis Lipschitz continuous for 0 ≤ t ≤ t0 with Lipschitz constant depending upon t0, K > 0. We have(6.13)where the right-hand side is integrable with respect to x by the Hardy–Littlewood maximal inequality and (6.12). Suppose that X(n) is a solution of (5.1). We take τn to be locally bounded. Then by applying Cauchy–Schwarz inequality to the integral(6.14)we deduce thatwhere the integral is finite by (6.12). Also(6.15)for 0 < x1 < x2 < … < xN < 2π. We haveso for and the inner product on , we have(6.16)where κn → κ in H1, so with norm convergence, we have in L2, and κnΩ(n) → κΩ1 as n → ∞, and with weak convergence in L2, we have X(n) → X, so(6.17)□(6.18)
The simulation of this differential equation computes starting with X0 = [0; 0; 1] and produces a frame {Xx, ΩxXx, Xx × ΩxXx} of orthogonal vectors. Geodesics on are the curves such that the principal normal is parallel to the position vector, namely the great circles. For a geodesic, Xx × ΩxXx is perpendicular to the plane that contains the great circle.
Let P + iQ = κeiσ be a solution of NLS with initial data P(x, 0) + iQ(x, 0) ∈ BK. Then gives a so(3)-valued stochastic of finite quadratic variation on [0, 2π] almost surely with respect to μK(dPdQ).
Let P + iQ = κeiσ be a solution of NLS with initial data P(x, 0) + iQ(x, 0) ∈ H1 ∩ BK, and let be the corresponding solution of the NLS truncated in Fourier space, giving matrix . Let be a solution of (6.10). Then converges in norm to Xt as n → ∞ where Xt gives a weak solution of (5.2).
- The essential estimate is(6.20)The function σ is a progressively measurable stochastic process adapted with respect to a suitable filtration, and with differential satisfying an Ito integral equation.18 Therefore, we can control the κτ term viawhich is a bounded martingale transform of Wiener loop. This formula is reminiscent of Levy’s stochastic area as in Example 5.1 of Ref. 21.(6.21)
- By (5.4) and (5.5), we have for all u ∈ H1. Bourgain9 has shown that for initial data P(x, 0) + iQ(x, 0) = κ(x, 0)eiσ(x,0) in H1 ∩ BK, the mapis Lipschitz continuous for 0 ≤ t ≤ t0 with Lipschitz constant depending upon t0, K > 0. We have(6.22)where the final integral is part of the Hamiltonian. With , we have the integral equation for the pairing ⟨·, ·⟩ on(6.23)Consider the variational differential equation inwhere and are skew.(6.24)
Let be the Fourier truncated matrix that corresponds to a solution un of the Fourier truncated equation NLSn, then let be the solution of the ODE (6.10). By Proposition VI.2, the map pushes forward the modified Gibbs measure to a measure on [C(Mn; SO(3)), L2] that satisfies a Gaussian concentration of measure inequality with constant α(β, K)/n2; compare (3.9).
- Then the distribution ν(n) of Zn satisfies the Gaussian concentration inequality(6.30)
Let be the empirical distribution of N independent copies of Zn. Then almost surely as N → ∞.
- As with un, we introduce the corresponding data for another solution vn. As in (6.27), we haveFor given initial condition , and T > 0, we can take the supremum over t, then integrate this with respect to x and obtain(6.31)so Ω(u) ↦ Xu is a Lipschitz function . By Bourgain’s results, there exists C > 0 such that(6.32)so is a Lipschitz function on , albeit with a constant growing with n. Thus we can push forward the modified Gibbs measure so that the image measure satisfies a Gaussian concentration inequality with constant α(β, K)/n2 dependent upon n. For each , we introduce Zn, so that where un ↦ Zn is Cn-Lipschitz function from to . The random variable Zn therefore satisfies the Gaussian concentration inequality (6.30).(6.33)
- By Theorem IV.3, we can use the Borel–Cantelli Lemma to show thatwhere by Proposition IV.4, as N → ∞.□
For u ∈ M∞, we have un = Dnu ∈ Mn so that un → u in L2 norm as n → ∞. It is plausible that (6.34) tends to 0 as n → ∞, but we do not have a proof. Unfortunately, the constants are not sharp enough to allow us to use Proposition IV.8 to deduce W2 convergence for the distributions on SO(3).
VII. EXPERIMENTAL RESULTS
Our objective in this section is to obtain a (random) numerical approximation to the solution of (6.9). We consider the case where the parameter β in (1.3) is equal to 0. Note that in this case, the Gibbs measure reduces to Wiener loop measure and stochastic processes with the Wiener loop measure as their law are by definition Brownian loop. Equation (6.9) is a partial differential equation (PDE) with respect to the space variable x, while the parameter of a stochastic process in a stochastic differential equation (SDE) is colloquially referred to as time. To avoid confusion, in this section we refer to x as s; whereas the time variable t is suppressed.
The computational complexity of simulating a single sample-path is where T denotes the length of the interval simulated. Therefore, for a total of N samples, the computational complexity of our simulation algorithm is . We run our simulations using a machine equipped with an 8-core Intel Xeon Gold 6248R central processing unit (CPU) with a clock speed of 2993 MHz; we take advantage of integrated parallelisation in MATLAB. With h = 10−5 and N = 2 × 106 the algorithm takes around 1 week to run on our system.
Since the sample paths are constrained to the points ys can be specified in spherical coordinates of longitude θs ∈ [−π, π) and colatitude ϕs ∈ [0, π]. Figure 2 demonstrates the empirical joint distribution of θs and ϕs for two different values of s. As can be observed, the distribution of (θs, ϕs) varies with s. We hypothesise that the angles θs and ϕs evolve to become statistically independent, and that ys will eventually be uniformly distributed on the sphere. In the remainder of the section, we test this hypothesis statistically.
A. Wasserstein distance between measures on
For each s ∈ [0, 10], let and be the marginal cumulative distribution functions (CDFs) of θs and ϕs respectively. For , denote by and the empirical CDFs of θs and ϕs. We generate empirical CDFs and with s = 0.3, 0.6, 0.9, …, 6.0, and N = 105. Figure 3 demonstrates that and , each decreases with increasing s. As a consequence of Theorem IV.3 and Proposition IV.4 for N = 105 with probability at least 0.99 it holds that and . Thus, we observe that converges to F1 and converges to G1.
B. Hypothesis tests for independence and goodness-of-fit
We run a total of 22 hypothesis tests to examine the evolution of the joint distribution of the angles θs and ϕs. In order to account for multiple testing, we set the significance level of each test to 0.000 45, leading to an overall level of 0.01. First, we generate sample paths to obtain N = 105 realisations of (θs, ϕs) for each value of s = 0.3, 0.6, 0.9, …, 6.0. For each s, we test the null hypothesis H0,s that the angles θs and ϕs are statistically independent, against the alternative hypothesis H1,s that they are dependent. To this end, we rely on a widely used nonparametric independence test, which is based on the Hilbert-Schmidt Independence Criterion (HSIC) dependence measure;2,19 the implementation is due to Jitkrittum et al.22 It is observed that while the null hypothesis is rejected for s = 0.3, …, 2.1, the test is unable to reject H0,s from s = 2.4, …, 6.0 at (an overall) significance level 0.01.
We run two Kolmogorov–Smirnov goodness-of-fit tests for s = 10 as follows. The first tests the null hypothesis that θs is distributed according to F1 against the alternative that it is not; the second tests the null hypothesis that ϕs is distributed according to G1 against the alternative that it is not. At significance level 0.01, the tests are unable to reject the null hypotheses and .
ACKNOWLEDGMENTS
The authors are grateful to the referee for insightful suggestions which improved the paper. Also, the authors thank Nadia Mazza for her helpful remarks on combinatorics. M.K.S. is funded by a Faculty of Science and Technology studentship, Lancaster University.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Gordon Blower: Investigation (equal); Methodology (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal). Azadeh Khaleghi: Investigation (equal); Methodology (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal). Moe Kuchemann-Scales: Investigation (equal); Methodology (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.