We review two works [Chandra *et al*., Publ. Math. l’IHÉS (published online, 2022) and Chandra *et al*., arXiv:2201.03487 (2022)] that study the stochastic quantization equations of Yang–Mills on two- and three-dimensional Euclidean space with finite volume. The main result of these works is that one can renormalize the 2D and 3D stochastic Yang–Mills heat flow so that the dynamic becomes gauge covariant in law. Furthermore, there is a state space of distributional 1-forms $S$ to which gauge equivalence approximately extends and such that the renormalized stochastic Yang–Mills heat flow projects to a Markov process on the quotient space of gauge orbits $S/\u223c$. In this Review, we give unified statements of the main results of these works, highlight differences in the methods, and point out a number of open problems.

## I. INTRODUCTION

### A. Yang–Mills theory

Yang–Mills (YM) theory plays an important role in the description of force-carrying particles in the standard model. An important unsolved problem in mathematics is to show that YM theory on Minkowski space-time can be rigorously quantized. We refer to Ref. 42 for a description of this problem, together with the surveys^{16,48} for related literature and problems.

*G*, called the

*structure group*, with Lie algebra $g$. Passing to Euclidean space and working in an arbitrary dimension

*d*, one can reformulate the problem as trying to make sense of the YM probability measure on the space of $g$-valued 1-forms $A=(A1,\u2026,A4):Rd\u2192gd$,

*S*(

*A*) is the

*YM action*,

*F*(

*A*) is the curvature 2-form of

*A*given by

*F*

_{ij}(

*A*) =

*∂*

_{i}

*A*

_{j}−

*∂*

_{j}

*A*

_{i}+ [

*A*

_{i},

*A*

_{j}], and we equip $g$ with an Ad-invariant inner product ⟨·, ·⟩ and norm |·|.

Without loss of generality, we can take *G* ⊂ *U*(*N*) and $g\u2282u(N)$ for some *N* ≥ 1. In this case, the adjoint action is Ad_{g}*A* = *gAg*^{−1} and a possible choice for the inner product is ⟨*X*, *Y*⟩ = −Tr(*XY*), and one can rewrite $|F(A)|2=\u221212\u2211i,j=1dTr(Fij(A)2)$.

The case *d* = 4 corresponds to physical space-time, but the task of constructing the probability measure *μ* makes sense for arbitrary dimension and even for a Riemannian manifold *M* in place of **R**^{d}. The $g$-valued 1-forms in this case become connections on a principal *G*-bundle *P* → *M*, and one aims to define the probability measure *μ* on the space of connections on *P*.

The cases *d* = 2, 3 are considered substantially simpler than *d* = 4 as they correspond to super-renormalizable theories in quantum field theory (vs renormalizable for *d* = 4 and non-renormalizable for *d* ≥ 5). In the remainder of this article, we will focus on these dimensions and further restrict to *finite volume*, replacing **R**^{d} by the torus **T**^{d} = **R**^{d}/**Z**^{d}. The underlying principal bundle *P* is always assumed trivial and we keep in mind that all geometric objects (connections, curvature forms, etc.) can be written in coordinates ($g$-valued 1-forms, 2-forms, etc.) once we fix a global section of *P* that identifies it with **T**^{d} × *G*. The space of connections is an affine space, with the difference of two connections being a 1-form.

*gauge group*, i.e., the automorphism group of

*P*. Since

*P*is trivial, the gauge group consists of functions

*g*:

**T**

^{d}→

*G*. These gauge transformation change the global section we use to identify

*P*≃

**T**

^{d}×

*G*and act on 1-forms by

*A*

^{g}either as a new connection, gauge equivalent to

*A*, or simply as the same connection

*A*written in a new coordinate system.

We write *A* ∼ *B* if there exists *g* such that *A*^{g} = *B* and write [*A*] = {*B*: *B* ∼ *A*} for the *gauge orbit* of *A*. In light of the above, the natural space on which to define the probability measure *μ* is not the space of 1-forms, but rather the quotient space $O$ of all gauge orbits. The space $O$ is a non-linear space if *G* is non-Abelian, which makes non-trivial even the construction of the state space on which the YM measure *μ* should be defined.

A number of works have made contributions to a precise definition of this measure. The most successful case is dimension *d* = 2, which includes **R**^{2} and compact orientable surfaces. The key feature that makes 2D YM special is its exact solvability, which allows one to write down an explicit formula for the joint distribution of Wilson loop observables; this property was observed in the physics literature by Migdal^{53} and later developed in mathematics; see, e.g., Refs. 17, 26, 29, 33, 44, 45, and 60.

In the Abelian case *G* = *U*(1) on **R**^{d}, one can make sense of the measure *μ* for *d* = 3^{32} and *d* = 4.^{25} For any structure group *G*, a form of ultraviolet stability on **T**^{d} was demonstrated for *d* = 4 using a continuum regularization in Ref. 51 and for *d* = 3, 4 using a renormalization group approach on the lattice in a series of works by Balaban^{2–4} (see also Ref. 28). However, a construction of the 3D YM measure and a description of its gauge-invariant observables, even on **T**^{3}, remains open.

### B. Stochastic quantization

Another approach to the construction of the YM measure was recently initiated in Refs. 11 and 12, which is based on stochastic quantization (see also Ref. 61 that treats scalar QED on **T**^{2}). The basic idea behind this approach is to view the measure *μ* as the invariant measure of a Langevin dynamic. By studying this dynamic, one can try to determine properties and even give constructions of *μ*. The method was put forward in the context of gauge theories by Parisi–Wu^{57} and has recently been applied to the construction of *scalar* theories (see Refs. 1, 34, 40, and 54).

_{A}and

*ξ*is a white noise built over

*L*

^{2}-space of $g$-valued 1-forms.

We point out right away that a difficulty in solving (1.1), even in the absence of noise, is that the equation $\u2202tA=\u2212dA*F(A)$ is not parabolic. This is a well-known feature of YM theory and is connected with the infinite-dimensional nature of the gauge group: the YM equations $dA*F(A)=0$ are not elliptic and admit infinitely many solutions because if *A* is a solution, then so is every element of the gauge orbit [*A*].

*A*] at

*A*and which renders the equation parabolic. This “trick” was used by Zwanziger

^{64}and Donaldson

^{24}in YM theory (see also Refs. 5 and 59) and by DeTurck

^{23}in the context of Ricci flow. The tangent space of [

*A*] at

*A*is precisely the set of all $dA\omega =\u2211i=1d(\u2202i+[Ai,\u22c5])\omega dxi$, where

*ω*is a $g$-valued function. A natural choice for the additional term is −d

_{A}d

^{*}

*A*, where $d*A=\u2212\u2211i=1d\u2202iAi$. Therefore, making sense of (1.1) is equivalent to making sense of the parabolic equation

*ξ*

_{1}, …,

*ξ*

_{d}are independent and identically distributed (i.i.d.) $g$-valued white noises and [·, ·] is the Lie bracket of $g$. Here and below, there is an implicit summation over

*j*= 1, …,

*d*.

#### 1. Gauge covariance

The reason why (1.3) is natural is that it is (formally) gauge covariant in law. To see this, it is convenient to work in coordinate-free notation and, for the moment, make an distinction between connections and 1-forms. Recall that the space of connections is affine modeled on the space Ω^{1} of $g$-valued 1-forms [we are using the global section of *P* to identify ad(*P*)-valued forms with $g$-valued forms]. For a connection *A*, recall further that d_{A}: Ω^{k} → Ω^{k+1} is a linear map from $g$-valued *k*-forms to (*k* + 1)-forms with adjoint $dA*:\Omega k+1\u2192\Omega k$. Furthermore, gauge transformations *g* act on connections via *A* ↦ *A*^{g} and on forms via *ω* ↦ Ad_{g}*ω*.

*A*and gauge transformation

*g*. Then, $B=defAg$ satisfies

*Z*denote the canonical flat connection associated with our choice of global section, i.e., the connection associated with the 1-form 0. Consider a time-dependent 1-form

*ξ*and suppose that

*A*solves (1.3), which we now write as

*B*solves

*B*into a form similar to

*A*, it is natural to take

*g*that solves

*B*becomes

*A*, except that

*ξ*is replaced by Ad

_{g}

*ξ*.

If we now assume *ξ* is a white noise and that the equations above make sense with the global section in time solutions, then Ad_{g}*ξ* is equal in law to *ξ* by Itô isometry. This formal argument suggests that there is a coupling between two solutions *A*, *B* to (1.3) starting from gauge equivalent initial conditions such that *A*(*t*) ∼ *B*(*t*) for all *t* ≥ 0. In particular, the law of the projected process [*A*] on gauge orbits is equal to that of [*B*]. The projected process on gauge orbits is therefore well-defined and Markov, and its invariant probability measure is a natural candidate for the YM measure.

### C. Main results

The basic objective of Refs. 11 and 12 is to make rigorous the above formal argument in the case of **T**^{d} for *d* = 2 and *d* = 3, respectively. In particular, they aim to define a natural Markov process on gauge orbits associated with the YM Langevin dynamic (1.3). In this subsection, we give unified statements of the main results in Refs. 11 and 12. We will go into more detail and discuss the differences in proofs in Secs. II and III. We also discuss in detail these results in the simple case that *G* = *U*(1) in Sec. I D.

*G*is Abelian, this equation is non-linear and

*singular*. To simplify notation and highlight the nature of the non-linearities, we will henceforth write (1.2) and (1.3) as

**R**×

**T**

^{d}can be realized as a random distribution in $C\u22121\u2212d2\u2212\kappa $, the Hölder–Besov space with parabolic scaling, for arbitrary

*κ*> 0. Furthermore, this regularity is optimal at least in the scale of Besov spaces. By Schauder estimates, we, therefore, expect that

*A*is in $C1\u2212d2\u2212\kappa $ (and no better), but this renders the non-linear terms

*A∂A*and

*A*

^{3}analytically ill-defined once

*d*≥ 2 because $1\u2212d2\u2212\kappa <0$. [We recall here that the bilinear map (

*f*,

*g*) ↦

*fg*defined for smooth functions extends to

*C*

^{α}×

*C*

^{β}if and only if

*α*+

*β*> 0.]

^{36}and paracontrolled calculus

^{35}(see also Refs. 27, 43, and 56), have been developed in the last decade, which allow one to make sense of such equations. The key condition that must be satisfied is

*subcriticality*, which happens if and only if

*d*< 4 and which parallels the notion of super-renormalizability in quantum field theory (QFT). Subcriticality implies that the solution

*A*to (1.4) should be a perturbation of the stochastic heat equation (SHE),

Recall that *A* represents a geometric object (a principal *G*-connection). On the other hand, the solution *A* to (1.4) at positive times is expected to be a distribution of the same regularity as the GFF. Therefore, a first natural question is whether there exists a state space $S$ large enough to support the GFF while small enough so that gauge equivalence extends to $S$. One of the main results of Refs. 11 and 12 gives an answer to this question, which can be informally stated as follows.

*There exists a metric space* $(S,\Sigma )$ *of* $g$*-valued distributional 1-forms on* **T**^{d}*,* *d* = 2, 3*, which contains all smooth 1-forms and to which gauge equivalence approximately extends in a canonical way. Furthermore,* $S$ *contains distributions of the same regularity as the* *GFF on* **T**^{d}*.*

The constructions of $S$ in Refs. 11 and 12 are rather different. In Ref. 11, $S$ (therein denoted by $\Omega \alpha 1$) is a Banach space defined through line integrals, and gauge equivalence is determined by an action of a gauge group. In Ref. 12, $S$ is a non-linear metric space of distributions defined in terms of the effect of the heat flow; gauge equivalence is extended using a gauge-covariant regularizing operator (the deterministic YM flow). We describe these constructions further in Secs. II A and III A.

A naive construction of $S$ would be to take *C*^{η} for $\eta =1\u2212d2\u2212\kappa $ and quotient by the action of the gauge group *C*^{η+1}(**T**^{d}, *G*) ∩ *C*^{−η+κ}(**T**^{d}, *G*) so that Ad_{g}*A* ∈ *C*^{η} and (d*g*)*g*^{−1} ∈ *C*^{η} are well-defined. Such a construction ends up lacking most of the nice properties we discuss in Secs. II A and III A.

While Theorem 1.3 makes it seems like the 2D and 3D cases are on an equal footing, we actually know much more about $S$ in 2D than in 3D, e.g., the space of orbits $S/\u223c$ in 2D comes with a natural complete metric, and is thus Polish, while we only know that $S/\u223c$ is completely Hausdorff (and separable) in 3D.

We now turn to the question of solving (1.4). A natural approach is to replace *ξ* by a smooth approximation *ξ*^{ɛ} and let the mollification parameter *ɛ* ↓ 0. The hope then is that the corresponding solutions *A* converge. Unfortunately, this is not, in general, the case and the stochastic partial differential equation (SPDE) requires *renormalization* for convergence to take place. The following result ensures that renormalized solutions to (1.4) exist, at least up to a potential finite time blow-up.

*mollifier*is a smooth compactly supported function

*χ*:

**R**×

**R**

^{d}→

**R**such that

*∫χ*= 1 and which is spatially symmetric and invariant under flipping coordinates

*x*

_{i}↦ −

*x*

_{i}for

*i*= 1, …,

*d*. Denoting

*χ*

^{ɛ}(

*t*,

*x*) =

*ɛ*

^{−2−d}

*χ*(

*ɛ*

^{−2}

*t*,

*ɛ*

^{−1}

*x*) and * for space-time convolution, we define

*ξ*

^{ɛ}=

*χ*

^{ɛ}*

*ξ*. Furthermore, define

*For every mollifier*

*χ*

*, there exists a family of operators*${C\u2009bphz\epsilon}\epsilon \u2208(0,1)\u2282LG(g,g)$

*such that, for any*$C\u030a\u2208L(g,g)$

*and initial condition*$A(0)\u2208S$

*, the solution to the*

*partial differential equation (PDE),*

*converges in probability in*$CR+,S\u2294${})

*as*

*ɛ*↓ 0

*. The limit as*

*ɛ*↓ 0

*furthermore does not depend on*

*χ*

*.*

*A*

_{i}.

We call the *ɛ* ↓ 0 limit of *A* as in Theorem 1.6 the *solution to* *(1.5)* *driven by* *ξ* *with bare mass* $C\u030a$*.*

The bare mass $C\u030a$ is used to parameterize the space of all “reasonable” solutions and is a free parameter at this stage. We will see below (Theorems 1.11 and 1.14) that there does exist a unique choice for $C\u030a$, which selects a distinguished element of this solution space.

The operators $C\u2009bphz\epsilon $ are called the *Bogoliubov-Parasiuk-Hepp-Zimmermann (BPHZ) constants* and are given by (in principle, explicit) integrals involving *χ* and an arbitrary large scale truncation *K* of the heat kernel. While the solution to (1.5) in Definition 1.7 is independent of *χ*, it does, in general, depend on the choice of *K* used to define $C\u2009bphz\epsilon $.

The point is a cemetery state and is added to $S$ to handle the possibility of finite time blow-up. Some care is needed to properly define $CR+,S\u2294${}) and the metric that one equips it with. This is done in Ref. 11 (Sec. 1.5.1), where, for a general metric space *E*, a metric space *E*^{sol} of continuous paths with values in *E* ⊔{} is defined in which two paths are close if they track each other until the point when they become large. Our notation $CR+,S\u2294${}) here really means $Ssol$.

It turns out that in 2D, due to a cancellation in renormalization constants, $C\u2009bphz\epsilon $ converges to a finite value as *ɛ* ↓ 0 (see Theorem 2.5). Therefore, Theorem 1.6 in 2D remains true if $C\u2009bphz\epsilon +C\u030a$ replaced by any fixed $C\u2208L(g,g)$, which is the formulation of Ref. 11 (Theorem 2.4). No such cancellation occurs in 3D and $C\u2009bphz\epsilon $ diverges at order *ɛ*^{−1}.

We now discuss the way in which solutions to (1.5) are gauge covariant in the sense described in Sec. I B 1. Remark that by inserting the counterterm $(C\u2009bphz\epsilon +C\u030a)A$, we are seemingly breaking the desired gauge covariance property discussed in Sec. I B 1 [in the notation of that section, $(C\u2009bphz\epsilon +C\u030a)A$ should be written $(C\u2009bphz\epsilon +C\u030a)(A\u2212Z)$]. However, the formal argument in Sec. I B 1 also breaks if we replace *ξ* by *ξ*^{ɛ} because Itô isometry is not true for the latter.

A surprising fact is that if one chooses $C\u030a$ carefully, then the broken gauge covariance [due to the counterterm $(C\u2009bphz\epsilon +C\u030a)A$] compensates in the *ɛ* ↓ 0 limit for the broken Itô isometry (due to the mollified noise *ξ*^{ɛ}), and one obtains a solution to (1.5), which is gauge covariant in law.

It is not entirely trivial to make this statement precise, essentially because we do not know if (1.5) (with any bare mass) is global in time solutions. In particular, we do not know how to rule out that solutions to (1.5) with different gauge equivalent initial conditions *a* ∼ *b* blow up at different times, and this makes it unclear in what sense we can expect the projected process [*A*] on gauge orbits to be Markov.

To address this issue, it is natural to look for a type of process that solves (1.5) on disjoint intervals [*ς*_{j−1}, *ς*_{j}) and at time *ς*_{j} jumps to a new representative of the gauge orbit $[limt\u2191\u03c2jA(t)]$. This should happen in such a way that *A* does not blow up unless the entire orbit [*A*] “blows up.” This class of processes is defined through *generative probability measures* in Refs. 11 and 12.

We say that a probability measure *μ* on the space of càdlàg functions $DR+,S\u2294${}) is *generative* with bare mass $C\u030a$ and initial condition $a\u2208S$ if there exists a white noise *ξ* and a random variable *A* with law *μ* such that

*A*(0) =*a*almost surely;there exists a sequence of stopping times

*ς*_{0}= 0 ≤*ς*_{1}≤*ς*_{2}< ⋯ such that*A*solves (1.5) driven by*ξ*with bare mass $C\u030a$ on each interval [*ς*_{j},*ς*_{j+1});for every

*j*≥ 0, $A(\u03c2j+1)\u223climt\u2191\u03c2j+1A(t)$; and- $limj\u2192\u221e\u03c2j=T*=definft\u22650:A(t)=$$$ and, on the event
*T*^{*}<*∞*,^{65}(1.6)$limt\u2191T*infB\u223cA(t)\Sigma (B,0)=\u221e.$

The point of this definition is to give a sufficiently general and natural way in which (1.5) can be restarted along gauge orbits. The following result from Refs. 11 and 12 ensures the existence of a canonical Markov process associated with (1.5) on the quotient space of gauge orbits $O=defS/\u223c$, *provided* the bare mass is chosen in a precise way.

*For every*$a\u2208S$*and*$C\u030a\u2208L(g,g)$*, there exists a generative probability measure**μ**with bare mass*$C\u030a$*and initial condition**a**.**There exists*$C\u030c\u2208LG(g,g)$*with the following properties. For all*$a\u223cb\u2208S$*, if**μ*,*ν**are generative probability measures with initial conditions**a*,*b*,*respectively, and bare mass*$C\u030c$*, then the pushforward measures**π*_{*}*μ**and**π*_{*}*ν**are equal. In particular, the probability measure***P**^{x}=*π*_{*}*μ**, where**μ**is generative with bare mass*$C\u030c$*and initial condition*$a\u2208x\u2208O$*, depends only on**x**. Finally,*${Px}x\u2208O$*are the transition functions of a time homogeneous, continuous Markov process on*$O\u2294${}.

Theorem 1.11(b) makes no claims about the *uniqueness* of $C\u030c$, but we conjecture that $C\u030c$ is indeed unique (which is not difficult to prove in the Abelian case; see Sec. I D).

Finally, we mention a result in Refs. 11 and 12 that is crucial for the Proof of Theorem 1.11(b) and that makes precise the coupling argument outlined in Sec. I B 1. This result makes a stronger statement about the constant $C\u030c$ for which uniqueness does hold. It can also be seen as a version of the Slavnov–Taylor identities for renormalization schemes that preserve gauge symmetries.

*A*solves (1.5) and, recalling that Δ

*A*+

*A∂A*+

*A*

^{3}+

*CA*is shorthand for $\u2212dA*F(A)\u2212dAdA*(A\u2212Z)+C(A\u2212Z)$, suppose that (

*B*,

*g*) solves

*B*is

*B*(0) =

*A*(0)

^{g(0)}. Then, the same computation as in Sec. I B 1 (see also Ref. 11, Sec. 2.2) shows that

*A*

^{g}=

*B*. In coordinates, (1.7) is written as

*ξ*

^{ɛ}; thus, $A\u0304$ is equal in law to the solution of (1.5) with initial condition

*B*(0), provided we take

*χ*non-anticipative in the following sense.

A mollifier *χ* is called *non-anticipative* if it has support in (0, *∞*) × **R**^{d}.

What we would therefore like to show is that, for a special choice of $C\u030a$, the solutions to (1.8) and (1.9) converge as *ɛ* ↓ 0 to the same limit. The identity *A*^{g} = *B*, which survives in the limit, would provide a coupling between (1.5) with initial condition *A*(0) and initial condition *A*(0)^{g(0)} under which the two solutions are gauge equivalent, at least locally in time. It turns out that the following more general result is true.

*There exists a unique*$C\u030c\u2208LG(g,g)$

*such that for all non-anticipative mollifiers*

*χ*

*, all*$C\u030a\u2208L(g,g)$

*, and all initial conditions*$(B(0),g(0))\u2208S\xd7C\rho (Td,G)$

*,*$\rho \u2208(12,1)$

*, the solution*(

*B*,

*g*)

*to*

*(1.8)*

*converges as*

*ɛ*↓ 0

*in probability to the same limit as the solution to*

*Furthermore, the solution to*

*(1.5)*

*with bare mass*$C\u030c$

*is independent of*

*χ*

*and of the choice of*$C\u2009bphz\epsilon $

*.*

It follows from Theorem 1.14 that the solutions to (1.8) and (1.9) indeed converge to the same limit as *ɛ* ↓ 0, provided we choose $C\u030a=C\u030c$. The operator $C\u030c$ in Theorem 1.14 is exactly the operator appearing in Theorem 1.11(b); in 2D, we can give an explicit expression for $C\u030c$ [see (2.13)].

The value of $C\u030c$ in Theorem 1.14 is determined uniquely *after* we fix a choice for $C\u2009bphz\epsilon $. However, recall from Remark 1.8 that $C\u2009bphz\epsilon $ is not unique or canonical—it is determined by *χ* and an arbitrary truncation of the heat kernel (see, e.g., Theorem 2.5). The final part of Theorem 1.14 states that the solution of (1.5) with bare mass $C\u030c$ is independent of *χ* and this choice of truncation.

### E. Abelian case

We end this section by discussing the above results in the Abelian case, i.e., *G* = *U*(1) and $g=R$. We consider here *d* ≥ 1 arbitrary. We will see in this case that

the constant in Theorem 1.11(b) is $C\u030c=0$ and is

*unique*,if

**T**^{d}is replaced by**R**^{d}, then uniqueness of $C\u030c$ fails (Remark 1.16),(1.5) with bare mass $C\u030c=0$ is global in time solutions but no invariant probability measure (Remark 1.17)

*A∂A*and

*A*

^{3}as well as the constants $C\u2009bphz\epsilon $ vanish. Equation (1.5) with bare mass $C\u030a$, therefore, becomes linear in

*A*and converges as

*ɛ*↓ 0 to the solution of the SHE with a mass term

**T**

^{d}as periodic functions (or distributions) on

**R**

^{d}modulo

**Z**

^{d}, it is easy to see that

*A*∼

*B*if and only if

*A*=

*B*+ d

*ω*for some

*ω*:

**R**

^{d}→

**R**such that

*e*

^{iω}is periodic, where $i=\u22121$. The tangent space of every gauge orbit is therefore {d

*ω*:

*ω*:

**T**

^{d}→

**iR**}.

Since Ad_{g} is now the identity, it is clear that a possible value for $C\u030c$ in Theorems 1.11 and 1.14 is $C\u030c=0$. This is because if *B*(0) = *A*(0) + d*ω*(0) and *A*, *B* solve (1.11) with $C\u030a=0$, then *B* = *A* + d*ω* for all times, where *ω* solves *∂*_{t}*ω* = Δ*ω*.

*only*possible value for $C\u030c$. Indeed, consider the two gauge equivalent initial conditions $A(0)=def0$ and $B(0)=(B1(0),\u2026,Bd(0))=def(2\pi ,0,\u2026,0)$, and suppose that

*A*,

*B*solve (1.11) with $C\u030a\u22600$. Then, $B(t)=etC\u030aB(0)+A(t)$, where we used that

*B*(0) is constant on

**T**

^{d}. Consider now the gauge-invariant observable $I[A]=defei\u222bTdA1$. Then, $\u222bTdB1(t)=etC\u030a2\pi +\u222bTdA1(t)$, and thus,

*t*> 0 sufficiently small since $C\u030a\u22600$. This show that

*A*and

*B*cannot be gauge equivalent in law; thus, $C\u030c=0$ is the only possible value in Theorem 1.11(b).

**T**

^{d}is not simply connected: we exploited that (d

*g*)

*g*

^{−1}, which appears in (1.10), is not tangent to gauge orbits, in general (it is tangent if and only if

*g*=

*e*

^{iω}for some

*ω*:

**T**

^{d}→

**R**). If

**T**

^{d}is replaced by

**R**

^{d}, then (d

*g*)

*g*

^{−1}

*is*tangent to gauge orbits since we can always write

*g*=

*e*

^{iω}for some

*ω*:

**R**

^{d}→

**R**. Therefore, on

**R**

^{d}in the Abelian case, there is no uniqueness of $C\u030c$. Explicitly, working on

**R**

^{d}, suppose

*A*solves (1.11) and consider

*B*as in (1.8), where

*B*(0) =

*A*(0)

^{g(0)}but now

*g*satisfies

*g*=

*e*

^{iω}, where

*ω*solves $\u2202t\omega =\Delta \omega +C\u030a\omega $. Then,

*B*(

*t*) =

*A*(

*t*)

^{g(t)}=

*A*(

*t*) − d

*ω*(

*t*) for all

*t*≥ 0 and

*B*solves the same equation (1.11) as

*A*for any $C\u030a\u2208R$.

Clearly (1.11) with $C\u030a=0$ does not have an invariant *probability* measure because $\u222bTdAi$ evolves like a Brownian motion. In fact, any gauge equivalent generalization of (1.5) of the form *∂*_{t}*A* = Δ*A* + *ξ* + d*ω*, where *ω* is adapted, will lack an invariant probability measure because the spatial mean is unaffected by d*ω*. However, we do obtain an invariant probability measure for the projected process [*A*] because any 1-form is gauge equivalent to another 1-form *B* such that $\u222bTdB1,\u2026,\u222bTdBd\u2208[\u2212\pi ,\pi )$. This remark shows that the Markov process from Theorem 1.11 can have an invariant probability measure, while (1.5) with bare mass $C\u030c$ does not.

## II. TWO DIMENSIONS

### A. State space

The definition of the state space $S$ (denoted by $\Omega \alpha 1$ in Ref. 11) is motivated by the desire to define holonomies and, thus, Wilson loops (for every element $A\u2208S$). The construction is a refinement of that introduced in Ref. 17. Let $X=T2\xd7B1/4$, where $B1/4={v\u2208R2:|v|\u226414}$. We think of $X$ as the collection of straight line segments *ℓ* = (*x*, *v*) in **T**^{2} of length at most $|\u2113|=def|v|<14$. (The starting point of *ℓ* is *x*.)

*α*∈ [0, 1] and a smooth 1-form $A\u2208C\u221e(T2,g2)$, define the norm

*ℓ*| > 0 and where we define the

*line integral*

*P*= (

*ℓ*

_{1},

*ℓ*

_{2},

*ℓ*

_{3}) with $\u2113i\u2208X$ and area |

*P*| > 0 and $A(\u2202P)=def\u2211i=13A(\u2113i)$. We can now define the state space studied in Ref. 11.

The Banach space $(S,|\u22c5|\alpha )$ is defined as the completion of smooth $g$-valued 1-forms under |·|_{α} for some $\alpha \u2208(23,1)$.

The metric Σ in Theorem 1.3 is then the usual metric Σ(*A*, *B*) = |*A* − *B*|_{α}.

To motivate these norms, consider *A* = (*A*_{1}, *A*_{2}) a pair of i.i.d. GFFs. A simple calculation shows that, for all *α* < 1, **E**|*A*(*ℓ*)|^{2} ≲ |*ℓ*|^{2α}. Furthermore, it follows from the Stokes theorem and the fact that d*A* is a white noise and that *A*(*∂P*) = *∫*_{P}d*A* and, hence, **E**|*A*(*∂P*)|^{2} = |*P*|. A Kolmogorov argument then implies that the GFF has a modification with |*A*|_{α} < *∞* almost surely.

*C*

^{α}(

**T**

^{2},

*G*). One can show that the group action (

*A*,

*g*) ↦

*A*

^{g}, defined for smooth 1-forms and gauge transformations, extends to a locally Lipschitz map $S\xd7G0,\alpha \u2192S$ [see Ref. 11 (Theorem 3.27, Corollary 3.36)]. We then extend gauge equivalence ∼ to $S$ by

- For every $\u2113=(x,v)\u2208X$ and $A\u2208S$, one has $|\u2113A|C\alpha \u2264|\u2113|\alpha |A|\alpha -gr$, where $\u2113A:[0,1]\u2192g$ is the path $\u2113A(t)=\u222b0tA(x+sv)vds$. The holonomy hol(
*A*,*ℓ*) ∈*G*, defined by hol(*A*,*ℓ*) =*y*_{1}, where*y*solves the ordinary differential equation (ODE)is, therefore, well-defined by Young integration.$dyt=ytd\u2113A(t),y0=1,$^{31,50,63}More generally, hol(

*A*,*γ*) is well-defined for*any**γ*∈*C*^{1,β}([0, 1],**T**^{2}), where $\beta \u2208(2\alpha \u22121,1]$, and the map (*A*,*γ*) ↦ hol(*A*,*γ*) is Hölder continuous. In particular, classical Wilson loop observables are well-defined with good stability properties. The relation ∼ can be expressed entirely in terms of holonomies.

Since hol(*A*, *γ*) is independent of the parameterization of *γ*, it is natural to also measure the regularity of *γ* in a parameterization independent way. Such a notion of regularity is introduced in Ref. 11 (Sec. 3.2), which interpolates between *C*^{1} and *C*^{2} (akin to how *p*-variation is a parameterization invariant interpolation between *C*^{0} and *C*^{1}).

- One has the embeddingswhere $\Omega \alpha -gr1$ is the completion of smooth functions under |·|$C\alpha /2\u21aaS\u21aa\Omega \alpha -gr1\u21aaC\alpha \u22121,$
_{α}. (Only the last of these is non-trivial; see Ref. 17, Proposition 3.21.) These embeddings are furthermore optimal in the sense that*α*/2 in*C*^{α/2}cannot be decreased and*α*− 1 in*C*^{α−1}cannot be increased. Remark also that $|A|1-gr\u224d|A|L\u221e$, while $S$, since*α*< 1, contains distributions that cannot be represented by functions, such as the GFF. There exists a complete metric

*D*on the quotient space of gauge orbits $O=S/\u223c$, which induces the quotient topology. To define*D*, we first define a new (but topologically equivalent) metric*k*on $S$ by shrinking the usual metric Σ(·, ·) = |· − ·|_{α}in such a way that the diameter of every*R*-sphere $SR=def{A\u2208S:|A|\alpha =R}$ goes to zero as*R*→*∞*, but the distance between*S*_{r}and*S*_{R}for large*r*≤*R*is of order $Rr\u22121$, so, in particular, it goes to*∞*as*R*→*∞*. Then,*D*is defined as the Hausdorff distance associated with the metric*k*on $S$.

The space $S$ strengthens the definition of a Banach space $Sax$ introduced in Ref. 17; $Sax$ is defined in a similar way but with $X$ taken as the set of *axis-parallel* line segments. The main result of Ref. 17 is that if *G* is simply connected, then there exists a (non-unique) probability measure on $Sax$ such that the holonomies along all axis-parallel curves agree in distribution with those of the YM measure on **T**^{2} defined in Refs. 44 and 60. The proof of this result uses a gauge-fixed lattice approximation, which explains the restriction to axis-parallel lines.

### B. Local solutions

It turns out that in 2D we can sharpen the statement of Theorem 1.6 as follows.

*For every*$C\u030a\u2208L(g,g)$

*, mollifier*

*χ*

*, and initial condition*$A(0)\u2208S$

*, the solution to*

*converges in probability in*$CR+,S\u2294${})

*as*

*ɛ*↓ 0

*. The constant*$C\u2009sym\epsilon $

*converges to a finite limit as*

*ɛ*↓ 0

*and is defined by*

*Here,*

*K*:

**R**×

**R**

^{2}\{0} →

**R**

*is any spatially symmetric function invariant under flipping coordinates*

*x*

_{i}↦ −

*x*

_{i}

*,*

*i*= 1, 2

*, which vanishes for negative times, has bounded support, and is equal to the heat kernel*$(\u2202t\u2212\Delta )\u22121$

*in a neighborhood of the origin. We write*

*K*

^{ɛ}=

*χ*

^{ɛ}*

*K*,

*and*

*∂*

_{j}

*is any spatial derivative,*

*j*= 1, 2

*. The operator*$\lambda \u2208LG(g,g)$

*is the Casimir element of*$g$

*in the adjoint representation.*

*The* *ɛ* ↓ 0 *limit of* *A**, which solves* *(1.5)* *with bare mass* $C\u030a$*, depends on* *K* *and* $C\u030a$ *but not on* *χ**.*

If $g$ is simple (which one can assume without loss of generality; see Ref. 11, Remark 2.8), then *λ* < 0 is just a scalar.

Recall from Remark 1.10 that the convergence of $C\u2009bphz\epsilon =def\lambda C\u2009sym\epsilon $ to a finite limit is special to dimension 2 and is due to a cancellation between the diverging constants $4C\u0302\epsilon $ and $C\u0304\epsilon $.

We briefly describe the ingredients in the Proof of Theorem 2.5, which is based on the theory of regularity structures. We only mention the overall strategy behind this theory and refer to Refs. 14, 31, and 37 for an introduction and more details. To solve an SPDE such as (1.4), one constructs a sufficiently large “regularity structure” and lifts the equation to a space of “modeled distributions” with values in the regularity structure. This construction, first introduced in Ref. 36, is done at a high level of generality in Ref. 8. One then constructs a finite number of stochastic objects from the noise called a “model”—these objects are essentially renormalizations of functions of forms (2.3) and (2.4)—the existence of which follows from Ref. 13. The point of construction is that the products *A∂A*, *A*^{3} and convolution with the heat kernel become stable operations on modeled distributions, and one can solve a fixed point problem for the “lifted” equation. Finally, one maps the resulting modeled distribution to a distribution on **R** × **T**^{2} via the “reconstruction operator” and identifies it with a solution to a classical renormalized PDE, at least for *ɛ* > 0. This final step is carried out systematically in Ref. 6. All these operations are done in a way that is stable as *ɛ* ↓ 0, thereby showing the desired convergence.

One of the contributions of Ref. 11 is to develop a framework in which the algebraic results from Refs. 6 and 8 can be transferred to a setting in which the noise and solution are vector-valued. References 6 and 8 provided a general method to compute the renormalized form of a system of *scalar* SPDEs, which, in principle, does apply to (1.4) by writing it as a system of $d\xd7dim(g)$ scalar-valued equations using a basis. However, such a procedure is cumbersome and unnatural; it is more desirable to find a framework that preserves the vector-valued nature of the noise and solution, which is the purpose of Ref. 11 (Sec. 5).

The main idea behind the extension in Ref. 11 is to define a category of “symmetric sets” and a functor between this category and the category of vector spaces. This construction allows one to canonically associate partially symmetrized tensor products of vector spaces to combinatorial rooted trees that commonly appear in regularity structures. One of the main outcomes is a procedure to compute the renormalized form of equations like (1.4) without resorting to a basis.

It follows from the general theory of regularity structures that (2.2), for any $C\u030a\u2208L(g,g)$, converges locally in time in *C*^{α−1}. To improve this to convergence in $CR+,S\u2294${}), one decomposes the solution *A* into *A* = Ψ + *B*, where Ψ solves the SHE *∂*_{t}Ψ = ΔΨ + *ξ* with initial condition Ψ(0) = *A*(0) and *B* is in *C*^{1−κ} for all *κ* > 0. One can then show by hand that $\Psi \u2208C(R+,S)$ (see Ref. 11, Sec. 4), which, together with the embeddings $C\alpha /2\u21aaS\u21aaC\alpha \u22121$, shows that *A* converges to a maximal solution with values in $S$.

### C. Gauge covariance

Recall that Theorems 1.11 and 1.14 imply a form of gauge covariance for (1.5). Theorem 1.11(a) is a relatively straightforward consequence of Theorem 2.5. One defines the random variable *A* by solving (1.5) until the first time that |*A*(*t*)|_{α} ≥ 2 + inf_{B∼A(t)}|*B*|_{α}, at which point one uses a measurable selection $S:O\u2192S$ to jump to a new small representative *B* of the gauge orbit [*A*(*t*)] for which |*B*|_{α} < 1 + inf_{a∈[A(t)]}|*a*|_{α}. These jump times define the increasing sequence of stopping times ${\u03c2j}j\u22650$ in item (ii). Items (i)–(iv) all follow readily from the construction.

The Proof of Theorem 1.11(b), which is the main statement of Theorem 1.11, requires more work. The idea is to use Theorem 1.14, which we admit for now, to couple the solutions to (1.5) with bare mass $C\u030c$ and initial conditions *a* ∼ *b*. Specifically, let *ν* be a generative probability measure with bare mass $C\u030c$ and initial condition $b\u2208S$, and consider any *a* ∼ *b*. Letting *B* and $\xi \u0304$ denote the random variable and white noise, respectively, corresponding to *ν*, it follows from Theorem 1.14 that, on the same probability space, there exists a càdlàg process *A* defined as above with (1.5) driven by $\xi =defAdg\u22121\xi \u0304$ and bare mass $C\u030c$ in such a way that *B* = *A*^{g}. Here, *g* is càdlàg with values in $G0,\alpha \u2294${} and jump times contained in those of *A* and *B* and solves (1.8) in between these jump times; see Fig. 1. This shows that the pushforward of *ν* to the orbit space $O$ is equal to the pushforward of the law of *A* from the Proof of Theorem 1.11(a), which proves Theorem 1.11(b).

*g*to (1.8) does not blow up before

*B*or $A=defBg\u22121$ does. This is due to the elementary but important property of $S$ that

*γ*

_{xy}with

*γ*

_{xy}(0) =

*x*to

*γ*

_{xy}(1) =

*y*,

*γ*

_{xy}is the shortest such curve, then the distance of the holonomy hol(

*A*,

*γ*

_{xy}) ∈

*G*from the identity in

*G*is of order |

*x*−

*y*|

^{α}|

*A*|

_{α-gr}by Young ODE theory (see Ref. 11, Sec. 3.5).

*χ*and considering the two systems

*U*= Ad

_{g}and

*h*= (d

*g*)

*g*

^{−1}, which assists in computing the desired renormalized equations.]

Well-posedness for systems (2.7) and (2.8) as *ɛ* ↓ 0 is generally standard. However, a subtlety arises from the multiplicative noise term Ad_{g}*ξ*, which is in *C*^{−2−κ} and leads to problems in posing a suitable fixed point problem (−2 is the threshold regularity at which one cannot extend uniquely a distribution from **R**_{+} × **R**^{d} to **R** × **R**^{d}). This issue is handled by decomposing $g=Pg(0)+g\u0302$, where $Pg(0)$ is the harmonic extension of *g*(0) to positive times. Then, $g\u0302$ vanishes at *t* = 0, which makes the product $Adg\u0302\xi $ better behaved, while the product $AdPg(0)\xi $ is shown to be a well-defined distribution in *C*^{−2−κ}(**R** × **R**^{d}) using stochastic estimates.

*same*limit as

*ɛ*↓ 0, the strategy taken in Ref. 11 is to introduce

*ɛ*-dependent norms on the underlying regularity structure. The idea behind these norms is that they allow one to lift the basic estimate

*ɛ*

^{θ}for some

*θ*> 0 sufficiently small. By continuity of the reconstruction operator, it follows that some renormalized forms of (2.7) and (2.8) converge to the same limit as

*ɛ*↓ 0.

*χ*is non-anticipative since this step can be done without it. It follows from a direct computation using the algebraic theory developed in Ref. 6 (see also Ref. 11, Sec. 5) that the renormalized equations are

*ɛ*↓ 0 because if

*G*is the heat kernel, then (

*G**

*G*)(

*t*, ·) =

*tG*(

*t*, ·), which is a bounded functions because

*d*= 2. The components

*g*and $g\u0304$ do not require renormalization and solve the same equations as in (2.7) and (2.8).

To derive and solve the equation for $A\u0304$, one substitutes *ξ* by *ξ*^{δ} and takes the limit *δ* ↓ 0 with *ɛ* > 0 fixed; this ensures that all objects are smooth for *ɛ*, *δ* > 0. This also explains the definition of $C\u03030,\epsilon $ as a limit in *δ* ↓ 0.

To summarize, for any $C\u030a1,C\u030a2\u2208L(g,g)$ and any mollifier *χ*, the solutions *B* and $A\u0304$ to (2.9) and (2.10), respectively, converge to the same limit as *ɛ* ↓ 0 over a short random time interval [the argument in Ref. 11 that (*B*, *g*) and $(A\u0304,g\u0304)$ converge as *maximal* solutions uses non-anticipativity of *χ*].

*ɛ*↓ 0. Finally, we want to find

*C*(playing the role of $C\u030c$) so that $\lambda C\u03030,\epsilon +C\u030a2=C\u030a1\u2212C+o(1)$. This desired operator is

*B*and $A\u0304$ as in Theorem 1.14 with $C\u030c=C$ converge to the same limit over a short time interval as

*ɛ*↓ 0 for

*any*mollifier

*χ*.

*χ*is

*non-anticipative*, then

*C*is independent of

*χ*. Indeed, $C\u03030,\epsilon =0$ for non-anticipative

*χ*. Furthermore, it follows from the identity (

*∂*

_{t}− Δ)

*K*=

*δ*+

*Q*, where

*δ*is the Dirac delta and

*Q*is smooth and supported away from the origin, and a computation with integration by parts (see Ref. 11, Lemma 6.9), that

*C*in this case is equal to

*Q*is supported away from the origin, the final limit is independent of

*χ*.

To conclude the Proof of Theorem 1.14, it remains to show that (1.5) with bare mass $C\u030c$ defined by (2.13) is independent of *χ* and of *K*. Remark that *χ* can now be *any* mollifier, not necessarily non-anticipative. Independence of *χ* follows from the final part of Theorem 2.5 since $C\u030c$ is independent of *χ*. Independence of *K* follows from the fact that $lim\epsilon \u21930(\lambda C\u2009sym\epsilon +C\u030c)=lim\epsilon \u21930\lambda C\u0303\epsilon $, which does not depend on the choice of *K* since *K* is always equal to the heat kernel near the origin.

The existence of $C\u030c$ with the above properties may appear as a bit of a miracle. Indeed, the fact that $C\u0303\epsilon $ and $C\u03030,\epsilon $ converge to finite limits was easy to see because (*G***G*)(*t*, ·) = *tG*(*t*, ·) is a bounded function for the heat kernel *G* in 2D. On the other hand, the fact that $C\u2009sym\epsilon $ converges to a finite limit, and, thus, that $C\u030c$ exists, is not *a priori* obvious because it relies on a cancellation between diverging BPHZ constants $4C\u0302\epsilon $ and $C\u0304\epsilon $ in Theorem 2.5. The fact that $C\u030c$ is furthermore independent of *χ* relies on a cancellation between $C\u0303\epsilon $ and $C\u2009sym\epsilon $ and that (1.5) with bare mass $C\u030c$ is independent of *K* relies on the expression for $C\u0303\epsilon $.

## III. THREE DIMENSIONS

We now discuss the main results of Ref. 12, which deals with the 3D theory.

### A. State space

*A*is too singular to be restricted to lines. To see this, recall that the correlation function of

*A*in 3D behaves like $C(x,y)\u224d1|x\u2212y|$ for

*x*,

*y*close [vs

*C*(

*x*,

*y*) ≍ −log|

*x*−

*y*| in 2D]. Therefore, for

*ℓ*= (

*x*,

*v*),

The construction of the state space $S$ in Ref. 12 proceeds in two steps. The first step is to define a space $I$ of initial conditions for a gauge-covariant regularizing operator. Abstractly, we will find a metric space of distributional 1-forms $(I,\Theta )$ and a family of operators ${Ft}t>0$, $Ft:I\u2192C\u221e$ such that

- for smooth
*A*,*B*,(3.1)$A\u223cB\u21d4Ft(A)\u223cFt(B)for\u2009some\u2009t>0,$ $Ft:I\u2192C\u221e$ is continuous for every

*t*> 0.

If we can find such $I$ and ${Ft}t>0$, then we can extend gauge equivalence ∼ to $I$ by using (3.1) as a *definition*. Finally, we want $I$ to be sufficiently large to contain distributions as rough as the GFF.

*A*(

*t*) to the deterministic YM flow (with the DeTurck term),

^{66}

The second step in the construction of $S$ is to augment $I$ with an additional norm, which ensures that a form of the bound (2.5) holds. This turns out to be critical in several places of the construction for the Markov process in Theorem 1.11(b).

We mention that the idea to use the YM flow to define a suitable space of distributional 1-forms was already suggested in Ref. 15 (see also Refs. 22, 30, 49, and 55 for related ideas in physics). We also point out that another state space that bears close similarity to $I$ was independently defined in Refs. 9 and 10 and was shown to support the GFF.

#### 1. The first half

*C*

^{η}are not suitable for our purposes. The standard strategy to solve (3.2) is to rewrite the equation in mild formulation,

*a*∈

*C*

^{η}, the best we can do to handle the product $Pt\u2212s[Psa\u22c5\u2202Psa]$ in $M(Pa)$ is to estimate for

*s*> 0,

*C*

^{η}for any $\eta <\u221212$ (and not for $\eta =\u221212$). Therefore, trying to estimate the term $\u222b0tPt\u2212s[Psa\u22c5\u2202Psa]ds$ in the second Picard iterate leads to a non-integrable singularity $\u222b0ts\eta \u221212ds=\u221e$. The above estimates are, in general, sharp, which suggests that one cannot start the YM flow (3.2) from the initial data in

*C*

^{η}with $\eta <\u221212$ (and even in

*C*

^{−1/2}).

*a*is a highly non-generic element of

*C*

^{η}and $Psa\u22c5\u2202Psa$ behaves better than the naive bound (3.4) would suggest. Indeed, a second moment estimate shows that for any

*β*∈ (−1, 0) and $\delta >1+\beta 2$, almost surely uniformly in

*s*∈ (0, 1),

*κ*> 0 and then pretend that multiplication is a bounded operator

*C*

^{β/2}×

*C*

^{β/2}→

*C*

^{β}—this is clearly false since

*β*< 0, but ends up working for $(Psa,\u2202Psa)$ due to probabilistic cancellations.]

Since we can take *δ* < 1 in (3.5) [vs $\u2212\eta +12>1$ in (3.4)], this improved regularity ends up being enough to show that every Picard iterate of $M$ is well-defined when *a* is the GFF. It is, therefore, natural to make the following definition.

*β*< 0, and $\delta \u2208(1+\beta 2,1)$, let $I$ be the completion of smooth 1-forms under the metric

The space $I$ can be identified with a subset of $C0,\eta =defC\u221e\u0304C\eta $ because the map $C\eta \u220bA\u21a6P\u22c5A\u22c5\u2202P\u22c5A\u2208C\u221e((0,1),C\beta )$ has a closed graph.

A standard argument with Young’s product theorem and estimates of the type (3.3) shows that the YM flow extends to $I$ in the following sense.

For every ball $B$ in $(I,\Theta )$ centered at 0, there exists *T* > 0 such that for all *t* ∈ (0, *T*), the YM flow (3.2) extends to a continuous function $Ft:B\u2192C\u221e$ (which is Lipschitz for any norm on *C*^{∞}).

We can therefore extend gauge equivalence ∼ to $I$ by using (3.1) as a definition. We emphasize that $I$ is not a vector space and that this is unavoidable: there is no Banach space that carries the GFF on **T**^{3} and to which the YM flow (3.2) extends as a continuous function (locally in time); see Ref. 18.

#### 2. The second half

We now refine the space $I$ in order to obtain control on gauge transformations of the type (2.5), which proves crucial in the construction of the Markov process on gauge orbits associated with (1.5).

*θ*> 0, define the norm |||·|||

_{α,θ}on smooth 1-forms

*ℓ*defined analogously to (2.1). Define further the metric

To motivate the norm |||·|||_{α,θ}, recall that the estimate (2.5) relies on the identity (2.6), which, in turn, requires that line integrals and holonomies of *A* are well-defined. However, we saw that the GFF *A* cannot even be restricted to lines.

*A*, the heat flow regularization ${PtA}t\u2208(0,1)$. A quick computation shows that, uniformly in $0<t<|\u2113|<14$,

*ℓ*as

*t*↓ 0, but rather slowly. Furthermore, restricting to short length scales, say, |

*ℓ*| <

*t*

^{θ}for any

*θ*> 0,

*t*∈ (0, 1) and |

*ℓ*| <

*t*

^{θ}. Combined with a Kolmogorov argument, this shows that |||

*A*|||

_{α,θ}<

*∞*almost surely. The restriction to $\alpha <12$ is natural because the GFF in 3D has $12$ less regularity than in 2D (e.g.,

*C*

^{−κ−1/2}in 3D vs

*C*

^{−κ}in 2D for the Hölder–Besov regularity), and we saw that |

*A*|

_{α-gr}<

*∞*for

*α*< 1 for the GFF

*A*in 2D.

The metric space $S$ can be identified with a subset of $I\u2282C0,\eta $ and comes with the parameters (*η*, *β*, *δ*, *α*, *θ*), the possible range of which is given in Ref. 12 (Sec. 5); $(S,\Sigma )$ is the space appearing in Theorem 1.3 for *d* = 3.

*A*,

*g*) ↦

*A*

^{g}extends continuously to a map $S\xd7G\rho \u2192S$, where

*A*∼

*A*

^{g}for all $(A,g)\u2208S\xd7G\rho $.

We can now state Ref. 12 (Theorem 2.39), which is one of the main results of (Ref. 12, Sec. 2) and the motivation behind the norm |||·|||_{α,θ}.

*There exist constants*

*C*,

*q*> 0

*and*$\nu \u2208(0,12)$

*such that, for all*$g\u2208G\rho $

*and*$A\u2208S$

*,*

The Proof of Theorem 3.7 relies on two estimates: (i) the estimate (2.5) used in the 2D case (which, of course, holds in arbitrary dimension) and (ii) a “backward estimate” that controls the initial condition of a parabolic PDE in terms of its behavior for positive times [see Ref. 12, Lemma 2.46(b)]—this estimate is applied to the harmonic map flow-type PDE solved by *h* for which *h*(0) = *g* and $Ft(A)h(t)=Ft(Ag)$ for all *t* > 0. Theorem 3.7 is then obtained by suitably interpolating between estimates (i) and (ii).

One can show that mollifications of the SHE converge in probability in the space $C(R+,S)$ (see Ref. 12, Corollary 3.14). In particular, the SHE admits a modification with sample paths in $C(R+,S)$.

Unlike in 2D, the action of $G\rho $ on $S$ is not transitive over the orbits, and it is unclear if ∼, or some variant of it, is determined by the action of a group. This lack of a gauge group is responsible for the gap in our understanding of the quotient space $S/\u223c$ in 3D vs 2D; see Remark 1.5.

### B. Local solutions

We next explain how one proves Theorem 1.6 in 3D, which is done in Ref. 12 (Sec. 5). We do not restate the result here like we did in Sec. II B since we cannot make it substantially more precise.

Though primarily using the theory of regularity structures as before, there are two main additional challenges on top of the 2D case. The first is purely algebraic and concerns showing that the renormalization counterterms are of the form $C\u2009bphz\epsilon A$. The difficulty is that there are dozens of trees that potentially contribute to renormalization (vs just nine trees in 2D; see Ref. 11, Sec. 6.2.3).

By power counting, one can deduce that the renormalization is *linear* in *A*. To argue that one sees precisely $C\u2009bphz\epsilon \u2208LG(g,g)$ requires a systematic approach to symmetry arguments, which is developed in Ref. 12 (Sec. 4) and which could of independent interest in other contexts.

To give an example of how this works, we argue that the renormalization is “block diagonal,” i.e., if the counterterm *cA*_{j} appears in the *A*_{i} equation for *j* ≠ *i*, then *c* = 0. Indeed, if we flip the coordinate *x*_{i}↦ −*x*_{i} and, thus, *∂*_{i}↦ −*∂*_{i}, together with *A*_{i}↦ −*A*_{i} and $\xi i\epsilon \u21a6\u2212\xi i\epsilon $, while keeping all terms with indices *j* ≠ *i* the same, it is immediate that all the terms in the *A*_{i} equation (1.3) flip sign. Using the symmetry of the noise $\xi i\epsilon =law\u2212\xi i\epsilon $ and invariance under the flip *x*_{i}↦ −*x*_{i} of the kernel *K* used to define $C\u2009bphz\epsilon $, one can show that renormalized equation must possess the same symmetry, namely, all terms in the renormalized *A*_{i} equation must flip sign. Since we kept *A*_{j} the same, this shows that any factor *cA*_{j} in the renormalized *A*_{i} equation must have *c* = 0.

The way one argues that the *same* $C\u2009bphz\epsilon $ appears for all *i* ∈ {1, 2, 3} and that $C\u2009bphz\epsilon \u2208LG(g,g)$ is similar: one exploits symmetry under reflections *x*_{i} ↔ *x*_{j} and $\xi i\epsilon =law\xi j\epsilon $ for the former and symmetry under constant gauge transformations *A*_{i} ↦ Ad_{g}*A*_{i} and $\xi i=lawAdg\xi i\epsilon $, where *g* ∈ *G* for the latter.

The second challenge is analytic and comes from the singularity of the initial condition in $C\u221212\u2212\kappa $ for *κ* > 0 (this was already encountered in 2D in a more mild form; see Remark 2.8). This singularity means, for example, that $PA(0)\u2202\Psi $ is ill-defined for generic distributions $\Psi \u2208C\u221212\u2212\kappa (R\xd7T3)$ and $A(0)\u2208C\u221212\u2212\kappa (T3)$. Similar to the discussion in Sec. III A 1, this type of product appears in the Picard iteration used to solve (1.5). As in Remark 2.8, this issue is addressed by decomposing $A=PA(0)+\Psi +A\u0302$, where Ψ solves the SHE *∂*_{t}Ψ = ΔΨ + *ξ*^{ɛ}, and solving for the “remainder” $A\u0302$. One then shows with separate stochastic bounds that $PA(0)\u2202\Psi $ and $\Psi \u2202PA(0)$ converge in *C*^{−2−κ}(**R** × **T**^{3}) as *ɛ* ↓ 0.

A closely related issue *not present* in 2D is that of restarting the equation at some positive time *τ* > 0 to obtain maximal solutions. This is because, for *ɛ* > 0, *A*(*τ*) and *ξ*^{ɛ}↾_{[τ,∞)} see each other on a time interval of order *ɛ*^{2}. Since the regularities of *A*(*τ*) and *∂*Ψ add up to $<\u22122$, this breaks the argument used to show that $PA(0)\u2202\Psi $ and $\Psi \u2202PA(0)$ converge as *ɛ* ↓ 0 when *A*(0) is independent of *ξ*. To restart the equation, one instead leverages that *A*(*τ*) for *τ* > 0 is not a generic element of $S$ but takes the form *A*(*τ*) = Ψ(*τ*) + *R*(*τ*), where *R*(*τ*) ∈ *C*^{−κ}(**T**^{3}). Since Ψ is defined globally in time, this decomposition allows one to restart the equation using the “generalized Da Prato–Debussche trick” from Ref. 6.

### C. Gauge covariance

Finally, we describe the Proof of Theorem 1.11 in 3D. The Proof of Theorem 1.11(a) is similar to its 2D counterpart. The only appreciable difference is that the measurable selection $S:O\u2192S$ is replaced by a Borel map $S:S\u2192S$, which preserves gauge orbits and such that $\Sigma \u0304(S(X))\u22642infY\u223cX\Sigma \u0304(Y)$ whenever the right-hand side is finite. Here, $\Sigma \u0304\u2265\Sigma $ is defined analogously to Σ but with a stronger set of parameters (*η*, *β*, *δ*, *α*, *θ*). This complication is due to a lack of any nice known properties of $O=S/\u223c$ in 3D (e.g., polishness) (see Remarks 1.5 and 3.9), and we instead leverage compactness of the embedding $(S\u0304,\Sigma \u0304)\u21aa(S,\Sigma )$.

The Proof of Theorem 1.11(b) is where we start to see a difference with the 2D case. First, admitting for now Theorem 1.14, we aim to prove that solutions to (1.5) with bare mass $C\u030c$ and gauge equivalent initial conditions can be suitably coupled. Specifically, one has the following result.

(Coupling). Suppose *A* solves (1.5) with bare mass $C\u030c$ and initial condition *a*. Then, for any *b* ∼ *a*, there exists on the same probability space a white noise $\xi \u0304$ and a process (*B*, *g*) such that $g(t)\u2208G32\u2212\kappa $ and *A*(*t*)^{g(t)} = *B*(*t*) for all *t* > 0 (before blow-up of *A*, *B*) and such that *B* solves (1.5) driven by $\xi \u0304$ with bare mass $C\u030c$.

If *a*^{g(0)} = *b* for some $g(0)\u2208G\rho $ for $\rho >12$ as in Sec. III A 2, then this result follows almost immediately from Theorem 1.14. However, unlike the 2D case, it is now possible that *b* ∼ *a*, but no *g*(0) exists such that *a*^{g(0)} = *b*, which leads to trouble in applying Theorem 1.14—we effectively have no initial condition for *g* in the PDE (1.8).

*a*and

*b*using the YM flow $Ft$ so that $Ft(a)gt=Ft(b)$ for all

*t*sufficiently small and some smooth

*g*

_{t}. Then, $Ft(a)\u2192a$ and $Ft(b)\u2192b$ in $S$ as

*t*↓ 0, which, in particular, implies that $lim supt\u21930|gt|C\nu <\u221e$ for some $\nu \u2208(0,12)$ due to Theorem 3.7. Therefore, there exists $g(0)\u2208G\nu $ such that

*g*

_{t}→

*g*(0) in $G\nu /2$ along a subsequence. We can then rewrite the equation for

*g*in (1.8) in terms of $A=Bg\u22121$, namely,

*A*,

*g*), where

*A*solves (1.5), is well-posed for any initial condition in $S\xd7G\nu $ [vs $S\xd7G\rho $ with $\rho >12$ for (1.8) due to the multiplicative noise Ad

_{g}

*ξ*]. One can then use continuity of (

*A*,

*g*) with respect to the initial conditions, the fact that

*g*takes values in $G32\u2212\kappa $ for positive times and the joint continuity of the group action $G32\u2212\kappa \xd7S\u2192S$ to prove Lemma 3.10. The

*g*in Lemma 3.10 solves precisely (3.6) with initial condition

*g*(0).

With Lemma 3.10 in hand, together with the fact that *g* in its statement cannot blow up before Σ(*A*, 0) + Σ(*B*, 0) blows up (again due to Theorem 3.7), it is relatively straightforward to prove Theorem 1.11(b) like we did in the 2D case. See, in particular, the discussion around Fig. 1.

*χ*. Like in Sec. II C, it is natural consider systems (2.7) and (2.8) and show that their renormalized versions converge to the same limit. To do this, we show that the equations for

*g*and $g\u0304$, which are now singular, do not require renormalization and that the renormalized equations for

*B*and $A\u0304$ are of exactly the same form as in Sec. II C, namely,

Both of these facts are again proven by introducing the new variables *h* = (d*g*)*g*^{−1} and *U* = Ad_{g} and writing the corresponding equations for *h*, *U*. This time, instead of a direction computation with just two trees as in (2.11), a more complicated strategy relying on power counting and symmetry arguments is necessary. The main insight, which helps with this argument, is that the trees appearing in the “lifted” *B* and $A\u0304$ equations are obtained by attaching (or grafting) the trees appearing in the *U* equation onto the trees appearing in (1.5) (see Ref. 12, Sec. 6.2.1).

The analytic theory for *B* and $A\u0304$ equations is somewhat more involved than what we saw in Secs. II C and III B, but the general strategy is the same: we decompose the solution into the initial condition, globally defined singular terms, and a better behaved remainder and solve for the last of these. Restarting these equations, specifically the equation for $A\u0304$, is also not straightforward because, unlike in Sec. III B, we are outside the scope of the generalized Da Prato–Debussche trick of Ref. 6 since the multiplicative noise means that the most singular part of the solution is not just the SHE. Therefore, an entirely separate fixed point problem needs to be written for the restarted equation (see Ref. 12, Sec. 6.6), which leverages that the new initial condition comes from a modeled distribution defined for earlier times. The proof that *B* and $A\u0304$ converge to the same limit uses the same *ɛ*-dependent norms on regularity structures introduced in Ref. 11.

*ɛ*↓ 0. The idea, inspired by Ref. 7, is to introduce a parameter

*σ*

^{ɛ}∈

**R**in front of the noise that we take to zero as

*ɛ*↓ 0 in a precise way.

*σξ*, it is not difficult to see that these constants depend polynomially on

*σ*and converge to zero as

*σ*↓ 0. Therefore, by assumption (3.11), we can send

*σ*

^{ɛ}↓ 0 as

*ɛ*↓ 0 in such a way that, after passing to another subsequence, we can find $C\u0302\u22600$ such that

*B*with bare masses $C\u030a1=0$ and $C\u030a2=C\u0302\epsilon $, that is,

*B*=

*A*

^{g}pathwise, where

*A*solves (1.5) with bare mass zero, i.e.,

*σ*

^{ɛ}↓ 0,

*A*converges as

*ɛ*↓ 0 to solution of the deterministic YM flow (with the DeTurck term)

*B*converges as

*ɛ*↓ 0 to the solution of

*A*

^{g}=

*B*is preserved under the

*ɛ*↓ 0 limit, where

*g*solves (1.8), but this implies by the calculation in Sec. I B 1 with

*ξ*≡ 0 that

*B*should also solve

*B*(0),

*g*(0)) for which the solutions to (3.13) and (3.14) are different.

*σ*

^{ɛ}↓ 0 such that $C\u0303\sigma \epsilon 0,\epsilon \u2192C\u0302\u22600$ along another subsequence. Now, we consider

*χ*non-anticipative), equal in law to the solution of

*σ*

^{ɛ}↓ 0, $A\u0303$ converges in law again to the solution of the deterministic YM flow (3.12), while $A\u0304$ converges to the solution of

The bounds (3.10) are actually used in the short time analysis of (3.7) and (3.8) since they allow us to relate *B* and $A\u0304$ to a simpler equation with *additive* noise through the above mechanism (*B* is related through pathwise gauge transformations and $A\u0304$ is related through equality in law; see Ref. 12, Sec. 6.6).

*ɛ*↓ 0 along which $C\u2009bphz\epsilon \u2212C\u0303\epsilon $ converges to distinct limits $C\u03021$ and $C\u03022$. We then consider the

*B*equation with bare masses $C\u030a1=0$ and $C\u030a2=C\u2009bphz\epsilon \u2212C\u0303\epsilon $. With this choice,

*B*=

*A*

^{g}for all

*ɛ*> 0 where

*A*solves (1.5) with zero bare mass and

*g*solves (3.6). Since the limit of (

*A*,

*g*) is the same along the two subsequences, it follows that the limit of

*B*along the two subsequence is also the same almost surely. However, it is possible to show that the limiting bare mass constants $C\u03021$ and $C\u03022$ can be recovered from the

*ɛ*↓ 0 limit of (

*B*,

*g*) (see Ref. 12, Appendix D), which leads to a contradiction since we assumed $C\u03021\u2260C\u03022$. The same argument shows that the limit in (3.15) is independent of

*χ*because the solution (

*A*,

*g*) to (1.5) and (3.6) is independent of

*χ*.

Finally, to prove that $lim\epsilon \u21930C\u03030,\epsilon $ exists and is independent of the non-anticipative mollifier *χ*, one argues in a similar way except, as earlier, we appeal to equality with (1.5) *in law* and use instead that two limiting solutions $(A\u0304,g\u0304)$ with different bare masses cannot be equal *in law*.

These final statements mimic exactly what we saw in Sec. II C where $C\u03030,\epsilon =0$ for non-anticipative *χ* and $lim\epsilon \u21930(C\u0303\epsilon \u2212C\u2009bphz\epsilon )$ is given by (2.13). By analogy, (3.15) should hold for any mollifier, not necessarily non-anticipative, but this is not necessarily the case for $lim\epsilon \u21930C\u03030,\epsilon $. Furthermore, as in 2D, we expect that $lim\epsilon \u21930C\u03030,\epsilon =0$ for non-anticipative *χ*.

## IV. OPEN PROBLEMS

We close with several open problems that we believe to be of interest.

Does the Markov process on gauge orbits in Theorem 1.11 possess a unique invariant measure? The existence of the invariant measure should imply uniqueness due to the strong Feller property

^{38}and full support theorem for SPDEs.^{39}Furthermore, the invariant measure for*d*= 2 is expected to be the YM measure associated with the trivial principal*G*-bundle on**T**^{2}constructed in Refs. 44, 45, and 60. For*d*= 3, this would provide the first construction of the YM measure in 3D, even in finite volume.Can the analysis in Refs. 11 and 12 be extended to infinite volume

**T**^{d}⇝**R**^{d}? This is non-trivial even for*d*= 2, although the YM measure on**R**^{2}is arguably simpler.^{26}Can one extend these results beyond the case that the underlying principal bundle

*P*→**T**^{d}is trivial? For non-trivial principal bundles, one can no longer write connections as globally defined 1-forms, which complicates the solution theory.For

*d*= 3, can one modify the construction of the state space $S$ in Sec. III A so that the gauge equivalence ∼ is determined by a gauge group or a similar structure? This would yield a notion of gauge equivalence conceptually closer to the classical space of gauge orbits and would carry a number of technical advantages (see Remarks 1.5 and 3.9 and the start of Sec. III C).Taking

*G*to be one of the classical groups, e.g.,*G*=*U*(*N*), what is the behavior of the dynamic as*N*→*∞*? In 2D, the associated YM measure is known to converge to a deterministic object called the master field,^{19–21,46}which is governed by the Makeenko–Migdal equations;^{52}see Ref. 47 for a survey. No such result is rigorously known in 3D (the measure at finite*N*has not been constructed). It would be interesting if one can use stochastic quantization to recover some of the known results in 2D and obtain new results in 3D; see Ref. 62 where the Langevin dynamic is used to derive the finite*N*master loop equation on the lattice.

## ACKNOWLEDGMENTS

It is a great pleasure to thank Ajay Chandra, Martin Hairer, and Hao Shen for many interesting discussions and explorations while carrying out the reviewed works.

## AUTHOR DECLARATIONS

### Conflict of Interest

The author has no conflicts to disclose.

### Author Contributions

**Ilya Chevyrev**: Investigation (equal); Writing – original draft (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

## REFERENCES

*Probability and Analysis in Interacting Physical Systems*

*U*(1)

_{4}lattice gauge theory to its continuum limit

_{2}: Continuum expectations, lattice convergence, and lassos

*A Course on Rough Paths: With an Introduction to Regularity Structures*

_{3}lattice gauge theory to its continuum limit

*Current Developments in Mathematics*

*m*-equivariant Yang-Mills flow in four dimensional spaces

*The Millennium Prize Problems*

*Frontiers in Analysis and Probability: In the Spirit of the Strasbourg-Zürich Meetings*

*Stochastic Geometric Mechanics*

_{4}with an infrared cutoff

*N*master loop equation for lattice Yang-Mills