Quantum computing for chemistry and physics applications from a Monte Carlo perspective

This Perspective focuses on the several overlaps between quantum algorithms and Monte Carlo methods in the domains of physics and chemistry. We will analyze the challenges and possibilities of integrating established quantum Monte Carlo solutions in quantum algorithms. These include refined energy estimators, parameter optimization, real and imaginary-time dynamics, and variational circuits. Conversely, we will review new ideas in utilizing quantum hardware to accelerate the sampling in statistical classical models, with applications in physics, chemistry, optimization, and machine learning. This review aims to be accessible to both communities and intends to foster further algorithmic developments at the intersection of quantum computing and Monte Carlo methods. Most of the works discussed in this Perspective have emerged within the last two years, indicating a rapidly growing interest in this promising area of research.


I. INTRODUCTION
The solution of quantum many-body problems in chemistry and physics is one of the most anticipated applications of a quantum computer, as first proposed by Feynman. 1 Over time, it has been proposed that many other classes of problems can benefit from quantum speedup, including cryptography, data science, machine learning, finance, linear algebra, and optimization. 2However, physics and chemistry remain among the main candidates for demonstrating practical quantum advantage over conventional methods because they contain classes of problems with the following characteristics: (i) they are very challenging for classical computation, and exponential quantum speed-ups are possible, and (ii) they are defined by a small number of variables, thus featuring a limited cost of data loading and reading. 3mong all possible problems in physics, here we will focus on electronic structure and spin models (including classical spin models), as their implementation requires a relatively lower cost compared to models like high-energy physics.Excellent review articles on quantum algorithms for quantum chemistry 4,5 and materials science 6 have already been published a few years ago, the latest one in 2020.The purpose of this manuscript is not to duplicate such presentations but rather to concentrate on a more frontier topic that is becoming relevant due to several works appearing in recent months.However, notice that about 40% of references cited in this Perspective are pretty recent, i.e. from 2021 onwards.This indicates how fast the whole field of quantum computing is growing.
We will analyze points of contact between quantum computing and Monte Carlo (MC), quantum Monte Carlo (QMC) methods. 7There are many common themes between the two worlds.Shot noise arising from the measurements of the quantum register finds a parallel in the statistical root of Monte Carlo.Both methods require extracting and utilizing expectation values computed in the presence of statistical noise.
The existence of shot noise is one of the major issues for near-term simulations: in variational setting this implies a problematically large number of circuit repetitions. 8n the other hand, such uncorrelated wave function collapses can have computational value if used as importance sampling in Monte Carlo.We will also report attempts of cross-fertilization between the two fields in designing variational ansatze and optimization methods for ground state and dynamical problems.Additionally, we discuss the requirements that a classical-quantum hybrid QMC algorithm, relying on a quantum computing subroutine, must meet.Regarding classical applications, we review several proposals for accelerating the Metropolis algorithm using quantum hardware and examine their practicality under realistic hardware constraints.
Therefore, the purpose of this manuscript is to review various Monte Carlo techniques that can be useful for creating new quantum algorithms or designing new applications of already-known quantum primitives.Conversely, this Perspective also aims to be an accessible presentation of the potential and limitations of quantum computing, for Monte Carlo experts and, more broadly, computational physicists.
0][11][12] Moreover, experiments at the threshold of quantum advantage for quantum dynamics are now possible given the existence of ∼ 100 qubits hardwares, albeit noisy. 13(2) On the quantum variational algorithm side, we have increasing evidence that the quantum measurement noise -the focus of this Perspective-is the major, unavoidable, bottleneck of near-term quantum algorithms. 8,14(3) In the last year, thorough resource assessment papers for quantum chemistry have appeared, [15][16][17] which clearly reaffirms the threshold for quantum advantage for ground state problems, at least with today's algorithms, to be deep into the future faulttolerant quantum computing regime, and question the previously claimed exponential advantage for groundstate electronic structure application. 18(4) Finally, we observe the emergence of a new class of hybrid quantum algorithms revisiting classical and quantum Monte Carlo, opening completely new possibilities for quantum advantage in these areas.In this Perspective we report about twenty recent (i.e which appeared in the last two years) works that aim to complement quantum computing and Monte Carlo in several sub-fields: hybrid quantum computing-QMC, variational circuits development, parameter optimization, time-dependent simulations, and classical sampling.
The manuscript is organized as follows.In Sec.II we briefly mention the types of quantum hardware and their fundamental limitation, namely the hardware errors in the NISQ regime, and the fairly long (compared to conventional CPUs) gate time of future, error-corrected ma-chines.In Sec.III we introduce general quantum algorithms for physics and the requirements for quantum advantage.In Sec.IV we review the basics of the encoding of a fermionic hamiltonian into a quantum computer.After these introductions, in Sec.V, we discuss the variational method, the simplest kind of algorithm for physics and chemistry, and its limitation due to the shot noise.In Sec.VI we instead describe how the same calculation could be done in the fault-tolerant era.The more technical Sec.VII introduces the local energy estimator that is central in QMC and explains why these algorithms, while still being stochastic, do not suffer from the severe variance problem of variational quantum algorithms.In Sec.VIII we review attempts to create hybrid quantumclassical QMC algorithms, as well as other points of contact between QMC and quantum algorithms.Finally, in Sec.IX, we reverse our perspective and discuss quantum computing methods to speed-up classical sampling, using digital machines and quantum simulators, annealers.
The concept map of the Perspective is shown in Fig. 1.

II. QUANTUM HARDWARES
Unlike conventional reviews on algorithms for quantum chemistry, it is necessary to briefly introduce the hardware on which these must be executed.Understanding the possibilities and limitations of the hardware is crucial to get an idea of the feasibility of current and future algorithms.
There are many types of quantum computers and quantum simulators.The difference between the two classes is that a quantum computer is built with the idea of being universal, therefore able to support any type of program.A quantum simulator is designed to perform a narrower range of tasks, such as optimizing classical costfunctions, 19,20 or simulating particular Hamiltonians. 21o tend towards universality, a quantum computer must support the execution of basic local operations, called quantum gates, just like a regular computer.For example, an architecture capable of synthesizing the following set of gates {CNOT, H, S, T} or {R x (θ), R y (θ), R z (θ), S, CNOT} that act on a maximum of two qubits (see textbooks on quantum computing for the definition of these gates 22 ), is capable of approximating any possible unitary operation on n qubits. 22On the other hand, a special-purpose quantum simulator can implement the global operation of interest directly, such as e iHt , where H is the quantum many-body Hamiltonian operator, without the problem of having to compile it using a gate set. 23Currently, the greatest engineering effort is focusing on gate-based "digital quantum computers", although it is not excluded that algorithms of interest for chemistry and materials science can be executed on quantum simulators.
There is then a second distinction that is important to keep in mind.At present, the size of digital quantum computers is on the order of n ≈ 100 qubits. 13,24In principle, these computers have access to a 2 O (100) dimensional Hilbert space.However, practical quantum advantage has not been achieved yet.The reason is that these machines are not properly digital ones but are subject to hardware noise.For example, each gate has a finite insuccess probability.As we will see, circuits necessary to write a quantum algorithm for chemistry require a considerable number of gates, and therefore even a small infidelity propagates devastatingly and the total error accumulates until it completely compromises the success of the algorithm.Current hardwares are built with the idea of executing a universal gate set, but are still affected by hardware noise.They generally go by the name of NISQ (Noisy Intermediate-Scale Quantum) machines. 25e final step to achieving a true digital quantum computer is to realize hardwares capable of executing gates without errors, just like their classical counterpart.Many detractors of quantum computers base their skepticism on the impossibility of maintaining a macroscopic coherent wavefunction for an arbitrary number of operations. 26ortunately, there is a theorem that does not exclude the possibility of a digital universal computer: below a certain noise threshold, it is possible to correct this hardware noise faster than it can accumulate during runtime (the same would not hold for analog computers). 22Practical proposals to realize this idea include using multiple physical qubits to realize a logical qubit and more operations to realize a single "error-corrected" logical gate. 27,28 is important to understand which types of algo-rithms have the hope of being executed on a NISQ machine and which will require a fault-tolerant machine to properly contextualize the ever-growing literature on quantum computing for quantum chemistry.At present, sub-communities have formed that are dedicated to developing NISQ algorithms, and others, that are increasingly growing in number, which are developing algorithms for the fault-tolerant era.This situation is unprecedented.In classical computing, to attempt a comparison, it would be as if in the 1960's there were a community developing algorithms for chemistry on punched cards, and another preparing for exascale computing without a clear idea of whether and how an HPC facility would be built.
However, the technological progress toward a faulttolerant machine is steady.][31] Clearly, we are still in the infancy of fault-tolerant hardware, and it is not yet clear when a large-scale errorcorrected machine, able to accommodate electronic structure calculations, will appear.This also depends on progress in the algorithmic side of compressing memory and runtime resources.
The last concept that needs to be presented in this brief account of hardware is related to the clock frequency of a quantum computer, which will always be necessarily slower than classical gates.This is because the execution of a quantum gate requires the manipulation of a wave function by an external control: the quantum gate can never be faster than the classical electronic apparatus that controls it.At present, the execution time of a noisy CNOT or noisy single qubit rotation is of the order of 100 ns in the NISQ era, for superconducting qubits, corresponding to 100 MHz. 24The expected logical clock rate in the fault-tolerant regime is much slower, of the order of 10-100kHz, because every error-corrected gate requires a large number of elementary operations that involve a large number of physical qubits. 32Often, in literature, one hears about T-depth as a proxy for the complexity of an algorithm. 17,33,34The reason is that truly digital hardware can only operate using a set of discrete gates that can be error-corrected.A rotation of an arbitrary angle does not belong to this category, and therefore every continuous rotation must be compiled into a series of discrete operations, such as S, H, and T gates.The T gates are the most expensive to synthesize (i.e 100-10000 more costly than a CNOT), 32 and therefore their number, and how many can be executed in parallel, determines the runtime.Since in chemistry, rotations of an arbitrary angle may be necessary for orbital basis rotations, or to realize infinitesimal Trotter step operations, are ubiquitous, fault-tolerant quantum algorithms generally include a large number of T gates.It is interesting to note that in the NISQ era, the opposite is true: rotations are comparatively simple gates to perform, while efforts are made to reduce the number of CNOT gates, which are currently the most noisy.Recent proposals include the possibility of retaining analog rotation gates and error-corrected Clifford gates, which are easier to synthetize. 35,36This hybrid approach is interesting but yet to be demonstrated in practical algorithms.Moreover, the gate times depend on the specific hardware architecture.Other platforms such as spin qubits, trapped ions, or photonic hardwares will imply different hardware constraints.

III. ALGORITHMS AND QUANTUM ADVANTAGE
After this essential overview of hardware, we are now in a position to introduce more concretely the most popular algorithms for the quantum many-body problem.The application where quantum advantage appears most clear and easy to justify is quantum dynamics.In this case, an exponential advantage can be obtained by virtue of the exponential compression of the Hilbert space into a linear memory with the number of particles. 37If we consider, for simplicity, spin-1/2 lattice models composed of n spins, it is easy to see that an exact simulation becomes unfeasible as soon as n ∼ 50.Storing a quantum state of 50 qubits with double-precision coefficients for each of the 2 50 possible components requires 16 PB of memory.To perform arbitrary discrete-time evolution, we would need to manipulate such an array thousands of times.For instance, direct matrix exponentiation e iHt for a typical many-body quantum Hamiltonian, H, and 50 spin-1/2 particles would require an array made of 10 15 entries undergoing a matrix-vector multiplication of size 2 50 × 2 50 = 10 30 .
The memory requirement for evolving a system of N electrons in second quantization using n orbitals is the same.Second quantization offers an optimal memory usage if a Fock encoding is used, where each binary string represents a possible configuration of orbital occupation (see Sec. IV). 38As we will see, the choice of second quantization introduces non-locality in the Hamiltonian, which is translated as a sum of tensor products of Pauli matrices (qubits are not fermionic particles) and therefore requires very long circuits.From an asymptotic point of view, first quantization would be a better choice, as it preserves the locality of interactions at the cost of introducing expensive arithmetic operations to calculate the Coulomb term.In addition, an antisymmetric function in real space must be provided. 39imulating lattice spin models, therefore, appears to be the most obvious choice in the search for a quantum advantage in Hamiltonian simulations.Beverland et.al 17 shows that a fault-tolerant simulation of a 10×10 plaquette of the two-dimensional quantum Ising model requires on the order of 10 5 −10 6 physical superconducting qubits, using the digital Trotter algorithm.This example is instructive, as it clearly shows that, although the system requires a priori a memory of n = 100 logical qubits (1 logical qubit = 1 spin), error correction and extra resources for distilling T gates (often called T-factories) push the total computation of qubits up to a million.This system sets a lower bound on the resources needed to simulate fermionic systems that are more complex than spin-1/2 models, for the same number of particles/spin-orbitals. Simulating lattice models is also possible with quantum simulators, although with calibration errors, and it is therefore likely that there will be competition between the two quantum computing paradigms towards the first simulation, beyond what is possible classically. 23,40ndeed, it is also possible that the first demonstration of practical quantum advantage in real-time simulations will be achieved before the fault-tolerant regime, for example in simulations of dynamical phase transition. 41At the time of writing this review, researches at IBM showcased a record-sized real-time dynamics of a 2D heavyhexagon lattice Ising model using Trotterization, 127 qubits, and error mitigation techniques. 13Just ten days later, approximate tensor network simulations achieved the same result, 42,43 raising the bar once again to declare a quantum advantage, similar to what happened with the first claim of quantum supremacy, on random circuit sampling. 24,44,45Demonstrating a definitive quantum advantage in quantum dynamics tasks is a less well-defined goal, as classical limits are still not thoroughly explored.One can expect that increasingly sophisticated classical methods will be adopted to counter the new claims of quantum advantage.The case of chemistry is different since classical methods are fairly established, and it is much clearer which electronic structure Hamiltonians are beyond the reach of classical computing.
Coming back to our chemistry problems, if the required answer requires a precision that can only be achieved with a long circuit, then we must prepare ourselves for the fact that our algorithm will only be available in the digital era, which could take a decade or more, when a technology capable of controlling millions of qubits will be available.As mentioned, fault-tolerant hardware has a lower clock frequency than a conventional computer, a non-exponential asymptotic speed-up may not be sufficient to guarantee actually shorter runtimes, for reasonable system sizes.Babbush et.al. 46 recently discuss how a quadratic speed-up is insufficient for practical advantage in many applications.
To conclude, quantum advantage in chemistry problems can be obtained either in several years using a faulttolerant algorithm with a superquadratic speed-up, or in a heuristic way using NISQ hardwares.In this case, only short-depth algorithms can be used, which often require a classical feedback loop, such as optimization of gate parameters that define the circuit, and repeated executions of the circuit.Variational methods fall exactly into this class of solutions. 47,48We will see in the next chapter how these can also be used and their limitations, especially their relationship with a type of noise that cannot be eliminated even in the fault-tolerant regime.

IV. FERMIONIC HAMILTONIANS
Most applications in chemistry are related to solving the many-electron Schrödinger equation.To clearly understand the problem it is necessary to formalize it to some extent.The field of quantum simulations of electronic structure problems is quite well established, and we refer to reviews [4][5][6] for a general introduction and more details.The most general fermionic hamiltonian reads where a † p and a p are the fermionic creation and destruction operators, which create (annihilate) a particle in the spin-orbital p.
To proceed, the next step is to define an encoding from fermionic Hilbert space to qubit space, so that a base vector of the latter, for example 1100, has a unique correspondence in the fermionic space.
In second quantization, this is trivial using Fock states.A bit-string denotes the occupation numbers of spinorbitals, in a chosen ordering.For instance, the string 1100 can represent the Hartree-Fock state of an H 2 molecule described with two spatial molecular orbitals.Here, there are two electrons, of opposite spin, occupying the first two spin-orbitals, {|ψ g,↑ ⟩ , |ψ g,↓ ⟩}, while leaving empty the two higher energy ones {|ψ u,↑ ⟩ , |ψ u,↓ ⟩}.
For ease of notation, we can label this string with its binary index (from left to right, here), |1100⟩ = |3⟩.The (doubly) excited state, compatible with the symmetries of the molecule, is |0011⟩ = |12⟩.In the H 2 case, the exact ground state (in this small atomic basis) can be expressed as a linear combination of these two strings alone, but in the general case, the ground state of a molecule with N electrons and n spin-orbitals is written as a linear combination, where c i are complex coefficients, and |i⟩ is a basis vector, which binary format denotes the occupation number of the spin-orbitals in a chosen ordering.Although many c i 's are zero due to particle and spin conservation, as is known, an exponentially large number of them remains finite when N and n increase.
The main advantage of a quantum computer is therefore clear.With only n qubits, it is possible to store in memory a wavefunction with 2 n complex coefficients.The fundamental question now is whether it is possible to devise a quantum algorithm with polynomial complexity, capable of manipulating these 2 n coefficients to find the ground state of a fermionic problem, or at least with a better approximation than the best classical method.
Before going further, it is necessary to show what kind of Hamiltonian is produced after the fermion-to-qubit mapping is applied.Since a qubit is not a fermion, the hamiltonian in Eq. 1 needs to be translated into a qubit operator.This is usually done with the Jordan-Wigner transformations, where the 1 2 (X+iY ) p and 1 2 (X−iY ) p spin minus and plus operators change the occupation number of the target mode p, while the string of Z operators are needed to enforce antisymmetrization.Therefore, each fermionic operator in Eq. 1 translates into a combination of tensor product of Pauli operators.
The full Hamiltonian of Eq. 1 then takes the general form of linear combination of products of single-qubit Pauli operators where h j is a real scalar coefficient, and N P is of order O(n 4 ) since there are n 4 terms to transform in Eq. 1.
Each term P j in the Hamiltonian is typically referred to as a Pauli string, and is a tensor product What is important for our discussion is that (1) the coefficient h p can take very large values (in modulus).Just to give an idea, j |h j | is of the order of 40 Ha, already for a moderate example of a H 2 O molecule described in STO-6G basis 8 .Then (2) the operators P j can have a number of non-identity gates, which is O(n) due to the non-locality introduced by the Jordan-Wigner trasformation.This implies that the circuit for real-time dynamics are longer compared to local quantum spin Hamiltonians, 38 and when measured the expectation value ⟨P j ⟩, is exponentially subsceptible to bit-flip measurement errors 49 .The form of Eq. 4 is however more general than the Jordan-Wigner workflow, so we assume it as a starting point of our discussion.

V. VARIATIONAL QUANTUM ALGORITHMS
In short, variational methods aims to use shallow parametrized circuits that can be optimized to minimize the calculated energy. 48,50,51The N par variational parameters can be the angles θ's of rotation gates defined above.This strategy takes the name of Variational Quantum Eigensolver (VQE) but it is nothing more than a common variational calculation on a quantum computing platform.First, of all, it is important to notice that even shallow circuits, i.e. featuring constant depth vs n, can display quantum advantage, although on quite artificial tasks, 52 or cannot be efficiently simulated classically. 53Therefore, variational algorithms are reasonable candidates for quantum advantage in the near term.As of today, the literature features many smallscale hardware demonstrations, still away from the quantum advantage threshold.The most notable either use heuristic circuits, 54 , or more structured physically inspired circuit ansatze. 55,56The current largest variational simulation of a chemical system reaches a system size of about 20 qubits. 57Performing variational calculations of many-body quantum systems has advantages in principle, but also many limitations in practice.
Current technology allows the execution of circuits with more than 100 qubits and a depth of about 60 twoqubit gates. 13While error correction is not yet available, there are error mitigation methods that enable unbiased estimation of the expectation value of operators. 58,59It seems, therefore, that all the ingredients for enabling NISQ variational methods are present, as such a circuit can define a variational ansatz that could outperform the best classical ansatz for a given problem.A central point of this Perspective is thus what is missing to translate this potential into practical variational computation and, hopefully, achieve quantum advantage.
The first point to establish is what kind of circuit can be used to create a ground state of our target many-body quantum Hamiltonian.As we have seen, the advantage of a quantum computer is the mere possibility of storing an exponentially large wavefunction with a linear number of qubits in memory (cfn.Eq. 2).But this gives us no guarantee that (1) a quantum circuit with a finite, and possibly small, depth can give us a better approximation than the best classical method, and (2) even more importantly from a conceptual point of view, that it is possible to optimize the parameters even assuming that the ground state is contained in the variational ansatz.
An obvious drawback of variational calculations in the NISQ regime is the presence of noise.A gate error of 0.1% propagating through a circuit with a depth of 100, composed of 100 qubits, produces a state with fidelity on the order of 10 −3 −10 −4 compared to the noiseless case. 13owever, there exist error mitigation methods that can be applied to obtain unbiased estimates of expectation values.8][59] It remains to be determined whether the exponent of this post-processing step is mild enough to guarantee reasonable runtimes for classically intractable molecules.
In this Perspective, we will not focus on hardware error mitigation, which is introduced in the recent Ref. 57 but on another complementary issue.As we will see, a major problem with quantum variational methods is even more surprising: even assuming that we have prepared exactly the target state, computing its energy is -so far-an inefficient procedure.Although this inefficiency is not the same as in complexity theory definition, where an algorithm is said to be inefficient if it is exponentially scaling, it is from a practical point of view.This is a completely new condition that does not happen, for example, in variational calculation using QMC.

A. The variance problem
The fundamental (and obvious) concept at the root of this is that a quantum computer obeys the postulates of quantum mechanics: we cannot access the state that is created by the circuit exactly, but only through measurements.We can measure the expectation value of the Hamiltonian by post-processing the read-out of each measurement, and then prepare exactly the same state and repeat the measurement, and so on.
Let us suppose to have prepared a quantum state |ψ⟩: following Eq.4, we see that the mean value of the Hamiltonian is the linear combination of the expectation values of the N p Pauli strings, For simplicity we assume that the expectation values of the P j 's are obtained from N P independent sets of measurements: the error on the estimate is then given by where Var[P j ] = ⟨P 2 j ⟩ − ⟨P j ⟩ 2 ≤ 1 is evaluated using M j repeated measurements, or shots.
Since we need to evaluate each ⟨P j ⟩ independently, their statistical fluctuations are not correlated, so one needs to reach chemical accuracy by resolving each expectation value with very high accuracy as it gets multiplied by a possibly very large prefactor h j .In Wecker et.al. 8 , it is estimated that M = j M j = 10 9 measurements would be needed to reach ϵ ∼ 10 −3 Ha, for H 2 O, and up to 10 13 for F e 2 S 2 decribed with 112 spin-orbitals and STO-3G basis.
This happens even in the case where we prepare the exact ground state, thus violating the zero-variance property of the ground-state if the energy is calculated this way.This issue is often called the variance problem, and is one of the most overlooked issues in the VQE community, which seems to be more active in new circuit ansatze development.However, several works aiming to mitigate this problem have been put forward.The simplest one consists in grouping all P j 's that commute qubit-wise and that can be measured simultaneously. 54Other methods aim to find better grouping schemes, introducing general commutativity rules at the expense of longer measurement circuits 49,[60][61][62][63] .
For the evaluation of two-body reduced density matrices for fermionic systems, it is possible to devise an asymptotically optimal measurement scheme 63,64 , with a number of unique measurements circuits scaling as N groups ∼ O(n 2 ).However, there is still the problem that all these correlators, estimated stochastically, need to be summed up to ⟨H⟩.Therefore, N groups is not a faithful representation of the number of total measurements to be performed to achieve chemical accuracy, since each partition needs to be measured M per groups to ϵ times, as for the sum to achieve a total error of ϵ.
There are also methods based on a different philosophy, namely to systematically approximate the electronic Hamiltonian (Eq. 1) to reduce the number of terms in the sum, using a low-rank representation of the Hamiltonian. 65,66This technique finds application also to real-time dynamics simulations, and may considerably reduce the runtime of error-corrected algorithms 16 .While it can certainly mitigate the variance problem in the NISQ era, it does not qualitatively solve the problem for the very same reason outlined above.We observe that several works aim to reduce the number of basis states required to represent the Hamiltonian.However, this may not necessarily improve the number of measurements, as these fewer terms may have a larger variance.An interesting example of this is seen in variational quantum algorithms applied to classical cost functions.This scenario is important for optimization, and in this case, the most popular variational method is called the Quantum Approximate Optimization Algorithm (QAOA).Despite the fact that the cost function, by definition, only needs to be measured in one basis-the computational basis-the impact of shot noise is still quite detrimental to the overall performance even in this case. 67inally, we also report the celebrated shadow tomography 68 method, which may be useful for estimating local qubit operators but has an exponential scaling for nonlocal ones, such as our P j 's.
In general, the total number of circuit repetitions during an optimization run based on energy optimization, featuring N iter optimization steps is Let us consider a concrete example of a H 2 O molecule, in a very minimal basis consisting of 12 spin-orbitals.Following a state-of-art variance reduction method 49 , we quote a number of 10 8 circuit repetitions to compute a single point energy within 10 −3 Ha accuracy (notice that this is the error from the exact value obtained with this minimal basis set, not the exact value using a converged basis set).Assuming a circuit execution time, with measurements, of 1µs, we are limited to about hundreds of optimization steps per day (totally neglecting classical communications and reset times).
Notice that NISQ hardware means that hardware noise will always be present, and this noise usually varies with time, that's why an optimization run that lasts for more than one day will likely never converge: the optimal parameters θ which yields the lowest energy with the hardware configuration of today, may not be optimal the day after.The fact that hardware errors are device and timedependent realistically excludes the possibility to parallelize the shots using several machines, as is customary in conventional QMC.In this case, one would claim the possibility to prepare always the same trial state on different noisy machines.

B. The noisy optimization problem
The second step of any numerical variational calculation is optimization.In this perspective, we aim to maintain a high-level tone and will not be concerned with details such as which optimization method is better or worse depending on the type of circuit.In general, optimizing the parameters is a complicated task that is delegated to the classical part of the algorithm.Heuristic circuits are very short but have a much larger number of variational parameters than those inspired by chemistry, such as the unitary coupled cluster, or physics, such as the Hamiltonian variational ansatz. 4The latter have longer circuits but allow for a more stable optimization.
A wealth of literature focuses on theoretical roadblocks, such as the existence of barren plateaus 69 and the fact that optimization itself is an NP-hard problem. 70owever, we observe that even in the conventional case, the optimization of parameters occurs in a corrugated landscape.Nevertheless, it is almost routine to optimize thousands of variational parameters in variational Monte Carlo (VMC). 71Moreover, barren plateaus are a concept borrowed from the quantum machine learning community and is likely not relevant, or at least not the real bottleneck, in the case of using structured ansatzes 14,72,73 (i.e.non-random or heuristic), which should be the norm for studying physical systems.The reason why this concept has never arisen in conventional VMC is that no one has ever tried to optimize molecular systems from quasi-random trial states.
In this Perspective, we remain faithful to our practical approach and briefly analyze the problems arising from the simple existence of statistical noise.First, we observe that since the zero variance property does not hold, we can only use the energy and not its variance as a cost function.Optimizing using finite differences is inefficient since each step is affected by statistical noise.Typically, in VMC, this problem can be solved through correlated sampling 7 , which is not possible in this case.The other possibility is to calculate the expectation value of the generalized forces f i , defined as the derivative of the energy with respect to the variational parameters.For simple circuits, calculating the force is possible thanks to a technique called the parameter shift rule 5 , and extensions are possible for more structured circuits. 14Due to the no-free-lunch theorem, the statistical error that we had using energy translates into statistical error on the forces.The optimization effectively becomes a stochastic gradient descent, which resembles a discrete Langevin equation at finite effective temperature, where is a Gaussian distributed random variable, and δ is a finite integration step.Astrakhantsev et.al. 14 have shown that the statistical error defines an effective temperature, T shot , proportional to the variance of the random variable η shot i .Below a certain number of samples M * , therefore above a certain effective noise temperature the search is unsuccesful.Above, the optimization becomes possible.Moreover in the M ≫ M * regime, the infidelity of the state preparation seems to scale as 1/∆ 2 , where ∆ is the energy gap between the ground and the first excited state.These numerical results have been obtained on a challenging j 1 − j 2 Heisemberg models (cfn.Ref. 74 ), and it would be interesting to check how they generalize to chemistry problems.On the optimistic side, the critical number of samples M * seems to not scale exponentially with the system's size, though allowing in principle efficient VQE optimization in the presence of quantum measurement noise.Moreover, in this state-of-the-art VQE study, barren plateaus are not observed.
Another point of contact between variational quantum computing and conventional Monte Carlo is that techniques well-known in the latter for decades are slowly being adopted in this new field.Perhaps one of the most important is the use of the so-called quantum information matrix to precondition the gradient at each step with the following matrix, where |∂ j ψ⟩ is the derivative of |ψ⟩ as a function of the j-th variational parameter.Although most of the com-munity believes that Eq. 9 comes from machine learning, where it is used in the natural gradient , 75 it has actually been used for more than twenty years in VMC to optimize trial wave functions for chemistry and condensed matter. 7It was introduced by Sorella in the stochastic reconfiguration method 76,77 and later given the geometric meaning of metric of the space of variational parameters. 78A weak regularization of the diagonal is sufficient to obtain stable and effective optimizations, as has also been shown in the quantum case. 79owever, the measurement problem also heavily affects the calculation of algorithm efficiency in this case.In VMC, the matrix S can be evaluated with negligible overhead using the same samples x's, distributed as |ψ(x)| 2 (which can be evaluated classically there), that were already generated for the energy calculation.In the quantum case, each matrix element must be statistically evaluated using (uncorrelated) repetitions of a specific circuit for the pair i, j.Moreover, it is not trivial to obtain a circuit for each element S ij , and only a blockdiagonal approximation of S, where i and j belong to the same block, is the most feasible solution. 79t the moment, an interesting development to overcome this problem is a heuristic combination of stochastic reconfiguration with the SPSA optimizer.In this case, the matrix S, which would require N 2 par circuits, is approximated by a Hessian calculated using only two random directions in the parameter space. 80However, the numerical benchmarks proposed to validate the method are too small to fully understand the real possibilities of this simplified optimizer.
Finally, another connection between VQE and QMC arises in the context of using noisy ionic forces in molecular dynamics (MD) simulations.In such cases, it becomes impractical to follow an energy-conserving trajectory.Sokolov et al. 81 utilize a similar technique proposed in QMC-powered MD simulations from many years ago. 82They use shot noise to define an effective Langevin MD, enabling unbiased simulations at constant temperature.

VI. FAULT-TOLERANT QUANTUM CHEMISTRY
At this point, we would face the conundrum that a quantum computer can in principle store an exact wavefunction, but we cannot practically evaluate its energy or other expectation values using the variational methods introduced so far.However, quantum computing admits an efficient method for calculating energy, even more efficient than Monte Carlo methods.These methods are based on the quantum phase estimation (QPE) algorithm or its successive variants.
The standard version of this algorithm, 22 which finds applications far beyond chemistry, works as follows.Suppose we have a unitary operator U and one of its eigenstates |ψ n ⟩.We can evaluate its eigenvalue λ n = e i2πϕn (expressible as a function of its phase since U is uni-tary) using an extra register of r qubits and r controlled operations cU , where the first operation controls the application of U , the second of U 2 , the third U 4 , and so on, until the final U 2 r−1 .Finally, it is sufficient to apply a quantum Fourier transform to read the phase value in binary form, truncated to r bits.
To arrive at an implementation that interests us, we simply identify this generic unitary operator U with e iHt , i.e., a Hamiltonian evolution operator, and the starting state as an eigenstate of H, for example, the ground state.In this case, the phase we read is E n t.The operation U 2 r−1 translates into an evolution of time t 2 r−1 .
It is possible to show that the error (due to truncation in r bits) we make on the phase E n t scales as 1/r, where r is the total number of applications of U .This is because the discretization error scales as 2 −r , but each application of the controlled unitary doubles the length of the circuit.In literature, this is often quoted as a quadratic speedup compared to a Monte Carlo evaluation, whose error scales as 1/ √ M , where M is the number of samples, and thus the number of function calls of the function to be evaluated on the generated distribution.If we interpret r as the number of "function calls" i.e., cU operations applied to the state |ψ n ⟩, the asymptotic comparison with Monte Carlo can be made, keeping these specifications in mind.
There are now two complications to consider.The controlled operation e iHt cannot be implemented exactly but requires approximations.The most established is the Trotter step decomposition, which has the advantage of not requiring additional qubits. 37,83Recently, other methods that have better asymptotic scaling but require additional qubits, such as the linear combination of unitaries 84 and qubitization 85 , have surpassed Trotter's method in popularity.For example, the first runtime and resource estimation works for chemistry, by Troyer and coworkers 33 , assumed that QPE with Trotter had to be used, while now more recent works use qubitization (plus many other tricks to shorten the circuit). 16][88] The first crucial observation is that now even a groundstate calculation requires a black-box that implements real-time dynamics, or closely matching objects.This brings us back to the initial discussions about the complexity of implementing Hamiltonian simulations of fermionic systems, much more complex than their spinlattice counterparts.Given that we require quantitative precision on the time evolution, the Hamiltonian evolution algorithm requires a full fault-tolerant implementation.
The second observation is that these algorithms require the challenging assumption of having the ground state as their input.What happens if the state on which we apply QPE estimation is not the ground state?Here there is good news and bad news.Let's start with the good news: unlike the classical case if we input a generic state Φ (classically, the equivalent would be preparing a generic ansatz and sampling with Metropolis from it), when we read the phase register, the state must collapse onto an eigenstate of |ψ n ⟩ and the read-out phase is ϕ n .Therefore, the energy readout in the auxiliary register determines the collapse onto an eigenstate of H of the previously initialized state Φ.The good news is therefore that we will not read a random number, but one of the possible eigenvalues.The bad news is that we do not know which one.In general, we will read the eigenvalue n with a probability given by the overlap | ⟨Φ|ψ n ⟩ | 2 .It then becomes crucial that, if we are interested in the ground state, the initial state is not completely random, but has a sizable overlap with the ground state.
Generally, in chemistry and materials science, we are interested in the runtime scaling with size, i.e., number of electrons, basis set, etc.If the overlap vanishes exponentially with size, the entire procedure becomes exponentially long, nullifying the exponential advantage that could be initially envisioned, given the compression in memory and the possibility of efficiently reading the energy.A recent study focused on this aspect, showing that this issue, though known in principle but often forgotten in practice, could seriously undermine the claim of exponential advantage for electronic structure. 89It should be noted, however, that even a polynomial advantage could be sufficient to solve problems that are still intractable, as can be seen from how Density Functional Theory has revolutionized chemistry and materials science, thanks to its improved N 3 scaling compared to the N 6 , N 7 scalings of coupled-cluster.
To conclude, the absence of an exponential speed-up does not rule out the existence of a practical quantum advantage, which is more difficult to identify a priori, but on a case-by-case basis.In this context, resource estimation studies focusing on particular molecular systems are of great importance.Goings et.al. 16 perform resource estimates to simulate a challenging system for classical methods, the cytochrome P450 enzyme.The estimates depend on the hardware noise that needs to be corrected.To simulate the ground state using an accurate active space, ∼ 5×10 6 (5×10 5 ) physical qubits with error rates of 0.1 % (0.001 %) would be needed.Concerning materials science systems, state-of-the-art studies are represented by Rubin.et.al. 90 , and Ivanov et.al. 91 , which move away from the plane-wave basis set and combine Bloch, or Wannier orbitals, respectively, with most recent techniques such as sparse qubitization or tensor hyper-contraction.Resource estimates applied to Lithium Nickel Oxide battery cathode 90 , and transition metal oxides 91 indicate longer fault-tolerant runtimes compared to molecular systems such as P450.Clearly, such estimates are based on state-of-art algorithms, including the most efficient way to encode fermionic Hamiltonians for phase estimation, and the current state of error correction algorithm.Further algorithmic developments will improve the cost of the simulations 15 .Orders of magnitude in efficiency have been gained compared to just ten years ago, 38 and therefore the threshold for quantum advantage could shift in one direction or another, approaching when new quantum strategies are invented or perhaps moving away thanks to the constant progress of "conventional" methods such as DMRG or QMC.

VII. THE LOCAL ENERGY IN VARIATIONAL MONTE CARLO
After extensively introducing quantum computing algorithms for chemistry and specifically discussing the practical limitations of variational approaches, let's move on to the classical case.It's very instructive to understand why classical variational methods do not suffer from the same variance problem, to guide us in inventing equally efficient energy estimators.Let's start again from the formal definition of the energy's expectation value over a general (unnormalized) state |ψ⟩ where we insert x |x⟩ ⟨x| in the denominator and numerator.Notice that here we use the notation for a discrete Hilbert space, but the formula can be generalized to continuous models by replacing the sum with the integral ( x → dx).Some steps are necessary to transform the Eq. 10 into the typical Monte Carlo format, where we integrate the product of a probability distribution from which we can sample form, p(x), and an objective function.This is achieved formally by dividing and multiplying by ψ(x) = ⟨x|ψ⟩.Eq. 10 then becomes where the local energy is defined as and is a quantity that can be evaluated locally for the configuration x (see below).The probability distribution is defined as we assume that we can sample configurations x ∼ p(x), i.e. using a Markov-chain algorithm, the energy can be evaluated stochastically as using M VMC decorrelated samples taken from a Markovchain algoritm, such as Metropolis 92 .This technique is called Variational Monte Carlo (VMC) 7 and is efficient as long as (i) computing E L (x) is efficient, and (ii) it is possible to run the Metropolis algorithm also efficiently.Since classical trial functions ψ(x) can be evaluated with numerical precision, for each x, then it is also the ratio |ψ(x ′ )| 2 /|ψ(x)| 2 for each pair x, x ′ , which is needed to perform a Metropolis update. 93ne of the most important features of the local energy is that its variance is zero when |ψ⟩ is an eigenstate of H. Indeed, if In practice, this means that the local energy function will be closer to a constant value as the trial state |ψ⟩ approaches the ground state.This results in reduced statistical fluctuation in Eq. 13.
At the same time, it is possible to use the variance of the local energy as a cost function for the optimization.This allows in principle to certify the success of the minimization, as the ground state is signaled by zero statistical error.

A. The local energy in practice
The calculation of the local energy depends on the model and the wave function.In continuous space, the wave function can be given by a Slater determinant ansatz (for fermionic systems), usually complemented with an explicit correlation operator like the Jastrow factor. 7,94In this case, evaluating the local energy reduces to applying the Laplacian operator to the function in real space and dividing by the function itself.For the sake of clarity, let's consider a toy example.The local energy for a (unnormalized) Gaussian trial ansatz ψ(x) = e −θx 2 , in continuous space, and a typical onedimensional Hamiltonian H = −1/2(∂ 2 /∂x 2 ) + V (x), reads The local energy depends on x and the variational parameter θ, which can be optimized in an outer loop.In this case, it can be observed that if the external potential is a harmonic oscillator V (x) = ω/2 x 2 , the local energy becomes The local energy no longer depends on x when the variational parameter takes the value θ = ω/2, for which the variational ansatz becomes exact.Moreover, it also takes the value E h.o./opt.L = ω/2 with zero statistical fluctuations in Eq. 13 as E L does not depend on the sampled point x i anymore.Modern codes for solving the manyelectron Schrödinger equation in chemistry or materials science feature sophisticated trial ansatz, which are in turn functions of atomic orbitals. 71While in the past, introducing a new ansatz required coding new functions for the evaluating the derivatives, now, the evaluation of the local energy can be delegated to algorithmic differentiation routines.This allows for the adoption of fairly sophisticated ansatze in VMC. 71,95he local energy shows up every in VMC calculation including lattice models such as spin and Hubbard models.In this case, the spatial derivatives are replaced by non-diagonal quantum operators such as spin-flip or hopping operators, H x,x ′ = ⟨x ′ |H|x⟩, where x, x ′ can represent a specific spin configuration or an occupation state of fermions or bosons on a lattice.In this discrete basis, the local energy is written as follows, and can be computed efficiently as long as the number of states x ′ such that the Hamiltonian matrix elements |H x,x ′ | ̸ = 0, at fixed x, is only polynomially increasing.

B. Pauli measurements versus local energy
To better illustrate these concepts, it can be instructive to perform a numerical experiment on a toy model, the one-dimensional transverse field Ising model, where σ α are Pauli matrices, and consider the critical transition point at J = Γ = 1.We also consider a short chain of L = 10.We denote a generic computational basis configuration as |x⟩ = (s 1 , • • • , s L ), where s k are eigenvalues {1, −1} of the σ z j operator.In this case, the spin Hamiltonian is already expressed in Pauli terms (one just needs to re-define the eigenvalue of the spin-z operator from {1, −1} to {0, 1}).Regarding the VQE approach, the energy can be measured in only two bases: the computational basis and the "XX • • • X" basis, obtained by applying a Hadamard gate, H, on each qubit at the end of the circuit that prepares the variational state.
In this numerical experiment, we use the variational Hamiltonian form, with a sufficiently deep circuit of up to 24 layers, resulting in up to 48 variational parameters (see Appendix A).By optimizing ansatze characterized by different circuit depths (without shot noise for simplicity), it is possible to obtain trial states systematically closer to the exact ground state of the model. 74In Fig. 2, we use depths ranging from 12 to 24, and we can reach a relative error on the energy of 10 −5 compared to the exact ground state energy, E 0 .
However, the statistical error on the energy, which is evaluated with Eq. 5, does not improve.In fact, if we had tried to optimize the circuit using the noisy energy estimator, we would not have been able to obtain such accurate optimized trial states.This clearly demonstrates that the estimator does not possess the zero variance property, as opposed to the VMC calculation.
To obtain the standard deviation in Fig. 2 we repeat the estimation of the variational energy, Eq. 5 (Eq.13 for the VMC case described below), 100 times to obtain a population of variational energies that could be obtained with the given variational setting, M j (M VMC for the VMC case) setups.We use a number of shots M j , M VMC , which is smaller (10 2 ), equal (10 3 ), and larger (10 5 ) than the Hilbert space of the model, i.e. 2 10 = 1024.
For the VMC comparison, we deliberately use a fairly simple classical ansatz, a long-range Jastrow state, which features only 5 variational parameters for L = 10 (see Appendix A).Although this classical ansatz only reaches a moderate relative accuracy of 10 −3 , at best, the statistical error on the energy consistently improves, outperforming the statistical error obtained with the quantum circuit.Notice that this is an easy model for VQE: the number of measurement basis is the minimum possible for a genuine quantum many-body system.Electronic structure Hamiltonians unfold into thousands of Pauli operators, which in turn require similar numbers of basis.This numerical example demonstrates the power of the local energy-based estimator compared to the Pauli measurement one.From this example we can understand also the following lesson: even finding the smallest set possible of basis to measure H will not solve all our problems, as this estimator still lacks the zero-variance property.
A hybrid solution has been proposed by Torlai et.al. 96 .They use quantum state tomography, using neuralnetworks 97 , and a tomographically incomplete basis set, to obtain a classical reconstruction of the quantum state.Classical VMC can be then applied to this classical approximation to calculate, precisely, the energy.This method solves the variance problem but it introduces a bias stemming from a possibly, and likely, imperfect reconstruction of the quantum state.Moreover, it raises the question of finding the range of applicability of the method.If the quantum states can indeed be represented by a classical ansatz, then one could directly reach the ground state by optimizing that, without the need of a quantum computer.

C. Beyond variational: projective methods
The local energy is a central concept in quantum Monte Carlo beyond the simplest VMC method because, in practice, every projective QMC method requires a trial wave function ψ to alleviate the sign problem in fermionic simulations or, more generally, to reduce statistical fluctuations.These projective methods, such as Diffusion Monte Carlo 7 or Auxiliary Field Monte Carlo (AFQMC) 98 , improve upon the VMC energy but still rely on a variational state for importance sampling.Thus, the local energy resurfaces in these contexts as well.An accurate QMC simulation is rarely seen without a good variational starting point. 99ne can therefore see a similarity between the importance of VMC, which is foundational for a more accurate calculation with projective QMC, and the significance of the initial state preparation for a successful execution of QPE in quantum computing.It is highly likely that this duality between variational and projective methods (in imaginary time in the classical case and in real-time in Standard deviation of energy estimator < l a t e x i t s h a 1 _ b a s e 6 4 = " 2 Y c 9 / v 0 h a J 2 X 7 t F y 5 r p S q l 1 k z e b J P D s g R s c k Z q Z I r 0 i B N w s k D e S L P 5 M V 4 N F 6 N N + P 9 Z z R n Z D t 7 5 A + M j 2 + t g p b n < / l a t e x i t > M j , M V MC = 100 < l a t e x i t s h a 1 _ b a s e 6 4 = " g X o W O s l P Q M B I J m s W 6 e 3 d I 6 2 7 S q t c m 9 J z a 1 s 2 x a 5 a s s j U D / U v s j e 2 T c u W q U q p e Z M 3 k y S 7 Z I w f E J q e k S i 5 J g z Q J J 2 P y S J 7 I s / F g v B i v x t v 3 a M 7 I d o r k F 4 z 3 L 2 3 k l 0 4 = < / l a t e x i t > M j , M V MC = 10 3 < l a t e x i t s h a 1 _ b a s e 6 4 = " P V V p 3 W Z / 4 V r G 4 2 3 t 4 k j Z I e u n y n w (circles), and a simple classical ansatz but using the local energy, Eq. 16 and Eq.13(diamonds).Different colors indicate different sampling sizes.In the quantum case, the dataset is made of M j wavefunction collapses per basis (which are two for the model considered), while for the classical case, it is made of M VMC spin configurations, x i , sampled from the trial state |ψ(x i )| 2 .In both cases, we prepare different ansatze, within the same ansatz class, but having different accuracies.For each trial state, we plot the standard deviation of the energy vs. the relative error of its variational energy E var (computed exactly).The zero variance property only holds in the VMC case, since the statistical error of the quantum energy estimator remains finite even when the trial state approaches the exact limit.
the quantum case) will extend to quantum computing.In that case, algorithms like VQE or its variants, despite being considered already old-fashioned by some, will remain central even in the fault-tolerant regime as state preparation subroutines.

VIII. QUANTUM COMPUTING MEETS QUANTUM MONTE CARLO A. The local energy in quantum computing
The existence of a local energy estimator in quantum computing would eliminate any variance problem in VQE.However, it is not as straightforward to apply this trick in quantum computing simply because evaluating the ratio ⟨x ′ |ψ⟩ / ⟨x|ψ⟩ becomes extremely demanding in general. 100ere ψ(x) = ⟨x|ψ⟩ needs to be evaluated from quantum measurements, hence is affected by statistical noise.While evaluating ψ(x) to additive precision is possible, the local energy involves ratios of amplitudes.Maintaining a fixed precision on the ratio is costly, because quantum states have generally an exponentially large support.This translates into exponentially vanishing amplitudes at the denominator of Eq. 16.
These statistical fluctuations are different compared to those found in a standard Monte Carlo calculation.In VMC, the local energy can always be computed with numerical precision and the fluctuations arise from a finite number of samples, M VMC in Eq. 13 (in the presence of an approximate trial state).Here uncontrolled statistical fluctuations arise solely from the estimation of the local energy at a fixed x i .
We are witnessing an increase in works that aim to combine quantum computing and quantum Monte Carlo.Huggins and coworkers proposed an interesting combination of quantum computing and AFQMC. 101In this work, they use a circuit to generate the trial wave function, from which samples are drawn (in this representation, the configuration x is a Slater determinant).The AFQMC algorithm then proceeds unchanged, and the supposed advantage of the method lies in using a circuit to generate an ansatz that could be inaccessible classically.Mazzola and Carleo 102 showed that the procedure, when adapted to many-body lattice models at criticality, thus using Green's function Monte Carlo instead of AFQMC, exhibits an exponentially scaling behavior with a hard exponent.This is due to Eq. 18 and that stronglycorrelated states have vanishing overlaps on the configuration's basis, necessitating an exponentially increasing number of samples to compute the local energy.It is estimated that a reasonably accurate ground state calculation of a 40-sites transverse-field Ising model (Eq.17) requires the order of 10 13 measurements.Inserting a gate frequency of 10 kHz (i.e.assuming a fault-tolerant implementation, see Sec.II) and a circuit depth of O(10) layers to generate an accurate trial state, this implies a runtime of a few thousand years, for a system in reach of exact classical diagonalization.
Other works appeared almost simultaneously last year on this topic.Zhang et.al 103 introduced a quantum computing adaptation of FCIQMC 104 .In this work a quantum circuit U is used to create a 'quantum' walker |x⟩ = U |x⟩, i.e. a linear combination of Slater determinants |x⟩, undergoing the FCIQMC subroutine.The idea is interesting as it could counter the exponential explosion of the determinants/walkers during the imaginary time projection, by compressing logaritmically the memory to store them.A possible major drawback of this method is that the Hamiltonian H x,x ′ in this new basis is not sparse anymore.Xu and Li 105 proposed to use Bayesian inference to reduce the number of shots required to compute the local energy.Kanno et.al. 106 further combines the ideas of Ref. 101 with tensor networks.Yang et.al. 107 propose a way to speed-up real -time pathintegral MC already on NISQ hardware.Tan et.al. 108 devise instead the integration with the Stochastic Series Expansion, another flavour of QMC used for spin models.
Finally, two recent works propose to use quantum data in a conventional VMC framework.In this case the local energy is calculated in conventional hardware.Montanaro and Stanisic 109 propose the usage of a VQE circuit as importance sampler to speed-up the first iteration of a VMC simulation.Moss et.al. 110 use quantum data from Rydberg atom simulators to train a classical neuralnetwork ansatz (as in Ref. 96 ) and further optimize it in a VMC fashion.
Overall, it is likely that an efficient way to estimate the local energy is possible only for sparse states, i.e for which the number of non-zero overlaps ⟨x|ψ⟩ ̸ = 0 grows only polynomially with the system's size.However, it remains to be understood if a quantum computer is really needed to tackle such systems at this point. 111Furthermore, if a suitable basis transformation U can be found to reduce the support of such states, then (1) this transformation should not spoil the sparsity of the Hamiltonian H x,x ′ to keep the evaluation of Eq. 16 efficient.(2) Moreover, if this transformation exists it can be used to diagonalize efficiently the systems in a reduced sub-space without the need of QMC.
On a more positive note, it is not excluded that, despite exhibiting exponential scaling, the aforementioned approaches could yield a better exponent than the best classical method for some specific fermionic systems.To achieve this, it will be crucial to start with a classically intractable trial state to justify the subsequent imaginarytime projection.Further research and methodological advancements are required to assess the true potential of the method, in the presence of shot noise.
Overall, the pursuit of an efficient method for calculating energy, inspired by the local energy in VMC, is a field of research that we hope will yield numerous fruitful results.It is necessary for the quantum computing and QMC communities to clearly understand the limitations and potentialities of their respective techniques in order to invent new hybrid algorithms at the interface of these two worlds.

B. Classical-inspired circuits for VQE, quantum-inspired ansatze for VMC
The techniques and methods that have been used for decades in QMC are so numerous that many have been (and many are waiting to be) exported to quantum computing.Trial functions play a central role in VMC.The use of explicitly correlated non-separable ansatze has brought great success to VMC and is basically a clever solution to compress the electronic wavefunction, which, when described in the space of determinants, requires otherwise an exponential number of coefficients.The latest iteration of this concept is the introduction of neural network quantum states by Carleo and Troyer in 2017, 112 which can be seen as more general forms of Jastrow, 94,97 back-flow, 113,114 and tensor network states. 115 mentioned earlier, compressing the Hilbert space within a polynomial scaling size quantum memory enables the manipulation of linear combinations of arbitrarily large Slater determinants.However, when considering the variational approach, we are constantly seeking shorter quantum circuits that can capture as much entanglement as possible, within the coherence time limitation of NISQ systems.
Several works have already proposed ways to implement a Gutzwiller operator, which is essentially the simplest form of a Jastrow operator, as a quantum circuit.Murta and Fernandes-Rossier 116 propose a method based on post-selection.Typically, the way to create non-unitary operators in quantum computing is through embedding them in a larger system that undergoes unitary evolution, a method also known as "block encoding".This involves introducing ancillary qubits, and it can be certified that the non-unitary operator has been successfully applied to the quantum state if and only if the ancillary register is measured and read in a given state.However, the problem with this approach is that the success probability decreases with the system size, requiring many repetitions.
Seki and coworkers 117 also propose a similar approach, based on the linear combinations of unitaries, and therefore also affected by a finite success probability.
Using a different approach, Mazzola and coworkers 118 defined implicitely a hybrid quantum-classical wavefunction with a Jastrow operator in post-processing.The approach has been then improved in its scalability in Ref. 119 .There, a quantum circuit is used as importance sampler, and the measured configurations undergo post-processing by a neural-network.Benfenati and coworkers 120 instead implemented a Jastrow operator moving it from the wavefunction to the Hamiltonian.This approach also do not require additional circuits compared to a VQE calculation.However, the re-defined Hamiltonian operator features much more Pauli terms to measure.Motta and coworkers devised an imaginary time evolution (QITE) operator without ancilla and postselections. 121The original formulation of the method formally incurs an exponential dependence on the correlation length in the general case, because it requires quantum state tomography.However, if truncated, it can generate heuristic trial states for variational calculations, and is still subject of improvements.?
Finally, it is interesting to note that the flow of information is not always from the older discipline to the newer one.Some circuit ansatz used in quantum computing can be adapted to VMC.Inspired by the Hamiltonian variational circuit ansatz 8 (see Appendix A) Sorella devised a method called Variational AFQM capable of obtaining state-of-the-art ground state energies of the Hubbard model for various U/t parameters and dopings in the thermodynamic limit. 122

C. Variational real-time dynamics and updates in parameters space
Variational states are not only used for ground state calculations but they can also be used to study dynamics.The price to pay is that the variational state must be flexible enough to accurately describe also excited states, and this can be a demanding constraint, while the advantage is the ability to use much shorter circuits compared to those used, for example, for trotterization.
From a classical perspective, this area of research is very active in recent months, as it allows for countering quantum advantage experiments in the quantum dynamics application space (cfn.Sec.III).Obviously, the use of a variational state does not allow for exact evolution, but it is also true that the errors of a NISQ machine do not allow for it either.The balance between classical and quantum advantage for real-time dynamics will be shifted in favor of the latter when fidelity enables the simulation of sufficiently large systems for a sufficiently long time, rendering them inaccessible to classical approximation methods. 23he subfield of variational real-time dynamics also offers interesting parallels between quantum computing and (time-dependent) VMC 123 .The formalism based on the time-dependent variational principle is the same.In practice, even the fundamental ingredients that allow for the update of variational parameters are the same: the matrix S defined in Sect.V B. (cfn.Ref. 123 with Ref. 124 ).As we have seen in the case of optimization, the fact that the elements of the S matrix are subject to statistical noise is a common issue in both implementations.In this case, as well, it is reasonable to expect cross-fertilization between the two techniques, regarding both variational forms and efficient ways to evaluate the S matrix.The field of variational algorithms for realtime simulations is very active.?Generally speaking, variational parameters can be updated using different pseudo-dynamics θ ′ = θ + δθ to achieve various objectives.While pure energy minimization is the most popular goal, and real-time evolution following the time-dependent variational principle is the second, there are other possibilities.Patti et al. 126 devised an iteration scheme to perform Markov chain Monte Carlo in the quantum circuit's parameter space, i.e., to sample from p(θ) ∼ exp [−β⟨ψ(θ)|H|ψ(θ)⟩].The resulting equation is a generalization of stochastic gradient descent that ensures detailed balance.This approach could assist in escaping local minima during VQE optimization.
Similar ideas have been proposed in the VMC context earlier.Mazzola et.al. 78 showed that one can obtain an upper bound for the free energy, by sampling from p(θ) In VMC, this can be achieved either using a modified Langevin equation for θ or a modified Metropolis acceptance, also known as "penalty method". 127 conclusion, manipulating trial states in the presence of statistical noise is a common feature of VMC in all its formulations and scopes.Many ideas have been proposed to achieve stable parameter updates.The VQE community could profit from this established knowledge but also share its own developments and ideas to advance both fields.

IX. CLASSICAL MONTE CARLO MEETS QUANTUM COMPUTING
In this Section, we completely shift our perspective.Not all chemistry problems are quantum many-body ones, for example, understanding protein folding is a daunting task already in its classical force-field formulation.Likewise, not all problems that a quantum computer can solve are genuine quantum mechanical problems.In fact, in many cases, the opposite is true: the most famous quantum algorithms, that have made the field of Quantum Information renowned, are focused on solving "classical" problems.For instance, Shor's algorithm provides exponential speed-up for factoring integers, and Grover's algorithm enables quadratic speed-up for searching in databases. 22Other examples include algorithms for linear algebra, optimization, and machine learning.Philosophically speaking, solving a purely classical problem with a quantum machine can be even more intellectually rewarding than simulating a quantum system, where the distinction between computation and simulation becomes less clear.
Up to this point, we have been exploring whether and how, well-known techniques in quantum Monte Carlo can be adapted to quantum computing to simulate manybody quantum systems.Now we are questioning the opposite: Can a quantum computer be useful in speeding up a classical Monte Carlo algorithm, where the Hamiltonian is defined solely using classical variables, e.g., classical spins?And more specifically, can we achieve this already on NISQ machines?

A. Autocorrelation of a Markov chain
Markov chain Monte Carlo (MC) algorithms are of fundamental importance in both science and technology to understand models that lack a simple analytical solution. 128,129C methods aim to generate statistically independent, representative configurations x i , belonging to the computational space, distributed as a target Boltzmann distribution, ρ(x) = exp(−βV (x)), at finite inverse temperature β = 1/T , and where V (x) is a classical potential energy.A Markov chain MC algorithm sequentially generates these representative configurations, through a transition probability matrix P (x, x ′ ), that defines which states x, x ′ can be connected along the chain, and the relative probability of the transition x → x ′ (each row of   Quantum state collapses into a single conf.

< l a t e x i t s h a 1 _ b a s e 6 4 = " l x H K u m c l O 6 t s t G B w 1 G B 0 L o h h I 0 4 = " >
< l a t e x i t s h a 1 _ b a s e 6 4 = " S i d 7  new ⟩.This concludes the proposal step T (x, x ′ ).For the transverse field case, the proposal matrix is symmetric, T (x, x ′ ) = T (x ′ , x).Finally, the acceptance step is performed classically, and the new configuration may or may not be accepted.
the matrix P is normalized to one). 130ong the family of Markov Chain MC algorithms, the Metropolis algorithm is certainly the most popular one. 92Here the transition process takes the form P (x, x ′ ) = T (x, x ′ )A(x, x ′ ), where T (x, x ′ ) and A(x, x ′ ) are, respectively, the proposal and the acceptance probability matrices.The algorithm works as follows: when at state x, a candidate trial configuration x ′ is generated from the distribution T (x, •).The trial configuration is accepted with probability A(x, x ′ ).If accepted, the next element of the chain becomes x ′ , otherwise, it remains x.2][133] The efficiency is given by the relaxation or mixing time, which quantifies the speed of convergence towards the equilibrium distribution ρ(x), and is formally given by the inverse of the gap, δ, between the largest and the second-largest (in modulus) eigenvalues of P .Two limiting cases exist, the first involves a local update scheme, based for instance on some physical intuition about the system (e.g. a single spin-flip).This usually produces a new configuration x ′ that is similar to the parent x.This choice increases the acceptance rate, because V (x) ∼ V (x ′ ) but results also in a long sequence of statistically correlated samples, such that long simulations are needed to thoroughly explore the configuration space.On the contrary, a non-local update scheme is more effective in producing uncorrelated samples, but usually at the expense of a vanishing acceptance rate.
Interestingly, it took about 30 years after the invention of the Metropolis algorithm in 1953, before the introduction of efficient non-local update schemes for lattice models, the Swendsen-Wang 134 and the Wolff 135 algorithms.These cluster updates solved the critical slowing down of MC simulations at phase transitions in ferromagnetic Ising models, but unfortunately are not as effective for frustrated models. 136,137 Sampling from a classical Boltzmann distribution using wavefunction collapses Recently, it has been proposed the use of a quantum computer, digital or NISQ, to generate efficient non-local Metropolis updates T (x, x ′ ) for spin systems.The theoretical framework has been first introduced by Mazzola 138 in 2021.Shortly after, Layden et.al. 139 demonstrated on real quantum hardware a quantum-enhanced Markov chain algorithm.
Following Ref. 138 , the general idea is rooted in the Fokker-Plank formalism of non-equilibrium statistical mechanics in continuous systems 140 .The Fokker-Plank operator H FP is a parent quantum Hamiltonian of the physical potential V (x), which spectrum is closely connected with the number of local-minima of V (x).For instance, in a double-well model, H FP has two lowest-lying eigenvalues and eigenstates corresponding to the symmetric(antisymmetric) combination of the two Gaussian localized states in the two wells, |ψ L (x)⟩, |ψ R (x)⟩.This idea can be ported to lattice models.Let us consider for simplicity the ferromagnetic Ising model, defined by V = H 1 in Eq. 17, as our classical potential.The task is to sample from the classical Boltzmann distribution exp [−βH 1 (x)].Here, the autocorrelation time of a local spin-flip update scheme is dominated, at low temperatures, by the rate of the rare-event processes that drive the system from the all-up (left These processes are necessarily characterized by a nucleation event, exponentially suppressed with β, and the subsequent diffusion of the domain wall separating the ↑ and the ↓ regions 141 .If however, one can construct a quantum Hamiltonian H such its low-lying eigenstates are mostly localized on the states (↑↑ • • • ↑) and (↓↓ • • • ↓), one could then prepare and sample configurations from these states with optimal autocorrelation times, through repeated collapses of these wavefunctions.
This quantum Hamiltonian could be the quantum transverse field Hamiltonian in Eq. 17, where we add a quantum driver H 2 to the classical potential, H 1 .Here, in the small Γ limit, the two (unnormalized) lowestlying eigenstates of H are |ψ 0 ⟩ ≈ |ψ L ⟩ + |ψ R ⟩ and |ψ 1 ⟩ ≈ |ψ L ⟩ − |ψ R ⟩. 142 While the gap E 01 between these states is exponentially vanishing with the system size, the gap between the rest of the state remains O(1).It is clear that, by sampling configurations x, 143 from either |ψ 0 ⟩ or |ψ 1 ⟩, we can achieve optimal autorrelation times in the large β limit, as the states |ψ L ⟩,|ψ R ⟩ are sampled with equal probability.
At intermediate temperatures, the probability distribution obtained via the eigenstates projection is generally different from the classical Boltzmann distribution ρ(x; β) = e −βH1(x) one aims to achieve.For this reason a standard Metropolis acceptance step needs to be performed.
One can define a valid Markov chain out of this physical intuition, rooted in (1) the preparation of a localized state, (2) a quantum propagation to prepare a linear combination of the low-energy eigenstates of H, and (3) a measurement to collapse into a new string state.In Ref. 138 it is proposed to use a QPE subroutine to prepare such low-energy eigenstates.Layden et.al. 139 simplifies the idea and uses a Hamiltonian simulation subroutine, e −iHt , with randomized t and Γ values at each step.Crucially they observe that such a quantum proposal update is symmetric, thus enabling a fast and practical evaluation of the acceptance step.The algorithm is sketched in Fig. 3.
A superquadratic speedup for spin glass instances is observed, i.e. polynomial speed-up of order 3.6, compared with the best possible classical update strategy.Interestingly, the procedure is demonstrated on hardware where it is found that hardware errors only impact the efficiency of the chain, while the sampling remains unbiased, exactly due to the existence of a classical acceptance step after the quantum, noisy proposal move.
This quantum-enhanced Markov chain MC needs a quantum dynamics subroutine.This, in turn, can be implemented both in the fault-tolerant, the NISQ regime, and in analog simulators, provided their architecture constraints.Clearly, to reach a possible quantum advantage one needs to deal with the fact that, as explained in Sect.II, quantum gate times are much slower than classical CPUs, and this may cancel a scaling advantage for reasonable system sizes.In particular, lattice MC simulations can be executed also on special-purpose classical hardware, such as FPGA, as demonstrated in Ref. 144 , which may enjoy an even faster logical clock speed.Therefore, more work will be needed to assess whether this idea can bring a real benefit in this application space.

C. Quantum walks and quantum Metropolis algorithms
For the sake of completeness and clarity, it is important to mention here another family of algorithms, known as quantum walks, which share the same objective as the quantum-enhanced Markov chain algorithms described earlier: accelerating the convergence of classical Markov chains.While the justification for quantumenhanced Markov chain algorithm of Sect.IX B is based on physical intuition 138 , and also the potential gains are assessed heuristically, quantum walks come with a more rigorous guarantee: if they can be implemented, they provide a quadratic speed-up in autocorrelation times.This quadratic speed-up is related to concepts such as Amplitude Amplification or Grover's algorithm. 145here are practical and conceptual difficulties that limit the design of quantized Markov chains, particularly in the acceptance step.In classical systems, we can always save the current configuration, previously denoted as x, and reuse it if the trial move leading to x ′ is not accepted.However, in quantum computing, the no-cloning theorem prohibits the direct copying of a quantum state. 146Furthermore, the acceptance step also involves arithmetic operations that are computationally more demanding in the quantum context.It is easy to imagine that a unitary "walk" operator must include rotations by the following arbitrary angle where ∆ is an energy difference. 147Now, it is important to note that quantum arithmetic is much more expensive in quantum computing because it must be reversible.For instance, the most efficient way to perform the addition is still a subject of ongoing research. 148he most commonly used definition of a quantum walk originates from Szegedy. 149For the sake of brevity, we have to refer the reader to the comprehensive review 150 or to Ref. 147 , where Szegedy's algorithm is revisited from a more practical perspective, for details.In short, for any classical Markov chain defined by the transition matrix P (x, x ′ ), (see Sect.IX A) a quantum walk represents a quantized version of it that offers a quadratic speed-up in mixing time.Formally, it enhances the gap from δ to √ δ, such that the mixing time decrease quadratically from O(1/δ) to O(1/ √ δ).Szegedy's walk circumvents the no-cloning constraint using two copies of the graph (or lattice), and postulate the existence of a unitary operation W of the form A practical implementation of the walk operator W would require digital rotation of angles such as Eq. 19, which are in turn evaluated by a sequence of costly arithmetic operators.Lemieux et.al 147 analyze the cost of quantum walks showing that the quadratic speed-up that they can offer is overshadowed by the cost of implementing the W operator, assuming even optimistic estimates for the gate time of fault-tolerant hardware.
The quantum-enhanced method of Sec.IX B completely avoids performing the acceptance step on the quantum hardware, which is the main reason for its hardware feasibility.
We note that it is increasingly easy to get confused with the names of the methods and the combinations of words such as "quantum", "Monte Carlo", and "Metropolis".In the traditional literature, as well as in this Perspective article, "quantum Monte Carlo" refers to the family of Monte Carlo algorithms that are executed on conventional computers but aim to solve manybody quantum problems.However, there is a community in quantum computing for which this combination of words indicates a Monte Carlo algorithm executed on quantum hardware, including Montanaro's algorithm for computing expectation values of multidimensional integrals using quantum amplitude estimation 151 .This type of application finds application in finance 34 , and, while interesting, it is not discussed in this manuscript.
Finally, we also note that a quantum computing-based method to speed up a Monte Carlo simulation, in principle, could be used to accelerate a QMC algorithm as well.In this case, the leap would be twofold: using quantum hardware to accelerate a classical algorithm, such as a path-integral MC, to simulate quantum Hamiltonians.
Two philosophically more interesting algorithms can be mentioned for this purpose.Temme et al. 152 proposed a "quantum Metropolis algorithm" for studying quantum many-body Hamiltonians.This method suggests performing a walk in the eigenstates of the quantum Hamiltonian, thus overcoming the sign problem. 153e algorithm includes performing and undoing QPE, an ancilla register that stores the energy, and the ability to perform on it measurements that only reveal one bit of information, enabling the acceptance step to overcome the no-cloning principle.
Yung and Aspuru-Guzik 154 chose a different strategy for their "quantum-quantum Metropolis algorithm", finding a way to extend Szegedy's walk to quantum Hamiltonians.The runtime performance and scaling of such approaches is still yet to be assessed.

D. Sampling with quantum annealers or simulators
The algorithms presented in Sect.IX C require a faulttolerant computer.However, it cannot be ruled out that an advantage in the sampling problem could come from hardwares that fall on the opposite spectrum, namely noisy quantum simulators or quantum annealers. 19First of all, let's observe that the quantum-enanched Markov chain method of Sect.IX B can be implemented not only using trotterization but also through real-time dynamics in a quantum simulator. 139Furthermore, optimization and sampling tasks are closely connected. 155Specialpurpose quantum simulators, called quantum annealers, have been built with the aim of optimizing large-scale spin-glass problems, but they have also been reconsidered as thermal samplers 156 with some specific applications in machine learning. 157,158he possibility of using a quantum annealer as a sampler arises from its deviation from adiabaticity.The existence of vanishing gaps during annealing implies that at the end of the experiment, the wave function does not localize in the classical minimum of the cost function but remains delocalized, producing a distribution of read-outs.The presence of hardware noise amplifies this effect even further.
This residual distribution could resemble a thermal Boltzmann distribution of some classical Hamiltonian, close to the problem Hamiltonian originally meant to be optimized, and at some effective temperature, which is difficult to determine. 156,159However, given all the possible hardware and calibration errors, it is unlikely to hope that this approach can generate an unbiased sampling from a target distribution.
Recently, Ghamari et al. 160 proposed the use of this annealing process as an importance sampler.Similarly to the quantum-enhanced Markov chain method, detailed balance is restored using a classical acceptance step.In this case, as well, a control parameter is the annealing runtime, which generates a more-or-less-localized final distribution.
Finally, Wild et.al. 161 proposed an adiabatic state preparation of Gibbs states that can also bring quantum speedup over classical Markov chain algorithms, that could also be implemented on NISQ Rydberg atoms devices.

X. CONCLUSIONS
In this Perspective, we investigate many intersections between quantum algorithms and Monte Carlo methods.We begin with a brief review of quantum computing applications for many-body quantum physics.We outline the consensus that is emerging after these years in which quantum computing has become mainstream.With the availability of quantum computers with ∼ 100 qubits and the ability to implement gate sets, albeit noisy 13 , the field is taking on a more practical connotation beyond the traditional boundaries of quantum information theory.
We observe that different hardware platforms imply different gate frequencies, which must be taken into consideration in the perspective of achieving quantum advantage.Quantum advantage for high-accuracy manybody ground state calculations is likely to be deferred to the fault-tolerant era due to the existence of hardware noise in today's machines and the existence of highly developed classical competitors. 89Although the possibility of obtaining quantum advantage through variational methods is possible, especially in systems where classical methods struggle, 74 here we face the additional challenge posed by the presence of quantum measurements shot noise.
We then list several points of contact between quantum and classical variational methods.First, we explain the difference between the statistical noise present in conventional QMC algorithms and the noise arising from quantum measurements.Classical QMC methods feature an energy estimator -the local energy -that enjoys the zero-variance property.This, along with stable optimizers, 76,162 enables the optimization of wave functions featuring thousands of parameters.This is not currently possible in variational quantum computing.Even with access to an exact ground state preparation circuit, obtaining the energy with sufficient precision requires a costly number of circuit repetitions. 8It is clear that this problem arises even before the circuit optimization stage.In the current literature, this aspect is often overlooked, as several new algorithms or circuits are tested without realistic shot noise conditions. 67e suggest that finding the quantum equivalent of the local energy should be one priority in variational algorithms developments.03]105 Other areas where we expect to see cross-fertilization between the quantum and classical worlds include the development of variational forms: classically-inspired circuits for VQE, quantum-inspired ansatze for VMC.Several essential ingredients for variational real-time evolution and parameters optimization under noisy conditions have been already put forward in the VMC community and will be instrumental for their quantum counterparts.
Finally, after discussing how knowledge of QMC methods can provide new momentum to the development of quantum algorithms, we take the opposite direction, showing that quantum hardware can bring advantages to Monte Carlo itself.In this space, quantum walks have been present in the literature for several years and achieve a quadratic speed-up in autocorrelation times through the quantization of a classical Markov chain. 149Their scaling is discussed, as is typical in quantum information, using an oracular form, which assumes the existence of key subroutines without delving into the details of their concrete gate-level implementation.Recently, it has been shown that these oracles require fairly long circuits.In their necessary fault-tolerant implementation, this implies absolute runtimes that are still slower than classical Monte Carlo, even for large-scale classical-spin simulations. 147 more hardware-friendly possibility is represented by a family of methods that use a quantum computer as an importance sampler or to perform only the proposal part of a Metropolis update on the quantum hardware. 138,139,160Physically speaking, one simply leverages the fact that quantum measurements are uncorrelated, making them an efficient engine for sampling.In this case, shot noise is no longer a limitation but rather becomes the computational resource for quantum advantage. 138verall, the purpose of this Perspective is to further connect two communities: the quantum algorithms and Monte Carlo one.As mentioned, many methods developed in QMC can be repurposed in quantum computing.On the other hand, QMC can be a formidable competitor that can hinder or delay the quantum advantage.This is true for quantum chemistry applications, but also optimization and beyond.For instance, QMC can reproduce the scaling of quantum annealing machines, for classical optimization purposes, under certain conditions. 142,163,164owever, the two communities can be complementary, and we hope that new impactful algorithms, either quantum or classical, will emerge thanks to this interaction, to solve important problems in chemistry and condensed matter.
principle.The unitary operator defining the HV ansatz is made of d blocks, and each block is a product of ℓ operators Ûj = exp(iθ k j Ĥj ), with j = 1, . . ., ℓ indexing the non-commuting terms of the Hamiltonian.For the transverse field Ising model we only need ℓ = 2, cfn.Eq. 17.In this case, the full unitary operator is ÛHV (θ) := d i=1 Û2 (θ i 2 ) Û1 (θ i 1 ), (A1) which can be efficiently decomposed using one-and twoqubit quantum gates, and the final parameterized state is where the initial non-entangled state can be obtained from |0⟩ ⊗L by placing one Hadamard gate on each qubit.The total number of parameters is ℓd.In our numerical experiment, we use a state-vector emulation of the operator, based on linear algebra operations, and we do not compile our operator into a real circuit.Parameters are optimized using COBYLA and BFGS optimizers.These results are compatible with Ref. 74 .
Once obtained the optimized state we emulate the noisy energy estimation using the Pauli measurement method.We sample M 1 spin configuration from |ψ(θ)| 2 in the computational basis, and M 2 = M 1 in the rotated basis, obtained using the H ⊗ H • • • H operator (cfn. 50).The total cost of the energy evaluation is therefore 2M j , where M j = M 1 = M 2 values are reported in the text.
Classical ansatz.For the case of VMC we use a long-range Jastrow state of the form, This trial state yields a variational energy which is ∼ 0.1% close to the exact ground state, E 0 .To generate different ansatze of different quality, to showcase the zero-variance property of the local energy, we simply act on the first parameter λ 1 , pulling it away from its minimum value, and towards smaller values (this is done to create a more challenging, delocalized state), while keeping the other fixed.When λ 1 = −0.15 the variational energy degraded up to a 10% systematic error.
We extract M VMC configurations in the computational basis, since we only need this basis to compute the local energy, Eq. 16 as the name suggests.To calculate the standard deviation of the estimator we simply repeat the numerical experiment, for each ansatz and M setup, 100 times.
7 6 a T P 8 F b P e 1 P x P 6 8 b 4 + D 5 u 4 p j Z B z N 3 h b D k A / w K W 6 3 s x N b P s P B f 3 F 1 T a P R U h 3 P u 5 Z 5 7 3 F A K j Z b 1 Y e S W l l d W 1 / L r h Y 3 N r e 0 d c 3 e v r Y N I c W j x Q A a q 6 z I N U v j Q Q o E S u q E C 5 r k S O u 6 k n v q d e 1 B a B P 4 1 T k M Y e G z s i 5 H g D B P J M Y s N 5 + 6 Y N p y 4 3 a j P 6 D m 1 r Z u K Y 5 a s s p W B / i X 2 n J T I H E 3 H / O w P A x 5 5 4 C O X T O u e b Y U 4

FIG. 2 :
FIG.2: Standard deviation of the energy estimator using a quantum circuit and Pauli measurements, Eq. 5 (circles), and a simple classical ansatz but using the local energy, Eq. 16 and Eq.13(diamonds).Different colors indicate different sampling sizes.In the quantum case, the dataset is made of M j wavefunction collapses per basis (which are two for the model considered), while for the classical case, it is made of M VMC spin configurations, x i , sampled from the trial state |ψ(x i )| 2 .In both cases, we prepare different ansatze, within the same ansatz class, but having different accuracies.For each trial state, we plot the standard deviation of the energy vs. the relative error of its variational energy E var (computed exactly).The zero variance property only holds in the VMC case, since the statistical error of the quantum energy estimator remains finite even when the trial state approaches the exact limit.
a 2 D 0 z d N e E y L C 3 f E K + I l c 9 5 R a 8 + g M 5 5 F / s 3 a y g Y t 7 p 8 V 4 V V e / l l S b H U f Q n m H v 2 f P 7 F y 4 X F 1 q u l 5 d d v 2 m / f H b i y t g p 7 q t S l P c r B o a Y C e 0 y s 8 a i y C C b X e J i f b 0 / 8 w 6 9 o H Z X F P l 9 U O D Q w R d a / p 8 c b I T x Z p j s J Z 2 t z 7 N m F s S K W B M f R S x S s S W + i F 3 R E 0 p 8 F z / F p b g K f g S / g t / B 9 b / R u W C 2 8 1 4 8 Q H B z C + m B n T g = < / l a t e x i t > e x i t s h a 1 b a s e 6 4 = " / I H 9 J a 6 s e 7 N S 3 j 7 e r + w e j Z m b Z O t t g N R a w X b b P j l i D N Z l g 9 + y R P b F n 7 8 F 7 8 V 6 9 t + / R k j f a W W O / 4 L 1 / A W Q b o m E = < / l a t e x i t > |~ z init i 2 l e X S n v I / j 7 / T A 5 2 a w G 2 9 W t o 6 1 K b W / Q z B R b Y a t s g w V s h 9 X Y A T t k d S b Y P X t k T + z Z e / B e v F f v 7 X t 0 x B v s L L N f 8 N 6 / A I o k o e 0 = < / l a t e x i t > |~ z new i < l a t e x i t s h a 1 _ b a s e 6 4 = " y v a G X J D b O 7 D R U 6 L c Q J 3 9 D 9 o Ep I k = " > A A A C I H i c b V D L S g N B E J y N r x h f q x 6 9 D A b B g 4 R d E f U i B L 3 k G M G Y Q B J D 7 9 j G I b O z y 0 y v I G t + w k / w K 7 z q y Z t 4 V P B f 3 I 1 B f N W p q O q m u y q I l b T k e a 9 O Y W J y a n q m O F u a m 1 9 Y X H K X V 0 5 t l B i B D R G p y L Q C s K i k x g Z J U t i K D U I Y K G w G g 6 P c b 1 6 h s T L S J 3 Q d Y z e E v p Y X U g B l U s / d u u G d 2 E r e M a D 7 C v k B x 7 N U 8 h q n I f + 0 e t 6 X 2 X P L X s U b g f 8 l / p i U 2 R j 1 n v v e O Y 9 E E q I m o c D a t u / F 1 E 3 B k B Q K h 6 V O Y j E G M Y A + t j O q I U T b T U e p h n w j s U A R j 9 F w q f h I x O 8 b K Y T W X o d B N h k C X dr f X i 7 + 5 7 U T u t j v p l L H C a E W + S G S W b r 8 k B V G Z n U h P 5 c G i S D / H L n U X I A B I j S S g x C Z m G T 9 l b I + / N / p / 5 L T 7 Y q / W 9 k 5 3 i l X D 8 f N F N k a W 2 e b z G d 7 r M p q r M 4 a T L B b d s 8 e 2 K N z 5 z w 5 z 8 7 L 5 2 j B G e + s s h 9 w 3 j 4 A 9 g O h z w = = < / l a t e x i t > | i = e iHt | 0 i < l a t e x i t s h a 1 _ b a s e 6 4 = " r h i s D 4 Z m E W e 1 L 7 Q Q E e 8 I f r h u T f A = " > A A A B + H i c b V D L S g N B E O z 1 G e M r 6 t H L Y B A E I e x K U I 9 B L z l G M A 9 I l j A 7 6 c Q x s w 9 m e o W 4 5 B + 8 6 s m b e P V v P P g v 7 q 4 5 a G K d i q p u u r q 8 S E l D t v 1 p L S 2 v r K 6 t F z a K m 1 v b O 7 u l v f 2 W C W M t s C l C F e q O x w 0 q G W C T J C n s R B q 5 7 y l s e + P r z G 8 / o D Y y D G 5 p E q H r 8 1 E g h 1 J w S q V W v e + w U 9 Y v l e 2 K n Y M t E m d G y j B D o 1 / 6 6 g 1 C E f s Y k F D c m K 5 j R + Q m X J M U C q f F X m w w 4 m L M R 9 h N a c B 9 N G 6 S p 5 2 y 4 9 h w C l m E m k n F c h F / b y T c N 2 b i e + m k z + n O z H u Z + J / X j W l 4 6 S Y y i G L C Q G S H S C r M D x m h Z V o D s o H U S M S z 5 M h k w A T X n A i 1 Z F y I V I z T X o p p H 8 7 8 9 4 u k d V Z x z i v V m 2 q 5 d j V r p g C H c A Q n 4 M A F 1 K A O D W i C g H t 4 g m d 4 s R 6 t V + v N e v 8 Z X b J m O w f w B 9 b H N 4 K b k o c = < / l a t e x i t > H 1 + < l a t e x i t s h a 1 _ b a s e 6 4 = " r n F h l / B k h 4 Q s x + M 3 8 y 6 / + b k 4 5 P s = " > A A A B 9 n i c d V D L T g J B E J z F F + I L 9 e h l I j H x t N l d l 9 e N 6 I U j J v J I g J D Z o Y E J s 4 / M 9 B o J 4 R e 8 6 s m b 8 e r v e P B f 3 E V M 1 G i d K l X d 6 e r y I i k 0 W t a b k V l b 3 9 j c y m 7 n d n b 3 9 g / y h 0

FIG. 3 :
FIG. 3: Schematic depiction of one quantum-enhanced Metropolis step described in Sec.IX B. The system illustrated is a 2D lattice model, with L sites and a classical spin glass energy cost function, H 1 .In this notation |x⟩= |σ z 1 , • • • , σ z L ⟩ = |⃗ σ z⟩ is a bit-string, basis state of the 2 L dimensional Hilbert space.We start from an initial bit-string |⃗ σ z init ⟩, that undergoes unitary evolution (in a digital quantum hardware, this can be implemented by Trotterization) using a full quantum Hamiltonian H = H 1 + H 2 .At the end of the evolution, the measurement process collapses the time-evolved state |ψ⟩ into a single bit-string |⃗ σ z new ⟩.This concludes the proposal step T (x, x ′ ).For the transverse field case, the proposal matrix is symmetric, T (x, x ′ ) = T (x ′ , x).Finally, the acceptance step is performed classically, and the new configuration may or may not be accepted.
Map of the main links, between quantum algorithms and Monte Carlo methods, contained in this Perspective.Connected green links indicate that fruitful information flow between the two area has already been established.Disconnected red links indicate topics that still require more investigation or where proposed solutions are not completely satisfactory.