The use of many control variates is proposed as a method to accelerate the second- and third-order Monte Carlo (MC) many-body perturbation (MC-MP2 and MC-MP3) calculations. A control variate is an exactly integrable function that is strongly correlated or anti-correlated with the target function to be integrated by the MC method. Evaluating both integrals and their covariances in the same MC run, one can effect a mutual cancellation of the statistical uncertainties and biases in the MC integrations, thereby accelerating its convergence considerably. Six and thirty-six control variates, whose integrals are known *a priori*, are generated for MC-MP2 and MC-MP3, respectively, by systematically replacing one or more two-electron-integral vertices of certain configurations by zero-valued overlap-integral vertices in their Goldstone diagrams. The variances and covariances of these control variates are computed at a marginal cost, enhancing the overall efficiency of the MC-MP2 and MC-MP3 calculations by a factor of up to 14 and 20, respectively.

## I. INTRODUCTION

There has recently been much effort to develop stochastic algorithms of *ab initio* many-body methods,^{1–26} complementing quantum Monte Carlo (MC).^{27–34} This effort is, in part, driven by the trend in high-performance computing to rely on ever-increasing parallelism in the hardware. Parallelization efforts become more daunting with the rapidly changing hardware landscape shifting to heterogeneous architecture involving both graphical processing units (GPUs) and central processing units (CPUs). Conventional deterministic algorithms tend to involve massive arrays and their dense matrix multiplications, which are hard to parallelize at the scale of modern supercomputers with millions of cores or on mixed GPU–CPU systems. In contrast, stochastic algorithms such as the Monte Carlo (MC) method can be implemented in such a way that each processor performs nearly independent calculation with minimal inter-processor communications and are much easier to parallelize in a scalable manner.

In addition to parallelization, developing algorithms that scale favorably with respect to the system size is equally or more important. The most affordable many-body method to improve upon the Hartree–Fock (HF) method is the second-order many-body perturbation (MP2) method whose cost already scales as the fifth power of the system size if the most conventional deterministic algorithm is employed.^{35,36} Methods capable of attaining benchmark accuracy, such as the fourth-order many-body perturbation (MP4) method and coupled-cluster singles, doubles, and perturbative triples [CCSD(T)],^{37–41} incur costs that grow as the seventh power of the system size. Reducing this high-rank size-dependence of costs (“scaling”) has always been a major goal of *ab initio* many-body method developers. Stochastic algorithms are among few such techniques that can potentially reduce the scaling for general systems by making a trade-off between statistical errors and computational costs.

The MC many-body perturbation (MC-MP)^{2,8–10,13,14} and MC many-body Green’s function (MC-GF)^{5,9,20,22} methods are an example of stochastic algorithms that are scalable with respect to the system size and parallel-executable efficiently on a mixed GPU–CPU platform.^{13} Currently, the MC-MP and MC-GF methods are both implemented through third-order, with basis-set-incompleteness corrections (via the explicitly correlated technique^{42–47}) available to the second-order methods. These methods adopt the Laplace-transform technique of Almlöf^{48} to eliminate the resolvent denominators and then recast the resulting long sum of products of low-dimensional integrals into a short sum of high-dimensional integrals in real space and imaginary time.^{8} These integrals are evaluated by the MC method essentially independently by as many processors as available at nearly perfect parallel scalability. Their operation cost has a one- to two-rank lower polynomial dependence^{13,14,20,22} on the system size than the corresponding deterministic algorithms, and their memory cost is negligible.

Nevertheless, the favorable scalings are tempered by large prefactors on the cost functions, making the crossover point far off along the system-size axis,^{47} whereupon the costs of the MC algorithms become actually less than those of the deterministic algorithms. Reducing this prefactor is, therefore, crucial.

A wide variety of techniques, drawing from different principles, have been developed to optimize the performance of deterministic many-body methods. Examples of such techniques applicable to perturbation theory include integral screening,^{49,50} integral approximations,^{51,52} improvements in integral codes and algorithms,^{35,36,53–61} and various local approximation.^{62–65} A common thread among these approaches is that they attempt to improve performance either by more efficiently evaluating the two-electron repulsion integrals or by reorganizing or altering the high-rank sums defining these methods. The MC-MP and MC-GF methods may be viewed as an example of the latter, as they do away with the explicit evaluation of individual two-electron repulsion integrals. As a result, adapting other existing techniques to accelerate deterministic many-body methods to the MC-MP and MC-GF methods is not trivial, and novel acceleration techniques may need to be proposed.

One such acceleration technique specifically designed for the MC-MP and MC-GF methods is the redundant-walker algorithm.^{6} The integrands for these methods are products of functions of *k* electron pairs, where *k* is the perturbation order. In the redundant-walker algorithm, instead of generating the minimum number (*k*) of electron pairs required to evaluate the integrand, much more electron pairs are produced. Then, all permutations of the electron pairs are used to evaluate the integrand. By carefully choosing the number of electron pairs,^{13} it is possible to shift the hotspot of the MC-MP algorithm in a manner that increases the overall sampling rate and, thereby, accelerates the MC integration.

General acceleration techniques for MC integrations are a topic of broad mathematical interest.^{66–68} One such approach focuses on improving the sampling of the coordinates for the integration. We have developed a direct-sampling algorithm^{69} for the MC-MP methods that directly generates a stochastic distribution for electron pair coordinates as opposed to relying on the Markov sampling technique such as the Metropolis–Hastings algorithm^{70,71} adopted in previous implementations. This direct-sampling algorithm avoids autocorrelation or rejected moves and is inherently more efficient than Markov sampling. This work is reported in Ref. 69.

Here, we explore another powerful acceleration technique for MC integrations, which aims at reducing the statistical uncertainty: control variates. Control variates^{66–68} are functions with exactly known integrals, which are similar to and highly correlated or anti-correlated with the target integrand. By integrating these control variates and the target integrand simultaneously in the same MC run, the statistical uncertainty in the latter can be compressed with the aid of the accurate knowledge of the statistical errors in the former. Recently, Bayne and Chakraborty^{72} proposed several control variates to accelerate the MC integration of two-electron repulsion integrals in the context of stochastic HF calculations, where they used an overlap integral as a control variate.

In this work, we develop an acceleration technique of MC-MP2 and MC-MP3 using many control variates and evaluate their effectiveness. The control variates, whose integrals are known *a priori* to be zero, are generated by replacing one or more of Coulomb operators in the MC-integrable expressions of MP2 or MP3 energy by unity, which amounts to converting one or more two-electron-integral vertices in the corresponding Feynman–Goldstone diagram by zero-valued overlap-integral vertices. Six and thirty-six control variates are formed for MC-MP2 and MC-MP3, respectively, by following a systematic diagrammatic rule of replacements, leading to an efficiency increase of these methods by a factor of up to 14 and 20 in our numerical experiments for molecules ranging in size from water to C_{60} fullerene.

## II. METHODS

The general idea and mathematical basis of control variates are given in the Appendix.

The second- and third-order (MP2 and MP3) energies are expressed by 2 and 12 Goldstone diagrams, respectively. A full list of these diagrams can be found in many standard texts.^{41,73} The rules to translate these diagrams into algebraic formulas involving two-electron repulsion integrals, etc., are also documented in those texts.^{41,73–76} We introduced a slightly different set of rules^{8,22} to convert the same diagrams directly into MC-integrable algebraic formulas^{2,8,13} involving Green’s functions, etc.

In this work, the control variates for the MC-MP methods are directly generated from the diagrams of the parent methods. Diagrams for the control variates are created by systematic modification of each of the Goldstone diagrams of the parent method with a certain topology. The algebraic formula of the control variate is then obtained by applying nearly the same interpretation rules mentioned above. This diagrammatic method for control variates is advantageous since its systematic nature permits the computerized enumeration and implementation of many (6 for MC-MP2 and 36 for MC-MP3) control variates.

This method can be best explained by an example. The following particle–particle ladder diagram in MP3 (for a restricted HF reference of a closed-shell molecule) is algebraically interpreted as follows:

Equation (1) is the MC-integrable formula,^{8} where *G*^{+} and *G*^{−} are the traces of the retarded and advanced HF Green’s functions, respectively, in the real space (**r**) and imaginary time (*τ*) and *r*_{12} is the distance between electrons 1 and 2, and thus, 1/*r*_{12} is the Coulomb potential between them. Equation (2) is the usual algebraic interpretation involving two-electron repulsion integrals, where *i*, *j* (*a*, *b*) denote occupied (virtual) spatial orbitals whose number is *n*_{occ} (*n*_{vir}), *ϵ*_{i} is the *i*th HF orbital energy, and

A control variate whose integral is zero can be obtained by replacing the top filled-circle vertex, representing a two-electron repulsion integral or Coulomb potential, by an open-circle vertex, designating an overlap integral or a unit potential,

The last equality follows from the definition of ⟨*ij*, *ab*⟩,

where *δ*_{ia} is Kronecker’s delta. Hence, the corresponding control variate is the integrand of Eq. (4), which differs from the parent integrand [Eq. (1)] by the absence of the Coulomb potential 1/*r*_{12}. It is convenient because its integral is known *a priori* to be zero. Note that the zero valuedness of the integral originates from the vanishing overlap-integral vertex [Eq. (9)], which, in turn, relies on the fact that the vertex is connected to a hole and particle at either end or at both ends and, therefore, at least one of its half-vertices is V-shaped. Another convenient control variate whose integral is zero is obtained by replacing the bottom vertex of the particle–particle ladder diagram [Eq. (1)] by an open-circle, overlap-integral vertex.

On the other hand, replacing the middle vertex by an open-circle, overlap-integral vertex, we find

which suggests the integrand of Eq. (10) as a control variate. Its integral is also known *a priori*, but the value is nonzero and needs to be determined by evaluating Eq. (12) at a deterministic MP2 cost. Since MC-MP3 is more meaningful when it is combined (or executed simultaneously) with MC-MP2 (not deterministic MP2), we elect to not adopt this control variate.

The general topology of a Goldstone diagram for the MP*n* energy is shown in Fig. 1. The top and bottom vertices, i.e., the “external” vertices of the Goldstone diagram at any perturbation order, always have two half-vertices in a V-shaped configuration. We, therefore, propose generating control variate diagrams by systematically replacing the top or bottom vertex of each of the Goldstone diagrams of the parent MP*n* method by an open-circle, overlap-integral vertex as well as by doing the same for zero through *n* − 1 remaining vertices. The corresponding control variates are obtained simply by deleting the Coulomb potentials from the parent MC-integrable algebraic formula whose integrals are exactly zero. There are, of course, many other control variates conceivable, but this simple algorithm generates as many as 6 control variates for MC-MP2 shown in Fig. 2 and 36 pairs for MC-MP3 compiled in Figs. 3 and 4. Each pair (e.g., diagrams 1.1a and 1.1b) is treated as one control variate, and hence, in a joint MC-MP2/MC-MP3 calculation, the vector of control variates has 42 elements in total. It is advantageous to join correlated control variates together so as to reduce the size of the covariance matrix since its inversion [Eq. (A12)] is costly and numerically less stable if the matrix is too large.

## III. RESULTS

### A. Computational details

The test calculations were performed on the Blue Waters supercomputer at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign. Its Cray XE nodes have two 8-core AMD 6276 CPUs and 64 GB of RAM with a peak performance of 313 GFLOPS.

The use of control variates accelerates MC simulations by reducing the uncertainty in the integration. This comes at the penalty of increasing the computational cost for each MC step. In order to account for both factors, the MC efficiency defined by^{66–68}

is measured, where *T* is the time per MC step, *σ*^{2} is the variance, and “ref” and “alt” stand for the reference and alternate MC integration methods. Here, the reference and alternate methods are the MC-MP*n* (*n* = 2, 3) methods without and with control variates, respectively. Equation (13) may alternatively be written by substituting the variances, *σ*^{2}, with the uncertainties squared, *s*^{2}, provided that the uncertainties were constructed with the same number of samples. The latter form of the equation is adopted here.

Calculations on five molecules, water (*r*_{OH} = 0.9573 Å, *r*_{HH} = 1.5140 Å), methane (*r*_{CH} = 1.0848 Å), ammonia (*r*_{NH} = 1.0123 Å, *r*_{HH} = 1.6011 Å), benzene (*r*_{CC} = 1.397 Å, *r*_{CH} = 1.0869 Å), and C_{60} fullerene (*r*_{CC} = 1.4445 Å, *r*_{CC} = 1.139 45 Å), were performed. The MC-MP2 and MC-MP3 calculations were accelerated by the redundant-walker algorithm,^{6} using 64 electron pairs for the water, ammonia, methane, and benzene simulations and 512 electron pairs for the C_{60} calculation. The sequence of electron pair coordinates was produced by direct sampling,^{69} and the integration of the imaginary-time coordinates was performed by the MC method.^{22} All calculations were carried out using the cc-pVDZ basis set^{77} in the frozen-core approximation.

The factors entering the MC efficiency [Eq. (13)] were measured separately. The average time per MC step was determined by short 16 384-step-per-CPU-core calculations on a single XE node using all of its processors. All uncertainties and energies for each molecule were obtained from a single joint MC-MP2/MC-MP3 calculation averaging 3 481 600 MC steps, except for the simulation for C_{60}, which averaged 128 028 672 steps.

### B. Performance analysis

Table I examines the reduction in uncertainty by the control variates. It is immediately apparent that the uncertainties are compressed by the control variates in all cases considered often significantly. Furthermore, with the control variates, the MC-MP*n* estimates of the energies are systematically closer to the corresponding deterministic benchmarks (when available). This is due entirely to their smaller uncertainties because the MC-MP*n* methods are expected to have no bias.

Method . | Molecule . | E
. | E_{MC}
. | s_{MC}
. | E_{CV}
. | s_{CV}
. | $sMC2/sCV2$ . | T_{MC}
. | T_{CV}
. | T_{CV}/T_{MC}
. | η^{a}
. |
---|---|---|---|---|---|---|---|---|---|---|---|

MP2 | Water | −0.201 63 | −0.2015 | 0.000 32 | −0.2017 | 0.000 14 | 5.1 | 0.518 | 0.538 | 1.04 | 4.9 |

MP2 | Ammonia | −0.186 64 | −0.1865 | 0.000 40 | −0.1866 | 0.000 15 | 7.4 | 0.599 | 0.621 | 1.04 | 7.2 |

MP2 | Methane | −0.160 99 | −0.1606 | 0.000 45 | −0.1609 | 0.000 14 | 10 | 0.686 | 0.710 | 1.04 | 9.9 |

MP2 | Benzene | −0.783 94 | −0.779 | 0.007 3 | −0.782 | 0.001 9 | 14 | 2.34 | 2.39 | 1.02 | 14 |

MP2 | C_{60} | N/A | −7.73 | 0.015 | −7.716 | 0.004 64 | 11 | 433 | 433 | 1.00 | 11 |

MP3 | Water | −0.007 00 | −0.0067 | 0.000 54 | −0.0070 | 0.000 13 | 18 | 3.14 | 4.67 | 1.49 | 12 |

MP3 | Ammonia | −0.013 67 | −0.0144 | 0.000 89 | −0.0138 | 0.000 20 | 20 | 3.30 | 4.78 | 1.45 | 14 |

MP3 | Methane | −0.020 24 | −0.020 | 0.001 1 | −0.0206 | 0.000 25 | 10 | 3.44 | 4.83 | 1.41 | 14 |

MP3 | Benzene | −0.033 96 | 0.04 | 0.052 | −0.02 | 0.011 | 21 | 5.31 | 6.65 | 1.25 | 17 |

MP3 | C_{60} | N/A | 0.4 | 0.32 | 0.23 | 0.060 | 28 | 1370 | 1900 | 1.39 | 20 |

Method . | Molecule . | E
. | E_{MC}
. | s_{MC}
. | E_{CV}
. | s_{CV}
. | $sMC2/sCV2$ . | T_{MC}
. | T_{CV}
. | T_{CV}/T_{MC}
. | η^{a}
. |
---|---|---|---|---|---|---|---|---|---|---|---|

MP2 | Water | −0.201 63 | −0.2015 | 0.000 32 | −0.2017 | 0.000 14 | 5.1 | 0.518 | 0.538 | 1.04 | 4.9 |

MP2 | Ammonia | −0.186 64 | −0.1865 | 0.000 40 | −0.1866 | 0.000 15 | 7.4 | 0.599 | 0.621 | 1.04 | 7.2 |

MP2 | Methane | −0.160 99 | −0.1606 | 0.000 45 | −0.1609 | 0.000 14 | 10 | 0.686 | 0.710 | 1.04 | 9.9 |

MP2 | Benzene | −0.783 94 | −0.779 | 0.007 3 | −0.782 | 0.001 9 | 14 | 2.34 | 2.39 | 1.02 | 14 |

MP2 | C_{60} | N/A | −7.73 | 0.015 | −7.716 | 0.004 64 | 11 | 433 | 433 | 1.00 | 11 |

MP3 | Water | −0.007 00 | −0.0067 | 0.000 54 | −0.0070 | 0.000 13 | 18 | 3.14 | 4.67 | 1.49 | 12 |

MP3 | Ammonia | −0.013 67 | −0.0144 | 0.000 89 | −0.0138 | 0.000 20 | 20 | 3.30 | 4.78 | 1.45 | 14 |

MP3 | Methane | −0.020 24 | −0.020 | 0.001 1 | −0.0206 | 0.000 25 | 10 | 3.44 | 4.83 | 1.41 | 14 |

MP3 | Benzene | −0.033 96 | 0.04 | 0.052 | −0.02 | 0.011 | 21 | 5.31 | 6.65 | 1.25 | 17 |

MP3 | C_{60} | N/A | 0.4 | 0.32 | 0.23 | 0.060 | 28 | 1370 | 1900 | 1.39 | 20 |

^{a}

The efficiency boost $\eta =(sMC2/sCV2)/(TCV/TMC)$ according to Eq. (13).

The column of “$sMC2/sCV2$” in Table I gives the square of the ratio of the uncertainties. Since the uncertainty falls off as *N*^{−1/2} with the number (*N*) of MC steps, the control variate reduces the number of MC steps to reach the same uncertainty by the same squared ratio, which ranges from 5.1 to 14 for MC-MP2 and from 10 to 28 for MC-MP3.

The squared ratios and hence the efficacy of the control variates seem to correlate roughly with the molecular size. Comparing the target integrand [e.g., the integrand of Eq. (1)] and one of its control variates [the integrand of Eq. (4)], it may be inferred that they are more strongly correlated when $r12\u22121$ is a slowly varying function, namely, when *r*_{12} on average is large. Figure 5 plots the distributions of *r*_{12} for the five molecules studied, confirming that the larger the molecule, the greater the average interelectronic distance, which may be at least partly why the control-variate method works particularly well for C_{60}.

Note that on a molecule-by-molecule basis, the reduction in the uncertainty by the control variates is greater in MC-MP3 than in MC-MP2. This is likely attributed to the increased number of control variates in MC-MP3, which thus have a greater opportunity to capture information about the error in the integration.

The additional time required to integrate the control variates and compute their covariances is now examined. The third and fourth columns from the last in Table I compare the average time per MC step in MC-MP2 or MC-MP3 calculations that do not (*T*_{MC}) and do (*T*_{CV}) use the control variates. The penultimate column gives the ratio of the two preceding columns, indicating that the control-variate method incurs a negligible additional cost in MC-MP2 and only a modest increase in cost by no greater than 50% in MC-MP3.

A detailed cost analysis of the MC-MP algorithm is given elsewhere.^{13} Briefly, the algorithmic bottleneck changes depending on the system size (*n*), the number of electron pairs (*m*), and the rank of theory (*k*). It is either the atomic-orbital-to-molecular-orbital transformation, the construction of the traces of the HF Green’s functions, or the integrand evaluations. These steps scale as *O*(*n*^{2}*m*), *O*(*nm*^{2}), and *O*(*m*^{k}), respectively. The additional computation related to the control variates occurs entirely within the integrand-evaluation step and affects only the prefactor of the cost function, but not its scaling.

At the MC-MP2 level of theory, with the parameters used here, the integrand evaluation is far from the algorithmic hotspot. As a consequence, the additional work required by the control variates has nearly no impact on the overall time per MC step, with the ratio *T*_{CV}/*T*_{MC} being near unity. At the MC-MP3 level of theory, the algorithmic hotspot (with a sufficiently large number of electron pairs) partially shifts to the integrand evaluation, making the effect of using the control variates felt in the average time per MC step.

The last column of Table I combines the rate of compression of the simulation length in terms of the number of MC steps and the rate of increase in the time per MC step to compute the overall efficiency boost (*η*) brought to by the control-variate method [Eq. (13)]. The acceleration provided by this method for MC-MP2 ranges from 4.9 to 14. Similarly, the MC-MP3 calculations were sped-up by a factor ranging from 12 to 20. The use of control variates seems to afford a greater degree of acceleration to more difficult calculations (i.e., for larger molecules and at higher perturbation orders), which is desirable.

## IV. CONCLUSIONS

Convergence acceleration of the MC-MP2 and MC-MP3 methods by employing multiple control variates has been proposed and evaluated. The control variates are the functions with known integrals that are strongly correlated or anti-correlated with the integrands of the correlation energies. The integration of these control variates and the evaluation of their covariances incur little additional work yet compress the statistical uncertainty in the MC estimates of the correlation energies for small and large molecules alike often by an order of magnitude. As a result, accelerations by a factor of up to 14 and 20 have been observed for MC-MP2 and MC-MP3, respectively. The method has been found to perform better for more challenging problems (such as for larger molecules and at higher perturbation orders).

The method should be straightforwardly extensible to high-order MC-MP methods^{78} and MC-GF methods.^{5,22} In contrast, the Monte Carlo explicitly correlated MP2 and GF2 (MC-MP2-F12 and MC-GF2-F12) methods,^{10,14,20} which compute basis-set-incompleteness corrections stochastically, are not formulated diagrammatically and may need a different strategy for generating control variates. Nevertheless, the redundant-walker,^{6} direct-sampling,^{69} and control-variate algorithms together constitute a powerful and indispensable triad for the entire family of the MC-MP and MC-GF methods, complementing their unrivaled parallelism that can effectively harness thousands of CPUs and hundreds of GPUs.^{13}

## DATA AVAILABILITY

The data that support the findings of this study are available within the article.

## ACKNOWLEDGMENTS

This work was supported by the Center for Scalable, Predictive methods for Excitation and Correlated phenomena (SPEC), which is funded by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, Chemical Sciences, Geosciences, and Biosciences Division, as a part of the Computational Chemical Sciences Program and also by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Grant No. DE-SC0006028. It is also part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (Award Nos. OCI-0725070 and ACI-1238993) and the State of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.

### APPENDIX: CONTROL VARIATES

The use of control variates^{66–68} is a well-known technique to dramatically reduce the variance in an MC integration. Suppose a problem where the integral of *f*(*x*) under the probability density function, *ρ*(*x*), is evaluated by the MC method,

where *μ*_{f} is the target integral, $\mu f[N]$ is its MC estimate after *N* MC steps. The variance of *f*(*x*) under *ρ*(*x*) is

Let there be a second related function, *g*(*x*), whose integral (*μ*_{g}) under *ρ*(*x*) is known *a priori* and exactly. Then, *g*(*x*) may be used as a control variate, giving a new MC estimate, $\mu \u0303f$, as

where *α* is an adjustable parameter. The variance of $\mu \u0303f$ is

where $\sigma g2$ is the variance of *g*(*x*) under *ρ*(*x*) and **K**_{f,g} is the covariance between *f*(*x*) and *g*(*x*) under *ρ*(*x*), namely,

The adjustable parameter *α* is determined by minimizing $\sigma \u0303f2$, which implies

and

where *ρ*_{f,g} = **K**_{f,g}/(*σ*_{f}*σ*_{g}) is identified as the Pearson correlation coefficient between *f*(*x*) and *g*(*x*). Clearly, $\sigma \u0303f2\u2264\sigma f2$ and the convergence of the MC integration of *f*(*x*) is accelerated by the control variate, *g*(*x*). The more strongly correlated or anti-correlated *f*(*x*) and *g*(*x*), the greater the degree of compression of the variance of the MC integral. In the limit where *f*(*x*) and *g*(*x*) are perfectly correlated or anti-correlated (*ρ*_{f,g} = ±1), the variance can be made to vanish.

When multiple control variates are used, the MC integral of *f*(*x*) under *ρ*(*x*) is evaluated as

where $\alpha \u2192$ is now a vector of adjustable parameters, $g\u2192(xn)$ is the vector of the control variate functions evaluated at *x*_{n}, and $\mu \u2192g$ is the vector of the exact integrals of the control variate functions. The variance of $\mu \u0303f$ is related to that of *μ*_{f} by

where $Kf,g\u2192$ is the vector of the covariances between *f*(*x*) and $g\u2192(x)$. The values of the adjustable parameters are again determined so as to minimize $\sigma \u0303f2$. They are then the root of the set of linear equations,

where $Kg\u2192,g\u2192$ is the covariance matrix of the control variates, which implies

Since the covariance matrix $Kg\u2192,g\u2192$ is positive semi-definite, we have $\sigma \u0303f\u2264\sigma f$. Generally, the greater the number of the control variates, the more strongly they are correlated and/or anti-correlated with *f*(*x*), making $\sigma \u0303f$ smaller. It is, therefore, advantageous to have many control variates insofar as the evaluation of *g*(*x*) is not overly burdensome so as to prevent the compression of the variance from being overshadowed by additional computational costs.