Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter February 5, 2020

Why simple quadrature is just as good as Monte Carlo

Kevin Vanslette EMAIL logo , Abdullatif Al Alsheikh and Kamal Youcef-Toumi


We motive and calculate Newton–Cotes quadrature integration variance and compare it directly with Monte Carlo (MC) integration variance. We find an equivalence between deterministic quadrature sampling and random MC sampling by noting that MC random sampling is statistically indistinguishable from a method that uses deterministic sampling on a randomly shuffled (permuted) function. We use this statistical equivalence to regularize the form of permissible Bayesian quadrature integration priors such that they are guaranteed to be objectively comparable with MC. This leads to the proof that simple quadrature methods have expected variances that are less than or equal to their corresponding theoretical MC integration variances. Separately, using Bayesian probability theory, we find that the theoretical standard deviations of the unbiased errors of simple Newton–Cotes composite quadrature integrations improve over their worst case errors by an extra dimension independent factor N-12. This dimension independent factor is validated in our simulations.

Funding statement: This work was supported by the Center for Complex Engineering Systems at King Abdulaziz City for Science and Technology and the Massachusetts Institute of Technology.

A Appendix: Direct comparisons to other MC based integration methods?

We attempt to make some direct comparisons between simple quadrature and Markov chain Monte Carlo (MCMC), importance sampling, stratified sampling, and Latin hypercube sampling using this analysis. We use comparisons of these methods to MC from literature and then compare them to simple quadrature using (3.11).

The variance of a MCMC estimate is


where τ is the integrated autocorrelation time. When the samples are all drawn i.i.d., one finds τ=1, which is MC. Due to the Markov chain nature of MCMC, the samples tend to have positive autocorrelation that exponentially decay, Cov(fi,fj)exp(-|j-i|/τ), which means that τ tends to be greater than one, i.e., it tends that VarMCVarMCMC. Using (3.11), this implies that the expected variance of rectangular and midpoint quadrature is also less than the variance of MCMC.

The other sampling methods are difficult to compare directly with simple quadrature for a number of reasons. The variance of importance sampling in the general case is not directly compared to MC in the literature because the “importance sampling distribution” h(x) is chosen somewhat arbitrarily in practice. It is known that the optimal choice of h(x) is h*(x)=|f(x)|/|f(x)|; however, this is self-defeating as it requires effectively knowing the value of the integral in the first place, |f(x)|. This causes practitioners to rely on heuristics or guess and check strategies for choosing h(x). Thus, by not having a rigorous general comparison to MC, it also lacks a rigorous comparison to simple quadrature. Latin hypercube sampling (LHS) and stratified sampling (SS) have been compared with MC, but because VarLHSVarMC (for monotonic functions at least) and VarSSVarMC, while E[Var𝒬]VarMC, one cannot determine from these inequalities alone whether or not simple quadrature is less than the other methods or vice-versa – for general f(x). Making general direct comparisons with these methods is outside the scope of the present article.

B Appendix: Simulation results

For 𝒬R and 𝒬M, we simulate the standard deviation of the error under their respective piecewise spline interpolant assumptions. Using N=216=65,536 samples, we simulated the approximate error of the integral using


where {fd(n𝒬)(c𝒬,i)} are randomly generated from a uniform distribution having σfd(n𝒬) (these distributions should be assigned based on the prior knowledge, and the choice of its functional form does not affect our result much – here uniform was chosen a priori). The error is randomly simulated 100 times, {ϵ^𝒬(1),,ϵ^𝒬(100)}, and the unbiased standard deviation of the error is estimated using


as the expected value of the error is zero. If the expected error is not zero because i=1Nd=1Dfd(n𝒬)(c𝒬,i)0, then one can use the knowledge of this bias to correct estimate (4.2).

The exponent χ in N-χ from the standard deviation of the error (4.3) is what we seek to validate in the simulation. We estimate χ from the simulation results by using


for 𝒬R and 𝒬M, which in theory is χ=n𝒬D+12. The ± values on the simulated χ^ values in Table 1 and Table 2 were generated by repeating the whole experiment 10 times and reporting their average χ^±σ^χ^ of the estimates. Once one realizes that the function derivatives can be treated as random numbers, the simulation effectively becomes the exercise of demonstrating the well-known result that the standard deviation of the average of a set of random numbers decreases N-12.

Table 1

𝒬R: Theory vs. simulation of χ=1D+12.

Dimension DError exponentTheory exponent χSimulated exponent χ^ ± 0.007 ± 0.006
40.250.750.752 ± 0.005
80.1250.6250.625 ± 0.004
160.06250.56250.564 ± 0.007
Table 2

𝒬M: Theory vs. simulation of χ=2D+12.

Dimension DError exponentTheory exponent χSimulated exponent χ^ ± 0.006 ± 0.005 ± 0.006
80.250.750.748 ± 0.006
160.1250.6250.625 ± 0.004


We would like to acknowledge the thoughtful conversations we had with Zeyad Al Awwad, Arwa Alanqary, and Nicholas Carrara during the writing of this article.


[1] B. Adams, L. Bauman, W. Bohnhoff, K. Dalbey, M. Ebeida, J. Eddy, M. Eldred, P. Hough, K. Hu, J. Jakeman, J. Stephens, L. Swiler, D. Vigil and T. Wildey, Dakota, A multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis: Version 6.3 User’s Manual, Sandia Labs Technical Report SAND2014-4633, New Mexico, 2015. 10.2172/1177048Search in Google Scholar

[2] F. X. Briol, C. Oates, M. Girolami and M. A. Osborne, Frank–Wolfe Bayesian quadrature: Probabilistic integration with theoretical guarantees, Adv. Neural Inform. Process. Syst. 28 (2015), 1162–1170. Search in Google Scholar

[3] B. Carpenter, A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li and A. Riddell, Stan: A probabilistic programming language, J. Statist. Softw. 76 (2017), 1–31. 10.18637/jss.v076.i01Search in Google Scholar

[4] A. Caticha and A. Giffin, Updating probabilities, AIP Conf. Proc. 872 (2006), 31–42. 10.1063/1.2423258Search in Google Scholar

[5] W. Gilks and P. Wild, Adaptive rejection sampling for Gibbs sampling, J. R. Stat. Soc. Ser. C. Appl. Stat. 41 (1992), 337–348. 10.2307/2347565Search in Google Scholar

[6] O. P. Maître and O. M. Knio, Spectral Methods for Uncertainty Quantification, Springer, New York, 2010. 10.1007/978-90-481-3520-2Search in Google Scholar

[7] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21 (1953), 1087–1092. 10.2172/4390578Search in Google Scholar

[8] R. M. Neal, Slice sampling, Ann. Statist. 31 (2003), 705–741. 10.1214/aos/1056562461Search in Google Scholar

[9] A. O’Hagan, Monte Carlo is fundamentally unsound, J. R. Stat. Soc. Ser. D. Statist. 2/3 (1987), 247–249. 10.2307/2348519Search in Google Scholar

[10] A. O’Hagan, Bayes–Hermite quadrature, J. Statist. Plann. Inference 3 (1991), 245–260. 10.1016/0378-3758(91)90002-VSearch in Google Scholar

[11] K. Vanslette, The inferential design of entropy and its application to quantum measurements, Ph.D. thesis, University at Albany SUNY, 2018. Search in Google Scholar

Received: 2020-01-06
Accepted: 2020-01-14
Published Online: 2020-02-05
Published in Print: 2020-03-01

© 2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 28.11.2022 from
Scroll Up Arrow