Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter August 13, 2013

Forecast densities for economic aggregates from disaggregate ensembles

Francesco Ravazzolo and Shaun P. Vahey EMAIL logo


We extend the “bottom up” approach for forecasting economic aggregates with disaggregates to probability forecasting. Our methodology utilises a linear opinion pool to combine the forecast densities from many disaggregate forecasting specifications, using weights based on the continuous ranked probability score. We also adopt a post-processing step prior to forecast combination. These methods are adapted from the meteorology literature. In our application, we use our approach to forecast US Personal Consumption Expenditure inflation from 1990q1 to 2009q4. Our ensemble combining the evidence from 16 disaggregate PCE series outperforms an integrated moving average specification for aggregate inflation in terms of density forecasting.

JEL codes: C11; C32; C53; E37; E52

Corresponding author: Shaun P. Vahey, Warwick Business School, University of Warwick, Coventry, UK, e-mail:


We benefited greatly from discussions with Todd Clark, Anthony Garratt, Kirstin Hubrich, Christian Kascha, James Mitchell, Kalvinder Shields, Tara Sinclair, Michael Smith and Simon van Norden. We thank conference and seminar participants at Melbourne University, Oslo University, the European Central Bank, the Veissmann European Research Centre, the Society for Nonlinear Dynamics and Econometrics 17th Annual Symposium, and the 6th Eurostat Colloquium. The views expressed in this paper are our own and do not necessarily reflect those of Norges Bank. We thank the ARC, Norges Bank, the Reserve Bank of Australia and the Reserve Bank of New Zealand for supporting this research (LP 0991098).


To illustrate how the ensemble system reacts to time variation in the weights, ωi,τ, and the parameters of the disaggregate forecasting equations, equation (3), we describe eight simulation exercises.[9]

We begin by describing the basic case, exercise 1. We simulate two disaggregate variables, each of which follows a first order autoregressive model, AR(1) with Gaussian error, given by equation (3). The aggregate index, yt, satisfies equation (1) for two disaggregates (i=1, 2), with index weights ωi=0.5. Each simulation has 1000 replications. Using a total sample of 120 observations (indexed by t=1, …, 120) in each simulation, we construct out of sample disaggregate forecasts for t=41, …, 120. We estimate the disaggregate models using a Bayesian AR(1) with non-informative priors, and an expanding window of observations for in-sample estimation. [The predictive densities follow the t-distribution, with mean and variance equal to OLS estimates; see, for example, Koop (2003, chapter 3) for details.] Out of sample forecast densities for t=41, …, 120 are passed through the LOP, using a 20-period training window to initialise the ensemble weights. A moving window of 20 observations is used to both bias-correct the disaggregate densities and to construct the ensemble weights for LOP. Hence, the out of sample evaluation for the ensemble starts in t=τ=61 and ends in τ¯=120. We forecast the aggregate using an aggregate AR(1) specification as a benchmark forecasting model.

In the seven subsequent simulation exercises, we explore the implications of introducing specification errors to the forecasting system. These include evolving index weights, and various forms of structural breaks in the disaggregate forecasting specifications. In each simulation, the disaggregate ensemble and the benchmark aggregate AR(1) model ignores the time variation in the “true” specification so that we can study the impacts of unknown specification errors.

  1. The index weight ω1 follows an autoregressive process, such that the weight is bounded between [0.25,0.75], and the weights sum to one.

  2. As exercise 2 except that each disaggregate has a single break in the mean at observation t=20.

  3. As exercise 3 except that each disaggregate has two breaks in the mean, the first at observation t=20, the second at t=60.

  4. As exercise 2 except that each disaggregate has a single break in the error variance at observation t=20.

  5. As exercise 5 except that each disaggregate has two breaks in the error variance, the first at observation t=20, the second at t=60.

  6. As exercise 2 except that each disaggregate has a single break both the mean and the error variance at observation t=20.

  7. As exercise 7 except that each disaggregate has two breaks in both the mean and the error variance, the first at observation t=20, the second at t=60.

To check that our results in exercises 2 through 8 are not sensitive to the assumption that the index weights are time-varying, we repeated exercises 2–8 with constant weights. The results of these simulations are quantitatively similar to exercises 2–8 and so are not reported. That is, the time variation in the index weights has negligible impacts on the performance of the disaggregate ensemble relative to the aggregate benchmark.

To judge forecasting performance, we use the average logarithmic score over the evaluation period, τ=61 to τ¯=120. The logarithmic score of the ith density forecast, In g(Yt|Ii, t), is the log of the probability density function g(.|Ii, t), evaluated at the outturn Yt. Mitchell and Wallis (2011) provide a recent discussion of scoring rules and the justification for testing relative density forecasting performance from the perspective of the KLIC. Gneiting and Raftery (2007) analyse the relationships between scoring rules and Bayes factors. A higher average Logarithmic Score denotes better density forecasting performance. We provide histograms based on the 1000 repetitions for each simulation exercise shown in Figure 4.

Figure 4 Simulation results.Note: The figures show histograms of the LS for the AR model and the DE, the mean of the LS for the AR model (red lines) and the LS for the DE (red dashed lines) for our simulation exercises.
Figure 4

Simulation results.

Note: The figures show histograms of the LS for the AR model and the DE, the mean of the LS for the AR model (red lines) and the LS for the DE (red dashed lines) for our simulation exercises.

There are two striking features from our simulations. First, regardless of which case we consider, the disaggregate ensemble (DE) is never inferior to the aggregate benchmark forecasting model (AR) in terms of the average logarithmic score across the 1000 replications. Second, the biggest differences in density forecasting performance arise in cases where the disaggregate forecasting specifications exhibit multiple structural breaks (especially in the means). In particular, in exercises 4 and 8.


Atger, F. 2003. “Spatial and Interannual Variability of the Reliability of Ensemble-based Probabilistic Forecasts: Consequences for Calibration.” Monthly Weather Review 131: 1509–1523.10.1175//1520-0493(2003)131<1509:SAIVOT>2.0.CO;2Search in Google Scholar

Amisano, G., and R. Giacomini. 2007. “Comparing Density Forecasts via Likelihood Ratio Tests.” Journal of Business and Economic Statistics 25 (2): 177–190.10.1198/073500106000000332Search in Google Scholar

Arora, S. M., M. A. Little, and P. E. McSharry. 2013. “Nonlinear and Nonparametric Modelling Approaches for Forecasting the US GNP.” Studies in Nonlinear Dynamics and Control, 1–26, published online June 2013.Search in Google Scholar

Bache, I. W., J. Mitchell, F. Ravazzolo, and S. P. Vahey. 2010. “Macro Modeling with Many Models.” In Twenty Years of Inflation Targeting: Lessons Learned and Future Prospects, edited by D. Cobham, Ø. Eitrheim, S. Gerlach, and J. Qvigstad, 398–418. Cambridge, UK: Cambridge University Press.10.1017/CBO9780511779770.016Search in Google Scholar

Bao, Y., T. -H. Lee, and B. Saltoglu. 2007. “Comparing Density Forecast Models.” Journal of Forecasting 26: 203–225.10.1002/for.1023Search in Google Scholar

Bao, L., T. Gneiting, E. P. Grimit, P. Guttop, and A. E. Raftery. 2010. “Bias Correction and Bayesian Model Averaging for Ensemble Forecasts of Surface Wind Direction.” Monthly Weather Review 138: 1811–1821.10.1175/2009MWR3138.1Search in Google Scholar

Berkowitz, J. 2001. “Testing Density Forecasts, with Applications to Risk Management.” Journal of Business and Economic Statistics 19 (4): 465–474.10.1198/07350010152596718Search in Google Scholar

Clark, T. E. 2006. “Disaggregate Evidence on the Persistence of Consumer Price Inflation.” Journal of Applied Econometrics 21: 563–587.10.1002/jae.859Search in Google Scholar

Clark, T. E. 2011. “Real-time Density Forecasts from VARs with Stochastic Volatility.” Journal of Business and Economic Statistics 29 (3): 327–341.10.1198/jbes.2010.09248Search in Google Scholar

Clark, T. E., and M. W. McCracken. 2010. “Averaging Forecasts from VARs with Uncertain Instabilities.” Journal of Applied Econometrics 25: 5–29.10.1002/jae.1127Search in Google Scholar

Croushore, D. 2009. “Revisions to PCE Inflation Measures: Implications for Monetary Policy.” FRB Philadelphia Working Paper 08-8, revised July 2009.Search in Google Scholar

Doblas-Reyes, F. J., A. Weisheimer, M. Déqué, N. Keenlyside, M. McVean, J. M. Murphy, P. Rogel, D. Smith, and T. N. Palmer. 2009. “Addressing Model Uncertainty in Seasonal and Annual Dynamical Ensemble Forecasts.” Quarterly Journal of the Royal Meteorological Society 135: 1538–1559.10.1002/qj.464Search in Google Scholar

Diebold, F. X., T. A. Gunther, and A. S. Tay. 1998. “Evaluating Density Forecasts; with Applications to Financial Risk Management.” International Economic Review 39: 863–883.10.2307/2527342Search in Google Scholar

Feinstein, M., M. A. King, and J. Yellen. 2004. “Innovations and Issues in Monetary Policy: Panel Discussion.” American Economic Review, Papers and Proceedings, May, 41–48.Search in Google Scholar

Garratt, A., J. Mitchell, S. P. Vahey, and Wakerly. 2011. “Real-time Inflation Forecast Densities from Ensemble Phillips Curves.” North American Journal of Economics and Finance 22: 77–87.10.1016/j.najef.2010.09.003Search in Google Scholar

Geweke, J. 2009. Complete and Incomplete Econometric Models. Princeton, US: Princeton University Press.10.1515/9781400835249Search in Google Scholar

Gneiting, T. 2011. “Making and Evaluating Point Forecasts.” Journal of the American Statistical Association 106: 746–762.10.1198/jasa.2011.r10138Search in Google Scholar

Gneiting, T., and A. E. Raftery. 2007. “Strictly Proper Scoring Rules, Prediction and Estimation.” Journal of the American Statistical Society 102 (477): 359–378.10.1198/016214506000001437Search in Google Scholar

Gneiting, T., and T. Thorarinsdottir. 2010. “Predicting Inflation: Professional Experts versus No-change Forecasts.” in Google Scholar

Granger, C., and M. H. Pesaran. 2000. “Economic and Statistical Measures of Forecast Accuracy.” Journal of Forecasting 19: 537–560.10.1002/1099-131X(200012)19:7<537::AID-FOR769>3.0.CO;2-GSearch in Google Scholar

Greenspan, A. 2004. “Risk and Uncertainty in Monetary Policy.” American Economic Review, Papers and Proceedings, May, 33–40.Search in Google Scholar

Groen, J. J. J., R. Paap, and F. Ravazzolo. 2009. “Real-time Inflation Forecasting in a Changing World.” Federal Reserve Bank of New York Staff Reports, 388.Search in Google Scholar

Hendry, D. F., and K. Hubrich. 2011. “Combining Disaggregate Forecasts or Combining Disaggregate Information to Forecast an Aggregate.” Journal of Business and Economic Statistics 29 (2): 216–227.10.1198/jbes.2009.07112Search in Google Scholar

Hersbach, H. 2000. “Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems.” Weather and Forecasting 15: 559–570.10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2Search in Google Scholar

Jore, A. S., J. Mitchell, and S. P. Vahey. 2010. “Combining Forecast Densities from VARs with Uncertain Instabilities.” Journal of Applied Econometrics 25: 621–634.10.1002/jae.1162Search in Google Scholar

Kascha, C., and F. Ravazzolo. 2010. “Combining Inflation Density Forecasts.” Journal of Forecasting 29: 231–250.10.1002/for.1147Search in Google Scholar

Koop, G. 2003. Bayesian Econometrics, Chichester, UK: Wiley.Search in Google Scholar

Lütkepohl, H. 2009. “Forecasting Aggregated Time Series Variables: A Survey.” Economics Working Papers ECO2009/17, European University Institute.Search in Google Scholar

Lütkepohl, H. 2010. “Forecasting Nonlinear Aggregates and Aggregates with Time-varying Weights.” Economics Working Papers ECO2010/11, European University Institute.10.2139/ssrn.1597622Search in Google Scholar

Marcellino, M., J. Stock, and M. Watson. 2003. “Macroeconomic Forecasting in the Euro area: Country Specific versus Euro Wide Information.” European Economic Review 47: 1–18.10.1016/S0014-2921(02)00206-4Search in Google Scholar

Mitchell, J., and S. G. Hall. 2005. “Evaluating, Comparing and Combining Density Forecasts using the KLIC with an Application to the Bank of England and NIESR Fan Charts of Inflation.” Oxford Bulletin of Economics and Statistics 67: 995–1033.10.1111/j.1468-0084.2005.00149.xSearch in Google Scholar

Mitchell, J., and K. F. Wallis. 2011. “Evaluating Density Forecasts: Forecast Combinations, Model Mixtures, Calibration and Sharpness.” Journal of Applied Econometrics 26: 1023–1040.10.1002/jae.1192Search in Google Scholar

Panagiotelis, A., and M. Smith. 2008. “Bayesian Density Forecasting Intraday Electricity Prices using Multivariate Skew t Distribution.” International Journal of Forecasting 24: 710–727.10.1016/j.ijforecast.2008.08.009Search in Google Scholar

Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski. 2005. “Using Bayesian Model Averaging to Calibrate Forecast Ensembles.” Monthly Weather Review 133: 1155–1174.10.1175/MWR2906.1Search in Google Scholar

Rosenblatt, M. 1952. “Remarks on a Multivariate Transformation.” The Annals of Mathematical Statistics 23: 470–472.10.1214/aoms/1177729394Search in Google Scholar

Stensrud, D. J., and N. Yussouf. 2007. “Bias-corrected Short-range Ensemble Forecasts of Near Surface Variables.” Meteorological Applications 12: 217–230.10.1017/S135048270500174XSearch in Google Scholar

Stock, J. H., and M. W. Watson. 2007. “Why has US Inflation Become Harder to Forecast?” Journal of Money, Credit and Banking 39: 3–34.Search in Google Scholar

Timmermann, A. 2006. “Forecast Combinations.” in Handbook of Economic Forecasting, vol. 1, edited by G. Elliot, C. Granger, and A. Timmermann, 135–196. North-Holland.10.1016/S1574-0706(05)01004-9Search in Google Scholar

van Garderen, K. J, K. Lee, and M. H. Pesaran. 2000. “Cross-sectional Aggregation of Non-linear Models.” Journal of Econometrics 95: 285–331.10.1016/S0304-4076(99)00040-8Search in Google Scholar

Wallis, K. F. 2003. “Chi-squared Tests of Interval and Density Forecasts, and the Bank of England’s Fan Charts.” International Journal of Forecasting 19: 165–175.10.1016/S0169-2070(02)00009-2Search in Google Scholar

Wallis, K. F. 2005. “Combining Density and Interval Forecasts: a Modest Proposal.” Oxford Bulletin of Economics and Statistics 67: 983–994.10.1111/j.1468-0084.2005.00148.xSearch in Google Scholar

Supplemental Material

The online version of this article (DOI:10.1515/snde-2012-0088) offers supplementary material, available to authorized users.

Published Online: 2013-8-13
Published in Print: 2014-9-1

©2014 by De Gruyter

Downloaded on 31.1.2023 from
Scroll Up Arrow