Working Paper

Forecasting with Large Datasets: Aggregating Before, During or After the Estimation?


  • Pirschel
  • I.
  • Wolters
  • M.
Publication Date

We study the forecasting performance of three alternative large data forecasting approaches. These three approaches handle the dimensionality problem evoked by a large dataset by aggregating its informational content, yet on different levels. We consider different factor models, a large Bayesian vector autoregression and model averaging techniques, where aggregation takes place before, during and after the estimation of the different forecasting models, respectively. We use a dataset for Germany that consists of 123 variables in quarterly frequency and find that overall the large Bayesian VAR and the Bayesian factor augmented VAR provide the most precise forecasts for a set of 11 core macroeconomic variables. Both considerably outperform the remaining large scale forecasting models in terms of joint forecasting accuracy as measured by the multivariate MSE. Further, we find that the performance of these two models is very robust to the exact specification of the forecasting model.


JEL Classification
C53, C55, E31, E32, E37, E47

Key Words

  • Factor Models
  • Faktormodelle
  • Great Recession
  • Large Bayesian VAR
  • Model averaging