The IWH Forecasting Dashboard: From Forecasts to Evaluation and Comparison

The paper describes the “Halle Institute for Economic Research (IWH) Forecasting Dashboard (ForDas)”. This tool aims at providing, on a non-commercial basis, historical and actual macroeconomic forecast data for the Germany economy to researchers and interested audiences. The database renders it possible to directly compare forecast quality across selected institutions and over time. It is partly based on data collected in the DFG-funded project “Macroeconomic forecasts in great crisis”.


Introduction
Macroeconomic forecasting is one of the areas of economics that receives the most attention from the media and public discourse. For example, forecast competitions are quite popular: Several newspapers and database providers evaluate forecasters more or less regularly (see Döhrn 2015, for an overview of German rankings). Also, point forecasts by institutions gain a lot of attention from the media and politics. Data providers such as Consensus Economics or Focus Economics use the institutions' forecasts to produce a mean forecast across several forecasters. As a result, a significant part of the scientific literature is focused on evaluating these forecasts.
However, typically forecasts are evaluated against a theoretical benchmark like a naive forecast but rarely across forecasting institutions. One reason for the relative scarcity of such analyses is the lack of non-commercial databases that include several institutions, macroeconomic indicators, and forecast periods. The IWH Forecasting Dashboard (ForDas) by the Halle Institute for Economic Research aims to fill the data gap by providing historical and actual data for this purpose. 1 Furthermore, on can directly compare forecasting quality of institutions. The dashboard relies partly on data collected within the DFG project "Macroeconomic forecasts in great crisis". 2 The IWH took over the data collection process in 2020, continues with it, and has implemented a tool to visualize the data. Due to data gaps and different dimensions (for example, growth vs. level predictions) provided by several institutions, the dashboard only shows a selection of variables. In the following, we describe the main contents 3 of the database and the IWH Forecasting Dashboard.
Since the database was initiated first within the context of economic history, we aimed at collecting all quantitative forecasts made for the German economy by institutions relevant to economic policy. As (Antholz 2006) points out, these forecasts started in the early 1960s. While it was possible to find texts describing the prospects of the German economy before, they rarely included any concrete numbers for GDP growth or inflation. Hence, analyses and comparisons based on ForDas data can start in 1965 at the earliest for selected forecasters. 4

Contents of the Database
The main objective of the DFG project was to build a macroeconomic forecast database, including a broad set of forecasts for Germany over the longest possible period, achieving consistency within the dataset as far as possible. In combination with user-friendliness and free access to the data, the database supplies a new instrument for forecast evaluation, made for the scientific community, media, or the public in general.
To this end, the forecast database covers the key macroeconomic variables at each point in time, including variables related to national accounts, financial and monetary variables, and also variables related to labor and (un-)employment. We choose the institutions according to their long-term experience in forecasting and their relevance for economic policy support in Germany. The field of macroeconomic forecasting has markedly grown over time. Therefore, the quantity of forecasts and forecasters has also increased substantially. To give an impression of the growth of the "forecasting industry" during the period covered, Figure 1 shows the sheer number of forecasts regarding the headline measure of economic growth per year included in the database. For the larger part of the period, this headline measure was the rate of change of real GDP. In a smaller part of the sample, however, real GNP growth served as the most prominent figure in this context and is, thus, also taken into account. The number of forecasts is both driven by the number of forecasting institutions as well as the forecast frequency per year. While at the beginning of the sample, forecasts are published once per year, while up to four forecasts are published recently by forecasters. The number of variables of interest for the forecasters also grew over time. Definitions and names of variables varied over time as well, which to some extent reflected changes in the system of national accounts.

Forecasters Covered
The database includes forecast data on the German economy covered by 15 national and international institutions with different institutional backgrounds:

Variables Included
The database covers key macroeconomic variables (Table 1). In addition, financial and monetary variables as well as data related to trade, labor, employment, and unemployment are included. 7 However, note that not all forecasting institutions mentioned above provide forecasts for all variables in all periods. 8 Furthermore, institutions might provide forecasts either in levels or growth rates.

Realisations (Actual Data) and Forecast Horizons
For the comparison of the forecast data with the actual economic development, the so-called "real-time" data problem is an important issue (Stark and Croushore 2002) in economic forecasting and forecast evaluation. Since time series are prone to revisions, the database provides realisations for selected important series in two 8 See the data availability section of the IWH Forecasting Dashboard to see the time span and missing observations for selected variables and institutions.
The IWH Forecasting Dashboard variants: first, the initial publication (first release) and, second, the revised data (current data vintage) by the German statistical office, if available. The database includes information on three possible forecast horizons: the current year, i.e. the year in which the forecast is released (labeled t 0 in the database), a forecast for the next year (t 1 ), and the year after the next year (t 2 ). Hence, following the practice of the institutions covered, the forecasts in the database are "fixed event" rather than "fixed horizon" predictions (see, e.g. Knüppel and Vladu 2016).
However, most of the institutions publish multiple forecasts per year (forecast rounds). Therefore, the exact date of forecast publication is stored, to distinguish different forecast rounds (e.g. quarters, months). In addition, if available, the date on which the forecast was completed is reported. Generally, the database refers to the name of a variable as it was at the date of the production of the forecast. 9 In a similar vein, all dimensions refer to the date of the forecasts. Thus, variables are expressed in Deutsche Mark up to 2000, and in Euros after. Real variables usually refer to the respective base year. Around German reunification, the switch from forecasts referring to West Germany to predictions for Germany as a whole differs by series and by the institution and is, hence, noted similarly for each series. 10 Figure 2 shows the real GDP growth forecasts for Germany, provided by the economic research institutes as well as the realised values from 2001 to 2022. The 9 This is important for cases in which the official name in the national accounts has changed. For example, until a certain date, the database refers to "Gross national product", followed by "Gross National Income" in later years. 10 The disentanglement from West German to German forecasts is not uniform in the period of unification across forecasters. Therefore, we excluded the year 1991 for the calculation of forecast errors in the IWH Forecasting Dashboard.
forecasts have been conducted in autumn (months 9, 10) of the previous year. The black line represents the realised growth rate. The figure illustrates, at first glance, some key insights regarding economic forecast evaluation. First, the forecasts are relatively close to each other, and it seems that they do not differ significantly over time. Second, the prediction of economic turning points (and/or recessions) is still a big challenge in economic forecasting. Different forecasting dates and diverging information sets seem to be important in determining forecast accuracy.

Recent and Possible Applications
The IWH Forecasting Dashboard provides the basis for various potential research questions, most obviously evaluating German business cycle forecasts based on their accuracy or efficiency. Potential other uses could be the investigation of the institutes themselves, their behavior, and change after economically significant events or due to paradigmatic shifts. An additional research field concerns assessing business cycle forecasts' benefits for economic agents. Several studies have already produced scientific findings based on the data available on the IWH Forecasting Dashboard. (Köhler and Döpke 2023) use the IWH Forecasting Dashboard to conduct an overall ranking of 14 institutions from 1993 to 2019 according to their forecast accuracy. They report substantial long-run differences in forecasting quality, which they mostly attribute to distinct average forecast horizons. Therefore, they cannot single out institutions as being superior at predicting the German economy. (Engelke et al. 2019) examine the extent to which initial assumptions that prove incorrect ex-post drive economic forecast errors. Based on an unbalanced panel of annual forecasts from different institutions forecasting German GDP and the underlying assumptions, they found that over 75% of squared errors of the GDP forecast co-move with the squared errors in their underlying assumptions. This finding implies that the accuracy of the assumptions is of great importance and that forecasters should reveal the framework of their assumptions in order to obtain useful policy recommendations based on economic forecasts. The impact of the Great Recession on forecast accuracy for growth and inflation and forecaster behaviour are investigated by (Döpke et al. 2019) using a data panel from 1971 to 2017. The authors report stable accuracy for growth forecasts, but slightly lower precision for inflation forecasts. More significantly, they report that the loss function has changed after the Great Recession, leading to more pessimistic forecasts from German professional forecasters. (Behrens et al. 2018a) evaluate whether growth and inflation forecasts are efficient or optimal, which requires that the information available at the time of forecast creation has no explanatory power for the corresponding forecast error. The The IWH Forecasting Dashboard joint forecast efficiency evaluation shows heterogeneity across the institutes, with different institutes conducting inefficient forecasts for different prognosis horizons. ) confirm this result, which extends the previous study with various scenarios and robustness checks, rejecting strong and weak forecast efficiency of growth and inflation forecasts in multiple cases. Additionally, the authors show in an out-of-sample experiment that a Bayesian additive regression trees (BART) model produces significantly more accurate forecasts. (Behrens 2020) finds trade forecasts similarly heterogeneously inefficient. Remarkably, the forecasters include typical trade predictors more efficiently than macroeconomic variables in their export and import forecasts.
Further research examines forecast efficiency while assuming a flexible instead of a symmetric (quadratic) loss function. For this purpose, the researchers test whether the set of predictors has predictive value for the sign of the forecast error using random decision forests. Re-evaluating inflation forecast optimality, (Behrens et al. 2018b) suggest that short-term inflation forecasts are suboptimal for some institutes while failing to reject the null hypothesis for long-term forecasts. Reconsidering trade forecasts, (Behrens 2019) rejects optimality only in one case, thus supporting a more favorable assessment of forecasts if flexible loss functions are assumed. Several studies have implemented textual business cycle reports using natural language processing (NLP) in their forecast efficiency analyses. While the written reports are not part of the IWH ForDas itself, they are source and rationale of the forecasts, making a combined analysis reasonable. (Müller 2022) transforms the written accounts of forecasters' expectations into sentiment indices using nine different methods. The author demonstrates that several indices can improve the accuracy of German business cycle forecasts proving that forecasters do not fully exploit the information content of their business cycle reports for their numerical point forecasts. (Foltas 2022) uses the Word2Sense-LDA topic model developed specifically for this task to measure the proportions of different economic topics in each business cycle report and uses their shift to test investment forecast efficiency. In some cases, the author rejects forecast efficiency with topics as the most important predictors supporting the thesis that institutes inefficiently incorporate qualitative information discussed in their business cycle reports into point forecasts. With an approach using topics as the sole predictors of the forecast error, (Foltas and Pierdzioch 2022a) affirm the usefulness of topic modeling for forecast efficiency analysis. The authors find several interpretable topics related to the forecast error under symmetric and flexible loss functions. Lastly, (Foltas and Pierdzioch 2022b) utilise a mixed sample of indicators and topics to predict growth forecast errors using quantile random forests and out-of-sample density forecasts. Even though none of the topics are among the top predictors, their aggregated relative importance varied between roughly 30-50%.
Currently, most studies based on the IWH Forecasting Dashboard evaluate economic forecasts and analyse the performance of forecasters. Only a few papers focus on the forecasts' economic value. (Döpke et al. 2018) dive into this field when testing an investment portfolio approach that actively reacts to macroeconomic predictions. While their approach does not systematically outperform passively managed portfolios, they propose several ways to extend their analysis.

IWH Forecasting Dashboard -Data Access
The IWH Forecasting Dashboard (IWH ForDas) is an online tool in R Shiny that makes use of the database described above in order to present the data publicly and in an interactive way. So far a preselection of data has been used to allow a suitable comparison across institutions. The dashboard provides both historical and recent forecast data as well as official data from the German statistical office (Destatis) and Bundesbank, including both first releases (real-time data) and current values (pseudo-real-time data) for the target years. 11 The IWH ForDas includes several forecasting institutions and macroeconomic variables allowing for a direct comparison of the figures, both graphically and numerically. Furthermore, the tool allows for forecast error calculations.
The user has the choice between a simplified view, where only the latest forecasts are shown for a preselection of forecasters and an extended view option, where researchers can assemble data according to specific needs. As an example for the extended view, Figure 3a shows the autumn forecasts conducted by the Joint Forecast (GD) during the years 1999-2022, i.e. the depiction range for the target of the next year's forecast (t + 1) runs from 2000 until 2023. The last available year published by the Federal Statistical Office (Destatis) is currently 2022. The figure displays both forecast and first-release data. 12 Furthermore, the user can use the IWH ForDas to calculate the forecast errors for a selected variable (forecast target) and sample, and even compare the errors across institutions. The option "comparable errors" compares forecasts that have been conducted in the same month (quarter) within a year. In addition, forecast figures and real-time data can be viewed, compared, and stored.
11 First release data refer to the first publication of national accounts data by the Federal Statistical Office of Germany (in January of the next year to the target), Fachserie 18 R1-1. For, e.g. current account data, publications by the Bundesbank in February on Balance of Payment Statistics (Statistical supplement to monthly report 3) are used as the first reference. On request, the forecaster has access to real-time data from the end of February. 12 Both forecast and the first release data cover only the respective year and should not be interpreted as a full time series.

The IWH Forecasting Dashboard
Another option is to compare the forecasts for two different variables and also across multiple institutions. Figure 3b shows the relationship between GDP growth and world trade growth assumptions (scatter plot) for a selection of two forecasters, the Joint Forecast and OECD.
In addition to allowing for comparison of forecast and forecast errors across institutions, the dashboard also allows for the analysis of the individual forecaster performance. It provides a graphical illustration of forecasts for a selected variable over time, e.g. covering all forecast target years at a particular time (porcupine graph). Furthermore, forecasts for various indicators over time can be depicted. Figure 3c shows an example of forecasts on GDP growth and world trade growth by the Joint Forecast (GD) conducted in the spring (second quarter) of a particular year for the next year.
The website offers support to the user via a detailed tutorial as well as a FAQ section.

Future Plans
The IWH ForDas is usually updated by the end of January, April, July, and October to cover the most recent forecasts. Further variables will be added in the months to come. It is also planned to make an effort to include predictions on quarterly forecasts. Naturally, it would be desirable to include additional forecasters, e.g. in particular with a commercial background.