Skip to content
Publicly Available Published by De Gruyter Oldenbourg September 21, 2019

Industry Conversion Tables for German Firm-Level Data

  • Steffi Dierks EMAIL logo , Alexander Schiersch and Jan Stede

1 Introduction

The relevance of microdata for economic analysis and policy evaluation has grown rapidly in the last decade. One reason for this, is that ever more data is collected and existing microdata are made available to researchers. The establishment of the Research Data Center (FDZ) was a milestone in this process in Germany. Located in almost every federal state, they grant easy access to the enormous treasure of official microdata (Zühlke et al. 2004). The creation of the AFiD-Panels, the German acronym for “official firm-level data in Germany,” also contributes to the accessibility of firm-level data by combining different official surveys and censuses. For more information on the AFiD-Panels, we refer to Wagner (2000, 2008, 2010), Fritsch et al. (2004), Konold (2007), Malchin and Voshage (2009), and Vogel (2009).

There are several reasons why German firm-level data is of high quality compared to other sources. First, firms are legally obliged to deliver the requested data. Moreover, firms have almost no option to refuse participation once they are selected into a survey. Both reasons lead to extremely low non-response rates, e. g. less than two percent in the case of the cost structure survey of manufacturing firms (Statistisches Bundesamt 2016). Third, the data are collected and processed by the staff of the Statistical Offices. This ensures very high data quality, since these institutions have comparatively larger resources than private firms and its staff are well qualified. Finally, the data are usually collected such that it is representative at the industry level and, often, also at the state level.

In applied research using firm-level data, it is often imperative to have a panel with an industry classification that is consistent over time. Monetary variables, for example, need to be deflated in order to allow comparisons over time. The estimation of production functions should also be conducted on a disaggregated industry level (if sufficient observations are available), usually defined by the two-digit or three-digit industry code. This follows from the assumption typically imposed in production function approaches that all firms operate within the same production possibility set, using the same technology. This assumption is much more credible when the estimation is done at the disaggregated industry level. Finally, dummies are often used to capture structural differences between industries.

The use of such firm-level data is hampered, however, by institutional changes, including the change of industry classification adjusting to changes in economic structures. In particular, the change from ISIC Rev. 3.1 to ISIC Rev. 4, implemented in Germany as the change from WZ 2003 to WZ 2008 (hereinafter: WZ03 and WZ08, respectively), [1] led to a disruption in the time series. Unlike previous changes, it was not just a relabeling of industries: the introduction of some new sections (e. g. the new created section information and communication) makes it difficult to compare data between WZ03 and WZ08 (Greulich 2009). [2] In addition, entire sectors were dissolved and the companies that were part of these sectors were assigned to newly established or already existing industries. [3] This is not only true for two-digit industries within one-digit industries, such as in manufacturing, but firms also moved between one-digit industries. For example, industry D22 (Publishing, printing and reproduction of recorded media) is part of manufacturing according to WZ03, but large parts of it moved to J58 (Publishing activities) in WZ08 and, therefore, is are no longer classified as manufacturing in the new system. Disregarding such inconsistencies in the sectoral affiliation of firms may lead to a situation where, for example, two different sectoral deflators are used for different periods of time for a single firm that has not changed its main economic activity.

The fundamental problem of the industry reclassification is that there is no one-to-one mapping between old and new industries: firms from an old industry moved to different economic sectors after 2008 and new industries combine firms from different former industries. Conversion tables providing one-to-one mappings that can be universally applied are one solution to this issue. However, the official conversion tables published by the Federal Statistical Office of Germany cannot be used in applied research. These tables simply contain all possible combinations between old and new sectors, without specifying what fraction of a certain sector in WZ03 was assigned to which sector in WZ08.

In this paper, we provide ready-to-use conversion tables for German firm-level data. [4] We apply a strategy to rematch firms between industry classification systems based on a propensity score. Specifically, we estimate the “most likely counterpart” of each economic sector by comparing the relative share of each industry combination/transition path in the overall number of switchings per industry observed in the German business register. We then select the industry combination/transition path with the highest share as the relevant one for the one-to-one-mapping. As a robustness check, we compare the estimated one-to-one mappings with the universe of industry combinations/transition paths published by the Federal Statistical Office of Germany.

We provide conversion tables for the five-digit, four-digit and three-digit industry code level. Four-digit sectoral codes are also included in the AFiD-Panels, which means our conversion tables can be directly applied to this official data as well as to any other German firm-level data with three-digit or more detailed industry codes. All conversion tables are provided for two directions, both to convert WZ03 codes to WZ08, and vice versa.

The remainder of this paper is organized as follows: Section 2 describes the data and method. Section 3 shows the results and Section 4 summarizes.

2 Data and methodology

We estimate the industry conversion tables using the AFiD-Panel Business Register. This dataset contains all economically active units – both single-plant firms and multi-plant companies – that provide a contribution to the German gross domestic product, have their registered office in Germany, and focus on one of the sections B (Mining and quarrying) to N (Administrative and support service activities), as well as P (Education) to S (Other service activities), of the statistical classification of economic activities in the European Community NACE Rev. 2 (Statistische Ämter des Bundes und der Länder 2015). The coverage of economic units is optional for section A (Agriculture, forestry and fishing), as well as O (Public administration, defense and social insurance). Sections T (Activities of households as employers; undifferentiated goods- and service-producing activities of households for own use) and U (Activities of extraterritorial organizations and bodies) are excluded.

The Business Register is a statistical register that is created and maintained using existing data from different sources. In Germany, these are mainly the files of the tax administration, the monthly results from the employment statistics of the Federal Employment Agency on companies and employees, as well as the annual files of the craft chambers and the chambers of industry and commerce. The most relevant items in the panel are the unique company and establishment identification numbers and the sector of economic activity according to the German industry classification WZ03 or WZ08. [5]

For calculating the conversion tables, the WZ03 and the WZ08 codes for each firm are required. Unfortunately, the data do not contain both codes for most firms within the year of the introduction of WZ08. Thus, for the creation of the industry conversion tables, we assume that firms do not change their main economic activity within two subsequent years. In other words, we assume that a company that produces cars at time t, for example, continues to be primarily a car manufacturer in t+1.

For a conversion table to work, one unique WZ08 code for every single WZ03 code is needed. However, the change in the classification system was so significant that there is no one-to-one mapping between industries. Rather, firms from the same industry in 2007 moved to distinct industries after 2008 and industries in 2008 combine firms from different industries of 2007. Consequently, there are many different WZ08 codes for each WZ03 code and vice versa, even at the detailed 5-digit level.

We use the relative majority to overcome this problem, defining each pair in the conversion table by a WZ03 code and the respective WZ08 code that most of the firms were assigned to. In order to construct the relative majority, we focus on the years 2007 and 2008 and count those firms with different sectoral codes in 2008 (the first year in which WZ08 codes are used in the AFiD-Panel Business Register) that were in the same sector in 2007 (the latest year that WZ03 codes were provided). Dividing these numbers by the overall number of observations in a specific WZ03 sector provides shares. Finally, the WZ08 code with the highest share per ISIC 3.1 code is selected for the conversion table. The same approach is applied, just with the opposite direction, when constructing of the tables for the conversion from WZ08 to WZ03.


Table 1provides a stylized example of our methodology. Different companies that were all in the same sector in 2007 (sector 01132) are assigned to different sectors in 2008. Most companies – 75 % in the example – are assigned to sector 01210. Consequently, in our conversion table all companies of WZ03 sector 01132 are assumed to be assigned to the WZ08 sector 01210.

Table 1:

Stylized example of the methodology used to construct the conversion tables.

Company IDWZ03 (in 2007)WZ08 (in 2008)NWZ03NWZ08Max(share)
000000101132012108675 %
123001201132012108675 %
134567801132012108675 %
178345601132012108675 %
234890001132012108675 %
278456801132012108675 %
345660001132016408112.5 %
372239901132110208112.5 %
  1. Notes: NWZ03 and NWZ08 refer to the (stylized) number of firms in a specific sector in 2003 and 2008, respectively. For a given sectoral combination, Max(share) is the share of the firms that moved to the converted sector (WZ08) as a fraction of the total number of firms in the previous sector (WZ03).

The original dataset, before any data treatment, contains 6.51 million observations for 2007 and 6.76 million observations for 2008. After the first phase of the data preparation process, which drops missing observations in relevant variables and focuses on firms, the dataset contains 10.3 million observations. [6] About 1.8 million of these firms do not have a match in either of the two years. As either the old or the new industry codes is not observed, these firms are also dropped from the dataset, leaving a final dataset of about 8.5 million observations, i. e. 4.25 million firms per year.

3 Conversion tables and robustness checks

We create conversion tables – using the methodology described in Section 2 – for the five-digit, four-digit, and three-digit codes. For test purposes, we also create a conversion table at the two-digit industry level, but it is not part of the published tables. The estimation process is applied for both directions, i. e. conversion from WZ03 to WZ08 and vice versa. The resulting conversion tables are provided as Excel files. [7]

The vast majority of industry pairs is defined by very high shares of firms assigned to the respective old (new) industries. Table 6 in the appendix displays summary statistics of the shares of firms that determine which single WZ03 sector a WZ08 code is assigned to (or vice versa). The table reveals that these shares are very high on average, i. e. roughly between 85 and 90 %. Even the 5th percentile of the shares is usually around 50 % or higher. [8]

Table 2:

Differences arising from the use of the four-digit and the three-digit industry codes.

WZ03 (pre-2008), 4-digitWZ03 (pre-2008), 3-digitName of industry (4-digit)WZ08 (post-2008), according to 4-digit FSOWZ08 (post-2008), according to 4-digit CTWZ08 (post-2008), according to 3-digit CT
D24.64D24.6Manufacture of photographic chemical materialC20.59C20.59C20.5
D24.65D24.6Manufacture of prepared unrecorded mediaC26.80C26.80C20.5
  1. Source: Statistisches Bundesamt (2008b).

  2. Notes: CT is the acronym for Conversion Table; FSO is the acronym for Federal Statistical Office.

Table 3:

Share of misallocated industries.

In comparison with conversion table for
Base conversion table2-digits
3-digits20.91 %
4-digits18.79 %12.13 %
5-digits20.95 %17.08 %12.07 %
Table 4:

Differences in gross investment with different conversion tables, 2003–2007, WZ08.

Industry WZ08Using 2-digit conversion tableUsing 4-digit conversion tableDifferencesin investment
Total investmentNTotal investmentNNTotalMean†,‡
  1. Source: DOI:10.21242/42111.2014., own calculations.

  2. Notes: *** significant at the 1 % level, ** significant at the 5 % level, * significant at the 10 % level, in million Euros, significance based on two-sided t-test of differences in means (Welch’s t-test).

Table 5:

Deviations from the conversion table of the Federal Statistical Office of Germany.

WZ03NameNWZ03WZ08CTMax (Share)Sectors listedin WZ08FSO
01413Laying out, planting and maintenance of graves38399603286.1 %81309
15920Production of ethyl alcohol from fermented materials4641101092.7 %20140
22122Publishing of weekly and Sunday newspapers6545813079.1 %58140
55236Boarding houses835590984.3 %55102
60214Operation of funicular railways and aerial cable-ways2909329091.0 %49310
64110National post activities5755320071.7 %82190
72301Data entry services11226203074.7 %74201
  1. Source: DOI:10.21242/52111.2012., Statistisches Bundesamt (2008b), own calculations.

  2. Notes: NWZ03 refers to the number of firms in a given five-digit WZ03 sector in the AFiD-Panel Business Register in 2007. WZ08CT is the five-digit WZ08 sector determined in our conversion tables. Max(Share) is the share of the firms that determines this converted WZ08 sector. WZ08FSO are the sectors that are listed in the conversion tables of the Federal Statistical Office of Germany.

Table 6:

Shares of the companies determining the converted industry, by direction of conversion.

Level of precisionDirection of conversionMean5th percentile95th percentileMinimum
Five-digitWZ03 → WZ0892.2 %70.1 %133.3 %
Five-digitWZ08 → WZ0382.6 %41.5 %98.8 %20.0 %
Four-digitWZ03 → WZ0886.6 %56.1 %99.6 %31.3 %
Four-digitWZ08 → WZ0386.2 %46.8 %98.9 %19.3 %
Three-digitWZ03 → WZ0882.9 %48.1 %99.2 %23.1 %
Three-digitWZ08 → WZ0385.9 %49.9 %99.0 %19.3 %
  1. Source: DOI:10.21242/52111.2012., own calculations.

In rare cases, however, the share of companies determining the conversion can be as low as 20 %. This reveals the main shortcoming of our approach, namely that sometimes less than half of the firms of a WZ03 code may determine the single WZ08 sector to which all firms from one WZ03 code are assigned to (see minimum values in the sixth column in Table 6). On the other hand, this is a common problem with conversion tables and only holds for a small fraction of our converted industries. [9] The conversion tables include the information on the respective share for each industry pair. Table 2 illustrates the benefits of using a more refined conversion table as industries are matched more accurately.

Example. When looking at the 4-digit level inTable 2, all manufacturers of photographic chemical material (WZ03 code D24.64) are assigned to industry C20.59 (WZ08 code), which is part of the industry manufacturing of chemicals and chemical products (WZ08 code C20). Firms producing prepared unrecorded media, on the other hand, end up in the sector manufacture of computer, electronic and optical products (WZ08 code C26). If only the three-digit level is available, firms from both industries would be seen as belonging to one single sector prior to 2008 (WZ03 code D24.6). Applying the conversion table for 3-digit industry codes, manufacturers of prepared unrecorded media (WZ03 code D24.65) would end up in the chemical industry (WZ08 code C20) instead of the correct industry (Manufacture of computer, electronic and optical products, WZ08 code C26).

In order to assess the relevance of the issue of misallocated industries, we check the differences between using the three-industry conversion table compared to a four-digit conversion tables, the differences between the four-industry conversion table and the five-digit conversion tables etc. We count the number of sectors that end up in the wrong industry once a less detailed conversion table is applied. Table 3 reveals that a significant share of the sectors is misallocated with a decreasing level of precision. For example, when using the three-digit industry conversion table instead of the four-digit industry conversion table, 62 industries or 12.1 % of all four-digit sectors are attributed to the wrong three-digit industries. In total, more than 600,000 firms or about 13.7 % of the observations are affected. As can be seen from Table 4, in general, more industries are misallocated, the less detailed is the conversion table.

Next, we test the effects of applying conversion tables at different levels of detail on investment statistics from the AFiD-Panel Manufacturing Firms. These data are the base for capital stock estimations and, thus, relevant for micro- and macroeconomic productivity estimations or investment studies. The original data come with four-digit industry codes. However, deflator time series are often only available at the two-digit industry level. Moreover, most firm level studies use the two-digit level to either create industry dummies or to conduct estimations separately for every two-digit industry. Therefore, we apply the industry conversion tables for the two-digit and the four-digit industry codes and the years 2003 to 2007, checking the consequences for the investment data at the two-digit industry level.

Table 4 shows the effects of applying conversion tables for industry codes at the four-digit level versus the two-digit level on firm-level data that contain the old industry codes (WZ03). The table contains the investment sums according to the new industry classification system (WZ08), the number of observations in each industry, and the differences coming from using the two different conversion tables.

The first thing to notice is the fact that some manufacturing industries have no observations when the two-digit industry code is used and the respective conversion table applied. This is the case for industries B5, C11, C21, and C33. [10] This highlights the fact that the industry codes cannot be meaningfully converted at this level of detail, which is why we do not provide these conversion tables. Secondly, some firms end up in industries outside manufacturing when the four-digit industry code and the respective conversion table is used. This is the result of the significant reorganization of industries in WZ08 compared to WZ03.

Example. Firms belonging to industry D22 (Publishing, printing and reproduction of recorded media) according to WZ03, are now largely part of industry J58 (Publishing activities) in WZ08. This means they are considered services in the new classification system.

The third fact to acknowledge is that the differences in investment are sometimes quite large, even when not significant according to the two-sided t-test of differences in means (last column Table 4). A good example is manufacture of machinery and equipment (C28), where the difference in nominal investment is 2.6 billion Euros.

To finally check the validity of our results, we compare our conversion tables with the conversion tables of the Federal Statistical Office of Germany (Statistisches Bundesamt 2008b). These latter tables cannot be directly used in applied research, since they simply contain all possible combinations between old and new sectors, without specifying what fraction of which sector in WZ03 was assigned to which sector in WZ08. [11] Since no further information is provided, there is no way of deciding which of the pairs should be used. However, we use these tables to test whether our conversion tables only contain combinations of old and new industry codes that are also listed in the tables of the Statistical Office.

Virtually all of the 1,037 combinations in our conversion table at the five-digit level also exist in the conversion table of the Federal Statistical Office of Germany. The only exceptions are listed in Table 5, namely the industries 01413, 15920, 22122, 55236, 60214, 64110, and 72301. This deviation means that – provided the conversion table of the Federal Statistical Office does not contain mistakes – either the majority of firms in these seven sectors had the wrong industry classification in the 2007 AFiD-Panel Business Register, or that the majority of these firms switched to another industry in the administrative process of assigning the new industry codes. A detailed analysis of the data reveals that, while the majority of firms in 2008 has an industry code that is listed in the fourth column of Table 5, indeed very few firms exist with the industry code that is listed in the Federal Statistical Office of Germany’s conversion table (WZ08FSO in Table 5). In some cases, less than three observations have the industry code combination that the conversion table of the Federal Statistical Office suggests. Consequently, the share of companies in these WZ08 sectors is typically very low.

As seen in Table 5, the total number of companies in industries that are missing in the Federal Statistical Office of Germany’s conversion table is quite low. A total of 7,027 firms belong to these seven sectors, which is less than 0.02 % of all companies in the 2007 AFiD-Panel Business Register. We conclude that, for the overwhelming majority of industry conversions, there is a concordance between our conversion table and the one from the Federal Statistical Office of Germany.

4 Summary

This study provides ready-to-use conversion tables for industry codes from the German industry classification system WZ 2003 to WZ 2008 (and vice versa) at the five-digit level, the four-digit level, and the three-digit level of industry codes. These conversion tables are constructed using both the 2007 and 2008 German Business Registers, which contains observations for more than four million firms in each year. Although initially developed to be applied to firm-level data from the Statistical Offices, the tables can be applied to all German firm-level data that contain industry codes at the three-digit industry level or higher.

Figure 1: Kernel densities of the shares of companies determining the converted industry(four-digit).
Figure 1:

Kernel densities of the shares of companies determining the converted industry(four-digit).

The final conversion tables contain unique pairs of industry codes to allow for their application. Each pair is defined by the relative majority of firms that moved from an old industry (WZ03) to a new industry (WZ08) or vice versa. This approach comes at a cost because some of the old industries have been completely split up. However, the statistics we provide in this study show that the overwhelming majority of industry pairs is defined by very high shares of firms assigned to the respective old (new) industries.

This paper further shows that the level of detail of the available industry codes affects the quality of the conversion. The tests reveal that the conversion works best at the highest level of detail of the original industry code. A final robustness check confirms that, with very few exceptions, the industry pairs in the conversion tables are consistent with the list of possible industry pairs that is published by the German Statistical Office. This latter list cannot be directly used in applied research, as it contains all potential industry pairs without specifying relative majorities. In contrast, direct applicability is the major advantage of the conversion tables we propose in this paper.


We thank Caroline Stiel for helpful suggestions. This research was partly funded by the German Ministry of Economics and Energy through the project: “Wissensbasiertes Kapital in Deutschland: Analysen zu Produktivitäts- und Wachstumseffekten und Erstellung eines Indikatorsystems.”


Fritsch, M., B. Görzig, O. Hennchen, A. Stephan (2004), Cost Structure Surveys for Germany. Schmollers Jahrbuch /Journal of Applied Social Science Studies 124: 557–566.10.3790/schm.124.4.557Search in Google Scholar

Greulich, M. (2009), Revidierte Wirtschaftszweig- und Güterklassifikationen fertiggestellt. Wirtschaft und Statistik 1: 36–46.Search in Google Scholar

Konold, M. (2007), New Possibilities for Economic Research through Integration of Establishment-level Panel Data of German Official Statistics. Schmollers Jahrbuch/Journal of Applied Social Science Studies 127: 321–334.10.3790/schm.127.2.321Search in Google Scholar

Malchin, A., R. Voshage (2009), Official Firm Data for Germany. Schmollers Jahrbuch/Journal of Applied Social Science Studies 129: 501–513.10.3790/schm.129.3.501Search in Google Scholar

Statistische Ämter des Bundes und der Länder (2015), Unternehmensregister – System 95. Metadaten für die On-Site-Nutzung. Stand 10.02.2015, zuletzt aufgerufen am 2019-08-07: in Google Scholar

Statistisches Bundesamt (2008a), Klassifikation der Wirtschaftszweige mit Erläuterungen – 2008. Stand 12.2008, zuletzt aufgerufen am 2019-08-07: in Google Scholar

Statistisches Bundesamt (2008b), Umsteigeschlüssel der Klassifikation der Wirtschaftszweige, Ausgabe 2003 (WZ 2003) zur Klassifikation der Wirtschaftszweige, Ausgabe 2008 (WZ 2008). Stand 20.11.2008, zuletzt aufgerufen am 2019-08-07: in Google Scholar

Statistisches Bundesamt (2016), Kostenstrukturerhebung im Verarbeitenden Gewerbe sowie des Bergbaus und der Gewinnung von Steinen und Erden. Stand 26. 05.2015, zuletzt aufgerufen am 2016-07-15: in Google Scholar

Vogel, A. (2009), The German Business Services Statistics Panel 2003 to 2007. Schmollers Jahrbuch/Journal of Applied Social Science Studies 129: 515–522.10.3790/schm.129.3.515Search in Google Scholar

Wagner, J. (2000), Firm Panel Data from German Official Statistics. Schmollers Jahrbuch/Journal of Applied Social Science Studies 120: 143–150.10.3790/schm.120.1.143Search in Google Scholar

Wagner, J. (2008), Die Forschungspotentiale der Betriebspaneldaten des Monatsberichtes im Verarbeitenden Gewerbe. ASta – Wirtschafts- und sozialstatistisches Archiv 2: 209–221.10.1007/s11943-008-0044-9Search in Google Scholar

Wagner, J. (2010), The Research Potential of New Types of Enterprise Data Based on Surveys from Official Statistics in Germany. Schmollers Jahrbuch/Journal of Applied Social Science Studies 130: 133–142.10.3790/schm.130.1.133Search in Google Scholar

Zühlke, S., M. Zwick, S. Scharnhorst, T. Wende (2004), The Research Data Centres of the Federal Statistical Office and the Statistical Offices of the Länder. Journal of Applied Social Science Studies 124: 567–578.10.3790/schm.124.4.567Search in Google Scholar


Published Online: 2019-09-21
Published in Print: 2020-10-25

© 2019 Oldenbourg Wissenschaftsverlag GmbH, Published by De Gruyter Oldenbourg, Berlin/Boston

Downloaded on 27.9.2023 from
Scroll to top button