Several decomposition procedures have been developed in order to decompose changes in some dependent variable into parts that are attributable to changes in characteristics or in coefficients. The original approach by Oaxaca (1973) and Blinder (1973) applies to the linear regression case. However, when studying changes in collective bargaining coverage, the dependent variable is binary and thus a non-linear parametric model is required. For this case Fairlie (1999, 2005) develops a decomposition approach on which this study is based.

We want to decompose changes in collective bargaining coverage over time. Adapting Fairlie’s method to our application, the decomposition reads:
${\overline{Y}}_{2006}-{\overline{Y}}_{2001}=\underset{\text{Residual}}{\underbrace{\left[{\displaystyle \sum _{i=1}^{{N}^{06}}\frac{F\left({X}^{06}{\widehat{\beta}}^{06}\right)}{{N}^{06}}}-{\displaystyle \sum _{i=1}^{{N}^{06}}\frac{F\left({X}^{06}{\widehat{\beta}}^{01}\right)}{{N}^{06}}}\right]}}+\underset{\text{Characteristics}}{\underbrace{{\displaystyle \sum _{i=1}^{{N}^{06}}\frac{F\left({X}^{06}{\widehat{\beta}}^{01}\right)}{{N}^{06}}}-{\displaystyle \sum _{j=1}^{{N}^{01}}\frac{F\left({X}^{01}{\widehat{\beta}}^{01}\right)}{{N}^{01}}}}}$[1]where *X* is the covariates matrix and *β* is the coefficients vector. In this case, the function *F* corresponds to the standard normal cumulative density function, corresponding to a probit model. The shorthand notation 01 refers to the year 2001 and likewise 06 to the year 2006. *N*^{06} and *N*^{01} denote the sample sizes of the two years. Hats refer to estimated values. The hypothetical value $F\left({X}^{06}{\widehat{\beta}}^{01}\right)$ estimates the propensity of being covered by collective bargaining for individuals with characteristics from 2006 as if they had lived in the labour market of 2001. We estimate all the decompositions separately for males and females and for East and West Germany. For all covariates we take differences to their 2001 means within the corresponding subsample (males/females; East/West). This allows us to interpret changes in the constant as changes over time.

The second term in eq. [1] is called the “characteristics effect” as it represents differences in the outcome variable that occur due to the differences in the distributions of *X* (Fairlie 2005: 307). The first term in brackets captures those differences that occur due to changes in the coefficients and in the constant. In case there were relevant factors which are unobserved to the researcher, the constant would be affected. In this case, also the coefficients could be biased in case the unobservables correlate with the covariates. For this reason, the corresponding first term of the decomposition is usually labelled “residual” term or “unexplained” part (Fairlie 2005: 307; Schnabel and Wagner 2007).

The coefficients $\stackrel{\u02c6}{\mathrm{\beta}}$ are obtained from probit regressions of a collective bargaining dummy on a set of covariates. The covariates can be grouped into three subgroups of interest:

P: Personal characteristics of the employee, i. e. age, tenure and education.

F: Firm characteristics of the job match, i. e. firm size, region and share of male employees.

S: Sector of the firm, i. e. industry branch.
^{8}

Next, we extend the decomposition approach in order to consider the contributions of these different sets of characteristics separately. This step requires a matching of the observations for the construction of a hypothetical counterfactual combination.
^{9}

Fairlie (2005: 308) suggests matching the observations based on the ranks of the fitted values of the estimated nonlinear functions. In case both subgroups are not of the same size, he further suggests using a random subsample of the larger group. However, this approach does not explicitly take account of the correlations between the covariates and therefore we further develop the approach at this point (similar to Antonczyk et al. 2010). The following approach is based on the sequential decomposition suggested by DiNardo et al. (1996) and further developed by Chernozhukov et al. (2013) and by Antonczyk et al. (2009, 2010). While all these approaches apply to the case of a continuous dependent variable, we will now translate them to the case of a limited dependent variable based on Fairlie (2005).

Thus, we want to decompose:
${\stackrel{\u02c9}{Y}}^{06}-{\stackrel{\u02c9}{Y}}^{01}=F\left({\mathrm{\beta}}_{P}^{06},{\mathrm{\beta}}_{F}^{06},{\mathrm{\beta}}_{S}^{06},{\mathrm{\beta}}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\mathrm{\beta}}_{P}^{01},{\mathrm{\beta}}_{F}^{06},{\mathrm{\beta}}_{S}^{01},{\mathrm{\beta}}_{0}^{01},{X}_{S}^{01},{X}_{F}^{01},{X}_{P}^{01}\right)$[2]where *X*_{P}, *X*_{F} and *X*_{S} denote sets of personal and firm characteristics and the industry sector respectively, and ${\mathrm{\beta}}_{P}$, ${\mathrm{\beta}}_{F}$ and ${\mathrm{\beta}}_{S}$ the corresponding coefficients. ${\mathrm{\beta}}_{0}$ denotes the constants obtained from the two underlying probit regressions for 2001 and 2006.

We will analyse the contribution of each of the components separately by changing them step by step as denoted by the following sequence of effects:
$\begin{array}{l}{\Delta}^{1}=F\left({\beta}_{P}^{06},{\beta}_{F}^{06},{\beta}_{S}^{06},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\beta}_{P}^{01},{\beta}_{F}^{06},{\beta}_{S}^{06},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)\\ {\Delta}^{2}=F\left({\beta}_{P}^{01},{\beta}_{F}^{06},{\beta}_{S}^{06},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{06},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)\\ {\Delta}^{3}=F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{06},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)\\ {\Delta}^{4}=F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{06},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)\\ {\Delta}^{5}=F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{06},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{01},{X}_{F}^{06},{X}_{P}^{06}\right)\\ {\Delta}^{6}=F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{01},{X}_{F}^{06},{X}_{P}^{06}\right)-F\left({\beta}_{\text{P}}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{01},{X}_{F}^{01},{X}_{P}^{06}\right)\\ {\Delta}^{7}=F\left({\beta}_{P}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{01},{X}_{F}^{01},{X}_{P}^{06}\right)-F\left({\beta}_{\text{P}}^{01},{\beta}_{F}^{01},{\beta}_{S}^{01},{\beta}_{0}^{01},{X}_{S}^{01},{X}_{F}^{01},{X}_{P}^{01}\right)\end{array}$[3]

The choice of a sequence is not innocuous because the order matters in any sequential decomposition, i. e. they are path-dependent.
^{10} We choose this specific sequence of counterfactuals because it reflects the idea that we transfer the individuals from 2006 ‘back in time’ to the year 2001. We argue that this is meaningful because in this way the first step reflects in what sense the changing labour market remunerations (i. e. coefficients) contributed to the drop in coverage, given the individual characteristics of 2006. Only then, we change the characteristics. The complete sequential decomposition of changes in collective bargaining coverage from 2001 to 2006 can be summarised as:
${\stackrel{\u02c9}{Y}}^{2006}-{\stackrel{\u02c9}{Y}}^{2001}=\underset{Coefficients}{\underset{\u23df}{\underset{Personal}{\underset{\u23df}{{\mathrm{\Delta}}^{1}}}+\underset{Firm}{\underset{\u23df}{{\mathrm{\Delta}}^{2}}}+\underset{Sector}{\underset{\u23df}{{\mathrm{\Delta}}^{3}}}}}+\underset{Residual}{\underset{\u23df}{{\mathrm{\Delta}}^{4}}}+\underset{Characteristics}{\underset{\u23df}{\underset{Sector}{\underset{\u23df}{{\mathrm{\Delta}}^{5}}}+\underset{Firm}{\underset{\u23df}{{\mathrm{\Delta}}^{6}}}+\underset{Personal}{\underset{\u23df}{{\mathrm{\Delta}}^{7}}}}}$

The first term of this detailed decomposition, Δ^{1}, reflects changes in the propensity to work under collective bargaining that occur due to changes in the coefficients which correspond to personal characteristics. For example, for certain educational groups, if the probability of working under collective contracts changes over time relative to other educational groups, this is reflected in the first component.

The second term of the detailed decomposition, Δ^{2}, captures changes in the coefficients which correspond to firm characteristics. For example, for employees working in small firms, if the probability of working under collective contracts changes over time relative to large firms, this is reflected in the second component.

The third term, Δ^{3}, captures changes in the coefficients which correspond to the industry sector. For example, for certain industries, if collective bargaining coverage changes more strongly over time than for other industries, this is reflected in the third component.

The fourth term, Δ^{4}, captures changes in the constant of the regression model over time. This includes an average time shift that applies to all industries, all firms and all employees. Further, a change in the constant includes changes in all unobservable variables. Therefore, the fourth component reflects all residual factors.

The fifth component, Δ^{5}, captures changes in the industry composition of the workforce. For example, if there is a trend towards tertiarisation and collective bargaining coverage in the service sector differs from the one in the manufacturing sector, this compositional effect is reflected in the fifth component.

The sixth component, Δ^{6}, captures changes in the composition of firms. For example, if there is a trend towards larger firms and if larger firms have different propensities to be covered by collective bargaining than smaller firms, then this is reflected in the sixth component.

The seventh component, Δ^{7}, captures changes in the composition of employees. For example, if there is a trend towards educational upskilling and if highly educated employees display lower probabilities of collective bargaining than lower educated employees, then this is reflected in the seventh component.

All seven components add up to the total change in collective bargaining coverage over time as given by the difference between the average predicted values from the conditional models (see eq. [3]).
^{11}

Up until step 4, it is sufficient for the implementation of the procedure to plug in certain coefficients from 2001 together with the individual observations from 2006. Then, for the fifth step it is necessary to simulate in which industry sectors the individuals from 2006, who work in firms in 2006, would have worked in 2001. This is implemented by kernel matching based on the normal Gaussian kernel. Similarly, for the sixth step it is necessary to match the individual employees from 2006 to some firms and industry sectors in 2001. Again, this is implemented by Gaussian kernel matching.

The crucial assumption that underlies the estimation of a hypothetical counterfactual distribution is that a change in the covariates X does not affect the parameters of the conditional distribution model given X (e. g. Chernozhukov et al. 2013; Antonczyk et al. 2010). In other words, the decomposition approach ignores general equilibrium effects. This is similar to other decomposition methods in the literature (e. g. DiNardo et al. 1996).
^{12} This means that if changes in the characteristics cause the coefficients to change or *vice versa*, this interrelation cannot be detected by the decomposition approach.
^{13}

Another caveat to the standard Fairlie method refers to the fact that the residual effect does not further differentiate between the impact of coefficients and of the constant (Schnabel and Wagner 2007). This point is also addressed by our approach because changes in the coefficients are separated from changes in the constant.

Finally, as explained above, sequential decompositions are path-dependent. Therefore, the order will be reversed later in order to test for robustness (see the online appendix for details).

## Comments (0)