In this section we illustrate the test using the labor market data from Card (1995). These data contain, in particular, wage, education and age of a sample of US men of size *n* = 3010 taken in 1976. The main variable is logarithm of wages (`lwage76`), and regressors are education (`ed76`) and age (`age76`). We run bivariate and trivariate full parametric models for the pairs (`lwage76`,`ed76`), (`lwage76`,`age76`) and the triple (`lwage76`,`ed76`,`age76`), compute implied regressions of log wages on one or two regressors, and test them for linearity using the test developed in this paper.^{10}

Because the regressand is a continuous variables while both regressors are discrete, we construct the joint distribution by using the copula machinery. The marginal density for the continuously distributed log wages is chosen to be the skew-normal distribution (Azzalini 1985):

$$u=\mu +\sigma w,$$

where *μ* is a location parameter, *σ* is a scale parameter,

$${f}_{w}\left(w|\gamma \right)=2\varphi \left(w\right)\mathrm{\Phi}\left(\gamma w\right),$$

and^{11}
*γ* is a shape parameter that indexes the degree of skewness; the distribution reduces to the regular normal when *γ* = 0. In total, the skew-normal density ${f}_{u}\left(u|{\theta}_{u}\right)$ and its CDF ${F}_{u}(u|{\theta}_{u})$ are characterized by three parameters in ${\theta}_{u}={\left(\mu ,\sigma ,\gamma \right)}^{\mathrm{\prime}}$. Azzalini, Dal Cappello, and Kotz (2003) argue that this distribution (among others) well approximates the real log income data. Below are the results of fitting the marginal skew-normal density to the variable `lwage76`.

Estimates of the marginal log-wage distribution

The Kolmogorov–Smirnov statistic (the maximal difference between the empirical distribution function and estimated CDF) equals 0.0168, and, normalized by $\sqrt{n}$, equals 0.921, which is quite smaller than the critical value even at the 20% significance level (e.g. Massey 1951).

The marginal distributions of the variables `ed76` and `age76` are categorical, with a number of categories being *k*_{1} = 18 for the former and *k*_{2} = 11 for the latter,^{12} and with categorical probabilities ${q}_{\ell}={\left({q}_{j}\right)}_{j=1}^{{k}_{\ell}}$, $\ell =1,2$ subject to $\sum _{j=1}^{{k}_{\ell}}{q}_{j}=1$. Let us denote the CMF of this distribution by ${G}_{v}(v|q)=\sum _{j=1}^{\lfloor v\rfloor}{q}_{j}$. The estimates are shown in the following tables.

Estimates of the marginal education distribution

Estimates of the marginal age distribution

Because the two/three components are both discrete and continuous, we extend the method of Anatolyev and Gospodinov (2010) of constructing a joint distribution of mixed marginals to the case of multiple values in the discrete marginal’s support^{13} using copula machinery. We employ the Gaussian copula because it is simple and convenient, easily interpretable, and allows natural extension to higher dimensions with a reasonable increase in the degree of parameterization. When there is only one discrete regressor, the Gaussian copula has only one correlation parameter *ϱ*. It is derived in Appendix C that the joint density is

$$f(u,v|\theta )={f}_{u}\left(u|{\theta}_{u}\right){f}^{C}(u,v|\theta ),$$

where

$${f}^{C}(u,v|\theta )=\mathrm{\Phi}\left(\frac{{\mathrm{\Phi}}^{-1}(G(v|q))-\varrho {\mathrm{\Phi}}^{-1}({F}_{u}(u|{\theta}_{u}))}{\sqrt{1-{\varrho}^{2}}}\right)-\mathrm{\Phi}\left(\frac{{\mathrm{\Phi}}^{-1}(G(v-1|q))-\varrho {\mathrm{\Phi}}^{-1}({F}_{u}(u|{\theta}_{u}))}{\sqrt{1-{\varrho}^{2}}}\right)$$

is ‘distorted’ categorical probability, and $\theta ={\left({\theta}_{u}^{\mathrm{\prime}},\varrho ,{q}^{\mathrm{\prime}}\right)}^{\mathrm{\prime}}$ collects all 21 or 14 parameters.

Maximization of the joint (log) likelihood yields estimates of parameters of the marginals very close to figures reported above but with lower standard errors, and the estimates of the copula as in the following table:

Estimates of the bivariate copula

One can see that the estimates of bivariate degrees of dependence are highly statistically significant and moderately large in value.

Figure 2: Estimated Mean Regression with Regressor `ed76` (Top Panel) or `age76` (Bottom Panel).

Figure 2 shows the estimated mean regressions. In the case of `ed76`, it may appear that the true functional form is linear, which is what the corresponding literature tends to focus on. In the case of `age76`, linearity does not seem to hold, but a low-order polynomial like a cubic form may be appropriate. To verify whether these conjectures hold, we first perform the test for a linear mean regression:

$$\psi \left(v,\beta \right)=a+bv.$$

The test results are in the following table.

Results of testing for linearity in the bivariate case

The hypothesis of a linear regression form is decidedly rejected for both regressors at any conventional significance level; in fact, the exceedance is huge. We conclude that the form of the actual mean regression differs from what is usually assumed in regressions of wages on its determinants.

Labor econometricians often add in their linear regressions a square of a variable related to duration (e.g. work experience^{14}); Murphy and Welch (1990) show that even fourth powers may be needed. Therefore, we have also run the test with low-order polynomial hypothesized regression forms: ${\psi}_{2}\left(v,\beta \right)=a+bv+c{v}^{2}$ and ${\psi}_{4}\left(v,\beta \right)=a+bv+c{v}^{2}+d{v}^{3}+f{v}^{4}$. These functional forms are also rejected at any conventional significance level.

When there are two discrete regressors, the Gaussian copula has a 3 × 3 correlation matrix

$$R=\left[\begin{array}{ccc}1& {\varrho}_{0}& {\varrho}_{1}\\ {\varrho}_{0}& 1& {\varrho}_{2}\\ {\varrho}_{1}& {\varrho}_{2}& 1\end{array}\right]$$

with three distinct parameters *ϱ*_{0}, *ϱ*_{1}, *ϱ*_{2}. It is derived in Appendix C that the joint density is

$$f(u,{v}_{1},{v}_{2}|\theta )={f}_{u}(u|{\theta}_{u}){f}^{C}(u,{v}_{1},{v}_{2}|\theta ),$$

where

$$\begin{array}{rl}{f}^{C}({u}_{1},{v}_{1},{v}_{2})& ={\mathrm{\Phi}}_{2}({\phi}_{1}\left({v}_{1}\right),{\phi}_{2}\left({v}_{2}\right)|{\phi}_{u}\left(u\right))-{\mathrm{\Phi}}_{2}({\phi}_{1}\left({v}_{1}-1\right),{\phi}_{2}\left({v}_{2}\right)|{\phi}_{u}\left(u\right))\\ & -{\mathrm{\Phi}}_{2}({\phi}_{1}\left({v}_{1}\right),{\phi}_{2}\left({v}_{2}-1\right)|{\phi}_{u}\left(u\right))+{\mathrm{\Phi}}_{2}({\phi}_{1}\left({v}_{1}-1\right),{\phi}_{2}\left({v}_{2}-1\right)|\phi \left(u\right))\end{array}$$

for ${v}_{1},{v}_{2}\in \{0,1\}$ are ‘distorted’ bivariate categorical probabilities, where

$$\begin{array}{rll}{\phi}_{\ell}\left(v\right)& =& {\mathrm{\Phi}}^{-1}({G}_{\ell}(v)),\phantom{\rule{1em}{0ex}}\ell =1,2,\\ {\phi}_{u}\left(u\right)& =& {\mathrm{\Phi}}^{-1}({F}_{u}(u)),\end{array}$$

and $\theta ={\left({\theta}_{u}^{\mathrm{\prime}},{\varrho}_{0},{\varrho}_{1},{\varrho}_{2},{q}_{1}^{\mathrm{\prime}},{q}_{2}^{\mathrm{\prime}}\right)}^{\mathrm{\prime}}$ collects all 33 parameters.

Maximization of the joint (log) likelihood yields estimates of parameters of the marginals very close to figures reported above but with lower standard errors, and the estimates of the copula as in the following table:

Estimates of the trivariate copula

One can see that the estimates of bivariate degrees of dependence *ϱ*_{1} and *ϱ*_{2} are very close to those from bivariate models with similar standard errors. The degree of dependence between the two regressors *ϱ*_{0} is estimated to be quite modest but significantly different from zero.

Figure 3: Estimated Mean Regression with Regressors `ed76` and `age76`.

Figure 3 shows the surface of the estimated mean regression which is arguably close to a plane. We perform the test for a linear mean regression:

$$\psi \left({v}_{1},{v}_{2},\beta \right)=a+{b}_{1}{v}_{1}+{b}_{2}{v}_{2}.$$

The test results are:

Results of testing for linearity in the trivariate case

The hypothesis of a linear regression form is decidedly rejected for both regressors at any conventional significance level. We also repeat this exercise for the form quadratic in both regressors ${\psi}_{22}\left({v}_{1},{v}_{2},\beta \right)=a+{b}_{1}{v}_{1}+{b}_{2}{v}_{2}+{c}_{1}{v}_{1}^{2}+{c}_{12}{v}_{1}{v}_{2}+{c}_{2}{v}_{2}^{2}$, as well as, motivated by the study of Murphy and Welch (1990), for the form linear in education and quartic in age, ${\psi}_{14}\left({v}_{1},{v}_{2},\beta \right)=a+{b}_{1}{v}_{1}+{b}_{2}{v}_{2}+{c}_{2}{v}_{2}^{2}+{c}_{12}{v}_{1}{v}_{2}+{d}_{2}{v}_{2}^{3}+{d}_{12}{v}_{1}{v}_{2}^{2}+{f}_{2}{v}_{2}^{4}+{f}_{12}{v}_{1}{v}_{2}^{3}$, as well as the same form with age *v*_{2} replaced by potential experience that equals ${v}_{2}+17-{v}_{1}$.^{15}

These functional forms are also decidedly rejected at any conventional significance level. Evidently, the observable “bumps” in the curves/surface in Figure 2 and Figure 3 are not due to a sampling error only, but rather are built-in attributes of the shapes of regressions. The overall results imply that the true mean regressions are not likely to reduce to low-order polynomials in the conditioning variables but rather take more complex functional forms, which is contradictory to popular empirical practices.^{16}

## Comments (0)

General note:By using the comment function on degruyter.com you agree to our Privacy Statement. A respectful treatment of one another is important to us. Therefore we would like to draw your attention to our House Rules.