‘Measure what is measurable, and make measurable what is not so.’ Galileo Galilei
The empirical evidence supporting capital asset pricing model (CAPM) in the version of Sharpe (1964) and Lintner (1965) is far from being convincing. Nevertheless, the CAPM is still a centerpiece of the asset pricing theory taught in MBA investment courses and it is still a widely used tool among practitioners. The reasons for its failure are manifold and have launched a large body of literature. Besides its theoretical simplicity leaving room for numerous generalizations based on more realistic settings (multifactor models, conditional CAPM, consumption CAPM etc.), the failure of producing convincing evidence in favor of the CAPM can also be attributed to the difficulties of its empirical implementation. This branch of the literature has generated numerous studies of using alternative estimation and testing procedures.1 Fama and French (2004) offer a concise summary of the struggle to find empirical support for the CAPM.
This paper addresses the implications of using the return of a market index as a proxy for the return of the true (equilibrium) market portfolio for the estimation of the CAPM. In a certain sense, the measurement problem can be regarded as a primary one, because it relates to the question of the existence of a workable empirical framework without questioning the rigorous theoretical assumptions underlying the CAPM as such. The measurement problem is at the heart of the Roll’s critique (1977) who argues that the true market portfolio includes a large range of investment opportunities including international securities, real estate, precious metals, etc., so that the true value-weighted market portfolio is empirically elusive. In particular, he concludes (Roll 1977: 130): ‘The theory is not testable unless the exact composition of the true market portfolio is known and used in the tests.’ Moreover, by using market proxies, due to data limitations, empirical tests of the CAPM effectively test whether the market proxies are on the minimum variance frontier.
In the following, we take a different approach to Roll’s identification problem by assuming that the CAPM holds true and investigate the properties of the CAPM in the presence of measurement problem. By using a linear projection framework within the Sharpe-Lintner version of the CAPM we pick up Roll’s critique to account for the fact that the market index in empirical studies is only a proxy that correlates more or less strongly with the true, but unobservable market index. This leads to a regression model with a non-zero intercept, which is observationally equivalent to a general one-factor model with non-zero excess returns. Consequently, in the presence of measurement error, results of standard tests on the existence of abnormal returns, e.g. in the tradition of Gibbons et al. (1989), render to be spurious. The same holds true for the two-pass cross-sectional regression methodology by Fama and MacBeth (1973) and further refined by Shanken (1992) and Kan et al. (2013), see Hamerle (1996) in this journal for pointing out this problem. Moreover, we show that measurement errors in the market return can also lead to systematic negative correlations between the estimated alphas and betas, which may falsely be interpreted as empirical evidence for the low risk puzzle (Frazzini/Pedersen 2014).
Our projection framework can serve as an alternative approach by providing an indirect test of the viability of the conventional CAPM by testing it against a general factor model with measurement error. The basic idea underlying our testing strategy is to check if the parameters of the CAPM with measurement error are statistically different from the ones of a more general model containing non-zero intercepts generated by both sources: measurement errors and true excess returns. Consequently, if the null hypothesis cannot be rejected care should be taken in a further analysis of “excess” returns.
The sources of misspecification of the market return can be manifold and lead to different biases in the parameter estimates. Generally, the weights in the market index used for a CAPM regression may differ from the true weights by including only a subset of assets and/or by misspecifying the weights even if the asset universe is correctly defined. The choice of an index consisting of a limited number of large stocks such as the Dow Jones Industrial Average (DJIA) can serve as an example of the first type of misspecification, while the choice between a volume-weighted versus an equally weighted index is an example for the latter type of misspecification. What seems to be an econometric problem in the first place for both types of measurement error has its roots in false theoretical model assumptions, e.g. assumptions about the fixed supply side in terms of volume or quantities.
Measurement error in the market index leads to a systematic bias in the parameter estimates and therefore may lead to erroneous investment strategies. As we will show in this study, the direction and severity of the bias of the CAPM parameters strongly depend on the nature of the measurement error. The consequences for the least squares estimates in the classical errors-in-variables case, where the market index differs from the latent true market return only by an additive idiosyncratic error, are well-known: First, estimates of the beta coefficients suffer from an attenuation bias, i.e. the estimated risk premia are biased towards zero. Second, the estimated intercepts are biased upwards such that positive alphas occur even if the CAPM holds true.
This paper takes a closer look at the measurement error bias in the CAPM beyond the assumption of an additive and independent measurement error typically assumed in the statistics literature. In particular, we derive the bias for the CAPM alphas and betas under different assumptions on the type of misspecification of the market index. We show that the typical attenuation bias occurring in linear regression models with additive measurement error generally does not hold. By means of Monte Carlo simulations, where the return process is generated from an artificial capital market, we assess the size of the bias and provide practical guidance for the choice of the market index in empirical work. By simulating the data from an artificial capital market our Monte-Carlo set-up allows generating different proxies for the market return and their corresponding measurement errors as a function of the true underlying return process similar to the market indices used in empirical studies.
Finally, this paper presents a novel approach of estimating the CAPM in the presence of measurement error. Contrary to general systems of linear regression equations with measurement error, the CAPM contains the same mismeasured explanatory variable in each equation. Using the property that the CAPM with measurement error is a system of regression equations with nonlinear cross-equation restrictions, we present a new identification strategy, which is superior to instrumental variable approaches that typically suffer from the weak instrument problem as market returns are only weakly autocorrelated. We introduce a minimum-distance approach that is easy to implement and that allows estimating three different versions of the CAPM: (i) the conventional CAPM without measurement error, (ii) the conventional CAPM with measurement error and (iii) a factor model with measurement error and excess returns under rather general assumptions on the type of the measurement error.
While the vast majority of empirical studies simply ignores the measurement problem or implicitly assumes that its impact on the parameter estimates is negligible, there are only a few studies considering the impact of measurement error in the market return on the outcome of efficiency tests (e.g. Stambaugh 1982; Kandel/Stambaugh 1987; Shanken 1987). They show that a rejection of market efficiency by the market proxy implies a rejection for the true portfolio, if the true market portfolio is sufficiently correlated with the proxy (ca. 0.7 or larger). The projection framework considered in these studies, however, ignores that the measurement error might be endogenous, i.e. the orthogonality between the (rational expectation) error in the CAPM and the true market index is generally violated. Prono (2015) proposes a new measure of misspecification that accounts not only for the latency of the true market index and the resulting imperfect correlation between market proxy and the true market index, but also for the effect of endogeneity on the CAPM estimates. Our study differs from previous studies on the impact of measurement error in the market index by deriving a theory based measurement error, i.e. a measurement error that results from the underlying economic model and its data generating process.
Jagannathan and Wang (1996) take a different perspective by trying to get closer to the theoretical concept of the market return. They use a broader market proxy, which also takes into account the returns from human capital. In their empirical study of the conditional CAPM, based on the broader concept of the market return, the additional explanatory power of size and book-to-market variables becomes negligible. Unlike previous studies analyzing the potential impact of measurement error in the market proxy on efficiency tests, the focus of this study is to assess its impact on the CAPM estimates with obvious consequences for performance measures, choices of investment strategies and outcomes of efficiency tests. Rather than defining correlation bounds, we estimate the size of the attenuation bias (possibly the size of an amplification bias in some settings) directly. This yields new insights into the quality of different market proxies and provides evidence for the presence of spurious abnormal returns.
The organization of the paper is as follows. Section 2 introduces the theoretical framework under which the return generation process and the unobservable true market index is defined. Various types of misspecification of the market index and their consequences for estimation are considered as deviations from the true return generating process. Based on Monte-Carlo simulations, Section 3 provides a quantitative assessment of the extent of the bias caused by different types of misspecification of the market index. Section 4 provides empirical evidence for the presence of measurement error in the market index using three different data sets and different definitions of the market returns. Section 5 concludes and gives an outlook on future research.
2 CAPM and measurement error in the market index
In the following, we consider a well-defined CAPM where asset returns are equilibrium outcomes from security markets with rational investors. Therefore, the data generating process for the returns and the true, but latent, market return is such that the CAPM holds by construction and identification of the model parameters is feasible. The initial set-up is based on common assumptions underlying the CAPM (e.g. Gourieroux/Jasiak 2001; Fan/Yao 2017) and is sufficiently flexible to allow for a number of generalizations concerning the price process and assumptions on the investors’ behavior. In a second step, we deviate from the world of a perfect data generating process by replacing the true market return by different proxies and derive the conditions under which the true model parameters are identifiable. Our strategy allows us to define more realistic proxies for the market return and study their stochastic implications beyond the conventional set-up of independent, additive or multiplicative measurement errors analyzed in the statistics literature. By comparing the estimates under correct specification with the ones under measurement error, we are able to identify uniquely the consequences of the misspecification on the pattern of the estimated parameters.
2.1 The baseline model
Consider the investment decision of a single investor holding N risky assets and one risk-free asset. The investor’s portfolio is given by the vector of quantities , where denotes the quantity of the risk-free asset and q the vector of quantities held in the risky assets. The price vector for the assets at time t is given by , where the price of the risk-free asset is taken as the numeraire. Expected portfolio wealth for period , , given information up to t, is given by . The allocation problem of investor i for period is given by:
where denotes the initial endowment in t, is the risk aversion parameter of investor i and . The optimal allocation for investor i takes the well-known form:
where is the vector of excess gains. Aggregate demand for a total number of I investors is given by:
with as the absolute risk aversion parameter of the market. Since most empirical studies use excess returns instead of excess gains, we reformulate excess gains in terms of excess returns such that , where is the vector of excess returns with as the N × 1 vector of the return rates on the risky assets and an N-dimensional diagonal matrix with the elements of the price vector on its main diagonal. In terms of the excess returns, the aggregate demand takes the form: (1)
For the definition of the market index the assumption on the supply of assets is absolutely crucial. In what follows, we assume for the supply of the risky assets: (2)
where is the N-dimensional vector of constants. Assumption eq. (2) implies that in every investment period t the market capitalization for each asset supplied is fixed and time-invariant. As we will see below, this assumption is absolutely crucial for obtaining a conventional CAPM defined in excess returns and a value-weighted market index.2 Therefore, proxies for the market return can be expressed in terms of deviations from .
The market equilibrium yields a return process of the form: (3)
where the expectation error is a martingale difference sequence. The excess return of the market is defined as a value-weighted index including the excess returns of all N risky assets. Given the supply side assumption eq. (2) this yields: (4)
where the asterisk indicates that the true market return is a latent random variable depending on the unknown parameter vector . The process for the excess returns of the market takes on the form: (5)
In this general set-up, the return process eq. (3), as well as the process for the market return, is conditionally heteroskedastic with time-varying intercepts leading to a conditional CAPM with time-varying betas. In the following, we assume a homoskedastic process for the returns, , in order to obtain the Sharpe-Lintner version of the CAPM. Under homoskedasticity, the processes for the returns and the market return simplify to: (6) (7)
with and . Without loss of generality, we can define the CAPM as the set of linear projection equations of the excess returns on the excess market return : (8)
where the vector of intercepts of the linear projection equations (CAPM alphas) vanishes, since . By the properties of linear projection, the vector of CAPM betas is given by: (9)
Note, that the CAPM betas are a function of the true, unobservable weighting scheme, , and the variance-covariance matrix of the vector of returns of the entire asset universe Ω.
By definition of the linear projection and are orthogonal. Therefore, β can be estimated consistently by least squares provided is observable. Replacing and in eq. (8) by their right-hand side terms from eqs. (6) and (7), respectively, and substituting eq. (9) instead of β yields the error term of the CAPM equation, , as a function of the error terms of the return process: (10)
with being homoskedastic and linearly dependent, since .
2.2 Misspecified Index Weights
The true CAPM is given by the relationship between the two processes defined in eqs. (6) and (7). In the following, we consider the linear relationships between the return process and alternative specifications of the market index based on an observable weighting vector . As shown below, the specific assumptions on b and its relationship to the true weighting index lead to different identification conditions concerning alpha and beta.
2.2.1 Weights with random measurement errors: CAPM-RME
Consider first the case of a CAPM regression model with random measurement errors in the weights of the market index (CAPM-RME). The actual weights b used to construct the market index deviate randomly from the true weights : (11)
where ν is the vector of random measurement errors defining the deviations of the observable weights from the latent true ones. The measurement errors are assumed to be independent of the return process with and .3 Using the mismeasured weights, b, from eq. (11) instead of in eq. (4) we obtain the observable market index, , based on the observable but erroneous weights. Expressing in terms of the latent market index yields: (12)
Thus the true market return and the observable market index differ by the additive measurement error, . Since , the return of the market index varies randomly around the true market return. Note, that , which implies that the mean of the true market return can be estimated by the mean of its observable counterpart.
Replacing in the true CAPM equation (8) by gives the CAPM based on the observable index: (13)
with . Contrary to the true market return its proxy does not satisfy the orthogonality condition with the error term, , so that least squares estimation of eq. (13) yields inconsistent parameter estimates. Equation (13) with its additive measurement error defined in eq. (12) appears to be observationally equivalent to a classical linear errors-in-variables (EIV) model (e.g. Fuller 1987). However, contrary to the classical EIV-model, lagged market proxies cannot serve as instruments. Due to the martingale properties of the market returns, the true market index is uncorrelated with previous market returns, , but the time-invariant measurement errors in the weights generate an equi-autocorrelation in the observed market returns with . However, the past market returns remain correlated with the overall error term in the regression, . Therefore, standard IV or GMM approaches using lagged market returns break down in the presence of a random measurement error as given by eqs. (11) and (12), respectively.
Although IV estimation is infeasible, the parameters of the CAPM can nevertheless be identified by exploiting the information on the first and second moments of the market proxy. In order to detect the relationship between identifiable estimable parameters and the true model parameters consider first the linear projection of on : (14)
with and . The error term is orthogonal to so that consistent estimates of and can be obtained.
Note that , which is the usual reliability ratio in EIV models. The estimation of parameters leads to the well-known attenuation bias for the slope coefficient, i.e. the estimates of CAPM betas are driven towards zero, i.e. under the presence of measurement error the least squares estimates mimic a too small dependence on the market risk.
Moreover, the CAPM-RME yields positive intercepts, . Thus in the absence of abnormal returns (α = 0), the CAPM-RME mimics spurious abnormal returns, even if the CAPM holds true. Consequently, in the presence of measurement error tests on the existence of abnormal returns ignoring the measurement error are jointly testing the validity of the CAPM and the absence of measurement error.
The sparse parametrization of the standard CAPM model and the parsimonious parametrization of the measurement error reflected only by the unknown parameter yields an overidentified model with only unknown parameters compared to 2N identifiable reduced form parameters given by and . The strong degree of overidentification simply results from the fact that (i) the measurement error affects all N equations in the same way through the reliability ratio and (ii) the asset specific intercepts are nonlinear functions of the reliability ratio and the true betas. Even without exploiting the nonlinear cross-equation restrictions the true beta and the reliability ratio can be identified by a single equation estimate provided the CAPM holds true. In order to see this, consider the nonlinear restriction for the model parameters for a single equation j with and . Then the reliability ratio is identified as the solution of the two equations as (16)
A simple estimate of the reliability ratio can be obtained by replacing the unknown parameters in eq. (16) by the least squares estimates from eq. (15), while can be estimated by the mean of the excess returns. Since this simple procedure generates estimates for every equation j, it seems meaningful to take the average over the single equation estimates in order to stabilize the results. Once is determined, and are identified.
Obviously, the estimation of the system of equations (15) by GMM or Minimum Distance estimation yields asymptotically more efficient estimates.4 In Section 4, we present an empirical application of this identification strategy based on the minimum distance estimation.
The CAPM-RME can be generalized to the case where only M < N assets define the market index and the remaining assets are ignored and do not enter the market proxy with a weighting vector given by:
where is the sub-vector of of dimension M and is the corresponding vector of measurement errors. The relationship between the true market return and the market proxy becomes with . In this case, the mean of the market proxy deviates from the mean of the true market return. A linear projection representation is feasible, but identification can only be obtained under additional assumptions. In the Appendix A.1, we show that if (i) the ratio of the mean of the market proxy and the mean of the true market is given and (ii) the correlation between the returns of the assets included and the assets excluded from the index is small and/or the true index weights are small, the bias resulting from the correlation between and is negligible.
2.2.2 Weights with fixed measurement error: CAPM-FME
Consider now the case where the differences between weights for the market index and the true market return are fixed such that (17)
where Δ denotes the vector of fixed deviations from the true weights. This case includes a number of interesting special cases. For instance, if the equally weighted index, , is used instead of , the market proxy is simply the average over all return rates in the asset universe. If the proxy is based only on a subset of M < N assets, the deviation of the weights of the proxy from the true weights takes the form , such that the first M assets receive some positive (most likely erroneous) weights, while the weights of the remaining assets are ignored. The return of the observable market index takes the form: (18)
Replacing in eq. (8) by gives a system of CAPM equations based on the fixed error market proxy: (19)
with and . Contrary to the random error case given by eq. (13), the observable system of CAPM equations contains nonzero intercepts, such that a test for the existence of abnormal returns would be misleading. Moreover, the market proxy is no longer orthogonal to the error term, . Therefore, estimation approaches based on the orthogonality assumption between the market proxy and the error term are inconsistent.
Replacing the latent market index by its linear projection on the observed index does not help, since the overall error term would also be correlated with the market proxy. Finally yet importantly, the size of the coefficient in the projection equation cannot be derived in the fixed error case, so that no ex-ante statements on the direction of the bias for the estimates of β can be derived. In this sense, the case of misspecification of the true index weights is the worst case scenario. Its implications are fully congruent with the arguments of Roll’s critique.
It is important to emphasize, that the underlying idea for our model choice, on which our simulations are based, is to stay as simple as possible in all aspects of the CAPM and to investigate in depth, how close we can get with our estimates to the true parameters in the presence of measurement errors. Thus our simulation exercise asks the very simple, admittedly puristic, question: “What if we know the true capital market (preferences of investors, underlying data generating process etc.) except for the weights of the market index (or in theoretical terms the correct specification of the supply side), how well do our empirical models perform and by how much does the presence of measurement errors in the market index bias our findings and lead to possible misinterpretations of market behavior including erroneous investment decisions?”
3 Monte Carlo evidence
3.1 Simulation design
By means of Monte Carlo simulations for returns generated from an artificial capital market, we illustrate below how and to what extent different proxies of the market index influence the quality of the CAPM parameter estimates. By simulating from a well-defined artificial capital market, for which the CAPM holds, we can define market proxies as the outcome of the true data generating process combined with misspecified weights rather than imposing arbitrary stochastic assumptions on the error process.
Our simulation study is based on 10,000 Monte Carlo samples of monthly excess return series for an asset universe of N = 205 assets over 10 years. The data generating process for the excess returns is given by:
The variance-covariance matrix Ω was chosen to be equal to the sample variance-covariance matrix calculated from monthly data on excess returns of 205 components of S&P500 index from January 1, 1974 to May 1, 2015. For these 205 stocks a sufficiently long time series was available to obtain a reliable estimate of the high-dimensional covariance matrix Ω. In order to use realistic values for the true weight vector we use their empirical counterparts. More precisely, for each of the 205 stocks of the S&P500 index we compute the mean value of the market capitalization based on monthly data from January 1, 1974 till May 1, 2015 and define the true market weights as a proportion of the total market capitalization of the 205 stocks. Finally, the coefficient of risk-aversion γ was chosen to be equal to 0.4. Following our baseline model, we assume a fixed supply of assets as assumed in eq. (2) with the true market index, generated according to eq. (4). In a second step, we generate proxies of the true market index under different types of measurement error. Table 1 summarizes the five different weighting strategies for the market proxy used in the Monte Carlo study.
For the case of random measurement errors, we consider three different choices for the variance of the measurement error, . For our analysis of the market proxies based on the subsets of the asset space, we choose subsets covering 18 %, 25 % and 50 % of the total market capitalization. For these subsets, only assets with the largest weights are selected, so that the indices are more comparable to real-world market indices. We estimate the CAPM parameters for 15 randomly drawn assets by the seemingly unrelated regression (SUR) method. Table 2 summarizes our findings for market proxies based on the total asset space and random error in the weighting scheme, while Table 3 contains the results for indices based on specific subsets. Both tables summarize our findings by reporting the means of the estimates for 15 selected assets. The detailed results for each of the 15 assets are given in Table 6 in Appendix A.3.
For reasons of comparison, the second column of Table 2 contains the results of the CAPM, when the true market return is feasible. Since these estimates are obtained under the true data generating process, they only differ from the true model parameters by the sampling error. Therefore, these estimates can serve as a benchmark for the estimates using market proxies. Under the true data generating process, the CAPM alphas are on average close to their theoretical value of zero. The empirical rejection rate for is close to the 5 % significance level, which indicates that the sample size of T = 120 chosen for the Monte-Carlo simulations is sufficiently large to produce estimates that come close to the true parameters given the distributional assumptions and the true market index. We also report the results for the joint test of absence of abnormal returns. Since the errors in the simulations are normally distributed, the F-test is the appropriate choice for a finite sample test. We also report the Wald statistics based on the true Ω to avoid finite sample distortions going along with Wald test based on large dimensional estimated covariance matrices. For the true CAPM model without measurement error both tests show an empirical rejection rate close to the nominal 5 % significance level, so that we can conclude that two joint tests do not suffer from any finite sample deficiencies that may distort the test results for the models based on proxies of the market index.
It is important to note, that in the case of the CAPM with measurement error the alternative hypothesis of non-zero intercepts is true, i.e. asymptotically the Type I error increases to unity and the power approaches unity as the Type II error vanishes. Therefore, the stronger the measurement error the more the model’s intercepts deviate from the null hypothesis, which explains why in Table 2 the empirical rejection rates increase with increasing .
Only for the case of a small measurement error, , the CAPM-RME shows negligible distortions of the parameter estimates. The empirical correlation between the true market returns and the proxy is 0.96 and the attenuation bias reflected by is small. For this mild case of measurement error, we find an empirical rejection rate of the null of no abnormal returns of 8 %. For the intermediate case with the correlation between true market return and the market proxy appears to be rather high with . However, for more than 47 % of our estimates we find abnormal returns mimicking the existence of potential profits from trading of the average size of 2.7 % per month. For the case of a large measurement error, the situation deteriorates even more, although the correlation between true market return and the proxy remains above 0.5.
All three scenarios demonstrate that the correlation between the true market return and the market proxy provides insufficient information to make any conclusion about the bias in the CAPM estimates. Contrary to Prono (2015), our results indicate that, after all, what matters is the coefficient on the linear projection , which consists of the product of the square root of the reliability ratio and the correlation coefficient.5 The situation does not improve when multi-factor models instead of the CAPM are considered. In multiple regression models with measurement error in one variable the distortions spill over to the estimates of the other parameters and produce size distortions (see Brunner/Austin 2009).
Moreover, the measurement error also has a considerable impact on the precision of the parameter estimates. Compared to the RMSE for the benchmark model, the RMSE for beta increases by 40 % in the case of a small measurement error and almost doubles for the medium size measurement error. The loss in estimation precision has serious implications for concrete applications and interpretations of the CAPM estimates. For instance, the increase of the RMSE due to the measurement error also increases the risk of a faulty sorting into defensive and aggressive stocks. For example, in the case of the medium size measurement error based on the empirical rejection rate of the null hypothesis , we detect that only 2 stocks have a beta coefficient significantly larger than 1, while in the case of no measurement error we detect 5 aggressive stocks. A larger variance-covariance matrix of the parameter estimates also reduces the power of detecting cumulative abnormal returns based on the CART-test (see Campbell et al. (1997), Chapter 4).
Table 3 contains the results for the case of fixed measurement errors, where the market proxies are based on a subset of the asset universe. We consider seven different scenarios with fixed measurement errors. The first three cases (columns 2–4) capture the case, where the market proxy is based on the normalized true weights for the largest assets that cover the 18 %, 25 % and the 50 % of total market capitalization, respectively. All indices are constructed such that they include only the largest fraction of assets, while ignoring the corresponding smaller assets of the asset universe.6
In terms of the number of constituents, the indices are based on the 4, 7 and 20 largest stocks of asset universe. This rather small number of included stocks may seem unreasonable at the first glance. However, one should keep in mind that the size of the true market universe in our simulation contains only 205 stocks. In empirical studies, the proxies most often used are the Dow-Jones Industrial Average Index, the S&P500 index and CRSP index. Correspondingly, these indices cover 30, 500 and nearly 4,000 stocks traded on the U.S. stock market. In terms of market capitalization, the S&P500 covers roughly 76 % of the total market capitalization of the 5,200 actively traded companies traded on the NYSE and the NASDAQ. This, however, is far from the capitalization of the entire true market portfolio.
The last four columns of Table 3 contain the estimates for models, where the true weighting scheme is replaced by equal weights. In the last column the estimation results are given, when the index is based on all assets but equal weights of size 1/205 are used instead of . These estimates come rather close to the ones obtained when the true market index is used. Both the reliability ratio as well as the correlation between the true index and the proxy are close to one. This particular finding, however, should not be overemphasized, because it is strongly based on the underlying Monte-Carlo design. The true weights and the equal weights of size 1/205 do not differ very much, so that on average the fixed measurement error turns out to be rather small. The results show the monotonicity of the quality of the estimates in terms of the number of assets used. The estimates with the index covering 50 % of total market capitalization are slightly superior to the estimates based on the other two scenarios. If the index represents only the capitalization of 18 % of the market, the estimates reveal the largest biases and mimic on average abnormal returns of 3.9 %.
Our simulations based on the equally weighted index generate a specific correlation pattern between the estimated alphas and betas depending on the number of stocks included in the index. This pattern is depicted in Figure 1. When more than 60 of largest stocks out of the 205 stocks are included, we find a negative correlation between the estimated alphas and betas with a maximum negative correlation for the case when all assets are included with equal weights. What appears to be evidence for the low risk puzzle is in our case simply the outcome of the specific measurement error of the market index. In fact, Frazzini and Pedersen (2014) use in their study on low risk premium the same construction of the market index.
In Figure 2 the top left and top right plots give the empirical rejection rates of no abnormal returns for indices based on different subsets of the asset space. For the case based on fixed measurement error with normalized weights, the empirical rejection rates decrease the more representative the market proxy becomes for the entire market, i.e. the over-rejection rate for the CAPM alphas and the evidence in favor of (spurious) abnormal returns decreases with an increasing fraction of total market capitalization captured by the market proxy. For scenarios, where the market proxy captures 10 % or less of the total market capitalization, the empirical rejection rate strongly exceeds the nominal 5 % level. Note, that for the equally weighted index we do not observe such a strict monotonicity (see upper right panel). Here we find an optimal number of stocks that minimizes the difference between the empirical and the nominal rejection rate. For our design, this is obtained, if around 30 % of the largest stocks are constituents of the index. Equally weighted indices including more stocks turn out to be less representative and proxy the true market index less well.
The Receiver Operating Characteristics (ROC) curves depicted in the bottom right and bottom left plots of Figure 2 provide more insight into the relation between Type I error and power of the tests for absence of abnormal returns depending on crudeness of the market proxy. For the least representative index with M = 4 the probability of falsely detecting abnormal returns due to measurement error is substantially larger than for the other two cases.7
3.2 Effects on performance measures
CAPM parameter estimates are frequently used to compute performance measures for single assets or portfolios. Obviously, the bias in the parameter estimates directly passes onto biases in these performance measures. Consider, for instance, the Sharpe ratio and the Treynor ratio for asset j = 1,…,N:
In the last two rows of Tables 2 and 3, we report the estimates of average Sharpe ratios and Treynor ratios under the different regimes of misspecification. The attenuation bias for beta also leads to an attenuation bias for the Sharpe ratio, while for the Treynor ratio the attenuation bias leads to a strong upward bias, because the estimated beta enters the denominator of the Treynor ratio.
Figures 5a to 5c in Appendix A.3 depict the box plots of the Sharpe ratios for the 15 assets estimated from the CAPM model with market indices in the case of random measurement error. For and , the true Sharpe ratio lies outside the interquartile range indicating that an estimated Sharpe ratio in the presence of measurement error is very unlikely to come close to its true value. Only for the case of minor contamination, the true Sharpe ratio lies within the interquartile range.
Figure 6 in Appendix A.3 depicts the box plots for the estimated Treynor ratio for each of the 15 selected assets. As implied by the theory the mean estimates are the same for all 15 assets and are equal to the true value of the expected excess return of the market indicated by the blue horizontal line. Figure 7 in Appendix A.3 contains the box plots for the Treynor ratio based on equal weights and weights with random measurement error. For the case of fixed measurement error the sign of the bias differs from asset to asset. But the distortions are not as severe as for the case of random measurement error (). Note, for the latter case we find a large sampling variation of the Treynor ratio, which can only be estimated with low precision, even in the absence of measurement error. With measurement error in the market return we find an even stronger variation in the performance across assets although the CAPM holds true. The presence of measurement error mimics investment opportunities where none exist.
The consequences of measurement error in the market return for investment decisions based on CAPM estimates can also be seen for the security market line (SML). Figure 3 depicts the unbiased CAPM estimates based on the true market return (blue crosses) and the biased estimates based on equal weights (green dots).
If the market return is measured correctly, the estimates are scattered closely around the SML indicating no need for reshuffling the portfolio. However, with measurement error the estimates indicate investment opportunities due to the spurious abnormal returns. The slope of the true SML is given by . Since the weights of the equally weighted index are very close to the true weights in our simulation the sample mean of the excess return is very close to . Therefore the estimated SML and the true SML in Figure 3 overlap. Only in the case of random measurement errors () can the sample mean of the excess returns, , serve as an unbiased estimate of the slope of the SML. However, in all other cases considered , and the mean excess return of the market proxy does not yield a consistent estimate of the slope. For these cases the CAPM estimates deviate systematically from the true SML.
4 Empirical evidence in the presence of measurement error
In the following, we present empirical evidence on the relevance of measurement error in market returns using our cross-equation identification strategy presented in Section 2.2. We use minimum distance estimation (see Appendix A.2 for details) to estimate and test for attenuation bias and the presence of abnormal returns for various datasets and alternative measures of the market index. The effect of different market proxies on the beta estimates provides insights into the robustness of CAPM estimates and the quantitative relevance of the measurement error problem. In the first stage, we estimate a CAPM system of regression equations by the seemingly unrelated regression (SUR) approach. We call this system of regression equations CAPM linear projection model (CAPM-LP) as it imposes no structure on the parameters implied by theory and can be taken as a purely statistical concept. Alternatively, one may regard this model as a general one-factor model with measurement error and/or abnormal returns, since non-zero intercepts due to measurement error and abnormal returns are not separately identifiable. The CAPM-LP, however, parametrically nests the standard CAPM (no abnormal returns, no measurement error, N true betas) and the CAPM with measurement error (CAPM-ME) (, N true betas and intercepts resulting solely from the presence of measurement error).
In the second estimation stage, we impose the parametric structure implied by either the CAPM-ME or the standard CAPM. For all the second stage estimates we use the inverse of the first-stage variance-covariance estimates as an optimal weighting matrix, so that the distance statistics at the minimum are asymptotically -distributed (see Appendix A.2). The test statistics for the null hypothesis that the nested model holds true can then be obtained by comparing the distance statistics of the restricted model against the less restricted specification. Thus the difference in the distance statistics of the CAPM-ME against the CAPM-LP is -distributed. It tests whether the intercepts can solely be explained by the intercepts of the CAPM-ME. Therefore, this test circumvents the problem of identifying the true from spurious alphas.
Assuming the CAPM-ME holds and imposing, in addition, the absence of any measurement error, , leads to the minimum distance statistics for the standard CAPM. The difference between the two corresponding distance statistics is a -distributed test statistics for the null hypothesis that the CAPM holds against the more general CAPM-ME.
Finally, the -distributed Wald test on the zero intercepts tests in the tradition of the CAPM against any alternative, which implies non-zero intercepts (CAPM-ME, abnormal returns, CAPM-ME including abnormal returns).
Table 4 summarizes the minimum distance estimates for two data sets consisting of different securities: (i) a set of 20 randomly selected stocks from the S&P 500 (ii) 30 stocks of the Dow-Jones Industrial Average Index (DJIA). Our estimates are based on three different definitions of the market return using monthly data from 2010:6 to 2015:5. In order to avoid estimation problems of structural breaks caused by the latest financial crisis, we decided to study the model on data that are likely to be unaffected by these distortions. The three market indices we use are (i) the value-weighted return of all CRSP firms incorporated in the US and listed on the NYSE, AMEX, or NASDAQ as provided on Kenneth French’s website,8 (ii) the S&P 500 value-weighted index and the (iii) Dow-Jones Industrial Average Index (DJIA). By construction, the CRSP index is the broadest index in terms of the asset space covered, while the DJIA is the crudest proxy of the true market index. Therefore, we would expect the lowest linear projection coefficient for the DJIA and the strongest attenuation bias for the beta estimates.
Consider first the results of Wald test for the absence of intercepts (CAPM vs CAPM-LP). Our findings are in accordance with many previous empirical studies rejecting the null of no intercepts. Only for the case of DJIA stocks and the CRSP index as the market proxy the conventional CAPM with no abnormal returns and no measurement error cannot be rejected. However, note that in the presence of measurement error this test provides no information, whether the non-zero intercepts result from true abnormal returns or are spurious due to measurement error. More interesting are the outcomes for our test of the CAPM-ME against the CAPM-LP. We find that the measurement error specification is sufficient to explain the presence of intercepts. Our test results indicate that the rejection of no abnormal returns, often found for many data sets, are likely to be the outcome of measurement error, i.e. the old dog CAPM remains alive once we account for measurement error.
Note, that the linear projection coefficient can only be interpreted as a reliability ratio in the case of the EIV model. Only for this case, the projection coefficient is bounded by construction between zero and one and implies an attenuation bias in the betas. Using the S&P500 and the DJIA as market proxies, we find for all datasets significant evidence for an attenuation bias. The situation for the CRSP index as the market proxy is somewhat different: The null of for the Dow-Jones and the S&P500 stocks cannot be rejected.
In this paper, we take a closer look at the consequences of a misspecified market index in the capital asset pricing model. Our focus is on two major sources of misspecification: (i) the use of inaccurate weights and (ii) the use of only a subset of the asset universe to construct the index. The consequences resulting from the use of a badly chosen market proxy reach from inconsistent parameter estimates to a misinterpretation of tests on the existence of abnormal returns. High correlations between market proxies and the true market return are shown to be insufficient to indicate that the choice of a particular market proxy is negligible. What matters is the predictive quality of the market proxy for the true market index in terms of a linear projection.
The estimation problems arising from measurement error in the market proxy deviate substantially from the ones typically found in errors-in-variables models for linear regressions. Unlike the true market index, market proxies are no longer orthogonal to the error in the CAPM. Instrumental variable estimation generally becomes infeasible, unless additional identifying assumptions are introduced.
For the EIV-CAPM model, where the errors in the weights of the market index are assumed to be random, we present a new identification strategy and testing strategy. This strategy accounts for the nonlinear cross-equation identifying restrictions, which exploit the property that all CAPM equations are affected by the same attenuation bias for the betas. We do not claim that our approach is able to identify the measurement error under general assumptions. In our setting identification of the measurement error is only feasible under the maintained hypothesis that the CAPM holds true. What seems to be an important caveat, however, also applies to approaches which naively interpret any non-zero excess returns as being non-spurious. In contrast, our testing strategy allows us to test the linear projection based asset pricing model incorporating abnormal returns and measurement error against the EIV-CAPM as well as to test the CAPM with measurement error against the conventional Sharpe-Lintner CAPM without abnormal returns and measurement error. In this sense, the testing strategy proposed provides empirical evidence that the CAPM with measurement error can serve as a device to explain the cross-sectional variation of returns in cases, where it is statistically indistinguishable from the more general set-up with both measurement error and true excess returns.
Our empirical findings indicate for two different datasets and different market proxies that regardless of the assets under investigation the use of a more accurate proxy is rewarding and reduces the estimation bias and the risk of detecting spurious abnormal returns. In fact, our empirical study indicates that the existence of abnormal returns in the conventional CAPM is spurious and can largely be explained by measurement error. In this sense, the empirical researcher should be cautious when interpreting non-zero estimates of alphas without checking for the possibility of spurious excess returns generated by measurement error.
Nevertheless, the claim of this study is a rather moderate one. We do not intend to give a final answer to the ongoing question of whether the CAPM is dead or alive. Our study simply points out that, if we want the CAPM to give a serious chance to survive as a workhorse in academic finance and business, we should take it more seriously and try to account for the latency of the market return as best as possible. In particular, this includes taking on a more skeptical view on the evidence for the existence of abnormal returns in the presence of measurement error. As was shown in this paper and in many preceding studies, the problem of measurement error in the market returns is just the empirical result of misspecification of the market supply. More general and realistic theoretical settings concerning the supply side may lead to the use of more accurate approximations for the true market index in empirical studies.
Future research should consider other types of measurement error (e.g. multiplicative errors) as well as the consequences of measurement error in other multi-factor asset pricing models. In the light of our findings it seems also rewarding to reconsider the construction of popular market proxies which generally are characterized by a local bias contradicting the assumption that a large fraction of the investors follow a global investment strategy and ignore returns from other important tangible or non-tangible assets.
Financial support of the first author by the Graduate School of Decision Science (GSDS) and the German Academic Exchange Service (DAAD) is gratefully acknowledged. The second author likes to thank Simon Benninga posthumously for a helpful discussion and for motivating him to extend his notes on measurement errors in linear systems to an application for the CAPM. For helpful comments we further like to thank Roxana Halbleib, Jens Jackwerth, Ingmar Nolte and two anonymous referees. All remaining errors are ours.
Brunner, J., P. Austin (2009), Inflation of Type I Error Rate in Multiple Regression When Independent Variables Are Measured with Error. Canadian Journal of Statistics 37: 33–46. Web of ScienceCrossrefGoogle Scholar
Campbell, J.Y., A.W. Lo, A.C. MacKinlay (1997), The Econometrics of Financial Markets. Princeton: Princeton University Press. Google Scholar
Fan, J., Q. Yao (2017), The Elements of Financial Econometrics. Cambridge: Cambridge University Press. Google Scholar
Fuller, W. (1987), Measurement Error Models. New York: Wiley Google Scholar
Gourieroux, C., J. Jasiak (2001), Financial Econometrics. Princeton: Princeton University Press. Google Scholar
Hamerle, A. (1996), Empirische Performance Multivariater Tests des Capital-Asset-Pricing-Models. Jahrbücher für Nationalökonomie und Statistik – Journal of Economics and Statistics 215: 228–244. Google Scholar
Jagannathan, R., G. Skoulakis, Z. Wang (2010), The Analysis of the Cross-Section of Security Returns, chap. 14. in: Y. Aït-Sahalia, L.P. Hansen (Eds.), Handbook of Financial Econometrics, vol. 2 - Applications. Amsterdam: Elsevier Web of ScienceGoogle Scholar
Sharpe, W.F. (1964), Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk. Journal of Finance 19: 425–442. Google Scholar
A.1 CAPM-RME with Subset of Assets in the Index
In the following we derive the consequences, if the market proxy is based on a subset of the asset returns and the weights of the included assets are measured with random errors. Let the proxy for the market index be based on M < N assets, where the disregarded assets receive a zero-weight according to as defined in Section 2.2. Correspondingly the vector of the true weights consists of the true weight subvector for the included and the true weight subvector for the disregarded assets: . The return of the market proxy is given by: (20)
with , where denotes the subvector returns for assets disregarded in the index. Contrary to the case, where all assets are included measurement error has a non-zero mean, .
Assume there is a known constant mark-up factor θ of the mean market returns over the mean of the market proxy, such that , the intercept of the CAPM-RME eq. (15) becomes .
Finally, consider the correlation of the overall error term in the CAPM-RME from eq. (15) and the market proxy. (21)
where is the covariance between the returns of the included and the disregarded assets and is the variance-covariance matrix of the returns of the disregarded assets. If the empirical analysis is focused on assets which are included in the market proxy, if the CAPM is restricted to the first M assets or a subset of it, the first M elements of the vanish, if or are strictly zero and consistent estimates of the first M equations of the CAPM-RME can be obtained. Only a small bias is likely to occur, if the included and the excluded assets are only weakly correlated, and the true weight of the disregarded assets is negligible, .
A.2 Minimum Distance Estimation of the CAPM
Consider the stochastic form of the linear predictor equation for the j-th excess return on the (observable) excess return of the market including an intercept: (22)
where contains the reduced form parameters of a CAPM with measurement error and . The collection of reduced form parameters into the so-called Π-matrix of dimension N × 2 is given by: (23)
With as the vector of excess returns, the system of CAPM reduced form regressions is given by: (24)
where is the matrix of linear predictor coefficients with its sample counterpart (25)
The Pi-matrix stacked into a row vector (26)
relates the 2N-vector of reduced form parameters to a vector g( · ) of nonlinear functions of the structural parameter vector of the same dimension, where π can be estimated consistently by single equation least squares without loss of efficiency: (27)
Under standard assumptions about the return process (28)
In the case of a homoskedastic error term vector with covariance matrix , the asymptotic variance-covariance matrix of takes the well-known form: (29)
assumed for seemingly unrelated regression models. Generalizations of Ω for the case of heteroskedasticity and autocorrelation can be easily derived.
The minimum distance estimator based on the restriction π = g(θ) is defined by: (30)
where converges asymptotically to the positive definite weighting matrix . The optimal feasible weighting is given by , where is a consistent estimate of the asymptotic variance-covariance matrix of . Note, that g( · ) depends on the unknown parameter . In our empirical application, we replace with its sample mean.
The model can also be derived under the general covariance specification . This generalization does not add any additional insights. However, for the simulation exercise the scalar identity expression is more convenient because the strength of the measurement error can be expressed by the single parameter .
Full information ML estimation is not recommended, because this would require additional distributional assumptions on , the error term of the linear projection eq. (14).
The simulation results based on the non-normalized weights are given in Table 5 in Appendix A.3. They do not differ substantially from those obtained for .
The plot of the empirical rejection rates and the ROC curve for the case of the CAPM with the market proxy based on the subset of the largest assets with non-normalized weights is given in Figure 4 in Appendix A.3. They do not differ substantially from those obtained for .
About the article
Published Online: 2019-06-18