The tendency of asset prices to go through locally explosive and mean reverting states is well documented and has intrigued both theoretical and empirical economists. Regime switching models, such as the ones introduced by Goldfeld and Quandt (1973), Tong (1978), and Hamilton (1989), have often been employed to empirically estimate asset prices with regime changes. These models include hidden Markov-state models as well as threshold autoregressive models.
Knight and Satchell (2011) study the steady-state properties of asset prices that are estimated using threshold auto-regressive models. Their article formalizes necessary and sufficient conditions for the existence of a stationary distribution for regime-switching threshold models. Analytical expressions for the mean, variance, co-variance and the steady-state distribution are also derived. While Knight et al. carry out most of their analysis under the assumption of an independent and identically distributed trigger variable they also consider the case of a threshold auto-regressive model (TAR henceforth) with a Markov trigger.
The current article builds on their results and generalizes the conditions required for a TAR model to have a stationary distribution with any number of finite regimes. We only consider the case where the TAR model is driven by an independent and identically distributed (i.i.d) exogenous variable that triggers regime switches. We also derive analytical expressions for the mean and variance of these models, noting the conditions that need to be satisfied for their existence. The two moments are derived with a switching drift, with a constant drift and with no drift.
This enables us to treat market efficiency as a state, instead of a condition that holds universally whereby an asset market is either efficient or inefficient. Considering efficiency as one of several states, we make a number of contributions to the literature on the efficient market hypothesis. We contribute to the bubble testing literature by carrying out a simulation study which compares the power of bubble detection tests in situations where a stationary distribution exists against situations where it does not. Here, the explosive or bubble state is one state of a multi-state price process.
Evans (1991) pointed out in his seminal study that bubble detection tests are less useful when an observed series contains multiple instances of collapsing bubbles. His study showed that such tests lose power when the number of bubbles and collapses in a series increases. A number of studies have attempted to address this criticism through alternative methodologies. The most notable ones among them are Hall, Psaradakis, and Sola (1999) and Phillips, Shi, and Yu (2013). Hall et al used a Markov-state regime switching model to estimate the probability of an asset being in an explosive state. Phillips et al. on the other hand have devised a recursive procedure using the Augmented Dickey Fuller test which allows them to not only test for explosiveness but also date these bubbles. The GSADF test as it is now called has proven to be popular with macroeconomists and financial economists. Our simulations show that while this test is statistically powerful in empirical application it has its limitations.
Using both i.i.d and Markov switching-regime simulated series we show that when a time series resembling an asset price fails to satisfy the conditions for a stationary distribution, the GSADF test has high power. On the other hand, the power of the GSADF test falls considerably when the process has a stationary distribution even though locally explosive regimes continue to be present. Thus, our simulation study builds on our theoretical results and further elaborates on observations made in Knight, Satchell, and Srivastava (2014), who outline reasons for the failure of bubble detection tests when a series has a stationary distribution.
Our analysis provides a limiting feature of the GSADF test as the test is premised on a process not having a mean reverting state. Phillips et al. show that the GSADF test has higher power than other alternative bubble detection test; thus, we contend that these results should have external validity for other bubble tests. Furthermore, we note that the power of the GSADF test increases the farther the explosive regime parameter is from unity. This observation is also supported by the formulae we derive in our workings and is discussed in the relevant section.
Finally, we also contribute to the market efficiency literature by providing a methodology that may be used to estimate how often an asset market is efficient and also allows us to compare efficiency across different markets. The efficient market hypothesis is perhaps the most well-known as well as the most divisive hypothesis in economics. While economists were aware of market efficiency for a very long time before him, Fama (1965) was the first to define the market as being efficient in his seminal article on stock prices where he concluded that stock market prices followed a random walk.
Since then a large number of economists have contributed to this literature with both proponents and opponents of the hypothesis contributing. Seminal contributions have been made to this literature by Samuelson (1965), De Bondt and Thaler (1985), Marsh and Merton (1986), and Shiller (2000), among others. The hypothesis itself states that for a given information set systematic gains cannot be made by trading on the information set alone. In fact, it may be argued that the threshold auto-regressive model literature and the bubble testing literature is a subset of the efficient markets literature as the techniques developed to estimate asset markets have often been discussed in the context of efficient markets.
Note that in this article when we mentioned market efficiency we are referring to the weak form of market efficiency which states that returns cannot be predicted based on prior information, i. e. the impact of prior prices is already reflected in the current price. In econometric terms this suggests that asset price follow a random walk with (or without) drift, i. e. . In the context of this research, thus, a deviation from efficiency should be understood as a deviation from the random walk process or any deviation in slope from the specification above. Instead of arguing for or against the efficient market hypothesis we recognize that although markets may be mostly weak form efficient, they can deviate from efficiency for significant periods of time. Our estimation methodology aims at identifying how long, in terms of what proportion of time, these inefficient periods last.
We outline an estimation methodology for asset markets where market efficiency is one of several states. Our estimation methodology aims at identifying how long periods of efficiency and inefficiency last. To the best of our knowledge this research would be the first to provide a metric for market efficiency using threshold auto-regressions. Historically, research has taken a binary view towards market efficiency and has been based around the presence or absence of market efficiency taken on average over a sample period; our metric provides a more detailed view. We believe that markets can undergo efficient as well as inefficient states and our methodology helps us determine and estimate how long such states last.
We provide an illustrative empirical application of our methodology through estimating a TAR(1) model for the S&P500 and FTSE 100 stock market indices where the parameter switches due to an exogenous trigger variable. We estimate the TAR(1) using a constant drift, a regime-switching drift and no drift. Our results indicate that the inclusion of a drift term, particularly a regime-switching drift term, reduces the impact of the regime-switching auto-regressive parameter. A switching drift term is able to explain changes in the return process and thus, the high variance observed during explosive periods. With a switching drift term, markets appear to be more efficient than with no drift as we are unable to reject the random walk hypothesis for a number of coefficients. This observation indicates that in the context of regime-switching models, market efficiency is a model driven concept. However, our methodology does allow us to compare efficiency across different markets for the same model specification.
Other measures of efficiency based on trading volumes or number of informed traders may be used to gain estimates of market efficiency. This would require estimating a trading model for the former and a heterogeneous agent model for the latter. We argue that our estimation methodology is much simpler and the data required for estimation (i. e. prices or returns) are much easily available compared to trading volumes and private information of traders. The methodology may be used for forecasting purposes; however, no data testing on forecasting is carried out as it is goes beyond the scope of our research which primarily aims at specifying and estimating a metric for market efficiency.
In summary, our main contributions to the literature are a generalization of conditions required for a threshold auto-regressive model to have a mean and variance when a steady-state distribution exists, a simulation study of how bubble tests behave when processes do or do not have steady-state distributions and an empirical methodology for analysing market efficiency through threshold autoregressive models driven by an exogenous trigger variable. Section 2 generalizes the results in Knight and Satchell (2011) to a finite number of regimes and presents results on the mean and variance of TAR(1) models. Section 3 explains the results of the simulation study using both i.i.d and Markov chain threshold exogenous triggers. Section 4 illustrates how the model in Section 2 may be estimated in practice and how it can be used to construct efficiency measures. Section 5 concludes.
2 Conditions for the Existence of a Mean and Variance
Let be the price (or log price) of some asset. We assume that: (1)
For illustrative purposes we consider the three state case although the results will be applicable to a finite number of ‘k’ states. For k = 3 we specify values of the trigger or driving variable for which the parameters switch between values.
Thus, for a three-state case we have:
and are threshold levels which trigger the switch between states. The above framework can be generalized to k intervals and k+1 constants where and .
We need to be i.i.d to derive our results. The probability that will take a value between any two constants is assumed to be , i. e. and
As a result there will be ‘2k’ different parameters. We denote the regime specific parameter by
We note that conditions for the existence of a stationary distribution will be similar if followed a Markov process but the moments will be different. (Knight et al. 2011 Theorem 2).
If < 0 and (Knight et al. 2011) then the TAR model given by eq. (1) has the solution (2)
and , (Quinn 1982).
When for i = 1, 2 … .k (3)
It follows that as : (4)
We note that the finiteness of the first term in eq. (3) is governed by the behaviour of but that eq. (4) being satisfied is enough to ensure the existence of eq. (3). We further note that existence of the process does not imply existence of the mean so the first term may not converge to a finite limit in expectation.
In the following sub-sections we derive the mean and variance for the general case and discuss special cases, i. e. and .
As the switch independently (due to being i.i.d), we have: (6)(7)
Thus, the mean will exist if
The mean is zero if there is no drift term.
Under independence of and it follows from eq. (1) that: (10)
We evaluate each term in eq. (10) separately (11)(12)(13)
Equation (13) again makes use of the fact that is an i.i.d process which implies independence between and .
Using the definition of variance we know that: (14)
Finally, we evaluate (15)
Equation (15) relies on the independence of and from .
Rearranging eq. (16) and recognizing that if has a stationary distribution and
The condition for the existence of a finite variance with switching regimes is thus, (18)
The expression in eq. (17) suggests that the variance of the price series is increasing in the variance of the drift parameter , the variance of the error term , the variance of the coefficients , the expected value of (which includes the absolute value of the drift) and the covariance between the drift and coefficient parameters. The latter will be positive since we assume that the same exogenous parameter causes a regime switch in both the drift and the coefficient parameter. If we have regimes with slope coefficients deviating far from unity (a case we will be interested in when considering the efficient market hypothesis), we will get a much higher variance for the price series. Similarly, if there is a large drift term, it can lead to the variance of price being high. This also gives us an early indication that model specification may be an important determinant in analysing asset price series. Given the amount of variation in an observed set of series, different specification will lead to the variation being captured by different parameters. Indeed, this is what we observe in Section 4.
Now we consider the case where the drift is constant, i. e.
A constant drift implies and . Therefore, the expression for the variance of the asset price with a constant drift term reduces to: (19)
With a constant drift term the variance of the process relies on the variance of the error term, the variance of the coefficient parameter and the absolute value of the expected price which itself is a function of the drift term; thus, the variance of price is dependent on the absolute value of the drift term.
Finally, we consider the case where . With no drift term, eq. (9) implies that . Thus, (20)
In the vicinity of a unit root, is likely to be close to 1, which will lead to a very large variance for the process. Nevertheless, the variance will be finite and will exist as long as the condition in eq. (18) is satisfied. Sections 3 and 4 analyse the implications of the formulae derived above. Section 3 considers a series of simulations to show how violations of the criterion for a stationary steady-state distribution specified above and the variance of the parameter coefficient impact the ability of statistical tests to detect explosive roots or bubbles. Sections 3 uses an illustrative empirical study to show how specifications such as eq. (1) may be used to analyse the efficiency of asset markets.
3 Simulation Results
In order to support our theoretical analysis we carry out a simulation study. We simulate series with a switching autoregressive parameter and a standard normal error term. We do not consider a switching drift term for our simulations as a constant drift adequately addresses the issue we wish to highlight. The simulated series takes the following form: (21)
In this context can be thought of as the logarithm of a price variable so that if the return . In the simulations below we consider the case when and .
Our simulation study considers the three-state case for computational ease, although the results will also hold for a finite number of k-states. The switching parameter depends on the pseudo sentiment variable, which in our simulations is either a multinomial vector or a Markov chain variable. For the multinomial vector case, we select the probability with which each state occurs. Thus, Z is [1 0 0] when in state 1, [0 1 0] when in state 2 and [0 0 1] when in state three with probabilities and respectively. The value taken by depends on the parameter vector specified.
On the other hand if is a Markov chain variable, it takes on the values 1, 2 or 3 depending on which state the series is in. For the Markov chain simulations we need to specify a transition matrix instead of a probability vector. A Markov chain is more intuitive for the type of series we are concerned with as states tend to be more persistent in this case once the switch occurs. It is also more comparable to the kind of simulations used in the literature related to test for explosiveness or bubbles. In addition to the process followed by the switch inducing variable , we also need to specify values for the switching parameters. Together, the switching parameter and the probability of enable us to verify if the criterion for a steady-state distribution is satisfied.
In order to illustrate what happens when the criterion is satisfied and when it is not we carry out the GSADF or Phillips, Shi and Yu (PSY henceforth) test on each simulated series to check if the test is able to detect explosiveness in the series which in the context of our simulations may be construed as the process not having a steady-state distribution. This takes the form of a power test. Each simulated series contains periods of explosiveness which is the primary reason for taking the series away from stationarity. This statistic is being used in different areas of economics and PSY have shown it to have high power in detecting explosiveness or bubbles. The test involves estimating recursive regressions of the following form:
In the above regression the parameter of interest is which is estimated through expanding, rolling windows with a minimum window size specified by the researcher. For each regression a right-sided unit root statistic is calculated. The supremum(sup) of all right-sided unit root statistics thus calculated is the GSADF or PSY statistic. The sup value can be compared to simulated critical values, allowing the user to comment on whether a bubble may be present in the series under consideration. We refer the interested reader to Phillips, Shi, and Yu (2013) for further details on the test procedure and asymptotic properties of the statistic.
Multinomial Switching variable:
For clarity, we indicate the specific form taken by our simulated series. When is a multinomial vector taking values it takes the following form:
Section 2 considers a similar exogenous trigger. To aid understanding, the first element of the vector indicates the mean reverting state, the second element indicates the random walk or efficient state and the third element indicates the explosive or bubble state. For each set of parameter and probability values we generated 500 simulated series and the GSADF test was conducted on each series. Each series is 1000 observations long and the minimum window size was stipulated to be 10 % of the series or 100 observations. Critical values were generated separately using the MATLAB code provided by PSY. The GSADF test was conducted at the 5 % level (critical values for the GSADF test at the 5 % level for series of length 1000 with initial window size of 100 is 2.16 for series without drift and 2.233 with drift). Since each simulated series is explosive 10 % of the time on average, each series exhibits the type of explosive behaviour that the GSADF test seeks to detect. The power is simply calculated by dividing the number of bubbles detected by 500 for each set of 500 simulations.
Table 1 shows the results of our simulations along with the parameter values and the probability vector. Column 3 shows the value of the criterion for a steady-state distribution. The criterion is said to be satisfied whenever < 0. We ensured that we chose a range of values so that for some values the criterion was satisfied and for other values it was not. For the multinomial vector case, we note that the power of the GSADF test is much higher when the criterion is not satisfied. We illustrate our results by considering some sets of parameter and probability values. For parameter values [0.96 1 1.05] and a probability vector [0.10 0.80 0.10] we obtain a criterion value of 0.00080 and as per our theoretical results the series should not have a stationary distribution. We see that when the criterion threshold is breached, we get a power of 17.6 % from the GSADF test. Note that the existence of a stationary distribution does not guarantee the existence of moments. With two exceptions (parameter vector = [0.98 1.02 1.05] and [0.96 1 1.03]), all other sets of values do not have a mean or variance even though the distribution may exist (when the criterion is satisfied).
Contrast this with cases when the criterion is satisfied e. g. when the parameter vector is [0.95 1 1.05] and the probability vector is [0.10 0.80 0.10]. The intensity of the bubble or explosive behaviour stays the same, i. e. the bubble increases the value of the series by 10 % each period and the explosive state occurs 10 % of the time in the long run. We note that even with no change in the bubble state parameter value the power of the GSADF test reduces markedly down to 9.0 %. This supports our theoretical results and shows that if bubbles occur in assets which may have a long run steady-state distribution, they may be harder to detect.
In their article PSY do not use a mean reverting state. Their analysis is based on a random walk and a mildly explosive regime which will not satisfy the criterion and is thus, is likely to result in a higher power for their test based on what we observe in our simulations. While the test has undoubtedly been useful in many applications, it is important to keep its limitations in mind particularly when it is unable to detect bubbles in an asset which may otherwise be thought to have gone through periods of explosiveness. The types of series considered by PSY are closer to the Markov-chain simulations in the following sub-section, thus, some of the low power detected in this sub-section may be attributed to the choice of our i.i.d exogenous trigger.
We also note that the power of the test increases the farther apart from unity the explosive state is i. e. . For instance, when we reduce and consider the parameter vector [0.98 1 1.02] with the same probability vector as before, the power reduces to 5.6 % even though the criterion is smaller than before. We also consider the case with 1 mean reverting and two explosive states with the mean reverting state occurring 80 % of the time. With the same probability vector, this set of values attained a power of only 1.2 %. This is the only set of values for which a mean and variance exists.
Tables 2 and 3 on the other hand report results for simulations which include a constant drift term. Table 2 contains results for a drift of 0.01. Table 3 contains results for a drift of 0.025. The higher drift value is chosen in order to illustrate how the power of the test depends on the drift term. Note that as per the results in Section 2, the size of the drift plays a role in determining the variance and is likely to impact the results.
When a constant drift term is included the power of the GSADF test goes up for all sets of values compared to the case with no drift and our main results continue to hold, i. e. the power is lower if the criterion for a steady-state distribution is satisfied and the variance of the parameter vector is low despite the fact that the mean and variance for the set of values chosen do not exist. As per the expressions in Section 2 for the mean and variance of the simulated series, a higher alpha implies not just a higher mean but also a higher variance. While it is not clear from our results whether a higher necessarily leads to a higher power for the GSADF test, the powers attained are higher than the no drift case. We do note, however, that for the case where both a mean and a variance exist a higher leads to a higher power.
Markov Chain trigger variable:
In this section we consider a Markov-chain trigger variable which influences states. Instead of a three-dimensional multinomial vector, is now a Markov-chain variable dependant on a transition matrix. Instead of specifying a vector of probabilities we now specify a transition matrix which determines the value of , the pseudo sentiment variable. Thus, the process becomes:
with the transition matrix
The probability that takes a value of 1 given that it was 1 in the previous period is ; the probability that takes a value of 2 given that it took a value of 1 in the previous period is and so on. Thus is a one period Markov-chain variable. Simulating the series in this way has the desirable property that states tend to be more persistent compared to the multinomial case. While over the long term the duration of each state is similar to the multinomial case, as evident from the steady-state probability vector, each state tends to last longer. Thus, it can be argued that series generated in this way share more properties of actual asset price series.
As in the previous section, we carry out 500 simulations for each set of values. In order to aid comparison we ensure that the transition matrix was such that the steady-state probabilities of states were similar to those used in the multinomial series. One simplification made in selecting values for the transition matrix is that there are no switches from the explosive state to the mean reverting state and vice versa. Thus, whenever there is a switch from either the explosive or the mean reverting state, it is to the random walk state in the first instance.
Simulating and testing Markov-chain series further strengthens the results obtained from the previous sub-section. Using Markov-chains instead of multinomial vectors the simulated series exhibit more asset price like properties and due to state persistence we obtain higher powers for each set of values compared to the multinomial vector counterpart. Table 4 reports results for Markov-chain simulations below. When the criterion is set to ‒0.00004 the multinomial vector series have a power of 5.6 % compared to 25.8 % for the corresponding Markov-chain simulations. A similar pattern is observed for the remaining values. This observation may be attributed to state persistence introduced by the Markov-chain which enables detection via the GSADF test.
As noted previously, influences the results. When the criterion is not satisfied and the non-efficient states significantly deviate from 1, we get very high power. For example, when the parameter vector is with a transition matrix so chosen to give a steady-state probability vector we find a power of 26.8 % (criterion value ‒0.00004). Keeping the explosive state at 1.02, if the mean reverting state is made more persistent (0.99 from 0.98), the criterion is violated (0.00098) and we get a higher power for the GSADF test at 38.2 %. If we increase the deviations from the random-walk state while maintaining the same steady-state probabilities, the power increases further even though the value of the criterion itself does not change significantly. For example, if the parameter vector is with the same steady-state probabilities and a criterion value of 0.0008, the power increases to 73.8 %.
We also consider the case when we have multiple explosive states and a mean reverting state (in this case the mean and variance of the process exist according to the criteria set in Section 2. Among the two explosive states one is more explosive than the other but both explosive states have the same steady-state probability. The parameter vector is with steady-state probabilities yielding a criterion value of ‒0.0093. Corresponding to these values we get a power of 25.8 % for the GSADF statistic. With the multinomial regime-switching variable we observed a power of only 1.2 % for the same set of values. Thus, even with two explosive states we note that if the steady-state distribution criterion is satisfied, explosive behaviour may not be discernible using conventional right-sided unit root tests such as the GSADF test.
We also report results for simulations which include drift terms . Results are reported in Tables 5 and 6. In the Markov-switching case we note that the size of the drift term matters significantly. With a small drift term we do not note a significant change in power compared to the case with no drift (note that the critical values for the two tests are different). For the Markov chain simulations we also note that the power of the test increases as the drift term is increased from 0.01 to 0.025 for all sets of values except one. PSY carried out the test for small deviations from random walk while our simulations include much larger deviations which explain why we observe much higher power despite including a mean reverting term. Nevertheless the results are consistent with the previous set of simulations and the same set of factors namely the value of the criterion, the existence of a drift term and the variance of the parameter vector tend to determine the power of the GSADF test.
Thus, our simulations provide evidence for our theoretical results. The power of bubble detection tests will be higher if the estimated parameters and state probabilities are such that the criterion for a stationary distribution is not satisfied. We also note that the power of such tests is higher when is high.
4 A Metric for Market Efficiency
We use the results in Sections 2 and 3 to outline a methodology that may be employed as a measure for market efficiency. Estimating different states in a process will enable us to understand the stationarity properties of an asset price series and whether the process has a stationary distribution. It will also allow us to evaluate how much time the process spends in each of the states.
While we use time as a metric for market efficiency, other potential metrics may also be employed (these could include but not be limited to metrics using trading volumes or market liquidity)2; however, we argue that our metric is the simplest to estimate and understand. For the data that we use in our empirical application, transaction costs are limited (around 1 basis point for ETFs trading S&P500); these ETFs also tend to be highly liquid and we make the argument that transaction costs and illiquidity are not an issue for the asset markets we consider in our empirical application. Nevertheless, it is important to keep in mind that inefficiency can also result from illiquidity or due to high transaction costs. To measure inefficiency from a transaction or liquidity viewpoint, the TAR model will need to be estimated on measures of transactions and liquidity rather than price. We outline our strategy for estimating market efficiency using pricing data below.
Consider eq. (22): (22)
We define an n-point grid over the extreme values taken by the trigger variable, . Thus, our grid takes values . Our recursive procedure starts by considering an initial pair of values for and ( and respectively). Using the values of and , the sample is divided into three sub-samples, i. e. the first sub-sample contains all values of the asset price that occur when the trigger variable is less than , the second sub-sample contains all values of the asset price when the trigger variable is between and and the third sub-sample contains all values of the asset price when the trigger variable is greater than or equal to .
The number of grid points is chosen so that we always have more than two observations between two consecutive grid points (as no estimation will be possible without at least two observations). Following similar terminology to Sections 2 and 3, is restricted to a value of 0 in order to ensure that the second sub-sample is consistent with an efficient market (if we run the regression in levels instead of logs, would equal 1). Thus, the market is efficient between the thresholds and . If then the market is not efficient for any amount of time. We estimate parameters and for the three sub-samples using least squares. are estimated using all values of that correspond to and are estimated using all values of that correspond to .
We calculate the sum of squared residuals, for each sub-sample in order to obtain the parameter values and the corresponding diagnostics. In the next iteration of the algorithm we keep fixed at and change the value of to . The above process is repeated and a new set of parameter estimates for and is obtained. The procedure is repeated until the last point on the grid. Following this first recursion, the recursive procedure is re-started by altering the value taken by to and. The above double recursion is repeated until and , which gives us our final parameter estimates. The values of and and the corresponding and that minimize the sum of squared residuals are selected and parameters estimated along with their asymptotic standard errors.
Once we have found the switching parameters and the corresponding thresholds for the trigger variable, we can measure market efficiency by analysing the time the asset price spends in each state. Using the thresholds, we can divide the data into efficient periods (corresponding to periods when ) and inefficient periods (corresponding to periods when statistically ). If we have a total sample size with efficient and inefficient periods, we can argue that the proportion of time the asset market is efficient is and the proportion of time it is inefficient is
Note, that if the and that minimize the sum of squared residuals are close, it implies that markets are rarely fully efficient (provided that the auto-regressive parameters for the sub-samples are significantly different from 0). Following convention from Section 2, indicate the first state, the second state and so on. If the series contains one mean reverting, one efficient and one explosive state we should find that, and or that and given how the recursive procedure and the thresholds operate (it also depends on the relationship between and the parameters as well as the value of the drift parameters).
We neither need to restrict the number of states nor the values that the parameter will take (except the corresponding to the efficient state). Thus, we may observe multiple mean reverting or explosive states if we consider more than three states. We restrict our empirical application to the three state cases for tractability. Increasing the number of states beyond three will increase computational time substantially. We provide an illustrative example in the sub-section below.
We would like to note that the empirical example is illustrative in nature and is primarily aimed at showing how the methodology should be used in empirical work. To be more rigorous any empirical application aiming to derive firm and robust conclusions on market efficiency will need to justify the use of a trigger and also be specific with model selection (i. e. whether to use a specification with or without a switching or constant drift term). Since our example is illustrative, we provide results for all model specifications discussed in Section 2 and comment on how model specification changes the efficiency metric.
While we do not touch upon forecasting and predictability of states in this article, we would like to make some observations. Firstly, if the exogenous trigger is i.i.d, investors or the public will not be able to predict when the switches happen as the switching probabilities will not change period on period, if it is not a leading indicator of a regime change. However, when the exogenous trigger is a Markov chain, stakeholders may be able to estimate the probability of a particular asset market being in a certain state which would enable them to make more informed decisions. In either case the exogenous variable must be specified a prior. Secondly, the exogenous trigger may have predictive power if it is a leading indicator of the regime; investors will be able to anticipate a regime-change with some probability if the exogenous trigger approaches regime-switching thresholds and if its probability distribution is known. As we note below, the exogenous triggers that we use for our empirical application are more likely to be Markovian. We also note that in order to forecast the asset price beyond 1 period, we would also need to forecast the exogenous trigger. Work on unobserved state-switching and corresponding efficiency metrics has also been considered and is currently in progress.
Our procedure allows us to say with a chosen probability (we chose a 95 % confidence interval) whether the asset market under question is inefficient at a particular point in time. It may be argued that the transition from efficiency to inefficiency is smoother and thus, a smooth transition model may be more viable.3 While the argument is valid, a different definition of efficiency will be required if a smooth transition model is considered and we may have to arbitrarily identify regions which we classify as efficient or inefficient, based on the steady-state or dynamic properties of the trigger variable. Here, we have identified a conventional criterion (a random walk process) to classify an asset market as efficient. Secondly, when we refer to inefficiency in the context of our article, we are aware that the asset market is inefficient in a probabilistic sense, i. e. we are 90 % confident that an asset market may be inefficient. Thus, our assertions are not absolute. The empirical application below will clarify how we intend the methodology to be used.
4.2 Empirical Application
For our empirical study we use the S&P 500 and FTSE 100 indices. Data for the two indices were obtained from Google Finance and Bloomberg. The setting in Section 2 suggests that the states are triggered by an exogenous variable. We use the University of Michigan’s Index of Consumer Sentiment as our exogenous trigger (MCSI henceforth). We also employ VIX as an additional exogenous trigger and compare the results. The lower bound on the MCSI is 51 and the upper bound is set at 112 so that the regression in eq. (22) can be estimated; for the VIX the lower bound is 10 and the upper bound is 70. We use a 200 point grid on both exogenous triggers.
The MCSI is a monthly survey collected by the University of Michigan. It asks questions on personal finance and economic trends of individuals and households through telephonic interviews. Survey respondents are representative of the American population and each month more than 500 interviews are conducted. More information on the Survey and sample design is available on the MCSI website. The VIX on the other hand is a measure of implied volatility for the S&P500 derived from a variety of different options available on the S&P500. It seeks to estimate future volatility in the index and is therefore representative of investors’ view on the future direction of the stock market index.
Data for the S&P500 index and the MCSI were obtained from January 1978 to June 2015. The underlying assumption required for the MCSI and VIX to be valid trigger variables in this setting is that contemporaneous and one-period lagged values of the two indices do not impact either the MCSI or the VIX. For the VIX, data start from 1st February 1990, but are available at a daily frequency. Thus, for VIX regressions we use daily data from the 1st February 1990 up to 30 June 2015 for both the S&P500 and the FTSE100 Indices.
While data for the MCSI are available from 1964, the survey was initially collected twice a year and the index only becomes a monthly index in 1978. The FTSE 100 index on the other hand is used from its inception in 1984. For the MCSI regressions we use monthly data. The use of MCSI and VIX as trigger variables for FTSE 100 is justified by the strong correlation between the S&P 500 and FTSE 100 indices. The MCSI and VIX do not appear to be independently distributed (first order autocorrelation > 0.9 for MCSI and 0.98 for VIX); thus, they are closer to being Markovian trigger variables.
Although we have not explicitly calculated moments for the case when the trigger variable is Markovian, we refer the interested reader to Knight and Satchell (2011), which discusses these results for the two-state case with a constant drift. We did test with a variety of other potential trigger variables but they failed basic Granger causality tests and thus, cannot be considered exogenous to the indices. Selecting a Markovian variable does not impact our estimators or our estimation strategy; however, we will not be able to use the formulae derived in Section 2 to calculate the mean and variance of our assets. Nevertheless, the formulae offer valuable insights as we will note in our discussion. We first discuss the results using MCSI as the exogenous trigger.
High consumer sentiment, i. e. a positive outlook towards personal finance and general business environment in the country is reflected through a high value of the index. On the other hand low consumer sentiment is reflected as a lower value. We posit that high consumer sentiment that persists for long periods is indicative of explosiveness or bubbles, i. e. if consumers have a very positive outlook they are likely to invest in assets and if a large number of consumers enter the asset markets or in this case the stock market the increase in demand could lead to a switch from efficiency to explosiveness.
Similarly when lower values persist we posit that the market is correcting itself and we get mean reverting behaviour. Figure 1 shows the MCSI and the log of the S&P500 index indicating how the MCSI varies with the log of S&P500 index. We see a spike in consumer sentiment in the run up to the dot com bubble. A similar increase is seen near the 2008–2009 financial crisis. Mean reverting behaviour is observed after the 1979 oil crisis as well as in the aftermath of the financial crisis. While the MCSI may not always respond contemporaneously to movements in the S&P500 index, it nevertheless acts as a valuable trigger variable for our illustrative example.
We estimate the autoregressive and drift parameters in eq. (22) using 3 states for the S&P500 & FTSE100 indices respectively and use the MCSI as the trigger variable. The dependant variable in the regression is asset returns instead of log prices in order to ensure consistency of standard errors. We could have used log levels instead of returns but returns are more intuitive; secondly, using levels we would find some non-stationary states. Using the return formulation also ensures that the criterion for the existence of a stationary distribution, specified in Section 2, is satisfied.
We estimate the model with and without the switching drift term. When we do use a drift term, we report results with both a switching drift term, i. e. the drift changes in each state and a constant drift term, i. e. the drift does not change across states. The most commonly employed specification in related literature is that with a constant drift. Our aim is to find thresholds and for the MCSI that minimize the residual sum of squares for the threshold auto-regression which in turn also yield the parameter estimates for the 2 inefficient states.
Note, that we do not impose any restrictions on the parameters of the other two states; both states may be mean reverting or explosive. The only restriction imposed is. We use the procedure outlined above to estimate the threshold, and . 51 and 112 represent the extreme values taken by the trigger variable, MCSI in this case. For our assets these values are reported in Table 7 below and include results for both stock market indices with and without a (switching/constant) drift term. The table also notes the time the stock-market index is in each of the 3-states; this will allow us to comment on the proportion of time each of the stock-market indices is efficient or inefficient.
We postulate that if MCSI is low and when MCSI is high; the postulated relationship will vary based on our choice of trigger variables). Columns (8) and (9) in Table 7 report the thresholds corresponding to the minimum sum of squared residuals. Note that for the specification with drift, the intercept is also switching and as illustrated in Section 2, this changes the mean and variance of the process (if they exist) significantly.
Since we use different model specifications for this example, it is no surprise that the results in Table 7 present a mixed picture. When we consider the case of a shifting drift term in addition to a shifting slope coefficient, a lot of the variance in the series is captured by the shifting drift term. Inclusion of a moving drift substantially reduces the impact of the moving slope terms and we find no more than 32 observations in non-efficient regimes (20 for S&P500 and 32 for FTSE100). This amounts to 5 % inefficiency for the S&P500 and 8 % for the FTSE100. For both indices we also note that the first state is not statistically significant so we may not classify that state as inefficient. We also find little evidence of explosive behaviour due to the slopes.
Thus, explosive and mean reverting episodes under a model with a moving drift are primarily caused by the change in drift. Note that the drift terms are larger in magnitude and appear farther apart which implies that they have a higher variance. As per our formulae in Section 2, a higher variance of the drift parameter leads to a higher variance of the series. We find that the drift term in the explosive regime is statistically significant and greater than the drift term in other regimes. Thus, under this specification explosiveness in the S&P500 and FTSE100 indices is due to a temporary increase in average returns. With a constant drift term, both indices appear stationary and do not have explosive regimes.
One way to systematically beat the market in such a situation will be through predicting when the switches will occur provided that investors are aware of what state they are in as soon as the switch has occurred (and thereby becomes a part of the information set). Thus, we are referring to efficiency in a broader sense. In the conventional auto-regressive sense, a market is said to be efficient if the auto-regressive parameter is 1, i. e. the process is a random walk so that the only change in asset returns is due to unpredictable factors and no gains can be made based on the existing information set. In the threshold auto-regressive case, in addition to the restriction on the auto-regressive parameter we would also require the state switches be unpredictable; although once the switch occurs everyone becomes aware of it. Thus, the information set will also include information about what state the exogenous trigger is in. If markets are weak form efficient all rational investors will find out about the switch at the same time although they may not know when the switch may occur.
For the specification without a drift the results are closer to the behaviour observed in the simulations, i. e. we observe three states although the deviation from efficiency is very small. When , i. e. we are in the explosive or bubble regime, we observe additional annualized gains of only 1.5 % in the S&P500 index and 1.1 % in the FTSE100 index. It may be argued that the additional annualized gains being captured by the parameter are in fact accounting for the missing drift term. For models with a switching drift, the criterion for a steady-state stationary distribution is trivially satisfied as for both FTSE100 and S&P500 we do not find an explosive slope coefficient. Figure 2 shows the areas that fall under the different states under this specification for log prices based on the thresholds estimated by the procedure outlined above.
We note that the first state corresponds to periods of relative slow down, i. e. in the aftermath of the 2nd oil price crisis, in the immediate aftermath of the financial crisis and in late 2011 when fears of a double dip recession abounded. The other non-efficient state occurs in the run up to the East Asian financial crisis and the dot-com bubble when consumer sentiment was at an all-time high. Our grid-search results do not indicate a deviation from a random walk during the financial crisis. The area under the non-random walk states has been shaded (blue for mean-reverting and red for explosive). Figure 3 shows similar results for the FTSE100 index. Note that apart from the dot-com bubble period in early 2000, 1998 is identified as a period of explosiveness for both indices. Both indices attained historical highs in the 1998 which is reflected in consumer sentiment.
By regressing log prices on their lags instead of returns (i. e. add 1 to each coefficient estimated in Section 3), we can calculate the value of the criterion function specified in Section 2 which allows us to comment on whether the series has a stationary distribution. For the case without a drift, the value of the criterion for the S&P500 and FTSE100 is 0.0010 and 0.0005 respectively (the criterion in this case is calculated as +1). Neither of the two indices satisfies the criterion for a steady-state stationary distribution under specifications without a drift. This indicates that any test for explosiveness that either assumes a constant drift term without shifting slope coefficients (not reported) or that drop the drift term are more likely to find the criterion violated for the S&P500 and FTSE100 indices.
As mentioned before, if MCSI was an independent and identically distributed variable we would be able to use our formulae from Section 2 and be able to calculate the mean and variance for both the S&P500 and the FTSE100 series when they are estimated using a threshold auto-regression. This will have allowed us to compute metrics such as Sharpe ratios enabling us to comment further on market efficiency and investor behaviour. Since MCSI is closer to a Markovian variable we are unable to use the formulae derived earlier. However, our results do allow us to compare efficiency across the two markets. In the following discussion when we talk about inefficient states we are referring to the proportion of time spent by each index in a state that is statistically significantly different from the random-walk.
When specifications with a drift are considered, the FTSE100 index appears more inefficient than the S&P500 index. The S&P500 index is inefficient for 2 % of the time with the switching drift specification and 86 % of the time with a constant drift. In contrast the FTSE 100 index is inefficient for 6.7 % of the time under the switching drift specification and 93 % of the time under the constant drift specification. On the other hand when no drift is included, the S&P500 appears mostly inefficient (95.5 %) compared to the FTSE 100 (68.7 %). If we compare similar periods, i. e. from 1981 onwards, the results remain robust. The mixed results do not offer a clear answer as to which market appears more inefficient; nevertheless the methodology is applicable to other assets. If an asset appears to spend more time in inefficient states under all specifications compared to another we will be able to conclude that the market for that particular asset is inefficient more often. Our results using VIX as the exogenous trigger shed further light on the efficiency question. Table 8 reports the results for when VIX is used as the exogenous trigger.
Before we discuss the regression results using VIX in detail, it is important to highlight the differences in the two exogenous triggers. While data for the MCSI are only available at a monthly frequency, VIX is reported daily; this allows us to estimate the regression using daily returns thereby allowing us to measure market efficiency at a higher frequency. Thus, VIX is measuring short-term market efficiency while with MCSI we were able to measure medium-term market efficiency. However, since the data for VIX are only available from 1990, we are unable to include observations from the oil price crises and large stock market downturns in the 1970s and 1980s.
Figure 4 plots the S&P500 and the VIX. The pattern suggests that periods of high volatility, marked by a high value of the VIX, tend to correspond to low values of the S&P500 index. Thus, we detect high volatility in the market whenever the S&P500 index is facing a downturn. No such pattern is immediately visible for the opposite scenario. High values of the S&P500 do not always correspond to low values of the VIX. It appears that while the VIX may be a good potential trigger for the stationary or mean reverting regime, it may not have the necessary variance for the explosive state. Also, as we pointed out initially, the VIX is a measure of investment sentiment while the MCSI is a measure of consumer sentiment. This is another reason why the two results can be different; the information sets of consumers and investors may be quite different. Since efficiency is defined with respect to an information set, it is no surprise that the results vary, although we do note some common trends. The prevailing market price that we observe results from the different demand functions of consumers and professional investors. Both indices contain valuable information and it is advisable to consider results from both sets of regression results.
While results from the MCSI were mixed, the ones obtained when VIX is used as the exogenous trigger are largely in favour of efficient markets. As before, we restricted state 2 to be the efficient state; however, our estimates with VIX contain significant evidence that there are only 2 as opposed to 3 states. is always close to 0 (to 3 decimal places or more) or is statistically insignificant, irrespective of the model specification. It is difficult to distinguish this state from the efficient state. on the other hand does indicate the presence of inefficient periods.
Although the proportion of inefficient observations is small (1 % or less), it accounts for around 50 periods/days. Figure 5 indicates that these tend to occur in 2008, in the aftermath of the financial crisis when volatility as recorded by VIX was very high. These periods also coincide with the periods of mean reversion detected when MCSI was employed as the exogenous variable. We note, however, that the procedure does not indicate the presence of explosive states in this case. It neither detected the dotcom bubble, nor the period in the run up to the financial crisis. The MCSI on the other hand, did flag the dotcom bubble period as inefficient. As mentioned before, VIX does not appear to be a good predictor of the explosive regime although it does do much better for indicating the onset of mean reversion. The results for FTSE100 looked similar and are reported in Figure 6.
Taking both the MCSI and VIX results together, our empirical results supplement our findings in Sections 2 and 3. Explosiveness is more likely to be detected in asset price series where the criterion function is violated and the variance of the switching slope parameters is large (i. e. there are many regimes or the regimes are much farther apart). Inclusion of a switching drift term may explain most of the explosiveness and may make the price series appear efficient. Another way of analysing results could be through comparing the different series.
The set of results reported above also depend on the selection of the trigger variable as we have noted. Finding an appropriate trigger variable that may indicate switches in regimes is non-trivial in practice and will require rigorous theoretical, empirical or experimental basis so that regime identification criteria can be appropriately set. We also need to consider the time period for which we are measuring market efficiency. As we have seen above, markets may appear to be more efficient when higher frequency data are considered.
While we primarily relied on Granger causality tests to determine appropriate lags for the trigger variables and for determining exogeneity, it could be argued that further testing on exogeneity, including testing for non-linear relationships, may be warranted. This requires judgement on the part of the researcher, the specific asset price being considered and the relationship between the asset price and the trigger variable. We did consider other suitable triggers including the fear and greed index and the economic policy uncertainty index but these were either not exogenous or did not include sufficient data. Furthermore, it is also possible to consider a combination of exogenous triggers which would combine elements of investor and consumer sentiment. We leave this exercise for future research as our example was aimed as an illustration of the outlined methodology. Additionally, the researcher also needs to consider the number of states to be used. A price series could exhibit multiple explosive or mean reverting states.
Our methodology can thus work in practice for different asset markets. We have shown with our example that the methodology may be used to identify the incidence of market efficiency for different assets while also highlighting the importance of model-specification when testing for market efficiency. Model specification is not important just in terms of estimation but completely changes the theoretical meaning of the results. Once a researcher has identified an appropriate specification for an asset price or return based on either technical analysis or through solving a structural model, our methodology will allow her to comment on market efficiency for that asset. However, irrespective of the specification, the results may still be used to compare different markets and identify which markets are more efficient for the given information set.
In this article we have extended existing general conditions that need to be satisfied by threshold autoregressive models in order to have a steady-state distribution and for a mean and variance to exist and have provided formulae for them. The results have been extended to include the case where a switching drift is included in addition to a switching coefficient parameter. We have also considered the case of models with and without drift, specifically considering an i.i.d variable as an exogenous regime switching trigger. We believe that the results can be extended to other types of trigger variables, such as Markovian trigger variables, although we do not evaluate analytical expressions for such cases in this article. We have shown that when a steady-state distribution does exist for a TAR(1) model with a switching drift term, the variance depends on the variance of the error term, the variance of the drift parameters, the variance of the coefficient parameters as well as the covariance between the drift and coefficient parameters.
A simulation study is carried out to evaluate the power of the GSADF bubble detection test under conditions where a steady-state distribution may not exist. Our simulation study has shown that if a series has a steady-state distribution, bubbles may be more difficult to detect. We further note that the power of such tests increases with the variance of the regime parameters, i. e. the farther apart the parameters are from unity, the higher is the power. These results enable us to understand why bubble tests may fail to detect explosiveness even though it may be locally present in a series.
We also provide a methodology that may be used in practice to estimate TAR models with exogenous i.i.d trigger variables or Markovian triggers. Using non-linear least squares we find threshold levels that minimize the sum of squared residuals. Our results indicate that model specification is critical when analysing weak form market efficiency using price series. Series that may appear to exhibit inefficiency when a financial analyst assumes no drift will appear efficient when a regime-switching drift term is used which highlights the need for carefully considering model specification prior to estimation. The methodology allows us to state how often a market is inefficient instead of making a binary classification.
Our empirical results vindicate our theoretical findings, i. e. the variance of a price process depends not only on the regime-switching coefficients but also on the regime-switching drift term. Additionally we also extend the notion of efficiency to include the predictability of state switching, i. e. a market is more efficient if state switching is unpredictable. We believe this methodology is applicable to a variety of different markets including commodity and foreign exchange markets. The methodology also allows us to compare different asset markets and comment on their efficiency relative to one another. We found evidence that the S&P500 and FTSE100 are mostly efficient under the most generalized (switching-drift) specification although there are pockets of data where inefficiency is noted.
Multiple avenues of further research open up as a result of this article. The theoretical results may be further expanded to include a Markovian process as the regime switching variable or consider results for TAR (p) models. In addition, our simulation results highlight one limitation of the GSADF test and also the need for having bubble tests that may be applied locally or on sub-samples as considering the full price process may make detection difficult. Although we have used only non-linear least squares for our illustration, other methodologies including Markov-switching regressions may also be applied.
Our empirical methodology opens up a wide array of possibilities for financial analysts and econometricians alike who may be interested in understanding market efficiency in different markets. In particular, trend-following commodity trading analysts could use such procedures to determine which markets are efficient most of the time and avoid trading dynamically in them unless their mean-variance properties make them intrinsically appealing. This article also raises the question of what exogenous trigger variables may be most appropriate for a particular asset market. Finally, identification of a suitable trigger may have policy implications, i. e. the government may try to influence expectations about these variables in order to move asset markets towards efficiency although we do caution against using tenuous relationships to draw policy conclusions.
Bloomberg, L.P. S&P500, FTSE100 and VIX daily data series 01/02/1990 to 30/06/2015 2016 from Bloomberg database. Jan.132016. Retrieved on.
Evans, G.W. 1991. “Pitfalls in Testing for Explosive Bubbles in Asset Prices.” American Economic Review 81: 1189–1214. Google Scholar
“Google Finance”. S&P 500 index. Retrieved on 03/09/2015 https://www.google.co.uk/finance.Google Scholar
“Google Finance”. FTSE 100 Index. Retrieved on 03/09/2015 https://www.google.co.uk/financeGoogle Scholar
Hamilton, J.D. 1989. “A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle.” Econometrica : Journal of the Econometric Society 57: 319–352. Google Scholar
Knight, J., and S.E. Satchell. 2011. “Some New Results for Threshold AR(1) Models.” Journal of Time Series Econometrics 3: 1–42. Google Scholar
Knight, J., S.E. Satchell, and N. Srivastava. 2014. “Steady State Distributions for Models of Locally Explosive Regimes: Existence and Econometric Implications.” Economic Modelling 41: 281–288. Web of ScienceCrossrefGoogle Scholar
Marsh, T.A., and R.C. Merton. 1986. “Divident Variability and Variance Bounds Tests for the Rationality of Stock Market Prices.” The American Economic Review 76 (3): 483–498. Google Scholar
Phillips, P.C.B., S. Shi, and J. Yu. 2013. “Testing for Multiple Bubbles: Historical Episodes of Exuberance and Collapse in the S&P 500.”. In Cowles Foundation Discussion Papers 1914. Cowles Foundation for Research in Economics Yale University. Google Scholar
Samuelson, P.A. 1965. “Proof that Properly Anticipated Prices Fluctuate Randomly.” Industrial Management Review 6 (2): 41–49. Google Scholar
Sewel, M. 2011. A History of the Efficient Market Hypothesis. Research Note: UCL Department of Computer Science.Google Scholar
Shiller, R.J. 2000. Irrational Exuberance. Princeton, NJ: Princeton University Press. Google Scholar
Tong, H. 1978. “On a Threshold Model.”. In Chen, C.H. (Ed.) Pattern Recognition and Signal Processing. Amsterdam: Sijthoff and Noordhoff. Google Scholar
Tong, H. 2011. “Threshold Models in Time Series Analysis – 30 Years on (With Discussions by P. Whittle, M. Rosenblatt, B. E. Hansen, P. Brockwell, N. I. Samia and F. Battaglia).” Statistics and Its Interface 4: 107–118. Google Scholar
University of Michigan. The Index of Consumer Sentiment. Retrieved on 03/09/2015 https://data.sca.isr.umich.edu/tables.php.Google Scholar
A smooth transition Autoregressive model (STAR) is written as: where is the exogenous trigger and is the cumulative distribution of the exogenous trigger. We have outlined the two state cases, but this can potentially have multiple state. can be a smooth function such as a logistic function. To define efficiency in this context, we will need to define a region of the density function which we can classify as the efficient region. (Tong 2011)