Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Studies in Nonlinear Dynamics & Econometrics

Ed. by Mizrach, Bruce

5 Issues per year

IMPACT FACTOR 2017: 0.855

CiteScore 2017: 0.76

SCImago Journal Rank (SJR) 2017: 0.668
Source Normalized Impact per Paper (SNIP) 2017: 0.894

Mathematical Citation Quotient (MCQ) 2017: 0.02

See all formats and pricing
More options …
Volume 19, Issue 1


Regime-switching cointegration

Markus Jochmann / Gary Koop
Published Online: 2014-02-28 | DOI: https://doi.org/10.1515/snde-2012-0064


We develop methods for Bayesian inference in vector error correction models which are subject to a variety of switches in regime (e.g., Markov switches in regime or structural breaks). An important aspect of our approach is that we allow both the cointegrating vectors and the number of cointegrating relationships to change when the regime changes. We show how Bayesian model averaging or model selection methods can be used to deal with the high-dimensional model space that results. Our methods are used in an empirical study of the Fisher effect.

This article offers supplementary material which is provided at the end of the article.

Keywords: Bayesian; cointegration; Markov switching; model averaging; structural breaks

JEL: C11; C32; C52

1 Introduction

Two of the most important challenges of modern empirical macroeconomics involve the wish to incorporate restrictions suggested by economic theory and the empirical need to allow for parameter change in multivariate time series models. With regard to the former, cointegration has played an important role as economic theory often suggests particular cointegrating relationships which the researcher may wish to impose or test for. As one example, consider the UK macroeconomic model of Garratt et al. (2003). This uses the purchasing power parity relationship, an interest rate parity condition, a neoclassical growth model, the Fisher hypothesis and a theory of portfolio balance to build a macroeconometric model involving five cointegrating relationships. With regard to the latter, papers such as Ang and Bekaert (2002) and Stock and Watson (1996) document widespread evidence of parameter change in many macroeconomic time series. In the field of cointegration, there are a large number of theoretical and empirical papers that model breaks or other forms of nonlinearity in cointegrating relationships, present empirical results relating to cointegration work using subsamples of the data or attribute failures of cointegration tests to parameter change (see, among many others, Michael, Nobay, and Peel 1997; Quintos 1997; Park and Hahn 1999; Lettau and Ludvigson 2004; Saikkonen and Choi 2004; Andrade, Bruneau, and Gregoir 2005; Bierens and Martins 2010; Beyer, Haug, and Dewald 2011).

All this work provides evidence of widespread empirical and theoretical interest in cointegration models with changing cointegrating spaces. However, with few exceptions (e.g., Martin 2000; Paap and van Dijk 2003; Sugita 2006; Koop, León-González, and Strachan 2008) this work is non-Bayesian. One purpose of the present paper is to provide a set of Bayesian tools for working with Vector Error Correction Models (VECMs) in the presence of changes in regime. The previous Bayesian work with time-varying cointegration typically assumes that the cointegrating rank is constant across regimes (e.g., Koop, León-González, and Strachan 2011) or works with much simpler model spaces than the one considered here (e.g., Martin 2000; Paap and van Dijk 2003; Sugita 2006).

A second purpose of this paper is to address the issues that arises with cointegration models due to the fact that the model space can be large. The researcher will typically consider models with different cointegrating ranks, different restrictions imposed on the cointegrating relationships, different lag lengths, different treatment of deterministic terms, etc. In previous work (see Koop, Potter, and Strachan 2008; Jochmann et al. 2013), we have developed Bayesian methods to navigate through such high-dimensional model spaces in the constant coefficient VECM. In this paper, we work with models with regime change and, in these, the dimension of the model space is greatly increased. For instance, we may wish to allow the cointegrating rank to differ across regimes or a restriction implied by economic theory to hold at some time periods but not others (e.g., we might have purchasing power parity holding in the 1970s but not the 1980s). Furthermore, in practice it is typically unclear what determines changes in regime. Of models that allow for regime change, structural break models assume breaks occur at specific points of time and regimes do not recur. Markov switching models allow for regimes to recur (i.e., the model switches between expansionary and recessionary dynamics). It is empirically sensible to work with a model space that allows for a range of such possibilities. Thus, a final contribution of this paper lies in the fact we offer a richer treatment of regime change, allowing for both structural break and Markov switching behavior. We show how, regardless of whether the researcher wishes to do Bayesian model averaging (BMA) or select a single model, the Bayesian approach is an attractive one in model spaces of this dimension.

Our methods are applied in an empirical exercise investigating the Fisher effect.

2 VECMs with regime switching

2.1 A general framework

A general unrestricted VECM for an n-dimensional vector yt can be written as:

Δyt=αβyt1+j=1p1γjΔytj+εt, (1)(1)

where α is a full rank r×n matrix, β is a full rank n×r matrix, γj is n×n and εt~N(0, Σ). r and p are the number of cointegrating relationships and lag length, respectively. For notational simplicity, we have not included deterministic terms in (1). See, e.g., Johansen (1995, Section 5.7) or Franses (2001) for a discussion of deterministic terms in VECMs.

A wide range of regime switching VECMs can be obtained by adding st subscripts to the parameters in (1), leading to:

Δyt=αstβstyt1+j=1p1γj,stΔytj+εt, (2)(2)

with εt~N(0,Σst) and st∈{1,…, M} indicating which of M regimes applies at time t. Importantly, we assume αst is rst×n and βst are n×rst so that the cointegrating rank can change when the regime changes.

Examples of models that can be put in this framework include Markov switching, other regime switching models such as endogenous threshold models, structural break models and time-varying parameter models. In the Bayesian multivariate time series literature, the emphasis has been on extensions to Vector Autoregressive (VAR) models. Prominent examples include the Markov switching VAR of Sims and Zha (2006) and the time varying parameter (TVP) VARs of Cogley and Sargent (2005) and Primiceri (2005). However, VARs are parameter-rich models and VARs with regime change are even more parameter rich. This has led to approaches which attempt to mitigate over-parametrization worries by using shrinkage priors and impose restrictions. Cointegration provides a good source of potential restrictions (often motivated by economic theory) which can help achieve parsimony.

2.2 Modeling the regime switching process

Many specifications for ST=(s1,…, sT)′ are possible. For instance, Koop, León-González and Strachan (2011) set st=t and M=T, resulting in a TVP-VECM. However, their model assumes a common cointegrating rank at all points in time, an assumption we wish to relax in the present paper. Sugita (2006) assumes a uniform prior over break dates which involves the assumption that st is sequentially increasing (i.e., st=st–1+1 if a break occurs at time t). Such an approach can be computationally daunting in the case of multiple breaks. That is, with one break in a sample of size T there are on the order of T possible break dates, but with M–1 breaks this increases to the order of TM which can lead to a serious computational burden if additional structure is not placed on ST.

Many approaches in the literature can be interpreted as placing a particular structure on ST using hierarchical priors. In this paper, we consider one class of hierarchical priors using Markov specifications for ST. These are empirically popular in many contexts and convenient and computationally efficient MCMC algorithms exist (e.g., Chib 1996). A standard Markov switching specification of the sort used, e.g., in Sims and Zha (2006) has:

Pr(st=j|st1=i)=ξij,i,j=1,,M, (3)(3)

where ξij is the probability of switching from regime i to regime j. In the Markov switching model no restrictions (other than the ones implied by probabilities summing to one) are placed on the ξij.

Chib (1998) notes that a Markov switching model can be turned into a structural break model by placing restrictions on the ξij. In particular, he sets ξij=0 for all i and j except for the following:

Pr(st=i|st1=i)=ξii,i=1,,M1,Pr(st=i+1|st1=i)=1ξii,i=1,,M1,Pr(st=M|st1=M)=1. (4)(4)

It can be seen that this leads to a model with M–1 structural breaks. That is, if regime i holds at time t–1, then at time t the process can either remain in regime i (with probability ξii) or a break occurs and the process moves to regime i+1 (with probability 1–ξii). The process moves through regimes sequentially (i.e., it cannot jump from regime i to regime i+2). Once a break occurs, the process cannot revert to an old regime (i.e., it cannot jump from regime i to regime i–1).

By modelling ST in terms of a Markov process we obtain a computationally feasible model (using the algorithm of Chib 1996) and can allow for regime switching behavior of various sorts. We can have a conventional Markov switching formulation where VECM coefficients vary over the business cycle (or in some other manner) or a structural break model where coefficients change at particular points in time. These are the two specifications for the break process considered in this paper. However, any specification for the ξij can be used with the methods outlined in this paper and only trivial alterations would be required to accommodate other specifications for ST.

2.3 Model space

The previous material outlines a general modeling framework for regime-switching VECMs. The resulting model space can be large since we allow for both βst and rst to differ across regimes. Furthermore, we may wish to consider models which impose restrictions on βst. For instance, in our empirical work, we consider versions of the model which impose the restriction βst=(1,1) which is the value implied by Fisher’s hypothesis. The cointegration rank rst can be either 0 or 1 and we consider lag lengths p=1, 2, 3. In the case of Markov switching models we analyze models with 2 regimes which already gives us 27 models. Repeating the analysis for structural break models with 2 and 3 regimes gives us another 108 models. This does not even include modeling choices such as the treatment of deterministic terms or special cases which will increase the model space even more. With model spaces of this size, sequential hypothesis testing procedures can be risky. BMA (which averages over all models with weights proportional to posterior model probabilities1) or model selection (which chooses the single model with the highest posterior model probability) are attractive alternatives.

Another issue relating to model space size is computation. Our empirical work illustrates our methods with a bivariate example (n=2). Working with n>2 will lead to an increase in the model space since the researcher will typically wish to consider rst=0,,n. In a constant coefficient VECM, this will cause the number of models (and, hence, the computational burden) to increase linearly with n. However, in the regime switching cointegration model with M>1 regimes, the number of models will increase at a rate nM.

Furthermore, in our empirical illustration, there is just one cointegrating restriction of interest: βst=(1,1). In models with larger n, there will typically be many more. For instance, in the 9-variable VECM of Jochmann et al. (2013) up to four possible restrictions on the cointegration space are considered (individually and in combination with one another). If nr is the number of possible restrictions on the cointegation space, then with a constant coefficient VECM there are 2nr possible restricted VECMs. With regime switching cointegration with M regimes, this becomes 2Mnr.

In short, with regime switching cointegration it is very easy for the number of models to become very large very quickly. In this paper, we estimate and calculate the marginal likelihood in every model using posterior simulation methods. We have more than a hundred models, and the resulting computational burden is not too onerous. As a rough order of magnitude, in our empirical example, estimating a single model takes around 1 minute on a good quality PC. Even thousands of models would be feasible. However, some researchers may wish to work with values, say, of n=9, M=4 and nr=4 which would lead to millions of models. With such a large model space, the methods used in this paper would not be computationally feasible. In such cases, the researcher must either restrict the model space in some way or use methods which do not explicitly calculate the marginal likelihood in each model (e.g., the stochastic search variable selection approach of Jochmann et al. 2013).

But these considerations suggest the need for efficient posterior simulation and marginal likelihood calculation methods and it is to these we now turn.

3 Bayesian inference in regime switching VECMs

The Appendix contains complete details on priors, posterior simulation and marginal likelihood calculation. Here we provide a summary of the main ideas involved in each so as to give the reader an overview of how to implement our methods in practice.

3.1 Prior distributions

We let the vector θ collect all model parameters. It contains the VECM parameters {αi}, {βi}, {Γi} and {Σi}, i=1,…,M and the switching probabilities {ξij}, i, j=1,…,M. For the latter it is common (e.g., Chib 1998) to use Beta priors and we follow this practice. Our priors for the VECM parameters are the same as those used in previous work and are in all cases proper (thus, allowing for valid calculation of marginal likelihoods). We assume the priors in different regimes are independent of one another. The reader interested in a detailed motivation is referred to the previous literature (see, e.g., Strachan 2003; Strachan and Inder 2004; Koop, Potter, and Strachan 2008; Koop, León-González, and Strachan 2010) with precise formulae being given in the Appendix.

To motivate the main ideas behind our priors, note that for VARs there is a large literature which uses so-called Minnesota priors (see, e.g., Doan, Litterman, and Sims 1984). Within each regime, the parameters {αi} and {Γi} play a similar role to VAR coefficients (conditional on the {βi}). Accordingly this and other papers in the Bayesian cointegration literature use Normal shrinkage priors with similar properties to Minnesota priors. They reduce worries associated with over-fitting. For the {Σi} inverted Wishart priors are used since they are conditionally conjugate. Typically, in a cointegration analysis it is the prior distributions for the {βi} which are most important. The basic idea of this prior is that, given the lack of identification of the VECM due to the product structure of the terms {αiβi}, it is only the space spanned by the cointegrating vectors which is identified. Hence, in the Bayesian cointegration literature (e.g., Strachan 2003; Strachan and Inder 2004) priors are placed directly over the cointegration space. It can be shown that all priors employed in this paper are proper and, thus, valid marginal likelihoods can be obtained.

3.2 Posterior simulation and marginal likelihood calculation

Efficient posterior simulation in the VECM with the aforementioned prior can be implemented using the algorithm developed in Koop, León-González, and Strachan (2010). An algorithm for efficient posterior simulation in Markov switching models is presented in Chib (1996). Chib (1998) modifies the algorithm of Chib (1996) so as to handle the structural break model. These algorithms are Markov Chain Monte Carlo (MCMC) algorithms which simulate draws of a parameter or latent state conditional on draws of the other parameters or latent states. The MCMC algorithm used for our regime switching cointegration models uses Koop, León-González, and Strachan (2010) to take draws of the VECM parameters in each regime, conditional on ST. Then the algorithm of Chib (1996) or Chib (1998) is used to produce draws of ST conditional on the VECM parameters.

To see why this is a valid algorithm, remember that ST divides the sample into regimes. Thus, conditional on ST, we can use the algorithm of Koop, León-González, and Strachan (2010) to draw the VECM coefficients in each regime. Conditional on posterior draws of the VECM coefficients, the algorithm of Chib (1996)(restricted as in Chib 1998, for the structural break case), can be used to draw ST.

Marginal likelihood calculation can be difficult in multivariate state space models such as the VECM. This has led to the use of approximations (e.g., the Laplace approximation of Strachan and Inder 2004 or the information criteria of Koop, Potter, and Strachan 2008), methods based on the Savage-Dickey density ratio (e.g., Koop, León-González, and Strachan 2008) or alternatives to the marginal likelihood such as the predictive likelihood (e.g., Geweke 1996). Given a desire to directly use marginal likelihoods and avoid approximations, in this paper we use a bridge sampler to calculate the marginal likelihood. See Gelman and Meng (1998) for a general treatment of bridge sampling and Frühwirth-Schnatter (2004) for bridge sampling in Markov switching models. Frühwirth-Schnatter (2004) compares bridge sampling with other methods and finds the former to be much more reliable and efficient.

Complete details of prior distributions, posterior computation and bridge sampling are provided in the Appendix.

4 Application: the Fisher effect

The Fisher effect is the name given to the theory which implies that a permanent change in inflation will, in the long run, cause an equal change in the nominal interest rate. Or, equivalently, monetary shocks will have no effect on the real interest rate in the long run. This can be taken to imply a cointegrating relationship between inflation, πt, and the interest rate, it, with cointegrating vector (1, –1)′. This relationship has been investigated in numerous papers for numerous countries and is often found not to hold. Beyer, Haug, and Dewalt (2011) offer a discussion of this literature and investigate whether structural breaks exist in the cointegrating relationship in a cross-country study.

In our empirical work, we look at the case of France. For this country, Beyer, Haug, and Dewalt (2011) analyze quarterly data from 1970:Q1 to 2004:Q3 and find evidence of unit roots in πt and it using classical unit root tests. However, both the Johansen trace and eigenvalue tests for cointegration indicate cointegration is not present and, thus, the Fisher effect appears not to hold. They next do a classical test where the null hypothesis is that cointegration is present, but with a structural break at an unknown point in time. This test does not reject the null hypothesis and finds a break in 1981:Q4. However, when Johansen tests are done using sub-samples (before and after 1981:Q4), the trace test finds cointegration in the second sub-sample but not in the first, whereas the eigenvalue test finds cointegration in both sub-samples. We take this as an interesting case where the evidence of previous work suggests there is a great deal of model uncertainty, both about the presence of cointegration and about the break process.

Our data on French CPI inflation (quarterly inflation at an annualized rate) and the 3 month treasury bill rate runs from 1970:Q1 to 2012:Q4 and is shown in Figure 1.

CPI inflation (dashed line) and 3-month interest rate (solid line).
Figure 1

CPI inflation (dashed line) and 3-month interest rate (solid line).

Concerning the inclusion of deterministic trends, we only put a constant in the cointegration part of the model since neither the inflation series nor the interest rate series display a trending pattern (see Franses 2001, for justification of that choice). We consider p=1, 2, 3 for the lag length. In each regime the cointegration relationship between the two variables follows one of the three subsequent cases. The first case (which we denote by b=0) assumes that inflation and the nominal interest rate are not cointegrated and the model becomes:

Δπt=j=1p1(γj,stππΔπtj+γj,stπiΔitj)+επt,Δit=j=1p1(γj,stiπΔπtj+γj,stiiΔitj)+εit. (5)(5)

The second case (b=1) assumes a cointegration rank of one but does not constrain the cointegration space:

Δπt=αstπ(β1,stπ+β2,stππt1+β3,stπit1)+j=1p1(γj,stππΔπtj+γj,stπiΔitj)+επt,Δit=αsti(β1,sti+β2,stiπt1+β3,stiit1)+j=1p1(γj,stiπΔπtj+γj,stiiΔitj)+εit. (6)(6)

Finally, the last case (b=1F) assumes a cointegration rank of one and restricts the cointegration space according to Fisher’s hypothesis. This means that the model can be expressed as follows:2

Δπt=δ1,stπ+δ2,stπ(πt1it1)+j=1p1(γj,stππΔπtj+γj,stπiΔitj)+επt,Δit=δ1,sti+δ2,sti(πt1it1)+j=1p1(γj,stiπΔπtj+γj,stiiΔitj)+εit. (7)(7)

We now turn to our empirical results that strongly favor Markov switching VECMs over structural break or constant coefficients VECMs. In fact, in a BMA exercise Markov switching models would receive virtually all of the weight. For the Markov switching case, there is never any evidence for more than two regimes. Accordingly, our empirical results focus on the Markov switching models with M=2. However, to illustrate the properties of our approach, we also present results for the models with structural breaks (even though there is little support for these models). For these, we do find evidence for three regimes and, accordingly, present results for structural break models with M=2 and M=3. For brevity’s sake, we do not present any results for constant coefficient VECMs (M=1). Trying various combinations of b and p we did not find a constant coefficient model that received considerable support.

4.1 Markov switching models

First, we look at results for the Markov switching case with two regimes (M=2). We impose an identification restriction which specifies that the variance of the interest rate equation in the first regime is bigger than the variance in the second regime.3 Table 1 gives logarithms of marginal likelihoods for models with different cointegration relationships in the two regimes and different lag lengths. In the last line we report results for the special case where the Fisher effect holds in both regimes but only the constant in the cointegrating vector is allowed to switch.

Table 1

Markov switching case: logarithms of marginal likelihoods.

We find that the model with the highest marginal likelihood has a lag length of two and specifies that both regimes are cointegrated but the Fisher effect restriction only holds in the first regime. For this “best model” the top panel in Figure 2 plots the posterior probability that the regime where the Fisher effect holds occurs. It can be seen that this probability is very high during the 1970s and in the beginning of the 1980s. After that the probability is very low for much of the time. If this were the full story, then we would expect a structural break model to work well, with a break occurring around 1983. The timing of the break is similar to that reported in Beyer, Haug, and Dewalt (2011). However, there are three, relatively short, time periods where the Fisher effect seems to hold again (in the mid-1990s and at the end of the sample). This kind of behavior is more consistent with a Markov switching process than a structural break model and this is why Markov switching models perform so well in our analysis.

Markov-switching case: results for the “best model.” (A) Posterior probability that the Fisher effect holds. (B) Posterior median and 16% and 84% posterior quantiles of β˜3.${\tilde \beta _3}.$
Figure 2

Markov-switching case: results for the “best model.”

(A) Posterior probability that the Fisher effect holds. (B) Posterior median and 16% and 84% posterior quantiles of β˜3.

The same conclusion can be drawn from the bottom panel in Figure 2. Here, the cointegration space is normalized to be a vector (β˜1,1,β˜3). In this normalization, β˜1 is the normalized intercept and the Fisher hypothesis tells us that β˜3 should be –1. The posterior median of β˜3 and its 16% and 84% posterior quantiles are drawn. As expected, the posterior median of β˜3 is equal to –1 at the same times that the top panel says there is a high probability that the Fisher effect holds.

So far, we have presented results for the single model with highest marginal likelihood. However, there are many other models whose marginal likelihoods are only slightly smaller than that of the “best model.” For example, the second best model is the one where the first regime is cointegrated and the Fisher effect holds but with no cointegration in the second regime. Faced with such model uncertainty, the researcher may wish to do BMA. Figure 3 gives the results of a BMA exercise. It plots the posterior probabilities of the three cointegration cases at each point in time, averaged across all the models in Table 1. The story told by Figure 3 is similar to that in Figure 2. Up until 1983, in the mid-1990s and at the end of the sample the Fisher effect is supported. But elsewhere it is not. Furthermore, in the periods where the Fisher effect is not supported, there is great uncertainty over whether cointegration occurs or not.

Markov-switching case: posterior probabilities of the three cointegration cases averaged over all models.
Figure 3

Markov-switching case: posterior probabilities of the three cointegration cases averaged over all models.

4.2 Structural break models

Now we discuss results for the structural break case with two and three regimes (M=2, 3). Table 2 gives the logarithms of marginal likelihoods for the different models. The special cases where the Fisher effect holds in all regimes but only the constant in the cointegrating vector is allowed to switch are marked by asterisks again.

Table 2

Structural break case: logarithms of marginal likelihoods.

As discussed previously, the logarithms of marginal likelihoods are much lower than for the Markov switching models and we include these structural break models for illustrative purposes only.

The “best model” with two regimes has a lag length of two. Both regimes are cointegrated with the Fisher effect holding in the first regime. The “best model” with three regimes also has a lag length of two. Again, all regimes are cointegrated and the Fisher effect only holds in the first regime. However, in the close second best model the first regime is cointegrated and the Fisher effects holds, the second regime is not cointegrated and the third regime is cointegrated again but the Fisher effect does not hold. Note that conventional models of cointegration with structural breaks could not handle such a case where the cointegrating space switches between cointegrating ranks as well as switching between restricted and unrestricted cointegrating spaces. This illustration shows that such cases are empirically relevant, highlighting the importance of a modelling approach which allows for such possibilities.

Figure 4 plots the posterior probabilities for each regime to occur for these two “best models.” It can be seen that the structural break models are trying (poorly) to approximate the Markov switching properties of Figure 2.

Structural break case: posterior probabilities of regimes. (A) Two regimes, solid line: Pr(s=1), dashed line: Pr(s=2). (B) Three regimes, solid line: Pr(s=1), dashed line: Pr(s=2), dotted line: Pr(s=3).
Figure 4

Structural break case: posterior probabilities of regimes.

(A) Two regimes, solid line: Pr(s=1), dashed line: Pr(s=2). (B) Three regimes, solid line: Pr(s=1), dashed line: Pr(s=2), dotted line: Pr(s=3).

5 Conclusions

This paper sets out a framework for Bayesian cointegration analysis which allows for regime-switching. We allow for both cointegrating rank and the exact cointegrating space to change when the regime changes. We consider two processes for regime change, leading to structural break and Markov switching VECMs. BMA or model selection using marginal likelihoods can be used to deal with the problems caused by the high-dimensional model space. We develop methods for Bayesian inference and bridge sampling methods are used to calculate the marginal likelihoods. An empirical application involving the Fisher effect shows the usefulness of our approach.


We acknowledge financial support from the Leverhulme Trust under Grant F/00273/J.


This Appendix describes how the model can be written in matrix notation and introduces a reparametrization that we use. Furthermore, the prior distributions and the algorithm for posterior simulation are discussed. Finally, we show how marginal likelihoods are computed with the bridge sampler. The presentation considers the case of the Fisher effect application discussed above, thus, yt=(πt, it)′, n=2, r=1 and a constant in the cointegration part is allowed for. A generalization to other cases is straightforward.

Models in matrix form

The model in a regime i with no cointegration (bi=0) can be written in the following way:

Yi=WiΓi+Ei, (A.1)(A.1)

where Yi and Wi collect the observations belonging to regime i. Yi is Ti×n with the rows given by Δyt˜i and Wi is Ti×[n(p–1)] with the rows given by (Δyt˜i1,,Δyt˜ip+1), where t˜i denotes the t-th observation in regime i and Ti gives the number of observations in regime i. Γi is [n(p–1)]×n. Ei is Ti×n with vec(Ei)~N(0, Σ⊗I).

The model in a regime i with a cointegration rank of one and the cointegration space restricted according to Fisher’s hypothesis (bi=1F) can be expressed in the same way but Wi is now Ti×[2+n(p–1)] with the rows given by (1,πt1it1,Δyt˜i1,,Δyt˜ip+1). Γi is [2+n(p–1)]×n.

The model in a regime i with a cointegration rank of one and an unconstrained cointegration space (b=1) can be written in the following way:

Yi=Xiβiαi+WiΓi+Ei, (A.2)(A.2)

where Xi is Ti×(n+1) with the rows given by (1,yt˜i1) and Wi takes the same form as in the case bi=0. αi is r×n and βi is (n+1)×r.

Following Koop, León-González and Strachan (2010) we next introduce non-identified r×r symmetric positive definite matrices Di and define αi*=Di1αi and βi*=βiDi where αi* is r×n and βi* is (n+1)×r. Since αi and βi always occur in product form and βiα=βiDiDi1αi=βi*αi* this does not affect the model which now can be written as:

Yi=Xiβi*αi*+WiΓi+Ei. (A.3)(A.3)

Prior distributions

  1. {αi*}: We assume the following shrinkage prior:

    ai*vec(αi*)~N(0,η_α1I). (A.4)(A.4)

    In our application we set η_α=10. Note that αi* only appears in regimes with unrestricted cointegration (bi=1).

  2. {βi*}: Our prior is:

    bi*vec(βi*)~N(0,P). (A.5)(A.5)

    In our application we set P=0.1I. Note that βi* only appears in regimes with unrestricted cointegration (bi=1).

  3. i}: We assume the following shrinkage prior:

    ci*vec(Γi)~N(0,η_Γ1I). (A.6)(A.6)

    For our application we choose η_Γ=10.

  4. i}: We use an inverted Wishart prior:

    Σi~InvWishart(ν_,S_). (A.7)(A.7)

    In the application we set v=13 and S=10I. It follows that E(Σi)=I.

  5. {ξij}: In the case of structural breaks we use the following Beta prior distributions:

    ξii~Beta(a_,b_),i=1,,M1. (A.8)(A.8)

    In our application we set a_=10 and b_=0.1. For the Markov switching models we assume

    ξi~Dirichlet(c_i1,,c_iM),i=1,,M. (A.9)(A.9)

    In our application we choose c_ij=10 if i=j and c_ij=1 otherwise.

Posterior simulation

We sample from the posterior distribution with a Gibbs sampler. Given initial conditions, the data, and in each block the other parameters, the algorithm comprises the following steps:

  1. Structural break model:

    Draw ST using Chib’s (1998) algorithm.

    Markov switching model:

    Draw ST using Chib’s (1996) algorithm.

  2. Structural break model:

    Draw ξii from Beta[a_+Nii(ST),b_+1] for i=1,…,M–1, where Nii(ST) is the number of one-step transitions from state i to state i in the sequence ST.

    Markov switching model:

    Draw ξi. from Dirichlet[c_i1+Ni1(ST),,c_iM+NiM(ST)] for i=1,…,M, where Nij(ST) is the number of one-step transitions from state i to state j in the sequence ST.

  3. For i=1,…,M: In the case of bi=1, draw ai* from N(a¯i,A¯i) with

    A¯i=[(Σi1βi*XiXiβi*)+η_αI]1 (A.10)(A.10)


    a¯i=A¯i(Σi1βi*Xi)vec(YiWiΓi). (A.11)(A.11)

  4. For i=1,…,M: In the case of bi=1, draw bi* from N(b¯i,B¯i) with

    B¯i=([(αi*Σi1αi*)(XiXi)]+P1)1 (A.12)(A.12)


    b¯i=B¯i(αi*Σ1Xi)vec(YiWiΓi). (A.13)(A.13)

  5. For i=1,…,M: Draw ci* from N(c¯i,C¯i) with

    C¯i=[(Σi1WiWi)+ηΓI]1 (A.14)(A.14)


    c¯i={C¯i(Σi1Wi)vec(YiXiβi*αi*)ifbi=1,C¯i(Σi1Wi)vec(Yi) otherwise. (A.15)(A.15)

  6. For i=1,…,M: Draw Σi from InvWishart(v¯i,S¯i) with

    v¯i=ν_+Ti (A.16)(A.16)


    S¯i={S_+(YiXiβi*αi*WiΓi)(YiXiβi*αi*WiΓi)  ifbi=1,S_+(YiWiΓi)(YiWiΓi) otherwise. (A. 17)(A. 17)

Marginal likelihood calculation using the bridge sampler

In order to calculate marginal likelihoods we use the bridge sampling algorithm of Frühwirth-Schnatter (2004) and the reader is referred to this paper for complete details. Note that this algorithm requires the evaluation of the likelihood function with the latent states, ST, integrated out. But this is provided in the forward-backward recursions of the algorithms of Chib (1996, 1998). Section 11.4.1 of Frühwirth-Schnatter (2006) on “The Markov Mixture Likelihood Function” provides a thorough discussion of the relevant issues and various methods for evaluating this likelihood function.

Given B MCMC sampler draws {θK(b)}, b=1,…,B, from the posterior distribution p(θK|y,K) and the ability to evaluate the likelihood function and prior, the bridge sampler consists of the following steps:

  1. Simulation. Construct the unsupervised importance density q(θK) discussed in Frühwirth-Schnatter (2004, Section 3.4). Draw L draws {θ˜K(l)}, l=1,…,L, from this importance density.

  2. Evaluation. Calculate both the non-normalized posterior p*(θK|y,K) and the importance density q(θK) at the draws both from the posterior and the importance density.

  3. Iteration. Get a starting value for the estimate of the marginal likelihood p^0 (for example the importance sampling estimator). Run the following recursion until convergence has been achieved:

    p^t=1Ll=1Lp*(θ˜K(l)|y,K)Lq(θ˜K(l))+Bp*(θ˜K(l)|y,K)/p^t11Bb=1Bq(θK(b))Lq(θK(b))+Bp*(θK(b)|y,K)/p^t1. (A.18)(A.18)


  • Andrade, P., C. Bruneau, and S. Gregoir. 2005. “Testing for the Cointegration Rank When Some Cointegrating Directions are Changing.” Journal of Econometrics 124: 269–310.CrossrefGoogle Scholar

  • Ang, A., and G. Bekaert. 2002. “Regime Switches in Interest Rates.” Journal of Business and Economic Statistics 20: 163–182.CrossrefGoogle Scholar

  • Beyer, A., A. Haug, and W. Dewald. 2011. “Structural Breaks and the Fisher Effect.” The B.E. Journal of Macroeconomics 11.Google Scholar

  • Bierens, H. J., and L. F. Martins. 2010. “Time Varying Cointegration.” Econometric Theory 26: 1453–1490.CrossrefWeb of ScienceGoogle Scholar

  • Chib, S. 1996. “Calculating Posterior Distributions and Modal Estimates in Markov Mixture Models.” Journal of Econometrics 75: 79–97.CrossrefGoogle Scholar

  • Chib, S. 1998. “Estimation and Comparison of Multiple Change-point Models.” Journal of Econometrics 86: 221–241.CrossrefGoogle Scholar

  • Cogley, T., and T. Sargent. 2005. “Drifts and Volatilities: Monetary Policies and Outcomes in the Post WWII U.S.” Review of Economic Dynamics 8: 262–302.CrossrefGoogle Scholar

  • Doan, T., R. Litterman, and C. Sims. 1984. “Forecasting and Conditional Projection Using Realistic Prior Distributions.” Econometric Reviews 3: 1–144.CrossrefGoogle Scholar

  • Franses, P. H. 2001. “How to Deal with Intercept and Trend in Practical Cointegration Analysis.” Applied Economics 33: 577–579.CrossrefGoogle Scholar

  • Frühwirth-Schnatter, S. 2001. “MCMC Estimation of Classical and Dynamic Switching and Mixture Models.” Journal of the American Statistical Association 96: 194–209.CrossrefGoogle Scholar

  • Frühwirth-Schnatter, S. 2004. “Estimating Marginal Likelihoods for Mixture and Markov Switching Models Using Bridge Sampling Techniques.” Econometrics Journal 7: 143–167.CrossrefGoogle Scholar

  • Frühwirth-Schnatter, S. 2006. Finite Mixture and Markov Switching Models. New York: Springer.Google Scholar

  • Garratt, A., K. Lee, M. H. Pesaran, and Y. Shin. 2003. “A Long-Run Structural Macroeconometric Model of the UK Economy.” Economic Journal 113: 412–455.CrossrefGoogle Scholar

  • Gelman, A., and X. Meng. 1998. “Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling.” Statistical Science 13: 163–185.CrossrefGoogle Scholar

  • Geweke, J. 1996. “Bayesian Reduced Rank Regression in Econometrics.” Journal of Econometrics 75: 121–146.CrossrefGoogle Scholar

  • Jochmann, M., G. Koop, R. León-González, and R. Strachan. 2013. “Stochastic Search Variable Selection in Vector Error Correction Models with an Application to a Model of the UK Macroeconomy.” Journal of Applied Econometrics 28: 62–81.CrossrefWeb of ScienceGoogle Scholar

  • Johansen, S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press.Google Scholar

  • Koop, G., R. León-González, and R. Strachan. 2008. “Bayesian Inference in a Cointegrating Panel Data Model.” Advances in Econometrics 23: 433–469.CrossrefWeb of ScienceGoogle Scholar

  • Koop, G., R. León-González, and R. Strachan. 2010. “Efficient Posterior Simulation for Cointegrated Models with Priors on the Cointegration Space.” Econometric Reviews 29: 224–242.CrossrefWeb of ScienceGoogle Scholar

  • Koop, G., R. León-González, and R. Strachan. 2011. “Bayesian Inference in a Time Varying Cointegration Model.” Journal of Econometrics 165: 210–220.Web of ScienceGoogle Scholar

  • Koop, G., S. Potter, and R. Strachan. 2008. “Re-examining the Consumption-Wealth Relationship: The Role of Model Uncertainty.” Journal of Money, Credit and Banking 40: 341–367.CrossrefWeb of ScienceGoogle Scholar

  • Lettau, M., and S. Ludvigson. 2004. “Understanding Trend and Cycle in Asset Values: Reevaluating the Wealth Effect on Consumption.” American Economic Review 94: 276–299.CrossrefGoogle Scholar

  • Martin, G. 2000. “US Deficit Sustainability: A New Approach Based on Multiple Endogenous Breaks.” Journal of Applied Econometrics 15: 83–105.CrossrefGoogle Scholar

  • Michael, P., A. Nobay, and D. Peel. 1997. “Transactions Costs and Nonlinear Adjustment in Real Exchange Rates: An Empirical Investigation.” Journal of Political Economy 105: 862–879.CrossrefGoogle Scholar

  • Paap, R., and H. van Dijk. 2003. “Bayes Estimates of Markov Trends in Possibly Cointegrated Series: An Application to US Consumption and Income.” Journal of Business Economics and Statistics 21: 547–563.CrossrefGoogle Scholar

  • Park, J., and H. Hahn. 1999. “Cointegrating Regressions with Time Varying Coefficients.” Econometric Theory 15: 664–703.CrossrefGoogle Scholar

  • Primiceri, G. 2005. “Time Varying Structural Vector Autoregressions and Monetary Policy.” Review of Economic Studies 72: 821–852.Web of ScienceCrossrefGoogle Scholar

  • Quintos, C. E. 1997. “Stability Tests in Error Correction Models.” Journal of Econometrics 82: 289–315.CrossrefGoogle Scholar

  • Saikkonen, P., and I. Choi. 2004. “Cointegrating Smooth Transition Regressions.” Econometric Theory 20: 301–340.CrossrefGoogle Scholar

  • Sims, C., and T. Zha. 2006. “Were There Regime Switches in Macroeconomic Policy?” American Economic Review 96: 54–81.CrossrefGoogle Scholar

  • Stock, J., and M. Watson. 1996. “Evidence on Structural Instability in Macroeconomic Time Series Relations.” Journal of Business and Economic Statistics 14: 11–30.Google Scholar

  • Strachan, R. 2003. “Valid Bayesian Estimation of the Cointegrating Error Correction Model.” Journal of Business and Economic Statistics 21: 185–195.CrossrefGoogle Scholar

  • Strachan, R., and B. Inder. 2004. “Bayesian Analysis of the Error Correction Model.” Journal of Econometrics 123: 307–325.CrossrefGoogle Scholar

  • Sugita, K. 2006. “Bayesian Analysis of Dynamic Multivariate Models with Multiple Structural Breaks.” Discussion Paper No 2006-14, Graduate School of Economics, Hitotsubashi University.Google Scholar

Supplemental Material

The online version of this article (DOI: 10.1515/snde-2012-0064) offers supplementary material, available to authorized users.


  • In the case where all models, a priori, are given equal weight, posterior model probabilities are proportional to marginal likelihoods. 

  • Note that an alternative to imposing the restriction on the cointegration space directly is to use a prior that is tightly centered over the restriction (see Jochmann et al. 2013, for details and justification of this approach). We followed this strategy in an earlier version of this paper and obtained similar results. 

  • Imposing the identification restriction means that we do not have to worry about the label-switching problem. We checked the restriction’s appropriateness by examining draws from the unconstrained posterior obtained with the permutation sampler of Frühwirth-Schnatter (2001). 

About the article

Corresponding author: Markus Jochmann, Newcastle University Business School, Newcastle upon Tyne, UK, e-mail:

Published Online: 2014-02-28

Published in Print: 2015-02-01

Citation Information: Studies in Nonlinear Dynamics & Econometrics, Volume 19, Issue 1, Pages 35–48, ISSN (Online) 1558-3708, ISSN (Print) 1081-1826, DOI: https://doi.org/10.1515/snde-2012-0064.

Export Citation

©2015 by De Gruyter.Get Permission

Supplementary Article Materials

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Christina Christou, Rangan Gupta, Wendy Nyakabawo, and Mark E. Wohar
International Review of Economics & Finance, 2017
Joscha Beckmann, Dionysius Glycopantis, and Keith Pilbeam
Empirical Economics, 2017

Comments (0)

Please log in or register to comment.
Log in