Show Summary Details
More options …

Studies in Nonlinear Dynamics & Econometrics

Ed. by Mizrach, Bruce

IMPACT FACTOR 2018: 0.448
5-years IMPACT FACTOR: 0.877

CiteScore 2018: 0.85

SCImago Journal Rank (SJR) 2018: 0.552
Source Normalized Impact per Paper (SNIP) 2018: 0.561

Mathematical Citation Quotient (MCQ) 2018: 0.07

Online
ISSN
1558-3708
See all formats and pricing
More options …
Volume 22, Issue 4

A hidden Markov regime-switching smooth transition model

Robert J. Elliott
• School of Commerce, University of South Australia, Australia; Haskayne School of Business, University of Calgary, Calgary, Alberta, Canada
• Department of Applied Finance and Actuarial Studies, Faculty of Business and Economics, Macquarie University, Sydney, Australia, Phone: (+61-2) 9850 8589; Fax: (+61-2) 9850 9481
• Other articles by this author:
/ Tak Kuen Siu
• Corresponding author
• Department of Applied Finance and Actuarial Studies, Faculty of Business and Economics, Macquarie University, Sydney, Australia, Phone: (+61-2) 9850 8589; Fax: (+61-2) 9850 9481
• Email
• Other articles by this author:
/ John W. Lau
Published Online: 2018-06-29 | DOI: https://doi.org/10.1515/snde-2016-0061

Abstract

In this paper, we develop a new class of parametric nonlinear time series models by combining two important classes of models, namely smooth transition models and hidden Markov regime-switching models. The class of models is general and flexible enough to incorporate two types of switching behavior: smooth state transitions and abrupt changes in hidden states. The estimation of the hidden states and model parameters is performed by applying filtering theory and a filter-based expectation-maximization (EM) algorithm. Applications of the model are illustrated using simulated data and real financial data. Other potential applications are mentioned.

This article offers supplementary material which is provided at the end of the article.

1 Introduction

The objective of this paper is to develop a new class of parametric nonlinear time series models which is general and flexible enough to incorporate two types of switching behavior, namely smooth state transitions and abrupt changes in hidden states. We call our class of nonlinear time series models hidden Markov regime-switching smooth transition (HMRS-STAR) models. It is developed by combining two important classes of parametric nonlinear time series models, namely smooth transition (STAR) models and hidden Markov regime-switching (HMRS) models. The reason behind combining the two models to develop the class of HMRS-STAR models is similar to that of combining first-generation models to produce second-generation models, third-generation models, and so on, as proposed in Tong (1990), (see Chapter 3, Section 13.3, therein). The class of HMRS-STAR models proposed here is general and may have potential applications in diverse fields such as economics, finance, actuarial science, biology, engineering and physical sciences, ecology and population dynamics, where nonlinear time series modelling or nonlinear dynamical systems may be highly relevant. Concrete applications of the HMRS-STAR models to financial modeling are presented below. Particularly, the class of HMRS-STAR models can describe the impacts of both abrupt regime switches and smooth regime switches on asset price dynamics or returns. Abrupt regime switches are those sudden changes in asset price dynamics or returns which may be attributed to major economic or market events such as financial crises. Some observed market news or events may be digested more slowly by market participants and their impacts may be more slowly reflected in asset prices or returns. These changes in asset prices or returns are gradual rather than abrupt and are referred to as smooth regime switches. Smooth state transitions in asset prices or returns may also have some implications on market efficiency and may be related to a price trend model in the financial econometrics literature.

Standard filtering theory based on the reference probability approach in the literature and its corresponding filter-based expectation-maximization (EM) algorithm, [see, for example, Elliott, Aggoun, and Moore (1995)], is applied to estimate the hidden states and unknown parameters in the proposed model. To simplify some nonlinear relationships in the filter-based based estimation due to the nonlinearity in the smoothing function, we employ a Laplace series expansion to simplify the estimation procedure and derive analytical formulas for some parameter estimates. Simulated data are used to illustrate the practical implementation of the estimation method and its accuracy. Real data examples based on two financial data sets, namely the Hang Seng Index and NASDAQ Composite index, are provided to illustrate potential applications of the proposed class of models to real financial data. Other potential applications of the proposed class of models are briefly discussed. In the course of our real data analysis, we have found that for some financial datasets, the estimation algorithm based on filtering theory does not converge. This reflects that there is still room for improving the filter-based estimation method. This may pose a challenging problem for further research. Given the important role played by hidden Markov models in signal processing, it is hoped that the proposed class of HMRS-STAR models may provide some insights into how the link between two important fields, namely nonlinear time series analysis and signal processing, may be consolidated. In what follows, some of the relevant literature and background of the proposed class of models are discussed.

Nonlinear time series analysis is an important topic in econometrics, statistics and dynamical systems theory. Some early developments of nonlinear time series models may be traced back to the late 1970s where limitations of linear time series modelling to fit real datasets in diverse fields were realised. See the monographs by Tong (1983, 1990). Some important classes of parametric nonlinear time series models proposed in the early stage of developments of the discipline are, for example, threshold autoregressive (TAR) models by Tong (1977, 1978) and Tong and Lim (1980), bilinear models by Granger and Anderson (1978) [see also Subba Rao and Gabr (1984)], autoregressive conditional heteroscedastic (ARCH) models by Engle (1982), stochastic volatility models of Taylor (1982) and state-dependent models by Priestley (1980). The monograph by Tong (1990) provides an excellent and authoritative account of parametric nonlinear time series models developed before the 1990s. Nonparametric and semi-parametric approaches to nonlinear time series analysis have also been studied in the literature. See the monographs by Fan and Yao (2003) and Gao (2007).

Regime-switching models represent an important class of parametric nonlinear time series models which have wide ranging applications in diverse fields such as engineering, economics, finance and actuarial science, amongst others. The basic principle of regime switching was conceived some time ago in economics, particularly in econometrics, and engineering, particularly in signal processing and control engineering. For discussions on applications of regime switching models and related models to engineering, one may refer to the monographs by Elliott, Aggoun, and Moore (1995) and Yin and Zhu (2010). Early works in econometrics where the idea of regime switching has been adopted are, for example, Quandt (1958) and Goldfeld and Quandt (1973), where regression models with switching parameters were introduced to discuss nonlinearity in economic data. The principle of regime switching also appeared in pioneering works on parametric nonlinear time series analysis, [see, for example, Tong and Lim (1980) and Tong (1983)]. Since the seminal article by Hamilton (1989) on Markov regime-switching autoregressive models for econometrics, regime switching models have become popular in economics, econometrics and finance. The basic idea of regime switching models is that the model parameters can change over time according to an underlying state process which could be a finite-state hidden Markov chain.

Chan and Tong (1986) and Teräsvirta (1994) introduced a class of time series models, namely, smooth transition autoregressive models. This class of time series models allow “smooth”, or gradual transitions, in regimes and may be thought of as a type of regime switching models in a wide sense, see, for example, Tong (1990), for further discussions. The key feature of this class of models is that “smooth” regime switching is described by introducing a “smooth” transition function, which is a non-decreasing function. Some typical examples of this “smooth” transition function are the cumulative normal distribution function and the logistic function. Like regime-switching autoregressive models, smooth transition autoregressive models can capture cyclical behavior in time series. For a survey on the smooth transition autoregressive models and their applications, interested readers may refer to Granger and Teräsvirta (1993), Potter (1999), and Teräsvirta (1998).

In Elliott, Siu, and Lau (2013), a double threshold model was considered, where two types of regime-switching, namely the regime switching governed by the threshold principle of Tong (1983, 1990) and the regime switching described by transitions of a hidden Markov chain. In Elliott, Liew, and Siu (2011), the filtering of a threshold stochastic volatility model was considered using the reference probability approach for hidden Markov models in, for example, Elliott, Aggoun, and Moore (1995). In a recent paper by Zhu et al. (2017), a hidden Markov models with threshold effects was considered and applied to oil price forecasting, where the threshold regime switching effect was present in transition probabilities of the hidden Markov chain. Note that the smooth transitions models may be more general than the threshold regime-switching models and may be used to test whether regime switching is attributed to smooth transitions or threshold regime switches, [see, for example, Tong (1990), for related discussions]. In this sense, the HMRS-STAR model may be more general than the double threshold model in Elliott, Siu, and Lau (2013) and the former may also be used to generalize threshold-type models in, for example, Elliott, Liew, and Siu (2011) and Zhu et al. (2017).

This paper is organized as follows. The next section gives an introduction to the HMRS-STAR model. Filters for the hidden Markov chain and filter-based estimates of the unknown parameters based on the EM algorithm are presented in Section 3. Simulation experiments and results are discussed in Section 4. Real data examples are presented in Section 5. Section 6 provides discussions on some potential economic applications of the proposed HMRS-STAR model. The final section gives some concluding remarks.

2 The HMRS-STAR model

We consider a complete probability space (Ω, , P) on which all random variables are defined, where P is a real-world probability. Let {Xt|t𝒯} be a discrete-time, N-state, hidden Markov chain defined on (Ω, , P) with state space being the set of unit vectors {e1, e2, …, eN}. Here 𝒯 := {0, 1, …} and the jth component of ei is the Kronecker delta function δij for each i, j = 1, 2, …, N. Indeed the state space we considered here is called the canonical state space of the hidden Markov chain. The state space was adopted in, for example, Elliott, Aggoun, and Moore (1995).

Suppose that the chain X is time-homogeneous so that its probability law is completely determined by its transition probability and initial distribution. For each i, j = 1, 2, …, N, let

$πji:=P(Xt+1=ej|Xt=ei)=P(X1=ej|X0=ei) ,$

so [πji]i, j = 1, 2, …, N is the transition probability matrix of the chain X under P and it is denoted by 𝚷.

Consider the P-completion of the natural filtration ${\mathbb{F}}^{\mathbf{\text{X}}}:=\left\{{\mathcal{F}}_{t}^{\mathbf{\text{X}}}|t\in \mathcal{T}\right\}$ generated by the hidden Markov chain, where

$FtX:=σ{X0,X1,…,Xt}∨N,$

the minimal σ-algebra generated by information about the values of the chain X up to and including time t and the collection 𝒩 of P-null sets of . Here for σ-algebras 𝒜 and , we denote by 𝒜 the minimal σ-algebra generated by 𝒜 and .

With the canonical state space of the chain, Elliott, Aggoun, and Moore (1995) gave the following semimartingale dynamics for the chain X under P:

$Xt+1=ΠXt+Mt+1 ,t∈T,$(1)

where $\left\{{\mathbf{\text{M}}}_{t}|t\in \mathcal{T}\mathrm{\setminus }\left\{0\right\}\right\}$ is an ℜN-valued, (𝔽X, P)-martingale difference process. The above semimartingale dynamics will be used when deriving filters for the hidden Markov chain and filter-based estimates of the unknown parameters in the HMRS-STAR model.

We now describe the HMRS-STAR model. For illustration, we consider a simple case where there are two regimes in the smooth transition part of the model. Let

$μ(t):=⟨μ,Xt⟩ ,θ(t):=⟨θ,Xt⟩ ,σ(t):=⟨σ,Xt⟩ .$

Here $\mathbit{\mu }:=\left({\mu }_{1},{\mu }_{2},\dots ,{\mu }_{N}{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$, $\mathbit{\theta }:=\left({\theta }_{1},{\theta }_{2},\dots ,{\theta }_{N}{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$ and $\mathbit{\sigma }:=\left({\sigma }_{1},{\sigma }_{2},\dots ,{\sigma }_{N}{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$ with σi > 0 for each i = 1, 2, … , N; $⟨\cdot ,\cdot ⟩$ is the scalar product in ℜN.

Let F : (−∞, ∞) → [0, 1] be a continuous and non-decreasing function. This function plays the role of smoothing transitions in regimes. Suppose r is the threshold parameter and δ is the scale parameter, where r ∈ ℜ and δ > 0. Let d be a positive integer representing the delay parameter. Consider a time series $\left\{{Y}_{t}|t=-d+1,-d+2,\dots ,0,1,\dots \right\}$, where the initial values ${Y}_{-d+1}$, ${Y}_{-d+2}$, …, Y0 are given in advance. We suppose the remaining terms of the time series $\left\{{Y}_{t}|t\in \mathcal{T}\mathrm{\setminus }\left\{0\right\}\right\}$ follow a HMRS-STAR (2, N) model defined by:

$Yt=μ(t)+θ(t)F(Yt−d−rδ)+σ(t)ϵt .$

Here we suppose, for simplicity, that the noise process $\left\{{ϵ}_{t}|t\in \mathcal{T}\mathrm{\setminus }\left\{0\right\}\right\}$ is a sequence of independent and identically distributed (i.i.d.) standard Gaussian random variables, (i.e., ${ϵ}_{t}\sim N\left(0,1\right)$).

When the scale parameter δ tends to zero, the function $F\left(\frac{{Y}_{t-d}-r}{\delta }\right)$ tends to the indicator function ${I}_{\left\{{Y}_{t-d}>r\right\}}$. In the limiting case, the transition in the smooth transition component becomes abrupt. The HMRS-STAR(2, N) model becomes an (2, N)-double threshold model in Elliott, Siu, and Lau (2013). One may ask, given that the smooth transition component may be able to model abrupt or smooth transitions for different values of the parameters δ, why we need to incorporate abrupt changes using a hidden Markov chain. There are at least two reasons. Firstly, the nature of the abrupt changes in the proposed HMRS-STAR model and that incorporated in the hidden Markov chain are different. The abrupt change in the smooth transition component is a self-exciting change in the sense that it depends on a past value of the time series itself, while the abrupt change incorporated by the hidden Markov chain is generated exogenously. This partly motivated the double threshold model in Elliott, Siu, and Lau (2013). Secondly, the HMRS-STAR model can incorporate both abrupt changes and smooth changes simultaneously. The HMRS-STAR model could be applied to financial modelling. For example, we can consider $\left\{{Y}_{t}|t\in \mathcal{T}\mathrm{\setminus }\left\{0\right\}\right\}$ as the return series of a financial asset. Then the HMRS-STAR model can incorporate two types of regime switches, namely, “abrupt” regime switches and “smooth” regime switches, in the appreciation rate of the asset. “Abrupt” regime switches may be attributed to sudden changes in hidden economic fundamentals which may occur during financial crises. “Smooth” regime switches may be attributed to observed news and events which are digested slowly by the market and gradually reflected in market prices or returns. In the modern finance theory, the theory of market efficiency is pertinent. One of the versions of this theory basically states that if a capital market is efficient, market prices react almost instantaneously to market news and information. However, the theory of market efficiency is questioned by some empirical studies for different markets such as equity, currency and commodity markets. Good references for these empirical studies are, for example, Taylor (1986, 2005). Indeed, in Taylor (1986), (see Chapter 7 therein), a price trend model was introduced as an alternative to the random walk model which is often assumed under market efficiency hypothesis. The key idea of the price trend model is that trend occurs when some market information is reflected in several consecutive asset returns. This reflects a kind of slow adjustments of market prices to market news and information.

In general, one could also consider a situation where both the threshold parameter r and the scale parameter δ are modulated by the hidden Markov chain X. However, to simplify our analysis, we assume they are constants. The following two examples give two special cases of the HMRS-STAR model.

Example 1: Suppose the smooth transition function F(x) is the standard Gaussian distribution Φ(x), (i.e., the probability distribution function of a zero-mean, unit-variance, normal distribution). Then the HMRS-STAR model becomes:

$Yt=μ(t)+θ(t)Φ(Yt−d−rδ)+σ(t)ϵt .$

Example 2: Suppose the smooth transition function F(x) is the logistic distribution given by $\left[1+\mathrm{exp}\left(-x\right){\right]}^{-1}$. Then the HMRS-STAR model becomes:

$Yt=μ(t)+θ(t)[1+exp⁡(r−Yt−dδ)]−1+σ(t)ϵt .$

The quantity 1/δ is called the speed of transition parameter. It could be challenging to estimate using MLE directly, and the EM algorithm circumvents this step.

In the next section, we shall illustrate how to estimate the parameters r and δ using the Expectation Maximization (EM) algorithm and an approximation method for the case of the Gaussian smooth transition function. Indeed, this method also works for the case of the logistic smooth transition function when the following standard linear approximation method for the logistic function F is used:

$F(x)=11+e−x≈12+F′(0)x+O(x2) .$

3 Filtering and estimation

The basic idea of the reference probability approach is to start with a reference probability $\overline{P}$ under which the observed process Y has simpler dynamics, (i.e. independent of the hidden Markov chain). Then the reference probability $\overline{P}$ is related to the real-world probability P by a measure change. Under P, the observed process is governed by the original dynamics. However, filters for the hidden quantities are derived under the reference proability $\overline{P}$. The reference probability approach for filtering hidden Markov models was discussed in Elliott, Aggoun, and Moore (1995). The mathematical techniques for deriving the filters and filter-based estimates here follow those in Elliott, Aggoun, and Moore (1995). We only present results which are relevant to the simulation and real-data analyses in the following two sections. For some technical details of the results, one may refer to, for example, Elliott, Aggoun, and Moore (1995). In what follows, we start by presenting essential concepts and notations. Then the filters and filter-based estimates are given.

3.1 Filters for the hidden chain and related quantities

Let $\overline{P}$ be a reference probability under which

1. $\left\{{Y}_{t}|t\in \mathcal{T}\mathrm{\setminus }\left\{0\right\}\right\}$ is a sequence of i.i.d. standard normal random variables;

2. the chain X has a transition probability matrix 𝚷.

Define ${\mathbb{Y}}^{0}:=\left\{{\mathcal{Y}}_{t}^{0}|t\in \mathcal{T}\right\}$ and ${\mathbb{G}}^{0}:=\left\{{\mathcal{G}}_{t}^{0}|t\in \mathcal{T}\right\}$ by:

$Yt0:=σ{Y1,Y2,…,Yt}∨Y00,Gt0:=σ{Y1,Y2,…,Yt,X1,X2,…,Xt}∨G00 ,t=1,2,…,$

and

$Y00:=σ{Y−d+1,Y−d+2,…,Y0},G00:=σ{X0}∨Y00.$

Write 𝕐 and 𝔾 for the P-completion of 𝕐0 and 𝔾0, respectively. Given observed information 𝒴t up to time t, we wish to estimate Xt as:

$X^t=E[Xt|Yt],$

where E is an expectation with respect to the measure P.

By a version of the Bayes’ rule [see, for example, Elliott, Aggoun, and Moore (1995)],

$E[Xt|Yt]=E¯[ΛtXt|Yt]E¯[Λt|Yt],$

where $\overline{\text{E}}$ is an expectation with respect to the measure $\overline{P}$. Write

$qt:=E¯[ΛtXt|Yt].$(2)

For each i = 1, 2, … , N, let

$ζi(t,Yt):=ϕ(Yt−μi−θiF(Yt−d−rδ)σi)σiϕ(Yt),$

$ζ(t,Yt):=(ζ1(t,Yt),ζ2(t,Yt),…,ζN(t,Yt))′∈ℜN,$

and write diag(𝜻(t, Yt)) for the diagonal matrix with diagonal elements being the components in the vector 𝜻(t, Yt). Then the unnormalized filter qt satisfies the following recursive equation:

$qt=diag(ζ(t,Yt))Πqt−1 ,t∈T∖{0}.$

Consequently,

$E[Xt|Yt]=qt⟨qt,1⟩.$

Here $\mathbf{\text{1}}:=\left(1,1,\dots ,1{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$.

The following quantities will be used to derive the filter-based estimates of some unknown parameters in the HMRS-STAR model.

1. The number of transitions of the chain X from state ei to state ej up to time t, for each i, j = 1, 2, …, N, is:$Jtij:=∑l=1t⟨Xl−1,ei⟩⟨Xl,ej⟩.$

2. The occupation time of the chain X in state ei up to time t, for each i = 1, 2, … , N, is:$Oti:=∑l=1t⟨Xl,ei⟩ .$

3. The “generalized” level process associated with state ei up to time t, for each i = 1, 2, … , N, is:$Ld+1,ti:=∑l=1tfd+1(Yl,Yl−1,…,Yl−d)⟨Xl,ei⟩ .$Here ${f}_{d+1}:{\mathrm{\Re }}^{d+1}\to \mathrm{\Re }$ is a Borel-measurable function.

As in Elliott, Aggoun, and Moore (1995) we define the following ‘unnormalized’ vector quantities:

$σ(JtijXt):=E¯[ΛtJtijXt|Yt]σ(OtiXt):=E¯[ΛtOtiXt|Yt]σ(Ld+1,tiXt):=E¯[ΛtLd+1,tiXt|Yt] ,t∈T∖{0},$

where

$σ(J0jiX0)=σ(O0iX0)=σ(Ld+1,0iX0)=0∈ℜN.$

Then the exact recursive equations for $\sigma \left({J}_{t}^{ij}{\mathbf{\text{X}}}_{t}\right)$, $\sigma \left({O}_{t}^{i}{\mathbf{\text{X}}}_{t}\right)$ and $\sigma \left({L}_{d+1,t}^{i}{\mathbf{\text{X}}}_{t}\right)$ are, respectively, given by:

$σ(JtijXt)=diag(ζ(t,Yt))Πσ(Jt−1ijXt−1)+⟨ζ(t,Yt),ej⟩⟨qt−1,ei⟩πjiej,$(3)

$σ(OtiXt)=diag(ζ(t,Yt))Πσ(Ot−1iXt−1)+⟨ζ(t,Yt),ei⟩⟨Πqt−1,ei⟩ei,$(4)

and

$σ(Ld+1,tiXt)=diag(ζ(t,Yt))Πσ(Ld+1,t−1iXt−1)+fd+1(Yt,Yt−1,…,Yt−d)⟨ζ(t,Yt),ei⟩×⟨Πqt−1,ei⟩ei,$(5)

for each i = 1, 2, … , N. See, for example, Elliott, Aggoun, and Moore (1995).

Note that

$σ(Jtij):=E¯[ΛtJtij|Yt]=⟨σ(JtijXt),1⟩,σ(Oti):=E¯[ΛtOti|Yt]=⟨σ(OtiXt),1⟩,σ(Ld+1,ti):=E¯[ΛtLd+1,ti|Yt]=⟨σ(Ld+1,tiXt),1⟩.$

3.2 Filter-based estimates and the EM algorithm

The filter-based estimates for the unknown parameters in the HMRS-STAR model are derived using the Expectation Maximization, or EM, algorithm. Here we briefly present the idea of the EM algorithm. For details, one may refer to, for example, Elliott, Aggoun, and Moore (1995). Again the mathematical techniques used here follow from those in Elliott, Aggoun, and Moore (1995).

Let $\left\{{P}_{\theta }|\theta \in \mathrm{\Theta }\right\}$ be a family of probability measures on a measurable space $\left(\mathrm{\Omega },\mathcal{F}\right)$ all absolutely continuous with respect to a fixed probability measure P0, where θ is an unknown model parameter and Θ is the parameter space. Write 𝒴 for a sub-σ-field of . Then the likelihood function for computing an estimate of the unknown parameter θ given the available information in 𝒴 is:

$L(θ):=E0[dPθdP0|Y] .$

Here E0 is the expectation under P0. Then the maximum likelihood estimate (MLE) $\stackrel{^}{\theta }$ of θ is given by:

$θ^:=argmaxθ∈ΘL(θ).$

It may be challenging to compute the MLE $\stackrel{^}{\theta }$ directly. The EM algorithm provides an iterative approximation method to compute the MLE $\stackrel{^}{\theta }$. It consists of the following four steps, [see, for example, Elliott, Aggoun, and Moore (1995)]:

Step I: Set the counter p = 0 and choose $\stackrel{^}{\theta }$0.

Step II: (E-step) Set ${\theta }^{\ast }={\stackrel{^}{\theta }}_{p}$ and compute $Q\left(\cdot ,{\theta }^{\ast }\right)$, where

$Q(θ,θ∗):=Eθ∗[ln⁡(dPθdPθ∗)|Y] ,$

where ${\text{E}}_{{\theta }^{\ast }}$ is the expectation under ${P}_{{\theta }^{\ast }}$.

Step III: (M- step) Find

$θp+1:=argmaxθ∈Θ Q(θ,θ∗) .$

Step IV: Replace p by p + 1 and repeat beginning with Step II until a certain stopping criterion is satisfied.

Note that the EM algorithm we applied here recursively converges only to a local maximum of log-likelihood. For a discussion of the EM algorithm and its convergent properties, one may refer to Baum et al. (1970), Dembo and Zeitouni (1986), and Elliott, Aggoun, and Moore (1995).

The set of parameters of interest can be described by the set 𝚯 defined as follows:

$Θ:={(πkl)k,l=1,2,…,N,(μ,θ,σ),r,δ} ,$

where $\mathbit{\mu }:=\left({\mu }_{1},{\mu }_{2},\dots ,{\mu }_{N}{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$; $\mathbit{\theta }:=\left({\theta }_{1},{\theta }_{2},\dots ,{\theta }_{N}{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$; $\mathbit{\sigma }:=\left({\sigma }_{1},{\sigma }_{2},\dots ,{\sigma }_{N}{\right)}^{\mathrm{\prime }}\in {\mathrm{\Re }}^{N}$.

Since πji’s are transition probabilities, we also require that

$∑j=1Nπji=1 ,i=1,2,…,N .$

Suppose now that the set of parameters 𝚯 is given and that the set of observed data described by the σ-field 𝒴n is known. We wish to determine a new set of parameters 𝚯(n) defined by:

$Θ(n):={(πkl(n))k,l=1,2,…,N,(μ(n),θ(n),σ(n)),r(n),δ(n)} ,$

where

$μ(n):=(μ1(n),μ2(n),…,μN(n))′∈ℜN,θ(n):=(θ1(n),θ2(n),…,θN(n))′∈ℜN,σ(n):=(σ1(n),σ2(n),…,σN(n))′∈ℜN.$

Such a new set of parameters is determined by maximizing the conditional log-likelihoods defined below. The basic idea is that we update one set of parameters at a time starting with the $\left[{\pi }_{kl}{\right]}_{k,l=1,2,\dots ,N}$. Using this method, it is known that a filter-based estimate ${\stackrel{^}{\pi }}_{kl}\left(n\right)$ for πkl given observed data described by the σ-field 𝒴n is given by:

$π^kl(n)=E[Jnkl|Yn]E[Onl|Yn]=σ(Jnkl)σ(Onl) .$

For derivation, interested readers may refer to Elliott, Aggoun, and Moore (1995), (see Chapter 2 therein).

Consider now another set of parameters (𝝁, 𝜽). To change the set of parameters from (𝝁, 𝜽) to ($\stackrel{^}{\mathbit{\mu }}$(n), $\stackrel{^}{\mathbit{\theta }}$(n)) while keeping other parameters (𝝈, r, δ) constant, we must consider the following factors, (t = 1, 2, … , n),

$λ^t1:=exp⁡{12⟨σ,Xt⟩2[⟨μ+θF(Yt−d−rδ),Xt⟩2−⟨μ^+θ^F(Yt−d−rδ),Xt⟩2−2Yt⟨μ+θF(Yt−d−rδ),Xt⟩+2Yt⟨μ^+θ^F(Yt−d−rδ),Xt⟩]} .$

Note that, to simplify the notation, we write $\stackrel{^}{\mathbit{\mu }}$ and $\stackrel{^}{\mathbit{\theta }}$ for $\stackrel{^}{\mathbit{\mu }}$(n) and $\stackrel{^}{\mathbit{\theta }}$(n), respectively.

Write

$Λ^t1:=∏u=1tλ^u1 ,Λ^01:=1 .$

Then a new probability measure $\stackrel{^}{P}$1 is defined so that the restriction of its Radon-Nikodym derivative $\frac{d{\stackrel{^}{P}}^{1}}{dP}$ to 𝒴n is given by:

$dP^1dP|Gn=Λ^n1 .$

It is then not difficult to see that under $\stackrel{^}{P}$1, the sequence defined by:

$Yt−⟨μ^+θ^F(Yt−d−rδ),Xt⟩⟨σ,Xt⟩ ,t=1,2,…,n ,$

is a sequence of N(0, 1) i.i.d. random variables.

Now

$ln⁡Λ^n1=∑t=1n12⟨σ,Xt⟩2(⟨μ+θF(Yt−d−rδ),Xt⟩2−⟨μ^+θ^F(Yt−d−rδ),Xt⟩2−2Yt⟨μ+θF(Yt−d−rδ),Xt⟩+2Yt⟨μ^+θ^F(Yt−d−rδ),Xt⟩)=∑t=1n∑i=1N12σi2[(μi+θiF(Yt−d−rδ))2−(μ^i+θ^iF(Yt−d−rδ))2−2Yt(μi+θiF(Yt−d−rδ))+2Yt(μ^i+θ^iF(Yt−d−rδ))]⟨Xt,ei⟩=∑t=1n∑i=1N12σi2[−(μ^i+θ^iF(Yt−d−rδ))2+2Yt(μ^i+θ^iF(Yt−d−rδ))]⟨Xt,ei⟩+R(μ,θ,σ,r,δ) ,$

where R(𝝁, 𝜽, 𝝈, r, δ) does not involve $\stackrel{^}{\mathbit{\mu }}$ and $\stackrel{^}{\mathbit{\theta }}$ and it represents a quantity which may change from line to line.

Consequently,

$ln⁡Λ^n1=∑t=1n∑i=1N12σi2[−μ^i2−θ^i2F2(Yt−d−rδ)−2μ^iθ^iF(Yt−d−rδ)+2μ^iYt+2θ^iYtF(Yt−d−rδ)]⟨Xt,ei⟩+R(μ,θ,σ,r,δ) .$

Define the following functions:

$fd+11(Yt,Yt−1,…,Yt−d):=Yt ,fd+12(Yt,Yt−1,…,Yt−d):=F(Yt−d−rδ) ,fd+13(Yt,Yt−1,…,Yt−d):=YtF(Yt−d−rδ) ,fd+14(Yt,Yt−1,…,Yt−d):=F2(Yt−d−rδ) ,$

Consider the following functionals, for i = 1, 2, … , N and j = 1, 2, 3, 4:

$Ld+1,ni(fd+1j):=∑t=1nfd+1j(Yt,Yt−1,…,Yt−d)⟨Xt,ei⟩ .$

Then

$ln⁡Λ^n1=∑i=1N12σi2[−μ^i2Oni−θ^i2Ld+1,ni(fd+14)−2μ^iθ^iLd+1,ni(fd+12)+2μ^iLd+1,ni(fd+11)+2θ^iLd+1,ni(fd+13)]+R(μ,θ,σ,r,δ) .$

Again R(𝝁, 𝜽, 𝝈, r, δ) is a quantity which does not depend on $\stackrel{^}{\mu }$i or $\stackrel{^}{\theta }$i and can change from line to line.

Conditioning on 𝒴n under P gives:

$E[ln⁡Λ^n1|Yn]=∑i=1N12σi2[−μ^i2O^ni−θ^i2L^d+1,ni(fd+14)−2μ^iθ^iL^d+1,ni(fd+12)+2μ^iL^d+1,ni(fd+11)+2θ^iL^d+1,ni(fd+13)]+R(μ,θ,σ,r,δ) .$

Differentiating $\text{E}\left[\mathrm{ln}{\stackrel{^}{\mathrm{\Lambda }}}_{n}^{1}|{\mathcal{Y}}_{n}\right]$ with respect to $\stackrel{^}{\mu }$i and setting the derivative equal to zero gives:

$μ^iO^ni+θ^iL^d+1,ni(fd+12)=L^d+1,ni(fd+11).$

Solving for $\stackrel{^}{\mu }$i then gives:

$μ^i=L^d+1,ni(fd+11)−θ^iL^d+1,ni(fd+12)O^ni=σ(Ld+1,ni(fd+11))−θ^iσ(Ld+1,ni(fd+12))σ(Oni),$(6)

Note that the formula for $\stackrel{^}{\mu }$i depends on $\stackrel{^}{\theta }$i.

Differentiating $\text{E}\left[\mathrm{ln}{\stackrel{^}{\mathrm{\Lambda }}}_{n}^{1}|{\mathcal{Y}}_{n}\right]$ with respect to $\stackrel{^}{\theta }$i and setting the derivative equal to zero gives:

$θ^iL^d+1,ni(fd+14)+μ^iL^d+1,ni(fd+12)=Ld+1,ni(fd+13).$

and this gives

$θ^i=L^d+1,ni(fd+13)−μ^iL^d+1,ni(fd+12)L^d+1,ni(fd+14)=σ(Ld+1,ni(fd+13))−μ^iσ(Ld+1,ni(fd+12))σ(Ld+1,ni(fd+14)) ,$(7)

where the exact recursive formulae for evaluating $\sigma \left({L}_{d+1,n}^{i}\left({f}_{d+1}^{j}\right)\right)$ , j =1, 2, 3, 4, and $\sigma \left({O}_{n}^{i}\right)$ are given in Section 3, say Theorem 3.4 and Theorem 3.5. Note that the formula for $\stackrel{^}{\theta }$i depends on $\stackrel{^}{\mu }$i.

Consider now the set of parameters σi, i = 1, 2, … , N. To change the parameters from σi to $\stackrel{^}{\sigma }$i(n), i = 1, 2, … , N, while keeping (𝝁, 𝜽, r, δ) fixed, we must consider factors, (t = 1, 2, ⋯ , n):

$λ^t2:=⟨σ,Xt⟩⟨σ^,Xt⟩exp⁡[−12⟨σ^,Xt⟩2(Yt−⟨μ+θF(Yt−d−rδ),Xt⟩)2]exp⁡[−12⟨σ,Xt⟩2(Yt−⟨μ+θF(Yt−d−rδ),Xt⟩)]2 .$

Here we assume that $\stackrel{^}{\sigma }$i(n) > 0.

Again, we write

$Λ^t2:=∏u=1tλ^u2 ,Λ^02:=1 .$

A new probability measure $\stackrel{^}{P}$2 can then be defined so that the restriction of its Radon-Nikodym derivative $\frac{d{\stackrel{^}{P}}^{2}}{dP}$ to 𝒴n is given by:

$dP^2dP|Gn=Λ^n2 .$

Now,

$ln⁡Λ^n2=∑t=1n[−ln⁡⟨σ^,Xt⟩−12⟨σ^,Xt⟩2(Yt−⟨μ+θF(Yt−d−rδ),Xt⟩)2]+R(μ,θ,σ,r,δ)=∑t=1n∑i=1N[−ln⁡σ^i−12σ^i2(Yt−μi−θiF(Yt−d−rδ))2]⟨Xt,ei⟩+R(μ,θ,σ,r,δ)=∑t=1n∑i=1N[−ln⁡σ^i−12σ^i2(Yt2−2Ytμi−2θiYtF(Yt−d−rδ)+μi2+θi2F2(Yt−d−rδ)+2μiθiF(Yt−d−rδ))2]⟨Xt,ei⟩+R(μ,θ,σ,r,δ) .$

Define

$fd+15(Yt,Yt−1,⋯,Yt−d):=Yt2 ,$

and

$Ld+1,ni(fd+15)=∑t=1nfd+15(Yt,Yt−1,⋯,Yt−d)⟨Xt,ei⟩ .$

Again conditioning on 𝒴n under P gives:

$E[ln⁡Λ^n2|Yn]=∑i=1N[−ln⁡σ^iO^ni−12σ^i2(L^d+1,ni(fd+15)−2μiL^d+1,ni(fd+11)−2θiL^d+1,ni(fd+13)+μi2O^ni+θi2L^d+1,ni(fd+14)+2μiθiL^d+1,ni(fd+12))]+R(μ,θ,σ,r,δ) .$

Differentiating $\text{E}\left[\mathrm{ln}{\stackrel{^}{\mathrm{\Lambda }}}_{n}^{2}|{\mathcal{Y}}_{n}\right]$ with respect to $\stackrel{^}{\sigma }$i and setting the derivative equal to zero gives:

$σ^i2=1O^ni(L^d+1,ni(fd+15)−2μiL^d+1,ni(fd+11)−2θiL^d+1,ni(fd+13)+μi2O^ni+θi2L^d+1,ni(fd+14)+2μiθiL^d+1,ni(fd+12))=1σ(Oni)(σ(Ld+1,ni(fd+15))−2μiσ(Ld+1,ni(fd+11))−2θiσ(Ld+1,ni(fd+13))+μi2σ(Oni)+θi2σ(Ld+1,ni(fd+14))+2μiθiσ(Ld+1,ni(fd+12))).$(8)

By construction, these estimates are greater than zero. Note that the formula for ${\stackrel{^}{\sigma }}_{i}^{2}$ depends on μi and θi.

Finally, we consider the set of parameters (r, δ). In this case, we consider a specific form of the smooth transition function F. For illustration, we consider the “normal” smooth transition function given in Example 1. That is,

$F(x)=Φ(x)=∫−∞x12πe−12y2dy .$

In this case, we consider a Laplace series expansion for Φ(x) as follows:

$Φ(x)=12+12π∑n=0∞(−1)nx2n+1n!2n(2n+1)=12+12π(x−x36+x540−⋯)=12+12πx+R(x3) .$

To change the parameters (r, δ) to ($\stackrel{^}{r}$(n), $\stackrel{^}{\delta }$(n)) while keeping other parameters (𝝁, 𝜽, 𝝈) fixed, we consider the following factors, (t = 1, 2, ⋯ , n):

$λ^t3:=exp⁡{12⟨σ,Xt⟩2[⟨μ+θΦ(Yt−d−rδ),Xt⟩2−⟨μ+θΦ(Yt−d−r^δ^),Xt⟩2−2Yt⟨μ+θΦ(Yt−d−rδ),Xt⟩+2Yt⟨μ+θΦ(Yt−d−r^δ^),Xt⟩]} .$

Write

$Λ^t3:=∏u=1tλ^u3 ,Λ^03:=1 .$

Again a new probability measure $\stackrel{^}{P}$3 is defined so that the restriction of its Radon-Nikodym derivative $\frac{d{\stackrel{^}{P}}^{3}}{dP}$ to 𝒴n is given by:

$dP^3dP|Gn=Λ^n3 .$

Consequently,

$ln⁡Λ^n3=∑i=1N∑i=1N12σi2[(μi+θiΦ(Yt−d−rδ))2−(μi+θiΦ(Yt−d−r^δ^))2−2Yt(μi+θiΦ(Yt−d−rδ))+2Yt(μi+θiΦ(Yt−d−r^δ^))]⟨Xt,ei⟩=∑t=1n∑i=1N12σi2[−(μi+θiΦ(Yt−d−r^δ^))2+2YtθiΦ(Yt−d−r^δ^)]⟨Xt,ei⟩+R(μ,θ,σ,r,δ)=∑t=1n∑i=1N12σi2[−θi2Φ2(Yt−d−r^δ^)+2(Yt−μi)θiΦ(Yt−d−r^δ^)]⟨Xt,ei⟩+R(μ,θ,σ,r,δ)≈∑t=1n∑i=1N12σi2{−θi2[14+12π(Yt−d−r^δ^)+12π(Yt−d−r^δ^)2]+2(Yt−μi)θi[12+12π(Yt−d−r^δ^)]}⟨Xt,ei⟩+R(μ,θ,σ,r,δ)=∑t=1n∑i=1N12σi2[(−14θi2+θi2r^2πδ^−θi2r^22πδ^2−μiθi+2μiθir^2πδ^)+(θi2r^πδ^2−θi22πδ^−2μiθi2πδ^)Yt−d+(θi−2r^θi2πδ^)Yt+2YtYt−dθi2πδ^−θi22πδ^2Yt−d2]⟨Xt,ei⟩+R(μ,θ,σ,r,δ) ,$

where R(𝝁, 𝜽, 𝝈, r, δ) does not involve $\stackrel{^}{r}$ and $\stackrel{^}{\delta }$.

Write

$fd+16(Yt,Yt−1,⋯,Yt−d)=Yt−d ,fd+17(Yt,Yt−1,⋯,Yt−d)=Yt−d2 ,fd+18(Yt,Yt−1,⋯,Yt−d)=YtYt−d ,$

and for j = 6, 7, 8,

$Ld+1,ni(fd+1j)=∑t=1nfd+1j(Yt,Yt−1,⋯,Yt−d)⟨Xt,ei⟩ .$

Consequently,

$ln⁡Λ^n3=∑i=1N12σi2[(−14θi2+θi2r^2πδ^−θi2r^22πδ^2−μiθi+2μiθir^2πδ^)Oni+(θi2r^πδ^2−θi22πδ^−2μiθi2πδ^)Ld+1,ni(fd+16)+(θi−2r^θi2πδ^)Ld+1,ni(fd+11)+2θi2πδ^Ld+1,ni(fd+18)−θi22πδ^2Ld+1,n(fd+17)]+R(μ,θ,σ,r,δ) .$

Conditioning on 𝒴n under P then gives:

$E[ln⁡Λ^n3|Yn]=∑i=1N12σi2[(−14θi2+θi2r^2πδ^−θi2r^22πδ^2−μiθi+2μiθir^2πδ^)O^ni+(θi2r^πδ^2−θi22πδ^−2μiθi2πδ^)L^d+1,ni(fd+16)+(θi−2r^θi2πδ^)L^d+1,ni(fd+11)+2θi2πδ^L^d+1,ni(fd+18)−θi22πδ^2L^d+1,ni(fd+17)]+R(μ,θ,σ,r,δ) .$

Differentiating with respect to $\stackrel{^}{r}$ and setting the derivative equal to zero gives:

$r^=∑i=1N12σi2[[θi22πδ^+2μiθi2πδ^]O^ni+θi2πδ^2L^d+1,ni(fd+16)−2θi2πδ^L^d+1,ni(fd+11)]∑i=1N12σi2θi2πδ^2O^ni=∑i=1N12σi2[[θi22πδ^+2μiθi2πδ^]σ(Oni)+θi2πδ^2σ(Ld+1,ni(fd+16))−2θi2πδ^σ(Ld+1,ni(fd+11))]∑i=1N12σi2θi2πδ^2σ(Oni).$(9)

Note that the formula for $\stackrel{^}{r}$ depends on $\stackrel{^}{\delta }$, θi, μi and σi.

Differentiating with respect to $\stackrel{^}{\delta }$ and setting the derivative equal to zero gives:

$δ^=∑i=1Nθi22πσi2[−L^d+1,ni(fd+17)+2r^L^d+1,ni(fd+16)−r^2O^ni]∑i=1N12σi2[aiO^ni+biL^d+1,ni(fd+16)+ciL^d+1,ni(fd+11)+diL^d+1,ni(fd+18)]=∑i=1Nθi22πσi2[−σ(Ld+1,ni(fd+17))+2r^σ(Ld+1,ni(fd+16))−r^2σ(Oni)]∑i=1N12σi2[aiσ(Oni)+biσ(Ld+1,ni(fd+16))+ciσ(Ld+1,ni(fd+11))+diσ(Ld+1,ni(fd+18))],$(10)

where

$ai:=−θi2r^2π−2μiθir^2π ,bi:=θi22π+2μiθi2π ,ci:=2r^θi2π ,di:=−2θi2π .$

Note that these coefficients depend on $\stackrel{^}{r}$, μi, θi and σi.

4 Simulation study

We illustrate the implementation of the proposed filtering and estimation algorithm for the Hidden Markov Regime-Switching Smooth Transition Model in practice using simulated data. The accuracy of the filters of the hidden Markov chain and filter-based estimates of unknown parameters as well as the convergence of the parameters estimates of the filter-based EM algorithm are also studied using the simulated data. Here we consider a simple, two-state and two-regime, HMRS-STAR model, namely an HMRS-STAR(2, 2)-model, following with lag d = 1 and sample size n = 1000. That is, the following model is considered.

$Yt={−1−1×Φ(Yt−1−010)+2×ϵt ,for Xt=e11+1×Φ(Yt−1−010)+1×ϵt ,for Xt=e2$

It is clear that μ1 = −1, μ2 = 1, θ1 = −1, θ = 1, σ1 = 2, σ2 = 1, δ = 10 and r = 0, and that ϵt, for t = 1, …, n, are independent standard normal random variables. We consider the following “hypothetical” transition probability matrix $\left(\begin{array}{cc}.6& .3\\ .4& .7\end{array}\right)$ for the hidden Markov chain. The plot of a simulated series is depicted in Figure 1.

Figure 1:

Plots of simulated data.

With the initial setting, ${\mathbf{\text{q}}}_{0}={\mathbf{\text{p}}}_{0}=\left(0.5,0.5{\right)}^{\mathrm{\prime }}$, $\mathbf{\Pi }=\left(\genfrac{}{}{0}{}{0.5\text{ }0.5}{0.5\text{ }0.5}\right)$; σ1 is equal to the standard deviation of those Y's which are less than its sample average; σ2 is equal to the standard deviation of those Y's which are greater than its sample average; μ1 = θ1 is equal to the average of those Y's which are less than its sample average; μ2 = θ2 is equal to the average of those Y's which are greater than its sample average, and δ = r = 1. The algorithm is terminated when differences in the current parameters estimates and the previous parameters estimates are all less than 0.005. In this setup, we obtain our final estimation in iteration 87, see Table 1.

Table 1:

Parameter estimates from the simulation.

We also provide figures to illustrate the convergence of the parameters estimates. From the estimation results displayed in the above table, we see that the parameters estimates from the filter-based EM algorithm are reasonably accurate for most of the parameters. However, compared with other parameters estimates, the parameter estimate $\stackrel{^}{\delta }$ for the scale parameter δ is less accurate. Note that a sample size n = 1000 is required to obtain these parameters estimates. One possible explanation for the less accurate estimation result for the scale parameter delta is that the parameter estimate is based on an approximate formula derived from taking the first term only in the Laplace series expansion. The accuracy of the estimation for the scale parameter may be improved if higher-order terms in the Laplace series expansion are included. Of course, there is a trade-off between the accuracy of the estimation and the tractability of the computation of the parameter estimates. Furthermore, in general, it seems to be difficult to achieve an accurate estimation for a scale parameter and hence a large sample size may be required to achieve an accurate estimate at an intuitive level. In practice, most of financial series are large and easily exceed 1000 observations.1 Figure 2 plots the values of 𝚷 against iteration. Figure 3 plots the values of μ1, μ2 against iteration. Figure 4 plots the values of θ1, θ2 against iteration. Figure 5 plots the values of σ1, σ2 against iteration. Figure 6 plots the values of r and δ against iteration. Figure 7 presents the filtered estimate of the paths of the hidden Markov chain (Red line: p1 of p = q/ < q, 1 >, Blue line: p2 of p = q/ < q, 1 >) and the simulated paths of the hidden Markov chain based on the hypothetical parameters (black line). From Figure 3Figure 6, we see that the parameters estimates converge at reasonable rates with most of them converging at 87 iterations. Furthermore, from Figure 7, it seems that the filtered estimate of the path of the hidden Markov chain matches quite reasonably well with the simulated path of the hidden Markov chain.

Figure 2:

Convergence of 𝚷.

Figure 3:

Convergence of μ1 & μ2.

Figure 4:

Convergence of θ1 & θ2.

Figure 5:

Convergence of σ1 & σ2.

Figure 6:

Convergence of r & δ.

Figure 7:

Estimate of p = q/ < q, 1 >.

5 Real data illustration

We implement our HMRS-STAR model and procedure on two sets of indexes, Hang Seng Index (HSI) and NASDAQ Composite (IXIC) from 3rd Jan 2012 to 30th Dec 2016. There are a total of 1246 and 1258 days for the Hang Seng Index and NASDAQ Composite respectively. Data are achieved from Yahoo finance via the link finance.yahoo.com. Let Pt be the index at day t and we consider the log return defined by: Yt = 100log(Pt/Pt−1). Figure 8 shows the return data.

Figure 8:

plots of Yt = 100 × log-return (left: HSI, Right: IXIC).

The returns are fitted to the two-state and two-regime, HMRS-STAR model, namely an HMRS-STAR(2, 2)-model, following with lag d = 1, and the initial values are identical to the one used in the simulation study, that is q0 = p0 = (0.5, 0.5)′, $\mathbf{\Pi }=\left(\begin{array}{cc}0.5& 0.5\\ 0.5& 0.5\end{array}\right)$; σ1 is equal to the standard deviation of those Y's which are less than the sample average; σ2 is equal to the standard deviation of those Y's which are greater than its sample average; μ1 = θ1 is equal to the average of those Y's which are less than its sample average; μ2 = θ2 is equal to the average of those Y's which are greater than its sample average, and δ = r = 1. The algorithm is terminated when differences in the current parameters estimates and the previous parameters estimates are all less than 0.005. We obtain our final estimations in iterations 394 and 60 for the Hang Seng Index and NASDAQ Composite respectively. The estimation algorithm follows the general algorithm, (i.e. Steps I–IV), presented in Section 3.2. The estimated values with respect to the log returns are given in Table 2. Again we also provide figures to illustrate the convergence of the parameter estimates. The following figures are plotted based on the estimation of 100×log-return. Figure 9 plots the values of 𝚷 against iteration. Figure 10 plots the values of μ1, μ2 against iteration. Figure 11 plots the values of θ1, θ2 against iteration. Figure 12 plots the values of σ1, σ2 against iteration. Figure 13 plots the values of r and δ against iteration. Figure 14 presents the filtered estimate of the paths of the hidden Markov chain (Red line: p1 of p = q/ < q, 1 >, Blue line: p2 of p = q/ < q, 1 >). From Figure 9Figure 13, we see that the parameters estimates converge at reasonable rates with most of them converging at 394 and 60 iterations respectively. The convergence of the parameters are rather stable, except for the two parameters r and δ of HSI data significantly unstable around iterations 100–400 as shown in Figure 13. Also for the two real data sets, it seems that the parameters estimates can disentangle the two regimes for μs and σs, but not for θs for the HSI data.

Table 2:

Parameter estimates and summary statistics from the real data.

Figure 9:

Convergence of 𝚷 (left: HSI, right: IXIC).

Figure 10:

Convergence of μ1 & μ2 (left: HSI, right: IXIC).

Figure 11:

Convergence of θ1 & θ2 (left: HSI, right: IXIC).

Figure 12:

Convergence of σ1 & σ2 (left: HSI, right: IXIC).

Figure 13:

Convergence of r & δ (left: HSI, right: IXIC).

Figure 14:

Estimate of p = q/< q, 1 > (left: HSI, right: IXIC).

6 Discussions on other potential economic applications

The proposed class of HMRS-STAR models may be applied to study some important problems in econometrics and economics as well as real-world problems which may be of socio-economic importance. Here we provide some suggestions for potential applications of the proposed class of models. There may be interesting applications beyond those mentioned below.

As noted in, for example, Hansen (2011), one of the important problems in econometrics is testing for linearity in economic time series against a certain nonlinear alternative which may be described by a parametric nonlinear time series model. This problem has significant economic implications since many economic models were developed under the premise of linearity. However, real-world data may reveal otherwise. A challenging issue is what nonlinear economic models may be used if a linear economic model is not appropriate.2 Indeed, different conclusions may be drawn from testing for linearity if different parametric nonlinear time series models are used as the nonlinear alternative. The testing of a linear model against a threshold autoregressive model has been considered by statisticians and econometricans, see, for example, Chan (1990), Chan and Tong (1990), and Hansen (1996). See also Hansen (2011) and Tong (2011) and the relevant references therein. The testing of a linear model against a Markovian regime-switching autoregressive model has been considered in the literature. See, for example, Hamilton (2016) and the relevant references therein. The testing of a linear model against a smooth transition autoregressive model has also been considered. See, for example, van Dijk, Teräsvirta, and Franses (2002) and the related literature therein. However, it seems that the testing of linearity against a composite alternative model such as a second-generation nonlinear time series model may have received a relatively less attention in the literature. Intuitively, it appears that considering a composite alternative model in testing for linearity may provide a more general and flexible way to detect nonlinearity. The proposed class of HMRS-STAR models may be used to form a composite alternative model in testing for linearity in economic time series. Besides testing for linearity in economic time series, another potential application of the proposed class of HMRS-STAR models is testing for the unit root which is an important topic in econometrics. As noted in Hansen (2011), testing for the unit root has been investigated using a nonlinear stationarity threshold autoregressive model. It may perhaps be interesting to investigate to use of the proposed class of HMRS-STAR models to study the testing of the unit root. Time delay is also an important feature of economic time series. One potential application of the proposed class of HMRS-STAR models is modeling of time delay of economic time series. A flexibility that is provided by the HMRS-STAR models is that abrupt regime switches and time delay can be disentangled under the HMRS-STAR models. Indeed, inherent from the STAR models, time delay is related to smooth regime switches.

Some major classes of parametric nonlinear time series models such as the threshold autoregressive models have been adopted to investigate the law of one price and transactions costs. See, for example, Sarno, Taylor, and Chowdhury (2004) and Taylor (2001) for the use of threshold autoregressive models to investigate the purchasing power parity puzzle, the law of one price and transaction costs. See also Hansen (2011) for some related discussions. It was noted in Taylor (2001) that a linear time series model such as an autoregressive model cannot provide a realistic description for nonlinear adjustments of prices attributed to the presence of transaction costs. In Taylor (2001), a two-regime threshold autoregressive model was employed to describe such nonlinearity, where the two threshold parameters are used to describe a “band of inaction” which may cause the nonlinearity. Instead of using the threshold autoregressive models, one may explore the use of the proposed class of HMRS-STAR models to study the law of one price and transaction costs. For example, one may investigate the nonlinearity due to transaction costs using both the abrupt and smooth regime switches.

7 Conclusion

We introduced a hidden, regime-switching, smooth transition model and discussed its filtering and estimation issues. A reference probability approach and a version of the Bayes’ rule were used to derive filters for the hidden Markov chain and some related quantities. These related quantities were used to derive filter-based estimates for the unknown parameters using the EM algorithm. For the threshold parameter and the scale parameter we employed a Laplace series expansion for the cumulative normal probability distribution to derive approximations to their filter-based estimates. Simulation experiments were presented to illustrate the practical implementation of the HMRS-STAR model as well as the filtering and estimation algorithm. The simulation results reveal that the parameters estimates converge at reasonable rates, say 87 iterations for most of the parameters, and that the parameters estimates are reasonably accurate for most of the parameters. Furthermore, real financial data were used to illustrate the practical implementation of the model. However, we have experienced a challenging issue that some other datasets that our estimation algorithm does not converge. So, there is still room to further improve the estimation method in particular the convergence, stability and robustness. This may represent interesting topics for further research. Other potential applications of the proposed model may be to apply the model for fitting some economic series such as unemployment data which is an important area in economics and econometrics. For this application, one may refer to, for example, Koop and Potter (1999) and Hamilton (2005). It may also be interesting to explore the use of the proposed model to fit financial time series other than equity indices, such as foreign exchange rates series. One may also explore some applications of the proposed model to study time series data in other fields such as climate science, ecology, population dynamics, biological, engineering and physical sciences.3

Acknowledgement

The authors would like to thank the Associate Editor and referees for their valuable and helpful comments.

References

• Baum, L. E., T. Petrie, G. Soules, and N. Weiss. 1970. “A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains.” The Annals of Mathematical Statistics 41: 164–171.

• Chan, K. S., and H. Tong. 1986. “On Estimating Thresholds in Autoregressive Models.” Journal of Time Series Analysis 7: 178–190. Google Scholar

• Chan, K. S. 1990. “Testing for Threshold Autoregression.” The Annals of Statistics 18: 1886–1894.

• Chan, K. S., and H. Tong. 1990. “On Likelihood Ratio Tests for Threshold Autoregression.” Journal of the Royal Statistical Society. Series B (Methodological) 52: 469–476. Google Scholar

• Dembo, A., and O. Zeitouni. 1986. “Parameter Estimation of Partially Observed Continuous Time Stochastic Processes.” Stochastic Processes and Their Applications 23: 91–113.

• Engle, R. 1982. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica 50: 987–1007.

• Elliott, R. J., L. Aggoun, and J. Moore. 1995. Hidden Markov Models: Estimation and Control. New York: Springer. Google Scholar

• Elliott, R. J., C. C. Liew, and T. K. Siu. 2011. “On Filtering and Estimation of a Threshold Stochastic Volatility Model.” Applied Mathematics and Computation 218: 61–75.

• Elliott, R. J., T. K. Siu, and J. W. Lau. 2013. “Filtering a Double Threshold Model with Regime Switching.” IEEE Transactions on Automatic Control 58: 3185–3190.

• Fan, J., and Q. Yao. 2003 Nonlinear Time Series: Nonparametric and Parametric Methods New York: Springer. Google Scholar

• Gao, J.. 2007 Nonlinear Time Series: Semiparametric and Nonparametric Methods London: Chapman and Hall/CRC. Google Scholar

• Goldfeld, S. M., and S. M. Quandt. 1973. “A Markov Model for Switching Regressions.” Journal of Econometrics 1: 3–15.

• Granger, C. W., and A. P. Anderson. 1978. An Introduction to Bilinear Time Series Model. Guttingen: Vandenhoeck and Ruprecht. Google Scholar

• Granger, C. W. J., and T. Teräsvirta. 1993. Modelling Non-linear Econometric Relationships. Oxford: Oxford University Press. Google Scholar

• Hamilton, J. D. 1989. “A New Approach to Economic Analysis of Nonstationary Time Series and the Business Cycle.” Econometrica 57: 357–384.

• Hamilton, J. D. 2005. “What’s Real About the Business Cycle?” Federal Reserve Bank of St. Louis Review 87: 435–452. Google Scholar

• Hamilton, J. D. 2016. “Macroeconomic Regimes and Regime Shifts.” In Handbook of Macroeconomics, edited by H. Uhlig and J. Taylor, Volume 2A, 163–201. Amsterdam: Elsevier. Google Scholar

• Hansen, B. E. 1996. “Inference When a Nuisance Parameter is Not Identified Under the Null Hypothesis.” Econometrica 64: 413–430.

• Hansen, B. E. 2011. “Threshold Autoregression in Economics.” Statistics and Its Interface 4: 123–127.

• Koop, G., and S. M. Potter. 1999. “Dynamic Asymmetries in U.S. unemployment.” Journal of Business & Economic Statistics 17: 298–312. Google Scholar

• Potter, S. M. 1999. “Nonlinear Time Series Modelling: An Introduction.” Journal of Economic Surveys 13: 505–528.

• Priestley, M. B. 1980. “State-Dependent Models: A General Approach to Non-linear Time Series Analysis.” Journal of Time Series Analysis 1: 47–71.

• Quandt, R. E. 1958. “The Estimation of the Parameters of a Linear Regression System Obeying Two Separate Regimes.” Journal of the American Statistical Association 53: 873–880.

• Sarno, L., M. P. Taylor, and I. Chowdhury. 2004. “Nonlinear Dynamics in Deviations from the Law of One Price: A Broad-Based Empirical Study.” Journal of International Money and Finance 23: 1–25.

• Subba Rao, T., and M. M. Gabr. 1984. An Introduction to Bispectral Analysis and Bilinear Time Series Models. New York: Springer. Google Scholar

• Taylor, S. J. 1982. “Financial Returns Modelled by the Product of Two Stochastic Processes, a Study of Daily Sugar Prices, 1961–79.” In Time Series Analysis: Theory and Practice 1, edited by O. D. Anderson, 203–226. Amsterdam: North Holland. Google Scholar

• Taylor, S. J. 1986. Modeling Financial Time Series. Chichester: Wiley. Google Scholar

• Taylor, A. M. 2001. “Potential Pitfalls for the Purchasing-Power-Parity Puzzle? Sampling and Specification Biases in Mean-Reversion Tests of the Law of One Price.” Econometrica 69: 473–498.

• Taylor, S. J. 2005. Asset Price Dynamics, Volatility, and Prediction. Princeton: Princeton University Press. Google Scholar

• Teräsvirta, T. 1994. “Specification, Estimation, and Evaluation of Smooth Transition Autoregressive Models.” Journal of the American Statistical Association 89: 208–218. Google Scholar

• Teräsvirta, T. 1998. “Modelling Economic Relationships with Smooth Transition Regressions.” In Handbook of Applied Economic Statistics, edited by D. E. A. Giles and A. Ullah, 507–552. New York: Marcel Dekker. Google Scholar

• Tong, H.. 1977. “Contribution to the Discussion of the Paper Entitled “Stochastic Modelling of Riverflow Time Series” by A. J. Lawrance and N. T. Kottegoda.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 140: 34–35. Google Scholar

• Tong, H. 1978. “On a Threshold Model, Pattern Recognition and Signal Processing.” In NATO ASI Series E: Applied Sc. No. 29, edited by C. H. Chen, 575–586. The Netherlands: Sijthoff & Noordhoff. Google Scholar

• Tong, H. 1983. Threshold Models in Non-Linear Time Series Analysis. New York: Springer. Google Scholar

• Tong, H. 1990. Nonlinear Time Series Analysis: A Dynamical System Approach. Oxford: Oxford University Press. Google Scholar

• Tong, H. 2011. “Threshold Models in Time Series Analysis – 30 years on.” Statistics and Its Interface 4: 107–118.

• Tong, H., and K. S. Lim. 1980. “Threshold Autoregression, Limit Cycles and Cyclical Data (with Discussion).” Journal of the Royal Statistical Society. Series B (Methodological) 42: 245–292. Google Scholar

• van Dijk, D., T. Teräsvirta, and P. H. Franses. 2002. “Smooth Transition Autoregressive Models – A Survey of Recent Developments.” Econometric Reviews 21: 1–47.

• Yin, G., and C. Zhu. 2010. Hybrid Switching Diffusions: Properties and Applications. New York: Springer. Google Scholar

• Zhu, D. M., W. K. Ching, R. J. Elliott, T. K. Siu, and L. Zhang. 2017. “Hidden Markov Models with Threshold Effects and Their Applications to Oil Price Forecasting.” Journal of Industrial and Management Optimization 13: 757–773.

Code and Datasets

The author(s) published code and data associated with this article is on Code Ocean, a computational reproducibility platform. We recommend Code Ocean to SNDE contributors who wish share, discover, and run code in published research articles. (See: https://doi.org/10.24433/CO.dd77cb0f-e54e-4693-905b-493a86cfd345).

Footnotes

• 1

We would like to thank one of the referees for stimulating this discussion.

• 2

As noted in Tong (1990), Chapter 2, when one leaves a linear world, there are infinitely many parametric nonlinear alternatives. A challenging issue is which nonlinear alternative may be used.

• 3

We would like to thank one of the referees for stimulating our discussions on some potential applications of the proposed model.

Code Ocean

A Hidden Markov Regime-Switching Smooth Transition Model

Published Online: 2018-06-29

Citation Information: Studies in Nonlinear Dynamics & Econometrics, Volume 22, Issue 4, 20160061, ISSN (Online) 1558-3708,

Export Citation

©2018 Walter de Gruyter GmbH, Berlin/Boston.