## 1 Introduction

The objective of this paper is to develop a new class of parametric nonlinear time series models which is general and flexible enough to incorporate two types of switching behavior, namely smooth state transitions and abrupt changes in hidden states. We call our class of nonlinear time series models hidden Markov regime-switching smooth transition (HMRS-STAR) models. It is developed by combining two important classes of parametric nonlinear time series models, namely smooth transition (STAR) models and hidden Markov regime-switching (HMRS) models. The reason behind combining the two models to develop the class of HMRS-STAR models is similar to that of combining first-generation models to produce second-generation models, third-generation models, and so on, as proposed in Tong (1990), (see Chapter 3, Section 13.3, therein). The class of HMRS-STAR models proposed here is general and may have potential applications in diverse fields such as economics, finance, actuarial science, biology, engineering and physical sciences, ecology and population dynamics, where nonlinear time series modelling or nonlinear dynamical systems may be highly relevant. Concrete applications of the HMRS-STAR models to financial modeling are presented below. Particularly, the class of HMRS-STAR models can describe the impacts of both abrupt regime switches and smooth regime switches on asset price dynamics or returns. Abrupt regime switches are those sudden changes in asset price dynamics or returns which may be attributed to major economic or market events such as financial crises. Some observed market news or events may be digested more slowly by market participants and their impacts may be more slowly reflected in asset prices or returns. These changes in asset prices or returns are gradual rather than abrupt and are referred to as smooth regime switches. Smooth state transitions in asset prices or returns may also have some implications on market efficiency and may be related to a price trend model in the financial econometrics literature.

Standard filtering theory based on the reference probability approach in the literature and its corresponding filter-based expectation-maximization (EM) algorithm, [see, for example, Elliott, Aggoun, and Moore (1995)], is applied to estimate the hidden states and unknown parameters in the proposed model. To simplify some nonlinear relationships in the filter-based based estimation due to the nonlinearity in the smoothing function, we employ a Laplace series expansion to simplify the estimation procedure and derive analytical formulas for some parameter estimates. Simulated data are used to illustrate the practical implementation of the estimation method and its accuracy. Real data examples based on two financial data sets, namely the Hang Seng Index and NASDAQ Composite index, are provided to illustrate potential applications of the proposed class of models to real financial data. Other potential applications of the proposed class of models are briefly discussed. In the course of our real data analysis, we have found that for some financial datasets, the estimation algorithm based on filtering theory does not converge. This reflects that there is still room for improving the filter-based estimation method. This may pose a challenging problem for further research. Given the important role played by hidden Markov models in signal processing, it is hoped that the proposed class of HMRS-STAR models may provide some insights into how the link between two important fields, namely nonlinear time series analysis and signal processing, may be consolidated. In what follows, some of the relevant literature and background of the proposed class of models are discussed.

Nonlinear time series analysis is an important topic in econometrics, statistics and dynamical systems theory. Some early developments of nonlinear time series models may be traced back to the late 1970s where limitations of linear time series modelling to fit real datasets in diverse fields were realised. See the monographs by Tong (1983, 1990). Some important classes of parametric nonlinear time series models proposed in the early stage of developments of the discipline are, for example, threshold autoregressive (TAR) models by Tong (1977, 1978) and Tong and Lim (1980), bilinear models by Granger and Anderson (1978) [see also Subba Rao and Gabr (1984)], autoregressive conditional heteroscedastic (ARCH) models by Engle (1982), stochastic volatility models of Taylor (1982) and state-dependent models by Priestley (1980). The monograph by Tong (1990) provides an excellent and authoritative account of parametric nonlinear time series models developed before the 1990s. Nonparametric and semi-parametric approaches to nonlinear time series analysis have also been studied in the literature. See the monographs by Fan and Yao (2003) and Gao (2007).

Regime-switching models represent an important class of parametric nonlinear time series models which have wide ranging applications in diverse fields such as engineering, economics, finance and actuarial science, amongst others. The basic principle of regime switching was conceived some time ago in economics, particularly in econometrics, and engineering, particularly in signal processing and control engineering. For discussions on applications of regime switching models and related models to engineering, one may refer to the monographs by Elliott, Aggoun, and Moore (1995) and Yin and Zhu (2010). Early works in econometrics where the idea of regime switching has been adopted are, for example, Quandt (1958) and Goldfeld and Quandt (1973), where regression models with switching parameters were introduced to discuss nonlinearity in economic data. The principle of regime switching also appeared in pioneering works on parametric nonlinear time series analysis, [see, for example, Tong and Lim (1980) and Tong (1983)]. Since the seminal article by Hamilton (1989) on Markov regime-switching autoregressive models for econometrics, regime switching models have become popular in economics, econometrics and finance. The basic idea of regime switching models is that the model parameters can change over time according to an underlying state process which could be a finite-state hidden Markov chain.

Chan and Tong (1986) and Teräsvirta (1994) introduced a class of time series models, namely, smooth transition autoregressive models. This class of time series models allow “smooth”, or gradual transitions, in regimes and may be thought of as a type of regime switching models in a wide sense, see, for example, Tong (1990), for further discussions. The key feature of this class of models is that “smooth” regime switching is described by introducing a “smooth” transition function, which is a non-decreasing function. Some typical examples of this “smooth” transition function are the cumulative normal distribution function and the logistic function. Like regime-switching autoregressive models, smooth transition autoregressive models can capture cyclical behavior in time series. For a survey on the smooth transition autoregressive models and their applications, interested readers may refer to Granger and Teräsvirta (1993), Potter (1999), and Teräsvirta (1998).

In Elliott, Siu, and Lau (2013), a double threshold model was considered, where two types of regime-switching, namely the regime switching governed by the threshold principle of Tong (1983, 1990) and the regime switching described by transitions of a hidden Markov chain. In Elliott, Liew, and Siu (2011), the filtering of a threshold stochastic volatility model was considered using the reference probability approach for hidden Markov models in, for example, Elliott, Aggoun, and Moore (1995). In a recent paper by Zhu et al. (2017), a hidden Markov models with threshold effects was considered and applied to oil price forecasting, where the threshold regime switching effect was present in transition probabilities of the hidden Markov chain. Note that the smooth transitions models may be more general than the threshold regime-switching models and may be used to test whether regime switching is attributed to smooth transitions or threshold regime switches, [see, for example, Tong (1990), for related discussions]. In this sense, the HMRS-STAR model may be more general than the double threshold model in Elliott, Siu, and Lau (2013) and the former may also be used to generalize threshold-type models in, for example, Elliott, Liew, and Siu (2011) and Zhu et al. (2017).

This paper is organized as follows. The next section gives an introduction to the HMRS-STAR model. Filters for the hidden Markov chain and filter-based estimates of the unknown parameters based on the EM algorithm are presented in Section 3. Simulation experiments and results are discussed in Section 4. Real data examples are presented in Section 5. Section 6 provides discussions on some potential economic applications of the proposed HMRS-STAR model. The final section gives some concluding remarks.

## 2 The HMRS-STAR model

We consider a complete probability space (Ω, *ℱ*, *P*) on which all random variables are defined, where *P* is a real-world probability. Let {**X _{t}|**

*t*∈

*𝒯*} be a discrete-time,

*N*-state, hidden Markov chain defined on (Ω,

*ℱ*,

*P*) with state space being the set of unit vectors {

**e**,

_{1}**e**, …,

_{2}**e**}. Here

_{N}*𝒯*:= {0, 1, …} and the

*j*component of

^{th}**e**is the Kronecker delta function

_{i}*δ*

_{ij}for each

*i*,

*j*= 1, 2, …,

*N*. Indeed the state space we considered here is called the canonical state space of the hidden Markov chain. The state space was adopted in, for example, Elliott, Aggoun, and Moore (1995).

Suppose that the chain **X** is time-homogeneous so that its probability law is completely determined by its transition probability and initial distribution. For each *i*, *j* = 1, 2, …, *N*, let

so [*π*_{ji}]_{i, j = 1, 2, …, N} is the transition probability matrix of the chain **X** under *P* and it is denoted by *𝚷*.

Consider the *P*-completion of the natural filtration

the minimal *σ*-algebra generated by information about the values of the chain **X** up to and including time *t* and the collection *𝒩* of *P*-null sets of *ℱ*. Here for *σ*-algebras *𝒜* and *ℬ*, we denote by *𝒜* ∨ *ℬ* the minimal *σ*-algebra generated by *𝒜* and *ℬ*.

With the canonical state space of the chain, Elliott, Aggoun, and Moore (1995) gave the following semimartingale dynamics for the chain **X** under *P*:

where ^{N}-valued, (*𝔽*^{X}, *P*)-martingale difference process. The above semimartingale dynamics will be used when deriving filters for the hidden Markov chain and filter-based estimates of the unknown parameters in the HMRS-STAR model.

We now describe the HMRS-STAR model. For illustration, we consider a simple case where there are two regimes in the smooth transition part of the model. Let

Here *σ*_{i} > 0 for each *i* = 1, 2, … , *N*; ^{N}.

Let *F* : (−∞, ∞) → [0, 1] be a continuous and non-decreasing function. This function plays the role of smoothing transitions in regimes. Suppose *r* is the threshold parameter and *δ* is the scale parameter, where *r* ∈ ℜ and *δ* > 0. Let *d* be a positive integer representing the delay parameter. Consider a time series *Y*_{0} are given in advance. We suppose the remaining terms of the time series *N*) model defined by:

Here we suppose, for simplicity, that the noise process

When the scale parameter *δ* tends to zero, the function *N*) model becomes an (2, *N*)-double threshold model in Elliott, Siu, and Lau (2013). One may ask, given that the smooth transition component may be able to model abrupt or smooth transitions for different values of the parameters *δ*, why we need to incorporate abrupt changes using a hidden Markov chain. There are at least two reasons. Firstly, the nature of the abrupt changes in the proposed HMRS-STAR model and that incorporated in the hidden Markov chain are different. The abrupt change in the smooth transition component is a self-exciting change in the sense that it depends on a past value of the time series itself, while the abrupt change incorporated by the hidden Markov chain is generated exogenously. This partly motivated the double threshold model in Elliott, Siu, and Lau (2013). Secondly, the HMRS-STAR model can incorporate both abrupt changes and smooth changes simultaneously. The HMRS-STAR model could be applied to financial modelling. For example, we can consider

In general, one could also consider a situation where both the threshold parameter *r* and the scale parameter *δ* are modulated by the hidden Markov chain **X**. However, to simplify our analysis, we assume they are constants. The following two examples give two special cases of the HMRS-STAR model.

**Example 1:** Suppose the smooth transition function *F*(*x*) is the standard Gaussian distribution Φ(*x*), (i.e., the probability distribution function of a zero-mean, unit-variance, normal distribution). Then the HMRS-STAR model becomes:

**Example 2:** Suppose the smooth transition function *F*(*x*) is the logistic distribution given by

The quantity 1/*δ* is called the speed of transition parameter. It could be challenging to estimate using MLE directly, and the EM algorithm circumvents this step.

In the next section, we shall illustrate how to estimate the parameters *r* and *δ* using the Expectation Maximization (EM) algorithm and an approximation method for the case of the Gaussian smooth transition function. Indeed, this method also works for the case of the logistic smooth transition function when the following standard linear approximation method for the logistic function *F* is used:

## 3 Filtering and estimation

The basic idea of the reference probability approach is to start with a reference probability *Y* has simpler dynamics, (i.e. independent of the hidden Markov chain). Then the reference probability *P* by a measure change. Under *P*, the observed process is governed by the original dynamics. However, filters for the hidden quantities are derived under the reference proability

### 3.1 Filters for the hidden chain and related quantities

Let

is a sequence of i.i.d. standard normal random variables;$\{{Y}_{t}|t\in \mathcal{T}\mathrm{\setminus}\{0\}\}$ - the chain
**X**has a transition probability matrix*𝚷*.

Define

and

Write 𝕐 and 𝔾 for the *P*-completion of 𝕐^{0} and 𝔾^{0}, respectively. Given observed information *𝒴*_{t} up to time *t*, we wish to estimate **X**_{t} as:

where E is an expectation with respect to the measure *P*.

By a version of the Bayes’ rule [see, for example, Elliott, Aggoun, and Moore (1995)],

where

For each *i* = 1, 2, … , *N*, let

and write **diag**(*𝜻*(*t*, *Y _{t}*)) for the diagonal matrix with diagonal elements being the components in the vector

*𝜻*(

*t*,

*Y*). Then the unnormalized filter

_{t}**q**satisfies the following recursive equation:

_{t}Consequently,

Here

The following quantities will be used to derive the filter-based estimates of some unknown parameters in the HMRS-STAR model.

- The number of transitions of the chain
**X**from state**e**to state_{i}**e**up to time_{j}*t*, for each*i*,*j*= 1, 2, …,*N*, is:$${J}_{t}^{ij}:=\sum _{l=1}^{t}\u27e8{\mathbf{\text{X}}}_{l-1},{\mathbf{\text{e}}}_{i}\u27e9\u27e8{\mathbf{\text{X}}}_{l},{\mathbf{\text{e}}}_{j}\u27e9.$$ - The occupation time of the chain
**X**in state**e**up to time_{i}*t*, for each*i*= 1, 2, … ,*N*, is:$${O}_{t}^{i}:=\sum _{l=1}^{t}\u27e8{\mathbf{\text{X}}}_{l},{\mathbf{\text{e}}}_{i}\u27e9\text{\hspace{1em}}.$$ - The “generalized” level process associated with state
**e**up to time_{i}*t*, for each*i*= 1, 2, … ,*N*, is:Here$${L}_{d+1,t}^{i}:=\sum _{l=1}^{t}{f}_{d+1}({Y}_{l},{Y}_{l-1},\dots ,{Y}_{l-d})\u27e8{\mathbf{\text{X}}}_{l},{\mathbf{\text{e}}}_{i}\u27e9\text{\hspace{1em}}.$$ is a Borel-measurable function.${f}_{d+1}:{\mathrm{\Re}}^{d+1}\to \mathrm{\Re}$

As in Elliott, Aggoun, and Moore (1995) we define the following ‘unnormalized’ vector quantities:

where

Then the exact recursive equations for

and

for each *i* = 1, 2, … , *N*. See, for example, Elliott, Aggoun, and Moore (1995).

Note that

### 3.2 Filter-based estimates and the EM algorithm

The filter-based estimates for the unknown parameters in the HMRS-STAR model are derived using the Expectation Maximization, or EM, algorithm. Here we briefly present the idea of the EM algorithm. For details, one may refer to, for example, Elliott, Aggoun, and Moore (1995). Again the mathematical techniques used here follow from those in Elliott, Aggoun, and Moore (1995).

Let *P*_{0}, where *θ* is an unknown model parameter and Θ is the parameter space. Write *𝒴* for a sub-*σ*-field of *ℱ*. Then the likelihood function for computing an estimate of the unknown parameter *θ* given the available information in *𝒴* is:

Here E_{0} is the expectation under *P*_{0}. Then the maximum likelihood estimate (MLE) *θ* is given by:

It may be challenging to compute the MLE

**Step I:** Set the counter *p* = 0 and choose _{0}.

**Step II:** (E-step) Set

where

**Step III:** (M- step) Find

**Step IV:** Replace *p* by *p* + 1 and repeat beginning with Step II until a certain stopping criterion is satisfied.

Note that the EM algorithm we applied here recursively converges only to a local maximum of log-likelihood. For a discussion of the EM algorithm and its convergent properties, one may refer to Baum et al. (1970), Dembo and Zeitouni (1986), and Elliott, Aggoun, and Moore (1995).

The set of parameters of interest can be described by the set *𝚯* defined as follows:

where

Since *π*_{ji}’s are transition probabilities, we also require that

Suppose now that the set of parameters 𝚯 is given and that the set of observed data described by the *σ*-field *𝒴*_{n} is known. We wish to determine a new set of parameters 𝚯(*n*) defined by:

where

Such a new set of parameters is determined by maximizing the conditional log-likelihoods defined below. The basic idea is that we update one set of parameters at a time starting with the *π*_{kl} given observed data described by the *σ*-field *𝒴*_{n} is given by:

For derivation, interested readers may refer to Elliott, Aggoun, and Moore (1995), (see Chapter 2 therein).

Consider now another set of parameters (*𝝁*, *𝜽*). To change the set of parameters from (*𝝁*, *𝜽*) to (*n*), *n*)) while keeping other parameters (*𝝈*, *r*, *δ*) constant, we must consider the following factors, (*t* = 1, 2, … , *n*),

Note that, to simplify the notation, we write *n*) and *n*), respectively.

Write

Then a new probability measure ^{1} is defined so that the restriction of its Radon-Nikodym derivative *𝒴*_{n} is given by:

It is then not difficult to see that under ^{1}, the sequence defined by:

is a sequence of *N*(0, 1) i.i.d. random variables.

Now

where *R*(*𝝁*, *𝜽*, *𝝈*, *r*, *δ*) does not involve

Consequently,

Define the following functions:

Consider the following functionals, for *i* = 1, 2, … , *N* and *j* = 1, 2, 3, 4:

Then

Again *R*(*𝝁*, *𝜽*, *𝝈*, *r*, *δ*) is a quantity which does not depend on _{i} or _{i} and can change from line to line.

Conditioning on *𝒴*_{n} under *P* gives:

Differentiating _{i} and setting the derivative equal to zero gives:

Solving for _{i} then gives:

Note that the formula for _{i} depends on _{i}.

Differentiating _{i} and setting the derivative equal to zero gives:

and this gives

where the exact recursive formulae for evaluating *j* =1, 2, 3, 4, and _{i} depends on _{i}.

Consider now the set of parameters *σ*_{i}, *i* = 1, 2, … , *N*. To change the parameters from *σ*_{i} to _{i}(*n*), *i* = 1, 2, … , *N*, while keeping (*𝝁*, *𝜽*, *r*, *δ*) fixed, we must consider factors, (*t* = 1, 2, ⋯ , *n*):

Here we assume that _{i}(*n*) > 0.

Again, we write

A new probability measure ^{2} can then be defined so that the restriction of its Radon-Nikodym derivative *𝒴*_{n} is given by:

Now,

Define

and

Again conditioning on *𝒴*_{n} under *P* gives:

Differentiating _{i} and setting the derivative equal to zero gives:

By construction, these estimates are greater than zero. Note that the formula for *μ*_{i} and *θ*_{i}.

Finally, we consider the set of parameters (*r*, *δ*). In this case, we consider a specific form of the smooth transition function *F*. For illustration, we consider the “normal” smooth transition function given in Example 1. That is,

In this case, we consider a Laplace series expansion for Φ(*x*) as follows:

To change the parameters (*r*, *δ*) to (*n*), *n*)) while keeping other parameters (*𝝁*, *𝜽*, *𝝈*) fixed, we consider the following factors, (*t* = 1, 2, ⋯ , *n*):

Write

Again a new probability measure ^{3} is defined so that the restriction of its Radon-Nikodym derivative *𝒴*_{n} is given by:

Consequently,

where *R*(*𝝁*, *𝜽*, *𝝈*, *r*, *δ*) does not involve

Write

and for *j* = 6, 7, 8,

Consequently,

Conditioning on *𝒴*_{n} under *P* then gives:

Differentiating with respect to

Note that the formula for *θ*_{i}, *μ*_{i} and *σ*_{i}.

Differentiating with respect to

where

Note that these coefficients depend on *μ*_{i}, *θ*_{i} and *σ*_{i}.

## 4 Simulation study

We illustrate the implementation of the proposed filtering and estimation algorithm for the Hidden Markov Regime-Switching Smooth Transition Model in practice using simulated data. The accuracy of the filters of the hidden Markov chain and filter-based estimates of unknown parameters as well as the convergence of the parameters estimates of the filter-based EM algorithm are also studied using the simulated data. Here we consider a simple, two-state and two-regime, HMRS-STAR model, namely an HMRS-STAR(2, 2)-model, following with lag *d* = 1 and sample size *n* = 1000. That is, the following model is considered.

It is clear that* μ*_{1} = −1, *μ*_{2} = 1, *θ*_{1} = −1, *θ* = 1, *σ*_{1} = 2, *σ*_{2} = 1, *δ* = 10 and *r* = 0, and that *ϵ*_{t,} for *t* = 1, …, *n,* are independent standard normal random variables. We consider the following “hypothetical” transition probability matrix

With the initial setting, *; σ*_{1} is equal to the standard deviation of those *Y's *which are less than its sample average; *σ*_{2} is equal to the standard deviation of those *Y's* which are greater than its sample average; *μ*_{1} = *θ*_{1} is equal to the average of those *Y's *which are less than its sample average; *μ*_{2} = *θ*_{2} is equal to the average of those *Y's *which are greater than its sample average, and *δ* = *r* = 1. The algorithm is terminated when differences in the current parameters estimates and the previous parameters estimates are all less than 0.005. In this setup, we obtain our final estimation in iteration 87, see Table 1.

Parameter estimates from the simulation.

Parameter | True value | Estimate | Parameter | True value | Estimate |
---|---|---|---|---|---|

π_{11} | 0.6 | 0.56 | π_{21} | 0.4 | 0.44 |

π_{12} | 0.3 | 0.27 | π_{22} | 0.7 | 0.73 |

μ_{1} | −1 | −1.24 | μ_{2} | 1 | 0.88 |

θ_{1} | −1 | −0.95 | θ_{2} | 1 | 1.15 |

σ_{1} | 2 | 1.94 | σ_{2} | 1 | 1.08 |

r | 0 | 0.0025 | δ | 10 | 13.52 |

We also provide figures to illustrate the convergence of the parameters estimates. From the estimation results displayed in the above table, we see that the parameters estimates from the filter-based EM algorithm are reasonably accurate for most of the parameters. However, compared with other parameters estimates, the parameter estimate *δ* is less accurate. Note that a sample size *n* = 1000 is required to obtain these parameters estimates. One possible explanation for the less accurate estimation result for the scale parameter delta is that the parameter estimate is based on an approximate formula derived from taking the first term only in the Laplace series expansion. The accuracy of the estimation for the scale parameter may be improved if higher-order terms in the Laplace series expansion are included. Of course, there is a trade-off between the accuracy of the estimation and the tractability of the computation of the parameter estimates. Furthermore, in general, it seems to be difficult to achieve an accurate estimation for a scale parameter and hence a large sample size may be required to achieve an accurate estimate at an intuitive level. In practice, most of financial series are large and easily exceed 1000 observations.^{1}Figure 2 plots the values of *𝚷* against iteration. Figure 3 plots the values of *μ*_{1}, *μ*_{2} against iteration. Figure 4 plots the values of *θ*_{1}, *θ*_{2} against iteration. Figure 5 plots the values of *σ*_{1}, *σ*_{2} against iteration. Figure 6 plots the values of *r* and *δ* against iteration. Figure 7 presents the filtered estimate of the paths of the hidden Markov chain (Red line: *p*_{1} of **p** = **q**/ < **q**, **1** >, Blue line: *p*_{2} of **p** = **q**/ < **q**, **1** >) and the simulated paths of the hidden Markov chain based on the hypothetical parameters (black line). From Figure 3–Figure 6, we see that the parameters estimates converge at reasonable rates with most of them converging at 87 iterations. Furthermore, from Figure 7, it seems that the filtered estimate of the path of the hidden Markov chain matches quite reasonably well with the simulated path of the hidden Markov chain.

## 5 Real data illustration

We implement our HMRS-STAR model and procedure on two sets of indexes, Hang Seng Index (HSI) and NASDAQ Composite (IXIC) from 3rd Jan 2012 to 30th Dec 2016. There are a total of 1246 and 1258 days for the Hang Seng Index and NASDAQ Composite respectively. Data are achieved from Yahoo finance via the link *P _{t}* be the index at day

*t*and we consider the log return defined by:

*Y*= 100log(

_{t}*P*/

_{t}*P*

_{t−}_{1}). Figure 8 shows the return data.

The returns are fitted to the two-state and two-regime, HMRS-STAR model, namely an HMRS-STAR(2, 2)-model, following with lag *d* = 1, and the initial values are identical to the one used in the simulation study, that is **q _{0}** =

**p**= (0.5, 0.5)′,

_{0}*; σ*

_{1}is equal to the standard deviation of those

*Y's*which are less than the sample average;

*σ*

_{2}is equal to the standard deviation of those

*Y's*which are greater than its sample average;

*μ*

_{1}=

*θ*

_{1}is equal to the average of those

*Y's*which are less than its sample average;

*μ*

_{2}=

*θ*

_{2}is equal to the average of those

*Y's*which are greater than its sample average, and

*δ*=

*r*= 1. The algorithm is terminated when differences in the current parameters estimates and the previous parameters estimates are all less than 0.005. We obtain our final estimations in iterations 394 and 60 for the Hang Seng Index and NASDAQ Composite respectively. The estimation algorithm follows the general algorithm, (i.e. Steps I–IV), presented in Section 3.2. The estimated values with respect to the log returns are given in Table 2. Again we also provide figures to illustrate the convergence of the parameter estimates. The following figures are plotted based on the estimation of 100×log-return. Figure 9 plots the values of

*𝚷*against iteration. Figure 10 plots the values of

*μ*

_{1},

*μ*

_{2}against iteration. Figure 11 plots the values of

*θ*

_{1},

*θ*

_{2}against iteration. Figure 12 plots the values of

*σ*

_{1},

*σ*

_{2}against iteration. Figure 13 plots the values of

*r*and

*δ*against iteration. Figure 14 presents the filtered estimate of the paths of the hidden Markov chain (Red line:

*p*

_{1}of

**p**=

**q**/ <

**q**,

**1**>, Blue line:

*p*

_{2}of

**p**=

**q**/ <

**q**,

**1**>). From Figure 9–Figure 13, we see that the parameters estimates converge at reasonable rates with most of them converging at 394 and 60 iterations respectively. The convergence of the parameters are rather stable, except for the two parameters

*r*and

*δ*of HSI data significantly unstable around iterations 100–400 as shown in Figure 13. Also for the two real data sets, it seems that the parameters estimates can disentangle the two regimes for

*μ*s and

*σ*s, but not for

*θ*s for the HSI data.

Parameter estimates and summary statistics from the real data.

Parameter | HSI | IXIC | Parameter | HSI | IXIC |
---|---|---|---|---|---|

π_{11} | 0.59 | 0.88 | π_{21} | 0.41 | 0.12 |

π_{12} | 0.31 | 0.10 | π_{22} | 0.69 | 0.90 |

μ_{1} | −0.36 | 0.12 | μ_{2} | −0.15 | −0.22 |

θ_{1} | 0.62 | −0.43 | θ_{2} | 0.47 | 0.86 |

σ_{1} | 1.45 | 1.23 | σ_{2} | 0.67 | 0.58 |

r | 7.58 | −0.15 | δ | 0.41 | −3.12 |

Sample mean | 0.0123 | 0.0564 | Sample standard deviation | 1.0836 | 0.9392 |

## 6 Discussions on other potential economic applications

The proposed class of HMRS-STAR models may be applied to study some important problems in econometrics and economics as well as real-world problems which may be of socio-economic importance. Here we provide some suggestions for potential applications of the proposed class of models. There may be interesting applications beyond those mentioned below.

As noted in, for example, Hansen (2011), one of the important problems in econometrics is testing for linearity in economic time series against a certain nonlinear alternative which may be described by a parametric nonlinear time series model. This problem has significant economic implications since many economic models were developed under the premise of linearity. However, real-world data may reveal otherwise. A challenging issue is what nonlinear economic models may be used if a linear economic model is not appropriate.^{2} Indeed, different conclusions may be drawn from testing for linearity if different parametric nonlinear time series models are used as the nonlinear alternative. The testing of a linear model against a threshold autoregressive model has been considered by statisticians and econometricans, see, for example, Chan (1990), Chan and Tong (1990), and Hansen (1996). See also Hansen (2011) and Tong (2011) and the relevant references therein. The testing of a linear model against a Markovian regime-switching autoregressive model has been considered in the literature. See, for example, Hamilton (2016) and the relevant references therein. The testing of a linear model against a smooth transition autoregressive model has also been considered. See, for example, van Dijk, Teräsvirta, and Franses (2002) and the related literature therein. However, it seems that the testing of linearity against a composite alternative model such as a second-generation nonlinear time series model may have received a relatively less attention in the literature. Intuitively, it appears that considering a composite alternative model in testing for linearity may provide a more general and flexible way to detect nonlinearity. The proposed class of HMRS-STAR models may be used to form a composite alternative model in testing for linearity in economic time series. Besides testing for linearity in economic time series, another potential application of the proposed class of HMRS-STAR models is testing for the unit root which is an important topic in econometrics. As noted in Hansen (2011), testing for the unit root has been investigated using a nonlinear stationarity threshold autoregressive model. It may perhaps be interesting to investigate to use of the proposed class of HMRS-STAR models to study the testing of the unit root. Time delay is also an important feature of economic time series. One potential application of the proposed class of HMRS-STAR models is modeling of time delay of economic time series. A flexibility that is provided by the HMRS-STAR models is that abrupt regime switches and time delay can be disentangled under the HMRS-STAR models. Indeed, inherent from the STAR models, time delay is related to smooth regime switches.

Some major classes of parametric nonlinear time series models such as the threshold autoregressive models have been adopted to investigate the law of one price and transactions costs. See, for example, Sarno, Taylor, and Chowdhury (2004) and Taylor (2001) for the use of threshold autoregressive models to investigate the purchasing power parity puzzle, the law of one price and transaction costs. See also Hansen (2011) for some related discussions. It was noted in Taylor (2001) that a linear time series model such as an autoregressive model cannot provide a realistic description for nonlinear adjustments of prices attributed to the presence of transaction costs. In Taylor (2001), a two-regime threshold autoregressive model was employed to describe such nonlinearity, where the two threshold parameters are used to describe a “band of inaction” which may cause the nonlinearity. Instead of using the threshold autoregressive models, one may explore the use of the proposed class of HMRS-STAR models to study the law of one price and transaction costs. For example, one may investigate the nonlinearity due to transaction costs using both the abrupt and smooth regime switches.

## 7 Conclusion

We introduced a hidden, regime-switching, smooth transition model and discussed its filtering and estimation issues. A reference probability approach and a version of the Bayes’ rule were used to derive filters for the hidden Markov chain and some related quantities. These related quantities were used to derive filter-based estimates for the unknown parameters using the EM algorithm. For the threshold parameter and the scale parameter we employed a Laplace series expansion for the cumulative normal probability distribution to derive approximations to their filter-based estimates. Simulation experiments were presented to illustrate the practical implementation of the HMRS-STAR model as well as the filtering and estimation algorithm. The simulation results reveal that the parameters estimates converge at reasonable rates, say 87 iterations for most of the parameters, and that the parameters estimates are reasonably accurate for most of the parameters. Furthermore, real financial data were used to illustrate the practical implementation of the model. However, we have experienced a challenging issue that some other datasets that our estimation algorithm does not converge. So, there is still room to further improve the estimation method in particular the convergence, stability and robustness. This may represent interesting topics for further research. Other potential applications of the proposed model may be to apply the model for fitting some economic series such as unemployment data which is an important area in economics and econometrics. For this application, one may refer to, for example, Koop and Potter (1999) and Hamilton (2005). It may also be interesting to explore the use of the proposed model to fit financial time series other than equity indices, such as foreign exchange rates series. One may also explore some applications of the proposed model to study time series data in other fields such as climate science, ecology, population dynamics, biological, engineering and physical sciences.^{3}

The authors would like to thank the Associate Editor and referees for their valuable and helpful comments.

## References

Baum, L. E., T. Petrie, G. Soules, and N. Weiss. 1970. “A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains.” The Annals of Mathematical Statistics 41: 164–171.

Chan, K. S., and H. Tong. 1986. “On Estimating Thresholds in Autoregressive Models.” Journal of Time Series Analysis 7: 178–190.

Chan, K. S. 1990. “Testing for Threshold Autoregression.” The Annals of Statistics 18: 1886–1894.

Chan, K. S., and H. Tong. 1990. “On Likelihood Ratio Tests for Threshold Autoregression.” Journal of the Royal Statistical Society. Series B (Methodological) 52: 469–476.

Dembo, A., and O. Zeitouni. 1986. “Parameter Estimation of Partially Observed Continuous Time Stochastic Processes.” Stochastic Processes and Their Applications 23: 91–113.

Engle, R. 1982. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica 50: 987–1007.

Elliott, R. J., L. Aggoun, and J. Moore. 1995. Hidden Markov Models: Estimation and Control. New York: Springer.

Elliott, R. J., C. C. Liew, and T. K. Siu. 2011. “On Filtering and Estimation of a Threshold Stochastic Volatility Model.” Applied Mathematics and Computation 218: 61–75.

Elliott, R. J., T. K. Siu, and J. W. Lau. 2013. “Filtering a Double Threshold Model with Regime Switching.” IEEE Transactions on Automatic Control 58: 3185–3190.

Fan, J., and Q. Yao. 2003 Nonlinear Time Series: Nonparametric and Parametric Methods New York: Springer.

Gao, J.. 2007 Nonlinear Time Series: Semiparametric and Nonparametric Methods London: Chapman and Hall/CRC.

Goldfeld, S. M., and S. M. Quandt. 1973. “A Markov Model for Switching Regressions.” Journal of Econometrics 1: 3–15.

Granger, C. W., and A. P. Anderson. 1978. An Introduction to Bilinear Time Series Model. Guttingen: Vandenhoeck and Ruprecht.

Granger, C. W. J., and T. Teräsvirta. 1993. Modelling Non-linear Econometric Relationships. Oxford: Oxford University Press.

Hamilton, J. D. 1989. “A New Approach to Economic Analysis of Nonstationary Time Series and the Business Cycle.” Econometrica 57: 357–384.

Hamilton, J. D. 2005. “What’s Real About the Business Cycle?” Federal Reserve Bank of St. Louis Review 87: 435–452.

Hamilton, J. D. 2016. “Macroeconomic Regimes and Regime Shifts.” In Handbook of Macroeconomics, edited by H. Uhlig and J. Taylor, Volume 2A, 163–201. Amsterdam: Elsevier.

Hansen, B. E. 1996. “Inference When a Nuisance Parameter is Not Identified Under the Null Hypothesis.” Econometrica 64: 413–430.

Hansen, B. E. 2011. “Threshold Autoregression in Economics.” Statistics and Its Interface 4: 123–127.

Koop, G., and S. M. Potter. 1999. “Dynamic Asymmetries in U.S. unemployment.” Journal of Business & Economic Statistics 17: 298–312.

Potter, S. M. 1999. “Nonlinear Time Series Modelling: An Introduction.” Journal of Economic Surveys 13: 505–528.

Priestley, M. B. 1980. “State-Dependent Models: A General Approach to Non-linear Time Series Analysis.” Journal of Time Series Analysis 1: 47–71.

Quandt, R. E. 1958. “The Estimation of the Parameters of a Linear Regression System Obeying Two Separate Regimes.” Journal of the American Statistical Association 53: 873–880.

Sarno, L., M. P. Taylor, and I. Chowdhury. 2004. “Nonlinear Dynamics in Deviations from the Law of One Price: A Broad-Based Empirical Study.” Journal of International Money and Finance 23: 1–25.

Subba Rao, T., and M. M. Gabr. 1984. An Introduction to Bispectral Analysis and Bilinear Time Series Models. New York: Springer.

Taylor, S. J. 1982. “Financial Returns Modelled by the Product of Two Stochastic Processes, a Study of Daily Sugar Prices, 1961–79.” In Time Series Analysis: Theory and Practice 1, edited by O. D. Anderson, 203–226. Amsterdam: North Holland.

Taylor, S. J. 1986. Modeling Financial Time Series. Chichester: Wiley.

Taylor, A. M. 2001. “Potential Pitfalls for the Purchasing-Power-Parity Puzzle? Sampling and Specification Biases in Mean-Reversion Tests of the Law of One Price.” Econometrica 69: 473–498.

Taylor, S. J. 2005. Asset Price Dynamics, Volatility, and Prediction. Princeton: Princeton University Press.

Teräsvirta, T. 1994. “Specification, Estimation, and Evaluation of Smooth Transition Autoregressive Models.” Journal of the American Statistical Association 89: 208–218.

Teräsvirta, T. 1998. “Modelling Economic Relationships with Smooth Transition Regressions.” In Handbook of Applied Economic Statistics, edited by D. E. A. Giles and A. Ullah, 507–552. New York: Marcel Dekker.

Tong, H.. 1977. “Contribution to the Discussion of the Paper Entitled “Stochastic Modelling of Riverflow Time Series” by A. J. Lawrance and N. T. Kottegoda.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 140: 34–35.

Tong, H. 1978. “On a Threshold Model, Pattern Recognition and Signal Processing.” In NATO ASI Series E: Applied Sc. No. 29, edited by C. H. Chen, 575–586. The Netherlands: Sijthoff & Noordhoff.

Tong, H. 1983. Threshold Models in Non-Linear Time Series Analysis. New York: Springer.

Tong, H. 1990. Nonlinear Time Series Analysis: A Dynamical System Approach. Oxford: Oxford University Press.

Tong, H. 2011. “Threshold Models in Time Series Analysis – 30 years on.” Statistics and Its Interface 4: 107–118.

Tong, H., and K. S. Lim. 1980. “Threshold Autoregression, Limit Cycles and Cyclical Data (with Discussion).” Journal of the Royal Statistical Society. Series B (Methodological) 42: 245–292.

van Dijk, D., T. Teräsvirta, and P. H. Franses. 2002. “Smooth Transition Autoregressive Models – A Survey of Recent Developments.” Econometric Reviews 21: 1–47.

Yin, G., and C. Zhu. 2010. Hybrid Switching Diffusions: Properties and Applications. New York: Springer.

Zhu, D. M., W. K. Ching, R. J. Elliott, T. K. Siu, and L. Zhang. 2017. “Hidden Markov Models with Threshold Effects and Their Applications to Oil Price Forecasting.” Journal of Industrial and Management Optimization 13: 757–773.

## Footnotes

## Supplementary Material

The online version of this article offers supplementary material (DOI:

## Footnotes

## Code and Datasets

The author(s) published code and data associated with this article is on Code Ocean, a computational reproducibility platform. We recommend Code Ocean to SNDE contributors who wish share, discover, and run code in published research articles. (See: https://doi.org/10.24433/CO.dd77cb0f-e54e-4693-905b-493a86cfd345).

## Footnotes

^{1}

We would like to thank one of the referees for stimulating this discussion.

^{2}

As noted in Tong (1990), Chapter 2, when one leaves a linear world, there are infinitely many parametric nonlinear alternatives. A challenging issue is which nonlinear alternative may be used.

^{3}

We would like to thank one of the referees for stimulating our discussions on some potential applications of the proposed model.