It is well known that a number of economic variables reflect cycles in their paths, which can be modeled as non-linear trends. However, non-linearities might complicate the estimation of the models, in particular when trying to use functions which are not linear in parameters to approximate those trends. This paper deals with the analysis of long range dependence in the context of non-linear models. In particular, we employ the Chebyshev polynomials in time to describe the deterministic part of the model, and suppose that the detrended series displays long memory behavior. We use a general definition of long memory that allows the inclusion of one or more poles or singularities in the spectrum at various frequencies. Thus, we consider the standard case of I(d, d>0) behavior, but also other possibilities such as seasonal/cyclical long range dependence and multiple cyclical structures. This is particularly interesting for macroeconomic data with a high seasonal component or cyclical movement due to economic activity.
The main problem with the non-linear deterministic trends in the context of fractional integration is that the interaction of the two structures produces a model with a non-linear structure for the coefficients, implying that linear methods are invalid for the estimation of the parameters unless we impose certain a priori conditions on various coefficients (e.g. Caporale and Gil-Alana 2007). Also, a misspecified deterministic component may affect the power of the tests for the order of integration of the variables (see Perron 1989, amongst many others). Many authors such as Zivot and Andrews (1992), Lumsdaine and Papell (1997), Lee and Strazicich (2003) and Papell and Prodan (2006), inter alia, have proposed unit root tests incorporating structural breaks, so as to improve the performance of the tests. However, structural breaks may still not be a proper specification of the deterministic component. Changes can occur smoothly rather than suddenly. In this line, Ouliaris, Park, and Phillips (1989) proposed regular polynomials to approximate deterministic components in the data generation process. However, as later pointed out by Bierens (1997), Chebyshev polynomials might be a better mathematical approximation of the time functions, since Chebyshev polynomials are bounded and orthogonal. Chebyshev polynomials are cosine functions of time, which according to Bierens (1997), can be very flexible to approximate deterministic trends. With respect to the long range dependence we use a very general framework that allows the incorporation of one or more integer or fractional orders of integration of arbitrary order anywhere on the unit circle in the complex plane. This allows the analysis of a great variety of model specifications, including for example seasonal and cyclical behaviors of any stationary or non-stationary degree. The specific form used in the empirical applications will be based on estimates of the spectral density function. Also, given that the inference based on t-statistics remains valid under the fractional integration specification used, we propose a very simple way to choose the order of the Chebyshev polynomials based on a “general to specific” approach and using the statistical significance of the Chebyshev coefficients.
The structure of the paper is as follows: Section 2 describes the statistical model incorporating non-linear (Chebyshev) trends and long range dependence. Section 3 presents a testing procedure for the fractional differencing parameters that includes the estimation of the non-linear trend coefficients. Section 4 contains a simulation study. Section 5 is devoted to the empirical work that includes an application using real effective exchange rates for 40 industrialized countries, and its implications for the purchasing power parity (PPP) theory. Section 6 concludes the paper.
2 The statistical model
We consider the following model,
where yt is the observed time series, f is a general function that might be non-linear and depends on the unknown parameter vector γ which dimension depends on the choice of the functional form of f(γ;zt), and zt which is a vector of deterministic terms that may include linear and non-linear trends;1 finally, we suppose that the error term xt can be described in terms of the following model,
and ut assumed to be I(0). For the purpose of the present work we define an I(0) process as a covariance stationary process with a spectral density function that is positive and bounded at all frequencies in the spectrum, and with r(j)=T/s(j) and s(j) indicating the number of time periods in the jth cyclical structure. Thus, it includes for ut in (2) stationary and invertible autoregressive and moving average (ARMA) processes. Coming back to (3), L is the backshift operator (i.e. Lxt=xt–1) and d=(d1, d2, …, dM)T is an (Mx1) vector of real values containing the fractional differencing parameters that correspond to different poles or singularities in the spectrum. We observe that this is a very general specification that includes many cases of interest such as the standard I(d) models (in case of dj=0 for all j≠1, and d1=d); cyclical fractional models based on Gegenbauer processes (when dj=0 for all j≠3); seasonal models (M=3 with etc. (see Section 3.1 below).
Given the above set-up we focus on the estimation and testing of the unknown parameters corresponding to the vectors d and γ referring, respectively, to the differencing parameters and the non-linear deterministic trend coefficients.
The main problem we face with this set-up is the interaction between the equations (1) and (2) in the context of a non-linear function f, and in particular, between the long memory polynomial ρ and the non-linear function f. Under many circumstances the combination of the two produces a non-linear model in parameters, which hinders the task of estimating the parameter vector γ. This problem can be solved by using the Chebyshev time polynomials.
The Chebyshev time polynomials Pi,T(t) are defined by:
See Hamming (1973) and Smyth (1998) for a detailed description of these polynomials. Bierens (1997) uses them in the context of unit root testing. The latter author proposes several unit root tests, which account for a drift and a unit root under the null hypothesis, and stationarity around a linear or non-linear trend under the alternative. Hence, within the analysis of the order of integration of the variables, Bierens (1997) unit root tests, allow us to test whether the process is linear or non-linear trend stationary. In addition Bierens and Martins (2010) propose the use of Chebyshev polynomials in the framework of time-varying cointegrating parameters. There are several advantages in using these polynomials; first, since they are orthogonal, it avoids the problem of near collinearity in the regressors matrix in comparison with using regular time polynomials. Second, according to Bierens (1997) and Tomasevic, Tomasevic, and Stanivuk (2009), it is possible to approximate highly non-linear trends with a rather low degree of polynomials. Finally, given their particular shape, they are good to approximate cyclical behavior. For instance, it is well known that GDP is characterized as a cyclical process. This particular shape needs to be accounted for when analyzing the statistical properties of this variable. Linked to that through the Balassa-Samuelson effect, the analysis of the Purchasing Power Parity (PPP) hypothesis, through the mean reversion of real exchange rates, might also be affected by cyclicality of the real exchange rate. In a recent contribution, Christopoulos and León-Ledesma (2010), include this cyclical behaviour of the real exchange rate in their analysis of the PPP. This is the base of our empirical application in Section 5.
Figure 1, depicts the shape of the Chebyshev polynomials in (4) as a function of t, for different i.2 Hence, for i=0, we have a drift, and for i=1 we have nearly a linear trend. For i≥2, the trend mimics cycles, which increase their frequency as i increases. Given this type of shape, these polynomials are a good way to approximate economic cycles, which are present in many variables, with a rather low i. See the Appendix for a brief discussion of the Chebyshev polynomials.3
Across the present paper we employ Chebyshev polynomials to describe the deterministic trend in our model. Thus, equation (1) can be approximated by
with m indicating the order of the Chebyshev polynomial, and xt following the model given by the equations (2) and (3). As earlier mentioned, if m=0 the model contains an intercept, if m=1 it adds a linear trend, and if m>1 the model becomes non-linear, and the higher m is the less linear the approximated deterministic component becomes. Any combination of values for i might be valid to approximate the deterministic component of the data generation process (DGP). Given the shape of these polynomials, one can approximate economic cycles with a rather low m, releasing degrees of freedom. An issue that immediately arises here is the determination of the optimal choice for m. However, as will be argued below, standard t-statistics will remain valid under the specification given by (5), (2) and (3) noting that the error term is I(0) by definition. The choice of m will, then, depend on the significance of the Chebyshev coefficients in the joint specification of the model, for a particular choice of the (possibly ARMA) model selected of the I(0) disturbances. Finally, with respect to the functional form of ρ in (3) we do not have a priori any knowledge about the number of differencing parameters to be required but in this respect, the periodogram or any other estimate of the spectral density function can provide us with useful information about it. Additionally, using the wide specification in (3), the confidence intervals of the different differencing parameters will also indicate us if some of these parameters can be removed if they are not statistically significantly different from zero. Finally, we can also evaluate the impact of misspecification of ρ(L;d) and m using standard methods such as ACI and BIC. However, it should be borne in mind that these two criteria might not be the optimal for applications involving fractional differences, as these criteria focus on the short-term forecasting ability of the fitted model and may not give sufficient attention to the long-run properties of the long memory models (see, e.g. Hosking 1981, 1984).4
3 The procedure
The method proposed in this paper is a slight modification of Robinson (1994). He considers the same set-up as in (1) and (2) with f in (1) based exclusively on the linear form θ′zt, testing the null hypothesis:
for any real vector value do. Under Ho, and using the two equations,
where and the symbol ′ indicating transposition. Then, given the linear structure of the above relationship and the I(0) nature of the error term ut, the coefficients in (7) can be estimated by standard ordinary least square/generalized least square (OLS/GLS) methods.5 The same happens in our approach, whereby f contains the Chebyshev polynomials, noting that in spite of the non-linear structure, the relation is linear in parameters. Thus, combining equations (2) and (5) we get
which can also be expressed in terms of (7) as in Robinson (1994) and then, using OLS/GLS methods, under the null hypothesis (6), the residuals are
and as the (mx1) vector of transformed Chebyshev polynomials. Based on the above residuals we estimate the variance,
where is the periodogram of g is a function related with the spectral density of ut [i.e. s.d.f.(ut)=(σ2/2π)g(λj;τ)]; and the nuisance parameter τ is estimated, for example, by where T* is a suitable subset of the Rq Euclidean space.6
The test statistic, which is clearly based on Robinson (1994), for testing Ho (6) in (5), (2) and (3) uses the Lagrange multiplier (LM) principle, and is given by
where T is the sample size, and
and the sum over * above refers to all the bounded discrete frequencies in the spectrum. Under very mild regularity conditions,7 it can be shown that as in Robinson (1994):
and, based on Gaussiantiy of ut, it can also be shown the Pitman efficiency theory of the test against local departures from the null. That means that if we direct the test against local alternatives of form:
where δ is a non-null parameter vector, indicating a non-central χ2 distribution with non-centrality parameter which is optimal under Gaussiantiy of ut. Note that the method just presented is a testing procedure and therefore we do not directly estimate the fractional differencing parameter vector but simply present confidence intervals based on the non-rejections for a given set of values. Nevertheless, in the empirical application carried out at the end of the paper we display estimates of d, which are based on the values that minimize the absolute value of the test statistic. This approach is found to be appropriate by means of Monte Carlo simulations.8
3.1 Simple particular cases
In this section, we simplify the functional form of the above test statistic for some particular cases of interest.
3.1.1 White noise ut
If we suppose that the disturbances are white noise, then, the spectral density function of ut is simply σ2/2π, and therefore, g≡1. Also, Then, in (11) simplifies, with
3.1.2 The case of the standard I(d) model
A very standard case examined in the literature is the one corresponding to ρ(L;d)=(1-L)d. These processes are called fractionally integrated or I(d); they were introduced by Granger (1980), Granger and Joyeux (1980) and Hosking (1981), and have been widely employed in empirical works in the last 20 years to describe the dynamics of many economic and financial time series (Diebold and Rudebusch 1989; Sowell 1992; Gil-Alana and Robinson 1997; etc.).9
In this context, M=1, and the test statistic also substantially simplifies, with implying that
which can be asymptotically approximated by π2/6.
Note here that the fact that d can be any real value allows us to consider the case of I(1) processes, but also other structures such as the “1/f noise” model (d=½), widely employed in many areas such as physics, hydrology and traffic flow (Marinari et al. 1983; Wolf 1997; Ninness 1998; Eliazar and Klafter 2010), and the “1/f1/2 noise” (d=¼) examined for example in Fox and Taqqu (1985) and Kaulakys (2000).
Other cases are presented in the Appendix.
4 A simulation experiment
In this section we briefly examine the finite sample behavior of some simple versions of the tests by means of Monte Carlo simulations. All calculations were carried out using Fortran and the programs are available from the authors upon request. Given the variety of cases and the number of possibilities covered by the tests, we concentrate on some simple cases, widely employed in the literature such as the case of standard I(d) processes with the singularity or pole in the spectrum occurring at the long run or zero frequency. In particular, we consider the following DGP:
with m=3 to justify some degree of non-linear behavior, and ut as a white noise process with mean zero and variance 1. Also, for simplicity, we suppose that θi=1 for all i, and take d in (14) equal to 0, 0.25, 0.50, 0.75 and 1, thus, including stationary and non-stationary hypotheses. Note that one additional advantage of our testing approach is that it is valid for any real fractional differencing parameter d, including thus stationary (d<0.5) and non-stationary (d≥0.5) hypotheses. We generate Gaussian series using the routines GASDEV and RAN3 of Press et al. (1986), for different sample sizes T=50, 100, 300 and 500, taking 10,000 replications for each case, and present the results for a nominal size of 5%.
Based on the model given by (14) we test the null hypothesis (6) for different do-values. However, noting that in this context M=1, we can consider one-sided alternatives such as Ha:d<do or d>do, and then, consider the test statistic:
which is asymptotically distributed as
See Robinson (1994). Thus, an approximate one-sided 100α%-level of (6) against the alternative d>do is given by the rule:
where the probability that a standard normal variate exceeds zα is α. In the same way, an approximate one-sided 100α%-level of (6) against the alternative d<do is given by the rule:
We examine the size and the power properties of the test in the case of the model given by (14) with d=1 and look in Table 1 at the rejection frequencies of in (15) with do=0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75 and 2. Thus, the values corresponding to do=1 will indicate the size of the test since d=1 is the true value of the data. We see in this table that the sizes of the tests are clearly biased if the sample size is small. Thus, for example, if T=50 and the tests are directed against d>do, the size is 0.018; however, when directed against d<do, it becomes much higher than the nominal size of 0.050 (0.109); however, as the sample size increases the values tend to approximate to the 5% level, which is consistent with the asymptotic nature of the tests. If we focus now on the rejection frequencies, we observe that the higher sizes observed in the case of d<do also produce higher rejection probabilities in all cases compared with the case of alternatives with d<1. Nevertheless, for departures higher than 0.5 even with small sample sizes, the tests behave fairly well, and if T≥300 the probabilities are very close to 1 in all cases. Note that the null consists of a unit root with Chebyshev time polynomials, so the test performs well even in strong non-stationary contexts. Performing the experiment with θ-coefficients different from 1, and also with other values of d lead to essentially the same conclusions implying that the test performs relatively well if the sample size is large enough.
Next we perform a similar experiment in non-Gaussian contexts. For this purpose, we examine the same null model as in Table 1 but assuming now that the disturbances are t-Student distributed with 3 degrees of freedom. This distribution is interesting because it just satisfies the second moment condition required in the test, its third moments not existing. The results, displayed in Table 2, are competitive with the Gaussian ones, with the sizes being closer to the nominal one of 5% in practically all cases. If we focus on the rejection frequencies, they tend to be slightly larger for values of do<1, and lower when do>1 compared with Table 1. Very similar results were obtained if weak autocorrelation [AR(1) and AR(2)] is permitted for the I(0) disturbances term, and the same applies for other values of d in (14).10
Next we examine the possibility of misspecification with respect to the deterministic terms. In particular, we suppose that the true data refers to a pure long memory process with d=0.25 (and d=0.75) and implement first (in Table 3) the test statistic with m in (14) equal to 4, and then, in Table 3, with m=2. As in the previous case we look at the power and the rejection frequencies with do=0, 0.25, 0.50, 0.75 and 1, and we focus on one-sided alternatives.
In Table 3 we focus on m=3. Starting with a pure long memory process with d=0.25, the rejection frequencies are very high even for small sample sizes and small departures from the null. The same happens with d=0.75 though in this case the rejection probabilities are quite small if T=100 and do=1 (0.428).
As an additional robustness check and noting that fractional integration and mean shifts are very much related issues, we examine the possibility of a mean shift in the data generating process, with short memory (d=0) and long memory (d=0.25 and d=0.75) processes, and investigate the rejection frequencies of the test under potential non-linear (m=2 and m=3) structures. Table 4 focuses on the case of m=3 and we observe that in the case of short memory for the data generating process the rejection frequencies are very high even for small sample sizes. Thus, if d=±0.25, the rejection probabilities are 0.979 (with d=−0.25) and 0.848 (with d=0.25) and they are exactly 1 if T=300 or 500. Very similar results are obtained with d=0.25 and d=0.75, and in all cases the power of the test approximates to the nominal size of 5% as we increase the number of observations.
As an additional robustness check we tried with alternative deterministic components, including monotonic (yt=3t−0.1+xt) and non-monotonic (yt=sin(4πt/T)+xt) deterministic trends like in Qu (2011) with short memory (d=0) and long memory (d>0) processes for xt, and perform the same type of analysis as above, obtaining almost identical results in terms of high rejection frequencies, and power that improves and approximates the nominal one as we increase the sample size. Results in these cases are available from the authors upon request. Similar results were obtained in the context of measurement errors with data contaminated by noise (see, e.g. Grassi and Santucci de Magistris 2014).
5 An empirical application
In this section we apply the fractional integration tests with Chebyshev polynomials to examine the mean reversion of real exchange rates and the PPP theory. The absolute version of the PPP theory postulates that the price levels in two different countries should converge when measured in the same currency, so as to equalize the purchasing power of the currencies in both places. This, therefore, implies that the real exchange rate, defined as the ratio of prices in both countries, translated to a common currency using the nominal exchange rate, should converge to 1. However, it is well known within the literature that the absolute version of the PPP hypothesis may be too restrictive. Hence, a less restrictive version of PPP is the relative PPP hypothesis, which implies that prices in common currency may converge to a constant different from 1. This relative version of the PPP implies then that what is actually expected in the long run is that the real exchange rate should be reverting to a constant, which may be different from 1. The intuition behind this is related to the fact that because of the existence of trade barriers, transport costs, and different measures of price indices, there may be a gap between price levels in different countries. Hence, on average, changes in real exchange rates should be zero, according to the relative version of the PPP theory.
In view of the above comments, testing for mean reversion becomes of paramount importance when testing for the empirical validity of the PPP theory, which at the same time, can be seen as a measure of the degree of over/under-valuation of the currencies, and it is used as a base for a number of macroeconomic models, i.e. the Dornbusch model. However, real exchange rate convergence, on average, to a constant along time may not be very realistic, in particular when countries experience different levels of economic growth and productivity gains, as well as, when countries suffer from changes in economic fundamentals, which may indeed change the equilibrium value of real exchange rates. For instance, the well known dynamic Penn effect and the Balassa-Samuelson effect, may induce deterministic trends in the data (see Lothian and Taylor 2000, among others), and the existence of structural changes, may, in addition, induce changes in those trends. Hence, the importance of controlling for non-linear deterministic trends when testing for real exchange rate mean reversion.
In a recent contribution, Cushman (2008) tests for the PPP hypothesis using the Bierens (1997) unit root tests for bilateral exchange rates. He finds evidence to support that real exchange rates may in fact contain non-linear trends. However, it is not possible to test for the significance of these trends, unless the null is rejected. (See also Cuestas 2009, and Cuestas and Mourelle 2011.)
As just mentioned, our newly developed fractional integration testing procedure, taking into account Chebyshev polynomials to approximate non-linear deterministic trends, solves these problems with the flexibility of having non-integer orders of integration. Given that the residuals of the auxiliary regression are I(0) stationary by assumption, t-statistics are valid to test for the significance of the non-linear trends. This novelty solves the problem of choosing the order of the Chebyshev polynomials, which was not clearly defined by Bierens (1997). Thus, we can start from a fairly general degree of non-linearity (e.g. m=3) and check the t-values of the estimated coefficients, removing those which are found statistically insignificant. We stop with the model with all significant coefficients.
The data used in the empirical application are real effective exchange rates against each country’s 27 main trade partners, downloaded from Eurostat (code ert_eff_ic_q) for 40 countries, with different degrees of economic integration and development. We have used quarterly data from 1994:Q1 until 2011:Q3.
Across this section we consider the following model,
assuming that ut is a white noise process. The use of autoregressions for the error term ut in (17) produced coefficients close to 0 in all cases. In fact, we also conducted an LR test to determine if the error term should be with noise or an AR(1) process and the results strongly support the white noise specification in all cases.
Although the results are not reported here, which are available upon request, we estimate d and the 95% confidence bands of the non-rejection values of d for the cases of m=0, 1, 2 and 3. Higher values of m lead to non-significant coefficients for θi (i>3) in all cases. These estimates were obtained using the Whittle function in the frequency domain and they coincide with the values of do that produce the lowest statistics in absolute value when using our (one-sided) testing approach with a fine grid of do-values (with 0.001 increments). We observe that the values of d are very similar across the different values for m, in general, observing a slight reduction in the degree of integration as we increase m.11 We also notice that most of the estimates of d are within the unit root interval and some of them are even significantly above 1. The only evidence of mean reversion (i.e. d significantly below 1) is obtained for the cases of Cyprus, Greece and Malta (for all values of m) and for France and Spain if m=2 or 3, i.e. assuming the existence of non-linearities. The results also point out that it is possible to reduce the order of integration of the variable by increasing artificially the order of the Chebyshev polynomials, m. This is consistent with other works that show that fractional integration and non-linearities are issues which are intimately related (Diebold and Inoue 2001; Granger and Hyung 2004; etc.). Given that, as aforementioned, inference based on t-statistics remain valid, this approach makes much easier the selection of the appropriate deterministic component.12
The summary of the results (based only on the significant Chebyshev coefficients at the 5% level) are reported in Table 5. We see that strong evidence of non-linearities (with the two non-linear coefficients statistically significantly different from zero) is obtained for the cases of Cyprus, France, Malta, Spain, Germany, Hong-Kong and Lithuania. In the first four cases, the unit root hypothesis is rejected in favour of mean reversion, while in the remaining three cases, though the estimated values of d are smaller than 1, the unit root cannot be rejected. Evidence of non-linearity with significant θ2-coefficient is observed for Austria, Greece and Slovakia, the unit root being rejected in favor of mean reversion in the case of Greece. Also, for some countries only one of the two non-linear coefficients is significant, such as China (with only θ3 being statistically significant, and an estimate of d of 0.979) as well as Bulgaria and Latvia (with d=0.827 and 1.197 respectively), and also, Belgium, Brazil and the UK (with θ2 significant but not θ3) and the unit root being not rejected. For the remaining cases, only an intercept or a linear trend is required.
We also conducted the analysis based on weakly autocorrelated errors. We tried both seasonal and non-seasonal autoregressions and the results, not displayed, indicate that though quantitatively there are some differences when computing the results based on autocorrelated errors qualitatively the same conclusions hold, since the number of cases corresponding to “mean reversion,” “unit roots” or “explosive roots” affect exactly to the same series as in the case of white noise errors. As earlier mentioned, LR tests also support the white noise specification in all cases.
Our results pinpoint a few economic insights. We first observe that in many cases structural breaks in the form of non-linear trends are present in the data. Second, for a number of countries, for instance the Czech Republic and Hungary, a linear trend is enough to approximate the data. This implies that the Balassa-Samuelson effect might be present, which makes economic sense given the process of catching-up with Western Europe during the transition period from communism to market economies. Finally, that in all cases of mean reversion, it occurs along with structural breaks. Comparing our results to those by Cushman (2008), although the results are not directly comparable, we can say that we find evidence of mean reversion using a lower order for the Chebyshev polynomials. A similar approach as the one presented here has been recently conducted in a paper by Caporale, Carcel, and Gil-Alana (2015), examining the inflation rates in five African countries: Angola, Lesotho, Bostwana, Namibia and South Africa, and evidence of non-linearities were found in the former two countries but not in the other three.
6 Concluding comments
In this paper we have examined a model that incorporates non-linear Chebyshev polynomials in time in the context of long range dependence. For the latter we use a very general expression that permits us to examine stationary and non-stationary hypotheses with one or more unit or fractional degrees of integration with the singularities in the spectrum occurring at zero and non-zero frequencies. The main advantage of this model is that combining the two structures (non-linear Chebyshev polynomials and fractional integration) leads to a new model that is linear in parameters, permitting the estimation of the Chebyshev polynomials in a very simple way. Moreover, we describe a testing procedure, originally proposed by Robinson (1994) for the linear case that displays several advantages in the present context. Thus, it allows us to test any real vector value for the differencing parameters, including stationary and non-stationary hypotheses; the incorporation of the Chebyshev polynomials allows its estimation with a straightforward method, including the use of the significancy of the coefficients throughout standard t-values. The limit distribution of the procedure is standard χ2 distributed, and several Monte Carlo experiments conducted in the paper show it performs well even with small samples. A small empirical application based on this approach and using real effective exchange rates is also conducted in the paper. In our empirical application for real exchange rates, we find that it is possible to reduce the order of the polynomials in comparison with previous studies and that non-linear trends tend to be significant when explaining the evolution of real exchange rates in a number of countries. This is the case in sixteen out of the forty countries analysed. Hence, we show that our proposed test could be an effective way to estimate trends/cycles along with the orders of integration of variables alike.
The authors gratefully acknowledge Jenny Roberts and Kostas Mouratidis, the editor and two anonymous referees for helpful comment. The usual disclaimer applies. Luis A. Gil-Alana gratefully acknowledges financial support from the Ministry of Education of Spain (ECO2011-2014 ECON Y FINANZAS, Spain) and from a Jeronimo de Ayanz project of the Government of Navarra. Juan Carlos Cuestas acknowledges the MINECO (Ministerio de Economía y Competitividad, Spain) research grant ECO2014-58991-C3-2-R.
The Chebyshev polynomials used by Bierens (1997), Bierens and Martins (2010), and in the present paper, are the ones of the “first kind.” They are a special case of the Gegenbauer polynomials. They are defined as:
These polynomials satisfy the Weierstrass theorem and the minimax criterion, which basically establishes that the polynomials can be approximated at the set of zeros occurring when
for i=0, 1, …, n, defining the order of the polynomials, and they only exist in the interval [1, −1].
Chebyshev polynomials are orthogonal with respect to the weighting function w(x)=(1–x2)0.5, implying that
with δij being the Kronecker delta. Given that the polynomials are orthogonal, the terms of any expansion of them will be independent. This is a nice property for regression modelling, since it eliminates any collineartiy in the regressors matrix. This implies that the estimated coefficients of any non-linear approximation using Chebyshev polynomials are independent on the order of the polynomials. Hence, the estimation of redundant parameters does not affect the rest. Additionally, the standard errors of the estimated parameters of an approximation using these polynomials are the same.
We define f(x) as a function to be approximated by non-linear terms using regular polynomials,
One of the interesting properties for mathematical modelling is the fact that it is possible to reduce the degree of polynomials by means of “economization of power series,” which implies to rewrite the series of regular polynomials as Chebyshev polynomials,
for n=0, 1, 2, …. Since Tn (x) belongs to the interval [−1, 1], the number of terms which can be eliminated from the approximation depends on βn.
a) The case of a cyclical I(d) model
In the previous case, the spectral density function is unbounded at the long run or zero frequency. However, the pole or singularity in the spectrum may occur at a non-zero frequency. In such a case we can consider ρ(L;d)=(1–2cos wrL+L2)d, with wr=2πr/T, r=T/s, and thus s will indicate the number of time periods per cycle, while r refers to the frequency that has a pole or singularity in the spectrum of the series. Gray, Zhang, and Woodward (1989, 1994) showed that this polynomial can be expressed in terms of the Gegenbauer polynomial, such that, denoting μ=cos wr, for all d≠0,
where Cj,d(μ) are orthogonal Gegenbauer polynomial coefficients defined recursively as:
(see Rainville 1960; Magnus, Oberhettinger, and Soni 1966, etc. for further details on Gegenbauer polynomials). This type of process was introduced by Andel (1986) and subsequently analysed by Gray, Zhang, and Woodward (1989, 1994), Chung (1996a,b), Gil-Alana (2001), Giraitis et al. (2001), Hidalgo (2005), and Dalla and Hidalgo (2005) among many others.
In this case, M is also equal to 1, and
b) The case of multiple cycles
We can also study the case of processes that contain multiple poles or singularities in the spectrum. In these cases, These processes were introduced by Giraitis and Leipus (1995), Woodward, Cheng, and Ray (1998), Ferrara and Guegan (2001), and Sadek and Khotanzad (2004) among others. One special case here is the seasonal I(d) model that, using a very simple specification may be expressed as
s indicating the number of time periods per year. Thus, for example, for quarterly data, s=4, and it is a particular case of d) with M=3, and 0, π/2 and π, respectively, for (u)=1, 2 and 3. These processes were introduced by Porter-Hudak (1990) and have been subsequently examined by Ray (1993), Sutcliffe (1994) and Gil-Alana and Robinson (2001) and others.
If s=4 and ρ(L;d)=(1–L4)d, then M=1,13 and ψ(λj) becomes:
and allowing for a greater degree of generality, we can consider the case of different orders of integration at each frequency, so that In this case, M=3 and ψ(λj) becomes a (3x1) vector of form:
(See Gil-Alana and Robinson, 2001; Ooms and Franses, 2001.)
c) The case of Bloomfield (1973) disturbances
Finally, we can suppose that the disturbances ut follow a non-parametric approach due to Bloomfield (1973). This model does not provide an explicit formula for the error term, but it is implicitly determined by its spectral density function, which is given by
where N indicates the number of parameters required to describe the short run dynamics. Bloomfield (1973) showed that the logarithm of an estimated spectral density function is often found to be a fairly well behaved function and thus can be approximated by a truncated Fourier series. He showed that (13) approximates the spectral density of an ARMA(p, q) process well when p and q are small values, which is usually the case for most economic time series. Like the stationary AR model, this has exponentially decaying autocorrelations and thus, using this specification, one does not need to rely on as many parameters as in the case of ARMA processes. Moreover, it accommodates extremely well in the context of the testing procedure presented above. Thus, formulae for Newton-type iterations for estimating the τj are very simple (involving no matrix inversion), updating formulae when N is increased is also simple, and we can replace Â in the functional form of the test statistic in (11) by the population quantity:
which indeed is constant with respect to the τj.14
Other simple specifications can also be obtained from the general formula in (11). If ut is ARMA(p, q) of the form:
then, the function g becomes:
and in the simplest AR(p) case,
and thus, implying that the lth element of ε(λj) becomes:
Andel, J. 1986. “Long Memory Time Series Models.” Kybernetika 22: 105–123.Google Scholar
Beran, J., R. J. Bhansali, and D. Ocker. 1998. “On Unified Model Selection for Stationary and Nonstationary Short- and Long-memory Autoregressive Processes.” Biometrika 85: 921–934.CrossrefGoogle Scholar
Bierens, H. J. 1997. “Testing the Unit Root with Drift Hypothesis Against Trend Stationarity with an Application to the US Price Level and Interest Rate.” Journal of Econometrics 81: 29–64.CrossrefGoogle Scholar
Caporale, G. M., and L. A. Gil-Alana. 2007. “Non-linearities and Fractional Integration in the US Unemployment Rate.” Oxford Bulletin of Economics and Statistics 69 (4): 544–608.Google Scholar
Caporale, C. M., H. Carcel, and L. A. Gil-Alana. 2015. “Modelling African Inflation Rates: Nonlinear Deterministic Terms and Long-range Dependence.” Applied Economics Letters 22: 421–424.CrossrefGoogle Scholar
Christopoulos, D., and M. A. León-Ledesma. 2010. “Smooth Breaks and Non-linear Mean Reversion: Post-Bretton Woods Real Exchange Rates.” Journal of International Money and Finance 29: 1076–1093.CrossrefGoogle Scholar
Gil-Alana, L. A. 2004. “The Use of Bloomfield (1973) “Model as an Approximation to ARMA Processes in the Context of Fractional Integration.” Mathematical and Computer Modelling 39: 429–436.CrossrefGoogle Scholar
Gil-Alana, L.A., and P. M. Robinson. 2001. “Testing of Seasonal Fractional Integration in the UK and Japanese Consumption and Income.” Journal of Applied Econometrics 16: 95–114.CrossrefGoogle Scholar
Godfrey, L. G. 1978a. “Testing Against General Autoregressive and Moving Average Error Models when the Regressors Include Lagged Dependent Variables.” Econometrica 43: 1293–1301.CrossrefGoogle Scholar
Granger, C. W. J., and N. Hyung. 2004. “Occasional Structural Breaks and Long Memory with an Application to the S&P 500 Absolute Stock Returns.” Journal of Empirical Finance 11: 399–421.CrossrefGoogle Scholar
Hamming, R. W. 1973. Numerical Methods for Scientists and Engineers. Mineola, New York: Dover Publications.Google Scholar
Hannan, E. J., and B. G. Quinn. 1979. “The Determination of the Order of an Autoregression.” Journal of the Royal Statistical Society, Series B, 41: 190–195.Google Scholar
Kaulakys, B. 2000. “On the Inherent Origin of 1/f Noise.” Lithuanian Journal of Physics 40: 281–286.Google Scholar
Lothian, J., and M. P. Taylor. 2000. “Purchasing Power Parity Over Two Centuries: Strengthening the Case for Real Exchange Rate Stability. A reply to Cuddington and Liang.” Journal of International Money and Finance 19: 759–764.CrossrefGoogle Scholar
Magnus, W., F. Oberhettinger, and R. P. Soni. 1966. Formulas and Theorems for the Special Functions of Mathematical Physics. Berlin: Springer.Google Scholar
Müller, U., and M. Watson. 2008. “Testing Models of Low Frequency Variability.” Econometrica 76 (5): 979–1016.Google Scholar
Ninness, B. 1998. “Estimation of 1/f Noise.” IEEE Transactions of Information Theory 44 (1): 32–46.Google Scholar
Ooms, M., and P. H. Franses. 2001. “A Seasonal Periodic Long Memory Model for Monthly River Flows.” Environmental Modelling and Software 16: 559–569.Google Scholar
Ouliaris, S., J. Y. Park, and P. C. B. Phillips. 1989. “Testing for a Unit Root in the Presence of a Maintained Trend.” In Advances in Econometrics and Modelling, edited by B. Ray, 6–28. Dordrecht: Kluwer.Google Scholar
Press, W. H., B. P. Flannery, S. A. Teukolsky, and W. T. Wetterling. 1986. Numerical Recipes. The Art of Scientific Computing, Cambridge: Cambridge University Press.Google Scholar
Rainville, E. D. 1960. Special Functions. New York: MacMillan.Google Scholar
Robinson, P. M. 1978. “Statistical Inference for a Random Coefficient Autoregressive Model.” Scandinavian Journal of Statistics 5: 163–168.Google Scholar
Sadek, N., and A. Khotanzad. 2004. “K-factor Gegenbauer ARMA Process for Network Traffic Simulation.” Computers and Communications 2: 963–968.Google Scholar
Tomasevic, N., M. Tomasevic, and T. Stanivuk. 2009. “Regression Analysis and Approximation by Means of Chebishev Polynomials. Informatologia 42: 166–172.Google Scholar
Zivot, E., and D. W. K. Andrews. 1992. “Further Evidence on the Great Crash, the Oil Price Shock, and the Unit Root Hypothesis.” Journal of Business and Economic Statistics 10: 251–270.Google Scholar
The online version of this article (DOI: 10.1515/snde-2014-0005) offers supplementary material, available to authorized users.
A paper about model selection in the presence of long and short memory processes is Beran, Bhansali, and Ocker (1998). They propose versions of the AIC, BIC and the HQ (Hannan and Quinn 1979) which are suitable for fractional autoregressions, but do not consider MA components.
Although Robinson (1994) focuses exclusively on the linear case, he argues (page 1421) that “(…) undoubtedly a non-linear regression will also leave our limit distributions unchanged, under standard regularity conditions.” These conditions can be found in Robinson (1994).
Lobato and Velasco (2007) proposed a Wald tests, which is asymptotically equivalent to the LM test of Robinson (1994), though it focuses exclusively on the long run or zero frequency, not taking into account alternative poles in the spectrum.
In the case of AR disturbances, the lowest power was obtained when the roots in the AR polynomials were close to the unit circle. This is something that is faced in all procedures when dealing with AR structures and fractional integration given the competition between the two structures in describing the nonstationarity.
We conducted several diagnostic tests on the residuals of the estimated models (tests of no additional serial correlation, Durbin 1970; Godfrey 1978a,b and homoscedasticity, Koenker 1981), and the results pass the diagnostics in all cases for the error terms.
See Gil-Alana (2004) for an explanation of the accommodation of the model of Bloomfield (1973) in the context of fractional integration, and more in particular, in the context of the tests of Robinson (1994).
About the article
Published Online: 2015-06-22
Published in Print: 2016-02-01
Citation Information: Studies in Nonlinear Dynamics & Econometrics, ISSN (Online) 1558-3708, ISSN (Print) 1081-1826, DOI: https://doi.org/10.1515/snde-2014-0005.
©2015, Luis Alberiko Gil-Alana et al., published by De Gruyter. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0