Accessible Published by De Gruyter May 21, 2015

A Test of the Long Memory Hypothesis Based on Self-Similarity

James Davidson and Dooruj Rambaccussing


This paper develops a new test of true versus spurious long memory, based on log-periodogram estimation of the long memory parameter using skip-sampled data. A correction factor is derived to overcome the bias in this estimator due to aliasing. The procedure is designed to be used in the context of a conventional test of significance of the long memory parameter, and a composite test procedure is described that has the properties of known asymptotic size and consistency. The test is implemented using the bootstrap, with the distribution under the null hypothesis being approximated using a dependent-sample bootstrap technique to approximate short-run dependence following fractional differencing. The properties of the test are investigated in a set of Monte Carlo experiments. The procedure is illustrated by applications to exchange rate volatility and dividend growth series.

1 Introduction

Estimation of the long memory parameter d by the method due to Geweke and Porter-Hudak (1983), or one of its variants, is a popular methodology in time series analysis. This estimator (henceforth, GPH) exploits fact that the autocovariances of a long memory process are nonsummable, and the spectral density f accordingly diverges at the origin at a particular rate, with f(λ)=O(|λ|2d) as λ0. GPH estimates d by regressing the logarithms of the periodogram points in the neighbourhood of zero onto a suitable trend. However, except in very large samples this method has well-known limitations. As documented by Agiakloglou, Newbold, and Wohar (1993), the neglect of components of f representing short-run autocorrelation imply omitted terms in the regression, resulting in potentially substantial bias. In particular, the method is problematic as a basis for testing the null hypothesis of short memory, the case d=0, since the conventional Wald statistic can be severely over-sized.

A simple illustration of this difficulty is provided by the observational equivalence between the fractionally integrated process (1L)dxt=ut with d=1 and the autoregressive process (1ϕL)xt=ut with ϕ=1. The ARFI model


can exhibit a characteristically bimodal likelihood function when either of the parameters ϕ and d is close to unity in the process generating the sample. For every finite sample size, there exists a ϕ close enough to unity to bias the GPH estimator of d significantly, when its true value is zero. It is desirable to have a means of distinguishing the cases of true d and spurious d, and goodness-of-fit criteria are an unreliable guide.

The approach explored in this paper is to devise a test with null and alternative interchanged. Recent research has highlighted the well-known property of self-similarity of hyperbolic decay processes under transformations such as periodic aggregation and periodic sub-sampling, otherwise known as skip-sampling. Chambers (1998) was the first to point out that if a long memory process is recorded at different rates, the rate of decay of the autocovariances is invariant to the rate of observation.

There are two ways to conceive of lowering the observation rate. Temporal aggregation means taking the sums of N successive observations to create the new sequence. This is the natural transformation for flow data, such that (for example) quarterly flows are each the sum of three successive monthly flows. Ohanissian, Russell, and Tsay (2008) implement a test of long memory based on comparing log-periodogram estimates under different rates of temporal aggregation.

Skip-sampling, by contrast, means taking every Nth observation and discarding the remainder. This is the natural way of lowering the observation rate for stock or price data, although for the present purpose the nature of the observations is irrelevant, since the required properties of the skip-sampled series hold in all cases. Consider this transformation in the context of hyperbolic memory decay. Let the parameter δ index the rate of decay such that the autocovariance sequence {γj,j=1,2,} of a stationary process satisfies


for some δ>0. The hyperbolic memory class includes short memory processes having summable autocovariances, such that δ>1, and long memory processes where δ=12d for 0<d<(1/2), and hence 0<δ<1. It is immediately evident that, for any fixed, finite N,


It follows that for the long memory class, the property of the spectral density near the origin should likewise be invariant to the sampling frequency.

This is in contrast to the case of exponential memory decay where γj=o(jδ) for every finite δ, but there exists ρ>0 such that


In this case, note that


so that the memory decay parameter rises from ρ to ρN following skip-sampling. Since the estimator of (spurious) d in the exponential decay case is inevitably sensitive to the value of ρ, this suggests that comparing estimates under different rates of sampling might yield a useful test of the null hypothesis of long memory.

A range of nonlinear models, such as threshold autoregressive and Markov-switching processes, are often thought of as likely to be to be mistaken for long memory, since they can exhibit local patterns of apparent persistence, switches of local mean, for example or, in the case of ESTAR threshold models, unit root-like behaviour in the neighbourhood of the origin. As for the linear autoregressive model, the essential difference between these models and the long-memory case is that the serial dependence decays exponentially as the lag increases beyond a certain point, whereas long memory implies hyperbolic decay. Whether linear or nonlinear, stable difference equations of finite order necessarily exhibit exponential decay (see Gallant and White 1988; Davidson 1994), whereas unstable difference equations are nonstationary, featuring unit roots or explosive behaviour.

The class of cases of [1] with δ>1 count as instances of the alternative hypothesis for present purposes, because the autocovariances are summable. Models with this characteristic have not been significantly exploited in econometrics to date except in two rather special contexts, over-differenced fractional models (where d<0 and there is the additional “anti-persistence” property of the autocovariances summing to zero) and stochastic volatility modelling. The FIGARCH (Baillie, Bollerslev and Mikkelsen 1996) and HYGARCH (Davidson 2004) models are cases of the ARCH() model where the lag weights in the conditional variance equation decline hyperbolically but are nonetheless summable. The co-moments of fourth order (when they exist) are likewise summable in these latter models. In the present case, by contrast, our null hypothesis is that of true long memory with d>0.

This paper considers tests of the long memory hypothesis based on a comparison of the log-periodogram estimator of the d parameter in skip-sampled data with that from the original data. The test statistic is asymptotically standard Gaussian under the null hypothesis, given the usual assumptions of this literature (notably Gaussianity of the observations, see Robinson 1995; Hurvich, Deo, and Brodsky 1998). Convergence to the limit may be slow, and the formulation adopted depends on ancillary assumptions. The test is therefore implemented, for evaluation purposes, both as an asymptotic test and as a bootstrap test. We further recognize that the test is not consistent, for the rejection probabilities must be ultimately decreasing in sample size under the alternative hypothesis. However, we propose that the procedure be utilized as a component of a composite test, in combination with the Wald significance test on the fractional integration parameter in which the roles of null and alternative are reversed. The composite test can be formalized by the construction of a pseudo-p-value, and we show that this defines a test of the null of long memory that is both consistent and asymptotically correctly sized.

The paper is organized as follows. Section 2 reviews the important issue of aliasing in skip-sampled data, and its consequences for the form of the periodogram. Section 3 derives a bias-corrected form of the GPH estimator appropriate to skip-samples. Next, Section 4 describes the test procedure and derives the null asymptotic distribution of the statistic. Section 5 describes the implementation of the bootstrap version of the test. Section 6 describes the composite test, and Section 7 comments on the nonstationary case of the null hypothesis. Monte Carlo findings reported in Section 9, Section 9 describes two contrasting applications, and Section 10 contains concluding comments. Some proofs are gathered in the appendix.

2 Aliasing

The distribution of the GPH estimator in skip-sampled data has been studied inter alia by Smith and Souza (2002, 2004) and Souza (2005). Skip-sampling induces a bias in the estimator due to the effect of aliasing on the form of the spectral density. For a comprehensive analysis of the aliasing phenomenon, see Hassler (2011). The essential result is that the spectral density of the skip-sampled data can be represented as an average of the spectral densities over the range of aliased frequencies.

Proposition 2.1

If{yt,t=1,2,}is a discrete stationary stochastic process with spectral densityfandxt=ytNfort=1,2,3,andN>1, the spectral density of the processxtis


The straightforward proof is given in the appendix. Note that cycles of frequency λ/N in the original data become cycles of frequency λ in the skip-sampled data, and frequencies above π/N are no longer identifiable. Hence, these contributions to the variance of the series are effectively aggregated with the identifiable frequencies.

In the fractionally integrated case where


with 0<g(0)<, we find that fN(λ) cannot be directly log-linearized in the GPH manner. What can be done, following the suggestion of Smith and Souza (2002), is to write




There is, evidently, an omitted term logHN in the log-periodogram regression in skip-sampled data, depending on d as well as λ. The omission of this term will be liable to produce a bias in the GPH regression, and its omission is not rendered negligible by taking frequencies close to the origin. Indeed, what is commonly observed is that estimates of d>0 obtained from skip-sampled data are substantially closer to zero than those from the original data.


Note the implication for the standard analysis of a model such as [3], which is revealed to be specifically linked to the frequency of observation. Without this assumption, there is no reason to suppose that the function g does not also depend on d, nor that it is constant near the origin. In this light, the standard long memory analysis appears a little more fragile than is commonly taken for granted. Nonetheless, in this paper we shall work with the standard assumptions for the purposes of developing a test.

3 The Bias-Corrected Estimator

The test we propose is based on the comparison of two narrow-band regression estimators of the memory parameter d, one based on the full sample, the other based on skip-sampling of the test series. As before, let N denote the periodicity of the skips. Skip-sampling is done by taking every Nth observation, so yielding a sample of size [T/N] where [z] denotes the largest integer below z. This can be done N times, by off-setting the initial observation, so that the N skip-samples can be represented as {x0t},{x1t},,{xN1,t} where, for k=0,,N1,


Each of these samples can be used to compute a modified log-periodogram estimator, which we denote dˆNk for k=0,,N1.

As shown in Section 2, the conventional GPH estimator applied to skip-sampled data is biased. To be more specific, it exhibits a bias different in character from the well-known case of data with short run dependence in the fractional differences, being present even if the original series is a pure fractional without short-run components. Moreover, the bias is not attenuated by choosing a narrow bandwidth. The logarithm of HN in [5] is a missing term in the log-periodogram regression, and bias correction involves finding a computable surrogate for this function.

Expression [5] as a function of λ depends in the first place on the unknown d, and the natural approximation is to replace this with the asymptotically unbiased estimator dˆ. It also depends on the unknown spectral density component g evaluated at different points, and except in the case of the pure fractional model, the term gλ+2πk)/N/g(λ/N) in [5] varies with λ in general over the whole of the interval [0,2π], including points close to the origin. Approximating it by a constant, in the manner of dealing with g(λ) in the narrow-band estimator, is therefore not an attractive option.

Possible methods for estimating this term include constructing a kernel estimator of g from the spectrum of the fractional differences. However, in this implementation we have adopted a semiparametric approach. Let the null hypothesis specify that the random sequence has a representation of the ARFI form


where ϕ(L) is an invertible lag polynomial, of possibly infinite order, and utNI(0,σ2) where “NI” denotes independent Gaussian. To approximate ϕ(L) we use the Durbin–Levinson algorithm to fit an autoregression of order pT=0.6T1/3 to the fractional differences (1L)dˆxt, where dˆ is the estimator of d based on the full sample. This yields an estimated lag polynomial ϕˆ(L), and we then approximate g(λ) by


It suffices for our application that gˆ(λ) converges in probability to g(λ) pointwise in a neighbourhood of zero, and since gˆ is a smooth differentiable function of the data this property should in fact hold in a wider class of processes than [6]. Absolute summability of the autocovariances of the fractional differences should hold in processes for which log-periodogram regression has good properties. Since the estimators in question depend only on second moments, they will yield the same consistency properties if ut in [6] is merely white noise. With caveats concerning invertibility, the Wold theorem therefore extends validity to the general covariance stationary case. The issues arising here are carefully analysed, in the bootstrap context, by Kreiss, Paparoditis, and Politis (2011). They show that Gaussianity of the series is certainly sufficient and this is, in any case, an assumption adopted for our subsequent asymptotic analysis and imposed in our experiments. Of course, these considerations strictly relate to the case where d is known, and the largest source of error in finite samples will be due to the replacement of d by dˆ.

Letting λj=2πj/T as usual, the skip-sampled series consists of [T/N] observations, and the frequencies at which the periodogram is evaluated are λNj=2πNj/T for j=1,,MN where MN=[(T/N)q], for 0<q<1, represents the usual GPH bandwidth function of sample size. In practice q should be chosen according to the established prescriptions of the literature and, following Hurvich, Deo, and Brodsky (1998) (henceforth HDB), setting q<4/5 ensures limiting Gaussianity of the estimator, with bias of O(T2(q1)). Let INk denote the periodogram computed from the kth skip-sampled data set with period N, and let HˆN(λ) denote the formula in [5] approximated as described, using the estimated parameters and the representation of the short-run spectral density in [7]. The kth bias-corrected skip-sample estimator then takes the form


where XNj=2log2sinλNj/2. Provided N is treated as fixed and not linked to sample size note that MN=O(M) where M=[Tq], and this is the assumption we maintain henceforth.

While the formula in [8] employs an estimator of the function g as a component of the aliasing correction, note that this estimator has not been included in the log-periodiogram regression itself, and this remains a narrow-band estimator. Be careful to note that gˆ depends on the narrow-band estimator of d based on the full sample, which is used to fractionally difference the data, and hence it does not provide a direct route to a broad-band estimation procedure of the type proposed by Moulines and Soulier (1999), for example.

4 The Skip-Sampling Test

Letting the conventional GPH estimator based on the complete sample be denoted dˆ, the test statistic we consider is


where dˆN=N1k=0N1dˆNk. We use the signed statistic and perform a one-tailed test, on the assumption that the leading cases of the alternative will give rise to a smaller value of d in the skip-sampled data. Also note that, in view of the form of the estimator, using the average of the dˆNk estimates from the N skip samples is equivalent to adopting the average of the log-periodogram points across the N offset samples as regressand. This scheme makes the most efficient use of the available data.

When the sample is large enough, both the conventional GPH estimator dˆ and the skip-sampled estimator dˆN defined in [8] can be analysed using the techniques developed in HDB. These authors obtain their results from the following assumptions, which here relate to our null hypothesis under test.

Assumption 1

The process{yt,t=1,2,}is stationary and Gaussian with the spectral density given in [3] with0<d<12.

Assumption 2


Assumption 3

g(0)=0, andg′′(λ), g′′′(λ)are bounded for allλin a neighbourhood of zero.

Letting ϵNkj=log(INk(λNj)/f(λNj)) there exists a function f such that (analogous to the expression in HDB page 42)


where aNj=XNjXˉN and SN=j=1MNaNj2=O(M). Under our assumptions, the first right-hand side term is o(1). In the case N=1, such that there is no skip sampling and dˆNk=dˆ, f1j=fj=gλj and ϵ10j=ϵj. In the cases with N>1, on the other hand,


Since H is twice-differentiable with respect to d and dˆ is M1/2-consistent under our assumptions, we can expand logHˆ(λNj) as


Then, using Lemma 1 of HDB, and letting


we have


Note that the relevant properties of the random variables ϵNkj extend from the full-sample to the skip-sampled case, specifically, that their distribution has finite second moments that asymptotically do not depend on nuisance parameters – see Lemmas 2 and 6–8 of HDB. Since the regressors are the same for each k, we further find


and hence


where aj=XjXˉ and S=j=1Maj2. In the appendix, we show the following.

Proposition 4.1

For fixed finiteN, anddsuch that(1L)dxtis a weakly dependent process, BT(N,d)converges in probability to a finite nonstochastic limitB(N,d).

The next thing to note using further results from HDB is that, with q<4/5,


Also note that SN1/2j=1MNaNjϵNkj has the same limit in distribution for each k, where SN=j=1MNaNj2, and so similarly,


It follows that under these conditions,




where Sˉ=limTS/M, SˉN=limTSN/MN, and


To derive a formula for C(q,N) analytically would entail quite a challenging calculation, and we have circumvented the need for this by a numerical evaluation. Note that ϵj is the logarithm of the periodogram point of an independent Gaussian series (having d=0 and g constant) whereas the ϵNkj are the log-periodograms of the corresponding skip-sampled series. Therefore, C(q,N) can be approximated as closely as desired, for given q and N, by a simulation based on a sufficiently large sample. The accuracy of the approximation can be monitored by computing the sample variances of the components at the same time, and checking how close these lie to their known asymptotic counterpart of π2/6. We have performed the simulation with 200,000 replications in a sample size of 2000, with the results shown in Table 1.

Table 1:

Numerical estimates of C(n,q).


Denoting by Vˆ the variance formula computed using these approximations, replacing d by its full-sample GPH estimator, the test statistic is calculated as


This statistic is used as the basis for a one-tailed test with rejections in the upper tail.

5 The Bootstrap Test

A difficulty with the semiparametric approach to estimation is the slow convergence to the asymptote, at the rate M1/2 rather than T1/2. The mean and variance approximations derived in the previous section are accordingly slow to improve, especially with the reduction in effective sample size following skip-sampling. This suggests that the bootstrap may have a useful role to play in implementing the test, while not overlooking that the parametric bootstrap is likewise dependent on slowly converging estimated parameters. Nonetheless, a comparison of the two procedures, asymptotic and bootstrap, may serve to triangulate the uncertainty.

The bootstrap distribution of the statistic has to be estimated by simulating the null hypothesis as a fractionally integrated process, while allowing for the possibility of short-run dependence of the fractional differences. Given an estimator dˆ of the fractional parameter and test statistic τˆ computed from the sample, the calculation is performed as follows.

  1. 1.

    Compute the fractional differences uˆt=(1L)+dˆ(yty1) where (1L)+dˆ=j=0taˆjLj and a0=1, aj=aj1(jdˆ1)/j for j1.

  2. 2.

    Repeat the following steps for j=1,...,B.

    1. (a)

      Draw a random sample uˆij,,uˆTj from the distribution of uˆ1,,uˆT using a method that preserves the dependence structure; see Remark 1 below.

    2. (b)

      Construct the sequence


      where (1L)+dˆ=s=0tbjLj where b0=1,bj=bj1(j+dˆ1)/j, for j1, and zˆtj is explained in Remark 2 below.

    3. (c)

      Compute the bootstrap test statistic τˆj as in [13] for the sample yˆij,,yˆTj.

  3. 3.

    Compute the estimated p-value for the test as 0 if τˆ>τˆ(B) or else as


where τˆ(j) is the jth order statistic for the bootstrap statistics τˆ1,,τˆB.


  1. 1.

    Methods for constructing the drawings uˆij,,uˆTj include the stationary bootstrap of Politis and Romano (1994) and the sieve autoregression method of Bühlmann (1997). Note that the latter calculation is also used to obtain expression [7], and the same remarks apply regarding the validity of the sieve AR method in this context; see Kreiss, Paparoditis, and Politis (2011).

  2. 2.

    The correction terms zˆtj are constructed using Gaussian drawings and weights computed from the estimated parameters to have a covariance structure matching the components omitted through truncating the innovation sequence at 0. These replace the sample initial condition which has been truncated in step 1. The resulting sequence is approximately stationary for |d^|<12. If dˆ12 the data are modelled in differences, replacing dˆ by dˆ1, and the simulation is then integrated using the first observation for the initial condition. Nonstationary processes generated by this procedure converge after normalization to Type I Brownian motion. For details of the simulation procedure, see Davidson and Hashimzade (2009).

  3. 3.

    In practice, different estimators of d, employing different bandwidths in particular, might be used to compute the statistic and to implement the bootstrap resampling as in Step 1. Using a wider bandwidth in the former case could increase power by emphasizing bias under the alternative, while a different balance between bias and variance might be advantageous in achieving the best bootstrap distribution. Such choices can be guided by simulation experiments.

6 The Composite Test

Testing the degree of persistence of a time series is a problem that has attracted a degree of controversy, as documented by one of the present authors (Davidson 2009). This is one of a class of problems have been characterized by Dufour (1997) as “ill-posed”, and has close links with the testing frameworks critically analysed by Pötscher (2002) and Faust (1996, 1999), inter alia. Tests of the null hypothesis that the series has summable autocovariances – the “I(0) hypothesis” – face a common difficulty for valid inference. This difficulty manifests itself in different ways in different contexts, but the essential common feature is that the null hypothesis constitutes an open set in the parameter space. It follows that test power cannot exceed test size, where the latter is defined as the supremum of the rejection probabilities over the null set of the model space. While this problem extends to more general parameterizations it is most transparent in the case where the “I(0)” property relates to the modulus of the maximal autoregressive root.[1] The null hypothesis is represented by the interval [0,1) with its closure containing the leading case of the alternative.

Although the null and alternative are interchanged, the present case is clearly similar. The null hypothesis relating to the value of d is the open interval (0,), with its closure containing the cases of the alternative with d=0. This is another situation where, under a literal interpretation, power cannot exceed size. The test is based on a comparison of two estimators of d, where under the alternative, one (the full-sample estimator) is expected to exhibit more bias than the other (the skip-sampled estimator) as estimators of zero. Since the estimators being compared are both consistent, albeit biased in finite samples, the test is evidently inconsistent. The probability of exceeding the rejection criteria under the alternative evidently cannot be monotone nondecreasing in sample size.

While the test might therefore appear of doubtful value in applications, this conclusion overlooks the context in which such a test might be applied. The question actually being posed, in most cases, is whether a “significantly positive” estimator of d should be treated as a biased estimator of zero. If the significance test does not result in rejection, then we might on these grounds decide to reject the null hypothesis of long memory and either forego the skip-sampling test or, at least, to overlook a non-rejection in the latter test.

To formalize this idea, consider a composite test in which the skip-sampling test is performed in partnership with a one-tailed Wald test of the hypothesis d0 with alternative d>0. With contamination by short-run positive autocorrelation, we anticipate possible over-rejection in this latter test. Non-rejection in the Wald test implies effective rejection of the null hypothesis of long memory, and there is, arguably, no need to proceed to the skip-sampling test. Here’s how we can compute a composite test which delivers a p-value taking account of the outcome of the initial Wald test. Suppose that the Wald test delivers a p-value π1T in a sample of size T, and the skip-sampling test a p-value π2T. Consider the pseudo-p value calculated as


for some δ>0.

Proposition 6.1

The testrejectH0whenπˆ2T<α” is consistent and asymptotically of sizeα.

To prove the proposition, first consider the behaviour of the statistic πˆ2T in the case d>0 (the null hypothesis). Let the Wald statistic be denoted tT. The corresponding p-value π1T is the area under the upper tail of the null distribution (standard normal) bounded by tT. Since the Wald test is consistent with tT=Op(Tq/2), we have


Hence, consider πˆ2T/π2T=(1π1T)Tδ, and note that for every δ<,


It follows that πˆ2T/π2T1 in probability. Since the null hypothesis is true, τˆ converges to N(0,1) in distribution according to [12]. Therefore, π2T is asymptotically uniformly distributed on the unit interval. By the indicated convergence in probability the composite test shares this property, and so rejects asymptotically with probability α in an α-level test.

Next, suppose that the null is false, with d0. Recalling that the Wald test is one-sided, π1T is asymptotically uniformly distributed on the unit interval in this case and so, in particular, P(π1T>0)1. When δ>0, it follows that for any ϵ>0,


as T, and the proposition is proved.

The convergence of πˆ2T to the uniform distribution under the null must be somewhat slower than that of π2T, depending on the choice of δ. The smaller that δ is chosen, the nearer πˆ2T/π2T is to unity in any given sample size and the smaller is the size distortion ceteris paribus, while not overlooking the fact that the test based on π2T may itself exhibit size distortion in one direction or another, so that the net distortion in a given sample size is unpredictable. On the other hand, the larger δ is chosen, the more rapidly πˆ2T approaches 0 under the alternative. Hence, the choice of δ represents a trade-off of power against size.

Take care to note that the consistency of the composite test holds whether or not π2T0 under the alternative. To appreciate the contribution of the skip-sampling test, it may be helpful to envisage the “test” based on simply drawing a uniform random number from [0,1] at the second stage, instead of computing the quantile of the skip-sampling statistic. Proposition 6.1 holds also for this test! What we have done here is to give an alternative way of formalizing the properties of the Wald test. The tendency of this test to over-reject the conventional null hypothesis d0, due to bias, is converted into a case of low finite-sample power to reject the hypothesis d>0. However, the expectation is that the power of the composite test in finite samples is greater, to the extent that π2T is distributed closer to zero than a uniform variate under the alternative. The simulation experiments reported in Section 8 show that such improvements, judged by the performance of the basic skip-sampling test, can be large.

We emphasize once again that the composite test does not need to be taken literally as an operational procedure. We can think of it as a formalization of the procedure of taking two test results into account in making a decision. If we cannot reject the hypothesis d0 on the Wald test, we are unlikely to proceed to the second stage. If we do find d “significantly positive” on conventional criteria, then we want to know how far this outcome might be attributable to bias, and the skip-sampling test can in this case provide countervailing evidence.

7 The Nonstationary Case

As the observational equivalence issue raised in the introduction would lead us to predict, autoregressively generated series with a root in the stable region but close to unity characteristically yield an estimated d in the nonstationary range 12d1. However, it is known (from, e.g. Velasco 1999; Kim and Phillips 2006) that log-periodogram regression in this range is consistent, and also asymptotically normal, under regularity conditions, for d<34. Our test should exhibit similar characteristics in stationary and nonstationary cases of the null hypothesis, and this conjecture is borne out by the simulation experiments reported in Section 8.

In a well-known paper, Diebold and Inoue (2001) point out that in certain models exhibiting structural change, in which the frequency of change has a particular relation with sample size, there is the appearance of hyperbolic memory decay. In some of their examples, the processes in question are “revealed” as really I(1) (stationary in differences) as T is extended with fixed parameters. To understand how the skip-sample test might behave in these cases, we must not overlook the fact that a unit root process, like a serially independent process, is technically a case of the null hypothesis. Both cases exhibit the invariance of memory to skip-sampling characteristic of fractional integration. Thus, a skip-sampled unit root remains a unit root. For this reason we should not expect the present test to have greatest power against local-to unity autoregressive alternatives. The natural approach, faced with a time series that does not exhibit mean reversion, might be to test for hyperbolic memory in the differences. Diebold and Inoue also propose examples in which processes appearing to show hyperbolic decay in a given sample size are “revealed” as I(0) as T increases, and here our test should perform better. In particular, they consider a simple independent process subject to Markov-switching, which is one of the cases to be studied in the next section.

8 Monte Carlo Experiments

We present some experiments using three sample sizes, T=250, 1000 and 5000, with 5,000 replications in each case. Following preliminary investigations a bandwidth for the GPH estimator of M=[T0.7] was chosen to compute the tests, with a skip-sampling period of N=8. A relatively wide bandwidth, emphasizing bias, is intended to optimize the performance of the test under the alternative. The skip-sampling periods N=4 and N=12 have also been tried, although the properties of the test do not appear very sensitive to this setting. These settings emerged as the best compromise in performance in null and alternative cases. The experiments returned both asymptotic and bootstrap p-values, using 300 bootstrap replications. The fractional differencing of the series prior to resampling has to be performed with an estimated d, as described in Section 5, and for this purpose a narrower bandwidth M=[T0.55] was used, to attenuate the bias. The simple bootstrap with independent resampling was used in the simulations of the pure fractional null hypothesis.

Table 2 shows the results obtained in nominal 5% tests for three cases of the null hypothesis. Under H0, the data are generated as


where zt(d) is an independent Gaussian term generated by the method of Davidson and Hashimzade (2009), such that the sequence {yt}t=1T is stationary (see also Remark 2 of Section 5). The chosen values of d are shown in the column headings. The table entries show the proportion of replications in which the asymptotic and bootstrap p-values, respectively, fell below 0.05. The rows of the table show the performance in the three sample sizes of the basic skip-sampling test, and three cases of the composite test based on [14], with the values of δ shown in the first column of the table. The bootstrap p-values were computed using the algorithm from Section 5 using simple independent sampling to draw the fractional differences.

Table 2:

Rejection rates in cases of the pure null (0.05 tests).

Basic Test
Composite test

Under-sizing of the asymptotic test occurs in all sample sizes, and also in the bootstrap test in the larger samples, suggesting that the convergence is non-monotone.[2] If it is found surprising that these errors in rejection probability do not diminish more quickly, it is as well to remember that the components of the statistic depend upon as few as [T/8]0.55 periodogram points, a mere 34 even in the case of T=5000. The convergence to the asymptote is inevitably slow.

The issue of size distortion would clearly benefit from further study, and alternative estimators and bandwidths could certainly be considered. However, we note that under-rejection is a relatively benign problem provided the rejection rates under null and alternative differ sufficiently. Moreover, because the composite test yields a pseudo p-value that is always smaller than the bootstrap p-value, under-rejection is a desirable feature in the sense that the composite test is less prone to over-rejection under the null. Considering the alternative cases of δ in the composite test, the trade-off between power and size is evident here. The rate of over-rejection by the pseudo p-value under the null can be unacceptably large, but even with δ=0.6 this is a problem chiefly in small samples, or when d is close to zero. The latter results are not surprising, because when d is small, π1T in [14] is on average closer to 1 then otherwise, and hence πˆ2T/π2T smaller and over-rejection more acute. It is interesting that this effect persists in the largest sample considered but, again, it is important to remember that the slow rate of convergence. It is, if anything, more surprising to see how well the test can perform in modest sample sizes.

Next, in Tables 3 and 4 some cases of the alternative are shown. The performance of the simple and composite tests with the data generated as first-order autoregressions is reported in Table 3. This is the exponential decay model


with y0=0, for three different values of ϕ. Note that these are rejection rates, not power estimates, since the test sizes are uncorrected. With our compound null hypothesis, no consistent scheme for correcting rejection rates can be defined. This table therefore needs to be read in conjunction with Table 2. Its interesting to note that the rejection rate increases as ϕ is increased. This reflects the fact that the test has most power when autocorrelation is substantial but not hyperbolic. When the amount of autocorrelation is small, it is correspondingly difficult for the test to discriminate between exponential and hyperbolic decay, and this fact is reflected in the lower rejection rates observed.

Table 3:

Rejection rates, cases of the AR alternative (0.05 tests).

Basic Test
Composite Test

Table 4 shows rejection rates, in the basic test only, against a range of nonlinear I(0) processes. Here, we report the averages over replications of the log-periodogram estimates of d, side by side with the rejection frequencies by the bootstrap test, where we know in each case that the true d is zero. The models reported are as follows, where in each case ϵtNI(0,1).

  • “Bilinear” is of the form


with ϕ1=0.8 and ϕ2=ϕ2=0.3.

  • “ESTAR” is the exponential self-exciting threshold AR case,


where α1=1.5, α2=1, γ=0.01.

  • “Markov Mean” is a model with Markov–switching intercepts. This takes the form


where α(1)=1, α(2)=1 and St=1 or 2 with P(St=1|St1=2)=P(St=2|St1=1)=0.05.

  • “Markov AR” is an autoregressive Markov-switching model,


    where ϕ(1)=1.0, ϕ(2)=0.6 and P(St=1|St1=2)=0.03 and P(St=2|St1=1)=0.05.

All of these models generate I(0) series, in the sense that their memory decay is ultimately exponential, but they have tended to give rise to highly biased GPH estimates, even in quite large samples.

Table 4:

Bootstrap rejection rates in Nonlinear I(0) Alternatives (0.05 tests).

TBilinearESTARMarkov MeanMarkov-AR

Our final set of experiments examines rejection rates (basic test only) under what we might call the “contaminated null hypothesis”, in other words, models in which the fractional differences of the process are autocorrelated. In such cases we have to consider resampling the differences using a bootstrap for dependent data. We consider two cases of the stationary bootstrap of Politis and Romano (1994) with exponential block length distributions with mean block lengths of 5 and 10 observations, and also the sieve-autoregressive method of Bühlmann (1997), where the lag length for the autoregression is chosen by the Akaike criterion up to a maximum of 10 lags. Including the asymptotic criterion, this makes for five test variants in all. Table 5 shows the results for models and sample sizes as in Table 2, but with an autoregressive component with ϕ=0.3. Thus, these are cases of the ARFIMA(1,d,0) class

Table 5:

Bootstrap rejection rates for ARFIMA(1,d,0) Models, ϕ =0.3 (0.05 tests).

IndependentStationarySieve AR
Block Mean 5Block Mean 10

The asymptotic test acquits itself relatively well here, and the independent bootstrap fails seriously only in the smallest sample. Under-rejection in the larger samples is again a feature of the findings, with the poorest performance delivered by the stationary bootstrap with the block-length 10.

9 Applications

We report two applications of the skip-sampling test. The first case considers a volatility measure for daily exchange rates of the British pound sterling against six currencies, in excess of 9,500 observations covering the period January 1975 to October 2012, (source: Bank of England). The measure in question is the logarithm of the absolute value of daily appreciation (log-change) augmented by 0.005. Taking logarithms normalizes the distribution by alleviating asymmetry and excess kurtosis, while adding the small constant overcomes the problem of days when zero change was recorded, so that the volatility measure is undefined.

Table 6:

Tests for long memory in exchange rate volatility.

Wald testSkip-sample testBias test
Australian Dollar0.53200.7890.013
Canadian Dollar0.54800.3750.251
Danish Krone0.47100.8830
Japanese Yen0.41400.9870.278
New Zealand Dollar0.52300.9770
US Dollar0.42900.3190.144

Table 6 reports in the first column the estimated d from GPH estimation with a bandwidth of [T0.55] where T is sample size. The succeeding columns show the bootstrap p-values for three tests: the usual Wald test (t-test) of significance of d, the skip-sampling test (skip period 8, bandwidth of [T0.7]), and lastly the bias test of Davidson and Sibbertsen (2009) with a bandwidth of 0.92. The latter test has the null hypothesis of a pure fractional process, and tests for the presence of short-run autocorrelation in the fractional differences. The stationary block-bootstrap with a mean block-length of 5 was implemented with 299 bootstrap replications. Note that in these cases, given the significance test outcomes, the composite test could not return a result different from the simple skip-sampling test.

As can be seen, none of these skip-sampling tests leads to a rejection, so that the skip-sampling test reinforces the evidence from the Wald test that these series are long memory. The skip-sampling p-values tend to appear at the upper end of the unit interval, which is expected behaviour of the bootstrap in samples of this size, given the Monte Carlo findings. Even allowing for this distortion, however, the evidence in favour of the null hypothesis appears unequivocal.

Our second application is to the growth (log-change) in Robert Shiller’s S&P500 monthly real dividends series for January 1871–June 2012.[3] First, consider the sub-period starting in January 1946 (798 observations) with the results shown in the first row of the Table 7. The estimation and test settings are the same here as for the previous example and, once again, note that the composite test cannot return different findings. This result suggests that the long memory indicated by the Wald test is spurious. However, we have a more direct check on this finding in the present case, by extending the sample. The result with the full set of 1,697 observations, starting in February 1871, is shown in the second row of Table 7, where the Wald test p-value falls emphatically in the non-rejection region of the 1-tailed test. This result shows how the biases in log-periodogram regression can persist in large samples, but also how the skip-sampling test offers the possibility of providing counter-evidence to this spurious significance.

Table 7:

Tests for long memory in dividend growth.

Dividend growthdˆBootstrapp-values
Wald testSkip-sample testBias test

10 Conclusion

In this paper we have investigated the performance of a test for the null hypothesis of long memory, based on the self-similarity property of sequences with hyperbolic memory decay. The idea is to compare GPH log-periodogram estimators in original and skip-sampled versions of the data set. The aliasing phenomenon, which introduces an estimation bias in skip-samples, poses a problem for the implementation of this test, but a bias-corrected estimator permits the construction of an asymptotically pivotal statistic.

The use of a semiparametric method to construct the estimators and test statistic, with correspondingly slow convergence to the asymptote, inevitably poses a challenge for the implementation of the test, and the bootstrap variant of the test performs relatively well, in spite of being implemented using semi-parametric estimates of the null distribution. This combination of factors proves to pose a problem of under-rejection even in quite large samples, in the Monte Carlo evaluations. A bias reduction strategy such as the double bootstrap (Beran 1988) might alleviate this problem, at the cost of a large computational overhead, but in the settings where the test might be applied, under-rejection is a relatively benign problem, and the experiments indicate reasonable power properties. Alternative choices of test settings, such as GPH bandwidths and skip period, could prove helpful, although to research these must take us beyond the scope of this study. Notwithstanding these qualifications, the test may prove a useful addition to the arsenal of diagnostic procedures for long memory models, beside the bias test of Davidson and Sibbertsen (2009), which compares log-periodogram estimates with different bandwidths, and the aggregation test of Ohanissian, Russell, and Tsay (2008).


A.1 Proof of Proposition 2.1

Let γk denote the kth autocovariance, defined by the identity


For the skip-sampled data with sampling period N, the autocovariances are γNk where


where the third equality makes use of the fact that cos(Nkω)=cos(Nkω+2πj), and the fourth one makes the change of variable λ=Nω and the substitution


A.2 Proof of Proposition 4.1



note first that


We obtain a formula for the derivative in the second term, and show that this is bounded in the limit. The terms of the form [7] depend on d because the data used to construct the sieve autoregressive estimates are the fractional differences of the measured data. Assume that p is fixed, and let zt=(1L)dxt and so let Z0(Tp×p) be the normalized data matrix whose columns are the vectors zj=(zp+1j,,zTj) for j=1,,p. Also, let Zj for j=1,,p denote the matrix equal to Z0 except that the jth column has been replaced by z0=(zp+1,,zT). Then, note that the coefficients ϕˆj in the autoregression of order p can be written using Cramer’s rule as


Let these elements define the p+1×1-vector ϕˆ by also putting ϕˆ0=1.

Now, let Q(θ)(p+1×p+1) denote the real part of the Fourier matrix with elements qrs=cosθ(rs) for r,s=0,,p. Setting θ1=(λ+2πk)/N and θ2=λ/N, note that


where b is the p+1-vector having elements b0=TpZ0Z0 and bj=TpZ0Zj for j=1,,p. In this notation we have


and it remains to evaluate the second right-hand side factor.

Start with the elements of the Zj matrices. Considering row t, let m denote the generic lag associated with a column of Zj. Using the argument from Tanaka (1999), Section 3.1, the derivatives with respect to d can be written as


where the last equality defines ztm. We have from Magnus and Neudecker (1988, 149), that for j=0,,p,

(defining bj) where the Zj denote the matrices with elements ztm, with the value of m defined as appropriate, according to the construction of Zj. Letting b denote the vector with elements b0 and bj, for j=1,,p, we now have the result


Since {zt} is a weakly dependent process by hypothesis, the process zt is covariance stationary. It follows directly that, for every finite p, b converges in probability to a nonstochastic limit, depending on the autocovariances of {zt}. From the fact that b converges in the same manner, and the Slutsky theorem, the proposition follows under the conditions stated.

Two simplifying assumptions have been made to reach this conclusion. The first is that zt has been constructed as an infinite order moving average, whereas in practice the sums will be truncated, containing only the first tm terms. However, since the truncation affects at most a finite number of terms, this cannot change the value of the limit. Second, the lag length p has been assumed fixed. However, since zt is a weakly dependent process by hypothesis, the autocovariances are summable and hence equal zero for lags exceeding some finite value. Letting p tend to infinity with T cannot change the distribution of B(N,d) beyond some point, since the additional elements of b and b have sums converging to zero as p increases.


We thank David Peel for helpful discussions on this problem, and an anonymous referee for perceptive comments which have materially improved the paper.


Agiakloglou, C., P.Newbold, and M.Wohar. 1993. “Bias in an Estimator of the Fractional Difference Parameter.” Journal of Time Series Analysis14:23546. Search in Google Scholar

Baillie, R. T., T.Bollerslev, and H. O.Mikkelsen. 1996. “Fractionally Integrated Generalized Autoregressive Conditional Heteroscedasticity.” Journal of Econometrics74:330. Search in Google Scholar

Beran, R.1988. “Prepivoting Test Statistics: A Bootstrap View of Asymptotic Refinements.” Journal of the American Statistical Association83:68797. Search in Google Scholar

Bühlmann, P.1997. “Sieve Bootstrap for Time Series.” Bernoulli3:12348. Search in Google Scholar

Chambers, M. J.1998. “Long Memory and Aggregation in Macroeconomic Time Series.” International Economic Review39 (4). Symposium on Forecasting and Empirical Methods in Macroeconomics and Finance (Nov., 1998):105372. Search in Google Scholar

Davidson, J.1994. Stochastic Limit Theory: An Introduction for Econometricians. Oxford: Oxford University Press. Search in Google Scholar

Davidson, J.2004. “Moment and Memory Properties of Linear Conditional Heteroscedasticity Models, and a New Model.” Journal of Business and Economics Statistics22 (1):1629. Search in Google Scholar

Davidson, J.2009. “When Is a Time Series I(0)?” Chapter 13, pages 322–342, of The Methodology and Practice of Econometrics, a Festschrift for David F. Hendry, edited by J. Castle and N. Shepherd. Oxford: Oxford University Press. Search in Google Scholar

Davidson, J., and N.Hashimzade. 2009. “Type I and Type II Fractional Brownian Motions: A Reconsideration.” Computational Statistics and Data Analysis53 (6):2089106. Search in Google Scholar

Davidson, J., and P.Sibbertsen. 2009. “Tests of Bias in Log-Periodogram Regression.” Economics Letters102:836. Search in Google Scholar

Diebold, F. X., and A.Inoue. 2001. “Long memory and regime switching.”Journal of Econometrics105:13159. Search in Google Scholar

Dufour, J. -M. 1997. “Some Impossibility Theorems in Econometrics with Applications to Structural and Dynamic Models.” Econometrica65 (6):136587. Search in Google Scholar

Faust, J. 1996. “Near Observational Equivalence and Theoretical Size Problems with Unit Root Tests.” Econometric Theory12 (4):72431. Search in Google Scholar

Faust, J. 1999. “Conventional Confidence Intervals for Points on Spectrum Have Confidence Level Zero.” Econometrica67 (3):62937. Search in Google Scholar

Gallant, A. R., and H.White. 1988. A Unified Theory of Estimation and Inference for Nonlinear Dynamic Models. Oxford: Basil Blackwell. Search in Google Scholar

Geweke, J., and S.Porter-Hudak. 1983. “The Estimation and Application of Long-Memory Time Series Models.” Journal of Time Series Analysis4:22137. Search in Google Scholar

Hassler, U. 2011. “Estimation of Fractional Integration under Temporal Aggregation.” Journal of Econometrics162:2407. Search in Google Scholar

Hurvich, C. M., R.Deo, and J.Brodsky. 1998. “The Mean Squared Error of Geweke and Porter-Hudak’s Estimator of a Long Memory Time Series.” Journal of Time Series Analysis19:1946. Search in Google Scholar

Kim, C. S., and P. C. B.Phillips. 2006 “Log-Periodogram Regression: The Nonstationary Case.” Cowles Foundation Discussion Paper 1587, Yale University. Search in Google Scholar

Kreiss, J. -P., E.Paparoditis, and D. N.Politis. 2011On the Range of Validity of the Autoregressive Sieve Bootstrap.” Annals of Statistics39:210330. Search in Google Scholar

Magnus, J. R., and H.Neudecker. 1988. Matrix Differential Calculus with Applications in Statistics and Econometrics. Chichester: John Wiley & Sons. Search in Google Scholar

Moulines, E., and P.Soulier. 1999. “Broad Band Log-Periodogram Estimation of Time Series with Long-Range Dependence.” Annals of Statistics27:141539. Search in Google Scholar

Ohanissian, A., J. R.Russell, R. S.Tsay, et al. 2008. “True or Spurious Long Memory? a New Test.” Journal of Business and Economic Statistics26 (2):16175. Search in Google Scholar

Politis, D. N., and J. P.Romano. 1994. “The Stationary Bootstrap.”Journal of the American Statistical Association89:130313. Search in Google Scholar

Pötscher, B. M. 2002. “Lower Risk Bounds and Properties of Confidence Sets for Ill-Posed Estimation Problems with Applications to Spectral Density and Persistence Estimation, Unit Roots and Estimation of Long Memory Parameters.” Econometrica70 (3):103565. Search in Google Scholar

Robinson, P. 1995. “Log-Periodogram Regression of Time Series with Long-Range Dependence.” Annals of Statistics23:104872. Search in Google Scholar

Smith, J., and L. R.Souza. 2002. “Bias in the Memory Parameter for Different Sampling Rates.” International Journal of Forecasting18:299313. Search in Google Scholar

Smith, J., and L. R.Souza. 2004. “Effects of Temporal Aggregation on Estimates and Forecasts of Fractionally Integrated Processes: A Monte-Carlo Study.” International Journal of Forecasting20:487502. Search in Google Scholar

Souza, L. R. 2005. “A Note on Chambers’s Long Memory and Aggregation in Macroeconomic Time Series.” International Economic Review46 (3):105962. Search in Google Scholar

Tanaka, K. 1999. “The Nonstationary Fractional Unit Root.” Econometric Theory15:54982. Search in Google Scholar

Velasco, C. 1999. “Non-Stationary Log-Periodogram Regression.” Journal of Econometrics91:32571. Search in Google Scholar

Published Online: 2015-5-21
Published in Print: 2015-7-1

©2015 by De Gruyter