Accessible Published by De Gruyter January 28, 2017

Goodness-of-fit tests for random sequences incorporating several components

Tetiana O. Ianevych, Yuriy V. Kozachenko and Viktor B. Troshki


In this paper we have constructed the goodness-of-fit tests incorporating several components, like expectation and covariance function for identification of a non-centered univariate random sequence or auto-covariances and cross-covariances for identification of a centered multivariate random sequence. For the construction of the corresponding estimators and investigation of their properties we utilized the theory of square Gaussian random variables.

MSC 2010: 60G15; 60G10

1 Introduction

Investigating different processes and phenomena, we very often deal with random sequences or time series in practice. The later notion is used more frequently, but in this paper we will use the former one stressing out on some similarity with random processes.

There are many books and papers devoted to this topic. In particular, the classical books on statistical analysis of time series are written by Anderson [1], Box and Jenkins [2], and Brockwell and Davis [4]. Many of the goodness-of-fit tests in time series analysis are residual-based. For example, the classic portmanteau test of Box and Pierce [3] and its improvement by Ljung and Box [15] are based on the sample autocorrelations of the residuals. Chen and Deo [6] proposed some diagnostic tests based on a spectral approach of the residuals.

Within the model-based approach to time series analysis, estimated residuals are computed once a fitted model has been obtained from the data, and then tested whether they behave like white noise. These tests require the computation of residuals from the fitted model, which can be quite tedious when the model does not have a finite order autoregressive representation. Also, in such cases, the residuals are not uniquely defined.

In this paper we use another approach gained from the theory of stochastic processes. It is well known that random sequences as long as processes can be identified through their expectation and covariance function. So, for checking a goodness-of-fit test for a non-centered univariate stationary Gaussian sequence we need to aggregate the information about both these components. And we do it with the help of quadratic forms. Another issue of our interest are centered multivariate random sequences. In this case we can also incorporate information on every component into the test through the quadratic forms.

For the estimator construction we utilize the theory of square Gaussian random variables. This theory was developed in the works [10, 12, 13] for the investigation of stochastic processes. In the book by Buldygin and Kozachenko [5] the properties of the space of square Gaussian random variables were studied.

So, at the beginning we investigate properties of quadratics forms of square Gaussian random variables raised to some power p and then construct the test. For these tests we do not need to compute the residuals and they can be applied even in the case of infinite order representations.

This paper is the continuation of a series of works. In the papers [14] and [9] we considered centered univariate random sequences and constructed the goodness-of-fit tests using different types of statistics: one was based on properties of maximum of square Gaussian random variables, another one was built using inequalities for the Square Gaussian random variables raised to the power p. The statistics utilizing maximum of square Gaussian random variables had been constructed for non-centered univariate random sequences and for multivariate but centered random sequences in [11].

The paper consists of five sections and two annexes. The second section is devoted to the theory of square Gaussian random variables and contains the main definitions and results. Sections 3 and 4 consist of application of the estimate obtained in Section 2 for construction different aggregated tests. The criterion in Section 3 is constructed for testing the aggregated hypothesis on expectation and covariance function of the non-centered stationary Gaussian sequence. In Section 4 we consider the centered Gaussian multivariate stationary sequences. The approach analyzing the residuals dominates in multivariate case too. See, for example, papers by Hosking [7, 8], Mahdi and McLeod [16] and the references therein. The goodness-of-fit test we have constructed is based on fitting the covariance function. Section 5 draws some conclusions. Some necessary mathematical calculations are relegated to the annexes at the end.

2 Square Gaussian random variables

Let Ξ={γi:iI} be a family of the jointly Gaussian random variables for which 𝖤γi=0 for all iI.

Definition 2.1

Definition 2.1 ([13])

The space SGΞ(Ω) is the space of square Gaussian random variables if any element ξSGΞ(Ω) can be presented as


where γT=(γ1,,γr), γiΞ, i=1,,r, 𝐀 is a real-valued matrix; or the element ξSGΞ(Ω) is the square mean limit of the sequence {ξn:n1} of the form (1), that is,


It was proved by Buldygin and Kozachenko in [5] that SGΞ(Ω) is a linear space.

For the square Gaussian random variables the following results hold true.

Theorem 2.2

Theorem 2.2 ([10])

Let ξT=(ξ1,ξ2,,ξd) be a random vector such that ξiSGΞ(Ω) and let D be a symmetric semi-definite matrix. Then for all 0<|s|<1 the following inequality is true:


where R(y)=11-yexp{-y2}, 0y<1.

Theorem 2.3

Let ξT=(ξ1,ξ2,,ξd) be a vector such that ξiSGΞ(Ω) and let D be a symmetric and semi-definite matrix. Then for any 0<|s|<1 and p>0 the following inequality holds true:



It can be easily seen that for all x>0 and p>0 we have xpe-xppe-p, or equivalently, xpppe-pex. Hence, for all 0<|s|<1,


Here the function R comes from Theorem 2.2. ∎

Theorem 2.4

Let {η(m):1mM}, M<, be a sequence of random variables that can be presented as a quadratic form of square Gaussian random variables, that is,


where ξT(m)=(ξ1(m),,ξd(m)) with ξi(m)SGΞ(Ω), and let Dm be a symmetric semi-definite matrix. Then


for all p>0 and δ>pM1/p2(1+1+2p), where U(y)=21+yexp{-y2}, y>0.


Let us consider ε>0, 0<|s|<1 and r1. Applying the Chebyshev’s inequality, we obtain that


The Minkowski inequality and (3) imply that




where φ(r)=rrpar, r>0, a=Mppepεp.

Let us investigate the behavior of the function φ(r)=rrpar when r>0. It reaches its minimum value at the point r*=εpM1/p and


But since the Minkowski inequality is valid only for r1, we should be sure that r*1. This will be true if εpM1/p.

So, we can minimize the right-hand side of inequality (5) with regard to r assuming εpM1/p and obtain that


Denoting δ:=2|s|ε>0 with 0<|s|<1, we get


where c=1+2δM1/p.

The function f(s) within 0<|s|<1 reaches its minimum at the points s=±(1-1c). We will only consider s*=1-1c>0 since this function is even. As long as c=1+2δM1/p>1 for all δ>0, we have 0<s*<1. The function f attains its minimum value at the point s* and it is equal to


Inequality (4) holds true for δ satisfying the relation


The last condition is fulfilled as δ>pM1/p2(1+1+2p). ∎

3 Goodness-of-fit test for non-centered univariate sequences

The results obtained in the previous section can be utilized for construction the tests on joint hypothesis about an expectation and a covariance function of the non-centered univariate stationary Gaussian sequence.

Let us consider the stationary sequence {γ(n):n1} for which 𝖤γ(n)=a is its expectation and


is its covariance function. Hereinafter we will consider stationarity in a strict sense.

We assume that we have N+M (N,M>0) consecutive observations of this random sequence. Let us consider the estimators for the expectation and covariance function as follows:


For every estimator above we can evaluate the quantities


Remark 3.1

Quantities (6) and (7) we can rewritten in the following form (0mM-1):




So, the following random variables are square Gaussian:




Let us define the vectors


For any semi-definite matrix 𝐃m=(bij(m))i,j=1,2 the random variables


are actually the quadratic forms of square Gaussian random variables.

Example 3.2

Suppose that


(that is, 𝐃 is the identity matrix of order 2). Then




All necessary formulas for the terms of 𝖤ηN(m) and their derivations are included in Annex 1.

Further on we consider the particular case when 𝐃m=𝐈2 in relations (6)–(11). In this case results of the following theorem holds true.

Theorem 3.3

Consider the stationary sequence {γ(n):n1} with Eγ(n)=a and covariance function


The random variables ηN(m) are defined by relations (6)–(11) with Dm=I2. Then for some fixed p1, M<N, with M,NN, and δ>2AN+pM1/p(1+1+2p),


where U(y)=21+yexp{-y2}, y>0 and



In our case when 𝐁m=𝐈2 the random variables


Let us use the following notation:




since for a,b>0

(a+b)pχp(ap+bp),mwhere χp={1,0<p<1,2p-1,p1.




(χp/2)1pχ1/p2for p1.

Denoting for simplicity


we will get


This implies the statement of the theorem. ∎

Remark 3.4

The desired property for AN is AN0 as N. It is fulfilled for many covariance functions, for instance, for B(m)=e-λ|m|, m0, λ>0.

Using inequality (12), we can construct the goodness-of-fit test.

Criterion 1

Let the null hypothesis H0 state that for non-centered Gaussian stationary sequence {γ(n):n1} the expectation is a=a0 and its covariance function is given by B(m)=B0(m), m0. The alternative Ha implies the opposite statement. The random variables ηN(m) are as determined in (6)–(11) with 𝐃m=𝐈2. If for significance level α, some fixed p1 and M<N, with M,N,


then the hypothesis H0 should be rejected and accepted otherwise. Here εα is a critical value that can be found from the equation




and taking into account the restriction εα>2AN+pM1/p(1+1+2p).


The criterion follows from Theorem 3.3. ∎

Remark 3.5

The formulas for the evaluation of 𝖤ηN(m) can be found in Annex 1.

4 Goodness-of-fit tests for the centered multivariate random sequences

Inequality (4) can also be useful for testing a hypotheses on covariance of centered multivariate random sequences.

Let us assume that the components of the multivariate random sequence γ(n), n1, are jointly Gaussian, stationary (in the strict sense) sequences {γ(k)(n):n1,k=1,,K} for which 𝖤γ(k)(n)=0 and


is the covariance function of this multivariate sequence. It is worth to mention that if k=l, then Bkk is an ordinary autocovariance function of the k-th component and when kl, Bkl are the joint covariances or sometimes called cross-covariances. Hereinafter we shell use for Bkl(m) the term covariance function of the sequenceγ(n).

We suppose that the sequence γ(n) is observed at the points 1,2,,N+M (N,M). As an estimator of the covariance Bkl(m) we choose


The estimator B^Nkl(m) is unbiased for Bkl(m):


The random variables


are square Gaussian since B^Nkl(m) can be presented as


where the matrix is


Let ΔN(m) be a vector with components ΔNkl(m) and let 𝐃m=(dij(m))i,j=1,2 be some semi-definite matrices. In this case we can construct the goodness-of-fit test for centered multi-variate random sequence using the results of Theorem 2.4.

Criterion 2

Let the null hypothesis H0 state that for the centered Gaussian stationary multivariate sequence γ(n)={γ(k)(n):n1}k=1,,K, its covariance function equals Bkl(m)=B0kl(m), m0, while the alternative Ha states the opposite. If for some fixed p>0, M<N, with M,N, some significance level α and corresponding critical value δα (δα>pM1/p2(1+1+2/p)), which can be found from the equation


for any semi-definite matrices 𝐃m and random variables ηN(m)=ΔNT(m)𝐃mΔN(m),


then the hypothesis H0 should be rejected and accepted otherwise.


The criterion follows from Theorem 2.4. ∎

Remark 4.1

The probability of type I error for Criterion 2 is less than or equal to α.

Remark 4.2

The simplest way is to choose the matrix 𝐃m to be identical of the corresponding order.

Example 4.3

Let us consider the 2-component (K=2) stationary centered Gaussian sequence






If we choose the matrix 𝐃=𝐈4 to be an identical one of fourth order, then ηN(m)=ΔNT(m)𝐈4ΔN(m) and


The needed formulas for evaluation of 𝖤(ΔNkl(m))2 are included in Annex 2.

5 Conclusions

In this paper we estimated the distribution of quadratic forms raised to the power p of square Gaussian random variables. This result made it possible to build the criterion for testing a hypothesis on expectation and covariance function of the non-centered univariate stationary Gaussian sequence and a hypothesis on the covariance function of the centered multivariate stationary Gaussian sequence.

Our test statistics are quite easy to compute and do not require the calculation of residuals from the fitted model. This is especially advantageous when the fitted model is not a finite order autoregressive model.

There is, of course, a lot of room for improvement of the tests. Comparison with other tests and finding the number N for which the null and simple alternative hypotheses can be distinguishable are also very important issues for further investigation.

Annex 1

This annex includes the calculations needed in Section 3. In particular, the general formulas are given for 𝖤(ξNa(m))2 and 𝖤(ξNB(m))2.

For 𝖤(ξNa(m))2:

We have


Using Isserlis’ formula for centered Gaussian random variables γ~(n)=γ(n)-a, we obtain




For 𝖤(ξNB(m))2:

We have




Using again Isserlis’ formula, we obtain that








Annex 2

In Section 4 we need to find the expectation of (ΔNkl(m))2 in order to calculate the value of 𝖤(ΔNT(m)𝐈𝟒ΔN(m)) (see formula (14)). Below we make the needed calculations:


Using Isserlis’ formula for the centered Gaussian random variables, we obtain




Communicated by Enzo Orsingher

Funding statement: The third author’s research was supported by the Visegrad Scholarship Program-EaP (No. 51601704).


[1] Anderson T. W., The Statistical Analysis of Time Series, John Wiley & Sons, New York, 1971. Search in Google Scholar

[2] Box G. E. P., Jenkins G. M. and Reinsel G. C., Time Series Analysis: Forecasting and Control, 4th ed., John Wiley & Sons, Hoboken, 2011. Search in Google Scholar

[3] Box G. E. P. and Pierce D. A., Distribution of the residual autocorrelations in autoregressive integrated moving average time series models, J. Amer. Statist. Assoc. 65 (1970), 1509–1526. Search in Google Scholar

[4] Brockwell P. J. and Davis R. A., Time Series: Theory and Methods, 2nd ed., Springer Ser. Statist., Springer, New York, 2009. Search in Google Scholar

[5] Buldygin V. V. and Kozachenko Y. V., Metric Characterization of Random Variables and Random Processes, American Mathematical Society, Providence, 2000. Search in Google Scholar

[6] Chen W. W. and Deo R. S., A generalized portmanteau goodness-of-git test for time series models, Econometric Theory 20 (2004), no. 2, 382–416. Search in Google Scholar

[7] Hosking J. R. M., The multivariate portmanteau statistic, Statist. Sinica 75 (1980), 602–608. Search in Google Scholar

[8] Hosking J. R. M., Lagrange-multiplier tests of multivariate time-series models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 43 (1981), no. 2, 219–230. Search in Google Scholar

[9] Ianevych T. O., An Lp-criterion for testing a hypothesis about the covariance function of a rancom sequence, Theory Probab. Math. Statist. 92 (2016), 163–173. Search in Google Scholar

[10] Kozachenko Y. V. and Fedoryanych T. V., A criterion for testing hypotheses about the covariance function of a Gaussian stationary process, Theory Probab. Math. Statist. 69 (2004), 85–94. Search in Google Scholar

[11] Kozachenko Y. V. and Ianevych T. O., Some goodness of fit tests for random sequences, Lith. Math. J. Stat. 52 (2013), no. 1, 5–13. Search in Google Scholar

[12] Kozachenko Y. V. and Stadnik A. I., Pre-Gaussian processes and convergence in C(T) of estimators of covariance function, Theory Probab. Math. Statist. 45 (1991), 51–57. Search in Google Scholar

[13] Kozachenko Y. V. and Stus O. V., Square-Gaussian random processes and estimators of covariance functions, Math. Commun. 3 (1998), no. 1, 83–94. Search in Google Scholar

[14] Kozachenko Y. V. and Yakovenko T. O., Criterion for testing the hypotethis about the covariance function of the stationary Gaussian random sequence (in Ukrainian), Bull. Uzhgorod Univ. Ser. Math. Inform. 20 (2010), 39–43. Search in Google Scholar

[15] Ljung G. M. and Box G. E. P., On a measure on lack of fit in time series models, Biometrika 65 (1978), no. 2, 297–303. Search in Google Scholar

[16] Mahdi E. and McLeod A. I., Improved multivariate portmanteau test, J. Time Series Anal. 33 (2012), no. 2, 211–222. Search in Google Scholar

Received: 2016-10-2
Accepted: 2017-1-15
Published Online: 2017-1-28
Published in Print: 2017-3-1

© 2017 by De Gruyter