Realized BEKK-CAW Models

: Estimating time-varying conditional covariance matrices of financial returns play important role in portfolio analysis, risk management, and financial econometrics research. The availability of high-frequency financial data can provide an additional data source for dynamic covariance modeling. In this paper, we propose to use the information of asset return vector and realized covariance measures simultaneously to develop a new conditional covariance matrix model. We derive the stationary condition of the new model. We use the normal and Wishart distributions to construct the quasi-log-likelihood function. We also consider the variance targeting (VT) method, which plugs in the weighted average of the sample covariance matrix of returns and the sample mean of realized covariance measure for the unconditional covariance matrix, in order to maximize the quasi-log-likelihood function. We show the consistency and asymptotic normality of the quasi-maximum likelihood (QML) and VT estimators. We investigate the finite sample property of these estimators via Monte Carlo experiments. The empirical example for the bivariate data of the Nikkei 225 index and its futures indicates that the first-step VT estimation could have non-negligible effects on the standard errors of the second-step VT estimates.


Introduction
For two decades, realized volatility measure has been a popular tool for analyzing financial time series.Engle and Gallo (2006), Hansen, Huang, and Shek (2012), and Shephard and Sheppard (2010), among others, have extended the class of generalized autoregressive conditional heteroskedasticity (GARCH) models using information such as the range, squared returns, and realized measure of volatility.As the information on returns and realized volatility measures are available simultaneously, Hansen, Huang, and Shek (2012) developed the "realized GARCH" model, which is based on the traditional returns equation and an additional equation of a realized measure.Other studies using realized volatility measures for univariate volatility modeling include So and Xu (2013), Takahashi, Omori, and Watanabe (2009), and Takahashi, Watanabe, and Omori (2016).Recent development in volatiliy modeling can be found in So et al. (2022).
In this paper, we propose a new conditional covariance model for return vectors and realized covariance measures, using the Wishart distribution as in Golosnoy, Gribisch, and Liesenfeld (2012) and Gorgi et al. (2019), from which we develop the QML estimator of unknown parameters.Our new model is based on the CAW model of Golosnoy, Gribisch, and Liesenfeld (2012) and the BEKK specification, named after Baba et al. (1985) (see Engle and Kroner 1995).Since our new model accommodates realized covariance measures and cross products of return vectors to construct the conditional covariance matrix, the statistical properties of the model are unclear.The first purpose of the paper is to develop the new model and examine the stationary condition.The second purpose is to show the consistency and asymptotic normality of the QML estimator.
Since our new model uses the information on return vectors and realized covariance matrix measures, we also consider the weighted average of the sample covariance matrix of the former and the sample mean of the latter to adopt the variance targeting (VT) estimation approach as in the unvariate and multivariate GARCH framework (see Francq, Horváth, and Zakoïan 2011;Pedersen and Rahbek 2014).We obtain the VT estimator as an alternative to the QML estimator by maximizing the quasi-log-likelihood function given the estimate of the unconditional covariance matrix.We show the consistency and asymptotic normality of the VT estimator, and we examine how the choice of the weight affects the asymptotic covariance matrix of the VT estimator.
The remainder of the paper is organized as follows.Section 2 presents the development of the new realized model and examines the stationary condition.Section 3 explains the QML and VT estimation and shows the consistency and asymptotic normality of the QML and VT estimators.Section 4 provides an empirical example of the bivariate data for the Nikkei 225 index and its futures.Section 5 offers some concluding remarks.All proofs are given in the Appendix.

Realized BEKK-CAW Models
Consider a sequence of T observations of a d-dimensional vector process X t and d-dimensional positive-definite matrix process W t which have the following new structures (1) (2) where  Kroner (1995).Under the assumption that U t follows the Wishart distribution, Eqs. ( 2) and (3) with A = O constitute the CAW model of Golosnoy, Gribisch, and Liesenfeld (2012).As the new model encompasses both specifications, we refer to Eqs. ( 1)-(3) as the "realized BEKK-CAW" model.
, and  t = vech(W t ), in order to derive where  = vech( ), (5) for any symmetric matrix S. Based on Eq. ( 4), we consider a 2d-vector and an error term  t = s t − ′ , which provides E( t ) = 0. Then Eq. ( 4) yields a VARMA(1,1) representation for s t , as where We make the following two assumptions for the strict stationarity of s t .
Assumption 1.The matrices A, B, and C satisfy the following condition: Assumption 2. The distribution of u t = vech(U t ) is absolutely continuous with respect to the Lebesgue measure on R d * , and the mean vector is located at an interior point of the support of the distribution.
Remark 2.1.Assumption 1 guarantees the existence of unconditional mean of s t .
As a related issue, a tedious but cumbersome calculation gives det( We can show the strict stationarity of s t in the following way.Considering Z t and mean subtracted random vectors, u t−1 − vech(I d ) and vech ) , we can apply an argument similar to Theorem 2.4 in Boussama, Fuchs, and Stelzer (2011) for a Markov chain constructed by the BEKK-type specification (see also Section 2.4 of Doukhan 1994).Assumptions 1 and 2 imply the existence of a unique strictly stationary and ergodic solution to the realized BEKK-CAW model in Eqs.(1)-(3).Furthermore, the stationary solution guarantees that the sec- where Ω is the mean of W t and the covariance matrix of X t .Then the solution is given by Note that we can show that the expected value of s t based on Eq. ( 6) provides the same solution,  =  + A *  + B *  + C * .Equation ( 7) yields an alternative form of Eq. (3) as We refer to Eq. ( 8) as the VT representation, since Ω is the covariance matrix of X t and the expected value of the realized covariance measure, W t .The VT representation is in line with the works of Francq, Horváth, and Zakoïan (2011) and Pedersen and Rahbek (2014) in the GARCH literature.

Quasi-Maximum Likelihood and Variance Targeting Estimators
We consider two estimators based on the quasi-log-likelihood function of the proposed realized BEKK-CAW model in Eqs. ( 1)-(3).
Denote the parameter vector,  = ( ′ ,  ′ ) ′ , where  = (vec(A) ′ , vec(B) ′ , vec(C) ′ ) ′ .To emphasize dependence of the parameter vector , and the initial value H 0 = m, we denote H t,m () for the VT representation (8).Considering Z t ∼ N(0, I d ) and Wishart distribution with the scale matrix V and the degre-of-freedom parameter , we construct the quasi-log-likelihood function where Γ d (x) is the multivariate gamma function defined by: Then the QML estimator is given by ( θ, ν) = arg max ∈Θ,∈Θ  L T,m (, ). (10) Note that  is the parameter vector for the realized BEKK-CAW model (1), (2), and (8), while  is the parameter from the assumption of the Wishart distribution.
For non-Wishart variable U t , Proposition 1 of Asai and So (2021) gives the condition for the existence of the pseud true value  0 , which is the target of ν.We make the following assumption.
and  d (x) is the multivariate digamma function defined by  Asai and So (2021) shows that E[l t,m ( 0 , )] has a unique maximum at  0 .
Rather than maximizing the quasi-log-likelihood function directly, we can develop a concentrated function to reduce the number of parameters to be estimated in one step.For this purpose, we use the VT approach.Two candidate estimators for  are given by ξ = T −1 ∑ T t=1  t and w = T −1 ∑ T t=1  t , where  t and  t are stated in Eq. (4).For a predetermined real value, q, define ω(q) = q ξ + (1 − q) w, (12) which is the weighted average of the two estimators.We use ω(q) for our VT estimator, and we will discuss the optimal weight, q, later.We replace  with Ω(q) in Eq. ( 9), yielding our VT estimator for (, ), as ( λ, ν) = arg max ∈Θ  ,∈Θ  L T,m ( ω(q), , ). (13) In the following two subsections, we examine the asymptotic property of QML and VT estimators.

Asymptotic Property of QML Estimator
Assumptions 1 and 2 imply the existence of a strictly stationary ergodic solution {X t , W t }, while Assumption 3 guarantees the existence of  0 , which gives a unique maximum.Since empirical analyses in this field use realized covariance matrices based on high-frequency returns, we need to wait for further research on the link between that assumption for a strictly stationary ergodic process and an estimated covariance matrix.We make the following assumption.
Assumption 4(b) covers the identification issue.One condition of the identification is that the first element in the matrices A, B, and C should be strictly positive, which is a sufficient condition for parameter identification for the BEKK-type structure, as shown in Engle and Kroner (1995).Assumption 4(b) is from Comte and Lieberman (2003), Hafner and Preminger (2009), and Pedersen and Rahbek (2014) for the BEKK-GARCH model, and Asai and So (2021) use the assumption for the CAW model.
The standard arguments found in Theorem 2.1 of Newey and McFadden (1994) establishes the consistency of the QML estimator.
We make the following assumption for the asymptotic normality of the QML estimator.
(c) The initial value H 0 is drawn for the stationary ergodic distribution.
Remark 3.3.Assumption 5(a) corresponds to the sixth moment condition for the multivariate GARCH models given by Hafner and Preminger (2009), Ling and McAleer (2003), and Pedersen and Rahbek (2014), in order to guarantee the existence of the expected value of the second derivative of the log-likelihood function.To relax Assumption 5(c) to allow for an arbitrary initial value H 0 , we may use arguments similar to either of Comte and Lieberman (2003) and Jensen and Rahbek (2004).
Proposition 1.Under Assumptions 1-5, as T → ∞: , where J 0 is the nonsingular matrix defined by with the multivariate trigamma function defined by  ′ and the positive semidefinite matrix Σ 0 stated in Eq. (A.9).
Remark 3.4.If we further assume that the model is correctly specified, i.e.
. The block diagonal structure implies the independence of the asymptotic distribution of θ and ν.
We can estimate Σ 0 , P 0 , and J 0 , using the sample outer-product of the gradient and Hessian matrices, as: where

Asymptotic Property of VT Estimator
We first introduce the consistency and asymptotic normality of ω(q).
Remark 3.5.We can find a theoretically optimal q by solving dtr ( for the partitioned matrix ) .
Define the sample covariance matrix of s t as Then, we can use q⋆ = tr(S  − S  )∕tr(S  − 2S  + S  ) for a candidate of q.
Remark 3.6.The idea of optimal q corresponds to the weight of the minimum variance portfolio (MVP) in financial analysis.Consider a portfolio of two assets with the variances, tr and tr ) , and the covariance, tr ) , with weight (q, 1 − q), where q can take any real value.Then we obtain the portfolio variance as q 2 tr ( ) . In this case the weight of the MVP is given by ( 16).Except for the case that the correlation coefficient, tr ) ., is equal to one, the portfolio variance is always less than tr ) and tr ) .Note that a negative value of q ⋆ or (1 − q ⋆ ) refers to short selling in portfolio management and can help reduce portfolio variance.
To show the consistency of ( λ, ν), we can apply an argument in the proof of Theorem 4.1 of Pedersen and Rahbek, since they provide an approach to show the consistency of the VT estimator based on Theorem 2.1 of Newey and McFadden (1994).See also Section 6 in Newey and McFadden (1994) for two-step estimation.
To show the asymptotic normality of VT estimator, we can use the same assumptions as in Proposition 1.
Remark 3.7.The first step estimator ω(q) affects the asymptotic covariance matrix of λ via On the other hand, the asymptotic variance of ν is unaffected by the structure.

Monte Carlo Experiments
We conduct Monte Carlo experiments in order to investigate the finite-sample properties of the QML and VT estimators for the realized BEKK-CAW model.The first experiment is on the ML estimation based on the bivariate data generating process (DGP) with the parameters ) ,  0 = 10, under the correctly specified distributions, Z t ∼ N(0, I d ) and

𝜈
) .The second experiment examines an implication of Remark 3.4, that is, the asymptotic independence of θ and ν.For this purpose, we use the DGP with  0 = 10 as above, but we fix  = 7 to obtain QML estimates.The third experiment investigates the effects of changing q for the VT estimation.We set two sample sizes, T = 250 and T = 500.The number of replications is 2,000, and we compute the sample means, standard deviations, and root mean squared errors (RMSEs).
Table 1 shows the results for the ML estimator.Regarding T = 250, each sample mean except for ( 11 ,  21 ,  22 ) is close to the true value.On the other hand, the ML estimators for ( 11 ,  21 ,  22 ) display upward bias for this sample size.Compared with the T = 250, the results for T = 500 have smaller biases, standard deviations, and RMSEs for most of the cases.Furthermore, the values of the standard deviation, except for ( 11 ,  21 ,  22 ) are close to their RMSEs, implying that their biases are negligible.Figure 1 shows the histogram and QQ plots for the ML estimates of  11 , a 11 , b 11 , c 11 , and  for T = 250.The distributions for ω11 and ν are right-skewed, whereas those of â11 , b11 , and ĉ11 are left-skewed.Figure 2 displays the histogram and QQ plots for T = 500.The pattern of skewness is the same as that in Figure 1, but Figure 2 shows that the distributions are closer to a normal distribution than those of Figure 1.
Next we consider the QML estimation, especially under the incorrect .Remark 3.4 implies that the estimate of  has negligible effects on the consistency and asymptotic covariance matrix of θ, if we use the correct distribution for Z t and U t .By the theoretical results in Section 3, we may use a fixed value for  to estimate .To examine the effect, we use  = 10 as in Table 1 for data generation, while we set  = 7 for the estimation.Table 2 reports the results for the QML estimator with fixed .The values in Table 2 are similar to those of Table 1. Figure 3 displays the histogram and QQ plots for T = 250 with fixed .The features in Figure 3 are the same as in Figure 1.The result indicates that we can ignore the effect of  for estimating  even though it is incorrect.Thirdly, we investigate the property of the VT estimator.Remark 3.5 discusses an optimal q which minimizes tr(QSQ).Table 3 shows the sample mean and standard deviation for q⋆ based on  = {7, 10, 15} with the  specified above for the data generation.The result indicates that the sample mean is close to 1∕( + 1), which is implied by the ratio of tr ) under the true log-likelihood function (9).Table 4 presents the finite sample results for the VT estimator of , ω(q), under T = 500.We use (i) the correctly specified q, as q = 1∕( + 1); (ii) incorrect q, as q = ∕( + 1); and (iii) the estimate of optimal q, as q = q⋆ .While the incorrect q increases the standard deviations and the RMSEs, the estimates of optimal q ⋆ provide results that are comparable to those based on the true q.The result implies that we can use q⋆ in our empirical analysis.

Empirical Example
As an empirical example, we consider modeling the covariance matrix of the returns of futures contracts and its underlying assets.Baillie and Myers (1991), Kroner and Sultan (1993), and Lai (2016) used such a pair for forecasting an optimal hedge ratio.The data for the empirical analysis in this study consists of high-frequency Nikkei 225 futures contracts and their underlying Nikkei 225 index for the period November 14, 2013, to December 8, 2016.Nikkei 225 futures are traded on the Osaka Exchange (OSE), Japan, and the contract months are March, June, September, and December.The records of nearby contracts on any given day are used and rolled to the next month when the volume of the current contract is exceeded.Trading hours consist of day and night sessions: 9:00 a.m.-3:15 p.m.We use  = 10 for the DGP, but estimated the model fixing  = 7.
and 4:30 p.m.-3:00 a.m., respectively.We use the data for the day session for simplicity, and we use 1-min data of returns for the Nikkei 225 index and its futures, provided by the Index Business Office of Nikkei Inc. and the OSE, respectively.
To estimate the daily quadratic covariation, we use the preaveraged estimator suggested by Christensen et al. (2012).Table 5 shows the descriptive statistics for daily open-close returns, volatility, and co-volatility, based on 1211 observations.The daily observations of volatility and co-volatility are skewed to the right and leptokurtic.We set T = 500 with the true parameters in Table 1 to generate the data.The sample period is from November 14, 2013 to December 8, 2016.
In our empirical analysis, we compare the QML and VT estimates, and we examine the appropriateness of assuming the quasi-likelihood function by examining the information matrix (IM) inequality.Under the assumption that H t is correctly specified, we can test the distributional assumption by White (1982) IM test.The null hypothesis for the IM test regarding the QML estimation is where Σ 0 and J 0 are defined by Eqs.(A.17) and ( 14), respectively.To construct the IM test statistics, we follow the auxiliary regression of Chesher (1983) and Lancaster (1984), as: where γt and Pt are stated in Eq. ( 15),  1 (k × 1) and  2 ( k(k+1)

2
× 1) are vectors of parameters, k is the number of the parameters in (, ), and e t is the error term.Then the IM test statistic is given by T − t , where êt is the OLS residual for the auxiliary regression.Under the null hypothesis, the IM test has the asymptotic Table 6 shows the QML and VT estimates for the realized BEKK-CAW model, and the corresponding standard errors.The estimates of Ω are different in the two models, implying that there is room to improve the specification for H t .For the QML estimates of a ij , b ij , and c ij (i, j = 1, 2), the diagonal elements are significant, while the off-diagonal elements are insignificant.The VT estimates are close to those of QML, except for a 12 , b 12 , and c 21 , and these exceptions are insignificant in both estimations.The standard errors of the VT estimates are generally greater than the corresponding values in the QML estimates, as expected by Proposition 2.
In particular, the estimate of a 11 is significant in the QML estimation, while we have the opposite result in the VT estimation.The VT estimate of the degree-of-freedom parameter, , is 8.673, which is close to the QML estimate.
Under the correct distributions, the optimal q ⋆ is 1∕( + 1).While the estimate of q ⋆ is −0.102, the value implied by ν is 0.103.Note that the optimal q ⋆ can be negative, as discussed in Remark 3.6.The difference implies non-normal and/or The first and third columns show the QML and VT estimates of the parameters, respectively.The second and fourth columns present the corresponding standard errors.The IM test statistics for the QML (VT) estimation have asymptotic  2 (136) ( 2 (91)) distribution, and the critical value at the 5% significance level is 164.22 (114.27).The P-value is given in brackets.
non-Wishart distribution.The IM test rejects the null hypothesis that Σ 0 − J 0 = O, implying the appropriateness of the QML asymptotic covariance matrix.The variance of the MVP in Remark 3.6 is 0.3983, which is smaller than tr(S  ) = 5.3073 and tr(S ww ) = 0.4400.The result implies that the weighted average of ξ and w produces more efficient estimator for  than w does.
The empirical results imply that (i) the VT estimation needs a larger sample for statistical inference; (ii) it is appropriate to consider the QML estimation for the non-normal and/or non-Wishart distributions implied by the data; and (iii) we need to improve the specification for H t .For the latter, we can accommodate alternative asymmetric effects, as discussed in Bollerslev and Patton (2020).As pointed out by Cappiello, Engle, and Sheppard (2006), it is not easy to specify the covariance matrix of negative returns.For this purpose, we need to await further research on the asymptotic theory for the class of realized BEKK-CAW model.

Conclusion
This paper develops the realized BEKK-CAW model, which uses the information on returns and realized covariance measure simultaneously.We derived the stationary condition, and showed the consistency and asymptotic normality of the QML and VT estimators.We provided the optimal value of weight q, which is required for the first-step VT estimation.The Monte Carlo results indicate that the finite sample property of the ML estimator is satisfactory; we can fix the value of the degree-of-freedom parameter, ; the estimate of the optimal value of q yields a result similar to that of the optimal value.The empirical results for the covariance matrix of returns of the Nikkei 225 and its futures imply that the first-step VT estimation could have a non-negligible effect on the asymptotic covariance matrix of the second-step estimator.
We can consider several important extensions for the realized BEKK-CAW model: (i) theoretical analysis, (ii) empirical analysis, and (iii) applications in realized stochastic volatility models.For the theoretical part, we may develop a test for Wishart variables like the Jarque-Bera test for normality.We may investigate the asymptotic theory for models with explanatory variables by extending the work of Han and Kristensen (2014).Furthermore, we can consider an alternative specification for the structure of X t as X t = W 1∕2 t Z t , as in Asai and So (2013).For the empirical analysis, we can accommodate the asymmetric effects using negative returns and realized semi-covariances, as in Bollerslev and Patton (2020).Recently, Koike (2016) developed an estimator for realized covariance measures, which is consistent under conditions of unsynchronized trading times, jumps, and microstructure noise.With the estimator, we can include the effects on jumps in the BEKK structure, as in the CAW-type models examined by Asai, Gupta, and McAleer (2020).We can develop new specifications for the realized multivariate stochastic volatility models based on the Wishart distribution, instead of the multivariate normal distribution used in Asai, Chang, and McAleer (2022) and Yamauchi and Omori (2020).However, to do so, it is necessary to conduct further research in these fields.

Research funding:
The authors are most grateful to Yoshihisa Baba, the editor, and two anonymous reviewers for their helpful comments and suggestions.The first author acknowledges the financial support of the Japan Society for the Promotion of Science (JSPS 22K01429) and the Australian Academy of Science.

Conflict of interest statement:
The authors declare no conflicts of interest regarding this article.Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

A.1 Derivatives of the Quasi-Log-Likelihood Function
Corresponding to H t,m () with a fixed initial value, H 0,m () = m, we define the strictly stationary and ergodic solution to (8), {H t ()}.To distinguish H t () from H t,m (), we define The gradient of the log-likelihood function is given by ] , (A.2) where The Hessian matrices of the log-likelihood function are given by and

A.2 Proof of Asymptotic Normality of QML Estimator
Lemma A.1.Under Assumptions 1-4, where (A.5) Proof.From Eqs. ( 4) and ( 7), we obtain From an argument similar to the proof of Proposition 4.5 of Boussama, Fuchs, and Stelzer (2011), (B + F) < 1 implies (B * ) < 1 on Θ, and we obtain (A.3).We can verify the latter part with the technique used in the proofs of Lemma 9 of Asai and So (2021) and Lemma B.4 of Pedersen and Rahbek (2014).

A.3 Proof of Asymptotic Normality of VT Estimator
Proof of Lemma 1.Since X t and W t are strict stationary and ergodic processes, and because E [ ‖X t ‖ 2 ] < ∞, it follows, according to the ergodic theorem, that as →  0 , which establishes Lemma 1(a).
Applying a similar argument in the proof of Lemma B.8 of Pedersen and Rahbek (2014)  ) .

Figure 1 :
Figure 1: Histograms and QQ plots of ML estimates with T = 250.The red line in the histograms indicates the normal density with the same mean and variance.

Figure 2 :
Figure 2: Histograms and QQ plots of ML estimates with T = 500.The red line in the histograms indicates the normal density with the same mean and variance.

Figure 3 :
Figure 3: Histograms and QQ plots of QML estimates under fixed  with T = 250.While the data are generated with  = 10, QML estimates are obtained under  = 7.The red line in the histograms indicates the normal density with the same mean and variance.

Table 1 :
Monte Carlo results for ML estimator.

Table 2 :
Monte Carlo results for QML estimator with fixed .

Table 3 :
Monte Carlo results for optimal q ⋆ estimator.

Table 4 :
Monte Carlo results for first-step VT estimator based on alternative q.