Fast inference methods for high - dimensional factor copulas

: Gaussian factor models allow the statistician to capture multivariate dependence between vari -ables. However, they are computationally cumbersome in high dimensions and are not able to capture multivariate skewness in the data. We propose a copula model that allows for arbitrary margins, and multivariate skewness in the data by including a non - Gaussian factor whose dependence structure is the result of a one - factor copula model. Estimation is carried out using a two - step procedure: margins are modelled separately and transformed to the normal scale, after which the dependence structure is esti -mated. We develop an estimation procedure that allows for fast estimation of the model parameters in a high - dimensional setting. We ﬁ rst prove the theoretical results of the model with up to three Gaussian factors. Then, simulation results con ﬁ rm that the model works as the sample size and dimensionality grow larger. Finally, we apply the model to a selection of stocks of the S&P500, which demonstrates that our model is able to capture cross - sectional skewness in the stock market data.


Introduction
In the literature, Gaussian factor models are used as a dimensionality reduction technique to describe a vector of d variables , where for meaningful cases < p d. Thus, variables form groups based on their correlation, such that correlation within one particular group of variables can be high, while the correlation between variables assigned to different groups can be low [4,5]. Because of the minimal assumptions underlying factor models, this technique is amply used in a wide range of applications. This includes econometrics, where factor models are either used to identify the factor structure itself [1,2,18] or to control endogenous common factors in factor-augmented panel data regressions [15]. Other applications include the finance literature, see Chapter 5 in [10] or Chapter 9 in [21], and hydrology [19].
Nevertheless, the assumption of multivariate Gaussianity implies that these factor models cannot capture cross-sectional skewness or kurtosis in the underlying multivariate distribution. Copula models, on the other hand, provide a straightforward technique to capture the entire dependence structure between random variables, and allow for non-Gaussian skewness and kurtosis. holds [17]. If the margins are continuous, then one speaks of the copula, which is unique and equal to ( ) = u C ( ( ) ( )) … ← ← F F u F u , , d d 1 1 , where ← F j is the generalised inverse function of F j .¹ While copula models provide an elegant technique to model dependence between random variables, they come with certain restrictions. On one hand, they can be computationally demanding in a highdimensional setting, while on the other hand they might not be flexible enough to model dependence between a large number of random variables. A solution was provided by Krupskii and Joe [7], who proposed a class of factor copula models that are both parsimonious and flexible. The idea is to assume p latent factors … V V , , p 1 that cause dependence between d observable variables … U U , , d 1 via bivariate linking copulas. These types of factor copula models can be extended to allow for a more structured approach [8]. A different type of factor copula model was proposed by [13]. The latter model uses a simulated method of moments (SMM) estimation method described in [12] and is extended to allow for time-varying parameters in [14]. However, estimation for this type of factor copula model is particularly slow in higher dimensions, as SMM is based on a resampling scheme. Bayesian estimation methods for factor copula models are explored by [3,6,11,16,20], but are generally also computationally demanding. A novel method to decrease the computational time to estimate factor copula models is proposed by Krupskii and Joe [9], who provide a procedure that allows for fast estimation of the factor copula parameters in non-overlapping factor copula models. The latter include the one-factor copula model in [7] or the nested factor copula model in [8].
The main contribution of this article is to provide a fast and computationally tractable procedure to model very high-dimensional data characterised by non-Gaussian skewness in the multivariate distribution. This is done in a two-step approach. First, marginals are modelled separately and transformed to normal scores. Second, dependence between the marginals is modelled by means of + p 1 common factors: one factor resulting from a one-factor copula model as described in [7] and p Gaussian factors.
Factor loadings and the residual correlation parameter are estimated using a generalised method of moments (GMM) approach, which provides consistent and asymptotically normal estimates as the sample size n tends to infinity. In a subsequent step, the unobserved Gaussian factors are estimated using a weighted combination of the observed variables. Weights are obtained by maximising the correlation between the estimated and the true Gaussian factors, while convergence is shown by letting the sample size n tend to infinity, and setting the dimension d sufficiently large. To allow for fast inference of the onefactor copula parameter, we use the proxy factor idea proposed by Krupskii and Joe [9]. We show that as n and d tend to infinity, the proxy converges towards the true factor.
In sum, the proposed model exploits both the versatility of Gaussian factor models and the fast inference scheme of non-overlapping factor copula models. The remainder of the article is organised as follows. The new model is proposed in Section 2, its main properties are discussed, and the statistical procedure is outlined. The theoretical results are outlined in Section 3, and the estimation strategy is illustrated with a simulation experiment in Section 4, where we propose a closed-form expression to quickly invert the variance-covariance matrix of the observed data. We illustrate the model by applying it to a real-world data example in Section 5. Section 6 wraps up the article with a discussion and ideas for future research.

Model outline and inference
In Section 2.1, a new factor copula model is presented that combines the characteristics of Gaussian factor models and of factor copulas, while still being computationally tractable in high dimensions. The estimation procedure is subsequently discussed in Section 2.2.

Model outline
Assume one has at hand data from arbitrary margins which in a first step are transformed to ( ) N 0, 1 marginals. Then In a p-factor model, the observable variables X j are a linear combination of p independent latent Gaussian factors and an error term: are independent, then the model (2.1) is a classical Gaussian p-factor model. In this work, we assume that ( ) = ε ε η j j 1 2 for ≠ j j 1 2 , and the errors are independent conditional on some unobserved factor ( ) V Ũ 0,1 and the respective copula In other words, C is a one-factor copula; see [7] for more details. Casting the model in (2.1) into matrix notation gives consisting of d factor loadings, where λ jk captures the impact of factor Z k on the jth variable X j . The p Gaussian factors are elements of the column vector . Finally, dependence between the residuals is modelled by the homogeneous one-factor copula model in (2.2). Denote 1, , 1 2 . Then the correlation matrix of X is The difference with a standard Gaussian factor model lies in the assumption that ( ) = ≠ ⊤ εε H I d , which implies the presence of correlation between the residual terms.
The following example shows that model 2.1 can capture asymmetric dependence.   Figure 1 shows scatterplots of different pairs of the observed variables. It is seen that pairs ( ) X X , 1 2 and ( ) X X , 3 4 have stronger dependence in the lower tail, while ( ) X X , 1 3 has much weaker dependence and it does not exhibit strong asymmetry.

Inference
In this section, we present the statistical procedure that ensures we can estimate the model parameters, the Gaussian factors, and the latent factor of the one-factor copula model. The corresponding theoretical results are outlined in Section 3.
We begin by estimating the parameters in the model, for which we implement a GMM approach.
then the model parameters can be obtained by , . by minimising the L 2 norm between the proxy factor and the true factor. In Section 4.1, we propose a closedform expression to quickly invert the × d d variance-covariance matrix Σ X . The next step is to recover the latent factor in the one-factor copula model. To this end, the estimated residuals can be recovered as: to approximate the unknown factor V in the one-factor copula model. The density of the one-factor copula model is obtained by differentiating (2.2), that is  Fast inference methods for high-dimensional factor copulas  273 We prove these results for the model with ≤ p 3 Gaussian factors. All proofs are deferred to Appendix B. We assume the following conditions hold. 1 , and ϕ k k , 1 2 is the angle between Δ k1 and Δ k2 . There exist , and a constant integer ∈ d 0 such that the following hold for any > d d 0 and all = … j d 1, , , and for all = … k k k k p , , , : The next theorem states that GMM estimation yields parameter estimates that converge in probability to their true values, given some regularity conditions. Moreover, the resulting estimators are asymptotically normal, as stated in the subsequent theorem.
The GMM estimators converge in probability to their true values as → ∞ n : if the following conditions are satisfied: The GMM estimators are asymptotically normal if the following conditions are satisfied: is non-singular.
The following lemma states that we can obtain a proxy of the true Gaussian factors Z k (for = … k p 1, , ) by taking a weighted combination of the observable variables be a weight vector of length d (for = … k p 1, , ), such that the weighted sum of the observed variables = ⊤ w X Ẑ kd kd is a proxy of the Gaussian factor Z k . Then one can obtain the unique optimal weight vector Theorem 3.5 states that = ⋆ ⋆⊤ w X Ẑ kd kd will converge to its true value if the dimension is sufficiently large.
Theorem 3.6 states that ⋆ Ẑ knd will also converge to its true value.
and λ k be the estimated counterparts of − Σ X 1 and λ k , respectively. Then, Note that in Theorem 3.6, the order of the limits is crucial to ensure that convergence takes place. . Subsequently, in Theorem 3.8 we show that Ū d converges in probability to a monotone function of the true latent factor, if the dimension is sufficiently large. . Under the assumptions of Theorem 5 in [9] and Theorem 3.5, The next result shows that the observed variables in (2.1) are asymptotically independent so this model is not suitable for modelling data with strong tail dependence.
where ( ) ⋅ Φ is the cdf of a standard normal distribution. This result implies that the copula linking ( ) X X , l m has no lower tail dependence. Similar result holds for the upper tail.

Simulation study
In this section, we perform a simulation study with the goal to evaluate the performance of the estimation methods for model (2.1). The numerical implementation techniques that can be used to efficiently compute the presented estimators of the model are discussed in Section 4.1, and Section 4.2 sets up the simulation experiment for which the results are presented in Section 4.3.

Numerical implementation
The main hurdle in applying the calculations in the previous section is in obtaining the inverse of the correlation matrix, − Σ X 1 . Such an inversion becomes computationally infeasible whenever the dimension d is too large. As mentioned in the proof of Theorem 3.5, the following expression can be obtained using Woodbury's formula [Chapter 2 in 22]: As one can see, the bottleneck lies in the inversion of the symmetric

Simulation setup
We assume that we have at hand a sample of size n and dimension d which admits the factor model in (2.1) with two Gaussian factors and residual one-factor copula dependence: The following simulation procedure is used to generate the Gaussian factors and the residuals ε ij : In a subsequent step, the factor model parameters are estimated using GMM, with parameter λ 11 fixed at its true value to make the model parameters identifiable. Then, the factors are estimated using the weighted variables approach, where the weights are computed as in Theorem 3.6. Finally, the factor copula parameter is estimated using standard maximum likelihood. This experiment is repeated for 1,000 independent simulations. For every experiment, one of the following settings is used for the sample size n, the dimension d, the copula parameter θ, and the copula family C: • The family of bivariate copulas are Clayton (C Cl ), and t 4 (C t4 ).
Next, over all simulation results, we calculate the mean and variance of the correlation between the true and the estimated Gaussian factors, and the Pearson correlation coefficient between the estimated copula factor V , and the true copula factor V .³ Finally, we calculate the average bias of the copula parameter and its variance over all simulations. In the next section, we report these estimation results.

Simulation results
The results of the simulation experiments are presented in Tables 1 and 2. As one can see in both tables, for fixed n, the correlation between the estimated and the true factors increases as d becomes larger. Similarly, for the bias of the estimated copula parameter reported in the last column of the tables, we can see that, for each n, there is a decrease in the bias as the dimension d becomes larger. However, as d increases from 250 to 500, the bias starts to increase again. On the other hand, as the sample size n increases, this decrease becomes less important.

Data application
In this section, we apply the model outlined in the article to a real-world dataset. We consider three different sectors in the S&P500 between January 3, 2017, and December 30, 2020, which accounts for where μ j is the unconditional mean, and ϕ j gives, for stock j, the dynamic impact of a change in the logreturns in period − t 1. Furthermore, κ j t , introduces, for stock j, non-constant volatility to the process. The latter is a function of the average volatility ω j , the dynamic impact α j of a change in the previous period of the squared series, and β j captures the impact of a change in the volatility at time − t 1. Finally, error terms   ξ j t , are i.i.d. Skew-t distributed with ν degrees of freedom and skewness parameter γ. After estimating the univariate parameters, we transform the estimated residuals to a uniform scale.
Next, we want to identify the number of underlying factors. In Section 2.4 of [13] it is shown that, under regularity conditions, one can use a scree plot based on the eigenvalues of the rank correlation matrix to determine the appropriate number of factors. The scree plot for the AR(1)-GARCH(1,1) filtered data transformed to uniform ranks is given in Figure 2. One can see that for the Industrials sector, the majority of the variance in the data is explained by one factor, while for the Information Technology and Financials sectors, two common factors appear to be more appropriate to capture the variance in the data. Thus, when we look at the number of factors required to simultaneously model the Industrials, Information Technology, and Financials sectors, it is no surprise that the number of factors appears to be between 2 and 4. Next, we model the multivariate dependence between the different log-returns of stock prices. First, we transform the data to normal scores and look at the empirical semi-correlation of the data. Semi-correlation can be defined as the correlation in the lower tail ( < ρ 0 ) or upper tail ( > ρ 0 ) of the data. We can see in Table 3 that the average absolute semi-correlation indicates that the data in each of the three sectors are skewed to the left.⁵ This is consistent with stylised facts in the financial literature that show stronger correlation during downturns. We therefore expect that a copula with stronger dependence in the lower tail is more appropriate to model these data. For comparison, we use different copula models to see which one fits these characteristics best.  First, we model dependence within each sector separately using only one Gaussian factor, and a Clayton, Frank, and normal one-factor copula model to capture the residual dependence. Because the lower tail dependence does not appear to be very strong, we also use a reflected Gumbel copula model to capture the residual dependence. Then, we subsequently use two and three Gaussian factors while using the same one-factor copula models. These estimation results for each sector are presented in Table 4. As one can see in the model with one Gaussian factor, all three sectors establish a moderate degree of dependence between its constituent stocks. This is not surprising, as linear dependence is already captured by the Gaussian factor.⁶ Similarly, when pairing industry sectors together in a model with two Gaussian factors, we can see that there is a moderate degree of dependence between the stocks in each of the pairs of industry sectors. Finally, when using three Gaussian factors, we can see that the estimated parameter for each of the three copula models is somewhat higher than the estimated dependence parameters in the model with two Gaussian factors.
To evaluate the performance of our model, we use the estimated parameters to simulate a sample of size 1,000,000. We then calculate the semi-correlations of this simulated sample and calculate the average (absolute) bias with respect to the empirical semi-correlations in Table 3. These results are reported in Tables 5 and 6. As we can see, the average absolute bias of the lower semi-correlation, (| |) < b ρ 0 , is smallest for the model using reflected Gumbel linking copulas in the model with one Gaussian factor. This shows that our model is capable of capturing multivariate skewness in high-dimensional data. The average absolute bias of the upper semi-correlation is somewhat lower for the Frank and Normal copula, but the difference with that of the reflected Gumbel copula is relatively small, except for the model with one Gaussian factor with the Financials sector. Finally, we also modelled the data using a multivariate Student t ν copula with a three-factor covariance structure. However, the estimated degrees of freedom parameter is close to 20, which makes it indistinguishable from a normal copula model.

Discussion
Applying Gaussian factor models to high-dimensional data requires extensive computational power. Moreover, Gaussian factor models do not capture multivariate skewness in the data. In this article, we provided a method that exploits the high dimensionality of the data to accurately estimate the unobserved factors and simultaneously allows us to model multivariate skewness in the data. The proposed model is flexible, as it allows us to separately model the margins from the dependence structure. We proved that our model works for a factor model with up to three Gaussian factors. Simulation studies showed that the accuracy of our estimations improves as the sample size and dimension increase. To illustrate the model, we applied it to a dataset of three constituent sectors of the S&P500 stock data and found evidence that the stocks in the three sectors are skewed, consistent with findings in the finance literature.
There are different avenues for future research. One direction is to include fast inference models that allow for tail dependence in factor models or for a nested structure in the factor copula dependence. Another option is to include multiple variables that capture dependence between the data using different one-factor copula models.
Acknowledgments: The authors would like to thank the referee and Associate Editor for their constructive comments that have helped to improve this paper.

Conflict of interest:
The authors state no conflict of interest.  Note: Parameters used in the simulation are the estimated parameters for each sector. Table 6: Average bias for simulated semi-correlations, with = n 1,000,000, and dimension matching that of each sector We first consider the case = p 1 and we can assume without loss of generality ( ) = σ j j 1 so that  Furthermore, we consider the case when the loadings Λ 1 and Λ 2 are monotonically dependent with ( ) ( ) = σ l σ l 1 2 (case 1), and when they are independent (case 2). In the first case, for = k 1, 2, and in the second case, for = k 1, 2,   It implies that, in case 1,  . Condition 3.1(iii) fails in this case.
In case 2, ( ) ϕ cos 2 1,2 is far from one in all cases, and accurate parameter estimates can be obtained in this case.
Similar results can be obtained when = p 3, so they are not reported here.
is invertible. The condition is illustrated for a model with one Gaussian factor, and the proof is similar in the general case. The model with one factor has the following derivatives: , , The numerator and denominator on both sides of the above display are nonzero, even though the ratios are the same. However, looping through every possible combination of the pairs ( ) i l , and ( ) j k , , the only valid result is that all differences − a ãĩ l are zero, which implies that all δs are equal. However, this contradicts Condition 3.1(iii), which states that there can only be up to ( ) O d 2 terms that are zero, which concludes the proof. □ Proof for a one-factor model ( = p 1, and = k 1). In the case where = p 1, it follows that = λ Λ 1 . Hence, 1 . . Proof for a two-factor model ( = p 2, and = k 1). In the case of a two-factor model, Putting everything together, we obtain  . Proof for a three-factor model ( = p 3, and = k 1). For a three-factor model, we obtain the following expression: