Multivariate Medial Correlation with applications

We de(cid:28)ne a multivariate medial correlation coe(cid:30)cient that extends the probabilistic interpretation and properties of Blomqvist’s β coe(cid:30)cient, incorporates multivariate marginal dependencies and it preserves a partial ordering stronger than concordance relation. We illustrate the results in some models and provide an application on real datasets


Introduction
Let us consider that X = (X 1 , X 2 ) is a real random vector, over the probability space (Ω, A, P ), with continuous marginal distribution functions F X i , i = 1, 2, and let (U 1 , U 2 ) represent the corresponding uniformized vector, that is, U i = F X i (X i ), i = 1, 2.
The medial correlation coecient of (X 1 , X 2 ), which we will represent by β(X 1 , X 2 ) or β(X), is dened by The β coecient introduced by Blomqvist ( [1]), has its value in [−1, 1] and compares the propensity for the margins of (X 1 , X 2 ) to take both values above or both values below their respective medians, with the propensity for the occurrence of the contrary event.
Thus, from the representations (4) or (5), we verify that if X≺ c Y then β(X) ≤ β(Y). (6) In addition to the increasing with concordance ordering, the bivariate medial correlation coecient β satises other properties that shape the denition of measure of concordance according to Scarsini ([9]).
Considering the countermonotonicity, independence and comonotonicity copu- β(C M ) = 1 and we can also represent β(X 1 , X 2 ) by For a random vector X = (X 1 , ..., X d ) with dimension d > 2, if we think about (i) interpretation as a measure of propensity for all margins to exceed their respective medians or all margins to be below their medians, and On the other hand, any generalization of β in the multivariate context must preserve at least the property (i) and also verify (iii) β(C Π ) = 0 and β(C M ) = 1.
Starting from the multivariate version of (5), 4C X ( 1 2 , ..., 1 2 ) − 1, rescaled by considering the quotient between its distance to the corresponding value for C Π and the maximum value of that distance, nding again the expression of Úbeda-Flores ( [13]). In addition to this extension, Schmid and Schmidt ([10]) make a detailed study of a function resulting from a rescaling of C X (u) +Ĉ X (v), u, v ∈ [0, 1] d , putting emphasis on the tail regions of the copula which determine the degree of large co-movements between the marginal random variables.
In order to keep (i), (ii) and (iii), we have Joe's sophisticated proposal ( [4]) with an axiomatic on linear combinations of C σ i 1 σ i 2 ...σ i k X 1 2 , ..., d, where σ j X denotes the j-th reection of X, that is, the vector (X 1 , ...X j−1 , −X j , X j+1 , ..., X d ). Joe's axiomatic denition allows for various extensions of β, including those mentioned above and the arithmetic The extensions referred for β increase with the multivariate concordance (Joe [5]). We say that X = (X 1 , ..., X d ) is less concordant than Y = (Y 1 , ..., Y d ), or C X is less concordant than C Y , and in this case we write X ≺ c Y, when we have for u ∈ [0, 1] d . In the case of d = 2 the two conditions are equivalent, as we have already mentioned.
The above proposed generalizations start from extensions of the representations of bivariate β in terms of copulas, considering the corresponding multivariate copulas.
The proposal that we will make, in the next section, for a multivariate correlation coecient β(X) starts from a generalization of the probabilistic interpretation of the denition (1) and satises almost all the desirable properties for a multivariate concordance measure (Taylor [11], [12]). It preserves a multivariate partial order relation that we introduce in section 4. We present several representations for β(X), we demonstrate the main properties, relate it to the previously mentioned coecients and illustrate with examples and applications.
2 Motivation for the multivariate medial correlation coecient where ∨ and ∧ are the notations for the maximum and minimum operators, respectively.
When further clarication is needed, we write M X (I) and W X (I). Inequalities between vectors are understood by corresponding inequalities between homologous coordinates. By X I we understand the subvector of X with margins in I and P(D) represents the family of subsets of D. When |I| = 1, where |A| denotes the Let's x disjoint I and J in P(D). The propensity for margins of X I and margins of X J simultaneously taking values below the respective medians or simultaneously values above the respective medians is evaluated by and β(W (I), W (J)) := Let us make some comments about (15) (i) The expressions (13), (14) and (15)  (v) A linear combination of β {i},{j} (X), 1 ≤ i < j ≤ d, takes into account the bivariate dependencies in X, but if we consider some function of the coecients β I,J (X), with I, J ∈ F, for some family F ⊂ P(D) containing sets with more than one element, then we will be incorporating multivariate marginal dependencies.
The denition we propose, in the next section, for a multivariate medial correlation coecient, will be based on the bivariate coecients β {i}, incorporating the dependency between each margin X i and X D\{i} , 1 ≤ i ≤ d.
Our proposal contains, as a particular case, the Blomqvist bivariate coecient, extends the probabilistic interpretation (1), takes values in [−1, 1], becoming null naturally when C X = C Π and taking the maximum value when C X = C M . The rest of the properties we proved allow us to consider it a measure for a multivariate concordance relation stronger than concordance order.

A multivariate medial correlation coecient
We will propose to evaluate the multivariate medial correlation by comparing the propensity for all margins of X simultaneously taking values below the respective medians or all margins to exceed their respective medians with the propensity of each margin X i to contradicts this behavior. That is, we will take into account the coecients β I,J with the particular choice of I = {i} and J = D \ {i}, i = 1, ..., d.
Denition 3.1. The multivariate medial correlation coecient of the vector X with dimension d, or of its copula C X , is dened as where We remark that, from comment (i), it can be concluded that β(X) coincides with the Blomqvist coecient when d = 2.
Below we present some representations of β(X) that will be useful to clarify their properties and interpretation.
Proposition 3.1. The multivariate medial correlation coecient of the vector X with dimension d, admits the following representations: The relation (23) rewritten in the form reinforces the idea that β(X) compares the propensity of each margin X i to agree with the remaining margins together, X D\{i} , and the propensity to disagree with them, when they are all above or all below their respective medians.
The above representations for β show that by considering β as a mapping on copulas it is linear with respect to convex combinations.
In the following, we establish relationships between β(X) and the generalizations referred to in the introduction. By applying the denition (10) of β * , we conclude from the representation (23) that Note that in the 3-dimensional case, the multivariate medial correlation coe- Thus, in the 3-dimensional case β equals β * and hence allows a dierent view on Blomqvist's β discussed in Úbeda-Flores ( [13]).
We refer the properties of β(X) in the next section and end this one with three examples.
Example 3.1. Consider C X (u 1 , ..., u 4 ) = u δ , with 0 ≤ δ, α ≤ 1, that is, C X is the product of two Marshall-Olkin survival copulas ( [5]). It holds that Therefore, In the case of δ = α = 0 the result agrees with what we expect, since in this case the margins of X are independent. The expression obtained can be related to β(X 1 , X 2 ) and β(X 3 , X 4 ) through We verify that β(X) increases with δ and α, generalizing what we already knew to β(X 1 , X 2 ) and β(X 3 , X 4 ). Therefore β(X) increases with the concordance of X.
Example 3.2. Let us consider that X has a Gumbel copula With simple calculations we can also conclude that β(−X 1 , X 2 , X 3 ) = −2 2−2 δ + 1 3 and that which corresponds to the verication in this example of a transition property that we present in the next section. Before we present the general expression of the multivariate correlation coecient for a Gumbel distribution of dimension d ≥ 1, let's also calculate it specically for d = 4.
We have and C X D\{i} These results for d = 2, 3, 4, calculated directly, can also be obtained from the following general result.
If d is even, we have (considering that a sum with the initial value of the counter greater than the nal one is null) and if d is odd, we have Then β(X) = 2 × (0 + 0) It follows that, in this example we have β( The value of β(X) may not increase with the concordance of X. We can verify this with an example proposed by an anonymous referee.
Consider X and Y 4-dimensional vetors with copulas, respectively, We have X≺ c Y and however β( If X≺ c Y and, for each i ∈ D, then, from proposition 3.1, (23), we can conclude that β(X) ≤ β(Y).
The verication of condition (28) together with X≺ c Y, which can be illustrated with example 3.2, tells us that, in addition to the propensity for all margins to exceed their respective medians or all margins to be below their medians to be higher in Y, also the propensity for each margin to disagree with the remaining, in this sense, is lower in Y, reinforcing the relation X≺ c Y.
When we have X≺ c Y and (28) we denote this type of relation by X≺≺Y.
The relation ≺≺ is a point-wise partial ordering on the set of d-dimensional copulas that implies the concordance relation. For d = 2 both relations coincide.
2 . Therefore C M is the maximal copula.
In particular copula classes, the relation ≺≺ can induce a total order, as for example in the family of 3-dimensional copulas of example 3.2. In this class we can also see, from (27), that C Π is the least element and ≺≺ is a well order.
A weaker relation, although not so informative, could be considered in this work by The above properties on the values of the multivariate medial correlation coefcient are arranged in the following proposition.
Proposition 4.1. The values of the multivariate medial correlation coecient for vectors of dimension d satisfy the following properties: (iv) If C X = C M then β(X) = 1.
In the proposition below we present the properties of continuity, permutation invariance, duality, reection symmetry and transition, which together with (i)-(iii) of the previous proposition and following Taylor [11], [12], justies calling the proposed coecient a measure for the relation ≺≺. (i) If {C Xn } n≥1 converges uniformly to C X , n → +∞, then lim n→+∞ β(X n ) = β(X).
(ii) The value of β(X) is invariant for permutations of the margins of X.

Application to real data
The multivariate medial correlation coecient in (16) can be estimated through the bivariate coecients in (17). Here we consider the respective empirical counterparts.
This estimation procedure has already been addressed in literature (Blomqvist [1], Schmid and Schmidt [10] and references therein).
where, according to (17), we takê We are going to apply the multivariate medial correlation coecient estimator β in (29) on two datasets.   total.sulfur.dioxide alcohol Figure 2: Scatterplots of the variables within the wine dataset: residual sugar versus density (top-left), residual sugar versus alcohol (top-center) and density versus alcohol (top-right); density versus total sulfur dioxide (bottom-left), residual sugar versus total sulfur dioxide (bottom-center) and total sulfur dioxide versus alcohol (bottom-right). The multivariate medial correlation coecient that we propose extends the probabilistic interpretation and properties of the Blomqvist β coecient, it is calculable from the copula, incorporates the dependence between each margin of the vector and the vector of the remaining margins and is a measure of a strong mode of multivariate concordance.
The estimation is addressed based on bivariate inferential methodology existing in literature and we illustrate its application using real data.
The adopted approach envisages the possibility of considering other functions of bivariate coecients involving extremes of subvectors of X, as well as the possibility of adapting the method to generalize other coecients of bivariate dependence.