Hoeffding–Sobol decomposition of homogeneous co-survival functions: from Choquet representation to extreme value theory application

: The paper investigates the Hoeffding–Sobol decomposition of homogeneous co-survival functions. For this class, the Choquet representation is transferred to the terms of the functional decomposition, and in addition to their individual variances, or to the superset combinations of those. The domain of integration in the resulting formulae is reduced in comparison with the already known expressions. When the function under study is the stable tail dependence function of a random vector, ranking these superset indices corresponds to clustering the components of the random vector with respect to their asymptotic dependence. Their Choquet representation is the main ingredient in deriving a sharp upper bound for the quantities involved in the tail dependograph, a graph in extreme value theory that summarizes asymptotic dependence.

where for dλu(x) = i∈u dλ i (x i ) and −v = { , . . . , d}\ v. See [5,18,19]. The term fu only depends on the components of x associated with u. The constant term f ∅ is equal to fdλ and the global variance is given by σ = (f − f ∅ ) dλ. Set σ u = f u dλu and σ ∅ = . Then, from orthogonality arguments (see, for instance, [2]), the term fu is centered (except for the empty set) and the FANOVA expression relies on the equality Interest in the individual variances σ u , and more particularly their ratio to the total variance σ u /σ , traces back to [18] and [12]. The current research problems in Global Sensitivity Analysis (GSA) are varied in nature. Our concern in this paper is not improvements for estimation, cost-saving, construction of surrogate models, or other practical but no less crucial aspects or perspectives ; see rather [11,13] and references contained therein for an overall and recent assessment. The main goal here is to reveal simpli ed theoretical expressions for the quantity σ u within a speci c class of functions. Knowing such quantities σ u allows to order the importance of the input variables x , . . . , x d with respect to the global variance of f , the function under study.
Reducing the number of variables of interest in f is one of the main consequences of this hierarchical ranking.
In this paper we will concentrate on homogeneous co- is stated in [16,Theorem 2]. Some details are given in Section 2 to make the paper almost self-contained.
In extreme value theory, stable tail dependence functions (stdf), usually denoted by , play a central role to describe the asymptotic dependence between components of a random vector X = (X , . . . , X d ). Assuming the existence of a multivariate domain of attraction for the componentwise maxima of X is a classical starting point. This is equivalently written as in terms of F, F , . . . , F d the cumulative distribution functions of X, X , . . . , X d . More details on multivariate extreme value theory can be found, e.g., in [1,3,4,7,10]. As pointed out in [16], the stdfs are particular cases of homogeneous co-survival functions. The corresponding probability measures ν in (3) must satisfy d constraints induced by the fact that a stdf equals 1 at unit vectors. A graph based on the Hoe ding-Sobol decomposition of a stdf, called the tail dependograph, has been introduced in [10]. It reveals the asymptotic dependence structure of the random vector X through the structural analysis of the function . Tail superset indices, which are the superset combination of individual variances, are of prime interest in the tail dependograph. Their pairwise values de ne the thickness of the edges.
The aim of this paper is twofold. On the one hand, we shall establish a simpli ed expression for the individual variances σ u , as for their superset combinations, when the function under study is a homogeneous co-survival function. Their resulting Choquet representation thus provide new test cases for GSA. On the other hand, we will apply these results to stdfs so that upper bounds for the tail superset indices will be obtained. Proving this majorization initially motivated the current study.
The paper is organized as follows. We rst investigate the class of homogeneous co-survival functions: in Section 2, the expression of the FANOVA e ect ψu and the corresponding variance σ u are written as integrals of rank-one tensors (which are products of univariate functions in each of the input parameters, as de ned by [8]). The numerical performance of our results is analyzed at the end of this part. As an application, the study focuses on stdfs in Section 3. The new expressions allow to derive some sharp upper bounds for the tail superset indices. All proofs are postponed to Section 4. Finally, the last lines summarize conclusions and references.
Notation. Let ∨ and ∧ stand respectively for the maximum and the minimum. Set x+ = x ∨ . The indicator A equals 1 on A and 0 on A c . Set = ( , . . . , ) ∈ R d . The vector z u is the concatenation of z i for i ∈ u so that (z u , x −u ) = i∈u z i e i + i∉ u x i e i in the canonical basis (e , . . . , e d ). Binary operations are understood componentwise, e.g.
and let K i (w; s) stand for K i (w, w; s, s). The notation ψ is used for a homogeneous co-survival function whereas represents a stdf.

FANOVA of homogeneous co-survival function
In this section, the functional decomposition is explored under a new setting by considering homogeneous co-survival functions. Before stating our main result, we give a description of the class under study. It is worth noticing that focusing on the unit hypercube [ , ] d is not restrictive by homogeneity assumption.

. Choquet representation of homogeneous co-survival functions
Similar to distribution functions, also co-survival functions are essentially characterized by a special multivariate monotonicity property. First, we introduce a notation. Let A , . . . , A d be non-empty sets, A = A × · · · × A d , and let f : A → R be any function. Then for x, z ∈ A we put Moreover, for a non-empty subset u { , . . . , d} and for x −u ∈ j∈−u A j , let us de ne on j∈u A j , and if this inequality also holds whenever some of the variables are xed, for the function of the remaining variables, i.e. if for each non-empty subset v { , . . . , d}, for each y ∈ j∈−v A j and any See [17] for a detailed presentation of this concept.
If the reader is not familiar with Radon measures, one should only keep in mind that this assumption ensures that f is well de ned and f (x) nite for any x ∈ R d + . By Theorem 3 in [16] one knows that it is equivalent to assuming f d -alternating, left continuous, and f ( ) = . Moreover, for any ≤ x < z in R d by an application of the inclusion/exclusion principle. Now, if f is additionally assumed to be homogeneous, that is f (tx) = tf (x) for any positive t and vector x then the measure µ is homogeneous: µ(tA) = tµ(A) for any positive t and measurable subset A (and reciprocally). Note that any homogeneous d -alternating function f : R d + → R is automatically continuous, non-negative, with f ( ) = .
An important example of a homogeneous measure is given by the image λw of the Lebesgue measure λ on R+ under the mapping s → s/w, where w ∈ C. The co-survival function of λw is then These functions will play a decisive role in the following, since they are the "building stones" of all homogeneous co-survival functions. More precisely, consider the set of all normalized functions discussed above Then K is obviously convex and compact (with respect to pointwise convergence). It turns out that K is even a simplex, with {x → max(x · w)|w ∈ C} = ex(K) as its set of extreme points, and this set is closed (so compact as well) ; see [16,Theorem 4 (ii)]. In other words, K is a so-called Bauer simplex, i.e. for each ψ ∈ K the representing probability measure on ex(K) guaranteed by Krein-Milman's theorem, is unique. The resulting integral representation is also called Choquet representation. So, for each d -alternating and homogeneous ψ on R d + , ψ ≢ , there is a unique probability measure ν on C such that It is easily seen that ψ is the co-survival function of the measure µ := ψ( ) C λwdν(w).

. Expression of Sobol e ects and associated variances
The main result of this paper is stated below. It says that Sobol e ects ψu (as their variances) have rather simpler expressions in comparison with (2) when ψ is a homogeneous co-survival function. Indeed, they are expressed as integrals on C × [ , ] of rank-one functions. Recall that = ( , . . . , ) in R d .

Theorem 1. Let ψ be a homogeneous co-survival function (3) associated with a spectral probability measure ν on C. Then, the term ψu in the Hoe ding-Sobol decomposition with respect to λ satis es on
for any non-empty subset u of { , . . . , d} and Its corresponding variance ψ u has the following expression Furthermore, The link of Sobol e ects ψu and their variances σ u with the spectral measure ν has been made explicit. The main ingredient for proving the previous theorem is to remark that the spectral representation of ψ can be written as an integral of rank-one tensors. Then, all Sobol e ects ψu and corresponding variances σ u inherit the same form by application of the Fubini-Tonelli theorem. As can be seen through Formula (2), the variance σ u is usually computed as an alternating combination of cumulated variances. It thus su ers from accumulation of estimation error, overall as d becomes larger. Theorem 1 o ers a setting where the numerical complexity of σ u is the same as that of σ or other well-known quantities discussed in Subsection 2.3.

Example 1. If the measure λ corresponds to the product of Lebesgue measures dλ
, a particular extreme point. By Theorem 1, with the probability measure ν = δ on C, one

Example 2. Consider an extreme point of the convex and compact set K (mentioned in Subsection 2.1), precisely ψ(x) = max(x · w) with w ∈ C. It is worth noticing that Theorem 1 furnishes the expressions of the variances σ and σ u as integrals on [ , ] of a product of d univariate functions. In comparison with their original de nitions, already mentioned in the introduction, this provides an important gain: The number of integrals is reduced (it is no longer an alternating sum) and the domain of integration is smaller. Under a precise value of w, the calculations would give exact expressions after very tedious e orts. One could numerically approximate them by Monte-Carlo procedures on [ , ] instead.
With λ as the two dimensional Lebesgue measure, we focus here on these extreme points in the bivariate setting. For w = (w, ), we obtain with the following decomposition of σ

. Consequences for cumulated variances
It turns out that several combinations of variances are of prime interest in order to characterize the importance of a subset u of variables. Justi cations can be found in [9,18] in the case of and We see immediately that ≤ I u ≤ τ u ≤ σ and I u + τ −u = σ . Finally, [6] examined the meaning of the sum over the supersets of u Ranking based on the superset quantities Υ u takes into account the importance of x u but additionnaly that of any vector containing these |u| variables. Formulae depending on the spectral measure are now derived for these three types of cumulated variances. The next corollary asserts that they are also written as integrals of rank-one tensors. (3) associated with a spectral probability measure ν on C. Then,

Corollary 1. Let ψ be a homogeneous co-survival function
The Choquet representation of Υ u will play a crucial role in the proof of the upper bound stated in the extreme value theory setting at the end of the paper.

Example 3. Consider again Example 1. One obtains easily
In the opinion of the authors the current example (as its rst part Example 1) looks promising for being a convenient test function. It provides a simple but non trivial function which has known individual variances as well as cumulated and global ones, for any dimension d.
In [9, Theorem 1] the following identity is shown . The gain of the expression of Υ u claimed in Corollary 1 can be questioned with regard to the dimension of the domain of integration. Similar comments hold for I u and τ u with reference formulae (4) and (5). But, it does not exist a direct formula of σ u , i.e. based on ψ, except from inversion of (6) for instance. It yields Comparing the already known formula (8) with our result associated with σ u in Theorem 1 makes the interest of our expressions more obvious. In Theorem 1 indeed, it is no longer expressed as an alternating sum of integrals. We have reduced the dimension of integration. Nevertheless, to be also numerically convincing, a wide comparison between the estimation of σ u derived from (8) and from Theorem 1 will now be o ered. The same comparison is done for the estimation of Υ u , Formula (6) competing with the one from Corollary 1.

. Numerical illustrations
For the sake of simplicity, we assume here that the distribution of the entries are known and xed as uniforms. Our goal is to compare the e ectiveness of the new formulae, obtained for homogeneous co-survival functions, with the already known and general ones. Both are integrations approximated by Monte-Carlo procedures, but neither the domain of integration nor the complexity of the integrand are the same. Our choices must assess impartiality. One possibility is to compare the estimation obtained after a given common executing time. However, this depends strongly on the way the integrands are coded. We thus decide to x the Monte-Carlo size N on the unit interval.
We will rst restrict ourselves to the case of the max function where θ is the true value andθ i,N is the i-th estimate. The number of replicates here is n = .
The level of accuracy is the same on each column of Table 1 in order to facilitate the comparison. Two values for N have been handled: N = and N = , . However, for the largest value, the time limit has been reached using the already known formula. Table 2: AME for the estimation of Υ u when ψ(x) = max(x). Missing value -refers to exceeding the time limit.
In Table 2 only one estimation has not been obtained because of time exceedance. Again, the level of accuracy is the same on each column to facilitate the comparison.
Let us now consider another homogeneous co-survival function associated with a discrete probability measure ν = m k= p k δw k where each w k lies in C and p + . . . + pm = so that Fix arbitrarily m = and d = . The weights, chosen at random, are (p , . . . , pm) = ( . , . , . , . , . , . , . , . , . , . , . , . , . , . , Since the true values are not easily computable, we only provide a graphical comparison of the resulting boxplots obtained from n = repetitions. As expected, this numerical study shows that the estimation from the new formulae is more accurate. This is nothing more than the illustration of the domain of integration being reduced. The reader should be aware that recent studies in GSA provided new methods compared to the classical Monte-Carlo procedure. Going further with a comparison based on pick-freeze method or any other re nement would clearly exceed our ambitions in this paper.

Statistical applications in extreme value theory
In the following, we focus on the Hoe ding-Sobol representation of a stable tail dependence function (stdf). A homogeneous co-survival function is a stdf i (e ) = . . . = (e d ) = i.e., it is associated with a probability This means that µ * is the image of µ under x → /x, so that µ is directly homogeneous (as is ) when µ * is inversely homogeneous: µ * (tA) = t − µ * (A) for any positive t and any measurable set A of [ , ∞] d \ { }. Whereas the characterization of stdfs was shown relatively late [16, Theorem 6], their integral representation was known long before: it goes back essentially to [4]. Most of the use of their integral representation has been done under the L or L -norm on R d + . But as emphasized by de Haan and Resnick, it is an arbitrary choice. As seen in Section 2, the extreme points of K (functions x → max(x · w) for w ∈ C) combined with the max-norm were natural choices here.
The main objective of this section is to analyze the theoretical aspect of the functional decompositon for stdfs with respect to the Lebesgue measure dλ(x) = dx . . . dx d . As mentioned in the introduction, this idea has been introduced in [10] but the focus was on the meaning of Υ u in this context, named as tail superset indices, and on their estimation. To illustrate their importance in multivariate extreme value modeling, let us focus for instance on the comparison Υ {i,j} ( ) < Υ {h,k} ( ). This means that the asymptotic dependence between components X i and X j themselves added to the asymptotic dependence between the pair (X i , X j ) and the d − remaining variables is weaker than its equivalent in h, k. Reducing the dimension of the asymptotic dependence structure consists in selecting subsets u according to their tail superset indices Υ u .
Below, we rst obtain a simpli ed expression for these indices by application of Corollary 1 to . Then, we deduce an upper bound for the tail superset indices. The section is ended by a short discussion.

. Tail superset indices
The tail dependograph introduced in [10] starts from a non-oriented graph whose vertices represent components of the random vector X in the domain of attraction of . The edge between i and j is drawn proportionally to the pairwise superset indices Υ {i,j} of . This index measures the strength of asymptotic dependence between the components X i and X j , not only in their associated bivariate model (X i , X j ), but in the complete model X. A thick line reveals a strong asymptotic dependence between corresponding components, whereas at the opposite, such index vanishes when the asymptotic dependence is null. The present paper thus o ers a theoretical expression of the tail dependograph indices as Pairwise indices are perhaps the most important since their value on a graph is easily represented by the thickness of a segment. However, more general indices can be de ned and an application of the previous section also provides the representation of Υ u ( ) as follows Examples. The asymptotic independence occurs when + (x) := d i= x i so that + ( ) = d and ν = ( d i= δe i )/d. All the terms in the integrand of Υ {i,j} ( + ) cancel since at least the term depending on i or on j (or both) will be reduced to ( − ). As a consequence Υ u ( + ) = as soon as |u| ≥ .
For in between strengths of asymptotic dependence, one can use logistic extreme value models. Symmetric versions (x) = x /r + . . . + x /r d r for r ∈ ( , ) are obtained with ( ) = d r and as pointed out in [16]. Unfortunately, the expressions obtained in the present paper do not allow a real simpli cation under such models.

. Upper bounds for tail superset indices
Some simple computations allow to obtain the following lower and upper bounds.

Lemma 1. Let be a d-variate stable tail dependence function. Then,
.
Moreover, for any subset u of { , . . . , d}, set ∨,u (x) = max i∈u x i . Then, The lower bound is given by .
Hence, one can derive lower and upper bounds for I u of a stable tail dependence function in any dimension, and for any size of the subset u, whenever ∅ is controlled. However, it doesn't provide a very simple way to deduce bounds for σ u or, more interestingly in the tail dependograph context, for Υ u . The following result answers this question.
The proof is postponed to Section 4. However, note that it relies on the following preliminary result.
The authors conjectured the sharp upper bound in Theorem 2 a long time ago but the rigorous proof was only made possible after transferring the Choquet representation of the function to its indices as investigated in Section 2. The optimization problem dealt with in Theorem 2 might be looked at in the broader perspective of maximizing a convex functional over a compact convex set (which need not be a simplex). Bauer's maximum principle ensures that the maximal value is attained in an extreme point, in our case in an extreme stable tail dependence function. It does however give no hint to localize such a point nor to its uniqueness. Our statement in Theorem 2 answers completely the question: it asserts the existence, the uniqueness and the location (and so nds the maximal value) of the maximization problem.
The following statement is included in the proof of Theorem 2.

. Practical meaning and use
Taking into account the bounds provided by Theorem 2, the tail superset indices are normalized after multiplication by 90, so that the corresponding a nity matrix is [ Υ {i,j} ( )] ≤i,j≤d , on which classical clustering algorithms and analyses can be performed. However, even if this pairwise normalization is correct, the use of this bound is more powerful when comparing subsets with di erent sizes. Indeed, thanks to the renormalization Υ u ( )/Υ u ( ∨,u ) due to Theorem 2 all the renormalized superset tail indices can now be compared even if the subsets u have unequal sizes. The e ective dimension of the asymptotic dependence structure could be de ned as .
In the asymptotic independent case, ∆( + ) = since Υ {i,j} always vanishes: the entire additive component of + explains the whole variance. In the asymptotic complete dependent case, ∆( ∨ ) = d by application of Theorem 2. Now, for models in between, rules can be easily de ned: select a subset u that achieves the maximization, or remove (in the asymptotic dependence modeling) subsets associated with small values of the previous brace.
Let us provide an example. Consider a trivariate stdf with value at (x, y, z) given by It is an asymmetric extreme value logistic model. Its associated tail dependograph is drawn below , . and Υ ( ) · Υ − ( ∨, ) .
. These calculations reveal that the strength of asymptotic dependence, when modelled by , between the three components is closer to the possible maximal value, than its pairwise equivalent. Thus, the e ective dimension of is 3 and it is not 2, according to this criteria. In other words, one should not simplify the representation of by combining only bivariate terms. The knowledge of our bound is crucial to construct this reasoning.

Proofs
Proof of Theorem 1. The proof relies on the combination of Then, applying the Fubini-Tonelli theorem yields For non-empty u, the term ψu is centered so that its variance σ u is also the second order moment.
[ , ] |u| The last assertion comes from the computation of ψ dλ(x) = σ + ψ ∅ . More precisely, The result follows. Proof of Corollary 1. Following [18] one knows that Again, starting from Now, applying (7), = |u| a,a ⊆u |u| a,a ⊆u (− ) |u\a|+|u\a | s≥x a ·w a s≥z −a ·w −a t≥x a ·v a t≥z −a ·v − a since a⊆u (− ) |u\a| = as soon as u is non-empty. As usual, we let a∆b = (a ∪ b) \ (a ∩ b) be the symmetric di erence of the subsets a and b. As a consequence, where the last equality comes from the general fact a⊆u,b⊆u Finally, one obtains Proof of Lemma 3. By Lemma 2, we may assume that x ≤ z . For s ≤ t both in A d we have where both terms on the right hand side are non-positive. Hence and so r(t) ≤ r(s).
We now go back to the proof of Proposition 1. By iterating Lemma 3, we deduce that It yields . Proof of Lemma 1. Recall that + (x) and ∨ (x) now stand for d i= x i and max(x) respectively. Set also ∨,u (x) = max i∈u x i . Stable tail dependence functions have the well-known property To prove (12) recall the equality (x) = µ([x, ∞] c ). Then, the inclusion leads to the result since is the identity on each axis. Indeed, is homogeneous and equals one at the canonical basis vectors. The inequality (12) is easily transferred to rst and second order moments .
From (12), one can also prove that by application of Corollary 1 with ψ( ) = , ν = δ , and Example 3. Assume now the second assertion of Theorem 2 and recall that if is a d-variate stdf then [u] is a |u|-variate stdf. Combining the assumption with what precedes and Proposition 1, we obtain so that Υ u ( [u] ) = Υu( ∨,u ). The problem is now the same as the last statement of Theorem 2 for d = |u|. The result will thus follow if one can prove it directly. Let be any stdf where the maximal value is attained. Then the inequalities after the inequality (13) are in fact equalities, in particular implying by Lemma 4 (below) that the functions w → w /d i , ≤ i ≤ d, are proportional, as are then also w → w i . Since C w i dν(w) = / ( ) for each i, we see that ν-almost surely the components w i are equal, i.e. ν is concentrated on the diagonal {w|w = w = . . . = w d }, and the only w ∈ C with this property is w = .
Consequently, ν = δ and ( ) = . In other words, (x) = max(x , . . . , x d ). Proof of Lemma 4. If f i = α i f for all i (α = ) then both sides in (14) have the same value f n dµ · n i= α i . Supposing now equality in (14), we proceed by induction. For n = the inequality (14) is the Cauchy-Schwarz inequality and it is well-known that f and f are proportional in case of equality. We assume now the validity of our assertion for some n ≥ and consider n + functions f , . . . , f n+ . Hölder's inequality for two functions g, h ≥ reads