# About the exact simulation of bivariate (reciprocal) Archimax copulas

• Jan-Frederik Mai
From the journal Dependence Modeling

## Abstract

We provide an exact simulation algorithm for bivariate Archimax copulas, including instances with negative association. In contrast to existing simulation approaches, the feasibility of our algorithm is directly linked to the availability of an exact simulation algorithm for the probability measure described by the derivative of the parameterizing Pickands dependence function. We demonstrate that this hypothesis is satisfied in many cases of interest and, in particular, it is satisfied for piece-wise constant Pickands dependence functions, which can approximate the general case to a given level of desired accuracy. Finally, the algorithm can be leveraged to an exact simulation algorithm for bivariate copulas associated with max-infinitely divisible random vectors whose exponent measure has norm-symmetric survival function, so-called reciprocal Archimax copulas.

MSC 2010: 65C05; 60E05; 62H05

## 1 Introduction

We recall that a bivariate copula is the restriction to [ 0 , 1 ] 2 of a joint distribution function of a random vector ( X , Y ) with the property that both X and Y are uniformly distributed on the interval [ 0 , 1 ] , i.e., F X ( x ) P ( X x ) = x and F Y ( y ) P ( Y y ) = y for 0 x , y 1 . There exists a lot of research on copulas, since the distribution function of an arbitrary bivariate random vector ( X , Y ) can always be written as C ( F X ( x ) , F Y ( y ) ) for some copula C due to Sklar’s theorem, so that copulas are an analytic tool to separate dependence and marginals. Similarly, the survival function of ( X , Y ) can always be written as C ˆ ( 1 F X ( x ) , 1 F Y ( y ) ) for some copula C ˆ , called survival copula. If F X , F Y are continuous, the copula and survival copula of ( X , Y ) are even unique, see [5,17] for popular textbooks on the topic.

The present article deals with the specific family of bivariate Archimax copulas that have been introduced in [1] with the intention to create a large parametric family of copulas with desirable properties in the context of extreme-value theory. For detailed background on dependence properties we also refer to [6], and inference for Archimax copulas is investigated in [3]. Bivariate Archimax copulas are parameterized in terms of a pair ( φ , ) of a continuous and convex function φ : [ 0 , ) [ 0 , 1 ] with φ ( 0 ) = 1 and lim x φ ( x ) = 0 , and a norm on R 2 with normalization ( 1 , 0 ) = ( 0 , 1 ) = 1 that is orthant–monotonic, which means that ( x , y ) = ( x , y ) . Intuitively, this means that its unit circle is axis-symmetric so that its unit simplex { ( x , y ) [ 0 , 1 ] 2 : ( x , y ) = 1 } lies between the ones corresponding to . 1 and . . It is well-known from [1,2] that there exists a random vector ( X , Y ) on the positive orthant ( 0 , ) 2 whose survival function is given by P ( X > x , Y > y ) = φ ( x , y ) , x , y 0 . The unique survival copula of ( X , Y ) is called Archimax copula, given by

(1) C φ , ( x , y ) = φ [ φ 1 ( x ) , φ 1 ( y ) ] , 0 x , y 1 .

In Section 5, we will further provide an exact simulation algorithm for reciprocal Archimax copulas, which have the algebraic form

(2) C , ψ , ( x , y ) = x y C e ψ , ( x , y ) , 0 x , y 1 ,

where is the same as above, and ψ : ( 0 , ) [ 0 , ) is a convex function with lim x ψ ( x ) = 0 (just like φ ) and lim x 0 ψ ( x ) = (unlike φ ). The nomenclature originates from the algebraic reminiscence with Archimax copulas in (1) and was introduced in [7] for the special case = . 1 . These arise as copulas of bivariate max-infinitely divisible random vectors whose exponent measure has -symmetric survival function.

The function φ is called Archimedean generator, since the subfamily of copulas of the form C φ , = . 1 coincides with Archimedean copulas, see [16] for background on the matter. Similarly, ψ will be called a generator for the reciprocal Archimax copula, since the special case C , ψ , = . 1 coincides with bivariate reciprocal Archimedean copulas introduced in [7]. The norm is called a stable tail dependence function, and the function A ( x ) ( x , 1 x ) for x [ 0 , 1 ] is called Pickands dependence function. (Note that is fully determined by A via ( x , y ) = ( x + y ) A x x + y .) The nomenclature originates from the field of multivariate extreme-value theory, since the subfamily of copulas of the form C φ ( x ) = exp ( x ) , coincides with bivariate extreme-value copulas, see [8] for background on the matter.

Regarding the strength of association between the components X and Y there are two different regimes. If the function φ happens to be the Laplace transform of a positive random variable M , we have non-negative association, while we can have negative association in the complementary case, when φ is a so-called strict Archimedean generator (not a Laplace transform). The case of positive association is significantly better understood, because in this case the survival function φ ( x , y ) = E [ exp { M ( x , y ) } ] equals that of a scale mixture of a bivariate min-stable exponential distribution, which is a well-investigated concept of positive association, see [9, Chapter 6, p. 169 ff]. In particular, for random vectors with survival function exp ( m ( x , y ) ) , or equivalently for extreme-value copulas C φ ( x ) = exp ( x ) , , there exist exact simulation algorithms due to [4], which are feasible for many . Our intention is to provide an algorithm that is equally general with respect to the choice of , but is not restricted to Laplace transforms φ ( x ) = E [ exp ( M x ) ] , but works also for strict generators φ , in particular for the important special case φ ( x ) = ( 1 x ) + .

We recap the existing literature that forms the basis for our derivation. On the one hand, there is the following decomposition result due to [2].

## Theorem 1.1

(Charpentier et al. [2]) Let be an orthant–monotonic norm on R 2 subject to the normalization ( 1 , 0 ) = ( 0 , 1 ) = 1 , and let φ : [ 0 , ) [ 0 , 1 ] be a continuous and convex function with φ ( 0 ) = 1 and lim x φ ( x ) = 0 . Then there exists a random vector ( X , Y ) with survival function P ( X > x , Y > y ) = ( 1 ( x , y ) ) + and an independent positive random variable R φ with the property E [ ( 1 x / R φ ) + ] = φ ( x ) , whose distribution is uniquely determined by the function φ . Furthermore, the random vector ( X , Y ) R φ ( X , Y ) has survival function P ( X > x , Y > y ) = φ ( x , y ) , x , y 0 , and survival copula C φ , . Thus, the random vector ( φ ( R φ X ) , φ ( R φ Y ) ) has distribution function C φ , .

In words, Theorem 1.1 states that every Archimax copula arises as scale mixture of the special Archimax copula with generator φ ( x ) = ( 1 x ) + , which equals the Archimax copula with minimal association for a fixed . When simulating ( X , Y ) with survival copula C φ , , this allows us to perform two independent subsequent simulation steps: first simulation of R φ and second simulation of ( X , Y ) with survival copula given in terms of by

C ( x , y ) C φ ( x ) = ( 1 x ) + , ( x , y ) = ( 1 ( 1 x , 1 y ) ) + .

On the other hand, in the original reference [1] the authors propose an exact simulation algorithm, which we are going to refine. According to Theorem 1.1, we may focus on the simulation of ( X , Y ) , which corresponds to the special case φ ( x ) = ( 1 x ) + , and in which the method of [1] can be formulated as in the following theorem.

## Theorem 1.2

(Capéraà et al. [1]) Assume that the second derivative A of A exists and is continuous everywhere on ( 0 , 1 ) . Let U , V and Z be three independent random variables, where U , V U [ 0 , 1 ] and

P ( Z z ) = z + z ( 1 z ) A ( z ) A ( z ) , z [ 0 , 1 ] .

Further denote the density of Z by f Z (it exists by the assumption of existing A ) and define the threshold function

p ( z ) z ( 1 z ) A ( z ) f Z ( z ) A ( z ) , z ( 0 , 1 ) .

Under these assumptions, the random vector ( X , Y ) from Theorem 1.1admits the stochastic representation

(3) ( X , Y ) Z A ( Z ) , 1 Z A ( Z ) , if U > p ( Z ) V Z A ( Z ) , 1 Z A ( Z ) , if U p ( Z ) .

Our proposed algorithm below provides concrete implementation advice for simulation from C . For the simulation of C we use an alternative approach than Theorem 1.2, in particular we do not require A to exist, but we do make use of the distributional statement (3) in our derivation. The main idea is to use the well-known representation

(4) A ( x ) = 2 E [ max { Q x , ( 1 Q ) ( 1 x ) } ] = 1 + 0 x 1 2 P ( Q 1 q ) d q ,

which defines a one-to-one relationship between the parameterizing norm and the distribution of a random variable Q [ 0 , 1 ] with 2 E [ Q ] = 1 . This representation of in terms of Q is called Pickands representation in extreme-value theory. The last equality in (4) shows that the right-hand version of the derivative A , which is well-defined everywhere by convexity of A and which we simply denote by A throughout, essentially determines the distribution function of Q and, if existing, the second derivative A equals the density of 1 Q . Whereas the analytical treatment of extreme-value copulas in terms of A is traditional, the article [18] provides some convincing arguments as to why a treatment in terms of the cdf of Q (i.e., essentially A ) can sometimes be more convenient. The appearance of A and A in (3) above thus hints at the possibility to reformulate the simulation algorithm in terms of Q . In fact, we manage to provide a simulation algorithm for C that is exact and feasible whenever the following pre-requisites are met:

## Assumption 1

(Exact evaluation of ) We can evaluate ( y , x ) exactly and efficiently for arbitrary x , y 0 .

## Assumption 2

(Exact evaluation of A ) We can evaluate A ( x ) exactly and efficiently for arbitrary x ( 0 , 1 ) .

## Assumption 3

(Exact simulation of Q ) The random variable Q can be simulated exactly and efficiently.

Assumptions 1 and 2 are clearly mild, in particular as they are minimal requirements in order to have hope for simulation according to (3) or the inverse Rosenblatt transformation method via partial derivatives to work at all. The decisive question the thoughtful reader should raise at this point is: “Why is it useful to re-formulate the known model (3) in terms of Q ?” To this end, we point out that due to the aforementioned Pickands representation the random variable Q plays a central role in extreme-value theory, in particular the law of Q is feasible very often, as we will demonstrate below in terms of a comprehensive survey of models. For instance, [4] provides two simulation algorithms for extreme-value copulas precisely under Assumption 3. While the algorithms of [4] are restricted to extreme-value copulas but treat the general case of d -dimensional extreme-value copulas also for d > 2 , the algorithm we develop is restricted to the bivariate case only but can be leveraged to an algorithm for arbitrary bivariate Archimax copulas. The d -dimensional generalization remains an interesting open problem, to date only solved for the special case = . p , see [15].

The remainder of the article is organized as follows. Section 2 derives our simulation algorithm, while Section 3 demonstrates it by providing examples. Section 4 discusses some interesting aspects of the particular case when is symmetric, and Section 5 treats the situation of bivariate reciprocal Archimax copulas, in particular indicating that our simulation algorithm for bivariate Archimax copulas immediately implies an exact simulation algorithm for bivariate reciprocal Archimax copulas as well. Section 6 finally concludes.

## 2 Exact simulation from C ℓ in terms of Q

We derive an alternative stochastic representation for the random vector ( X , Y ) in Theorem 1.1, which serves as convenient basis for simulation under our Assumptions 13. One problem with the stochastic representation of Theorem 1.2 is that it requires A to exist, which is restrictive. Our method will resolve this problem. The other problem is that the probability law of Z is rather complicated to simulate in many cases. Our proposed method replaces the requirement to simulate Z with the requirement to simulate Q from the Pickands representation (4), and we demonstrate that this is feasible in many cases.

For the sake of clarity, we recall that the survival function of ( X , Y ) is given by ( 1 ( x , y ) ) + = C ( 1 x , 1 y ) , and the distribution function of ( X , Y ) equals

(5) C ˆ ( x , y ) x + y 1 + C ( 1 x , 1 y ) ,

which is itself a copula (the survival copula of C ) and ( 1 X , 1 Y ) C . Our proposed method to simulate ( X , Y ) requires the exact simulation of an absolutely continuous random variable on ( 0 , 1 ) whose density h Q depends on a realization of Q from the Pickands representation (4). As a first auxiliary step, the following lemma states the explicit form of this density. It furthermore provides the basis for an exact and efficient simulation from h Q via the so-called rejection–acceptance sampling method, see [12, p. 234–235] for background. The idea of rejection–acceptance sampling for a (possibly complicated) density h is to find an easy-to-simulate density g , called comparison density, and a finite constant c > 0 such that h c g . Given this, one may iteratively simulate variates X g g and independent U U [ 0 , 1 ] until finally U h ( X g ) / ( c g ( X g ) ) , and then return the resulting sample X g satisfying this condition as a sample from h .

## Lemma 2.1

(Auxiliary density and associated comparison density) Let be an orthant–monotonic norm on R 2 which is normalized to satisfy ( 0 , 1 ) = ( 1 , 0 ) = 1 , and fix q [ 0 , 1 ] . With the positive constant

c q 1 2 1 q , 1 1 q , if q ( 0 , 1 ) 1 , if q { 0 , 1 }

the function

h q ( x ) 1 c q 1 1 A ( x ) 1 2 x x ( 1 x ) A ( x ) A ( x ) , if x < 1 q 1 + 1 A ( x ) 1 2 x x ( 1 x ) A ( x ) A ( x ) , if x 1 q , x ( 0 , 1 ) ,

defines a probability density on ( 0 , 1 ) . Furthermore, we have that c q h q d q g q , where g q is a piece-wise polynomial probability density on ( 0 , 1 ) that is given by

(6) g q ( x ) 1 d q [ 1 { x < 1 q } ( 1 + 2 x ) + 1 { x 1 q } ( 3 2 x ) ] , x ( 0 , 1 ) ,

with normalizing constant d q 2 ( 1 q ( 1 q ) ) > 0 .

## Proof

We observe that the function

H q ( x ) 1 c q x x ( 1 x ) A ( x ) , if x < 1 q x + x ( 1 x ) A ( x ) 2 ( 1 q ) q A ( 1 q ) , if x 1 q ,

is a continuous distribution function on [ 0 , 1 ] for arbitrary q [ 0 , 1 ] with H q ( 0 ) = 0 and H q ( 1 ) = 1 . To verify this, we need to check that x ± x ( 1 x ) / A ( x ) is non-decreasing, which follows from the identity

x x ± x ( 1 x ) A ( x ) = 1 ± P ( Z > x ) x A ( x ) ,

and the bounds P ( Z > x ) [ 0 , 1 ] as well as A ( x ) max { x , 1 x } . We further observe easily with similar estimates, using again the similarity of the definition of h q ( x ) and P ( Z x ) , that the density h q = H q of H q is bounded from above by d q g q / c q for g q given in (6).□

A random variable X g q with density g q can be simulated via the inversion method, see [12, p. 234–235] for background, since its cumulative distribution function

G q ( x ) P ( X g q x ) = 0 x g q ( y ) d y , x [ 0 , 1 ] ,

can be inverted in closed form according to

G q 1 ( v ) = 1 2 + 1 2 1 + 8 ( 1 q ( 1 q ) ) v , if v < ( 1 q ) ( 2 q ) 2 ( 1 q ( 1 q ) ) 3 2 1 2 9 8 ( ( 1 q ( 1 q ) ) v + q ( 1 q ) ) , else ,

for v ( 0 , 1 ) . Consequently, Lemma 2.1 implies that a random variable X h q with density h q can be simulated exactly via the rejection–acceptance method under Assumptions 1 and 2. Algorithm 1 summarizes the resulting simulation method for X h q h q based on this idea.

Algorithm 1. Simulation of X h q h q via rejection–acceptance
1: procedure SimulateH( q )
2: U 1 and T 0
3: while U > T do
4: Simulate U , V uniform on [ 0 , 1 ]
5: if V < ( 1 q ) ( 2 q ) 2 ( 1 q ( 1 q ) ) then
6: X g q 1 2 + 1 2 1 + 8 ( 1 q ( 1 q ) ) V
7: else
8: X g q 3 2 1 2 9 8 ( ( 1 q ( 1 q ) ) V + q ( 1 q ) )
9: end if
10: X g q has comparison density g q given in (6)
11: if X g q < 1 q then
12: T 1 1 + 2 X g q 1 1 A ( X g q ) 1 2 X g q X g q ( 1 X g q ) A ( X g q ) A ( X g q )
13: else
14: T 1 3 2 X g q 1 + 1 A ( X g q ) 1 2 X g q X g q ( 1 X g q ) A ( X g q ) A ( X g q )
15: end if
16: end while
17: return X h q X g q
18: end procedure

Based on the availability of an efficient simulation of X h q via Algorithm 1, the following theorem is the main result of the present article.

## Theorem 2.2

(Exact simulation of C ) Let be an orthant–monotonic norm on R 2 subject to the normalization ( 0 , 1 ) = ( 1 , 0 ) = 1 . Let V U [ 0 , 1 ] and Q from the Pickands representation (4) of A be independent random variables. Given Q (and independent of V ), let X h Q be a random variable with density h Q , simulated according to Algorithm 1. The random vector ( X , Y ) with survival function C ( 1 x , 1 y ) and distribution function C ˆ has the stochastic representation

(7) ( X , Y ) X h Q A ( X h Q ) , 1 X h Q A ( X h Q ) , if V 2 Q , V 2 ( 1 Q ) 1 V 2 Q , V 2 ( 1 Q ) , if V 2 Q , V 2 ( 1 Q ) < 1 .

## Proof

It is sufficient to prove the claim for absolutely continuous Q . Why? An arbitrary Q (possibly with atoms) with E [ Q ] = 1 / 2 may be approximated arbitrarily well in distribution with a sequence Q n of absolutely continuous random variables satisfying E [ Q n ] = 1 / 2 . These correspond to a sequence of stable tail dependence functions n that converge point-wise, and this implies that the respective distributional statement also holds in the limit associated with Q . Note that this logic relies on the fact that convergence in distribution Q n Q stands in one-to-one relation with the point-wise convergence n , resp. C n C .

Under the assumption of absolute continuity of Q , is differentiable and it is clear that C ˆ has a singular component on the -simplex { ( x , y ) [ 0 , 1 ] 2 : ( x , y ) = 1 } , has zero mass outside the simplex and has a density inside the simplex. Consequently, we may compute this density of C ˆ inside the simplex by taking the partial derivative with respect to both components. To this end, we denote by f v ( q 1 , q 2 ) the density of the random vector ( v / 2 Q , v / 2 ( 1 Q ) ) for arbitrary but fixed v ( 0 , 1 ) , and obtain for x , y with ( x , y ) < 1

2 x y C ˆ ( x , y ) = 2 x y ( x , y ) = 2 x y 0 1 P ( max { 2 Q x , 2 ( 1 Q ) y } v ) d v = 0 1 2 x y P v 2 Q > x , v 2 ( 1 Q ) > y d v = 0 1 f v ( x , y ) d v .

Denoting by p C ˆ the mass that d C ˆ allocates on the -simplex, by d C ˆ s its singular and by d C ˆ c its continuous part, we have the decomposition C ˆ = ( 1 p C ˆ ) C ˆ c + p C ˆ C ˆ s . We obtain for x , y [ 0 , 1 ] from the above expression

( 1 p C ˆ ) C ˆ c ( x , y ) = 0 x 0 y 0 1 f v ( s , t ) d v 1 { ( s , t ) < 1 } d s d t = E 0 1 1 { v 2 Q x , v 2 ( 1 Q ) y } 1 { v 2 Q , v 2 ( 1 Q ) < 1 } d v = P V 2 Q x , V 2 ( 1 Q ) y , V 2 Q , V 2 ( 1 Q ) < 1 .

Furthermore, we obtain the probability 1 p C ˆ by plugging in x = y = 1 , which yields

1 p C ˆ = P V < min 2 Q , 2 ( 1 Q ) , 1 1 2 Q , 1 2 ( 1 Q ) = P V < 1 1 2 Q , 1 2 ( 1 Q ) = P V 2 Q , V 2 ( 1 Q ) < 1 = E 1 1 2 Q , 1 2 ( 1 Q ) .

Note that the last equality is stated only for later reference, and implicitly these computations rely on the fact that 1 min { 2 Q , 2 ( 1 Q ) } 1 / 1 2 Q , 1 2 ( 1 Q ) almost surely. This already proves the first part of our claimed statement.

For the second part, from Theorem 1.2 we obtain that

C ˆ s ( x , y ) = P Z ( Z , 1 Z ) x , 1 Z ( Z , 1 Z ) y U > p ( Z ) .

Using the well-known identity A ( z ) = 1 2 P ( Q 1 z ) from (4) and the definition of p and P ( Z z ) in Theorem 1.2, we find

E [ 1 p ( Z ) ] = E 1 1 1 2 Q , 1 2 ( 1 Q ) = c Q , see Lemma 2.1 = P V 2 Q , V 2 ( 1 Q ) > 1 = p C ˆ ,

as well as

p C ˆ P ( Z z U > p ( Z ) ) = E z + z ( 1 z ) A ( z ) 1 1 2 ( 1 z ) , 1 2 z , if z 1 Q 1 1 2 Q , 1 2 ( 1 Q ) , if z > 1 Q .

In the last expression we recognize the distribution function H Q from the proof of Lemma 2.1, so that this finally implies

P ( Z z U > p ( Z ) ) = E c Q p C ˆ H Q ( z ) = P X h Q z V 2 Q , V 2 ( 1 Q ) > 1 ,

which establishes the claimed statement, when comparing with (3).□

As mentioned in Section 1, the simulation of an arbitrary bivariate Archimax copula C φ , requires simulation from C , which we have now established, and, independently, simulation of a random radius R φ > 0 , whose probability distribution is determined uniquely by the Williamson-2-transform φ ( x ) E [ ( 1 x / R φ ) + ] . We first note that the distribution function of R φ is given by the Williamson-inversion formula

(8) P ( R φ x ) = 1 φ ( x ) + x φ ( x ) ,

where φ is meant to be the right-hand version of the (by convexity almost everywhere existing) derivative of φ . Regarding examples for feasible R φ , we refer to [16]. Generally speaking, rejection–acceptance sampling of R φ based on (8) is feasible whenever φ exists and is analytically feasible and if the (then existing) density

f R φ ( x ) = x φ ( x )

can be bounded from above by a scalar multiple of an easy-to-simulate density. In contrast, we end this section by providing a simulation algorithm for the special case

(9) φ ( x ) = 1 ( 0 , ] ( 1 e s x ) ν ( d s ) + ,

where ( 1 e s ) ν ( d s ) is a probability measure on ( 0 , ] , under the following assumption:

## Assumption 4

(Requirement to simulate R φ with generator (9)) The probability measure ρ ν ( d s ) ( 1 e s ) ν ( d s ) can be simulated exactly.

A generator of the form (9) always implies R φ 1 . There are examples for generators φ of that form which are not analytically tractable on first glimpse, in particular for which it is not straightforward to apply direct acceptance-rejection sampling based on the density f R φ , an example is provided below.

The function

Ψ ν ( x ) ( 0 , ] ( 1 e x s ) ν ( d s ) , x 0 ,

is a so-called Bernstein function satisfying Ψ ν ( 1 ) = 1 , in particular is concave, so that φ ( x ) ( 1 Ψ ν ( x ) ) + equals the Williamson-2-transform of some random variable taking values in ( 0 , 1 ] , i.e., φ is a proper Archimedean generator. Every R φ that arises in this way can be simulated under Assumption 4. Why? We observe that

P ( R φ x ) = Ψ ν ( x ) x Ψ ν ( x ) = ( 0 , ] 1 e x s ( 1 + x s ) 1 e s G s ( x ) ρ ν ( d s ) .

For fixed s ( 0 , ] , the function G s : [ 0 , 1 ] [ 0 , 1 ] is the distribution function of a random variable R s on [ 0 , 1 ] . Furthermore, simulation of R s G s is efficient and exact via acceptance-rejection, since R s takes values in ( 0 , 1 ] , has an atom at 1 with probability exp { s } s / ( 1 exp { s } ) and a density G s on ( 0 , 1 ) that is bounded from above by x x s 2 / ( 1 exp { s } ) . Summing up, in order to simulate R φ , the following steps need to be carried out:

• Simulate S ρ ν , using Assumption 4.

• Simulate B Bernoulli ( exp { S } S / ( 1 exp { S } ) ) .

• If B = 1 , return R φ = 1 .

• Otherwise, repeatedly simulate U , V U [ 0 , 1 ] independent, until finally U e V S , and then return R φ = V .

There exist some examples for closed-form analytic functions Ψ ν for which Ψ ν (hence the resulting density f R φ ) is nasty to deal with, but for which Assumption 4 is still satisfied. For instance, we may extract the following example from [13, Example 3.3]. With parameters α [ 0 , 1 ) and θ 0 we consider the Bernstein function

Ψ ν ( x ) = x Γ ( x + θ ) Γ ( 2 + θ α ) Γ ( 1 + θ ) Γ ( x + θ + 1 α ) , x 0 ,

which corresponds to the measure

ν ( d s ) = Γ ( 2 + θ α ) Γ ( θ + 1 ) Γ ( 1 α ) e θ s ( 1 e s ) α + 1 ( α e s + θ ( 1 e s ) ) .

The associated random variable S with probability distribution ρ ν has density given by ( 1 α ) g θ , 2 α + α g θ + 1 , 1 α , where g a , b denotes the density of log ( B a , b ) and B a , b has Beta distribution with density proportional to x x a 1 ( 1 x ) b 1 . The simulation of S is thus straightforward as a mixture of logarithms of Beta distributions, see [12, p. 243] for advice on how to simulate B a , b . Apparently, computing the second derivative of Ψ ν and developing a rejection–acceptance sampling method for R φ in this example appears to be rather inconvenient, whereas the presented method to simulate R φ and its associated Archimax copulas only requires to evaluate Ψ ν (in order to transform margins to uniforms), which is easy.

## 3 Examples for feasible ℓ

We review some popular families of stable tail dependence functions , for which the simulation of the associated Q is viable, thereby demonstrating that our algorithm is feasible in many interesting cases.

### 3.1 Q corresponding to conditionally iid extreme-value copulas

Many (but not all, see [14]) parametric families of symmetric stable tail dependence functions can be considered jointly under one common umbrella, by observing that they are of the particular form

( x , y ) = F ( x , y ) 0 1 F u x F u y d u ,

where F : [ 0 , ) [ 0 , 1 ] equals the distribution function of a non-negative random variable X F with unit mean. In probabilistic terms, this means that

F ( x , y ) = E [ max { x X F , 1 , y X F , 2 } ] , x , y 0 ,

where X F , 1 , X F , 2 F are iid random variables. These particular stable tail dependence functions are studied in [10], where it is shown that a random vector ( X , Y ) with survival function exp ( F ( x , y ) ) is conditionally iid. The probability distribution of Q = Q F in this case can be simulated by

(10) Q 1 { B = 0 } Y F Y F + X F , 2 + 1 { B = 1 } X F , 1 X F , 1 + Y F ,

where X F , 1 , X F , 2 F , B Bernoulli ( 1 / 2 ) and Y F x d F ( x ) are independent random variables. For many popular models, the distribution F as well as its so-called size bias Y F can be simulated efficiently. We provide some examples below. Note in particular that for both of the following two examples, the Hüsler-Reiss example and the Galambos example, simulation algorithms based on partial derivatives require numerical root search algorithms, while the presented algorithm is exact.

### Example 3.1

(Hüsler-Reiss family) With a parameter θ > 0 we consider the norm

( x , y ) = y Φ θ + 1 2 θ log y x + x Φ θ + 1 2 θ log x y , x , y 0 ,

where Φ denotes the cdf of a standard normally distributed random variable. This is of the form F with F ( x ) = Φ [ ( log ( x ) + θ 2 / 2 ) / θ ] a log-normal distribution. The random variables X F F and Y F x d F ( x ) can be simulated according to

X F e θ 2 + 2 θ W , Y F e θ 2 + 2 θ W , W Φ .

The required derivative of A is available in closed form, given by

A ( x ) = Φ θ + 1 2 θ log x 1 x Φ θ + 1 2 θ log 1 x x + ϕ θ + 1 2 θ log x 1 x 1 2 θ ( 1 x ) ϕ θ + 1 2 θ log 1 x x 1 2 θ x ,

where ϕ ( x ) = Φ ( x ) = exp ( x 2 / 2 ) / 2 π equals the density of W .

### Example 3.2

(Galambos family) With a parameter θ > 0 we consider the norm

( x , y ) = x + y ( x θ + y θ ) 1 θ , x , y 0 .

This is of the form F with F ( x ) = 1 exp { [ Γ ( 1 + 1 / θ ) x ] θ } a Weibull distribution. (Note that in [10, Example 2] there is a typo, mistaking θ for 1 / θ .) The random variable X F F can be simulated exactly via the inversion method and Y F x d F ( x ) according to

Y F G 1 / θ Γ ( 1 + 1 / θ ) , G Γ ( 1 + 1 / θ , 1 ) ,

where Γ ( β , η ) denotes the Gamma distribution with density proportional to x e η x x β 1 . The required derivative of A is available in closed form, given by

A ( x ) = ( x θ + ( 1 x ) θ ) 1 θ 1 ( ( 1 x ) θ 1 x θ 1 ) .

### Example 3.3

(Cuadras-Augé family) With a parameter θ ( 0 , 1 ] we consider the norm

( x , y ) = max { x , y } + ( 1 θ ) min { x , y } = θ ( x , y ) + ( 1 θ ) ( x , y ) 1 .

This is of the form F with F ( x ) = 1 θ + θ 1 { θ x 1 } , meaning that P ( X F = 0 ) = 1 θ = 1 P ( X F = 1 / θ ) . Furthermore, the associated size bias Y F = 1 / θ equals a constant in this case. The required derivative A is piece-wise constant, taking the value θ on ( 0 , 1 / 2 ) and the value θ on [ 1 / 2 , 1 ) . The case of general piece-wise constant A is further discussed in Section 3.2.

### Example 3.4

(Gumbel family = . p ) When ( x , y ) = ( x , y ) p = ( x p + y p ) 1 / p for some p > 1 , then = F with associated Fréchet distribution function F ( x ) = exp { [ Γ ( 1 1 / p ) x ] p } . In particular, X F F may be simulated exactly via the inversion method, and the size bias Y F may be simulated according to

Y F G 1 / p Γ ( 1 1 / p ) , G Γ ( 1 1 / p , 1 ) ,

where Γ ( β , η ) denotes the Gamma distribution with density proportional to x e η x x β 1 . The required derivative A is known in closed form, to wit

A ( x ) = ( x p + ( 1 x ) p ) 1 p 1 ( x p 1 ( 1 x ) p 1 ) .

In [13,11] it is further explained how to simulate Q associated with convex mixtures of stable tail dependence functions of the type F , and how this can be used to construct interesting and tractable new families. The following is an example in this spirit.

### Example 3.5

(Holmgren-Liouville family) Let S be some random variable taking values in ( 0 , 1 ) . Then the function

Ψ ( x ) = 1 E [ ( 1 S ) x ] E [ S ] , x 0 ,

is a so-called Bernstein function. Assume further that S is either absolutely continuous or discrete. In the absolutely continuous case we denote by f S its density, and in the discrete case we denote by f S the function f S ( z ) = P ( S = z ) , thus treating both the absolutely continuous case and the discrete case jointly with our notation. Similarly, we consider a random variable Z with associated function

f Z ( z ) = 1 E [ 1 + S / log ( 1 S ) ] 1 z + 1 log ( 1 z ) z f S ( z ) , z ( 0 , 1 ] .

This function is bounded from above by

f Z ( z ) 2 f S ( z ) E [ S ] , z ( 0 , 1 ] .

Consequently, we can simulate Z exactly via acceptance-rejection, whenever we can simulate S , see [12, p. 235–237]. Now we consider

( x , y ) = 2 Ψ ( 2 ) x y x + y 2 x y x + y max { x , y } Ψ 1 min { x , y } max { x , y } ,

which is an orthant–monotonic norm with ( 1 , 0 ) = ( 0 , 1 ) = 1 , hence a stable tail dependence function. This norm arises as a convex probability mixture of norms of the form F z with F ( x ) = min { exp ( x 1 ) , 1 } , when randomizing the power z , as demonstrated in [13, Example 3.6]. The associated random variable Q can be simulated as follows:

• Simulate Z according to the aforementioned rejection–acceptance method (requires simulation of S ).

• Simulate Y F from the continuous density

f Y F ( x ) = Z 2 e Z Z + e Z 1 x e Z x , x ( 0 , 1 ) ,

which can be done via acceptance-rejection with comparison density g ( x ) = 2 x .

• Simulate X F , 1 , X F , 2 iid from the distribution function F ( x ) = min { exp ( Z x Z ) , 1 } , which is straightforward by the inversion method.

• Apply formula (10) in order to simulate Q from the simulated X F , 1 , X F , 2 , Y F and a Bernoulli variable B .

In order to make this example feasible, the choice of S requires availability of Ψ in closed form, as well as an exact simulation method for S . Examples that work certainly include finite-valued S , as well as S with beta distribution whose density is proportional to x x a 1 ( 1 x ) b 1 for parameters a , b > 0 . In the latter case, we observe that

(11) Ψ ( x ) = a + b b 1 β ( a , b + x ) β ( a , b ) ,

where β ( . , . ) denotes the beta function. This example originates from a random vector ( X , Y ) with survival function exp ( ( x , y ) ) that is defined by ( X , Y ) = ( L ϵ X 1 , L ϵ Y 1 ) , where ϵ X , ϵ Y are iid standard exponential random variables and, independently, L = { L t } t 0 is a so-called Holmgren-Liouville subordinator. The latter is a non-decreasing stochastic process with long-range dependence properties. In particular, ( X , Y ) are iid conditioned on L . For the sake of completeness, we point out that A is explicitly given by

A ( x ) = 2 Ψ ( 2 ) ( 1 2 x ) ( 1 4 max { x , 1 x } ) Ψ 1 min { x , 1 x } max { x , 1 x } + 1 2 max { x , 1 x } max { x , 1 x } Ψ 1 min { x , 1 x } max { x , 1 x } .

In the particular special case with S having a beta distribution, the derivative Ψ of  Ψ in (11) is given in terms of the Digamma function D Γ ( x ) = Γ ( x ) / Γ ( x ) , which is pre-implemented in most software packages just like the Gamma function Γ ( . ) and Beta function β ( . , . ) , to wit

Ψ ( x ) = a + b b β ( a , b + x ) β ( a , b ) ( D Γ ( a + b + x ) D Γ ( b + x ) ) .

### 3.2 Finite-valued Q , resp. piece-wise constant A ℓ

The Pickands dependence function A is piece-wise linear if and only if the associated random variable Q is discrete. We may approximate an arbitrary A (from above due to convexity) by the function that takes the identical values on the grid i / n for n 1 and i = 0 , , n and which is piece-wise linear in between these grid points. The latter approximation corresponds to some finite-valued Q n . Since the corresponding approximation A , n converges point-wise to A by construction, Q n converges in distribution to Q , and also the associated copulas C , n approximate C point-wise. From this perspective, the case of finite-valued Q is of utmost importance.

Regarding the implementation, we recommend to apply the strategy proposed in [18], that is to work on the level of A . To this end, we consider a Pickands dependence function A , n that is piece-wise constant between given values v 0 0 < v 1 < < v n 1 , with n 1 arbitrary but fixed. According to the methodology in [18], we parameterize this function in terms of the vectors v = ( v 0 , , v n ) and a = ( a 1 , , a n ) subject to the conditions

1 a 1 a 2 a n 1 , i = 1 n a i ( v i v i 1 ) = 0 .

Intuitively, we define A , n piece-wise constant with A , n ( x ) = a i for v i 1 x < v i , i = 1 , , n . With i ( x ) max { i { 0 , , n } : v i x } , A , n is given by

A , n ( x ) = 1 + ( x v i ( x ) ) a i ( x ) + 1 + j = 1 i ( x ) a j ( v j v j 1 ) .

With the convenient notation a 0 1 , the associated random variable Q n has distribution function P ( Q n q ) = ( 1 a i ( 1 q ) ) / 2 . The probability mass function is given by

P ( Q n = 0 ) = 1 a n 2 , P ( Q n = 1 ) = 1 + a 1 2 , P ( Q n = 1 v i ) = a i + 1 a i 2 , i = 1 , , n 1 .

An arbitrary A may now be approximated by a piece-wise constant one of the above form A , n with v = ( 0 , 1 / n , , ( n 1 ) / n , 1 ) by computing the parameters a 1 , , a n iteratively from the relation

A i n = 1 + 1 n j = 1 i a j , i = 1 , , n ,

meaning that

a 1 = n A 1 n 1 , a i = n A i n 1 1 n j = 1 i 1 a j , i 2 .

Since exact simulation of the finite-valued Q n is possible, like the evaluation of A , n and A , n , our algorithm is feasible. Figure 1 visualizes points simulated from C ˆ , n when is from the Hüsler-Reiss family in Example 3.1.

Figure 1

Scatter plot of 2,500 samples from C ˆ with given by the Hüsler-Reiss family with parameter θ = 1 / 4 (gray points), as well as from its approximation C ˆ , n (black points) with n = 10 (left plot) and n = 80 (right plot).

## 4 The specific case of symmetric norms

From the proof of Theorem 2.2 we recall the decomposition of the survival copula C ˆ in (5) of the Archimax copula C into the parts on the -simplex and inside the -simplex. To wit, the distribution function C ˆ of the random vector ( X , Y ) in Theorem 2.2 admits the decomposition

C ˆ = ( 1 p C ˆ ) C ˆ c + p C ˆ C ˆ s ,

where p C ˆ = P ( ( X , Y ) = 1 ) denotes the probability mass on the -simplex, C ˆ s denotes the cdf of a singular distribution concentrated on the -simplex and C ˆ c denotes the cdf of a probability distribution concentrated inside the -simplex. Note that C ˆ c is only absolutely continuous, if the associated random variable Q from the Pickands representation (4) of is absolutely continuous, such as assumed in the proof of Theorem 2.2. In the general case, it has singular parts concentrated on straight lines corresponding to atoms of Q , as can be seen in Figure 1.

In the special case of symmetric , that is if ( x , y ) = ( y , x ) for arbitrary x , y , there is one phenomenon that we find interesting enough to discuss in the present context. The distribution function C ˆ s equals the distribution function of a counter-monotonic random vector, meaning that

C ˆ s ( x , y ) = ( G 1 , ( x ) + G 2 , ( y ) 1 ) + , x , y [ 0 , 1 ] ,

with marginal distribution functions G 1 , and G 2 , , respectively. A stochastic representation for ( X s , , Y s , ) C ˆ s is hence well-known to be given by ( G 1 , 1 ( W ) , G 2 , 1 ( 1 W ) ) for W U [ 0 , 1 ] . Since we know that this random vector is concentrated on the -unit simplex, the norm determines the implicit relationship between G 1 , and G 2 , via the equation

( G 1 , 1 ( w ) , G 2 , 1 ( 1 w ) ) = 1 for all w [ 0 , 1 ] ,

so that G 1 , determines G 2 , and vice versa. Now if is symmetric, it follows from exchangeability of C ˆ s that G 1 , = G 2 , G . The distribution function G is then uniquely and implicitly determined by via the equation

(12) ( G 1 ( w ) , G 1 ( 1 w ) ) = 1 for all w [ 0 , 1 ] .

We furthermore see that

( x , y ) = 1 w [ 0 , 1 ] : ( x , y ) = ( G 1 ( w ) , G 1 ( 1 w ) ) G ( x ) + G ( y ) = 1 ,

which provides an alternative way to describe the unit simplex of in terms of the distribution function G . Summarizing, we obtain a correspondence between symmetric and convex and continuous distribution functions G on [ 0 , 1 ] . (An arbitrary distribution function on [ 0 , 1 ] that is convex is always continuous on [ 0 , 1 ) , but in general may have a jump at 1, which we exclude by saying that G is continuous on all of [ 0 , 1 ] .)

## Lemma 4.1

(Correspondence between symmetric and G ) For a symmetric stable tail dependence function the distribution function G is continuous and convex on [ 0 , 1 ] . Conversely, any continuous and convex distribution function on [ 0 , 1 ] implicitly defines a symmetric stable tail dependence function = G , which is uniquely determined by its unit simplex { ( x , y ) [ 0 , 1 ] 2 : G ( x ) + G ( y ) = 1 } .

## Proof

Continuity of G is immediate, since a jump would imply a non-empty interval ( a , b ) ( 0 , 1 ) on which G 1 was constant, contradicting the fact that w ( G 1 ( w ) , G 1 ( 1 w ) ) is the closed curve that determines the -unit simplex. To verify convexity, assume the converse, that is assume G is not convex. Then we necessarily find two distinct x , y [ 0 , 1 ] and α ( 0 , 1 ) such that

G ( α x + ( 1 α