Probability of ruin in discrete insurance risk model with dependent Pareto claims

Abstract We present basic properties and discuss potential insurance applications of a new class of probability distributions on positive integers with power law tails. The distributions in this class are zero-inflated discrete counterparts of the Pareto distribution. In particular, we obtain the probability of ruin in the compound binomial risk model where the claims are zero-inflated discrete Pareto distributed and correlated by mixture.


Introduction
Discrete heavy-tailed distributions are an important and active area in non-life insurance research and practice (see, e.g., [4,5,21,29]). It is well-known that Pareto and Weibull distributions are used in insurance practice for modelling claim sizes. However, their theoretical implementation in collective risk models is nontrivial. We consider the compound binomial risk model introduced in [9]. The probability of ruin, ψ(u) = P(U t < for some t > |U = u), admits an explicit form when the claim amounts {X i } have zero-modi ed geometric (ZMG) distribution ZMG(q, ρ). The latter is given by the probability mass function (PMF) P(X i = k) = g(k), where and δ kj is the Kronecker delta function. In this case we have see [34].
In [8] the authors extended the formula (4) by using a mixing approach as in [1] and [6], and assuming that given Θ = θ, where Θ is a "mixing" random variable on R+, the claim amounts {X i } are independent, identically distributed (IID) zero-modi ed geometric ZMG(q, ρ) with the success probability ρ = e −θ . (5) In this set-up, [8] derived the probability of ruin (2) for three particular cases: (i) For Θ having exponential distribution with parameter λ, given by the probability density function (PDF) in which case the claim amounts have a zero-modi ed Yule distribution.
(ii) For Θ having gamma distribution with shape parameter α > and scale parameter λ > , given by the PDF In this case the claim amounts have the PMF and the probability of ruin can be expressed in terms of incomplete gamma function.
(iii) For Θ having positive stable distribution with index / (Lévy distribution), given by the PDF In this case the claim amounts have the PMF and the ruin probability can be expressed in terms of complementary error special function.
The purpose of this note is two-fold. First, we point out that in the above set-up with discrete claims correlated by mixture and, conditionally on Θ = θ, having ZMG(q, ρ) distribution, it is more convenient to assume that ρ = − e −θ (9) rather than (5) as in [8]. Thus, while in the set-up above the geometric probability of success is taken as e −θ , we use this expression for the probability of failure. Let us note that a geometric distribution with the probability of success given by (9) is a discrete version of an exponential one, since the geometric PMF can be derived as the di erence of two consecutive exponential tails with parameter θ: As shown below, this modi cation of the approach leads to convenient formulas for the probability of ruin as well for the tail probabilities (which were considered in Section 4.2 of [8]). As in [8], the mixing variable Θ will still be taken as exponential, gamma, or positive stable. However, with this choice of Θ, the resulting distributions of the claim amounts are generally quite di erent than those obtained by [8]: (i) For Θ having exponential distribution with parameter λ, the claim amounts are zero-modi ed discrete Pareto (10) with tail index α = , which is di erent than the Yule distribution (unless Θ is standard exponential with λ = ).
(ii) For Θ having gamma distribution with shape parameter α > and scale parameter λ > , the claim amounts are zero-modi ed discrete Pareto (10).
(iii) For Θ having the Lévy stable distribution (8), the claim amounts have a zero-modi ed discrete Weibull distribution. This brings us to the second motivation for this paper, which is the introduction of new classes of discrete probability models resulting from this mixing scheme. Namely, as shown in the sequel, when Θ is gamma distributed with the PDF (7) and, given Θ = θ, the claim amounts are IID ZMG(q, ρ) with ρ as in (9), the PMF of the unconditional distribution of the claim amount X becomes We obtain a mixture of a point mass at zero with probability q and a heavy-tail, discrete Pareto (DP) distribution of [3], given by the PMF with probability − q. Similarly, when Θ has a positive stable distribution with index α ∈ ( , ), given by the Laplace transform (LT) and, given Θ = θ, the claim amounts are IID ZMG(q, ρ) with ρ as in (9), then the PMF of the claim amount X becomes We again obtain a mixture, this time involving a discrete version of Weibull distribution with parameter α ∈ ( , ). Let us note that theory and applications of such zero-modi ed discrete distributions is an important area in distribution theory, with applications in manufacturing (see, e.g., [20]), econometrics (see, e.g., [24]), economics (see, e.g., [2,16,31]), and accident analysis (see, e.g., [22,30]), among others. Such modi cations, also known as zero-adjusted, zero-altered, or zero-in ated discrete distributions, have been developed for many standard discrete distributions to account for disproportionally large (or small) frequencies of zeroes observed in empirical data, compared with the standard models (see, e.g., [17], pp. 312-318). Popular models of this type include those based upon Poisson distribution (see, e.g., [10,11,13,14,[23][24][25][26]32]), generalized Poisson distribution (see, e.g., [12]), binomial distribution (see, e.g., [13]), geometric and negative binomial distributions (see, e.g., [2, 11, 14-16, 23, 31]), and logarithmic distribution (see, e.g., [18,27]). In the ruin theory literature, the binomial risk model has been developed in di erent directions (see, e.g., [7,28,35,36]). Our new, zero-modi ed discrete Pareto and Weibull distributions may provide a useful addition to an actuary's statistical toolbox, going beyond modeling claim amounts of discrete type. We note that this mixing approach introduces a dependence structure that produces tractable results in a few instances that we analyze in this paper. Speci cally, starting from classical ruin theory results for independent light-tail claims, we explore heavy-tailed scenarios with conditionally independent claims. In fact, the zero-modi ed DP model with the PMF (10) may be a useful heavy-tail model for the frequency of claims as well, as it can be extended to a continuous-time, discrete-valued stochastic process in the spirit of the classical Poisson process due to its fundamental property of in nite divisibility, established in the sequel.
The rest of the paper is organized as follows. In Section 2 we derive the probability of ruin in the above setup within a compound binomial risk model with mixed zero-modi ed geometric claims, including the case where the claims are conditionally independent, zero-modi ed discrete Pareto. We exemplify our theory with a concrete example with real data from an insurance-reinsurance company. In turn, in Section 3, we focus on the zero-modi ed discrete Pareto model, which provided the best t to the data. Here, we present basic information on this new stochastic model and develop its important properties, which should provide a useful reference for actuaries and others who use discrete stochastic models in their work.

Compound binomial risk model with mixed zero-modi ed geometric claims
Consider again the compound binomial risk model (1) where, given Θ = θ, the {X i } have ZMG distribution given by the PMF (3) with the success probability as in (9). To see why the latter condition is more convenient than the one given by (5), we rst derive the PDF of the claim amount X. Let F Θ be the cumulative distribution function (CDF) of the mixing variable Θ and let f Θ be the corresponding PDF (if it exists). Clearly, P(X = ) = q, while for k ≥ , we have: where ϕ Θ is the Laplace transform (LT) of the variable Θ. This leads to a convenient, general formula for the PMF of X: Note that when Θ has a gamma distribution with the PDF (7), then the LT is given by and the PMF of the claim amount X turns into that of the zero-modi ed discrete Pareto (ZMP) distribution, given by (10). Similarly, when Θ is positive stable with the LT (12), the claim amounts become conditionally independent zero-modi ed discrete Weibull (13). When comparing the ZMP and the ZMG models (see in Figures 1 and 2), we notice that for the same expectation of claims, the PMFs for both models have the same value of q when the zero claims occured, however, the PMF drops faster under ZMG model, displaying the heavier tail of the ZMP distribution.
Similar calculations show that that the CDF of the claim distribution in our set-up is given by while the survival probability becomes where x denotes the integer part of x (the oor function). When Θ is either gamma distributed with the PDF (7) or is positive stable with the LT (12), then the tail probabilities take on particularly simple forms, given by respectively. The above formulas should be contrasted with the rather inconvenient integral that appears in the rst paragraph of Section 4.2 in [8].

. The probability of ruin
Let us now derive the probability of ruin under our set-up. First, let us note that the probability of ruin in (4) becomes if and only if ρ ≥ − q (the net pro t condition). To see this, observe that the above holds if and only if Consider the function h(ρ) = ( − ρ) u+ /ρ, ρ ∈ ( , ). Since the function h is decreasing on the interval ( , ), and so (16) is equivalent to ρ ≥ − q as desired. Now, if we set − ρ = e −θ , then the net pro t condition becomes θ > θ * , where θ * = − log q ∈ ( , ∞). Then, analogously to (10) in [8], the probability of ruin can be written as where One can obtain a compact formula for the above probability in terms of a geometric random variable N ∼ Geo(p), given by the PMF and the probability generating function (PGF) and the excess random variable If Θ is absolutely continuous, then the PDF of the latter is The following result provides relevant details.  (3) and ρ = − e −θ . Then, the probability of ruin is given by where θ * = − log q, Θ * is the excess random variable given by the PDF (22), and N is a geometric random variable (19) with parameter p = − q, independent of Θ * .
Proof. Let us work with the quantity J(u, θ * ) given by (18). We have Note that dθ.
Upon the substitution x = θ − θ * in (24) we obtain We now recognize the term ( − q)e −x − qe −x under the integral in (25) as the PGF of geometric variable N with the PMF (19) and p = − q, evaluated at s = e −x (so this is actually the LT of N), so that we can write the above integral as as desired. This completes the proof.
Routine calculations lead to the following result, describing the special case with gamma-distributed Θ and zero-modi ed discrete Pareto (10) correlated claim amounts. Note that the probability of ruin given below involves the (upper) incomplete gamma function, as it does in an analogous problem considered by [8].
Corollary 2.1. Let Θ have a gamma distribution with the PDF (7) and suppose that, given Θ = θ, the variables (1) be IID modi ed geometric ZMG(q, ρ) with the PMF (3) and ρ = − e −θ . Then, the probability of ruin ψ(u) is given by Below we present a special case with exponential mixing distribution, where the probability of ruin may take on an explicit form.

Corollary 2.2. Let Θ have an exponential distribution with parameter λ > and suppose that, given Θ
Then, if λ ∈ N, the probability of ruin is given by Remark 2.1. Figure 3 shows a comparison of the ruin probabilities under two di erent settings with conditional ZMG claims, where, given Θ = θ, the geometric probability of success is given either by (5) as in [8] or by (9), as proposed in this paper. Moreover, in each cases Θ has gamma distribution (7), with parameters α i , so that the expected geometric probabilities of success coincide, E(ρ ) = E(ρ ). As can be seen in Figure 3, the ruin probability curves under our model drop faster then those under the model of [8]. Note that the settings for the parameter λ a ect the position of the ruin probability curves. In addition, according to the expression of the ruin probability given by [8], the equations only accept the integer initial capitals.  Remark 2.2. As can be seen from the ruin probability formula in the ZMP case, the probability of ruin converges to a non-zero level as u → ∞, which is due to the net pro t condition being violated. Therefore, in the ZMP model the ruin probability is more stable for large u compared with its behavior under the ZMG model. Furthermore, the rate of convergence can vary with the parameters, as can be seen in the example given in Table 2, by the parameters 1-4 provided in Table 1 below. When comparing Set 1 with Set 2, and Set 2 with Set 3, one can notice that larger λ and smaller α lead to a larger probability of ruin and faster convergence (the di erence in ruin probabilities between u = n and u = n + is smaller than − ). In other words, larger λ and lower α atten the ruin probability. According to Set 4, one can see that as the probability q of no claims increases, the ruin probability decreases. Moreover, starting with u = , the probability is already convergent to the level where the net pro t condition is violated. We also notice that the decrease is of . % (from ψ( ) = . % to ψ( ) = . %). This decrease is larger than the one in the case of Set 1, which was only . % (from ψ( ) = . % to ψ( ) = . %). Thus, the larger the q, the lower the ruin probability, the steeper the decrease, and the slower the convergence.  The result below provides the ruin probability for the special case where Θ is Levy stable with index α = / and PDF (8), in which case we have conditionally independent zero-modi ed discrete Weibull (ZMW) claim amounts, with the PMF (13) and α = / . As in the analogous problem considered by [8], the probability of ruin can be expressed in terms of the complementary error special function where Γ(·, ·) and erfc(·) are given by (26) and (27), respectively.
Proof. Let θ * = − log q. Then, by taking into account the PDF of Θ given by (8) and Proposition 2.1, we obtain where in the last equality we used Finally, the substitution and the result follows.
Remark 2.3. Let L = F Θ (θ * ) be the level at which the net pro t condition is violated. In Figure 5, one can set up the same level L of ψ(u) as u → ∞ for both, zero modi ed Pareto and Weibull models (denoted, respectively, by ZMP and ZMW). From Figure 5, one can see that the ruin probability curve is steeper under the ZMP model and it starts from a higher initial ruin probability ψ( ). Table 3 below shows that, when we increase the value of τ (the parameter in the ZMW model) from to . , the ruin probability curve decreases by % at given level L. This can be observed by increasing the expectation of the claims. Additionally, a smaller τ corresponds to a larger ruin probability and faster convergence to level L.

. Illustrative data example
As an illustration, we t the three zero-modi ed models, ZMG, ZMP and ZMW, to data from a non-life reinsurance company. The data were skewed and scaled for con dentiality reasons. Claims data span the time period of 11 years, with claims recorded on a monthly basis. The zero and the non-zero frequencies are shown in Table 4 given below. Zero claims refer to accidents that the company paid nothing for, due to deductibles or other contracts considerations. The model frequency q of zero claims is estimated by the corresponding sample frequency,q, resulting inq = .
. The parameters of all three models are estimated by the method of moments, and are provided in Table 5 below. Figure 6 illustrates the ruin probabilities under the three models.

Zero claims Non-zero claims Total claims Number
Remark 2.4. Note that while tting the data, we will keep the same net pro t condition, meaning the same θ * in (23). In the Figure 5, the levels of convergence F(θ * ) are di erent due to di erent distributions F.   To measure the goodness-of-t, we use P-P plots and the sum of the squared errors (SSE), shown in Figure 7 and Table 5, respectively. Based on the results ZMW and ZMP present a much better t than ZMG. Furthermore, our data analysis leads to the same conclusion as that provided by our theoretical results. Namely, while the ZMG model has the largest ruin probability when u = , it decays very quickly as the initial investment increases. As far as the ZMP and ZMW models, the ruin probability under the ZMP model is always larger than that under the ZMW model.

A zero-modi ed discrete Pareto distribution
In this section we present basic properties of zero-modi ed Pareto distribution, given by the PMF (10). We shall use the notation ZMDP(α, λ, q), or in short ZMDP, for this distribution. Some of our results presented below shall be stated in an alternative parameterization, which conveniently accounts for the special special case α = ∞, corresponding to the zero-modi ed geometric distribution given by (3). Namely, as in [3], we replace α with its reciprocal and instead of λ we set ρ = − exp(− /(αλ)), so that /λ = −α log( − ρ) and the PMF (10) takes on the form with k ∈ N . We use ZMDP * (α, ρ, q) for the zero-modi ed discrete Pareto distribution with the above PMF. As shown below, the parameter α ≥ is a tail parameter, ρ ∈ [ , ] has to do with the "size" of X, while the parameter q ∈ [ , ] controls the point mass at zero. The main motivation for the re-parameterization is that the distribution can be de ned at the boundary case α = , which is understood as the limit of the ZMDP * (α, ρ, q) distribution with ρ ∈ ( , ) as α converges to zero. It follows that in the limit we obtain the zero-modi ed geometric distribution (3). On the other hand, we do not get a proper distribution when α → ∞. We also have a few other special cases as follows: (i) If q = the distribution is a point mass at k = .
(ii) If q = , we get the discrete Pareto distribution.
(iii) If q ∈ ( , ) and ρ = , the distribution is a point mass at k = . As mentioned above, the parameter α controls the tails of the ZMDP distributions, which follow a power law just as they do in the case of DP distribution. The following result, which is straightforward to prove using the ZMDP survival function, makes this more precise.

Proposition 3.1. If X ∼ ZMDP(α, λ, q) then
Next, we argue that in some sense the parameter λ > controls the "size" of the ZMDP random variable, although it is not a scale parameter in the usual sense. As we show below, as λ is increasing, the distribution is increasing in a stochastic sense. Recall that a random variable X is said to be stochastically larger than a random variable X if F (x) ≤ F (x) for all x, where F and F are the CDFs of X and X , respectively. The following result, which is an extension of an analogous property of DP distribution, is a simple consequence of the particular form of the CDF of ZMDP distribution given in Proposition 3.3 below. Proposition 3.2. If X ∼ ZMDP(α, λ , q) and X ∼ ZMDP(α, λ , q), where λ < λ , then X is stochastically larger than X .

. The CDF and the quantile functions
In order to describe the CDF, the survival function (SF), and the quantile function connected with the ZMDP model, it is convenient to use the standard oor and ceiling functions. Recall that, for x ∈ R, the oor function, often denoted by x , is the largest integer that is less than or equal to x. Similarly, the ceiling function, often denoted by x , is the smallest integer that is larger than or equal to x. With this notation, the CDF and the SF of a ZMDP model admit the expressions given in the following result, whose elementary proof shall be omitted.

Proposition 3.3.
The CDF and the SF of X ∼ ZMDP * (α, ρ, q) are given by respectively.
In turn, the quantile function of the ZMDP model is obtained by inverting its CDF, leading to the result below.

. Moments and related parameters
We start with probability generating function (PGF) of a ZMDP random variable X, de ned as G(s) = Es X = ∞ n= s n P(X = n), s ∈ ( , ).
Perhaps the most convenient way to derive it is through the mixture representation (3.10) coupled with the formula for the PGF of the DP distribution (see [3], Proposition 2.6). This immediately produces the result below.
Proposition 3.5. The PGF of X ∼ ZMDP * (α, ρ, q) is given by The formulas for the moments connected with the ZMDP distribution are straightforward to derive when we take into account mixture representation on Proposition (3.10) and results on the moments of the DP distribution (see [3], Proposition 2.7). Note that according to Proposition 3.1, the moments EX r of X ∼ ZMDP * (α, ρ, q), where r > , are nite if and only if r < /α. The following result, which is straightforward to derive, provides further details.
Proposition 3.6. Let X ∼ ZMDP * (α, ρ, q) and r > . Then EX r exists if and only if r < /α, in which case we have In particular, the mean exists whenever α < , and simpli es to

. Stability properties
Due to the close connection between ZMDP and DP distributions, it is not surprising that the stability properties of the letter (see, e.g, Section 3.1 of [3]) carry over, with some modi cations, to the former.

. . Stability connected with minima
Our rst result is related to the minimum of independent ZMDP variables. Due to the particular form of ZMDP survival function, it can be seen that the minimum Mn = min ≤i≤n {X i } of n IID ZMDP variables {X i } will also have ZMDP distribution, but with di erent parameters. Indeed, if the SF of the {X i } is given by S(x) as in (28), then the SF of Mn is of the form where which is seen to be a SF of the ZMDP distribution. In turn, if the SF of Mn is of the form (30), then it follows that the SF of the X i must be given by (28). This leads to the following result, which is an extension of similar property of DP distributions [3,19] This result can be extended to the case of independent but not necessarily identically distributed ZMDP variables, as long as they have a common "scale" parameter.

. . Stability of the conditional tail
We now consider the "tail" random variable Xu, which is also known as the excess, de ned as X − u given that X ≥ u, where u ∈ N is interpreted as a threshold beyond which we have an observation. Recall that the geometric distribution (supported on N ) is stable, in the sense that the variables Xu and X have the same distribution for each u ∈ N when X is geometric. As shown below, if X is ZMDP then Xu is also ZMDP for each u ∈ N , although their distributions have di erent parameters. This result extends similar property of DP distribution to the ZMDP case [3] .

. Stochastic representations
Here, we present several useful stochastic representations of the ZMDP distribution. We start with its basic relation to the DP model of [3].
Proposition 3.10. If X ∼ ZMDP(α, λ, q) then where the variables I and N are independent, I has a Bernoulli distribution with parameter − q, and N ∼ DP(α, λ) with the PMF (11).
Since, as shown in Proposition 3.4 of [3], the variable N from Proposition 3.10 is (conditionally) geometric with parameter ρ = − e θ given that Θ = θ, where Θ is a gamma variable given by the LT (15), we obtain the following representation.
One can also relate the ZMDP distribution to randomly stopped Poisson process. Indeed, it is well-known that if {N(t), t ∈ R+} is a standard Poisson process and Z is standard exponential variable, independent of Z, then N(Z/β) has a geometric distribution (supported on N ) with parameter ρ = β/(β + ). In particular, when β = e θ − , then ρ = − e −θ . Consequently, in view of Proposition 3.11, we obtain the following result.
Proposition 3.12. If X ∼ ZMDP(α, λ, q), then where all the variables on the right-hand-side of (3.12) are independent, I has Bernoulli distribution with parameter − q, the variable Z is standard exponential, Θ has gamma distribution with the PDF (7), and {N(t), t ∈ R+} is a standard Poisson process.

. Divisibility properties
Recall that the probability distribution of a random variable X is in nitely divisible (ID) if for each n ∈ N we have the equality in distribution X d = X n, + · · · + Xn,n , where the {X n,j } ( ≤ j ≤ n) are IID random variables. Additionally, if the distribution of X is supported on N , then it is discrete in nitely divisible if it is ID and the variables {X n,j } in (32) are supported on N as well. As shown in [3], the DP distribution is ID (and its shifted version, supported on N , is discrete ID). However, as shown below, the in nite divisibility of zero-modi ed DP distribution depends on its parameters. Generally speaking, if X ∼ ZMDP(α, λ, q) then the discrete ID property holds when the values of q are near 1 and does not hold if q is near zero. The following result summarizes these facts.
Proposition 3.13. Let X ∼ ZMDP(α, λ, q). Then the distribution of X is discrete ID (and thus ID) when and it is not discrete ID when where Proof. To prove discrete ID we shall use a su cient condition for this property, stating that the sequence of probabilities (p k ) k∈N , where p k = P(X = k), is log-convex, that is p k > for all k and the sequence (p k+ /p k ) k∈N is non-decreasing (see, e.g., [33], Theorem 10.1, p. 60). We use this condition to establish discrete ID of ZMDP distribution with parameters satisfying (33) and q < , as for q = the distribution reduces to a point mass at zero, which is clearly discrete ID. In this case the probabilities are positive, so it remains to show the inequality p k p k− ≤ p k+ p k , k ∈ N, where the {p k } are given by the right-hand-side of (10). For n = , this inequality produces ( − q)/q ≤ d α,λ with d α,λ as in (35), and results in (33) upon solving for q. Next, we establish (36) for any k ≥ , which we accomplish by showing that the function p k+ /p k of real argument k is increasing on ( , ∞). To this end, consider the function which, according to (10), represents the ratio p x+ /px of ZMDP probabilities (evaluated at real argument x > ). By examining its derivative, we show that the function g is indeed increasing. Straightforward albeit rather lengthy algebra leads to the following expression for the derivative of g: Our objective is to show that h(x) > (x > ), in which case the derivative in (37) is positive and the function g is increasing. By setting y = λ + x − , we see that the condition h(x) > (x > ) is equivalent to w(y) < w(y + ) (y > λ), where w(y) = (y + ) α+ − y α+ . However, the later inequality is true since the function w is increasing, as can be veri ed by taking its derivative. This completes the rst part of the result. We now move to the second part of the result, and show that the distribution of X is not discrete ID when q satis es the inequality (34). This is clear when q = , since in this case the distribution is supported on N (as p = q = ) and consequently can not be discrete ID (see, e.g., [33], p. 23). Further, it is well-known that the characteristic sequence (r k ) k∈N of a discrete ID distribution must be non-negative [33], Theorem 4.4, p. 36, where the elements of the sequence r k are de ned via the relations (n + )p n+ = n k= p k r n−k , n ∈ N .
(38) Solving (38) for r and r leads to r = p /p and r = ( p − p r )/p , respectively, and the condition r ≥ becomes ( − q)/q ≤ d α,λ upon taking into account the particular form (10) of ZMDP probabilities. Since the last inequality is equivalent to q ≥ /( + d α,λ ), the distribution can not be discrete ID under (34). The proof is now complete.
Remark 3.1. The property of discrete ID shown above allows one to construct a continuous-time, discretevalued stochastic processes based on the ZMDP distribution with appropriate parameters. For example, if /( + d α,λ ) ≤ q < , we can de ne a Lévy motion {X(t), t > }, a process with independent and stationary increments, where X( ) is ZMDP(α, λ, q) with the PGF G given by (29) while for each t > the PGF of X(t) is G t . Similar construction is possible for the un-modi ed, regular DP distribution as well. Such processes may prove to be useful tools for modeling the claim arrival processes of actuarial risk theory.