The Pareto model corresponds to the power law widely used in physics, biology, and many other fields. In this article, a new generalized Pareto model with a heavy right tail is introduced and studied. It exhibits an upside-down bathtub-shaped failure rate (FR) function. The moments, quantiles, FR function, and mean remaining life function are examined. Then, its parameters are estimated by maximum likelihood, least squared error, and Anderson–Darling (a weighted least squared error) approaches. A simulation study is conducted to verify the efficiency and consistency of the discussed estimators. Analysis of Floyd River flood discharges in James, Iowa, USA, from 1935 to 1973 shows that the proposed model can be quite useful in real applications, especially for extreme value data.
The heavy right-tailed Pareto model which is characterized by the distribution function
and the probability distribution function (PDF)
occurs in a diverse range of physical phenomena. Generally, it is useful when there is an equilibrium in distribution of “small” to “large” values, e.g., the size of transmitted files on a computer network consisting many small files and few large ones, or the size of human settlements consisting of many small and few large cities villages/hamlets. Moreover, the sizes of solar flares, oil reserves in oil fields, earthquakes, corporations, and lunar craters have similar property which is referred to as “power law” property. Newman  reviewed some power law forms and theories explaining them. The Pareto model is recognized by its heavy right tail in the literature and shows a decreasing failure rate function. It is useful in biology, reliability engineering, survival analysis, quality control, economics, computer science, geophysics, and many other scientific fields. For detailed information about Pareto and related distributions and their features see Arnold , Zhang et al. , and Zhang et al. .
Bak and Sneppen , Sornette , and Carlson and Doyle  among many others used the Pareto as a power law model in their research. Also, Burroughs and Tebbens  fitted the Pareto model to earthquake and wildfire observations, and Schroeder et al.  described plate fault data by the Pareto model. Moreover, some researchers defined modified versions of the Pareto model and applied them in their studies. These modified Pareto models are more flexible for data generated in various phenomena. For example, Akinsete et al.  introduced a beta Pareto model; Nassar and Nada  and Mahmoudi  proposed a beta generalization of the Pareto distribution, Alzaatreh et al.  used the gamma distribution to propose a modified Pareto model; and Zea et al. , Elbatal  and Bourguignon et al.  defined extensions of the Pareto distribution. Papastathopoulos and Tawn  applied one extended Pareto model for tail estimation. Moreover, Mead , Elbatal and Aryal , Korkmaz et al. , Ghitany et al. , Tahir et al. , Ihtisham et al. , Haj Ahmad and Almetwally , Jayakumar et al. , and recently Jayakumar et al.  defined and studied a new model with heavier right tail than Pareto.
In this article, a new flexible generalized Pareto distribution with heavy right tail and upside down bathtub-shaped (UBT) FR function is introduced and studied. The novelty of the model is that it gathers the heavy right tail same as the Pareto model and UBT FR form in one model. Thus, the main advantage of the proposed model is that it is useful when the data show a fat right tail and UBT FR function. Such data can be observed in hydrology or other situations with extreme values. The remaining of the article is organized as follows. In Section 2, the new model is defined and its basic attributes like the moments and quantiles are discussed. Then, some important dynamic measures of it like FR and mean residual life (MRL) functions are studied. The aim of Section 3 is to estimate the parameters of the proposed model. In Section 4, the efficiency and consistency of the considered models are investigated by simulations. Then, the proposed model and some alternatives are fitted to consecutive flood discharges of the Floyd river located in James, Iowa, USA, during 1935 to 1973 to show its applicability.
2 The new modified Pareto distribution
The new generalized Pareto, , model is defined by the distribution function,
and the PDF
In Figure 1, the PDF is drawn for some values of the parameters and show a unimodal form for it. It seems that the coefficient changes the form of the PDF from decreasing to increasing in an early period and the mode of the model increases with . If , the GP shows the baseline Pareto model and if tends to zero, it converges to a model with the distribution function
which is a special case of the modified Weibull model defined by Kayid and Djemili .
Like the baseline Pareto, the proposed GP model has a heavy right tail. For example, in comparison with the well-known Weibull distribution we can write
Moreover, in comparison with the baseline Pareto model
which indicates that GP has a heavy right tail like the baseline Pareto model.
Let and be two positive functions, , for every and . Then, we have .
Since , there exists such that for , . Thus,
which gives the result.□
The expectation of a random variable following the Pareto distribution function (1) is finite and equals for and is infinite for . Moreover, the th moment of this random variable is finite for and infinite for . The following proposition proves similar result for the th moment of the proposed GP model. It shows that when , the expectation of is finite and otherwise it is infinite.
The kth moment of is finite for and infinite otherwise.
Let be the reliability function of . Then, the th moment of this model equals
Take and , where and are the reliability functions of the Pareto model with parameters and , respectively. Then, it can be verified that , and by verifying other conditions of Lemma 1, the result follows immediately.□
One good tendency measure which may be applied in place of the moments is the quantile function which at point equals . This function is useful in generating simulations, estimating the parameters and describing the model characteristics like skewness and kurtosis. For example, Bowley  and MacGillivray  defined the skewness based on the quantile function. Moreover, Moors  presented a measure of kurtosis in terms of quantiles. For , the quantile function at could be obtained by solving the following equation in terms of .
2.1 Dynamic measures
The FR function of the proposed model is
The FR function of is UBT. Moreover, the point maximizing the FR function is the root of the following equation:
By differentiation of the FR function, we found that the sign of is the same as sign of the following function:
which is a decreasing function and as tends to zero, and as tends to .□
In fact, the coefficient , included in the model, affects on the early life and makes the decreasing FR function to increasing in a beginning period of life.
The proof of the following proposition could be trivially obtained by comparing the reliability functions of the assumed random variables and is omitted.
Let and and . Then, is in stochastic order.
Figure 1 draws the FR function for some parameters and shows that under the conditions of Proposition 3, is not smaller than in FR ordering. Moreover, since likelihood ratio ordering is stronger than FR ordering, would not be smaller than in likelihood ratio ordering.
The MRL function of is finite for and is given by
By Proposition 2, it results that the MRL function has an increasing or bathtub form, see Lai and Xie . Figure 2 shows the MRL for some parameter values.
Another prominent dynamic tendency measure is the -quantile residual life ( -QRL) function, which is given by
The special case is referred to the median residual life function. The -QRL and specially the median residual life function are better than MRL for models with heavy right tail, specially the Pareto distribution since the MRL may be infinite. Figure 2 exhibits the median residual life for some values of the parameters. Note that when the FR function is UBT, the -QRL will be increasing or bathtub shaped, see Kayid  for detailed information.
Assume represents an ordered, independent and identically distributed sample of . In this section, the maximum likelihood (ML), least squared errors (LSE), and Anderson–Darling (AD) methods are discussed for estimating the parameters of the model.
3.1 ML method
The log-likelihood function of equals
The value ( ) which maximizes this function is referred to the maximum likelihood estimator (MLE). However, the maximum point does not have algebraic closed form, and it can be computed by maximizing the log-likelihood directly or solving the following likelihood equations:
The Fisher information matrix can be estimated by replacing parameters by ML estimate in the following Fisher information matrix.
It is a well-known and very practical technique to approximate the distribution of the MLE by multivariate normal distribution. The random vector approximately converges weekly to multivariate normal , where is the inverse of the observed Fisher information matrix.
3.2 LSE and AD methods
In the LSE approach for estimating the parameters, we are interested to find parameter values minimizing the following expression:
which causes the distance between estimated and empirical distributions to be the smallest possible value. That is, the LSE estimates are given by
The AD approach is a weighted version of the LSE method with weight . Thus, the AD estimate of the parameters is given by
4 Simulation study
To simulate one random variable from , first we generate one random instance from standard uniform distribution and solve the equation in terms of , where is the distribution function of .
In this simulation study, some values for parameters are selected. Then, in every run, repetitions of size or 150 are simulated and the parameters are estimated by one of the ML, LSE, or AD approaches. The bias (B) and mean squared error (MSE) of the estimators are computed by the following relations, respectively.
and similarly for and . All computations are performed in environment. The optimization problems are solved by the built-in “optim” function of . The initial values needed in this function are selected randomly from uniform distribution, e.g., for , from uniform distribution on the interval . Table 1 abstracts the results of the simulation study. The results show that all studied estimators are consistent and efficient but the MLE outperforms the LSE and AD. On the other hand, AD estimator gives smaller MSE than LSE.
|ML||0.1, 0.1, 1||0.0057||0.0125||−0.0008||0.0077|
|0.2, 0.07, 2||−0.0096||0.0204||−0.0167||0.0122|
|1, 0.2, 0.1||−0.0178||0.0824||−0.0052||0.0431|
|LSE||0.1, 0.1, 1||0.0403||0.0296||0.0215||0.0167|
|0.2, 0.07, 2||0.0314||0.0379||0.0114||0.0218|
|1, 0.2, 0.1||−0.0749||0.1495||−0.0329||0.0843|
|AD||0.1, 0.1, 1||0.0076||0.0173||−0.0046||0.0087|
|0.2, 0.07, 2||−0.0041||0.0242||−0.0072||0.0148|
|1, 0.2, 0.1||−0.0757||0.0965||−0.0488||0.0513|
In every cell, the first, second and third lines are corresponding to , , and , respectively.
Table 2 shows the consecutive flood discharges in terms of for the Floyd river located in James, Iowa, USA, during 1935 to 1973, see Mudholkar and Hutson . The flood discharges are extreme values and analyzing them could completely helpful in predicting the extreme flood occurrences. This data set was analyzed by Mudholkar and Hutson  and Merovcia and Puka . The box plot of the data presented in Figure 3 shows one extra ordinary large value 71,500. Figure 3 shows the total time on test (TTT) plot and verifies a UBT FR function. In a comparative analysis, the proposed GP along with some other UBT FR models and some models with other FR forms are participated, and the results are abstracted in Table 3.
The alternative models are Pareto; exponentiated Pareto (EP); Marshal–Olkin Pareto (MOP); Dimitrakopoulou, Adamidis, and Loukas (DAL) modified Weibull model proposed by Dimitrakopoulou et al. ; inverse Weibull (IW); Marshal–Olkin inverse Weibull (MOIW); gamma, Marshal–Olkin gamma (MOG); Weibull and Pareto exponential competing risk (PECR).
The parameters of the mentioned models are estimated by the ML method. The R programming language was used for computations, and all optimizations were done by the built-in function “optim” of R. The Akaike information criterion (AIC), Cramer–von Mises (CVM) statistics, AD and Kolmogorov–Smirnov (KS) statistics are reported for every model. Clearly, the proposed GP and MOIW show a close-run. However, the GP outperforms other models and provides a good description of the data. Figure 4 draws the empirical and fitted distribution function for GP and some of the alternatives which show better fits. The estimated FR function is plotted in Figure 5 and confirms a UBT form for the FR function. Also, histogram of the data and estimated PDF are plotted in the right side of Figure 5.
One new flexible GP model which preserves the heavy right tail attribute but exhibits an early increasing FR function is introduced. The limiting behavior of the proposed model is similar to the baseline Pareto, but the attributes differ at beginning of the support. The proposed GP model has a UBT FR function. The simulation results show that the ML estimator is efficient and consistent. Applying the model on one flood discharge data of the Floyd river shows that the proposed GP model could be useful in describing many data sets which occur in a wide variety of physical phenomena. There are many future related topics. For example, studying a mixture of the proposed GP model or introducing proper extensions of the GP model based on the underlying physical justifications.
The authors thank the two anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions. This work was supported by Researchers Supporting Project number RSP2022R464, King Saud University, Riyadh, Saudi Arabia.
Funding information: This work was supported by Researchers Supporting Project number RSP2022R464, King Saud University, Riyadh, Saudi Arabia.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Conflict of interest: The authors state no conflict of interest.
 Newman MEJ. Power laws, Pareto distributions and Zipf’s law. Contemporary Phys. 2005;46:323–51. 10.1080/00107510500052444.Search in Google Scholar
 Arnold B. Pareto distributions. 2nd edition. London: Chapman and Hall/CRC; March 10, 2015. 10.1201/b18141Search in Google Scholar
 Zhang Y, Agarwal P, Bhatnagar V, Balochian S, Yan J. Swarm intelligence and its applications. Scientific World J. 2013;2013:528069. 10.1155/2013/528069. Search in Google Scholar PubMed PubMed Central
 Zhang Y, Agarwal P, Bhatnagar V, Balochian S, Zhang X. Swarm intelligence and its applications 2014. Scientific World J. 2014;2014:204294. 10.1155/2014/204294. Search in Google Scholar PubMed PubMed Central
 Bak P, Sneppen K. Punctuated equilibrium and criticality in a simple model of evolution. Phys Rev Lett. 1993;74:4083–6. 10.1103/PhysRevLett.71.4083Search in Google Scholar PubMed
 Sornette D. Multiplicative processes and power laws. Phys Rev E. 1998;57:4811–3. 10.1103/PhysRevE.57.4811Search in Google Scholar
 Carlson JM, Doyle J. Highly optimized tolerance: a mechanism for power laws in designed systems. Phys Rev E. 1999;60:1412–27. 10.1103/PhysRevE.60.1412Search in Google Scholar PubMed
 Burroughs SM, Tebbens SF. Upper-truncated power law distributions. Fractals. 2001;9:209–22. 10.1142/S0218348X01000658Search in Google Scholar
 Schroeder B, Damouras S, Gill P. Understanding latent sector error and how to protect against them. ACM Trans Storage. 2010;6(3):8. 10.1145/1837915.1837917Search in Google Scholar
 Akinsete A, Famoye F, Lee C. The beta-Pareto distribution. Statistics. 2008;42:547–63. 10.1080/02331880801983876Search in Google Scholar
 Nassar MM, Nada NK. The beta generalized Pareto distribution. J Statistics Adv Theory Appl. 2011;6:1–17. Search in Google Scholar
 Mahmoudi E. The beta generalized Pareto distribution with application to lifetime data. Math Comput Simulat. 2011;81:2414–30. 10.1016/j.matcom.2011.03.006Search in Google Scholar
 Alzaatreh A, Famoye F, Lee C. Gamma-Pareto distribution and its applications. J Modern Appl Statist Methods. 2012;11(1):78–94. 10.22237/jmasm/133584516. Search in Google Scholar
 Zea LM, Silva RB, Bourguignon M, Santos AM, Cordeiro GM. The beta exponentiated Pareto distribution with application to bladder cancer susceptibility. Int J Statistics Probability. 2012;2:8–19. 10.5539/ijsp.v1n2p8Search in Google Scholar
 Elbatal I. The Kumaraswamy exponentiated Pareto distribution. Econom Quality Control. 2013;28:1–9. 10.1515/eqc-2013-0006Search in Google Scholar
 Bourguignon M, Silva RB, Zea LM, Cordeiro GM. The Kumaraswamy Pareto distribution. J Statist Theory Appl. 2013;12:129–44. 10.2991/jsta.2013.12.2.1Search in Google Scholar
 Papastathopoulos I, Tawn JA. Extended generalised Pareto models for tail estimation. J Statist Plann Inference. 2013;143(1):131–43. 10.1016/j.jspi.2012.07.001. Search in Google Scholar
 Mead M. An extended Pareto distribution. Pakistan J Statist Operat Res. 2014;10(3):313–29. 10.18187/pjsor.v10i3.766. Search in Google Scholar
 Elbatal I, Aryal G. A new generalization of the exponential Pareto distribution. J Inform Optim Sci. 2017;38(5):675–97. 10.1080/02522667.2016.1220079Search in Google Scholar
 Korkmaz MC, Altun E, Yousof HM, Afify AZ, Nadarajah S. The Burr X Pareto distribution: properties, applications and VaR estimation. J Risk Financial Manag. 2018;11(1):1–16. 10.3390/jrfm11010001Search in Google Scholar
 Ghitany ME, Gómez-Déniz E, Nadarajah S. A new generalization of the Pareto distribution and its application to insurance data. J Risk Financial Manag. 2018;11(1):10. 10.3390/jrfm11010010Search in Google Scholar
 Tahir A, Akhter AS, Haq AM. Transmuted new Weibull-Pareto distribution and its applications. Appl Appl Math Int J. 2018;13(1):30–46. Search in Google Scholar
 Ihtisham S, Khalil A, Manzoor S, Khan SA, Ali A. Alpha-power Pareto distribution: its properties and applications. PLoS ONE. 2019;14(6):e0218027. 10.1371/journal.pone.0218027. Search in Google Scholar
 Haj Ahmad H, Almetwally E. Marshall-Olkin generalized Pareto distribution: Bayesian and non Bayesian estimation. Pakistan J Statist Operat Res. 2020;16(1):21–3. 10.18187/pjsor.v16i1.2935. Search in Google Scholar
 Jayakumar K, Krishnan B, Hamedani GG. On a new generalization of Pareto distribution and its applications. Commun Statist-Simulat Comput. 2020;49(5):1264–84. 10.1080/03610918.2018.1494281Search in Google Scholar
 Jayakumar K, Kuttykrishnan AP, Krishnan B. Heavy tailed Pareto distribution: properties and applications. J Data Sci. 2021;18(4):828–45. 10.6339/JDS.202010_18(4).0015. Search in Google Scholar
 Kayid M, Djemili S. Reliability analysis of the inverse modified Weibull model with applications, Math. Probl. Eng. 2022;2022:4005896. https://doi.org/10.1155/2022/4005896. Search in Google Scholar
 Bowley AL. Elements of statistics. London: P.S. King and Son; 1901. Search in Google Scholar
 MacGillivray HL. Skewness and asymmetry: measures and orderings. Anal Stat. 1986;14:994–1011. 10.1214/aos/1176350046Search in Google Scholar
 Moors J. A quantile alternative for kurtosis. J R Stat Soc D (Statistician). 1988;562(37):25–32. 10.2307/2348376Search in Google Scholar
 Lai CD, Xie M. Stochastic ageing and dependence for reliability. New York: Springer; 2006. Search in Google Scholar
 Kayid M. Some new results on bathtub-shaped hazard rate models. Math Biosci Eng. 2022;19(2):1239–50. 10.3934/mbe.2022057. Search in Google Scholar PubMed
 Mudholkar GS, Hutson AD. The exponentiated Weibull family: some properties and a flood data application. Commun Statist Theory Methods. 2010;25(12):3059–83. 10.1080/03610929608831886. Search in Google Scholar
 Merovcia F, Puka L. Transmuted Pareto distribution. ProbStat Forum. 2014;07:1–11. Search in Google Scholar
 Dimitrakopoulou T, Adamidis K, Loukas S. A lifetime distribution with an upside-down bathtub-shaped hazard function. IEEE Trans Reliability. 2007;56(2):308–11. 10.1109/TR.2007.895304. Search in Google Scholar
© 2022 Mansour Shrahili and Mohamed Kayid, published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.