Properties and Inference for a New Class of Generalized Rayleigh Distributions with an Application

Abstract In the present paper, we introduce a new form of generalized Rayleigh distribution called the Alpha Power generalized Rayleigh (APGR) distribution by following the idea of extension of the distribution families with the Alpha Power transformation. The introduced distribution has the more general form than both the Rayleigh and generalized Rayleigh distributions and provides a better fit than the Rayleigh and generalized Rayleigh distributions for more various forms of the data sets. In the paper, we also obtain explicit forms of some important statistical characteristics of the APGR distribution such as hazard function, survival function, mode, moments, characteristic function, Shannon and Rényi entropies, stress-strength probability, Lorenz and Bonferroni curves and order statistics. The statistical inference problem for the APGR distribution is investigated by using the maximum likelihood and least-square methods. The estimation performances of the obtained estimators are compared based on the bias and mean square error criteria by a conducted Monte-Carlo simulation on small, moderate and large sample sizes. Finally, a real data analysis is given to show how the proposed model works in practice.


Introduction
The famous distribution families have been successfully used in modeling real-world data sets, until recently. However, it is well known that the performances of these distributions in the modeling of complex realworld data sets are not always at the desired level. In recent years, a number of researchers who take into account this situation have focused on introducing the exible distribution families in order to the modeling of data sets in a wide variety of complex structures and have made several breakthroughs by giving various continuous distribution generating methods, especially in lifetime distributions. These distribution produce methods lay out a new distribution taking a baseline distribution. The baseline distributions are always a special case of the newly obtained distribution. Hence, the produced distribution has the characteristics of the baseline distribution and provides better data t than the baseline distribution. There are numerous papers in the literature that create a new distribution using a baseline distribution and draw attention to its advantages. We refer readers to [1][2][3][4][5] for further information on generating a new distribution family by using a baseline distribution.
The Rayleigh distribution, which has only a shape parameter, was originally introduced by a study of Rayleigh on a problem of acoustic. The distribution has a strong modeling ability of positive valued and skewed data obtained from many elds such as engineering, biology, life sciences, reliability and etc. The Rayleigh distribution is a distribution related to Gamma, Weibull, Exponential and Rice distributions. However, it has a disadvantage since the distribution has only a single shape parameter in which plays a crucial role in describing the various behaviors of the distribution. Fortunately, to overcome this disadvantage of Rayleigh distribution, there are many generalizations of the distribution such as generalized Rayleigh distribution [6], transmuted Rayleigh distribution [7], Weibull Rayleigh distribution [8], inverted exponentiated Rayleigh distribution [9] and the slashed exponentiated Rayleigh distribution [10]. The generalized Rayleigh distribution is the most widely used among these generalizations. The generalized Rayleigh distribution has also some important generalizations recently introduced to achieve optimal data t, such as the Kumaraswamy generalized Rayleigh [11], the beta generalized Rayleigh [12], the slashed generalized Rayleigh [13] and the Marshall-Olkin extended generalized Rayleigh [14]. In the literature, there are also many published papers on the estimation of the parameters of Rayleigh and generalized Rayleigh distributions for the various data types, see [15][16][17][18][19][20][21][22][23][24].
The main motivation of this paper is to introduce a more exible lifetime distribution than the Rayleigh and generalized Rayleigh distribution to be used for the modeling of data sets in wide variety structures. In the aim of this context, in the study, a new three-parameter family of Rayleigh distribution which is named alpha power generalized Rayleigh distribution (APGR) is derived using the alpha power transform (APT) method recently introduced by Mahdavi and Kundu [5]. Both Rayleigh and generalized Rayleigh distributions are the special cases of APGR distribution. Therefore, APGR distribution has more data modeling capability than the Rayleigh and generalized Rayleigh. Further, the APGR distribution is an important alternative to famous distributions like Gamma, Weibull, and exponential for modeling the data observed from industrial and physical phenomena.
The rest of the paper is organized as follows. In section 2, we introduce the APGR distribution. We discuss some important statistical characteristics of the APGR distribution in section 3. In section 4, statistical inference problem for the APGR distribution is investigated according to maximum likelihood (ML) and least-square (LSq) methods. Section 5 includes a comprehensive Monte-Carlo simulation study display the estimation performance of the estimators derived in section 4. A real-world data set is analyzed in section 6 for illustrative purposes. Finally, section 7 concludes the paper.

De nition and properties of the APGR Distribution
In this section, we derive the probability density function (pdf) and cumulative distribution function (cdf) of the APGR distribution by using the APT method given in [5] and study some distributional properties of the APGR distribution. Before progressing for further, we recall the generalized Rayleigh distribution. The pdf of the generalized Rayleigh distribution is and its cdf is where β and λ is the positive and real valued scale parameter and shape parameters of the distribution, respectively. Generalized Rayleigh distribution was originally studied by Surles and Padgett [6] as the twoparameter Burr Type X distribution. Then, the distribution was called the generalized Rayleigh distribution by Raqab and Kundu [25]. Now, we introduce the APGR distribution by using generalized Rayleigh distribution as a baseline distribution in the APT method.

De nition 1.
A random variable X is said to have a APGR distribution with parameters α, β and λ, if it has the following pdf and cdf and respectively.
Considering the cdf given by equation (4), the survival and hazard functions of the APGR distribution can be easily written as in the following forms: and From now on, a random variable X distributed the APGR with parameters α, β and λ will be indicated as X ∼ APGR (α, β, λ). By considering the equation (11) in [5], the p-th quantile of the APGR distribution, say Qp , is immediately obtained as below Thus, when α ≠ , the median of the APGR distribution is obtained as and when α = , the median of the APGR distribution is equal to median of the generalized Rayleigh distribution. Now, we discuss the shape behavior of the pdf f APGR (x; α, β, λ). When X tends to and X tends to ∞, the pdf f APGR (x; α, β, λ) comply with the following behaviors  Proof. First derivative of the pdf f APGR (x; α, β, λ) given by equation (3) is When α = , that is the distribution is a generalized Rayleigh, mode of the distribution can be easily obtained from solution of the equation When α ≠ , the derivative f APGR (x; α, β, λ) is a strictly decreasing and continuous function of x and lim x→ + f APGR (x; α, β, λ) is positive and f APGR (x; α, β, λ) takes negative values as x → ∞. Thus, we can say the f APGR (x; α, β, λ) has only one zero according to intermediate value theorem and the pdf f APGR (x; α, β, λ) is unimodal.
We present a gure to show the shape behavior of the APGR distribution for illustrative purposes. Fig.1 a,b,c display the some of the possible shapes of the pdf of the APGR distribution for di erent values of the parameters α, β and λ.

Some Important Characteristics of the APGR Distribution
In this section, the moments, moment generating function and related measures such as mean, variance, skewness and kurtosis are obtained for the APGR distribution. In addition, the distribution of order statistics, stress-strength probability and Shannon and Rényi entropies and the Lorenz and Bonferroni curves of the APGR distribution are also obtained in this section. Let we rst introduce the Lemma 1 to obtain the moments of APGR distribution.

Lemma 1.
Let X be a random variable with pdf given by equation (3). For any real numbers a > , b > , L > , r ≥ and δ ≥ , the integral is calculated as where F (.; .; .) is indicate the hypergeometric function, see [27] Proof. See Appendix A for proof of Lemma 1.
Obviously, by using the Lemma 1, the r-th moment, moment generating function, characteristic function, mean and variance of the APGR distribution are easily obtained as and σ = Var (X) = µ − µ ,respectively. Now, we derive the central moments and cumulants of the APGR distribution. By using the raw moments given in equation (13), r-th central moment of the APGR distribution is obtained as follow Therefore, using the central moments given by equation (17), the second, third and fourth cumulants κ , κ and κ can be expressed as κ = µ , κ = µ and κ = µ − µ , respectively. The skewness and the kurtosis coe cients of the APGR distribution are calculated by γ = κ /κ / and γ = κ /κ .

. Stress-strength probability
We suppose that X and Y be random variables from APGR (α , β , λ ) and Y ∼ APGR (α , β , λ ) distributions, respectively. In this situation, the stress-strength probability is calculated by R = P (Y < X), where Y represents the 'stress' and X represents the 'strength' to sustain the stress. For APGR distribution, stressstrength probability P (Y < X) is obtained as below Further, using the Lemma 1, we have where

. Shannon and Rényi Entropies
The entropy is quite important as a measure of variation or uncertainty of a random variable. In this section, we investigate the Shannon and Rényi entropies of the APGR distribution. The Shannon entropy of a random variable X with pdf f (x) is de ned as, see [26], Hence, the Shannon entropy of APGR (α, β, λ) distribution is obtained as By applying the Lemma 1 to equation (24), the Shannon entropy H (X) is written as where Υ X = E (ln (X)) , ϑ X = E ln − e −(λx) and ς X = E − e −(λx) β and these expectations can be easily calculated numerically. Now, we calculate the Rényi entropy of the APGR distribution. We rst recall the de nition of the Rényi entropy. The Rényi entropy of a random variable X with pdf f is given by By using the pdf (3) in the equation (26), Rényi entropy of the APGR distribution is obtained as

. Lorenz and Bonferroni Curves
Lorenz and Bonferroni curves are two graphical representations to the measure inequality of distribution of a random variable. The Lorenz and Bonferroni curves for a random variable X are de ned as the plot of and respectively, against F(x), where µ is indicate the expectation of the random variable X and q = F − (p) also L (p) and B (p) are called the Lorenz index and Bonferroni index, respectively. If the expectation (16) and pdf (3) are used in the equation (28), the Lorenz index of APGR distribution is obtained as Following steps of the proof of Lemma 1, the Lorenz index (30) is immediately written as where erf(.) is indicate the error function, see [27]. Similarly, the Bonferroni index of APGR distribution is also obtained as

Inference
In this section, we consider the statistical inference problem for APGR (α, β, λ) distribution. We employ the ML and LSq methods to obtaining the estimators of the unknown parameters α, β, and λ.

. ML estimation
Let X , X , ..., Xn be a random sample from APGR (α, β, λ) distribution. The log-likelihood function of the random variables X i , i = , , ..., n can be easily written from equation ( Thus, by derivating the log-likelihood function given in equation (33) with respect to parameters α, β and λ, we can write the following likelihood equations and Unfortunately, the ML estimators of the parameters α, β and λ cannot be explicitly derived from equations (34), (35) and (36). However, we can obtain the ML estimates of the parameters α, β and λ, sayα ML ,β ML and λ ML , respectively, from the simultaneous numerical solution of equations (34), (35) and (36).

. LSq Estimation
The LSq estimation method was rstly used by Swain et al. [28] as a nonlinear method in estimation of the parameters of the Beta distribution. Especially, when the maximum likelihood estimators cannot be obtained in an explicit form, the LSq estimates are quite important with regard to provide an initial estimation for numerical methods which use in obtaining the maximum likelihood estimations. The LSq estimations of the parameters α, β and λ, sayα LSq ,β LSq andλ LSq , respectively, are obtained by minimizing the equation Note that both LSq estimates and ML estimates of the unknown parameters can be obtained using the numerical methods.

Monte-Carlo Simulation Study
In this section, some simulation studies are presented in order to compare the estimation e ciencies of the ML and LSq estimators obtained in the previous section. In the simulation studies, two di erent cases α < and α > are considered.
In the rst case, the parameter α is chosen as . and also the values of the parameters β, λ are set as β = . , , and λ = . , , , respectively. The ML and LSq estimates of the parameters (Est.) are obtained with the simulations performed by replications for the di erent sample of sizes n = , , and . In addition, through the simulation study, the bias (Bias) and mean-squared error (MSE) values of the ML and LSq estimators are obtained. The simulated results are given in Table 1.
For the second case of the simulation study, the α parameter is set as . Also, the values of the parameters β and λ are chosen β = . , , and λ = . , , , respectively, as in the rs case. The simulated results are given by Table 2.
When the results given by Tables 1 and 2 are examined, it is seen that as the sample size n increases, both the estimations are close to actual values of the parameters and the ML and LSq estimators have smaller bias and MSE values for all cases. Furthermore, for both cases, it is concluded that the ML estimators outperform the LSq estimators with smaller MSE values according to the results given in Tables 1 and 2.

Application to Real Data
In this section, we present an analysis on a real-life data set called the coal mining disaster data set to illustrate the modeling behavior of the APGR distribution in comparison with Rayleigh and generalized Rayleigh distributions. The data set includes 191 observation dealing with the intervals in days between successive coal mining disasters in Great Britain [29].
Firstly, we investigate the underlying distribution of the data set. We apply the Kolmogorov-Smirnov (KS) test statistic to check whether this data set follows the APGR and most popular lifetime distributions such as Rayleigh, generalized Rayleigh, exponential, Gamma, Weibull, Log-Normal. The computed values of the KS statistic and corresponding p-values for each model are tabulated in Table 3.
By Table 3, we can say that the underlying distribution of the coal mining disaster data set is compatible with the APGR, Gamma, Weibull and Log-Normal distributions. Now, we apply the APGR, Gamma, Weibull and Log-Normal distributions as a model to coal mining disaster data set and obtain the negative log-likelihood (Neg. Log-Lik) and Akaike information criterion (AIC) values for deciding the optimal distribution model to this data set. The ML and LSq estimations of the parameters with the obtained AIC and Neg. Log-Lik values are summarized in Table 4.
According to Table 4, it is concluded that the APGR distribution gives the better t to the dataset than the Weibull, Gamma and Log-Normal distributions since it has smaller AIC and Neg. Log-Lik values. The data tting performance of the APGR distribution can be clearly seen from Figure 2, which plots the ecdf and the cdf tted by APGR distribution. As can be seen from Figure 2, the tted cdf strongly follows the empirical cdf of the observations and this is the desired case in real-life applications.

Conclusion
In this study, a new life-time distribution named the APGR distribution is introduced. The pdf and cdf of the introduced distribution are derived using the APT method. The behavior of the pdf of APGR distribution is displayed in Figure 1 for di erent values of the model parameters. The expressions for basic characteristics of the APGR distribution such as hazard function, survival function, moments, characteristic function, skewness, kurtosis, order statistics, Shannon entropy, and stress-strength probability and Lorenz and Bonferroni curves are derived in the paper. Also, the estimators of the model parameters α, β and λ are obtained using two di erent methods the ML and LSq. The e ciencies of the ML and LSq estimators are also compared by comprehensive simulation studies on the di erent sample of sizes small, moderate and large. The simulation results show that the e ciencies of both estimators are quite satisfactory according to bias and MSE criteria for all sample sizes. Further, the ML and LSq estimators are asymptotically unbiased and consistent since, when the sample size increases, both bias and MSE values converge to zero.
The APGR distribution presents better t to the coal mining disaster data than Gamma, Weibull and Log-Normal distributions, with the smaller Neg. Log-Lik. and AIC values. Thus, we can say that the APGR distribution provides the quite preferable modeling performance for life-time data and is a powerful alternative to the    famous life-time distributions such as Gamma, Weibull and Log-Normal. Further, by information from real data application carried out using the coal mining disaster data set, it can be said that the APGR distribution has displayed more exible data modeling performance than the baseline distributions Generalized Rayleigh and Rayleigh. Because while the APGR distribution is a suitable model for the coal mining disaster data set according to the obtained results of the KS test statistic given in Table 3, the Generalized Rayleigh and Rayleigh distributions aren't appropriate models. Therefore, it can be said that the APGR distribution has capable of modeling more data types than the baseline distributions generalized Rayleigh and Rayleigh.