Show Summary Details
More options …

Open Life Sciences

formerly Central European Journal of Biology

Editor-in-Chief: Ratajczak, Mariusz

IMPACT FACTOR 2018: 0.504
5-year IMPACT FACTOR: 0.583

CiteScore 2018: 0.63

SCImago Journal Rank (SJR) 2018: 0.266
Source Normalized Impact per Paper (SNIP) 2018: 0.311

ICV 2017: 154.48

Open Access
Online
ISSN
2391-5412
See all formats and pricing
More options …

A best-fit probability distribution for the estimation of rainfall in northern regions of Pakistan

M. T. Amin
• Corresponding author
• Department of Environmental Sciences, COMSATS Institute of Information Technology, Abbottabad, 22060, Pakista
• Alamoudi Water Research Chair, King Saud University, P.O. Box 2460, Riyadh 11451, Kingdom of Saudi Arabia
• Email
• Other articles by this author:
/ M. Rizwan
/ A. A. Alazba
• Alamoudi Water Research Chair, King Saud University, P.O. Box 2460, Riyadh 11451, Kingdom of Saudi Arabia
• Other articles by this author:
Published Online: 2016-12-12 | DOI: https://doi.org/10.1515/biol-2016-0057

Abstract

This study was designed to find the best-fit probability distribution of annual maximum rainfall based on a twenty-four-hour sample in the northern regions of Pakistan using four probability distributions: normal, log-normal, log-Pearson type-III and Gumbel max. Based on the scores of goodness of fit tests, the normal distribution was found to be the best-fit probability distribution at the Mardan rainfall gauging station. The log-Pearson type-III distribution was found to be the best-fit probability distribution at the rest of the rainfall gauging stations. The maximum values of expected rainfall were calculated using the best-fit probability distributions and can be used by design engineers in future research.

1 Introduction

Pakistan is located at a latitude of 33.6667° N and longitude of 73.1667° E in the Middle East, a well-known region of southwestern Asia situated in the northern and eastern hemispheres. Pakistan experiences a diversified climate throughout the year. The minimum temperature is as low as –25°C in northern areas, and the maximum temperature is as high as 55°C in southern areas. Most of Pakistan experiences a dry climate, while humid conditions prevail in northern areas. In Pakistan, monsoons and evaporation from western depressions are the sources of rainfall. Monsoons contribute 65 to 75% of the total rainfall in Pakistan. The most vital natural source of water for humans, animals and crops is rainfall that contributes to lakes and rivers. Predicting the future occurrence and distribution of rainfall based on the amounts received in previous years has proved to be difficult and the results unreliable. Hydrological events such as rainfall, which occurs as a natural phenomenon, are observed at the event scale. The efficient management and use of water resources can be enhanced by rainfall analyses using probability distributions and annual maximum daily rainfall [1]. The expected rainfall in different return periods is determined through probability and frequency analysis of rainfall data [2]. In order to reduce flood damages and design and construct hydrologic projects such as dams, dykes, and urban drainage systems, the management and implementation of water resource strategies require reliable data regarding extreme events with high return periods [3]. Various probability distributions are currently used to predict expected rainfall in different return periods, as rainfall varies with time and location [4]. Frequency analyses of rainfall data have been performed for different return periods [5-9]. The expected rainfall values in different return periods, which are greater than or less than those of recorded rainfall, are estimated using a fitted distribution. The damage caused by storms can be reduced by the precise estimation of extreme rainfall, leading to the efficient design of hydraulic structures. A number of probability models have been developed to depict the distribution of extreme rainfall at a site [3]. The choice of an appropriate distribution model is one of the major problems in engineering practice. The selection mainly depends on the available rainfall data at a particular site. To find a suitable distribution model that will provide accurate estimates of extreme rainfall, it is necessary to evaluate the available distribution models. The probability models most commonly used to estimate rainfall frequency are the normal, log-normal, log-Pearson type-III and Gumbel distributions. The objective of the study is to find the best-fit probability model and perform a probability analysis of 24-hour annual maximum rainfall in northern Pakistan, as rainfall in this area is the main source of water for the irrigation network in the country.

2 Material and methods

Probability distributions are basic concepts in statistics. The results of statistical experiments and their probabilities of occurrence are linked by probability distributions. Rainfall data from northern Pakistan were evaluated with four probability models to find the best-fit model. The probability models used include the normal (N), log-normal (LN), log-Pearson type III (LP3) and Gumbel (EVI) probability models.

2.1 Normal distribution

The normal distribution is the most useful continuous distribution of all the distributions. The probability density function (PDF) and cumulative distribution function (CDF) of the normal distribution are calculated using Eqs. (1) and (2), respectively:

$f(x)=exp(−12(x−μσ)2)σ2π$(1)$F(x)=∅(x−μσ)=12[1+erf (x−μσ2)]$(2)

where ‘μ’ is the location parameter, ‘σ’ is the scale parameter and ‘Φ’is the Laplace Integral.

In the normal distribution, the maximum value of expected rainfall (XT) corresponding to any return period (T) can be calculated using Eq. (3):

$XT=X¯(1+CvKT)$(3)

where ‘XT’ is the maximum value of expected rainfall, $\overline{{}^{\prime }{X}^{\prime }}$ is the mean, ‘Cv’ is the coefficient of variation and ‘KT’ is the frequency factor, which depends on the return period and probability distribution. ‘KT’is calculated using the following equation.

$KT=XT−μσ$(4)

The frequency factor (KT) is the same as the standard normal variate ‘z’, which is calculated using Eq. (5).

$z=w−2.515517+0.802853w+0.0110328w21+1.432788w+0.189269w2+0.001308w3$(5)

From Eq. (5), can be expressed as follows:

$w=ln1p21/2(0(6)

where ‘p’is the exceedance probability (p=1/T). When p>0.5, 1-p is substituted for ‘p’in Eq. (6).

2.2 Log-normal distribution

The log-normal distribution is a distribution of random variables with a normally distributed logarithm. The lognormal distribution model includes a random variable Y, and Log(Y) is normally distributed. The probability density function (PDF) and cumulative distribution function (CDF) of the log-normal distribution are calculated using Eqs. (7) and (8), respectively:

$f(x)=exp[−12(ln(x−γ)−μσ)2](x−γ)σ2π$(7)$F(x)=∅(ln(x−γ)−μσ)=12[erfc {−ln(x−γ)−μσ2}]$(8)

where ‘μ’ is the shape parameter, ‘σ’ is the scale parameter, ‘γ’is the location parameter and ‘Φ’is the Laplace Integral.

The log-normal distribution assumes that Y=In(X); therefore, the maximum value of expected rainfall (XT) corresponding to any return period (T) can be calculated using Eq. (9):

$XT=exp(YT)$(9)$YT=Y¯(1+CvyKT)$(10)$KT=YT−μyσy$(11)

where $\overline{{}^{\prime }{Y}^{\prime }}$ and ‘Cvy’ are the mean and coefficient of variation of ‘Y’, respectively. ‘KT’ is the frequency factor, which is the same as the standard normal variate and can be computed using Eq. (5).

2.3 Log-Pearson type-III distribution

The log-Pearson type-III distribution has been widely and frequently used in hydrology and for hydrologic frequency analyses since the recommendation of this distribution by U.S. federal agencies. The probability density function (PDF) and cumulative distribution function (CDF) of the log-Pearson type-III distribution are calculated using Eqs. (12) and (13), respectively:

$fx=1xβΓαlnx−yβα−1exp−lnx−yβ$(12)$Fx=Γlnx−yαβΓα$(13)

where ‘α’, ‘β’ and ‘γ’ are shape, scale and location parameters, respectively.

In the log-Pearson type-III distribution, the maximum value of expected rainfall (XT) corresponding to any return period (T) can be calculated using Eq. (14):

$XT=AntilogX$(14)$Log(X)=X¯+KTSd$(15)$KT=2CS[{(z−Cs6)Cs6+1}3−1]$(16)

where $\overline{{}^{\prime }{X}^{\prime }}$, ‘Sd’, and ‘Cs’ are the mean, standard deviation and coefficient of skewness of rainfall data, respectively, and ‘KT’ is the frequency factor.

2.4 Gumbel (EV I) distribution

The Gumbel distribution named in honor of Emil Gumbel, and also known as the Extreme Value Type I (EV I) distribution, is a continuous probability distribution... This distribution can be applied to model maximum or minimum values (extreme values) of a random variable. The probability density function (PDF) and cumulative distribution function (CDF) of the Gumbel distribution are calculated using Eqs. (17) and (18), respectively:

$fx=1σexp−x−μσ−exp−x−μσ$(17)$Fx=exp−exp−x−μσ$(18)

where ‘σ’ and ‘μ’ are the scale and location parameters, respectively.

The Gumbel distribution can be used to calculate the maximum value of expected rainfall (XT) corresponding to any return period (T) using Eq. (19):

$XT=X¯1+CvKT$(19)$KT=6π0.5772+lnlnTT−1$(20)

where $\overline{{}^{\prime }{X}^{\prime }}$ is the mean, ‘Cv’ is the coefficient of variation and ‘KT’ is the frequency factor, which depends on the return period (T) and probability distribution.

3 Results and discussion

The northern area of Pakistan is surrounded by the Himalayan, Karakoram, Hindu Kush, and Pamir mountain ranges which with high peaks of between 6500 m to 8600 m. The snowmelt from these mountains, combined with glacier melt and monsoon rainfall, contribute to the many rivers, most notably the Indus River, that Pakistan has relied on to develop an advanced irrigation canal network However, the distribution and quantity of monsoon rainfall varies widely throughout the year, and occurs due to seasonal winds and western disturbances. In northern areas, such as Khyber Pukhtonkhuwa and Balochistan provinces, the maximum rainfall occurs during December to March, and in Punjab and Sindh, the maximum rainfall (50-75%) occurs during the monsoon season [10-15].

The 24-hour annual maximum rainfall data from six rainfall-gauging stations in northern Pakistan were used in this study. The locations of these stations are shown in Figure 1. A summary of the statistics is presented in Table 1. These statistical parameters are used to calculate the estimated 24-hour annual maximum rainfall in different return periods using different probability distributions. Of the six selected stations, Oghi has 46 years of rainfall data, spanning from 1961 to 2010. Three stations, including Kalam, Daggar and Mardan, have 44 years of rainfall data, spanning from 1962 to 2009, 1963 to 2010 and 1963 to 2010, respectively. Two stations, including Puran and Besham Qilla, have 38 years of rainfall data, spanning from 1963 to 2004 and 1969 to 2010, respectively.

Fig. 1

Locations of selected rainfall gauging stations.

Table 1

Summary of statistics from the selected rainfall gauging stations.

The distribution of 24-hour maximum rainfall observed during different months of a year is shown in Figure 2. Figure 2 shows that Kalam and Besham Qilla received 42% and 21%, respectively, of observed rainfall in March. Oghi, Daggar and Puran received 37%, 32% and 23%, respectively, of observed rainfall in July. Mardan received 37% of observed rainfall in August. These results suggest that the maximum rainfall at these selected stations occurred between March and August.

Fig. 2

Distributions of 24-hour annual maximum rainfall in a year.

Four probability distributions (normal, log-normal, log-Pearson type-III and Gumbel) were used in this study. The parameters of probability distributions were calculated using the method of moments and are given in Table 2.

Table 2

Parameters of probability distributions at rainfall gauging stations.

The four probability distributions were subjected to three goodness of fit tests (Kolmogorov Smirnov Test, Chi-Squared Test and Anderson Darling Test) to determine the best-fitting probability distribution model at each rainfall gauging station. A standard procedure was followed for application of goodness of fit tests that was described earlier by several authors [16-18].

The goodness of fit tests was ranked from one (bestfit) to four (least-fit) for all probability distributions.

Selection of the best-fit probability distribution is based on the total score from all the goodness of fit tests. The results of goodness of fit tests at each selected rainfall gauging station and for each probability distribution used in this study are shown in Table 3. Based on the results of the goodness of fit tests, the best-fit probability distribution and mathematical expression for the calculation of rainfall in different return periods at each gauging station are shown in Table 4.

Table 3

Results of goodness of fit tests.

Table 4

Best-fit distributions and mathematical expressions.

The normal distribution provides the best-fit at the Mardan rainfall gauging station, while log-Pearson type-III provides the best-fit at the other rainfall gauging stations analyzed in this study. Probability density functions (PDF) and cumulative distribution functions (CDF) at the rainfall gauging stations were calculated using the best-fit distribution, i.e., the normal distribution at Mardan and the log-Pearson type-III distribution at the rest of the rainfall gauging stations, and are shown in Figures 3 and 4.

Fig. 3

PDFs of probability distributions at rainfall gauging stations.

Fig. 4

CDFs of probability distributions at rainfall gauging stations.

The rainfall estimates or maximum values of expected rainfall (mm) for return periods of 2, 5, 10, 20, 50, 100 and 200 years at the rainfall gauging stations were calculated using the best-fit distribution. The rainfall estimates are given in Table 5.

Table 5

Rainfall estimates at the rainfall gauging stations using the best-fit distribution.

4 Conclusions

Annual maximum rainfall data based on a 24-hour duration at six rainfall-gauging stations in northern Pakistan were used in this study. The purpose of the study was to find the best-fit probability distributions at northern rainfall gauging stations. The maximum values of expected rainfall or rainfall estimates calculated using a probability distribution that does not provide the best-fit may yield values that are higher or lower than the actual values. These calculations may be used to influence decisions relating to local economics and hydrologic safety systems.

The normal distribution provided the best-fit probability distribution at the Mardan rainfall gauging station based on the scores of the goodness of fit tests used in this study. The log-Pearson type-III distribution is the best-fit probability distribution at the rest of the rainfall gauging stations. The expected values of designed rainfall or rainfall estimates calculated using the best-fit probability distributions at the rainfall gauging stations might be used by design engineers to safely and feasibly design hydrologic projects.

Acknowledgements

The project was financially supported by King Saud University, Vice Deanship of Research Chairs.

References

• Subudhi R., Probability analysis for prediction of annual maximum daily rainfall of Chakapada block of Kandhamal district in Orissa, Indian J. Soil Conser, 2007, 35, 84-85. Google Scholar

• Bhakar S. R., Iqbal M., Devanda M., Chhajed N., Bansal A. K., Probability analysis of rainfall at Kota, Indian J. Agri. Res, 2008, 42, 201-206. Google Scholar

• Tao D.Q., Nguyen V. T., Bourque A., On selection of probability distributions for representing extreme precipitations in Southern Quebec, Annual Conference of the Canadian Society for Civil Engineering, 5th-8th June 2002, 1-8. Google Scholar

• Upadhaya A., Singh S. R., Estimation of consecutive day’s maximum rainfall by various methods and their comparison, Indian J. Soil Conserv., 1998, 26, 193-201. Google Scholar

• Bhakar, S. R., Bansal A. N., Chhajed N., Purohit, R. C., Frequency analysis of consecutive days maximum rainfall at Banswara, Rajasthan, India, ARPN J. Engg. Appl. Sci, 2006, 1, 64-67. Google Scholar

• Barkotulla M. A. B., Rahman M. S., Rahman, M. M., Characterization and frequency analysis of consecutive days maximum rainfall at Boalia, Rajshahi and Bangladesh, J. Develop. Agri. Econ., 2009, 1, 121-126.Google Scholar

• Nemichandrappa M., Ballakrishnan P., Senthilvel S., Probability and confidence limit analysis of rainfall in Raichur region, Karnataka J. Agri. Sci., 2010, 23, 737-741.Google Scholar

• Manikandan M., Thiyagarajan G., Vijayakumar G., Probability analysis for estimating annual One day maximum rainfall in Tamil Nadu Agricultural University, Mad. Agri. J., 2011, 98 (1-3), 69-73.Google Scholar

• Vivekanandan N., Intercomparison of extreme value distributions for estimation of ADMR, Int. J. Appl. Engg. Technol., 2012, 2 (1), 30-37. Google Scholar

• Kazi S. A., Khan M. L., Variability of rainfall and its bearing on agriculture in the arid and semi-arid zones of West Pakistan, Pak. Geographic Rev., 1951, 6 (1), 40-63. Google Scholar

• FAO, Pakistan’s experience in rangeland rehabilitation and improvement, Food and Agriculture Organization of the UNO, 70, 1987.

• Khan J. A., The climate of Pakistan, Rehber Publishers, Karachi, Pakistan, 1993.Google Scholar

• Khan F. K., Pakistan geography, economy and people, Oxford University Press, Karachi, Pakistan, 2002.Google Scholar

• Kureshy K. U. Geography of Pakistan, National Book Service Lahore, Pakistan, 1998.Google Scholar

• Luo Q., Lin E., Agricultural vulnerability and adaptation in developing countries: the Asia-Pacific region, Climate Change, 1999, 43, 729-743. Google Scholar

• Chowdhury J. U., Stedinger J. R., Goodness of fit tests for regional generalized extreme value flood distributions, Water Res., 1991, 27 (7), 1765-1777. Google Scholar

• Adegboye O. S., Ipinyomi R. A., Statistical tables for class work and Examination, Tertiary publications Nigeria Limited, Ilorin, Nigeria, 1995, 5-11.Google Scholar

• Murray R.S., Larry J. S., Theory and problems of statistics, 3rd Edition, Tata Mc Graw – Hill Publishing Company Limited, New Delhi, India, 2000, 314-316. Google Scholar

Accepted: 2016-08-21

Published Online: 2016-12-12

Published in Print: 2016-01-01

Conflict of Interest: The authors declare no conflict of interest.

Citation Information: Open Life Sciences, Volume 11, Issue 1, Pages 432–440, ISSN (Online) 2391-5412,

Export Citation

Citing Articles

[1]
Muhammad Rizwan, Xin Li, Kashif Jamal, Yingying Chen, Junaid Nawaz Chauhdary, Donghai Zheng, Lubna Anjum, Youhua Ran, and Xiaoduo Pan
Water, 2019, Volume 11, Number 7, Page 1366
[2]
Amit Sharad Parchure and Shirish Kumar Gedam
Arabian Journal of Geosciences, 2019, Volume 12, Number 11
[3]
Md Alam, Kazuo Emura, Craig Farnham, and Jihui Yuan
Climate, 2018, Volume 6, Number 1, Page 9