In the World Energy Trilemma Index 2016 report, the World Energy Council’s definition of energy sustainability is based on three core dimensions as energy security, energy equity, and environmental sustainability (https://www.worldenergy.org/wp-content/uploads/2016/10/Full-report_Energy-Trilemma-Index-2016.pdf). In this study, the focus is on energy security. The World Energy Council defines “energy security” as the “effective management of primary energy supply from domestic and external sources, reliability of energy infrastructure, and ability of energy providers to meet current and future demand” in the World Energy Trilemma Index, 2016. The World Energy Council used indicators given in Table 1 to calculate energy security’s grade levels.
After the data for each indicator given in Table 1 was collected and verified, scores have been calculated for each indicator, the indicator-level results were standardized by using the z-score and then rescaled to the range between 0–100. The balance score grade for energy security was assigned from A to D based on the mean and standard deviation of energy security in the World Energy Trilemma Index, 2016.
In this study, following The World Energy Council’s above-stated classification, energy security is taken as the ordinal response variable coming from the multinomial distribution with four ordinal categories as A, B, C, and D grade levels.
In order to investigate statistically significant effects on the world energy security by using generalized linear model (GLM) approach, 15 countries from Asia, 26 countries from Europe, 8 countries from Latin America and Caribbean, 8 countries from Middle East and North Africa, 3 countries from North America, and 1country from Sub-Saharan Africa regions are taken as subjects into the study.
Some studies on energy security in the literature can be given as follows;
Jacobson  reviewed and ranked major proposed energy-related solutions to global warming, air pollution mortality, and energy security. He also estimated CO2 emissions due to the leakage for different residence times of carbon dioxide stored in a geological formation by using exponential equations over time. Kruyt et al.  provided an overview of available indicators for the long-term energy security of supply by using computation methods as dual concept diversity indices, mean variance portfolio theory, and Shannon index. Winzer  reviewed the multitude of definitions of energy security characterized according to some basic concepts. He calculated the system average interruption duration index of electricity supplies, and heat supply. He also estimated the total gross domestic product loss of Austria, Italy, and U.K. due to electricity interruptions by the additive mathematical model with multiplicative terms. Chester  examined energy security and its complex nature in full detail without giving any mathematical model. Sovacool and Mukherjee  provided a synthesized and workable framework for analyzing national energy security policies and performance in the aspect of energy security indicators and metrics. They also investigated demographic details of the energy security survey by using pie charts and bar graphs. Bahgat  reviewed the concept of energy security by examining Europe’s energy mix and then analyzed European efforts to establish and strengthen energy partnerships in all aspects without giving any mathematical model. Löschel et al.  suggested an additional dimension along which indicators of energy security were classified by calculating energy security market concentration measure, ex-post and ex-ante indicators of energy security. Skodras et al.  suggested that production of coal-based synthetic natural gas might be a good alternative to carbon intensity in the energy generation and energy security by using nonlinear regression analysis. Yergin  investigated the concept of energy security very broadly in terms of the return of energy security, the dimensions of the energy security, the limits of energy independence and the strategic significance of the energy security. Awerbuch  investigated the effect of mitigating fossil price volatility on the energy security by using portfolio equation and calculating portfolio risks for the EU, the US and Mexico. Sovacool and Brown  analyzed the energy security performances of the US and 21 OECD countries based on availability, affordability, efficiency and environmental stewardship between 1970 and 2007. Cherp and Jewell  examined energy security from three different perspectives as sovereignty, robustness and resilience. Belkin  examined energy security challenges of the EU member states and the efforts of EU to determine the energy strategy policy. Ang et al.  surveyed 104 studies in the literature about 83 different energy security definitions with energy security indicators and indexes. Further literature studies on energy can be found in [15, 16, 17, 18, 19, 20]. During the detailed examination of energy security literature, it can be stated that a contribution of the findings of GLM on energy security with respect to the different energy components as investigated in this study have not been made before. Therefore this study on energy security is expected to make significant contribution to the literature.
Based on the importance of energy security in the energy sustainability of the world’s energy performance, the main motivation of this study is to bring a new statistical evaluation of energy security in terms of energy imports, energy use per capita, energy sources consumptions, electricity productions from energy sources, and some energy reserves by GLM for ordinal response variable under different cumulative link functions. From this point of view, the statistical assessment of energy security presented in this study has not been included in any of the previous studies in the literature.
The statistical significance of this study is to highlight the importance of the most appropriate cumulative link function selection between the systematic component and the cumulative probabilities of the ordinal response variable in GLM by using major information criteria as AIC, AICC, BIC, and CAIC. And then, to emphasize the importance of the best cumulative link function selection on the accuracy of parameter estimates, confidence intervals, and hypothesis tests in the systematic component of GLM for the multinomially distributed response variable. The necessity to model the energy security data by GLM is the flexibility of modelling the ordinal response variable coming from the exponential family including multinomial distribution as given in this study.
2 Generalized linear model for ordinal response variable coming from multinomial distribution
In this study, attention is paid on regression models for the analysis of ordinal response variable with more than two response categories, especially which are ordered. Multinomial distribution is an extension of the binomial distribution to more than two response categories.
Consider a response variable Yi; i=1,2,…, N taking one of several discrete values. Let P(Yi = j) = πij; i = 1,2,…,N j = 1,2,…, c denotes the probability that the response belonging to the ith subject falls in the jth ordinal response category. In this study, the response variable is “energy security” and it takes the ordered values A, B, C, and D as categories indexed by 1, 2, 3, and 4 for 61 different countries from 6 regions in the world.
Let P(Yi = j)= πij; i = 1,2,…,61 j = 1,2,3,4 denotes the probability that the ith country’s response (i = 1,2,…,61) falls in the jth energy security grade level (j = 1,2,3,4). In this study, πil is the probability that the ith response is in A energy security grade level and so on. Assume that the response categories are mutually exclusive and exhaustive, Additionally there are N independent trials for 61 different countries and each trial results in 1 of 4 mutually exclusive and exhaustive outcomes as energy security grade levels. Then the “energy security” ordinal response variable comes from the multinomial probability distribution. In generalized linear model (GLM) approach for the energy security ordinal response variable, the main interest is in the cumulative probability of the ith response falling into or below the jth energy security grade level as follows; (1)
The random component identifies the ordinal response variable as energy security and assumes that the response variable comes from the exponential family including the multinomial distribution. The systematic component specifies the covariates belonging to the ith country in the GLM. Parameters in the systematic component of the GLM for the multinomial distribution are estimated by using maximum likelihood (ML) method with one of the accompanying iterative methods such as Newton-Raphson (NR), Fisher scoring (FS) or hybrid method [21, 22, 23, 24]
Let G–1 denotes a cumulative link function given in Table 2 as the inverse of the continuous cumulative distribution function G. Then the general form of a cumulative link model for the ith response relates the cumulative probabilities to the linear predictor depended on the covariates as follows ; (2)
In Eq.(3), αj; j = 1,2,…, c – 1 are the intercept parameters and β = (β1,, β2,… βp)′ are the parameters belonging to the covariates in the systematic component of the GLM for the multinomial distribution. Then the multinomial log-likelihood function for the cumulative link model given in Eq.(2) is as follows ; (4)
Let g be the probability density function belonging to the derivative of the cumulative distribution function G. δjk denotes the Kronecker delta as δjk =1 if j = k and δjk = 0 otherwise. Then the likelihood equations belonging to the αj; j =1, 2,…, c − 1 and β = (β1, β2, …, βp)′ parameters in the GLM for the multinomial distribution are as follows ; (5) (6)
The likelihood equations given in Eq.(5) and Eq.(6) can be solved by using NR, FS or hybrid method. In this study, hybrid method, in which iterations with the FS method are performed before continuing iterations with the NR method, is used. If convergence is achieved before the maximum number of Fisher iterations is reached, the hybrid algorithm continues with the NR method .
The estimate of the scale parameter is computed as the ratio of the model Pearson chi-square statistic to the model degrees of freedom defined as N(c − 1)− p in the GLM for the multinomial distribution [22,25,26].
In this study, the cumulative link function G−1 [P(Yi ≤ j)] describes the functional relationship between the systematic component of the GLM and the cumulative probabilities of the ith response falling into or below the jth energy security grade level.
Cumulative link functions given in Table 2 are used to permit the cumulative probabilities of the ordinal response variable to be linearly related to the covariates as in Eq.(2) (http://share.uoa.gr/public/Software/SPSS/SPSS22/Manuals/IBM%20SPSS%20Advanced%20Statistics.pdf). where Φ−1 is the inverse of the standard normal cumulative distribution function.
Information criteria (IC) are used as goodness-of-fit test statistics for determining the best cumulative link function between the systematic component and the cumulative probabilities in the GLM for the ordinal response variable are given in Table 3.
(https://www.ibm.com/support/knowledgecenter/SSLVMB_22.0.0/com.ibm.spss.statistics.algorithms/alg_genlin_gzlm_modeltest_goof.htm) where l is the maximum value of the multinomial log-likelihood function given in Eq.(4) evaluated at the parameter estimates, d = c − 1+ p is the number of parameters in the model, N is the total number of subjects. The smallest values of IC determine the best cumulative link function in the GLM for the ordinal response variable.
3 Generalized linear model for the world energy security data from multinomial distribution
In this study, “energy security” is taken as the ordinal response variable. Ordinal response variable categories are taken as A, B, C, and D energy grade levels as indicated in the World Energy Trilemma Index, 2016. The response variable probability distribution is taken as multinomial. Cumulative link functions associated with the ordinal response variable are taken as cumulative logit, cumulative probit, cumulative complementary log-log, cumulative Cauchit and cumulative negative log-log. The parameter estimation method for the systematic component or the mean structure of the GLM for multinomial distribution is taken as the ML method with the hybrid iterative algorithm. Also, the scale parameter is estimated by using the Pearson chi-square statistic.
Energy imports (% of net energy use), energy use per capita (kg of oil equivalent per capita), oil (million tonnes), natural gas, coal, nuclear energy, hydroelectricity, solar, wind, geothermal, biomass and other (GBO) renewable energy sources consumptions (million tonnes oil equivalent), electricity productions (% of total) from coal, hydroelectric, natural gas, nuclear energy, and oil energy sources, total proved oil (thousand million tonnes), and natural gas (trillion cubic metres) reserves are taken as covariates for modelling the world energy security by GLM under different cumulative link functions. All computations and statistical data analysis are performed by using Microsoft Excel 2010, and IBM SPSS 22.0 (IBM Corp, IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY, 2013).
Before starting the data analysis, covariates are centered by using their arithmetic means called “mean centering” method to get more stabilized, and consistent parameter estimates of the GLM and to reduce multicollinearity in the data structure.
Before constructing GLM for the world energy security data, pairwise correlations between these covariates are examined to measure the strength of the linear relationships between them. By using Pearson correlation coefficient, highly correlated pairs of the covariates are only determined for the energy sources consumptions between oil-natural gas 0.865, oil-wind 0.905, oil-GBO 0.834, natural gas-nuclear energy 0.824, coal-hydroelectricity 0.891, solar-wind 0.819, solar-GBO 0.816, and wind-GBO 0.854 with statistically significant p-values 0.000 at α = 0.05 significance level.
When the covariates are moderately or highly correlated, parameter estimates, hypothesis tests, and confidence intervals may yield inaccurate statistical inferences and conclusions in the GLM for the world energy security data. In order to overcome this data-based multicollinearity problem, principal component analysis (PCA) technique  is used in this study. Ridge regression  and lasso regression , based on constrained minimization problem with penalty term and Bayesian inference, are alternative but also complex approaches to PCA in the case of multicollinearity.
As the measure of sampling adequacy, Kaiser-Meyer-Olkin (KMO) value is found to be 0.640. Especially for the values of KMO>0.5, it can be concluded that the sample is adequate, and PCA is moderately suitable for the world energy security data. Approximate chi-square test statistic value for Bartlett’s test of sphericity is found to be 845.979 with statistically significant p-value 0.000 at α = 0.05 significance level. This indicates that the original correlation matrix is not an identity matrix and therefore the covariates are related and PCA is suitable for detecting this correlated covariate structure.
PCA which is also known as a popular dimension reduction technique is performed to form uncorrelated linear combinations of the covariates. In the PCA technique, “varimax” is used as the orthogonal rotation method that minimizes the number of covariates that have high loadings on each component. Six different principal components out of 17 original covariates are constituted based on eigenvalue criterion higher than 1, and the scores for each principal component are created as the values of the new covariates.
As a key output of PCA, rotated component matrix containing estimates of the correlations between each of the covariates and the estimated components is given in Table 4. As seen from Table 4, six different components consist of highly correlated energy sources consumptions; non-renewable energy (fossil fuels) reserves of natural gas and oil; electricity productions from nuclear energy, natural gas, and oil; energy use per capita and energy imports; electricity production from coal; and electricity production from hydroelectric, respectively.
Ethical approval: The conducted research is not related to either human or animals use.
4 Results and discussion
In this section, results for modelling the world energy security data by GLM under different cumulative link functions are given comparatively.
Firstly, GLM for the world energy security data is constituted with 5 different cumulative link functions in terms of 6 different principal components as the new covariates. The values of goodness-of-fit test statistics comparing these cumulative link functions in the GLM for the world energy security data are given in Table 5. The best cumulative link function between the cumulative probabilities of the ordinal response variable and the new candidate components to the mean structure of the GLM is determined as “cumulative logit” with the smallest values of IC as AIC, AICC, BIC, and CAIC, goodness-of-fit measures as the deviance, Pearson chi-square and the multinomial negative log-likelihood as seen from Table 5. Additionally, the worst cumulative link function is determined as “cumulative complementary log-log” with the largest values of the same IC and the goodness-of-fit measures from Table 5.
The estimates of the scale parameter are computed based on the value of the Pearson chi-square test statistic as 0.994, 1.084, 1.177, 1.192, and 1.247 for cumulative logit, cumulative Cauchit, cumulative negative log-log, cumulative probit, and cumulative complementary log-log link functions in the GLM for the world energy security data, respectively.
The results belonging to the parameter estimates, hypothesis tests of the parameters, odds ratios estimates and confidence intervals for the odds ratios in the GLM for the world energy security data under the best cumulative link function are given in Table 6.
As seen from Table 6, there are two statistically significant intercept parameter estimates, with α̂2 = 1.683 for B energy security grade level, and α̂3 = 4.962 for C energy security grade level with p-values 0.000 at α = 0.05 significance level. The last grade level of “energy security” ordinal response variable as D is taken as the reference category. Then by using the parameter estimates given in Table 6, cumulative logit model for A energy security grade level, indexed by 1, can be given as follows; (8)
Cumulative logit models for B and C energy security grade levels as logit[P(Yi ≤ 2)] and logit[P(Yi ≤ 3)] can be constituted in a similar manner by including the statistically significant intercept parameter estimates as α̂2 =1.683 for B energy security grade level, and α̂3 = 4.962 for C energy security grade level into the model.
To make a comparison between the best and the worst cumulative link functions, results belonging to the GLM for the world energy security data under the worst cumulative link function are given in Table 7.
As seen from Table 7, there are also two statistically significant intercept parameter estimates, with α̂2 = 1.666 for B energy security grade level, and α̂3 = 4.199 for C energy security grade level with p-values 0.000 at α = 0.05 significance level. The last grade level of “energy security” ordinal response variable as D is again taken as the reference category. Then by using the parameter estimates given in Table 7, cumulative complementary log-log model for A energy security grade level, indexed by 1, can be given as follows; (9)
Cumulative complementary log-log models for B and C energy security grade levels as and can be again constituted in a similar manner by including the statistically significant intercept parameter estimates as α̂2 =1.666 for B energy security grade level, and α̂3 = 4.199 for C energy security grade level into the model.
As the final results of this study, by using the cumulative logit model, 1 million tons of oil equivalent increase in energy sources consumptions; 1% increase in electricity productions from nuclear energy, natural gas, and oil; 1 kg of oil equivalent per capita increase in energy use and 1% increase in energy imports; 1% increase in electricity production from hydroelectric increase odds of falling into or below any energy security grade level by e1.176 = 3.241, e1.566 = 4.787, e.661 = 1.936 and e.849 = 2.337 units, respectively. On the other hand, 1% increase in electricity production from coal decreases the odds of falling into or below any energy security grade level by e–.872 = .418 units.
By using the cumulative complementary log-log model, 1 million tonnes oil equivalent increase in energy sources consumptions, and 1% increase in electricity productions from nuclear energy, natural gas, and oil increase the odds of falling into or below any energy security grade level by e.897 = 2.452, and e1.177 = 3.244 units, respectively. On the other hand, 1% increase in electricity production from coal decreases the odds of falling into or below any energy security grade level by e–.727 =.483 units.
In the light of this study, it can be concluded that overcoming the multicollinearity problem by integrating the mean centering transformation method with the PCA technique as an alternative to the ridge regression and LASSO regression is vital when the covariates in the GLM for the ordinal response variable are moderately or highly correlated. Additionally, the most appropriate cumulative link function selection between the systematic component and the cumulative probabilities of the ordinal response variable is one of the most important stages on the accuracy of parameter estimates, hypothesis tests, and confidence intervals in the GLM. Thus, statistically significant covariates such as energy use per capita and energy imports, and also electricity production from hydroelectric in the cumulative logit model become statistically insignificant in the cumulative complementary log-log model.
Furthermore, the inaccurate estimates of the scale parameter greater than 1, indicating over-dispersion in the data structure, are obtained from the misspecification of the cumulative link functions in the GLM for the world energy security data. These inaccurate estimates of the scale parameter may also cause misspecification of the systematic component in the GLM for the world energy security data. Therefore, larger inaccurate estimates of the scale parameter lead the statistically significant covariates in the cumulative logit model to become statistically insignificant in the cumulative complementary log-log model.
In the light of this study, as a future work, it is intended to extend this study for the statistical evaluation of three core dimensions of energy sustainability as energy security, energy equity, and environmental sustainability altogether in the world’s energy performance by using GLM for the ordinal response variable.
The author is grateful to Professor Iskender Akkurt, the editors and three anonymous referees for their valuable comments and contributions to the improvement of this paper.
Skodras G., Panagiotidou S., Kokorotsikos P., Serafidou M., Potassium catalyzed hydrogasification of low-rank coal for synthetic natural gas production, Open Chemistry, 2016, 14, 92-102. CrossrefWeb of ScienceGoogle Scholar
Yergin D., The quest: energy, security, and the remaking of the modern world, The Penguin Press, New York, 2011. Google Scholar
Awerbuch S., Portfolio-based electricity generation planning: policy implications for renewables and energy security, Mitigation and Adaptation Strategies for Global Change, 2006, 11(3), 693-710. CrossrefGoogle Scholar
Cherp A., Jewell J., The three perspectives on energy security: intellectual history, disciplinary roots and the potential for integration, Current Opinion in Environmental Sustainability, 2011, 3(4), 202-212. CrossrefWeb of ScienceGoogle Scholar
Iliashenko R., Zozulia O., Doroshenko A., High stokes shift long-wavelength energy gap regulated fluorescence in the series of nitro/dimethylamino-substituted ortho-analogs of POPOP, Open Chemistry, 2011, 9(6), 962-971. CrossrefGoogle Scholar
Dokuzoglu D., Purutcuoglu V., Comprehensive analyses of gaussian graphical model under different biological networks, Acta Physica Polonica A, 2017, 132(3), 1106-1111. CrossrefWeb of ScienceGoogle Scholar
Loomis D.G., Hayden J., Noll S., Payne J.E., Economic impact of wind energy development in Illinois, Journal of Business Valuation and Economic Loss Analysis, 2016, 11(1), 3-23. CrossrefGoogle Scholar
McCullagh P., Nelder J.A., Generalized linear models, 2nd ed., Chapman and Hall, London, UK, 1989. Google Scholar
Agresti A., Foundations of linear and generalized linear models, John Wiley&Sons, Inc., Hoboken, New Jersey, 2015. Google Scholar
Heck R.H., Thomas S.L., Tabata L.N., Multilevel modeling of categorical outcomes using IBM SPSS, Quantitative Methodology Series, Taylor&Francis Group, New York, 2012. Google Scholar
De Jong P., Heller G.Z., Generalized linear models for insurance data, Cambridge University Press, Cambridge, 2008. Google Scholar
Hilbe J.M., Robinson A.P., Methods of statistical model estimation, CRC Press, Boca Raton, FL, 2013. Google Scholar
Ozaltin O., Iyit N., Modelling the US diabetes mortality rates via generalized linear model with the Tweedie distribution, International Journal of Science and Research (IJSR), 2018, 7(2), 1326-1334. https://www.ijsr.net/archive/v7i2/ART2018368.pdf
Pearson K., LIII. On lines and planes of closest fit to systems of points in space, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1901, 2(11), 559–572. CrossrefGoogle Scholar
Tibshirani R., Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B (Methodological), 1996, 58(1), 267-288. http://www.jstor.org/stable/2346178 Web of ScienceCrossref
About the article
Published Online: 2018-04-30
Conflict of interest: The author states no conflict of interest.
Citation Information: Open Chemistry, Volume 16, Issue 1, Pages 377–385, ISSN (Online) 2391-5420, DOI: https://doi.org/10.1515/chem-2018-0053.
© 2018 Neslihan lyit, published by De Gruyter. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0