Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access December 31, 2022

Application of statistical and machine learning techniques for landslide susceptibility mapping in the Himalayan road corridors

  • Yasir Sarfraz EMAIL logo , Muhammad Basharat , Muhammad Tayyib Riaz , Mian Sohail Akram , Chong Xu , Khawaja Shoaib Ahmed , Amir Shahzad , Nadhir Al-Ansari and Nguyen Thi Thuy Linh EMAIL logo
From the journal Open Geosciences

Abstract

Landslides are frequent geological hazards, mainly in the rainy season along road corridors worldwide. In the present study, we have comparatively analyzed landslide susceptibility by employing integrated geospatial approaches, i.e., data-driven, knowledge-driven, and machine learning (ML), along the main road corridors of the Muzaffarabad district. The landslide inventory of three road corridors is developed to evaluate landslide susceptibility, and eleven landslide causative factors (LCFs) were analyzed. After statistical significance analysis, these eleven LCFs generated susceptibility models using WoE, AHP, LR, and RF. Distance from roads, landcover, lithological units, and slopes are considered more influential LCFs. The performance matrix of different LSMs is evaluated through the area under the curve (AUC-ROC), overall accuracy, Kappa index, F1 score, Mean Absolute Error, and Root Mean Square Error. The AUC-ROC for WoE, AHP, LR, and RF techniques along Neelum road is 0.86, 0.82, 0.91, and 0.97, respectively, along Jhelum Valley road is 0.83, 0.81, 0.93, and 0.95, respectively, while along Kohala road is 0.89, 0.88, 0.89, and 0.92, respectively. The produced LSMs through ML (i.e., RF and LR) showed better prediction accuracies than WoE and AHP along these three road corridors. The LSMs are categorized into very high, high, moderate, and low susceptible zones along these roads. The LSM generated through hybrid models can facilitate the concerned local agencies to implement landslide mitigation policies for the landslide-prone zones along road corridors.

1 Introduction

In mountainous terrains, landslides are a major challenge for settlements, transportation corridors, natural resource management, and tourism. In the Himalayas, landslide risk associated with road corridors is extensive due to steep slopes, high relief and landscape development, active tectonics, complex and fragile geological units, along with unpredictable monsoon changes [1]. The landslide susceptibility map (LSM) has been considered the foremost step toward landslide risk assessment in recent years. LSM can be done with landslide zonation, i.e., classifying and ranking the land surface areas based on the degree of hazard [2]. Different techniques of landslide zonation have been effectively used, such as heuristic, deterministic, probabilistic, statistical, and multi-criteria decision analysis [3]. The deterministic method depends on the mathematical formulation of the physical mechanics contributing to slope failures.

Knowledge-based techniques, that is, a heuristic method based on expert opinion, to zone landslide areas such as low, moderate, and high. In contrast, deterministic methods are based on properties of the material, e.g., mechanical characteristics and pore water behavior. Recently, new heuristic approaches such as weighted linear combination and Analytical hierarchy process (AHP) are considered new geographical information systems (GIS)-based decision support tools. These approaches are semi-quantitative, however, can be efficiently adopted for landslide susceptibility assessment on a medium scale.

The AHP technique adopts landslide causative factors (LCFs) based on experts’ judgment and experience and allocates weights to LCFs based on their significance in evaluating the landslide susceptibility index. Data-driven and statistical methods assume that there is no association among LCFs. These methods include the weight of evidence (WoE), the Information content method [4,5], the logistic regression (LR) method [6,7], and the multivariate linear regression method [8]. During the last two decades, machine learning (ML) methods have been frequently adopted for LSM in geosciences and improved the prediction performances of susceptibility maps. Several ML methods, e.g., Artificial neural networks-ANN [9,10], Decision trees (DTs) [11,12], Support vector machines-SVM [7,13,14], Random forests-RF [15,16], and LR [17,18] algorithms are frequently being used because of their superiority and computational efficiency on LSM as compared to conventional models. In Azad Kashmir and the northern part of Pakistan, the 2005 Kashmir earthquake activated thousands of landslides that caused about 26,000 casualties and disastrous effects on the economy and infrastructure and badly affected the road corridors [19,20]. Different researchers [21,22,23,24,25,26,27,28] have investigated the earthquake-affected landslide region and concluded that most of the landslides are distributed along roads and rivers. Being a significant geological hazard, the landslide causes severe blockages and damage to road corridors in northern areas of Pakistan. Road networks can be temporarily and permanently blocked due to landslides for at least a week up to a month, imparting significant influence on the area’s economy. All the existing work is limited to landslide distribution analysis, landslides as case studies, and regional susceptibility assessment.

To understand the landslide hazards along the main road corridors of District Muzaffarabad, Azad Kashmir, Pakistan, there are no available LSMs to implement landslide management policies in the area. Therefore, in the current study, an integrated data-driven WoE method, knowledge-based AHP technique, and advanced ML algorithms (RF and LR) were adopted to produce the LSMs along the main road corridors of the region. The ML algorithms were applied first time in the area so integrated WoE, AHP, LR, and RF methods are helpful for landslide probability modeling using Arc-GIS extensions and different other geospatial tools.

The WoE technique is appropriate for regional areas of unpredictable geo-environmental factors that can manage both continuous and discrete data sets. Moreover, the application of probability analysis for the weight calculation of the landslide-associated variables might be supposed to be the best exercise to avoid the specific impact with respect to the inputs of the model and resultantly to the outputs. AHP method provides a pragmatic and accurate result to define the influencing factor weights in the LSM. This method is based on the field expertise and expert’s knowledge about the area along with the controls of different factors on landslides [1]. The AHP method is nowadays proven as a multi-objective and multi-dimensional decision-making method, which helps the researchers to develop a scale of their preferences derived through a set of alternatives. Compared with other ML approaches, the LR is based on the statistical approach with low requirements for the quantity and quality of samples, continuous or discrete independent variables, and results validation with a number of tests [29]. The selection of an LR approach for the current study is based on the fact that landslide occurrence is mainly controlled by various linear and non-linear causative factors and to categorize the road corridors into susceptibility classes with influencing factors corresponding to landslides.

RF is a very flexible and powerful ensemble classifier based on the DTs. Each individual DT is based on a bootstrapped sample of data employing classification and regression trees methodology, with random subsets of variables selected at each node [30]. RF provided better conditions for training ML models and producing more precise estimation of generalization skills, that is, predicting new landslide events. The specific aim to conduct the present study is to correlate statistical (WoE), heuristical (AHP), and ML approaches (LR and RF) both for LSM and for relevant features detection along main road corridors of Muzaffarabad. The key difference between the present research and the existing works is that the three approaches (e.g., knowledge-driven, data-driven, and ML) are first time implemented for spatial LSM. Furthermore, landslides caused severe damage to road networks during and after the 2005 Kashmir earthquake and during the monsoon months each year [19,24,31]. Additionally, the current research aims to compare the prediction performance of AHP, WoE, LR, and RF for landslide susceptibility assessment. LR and RF are very popular and effective ML algorithms, while WoE has been effectively executed all around the globe due to its precision and to evaluate the effect of different LCFs on landslide distribution. This study combines the advantages of data-driven, expert knowledge/experience, and well-known ML methods to assess landslide susceptibility. Hence, the present study allows all concerned agencies to adopt the appropriate decisions regarding future land use planning concerning the previous single method-based maps.

2 Study area

The study area comprises the main road corridors of the district Muzaffarabad in northern Pakistan (Figure 1). These roads include Kohala road, Neelum Valley, and Jhelum Valley roads. In the Himalayas, several landslides were reported along the road corridors, especially in Azad Kashmir, which has blocked the routes for several days [5,19,32]. Safe and cost-effective operations for transportation routes remain the two most influential criteria during their planning in landslide-prone regions [33]. The studied road corridors encircled the mountainous area of district Muzaffarabad and remained blocked for traffic operations for more than a week due to the landslides triggered during torrential rainfalls and floods. The Donga Kas landslide along Neelum valley road disrupted traffic flow for a week, three houses and six shops were damaged, and three fatalities were recorded [23]. Moreover, according to the local inhabitants, the Langerpura landslide along Jhelum valley road was initially activated after heavy rainfall during the 1992 monsoon season and then afterward reactivated during the 2005 Kashmir earthquake and caused damages to the residential buildings of the inhabitants with several injured personnel. The area was significantly known as a landslide-prone region even before the 2005 Kashmir earthquake due to its steep slopes, poor drainage control, and active tectonic features. The rehabilitation works along road corridors have reported extensive slope failures due to improper slope cutting; therefore, the rate of landslide activities surged enormously.

Figure 1 
               Geographical location map of the study area road corridors.
Figure 1

Geographical location map of the study area road corridors.

Tectonically, the study area lies in the northwestern Himalayas’ fold and thrust belt [34]. The Panjal Thrust, Main Boundary Thrust, Deolian Fault, Muzaffarabad Fault (MzF), Minhasa Fault, and Jhelum Fault are the main structural elements in the region. The Precambrian Hazara and Tanol formations, Cambrian Nauseri Granite Gneiss and Muzaffarabad Formation, Carboniferous-Permian Panjal Formation, Paleocene age Hangu, Lockhart, and Patala formations, Eocene age Margalla Hill Limestone, Chorgali and Kuldana formations, and Miocene Murree and Kamlial formations exposed in the research area. The region’s geology is very complex, and the lithological units make the slopes of this area potentially susceptible to landslide phenomena. The 2005 Kashmir earthquake had catastrophic repercussions for those living in northern Pakistan’s highlands. These massive movements (Figure 2) interrupted communication throughout the afflicted area, devastated communities, and infrastructure, resulting in many deaths. A number of field visits along the Neelum, Kohala, and Jhelum Valley roads during 2019 and 2020 were carried out, and various landslide types such as rockfalls, rock slides, and shallow landslides were observed along these road corridors (Figure 2).

Figure 2 
               Examples of landslide events: (a) Kohala landslide along Kohala road triggered in 2014 during monsoon, (b) Kulian landslide along Kohala road triggered in 2016, (c) Lohargali landslide along Neelum Valley road, and (d) Majhoi landslide along Jhelum valley road.
Figure 2

Examples of landslide events: (a) Kohala landslide along Kohala road triggered in 2014 during monsoon, (b) Kulian landslide along Kohala road triggered in 2016, (c) Lohargali landslide along Neelum Valley road, and (d) Majhoi landslide along Jhelum valley road.

3 Procedures and methods

The adopted methodology is an effort to assess landslide susceptibility along study area road corridors. The LSM selection technique relies on the study area extent and the available data. First, landslide events were identified based on field observations and available literature, and their causative factors were selected for this study. After choosing these parameters, a GIS-based database was established, including topographic, geological, and remote sensing data. Statistical significance tests were applied to select the best subsets of LCFs. The compiled data were imported into ArcGIS 10.4.1 Esri software. The data set involved only the landslides that caused damages or occurred within the 1-km buffer along three main road corridors (Neelum road, Jhelum Valley road, and Kohala road) within the district Muzaffarabad. Thus, landslides in the district Muzaffarabad but not in the study area were not incorporated into the data set to evaluate the area via digital elevation models (DEMs) derived from ALOS PALSAR (12.5 m grid cell size), which is thought to be a precise technique for investigating topographic features at a regional scale.

The present study initiated the formulation of the landslide inventory through comprehensive field visits and satellite imageries. Moreover, eleven selected LCFs, that is, lithology, slope angle, slope aspect, land cover, elevation, distance to roads, distance to faults, distance to streams, NDVI, TWI, and curvature, were investigated for LSM. After the preparation of LCFs, the spatial geodatabase of LCFs was standardized on a common scale with a spatial resolution of 12.5 m and projection system WGS 1984/UTM Zone 43 N in ArcGIS environment before training models. Lastly, the susceptibility maps generated along the studied road corridors were then compared and analyzed through validation data sets. The generated susceptibility maps have been categorized into four classes, that is, low, moderate, high, and extremely high susceptibility zones following the recently published literature [32,35]. The methodological steps and data sets used in the present research are illustrated in Table 1 and Figure 3.

Table 1

List of data including basic attribute information used in the study

Thematic layers Data types Scale/resolution Data sources
Landslide Inventory Google earth imagery 30 m Google Inc.
World view 3 0.3 m Land use and Planning Department, Govt. of AJ&K
SPOT-5 2.5 m Institute of Geology, University of Azad Jammu and Kashmir, Pakistan
Quick bird 0.6 m Land use and Planning Department, Govt. of AJ&K
Field surveys Field visits during 2019
Slope ALOS PALSAR DEM 12.5 m (vertex.dacc.asf.alaska.edu)
Aspect
Elevation
Curvature
TWI
NDVI Landsat-8 30 m ((https:// earthexplorer.usgs.gov)
Land use
Geology Geological maps 1:50,000 Geological Survey of Pakistan (GSP)
Distance to faults
Distance to roads Open street map 30 m www.openstreetmap.org
Distance to streams Google earth imagery 30 m Google Inc.
Figure 3 
               Schematic flowchart methodology adopted for the present study.
Figure 3

Schematic flowchart methodology adopted for the present study.

3.1 Landslide inventory map

The road corridors of the study area have come across various landslide types, that is, rockslides, rockfalls, and shallow landslides. Approximately 182 landslides along these roads were analyzed from aerial photographs and satellite imageries which were then confirmed and revised through comprehensive field visits during 2019 and 2020 (Figure 4). Along the Neelum valley road, 91 landslides have been marked having 14.29% rockfalls, 24.18% rock slides, and 61.54% shallow landslides. Along the Jhelum valley road, 38 landslides have been marked 36.84% rockfalls, 44.74% rockslides, and 18.42% shallow landslides. Along the Kohala road, 53 landslides have been marked among which 13.21% are rockfalls, 26.42% are rockslides, and 62.26% are shallow landslides (Figure 5). Landslide areas observed in the landslide inventories along each road corridor are found greater than 0.01 km2. The largest landslides along the Neelum valley road (i.e., Panjgaran and Lohargali) cover an area of about 0.65 km2 and 0.75 km2, respectively, whereas along the Jhelum valley road, the Langerpura landslide covers an area of about 1 km2, and along the Kohala road, the largest landslide, that is, Kohala slide, covers an area of approximately 0.06 km2.

Figure 4 
                  Landslide temporal distribution maps along the study area road corridors: (a) Neelum Valley Road, (b) Jhelum Valley Road, and (c) Kohala Road.
Figure 4

Landslide temporal distribution maps along the study area road corridors: (a) Neelum Valley Road, (b) Jhelum Valley Road, and (c) Kohala Road.

Figure 5 
                  Statistics of landslide types along Neelum valley, Jhelum valley, and Kohala roads.
Figure 5

Statistics of landslide types along Neelum valley, Jhelum valley, and Kohala roads.

There is no clear and predetermined strategy for sorting landslide samples in the literature; however, the most often used ratios are 70:30 for training and validating data, respectively [5,15].

Therefore, we have randomly divided the data set into training (70%) and validating data sets (30%). Among 182 landslides along Neelum valley road, a total of 91 landslide events were recorded; out of which 70% (64 events) were used as training points, while 30% (27 events) were used for validation points. However, along Jhelum valley road, total 38 landslide events were recorded, out of which 70% (27 events) were used as training points while 30% (11 events) were used for validation points, while along Kohala road, total 53 landslide events were recorded out of which 70% (37 events) were used as training points while 30% (16 events) were used for validation points. The selection of non-landslide locations has a significant impact on the prediction capabilities of RF and LR models. As there exists no rule regarding the selection of non-landslide threshold, so with the increase or decrease in slope threshold, the sampling space will increase or decrease accordingly. For relatively very small sampling space, non-landslide locations will be selected from flat terrain, so the model will comprehensively differentiate between landslides and non-landslides locations. The ratio of landslide and non-landslide samples are selected as 1:2 based on recommendations from Pourghasemi et al. [36]. We have adopted the random method for non-landslide sampling as the study area is small, and it was easy to select the areas of non-landslides. The flat plain area was considered as contributing non-landslide areas. This study used random selection while including flat locations as non-landslides locations because these locations have few or no landslides based on our spatial distribution analysis. The selection of a non-landslide sampling approach is in line with the recent trends in literature [32].

The authors have local experience with the area’s landslide mechanisms, which reduced the uncertainty. In the present study, we have demarcated the landslides from World view 3 satellite imageries (0.3 m resolution) acquired from the land use and Planning department of the Government of Azad Jammu and Kashmir (GoAJK), Quickbird Images, SPOT-5, and Google Earth imagery for different periods and later on verified through field visits to formulate a landslide inventory.

3.2 Causative factors

The stability of the slope mainly depends on its slope angle, the extreme climatic changes, the vegetation cover, the discontinuity pattern, and lithology [37]. The slope angle is classified into eight classes along Neelum valley road, and seven classes along Jhelum valley and Kohala Road (Figures 6a, 7a and 8a). Various factors, such as climatic conditions (rainfall, sunshine, and snow melting), land cover (forest, green land, urban land, barren land, and water bodies), weathering, and soil properties, that is, infiltration, have a significant impact on the slope aspect. The slope aspect along the study area road corridors is categorized into eight classes (Figures 6b, 7b and 8b).

Figure 6 
                  LCFs along Neelum Valley road: (a) slope, (b) aspect, (c) elevation, (d) curvature, (e) lithology, (f) landcover, (g) distance to streams, (h) distance to roads, (i) distance to fault, (j) TWI, and (k) NDVI.
Figure 6

LCFs along Neelum Valley road: (a) slope, (b) aspect, (c) elevation, (d) curvature, (e) lithology, (f) landcover, (g) distance to streams, (h) distance to roads, (i) distance to fault, (j) TWI, and (k) NDVI.

Figure 7 
                  LCFs along Jhelum Valley road: (a) slope, (b) aspect, (c) elevation, (d) curvature, (e) lithology, (f) landcover, (g) distance to streams, (h) distance to roads, (i) distance to fault, (j) TWI, and (k) NDVI.
Figure 7

LCFs along Jhelum Valley road: (a) slope, (b) aspect, (c) elevation, (d) curvature, (e) lithology, (f) landcover, (g) distance to streams, (h) distance to roads, (i) distance to fault, (j) TWI, and (k) NDVI.

Figure 8 
                  LCFs along Kohala road: (a) slope, (b) aspect, (c) elevation, (d) curvature, (e) lithology, (f) landcover, (g) distance to streams, (h) distance to roads, (i) distance to fault, (j) TWI, and (k) NDVI.
Figure 8

LCFs along Kohala road: (a) slope, (b) aspect, (c) elevation, (d) curvature, (e) lithology, (f) landcover, (g) distance to streams, (h) distance to roads, (i) distance to fault, (j) TWI, and (k) NDVI.

Pachauri and Pant [38] have found that the elevation of an area possesses an indirect impact on slope failure (Figures 6c, 7c and 8c). Slope curvature is related to slope failures, especially in hilly areas by the accumulation or discharging of subsurface and surface hydrology of the area [39] (Figures 6d, 7d and 8d). Lithology being the most significant factor for the landslide occurrence due to the structural and geotechnical properties may change their strength and permeability characteristics [40] (Figures 6e, 7e and 8e).

Land cover has an association with triggering slope failures. Riaz et al. [5] have experienced a compelling association between the stability of slope and vegetation cover, particularly for small-scale landslides. In the research area, the land cover and the land-use maps are categorized into five classes such as forest, green land, urban land, barren land, and water bodies (Figures 6f, 7f and 8f).

Proximity to the streams can adversely influence slope stability through erosion and/or saturating the slope-forming material [41]. In this study, distance to streams was classified into five classes using 50 m interval buffer zones to verify the impact of streams on the slope stability along these roads (Figures 6g, 7g and 8g).

Road construction in hilly terrain has been marked as a key influential anthropogenic factor for landslides because undercutting slope for road construction is commonly the cause of slope instability [42] (Figures 6h, 7h and 8h). The 7.6 magnitude earthquake of 2005 along MzF initiated cracks in the rock units along the fault planes of this region. The distance to fault maps along these roads is calculated through the buffer distances along the fault and categorized into six classes with 100 m intervals (Figures 6i, 7i and 8i).

The TWI is a vital landscape characteristic for LSM [43]. TWI results are positive, non-zero, and enhanced with the catchment region increment and the slope angle decrease [44]. TWI categorizes these study area roads into four classes based on the maximum and minimum values obtained (Figures 6j, 7j and 8j). The NDVI values depict the vegetated slopes in the image. The dense vegetation on the slopes indicates high values of NDVI because of high chlorophyll concentration, causing a low reflectance in the red bands and because of the maximum stacking of leaves and vice versa. The NDVI maps of roads in the study area were produced by analyzing bands 3 and 4 (Figures 6k, 7k and 8k).

3.3 Landslide susceptibility methods (LSM)

LSM is a crucial component of hazard management and a significant source for mitigating the risk of living. Based on these grounds, the present work incorporates the results of the susceptible approaches carried out along the main road corridors of the District Muzaffarabad for susceptibility analysis.

3.3.1 WoE method

The WoE is a simple, easy-to-conduct, and time-saving approach [45] and can easily be attained through GIS applications. According to Bonham-Carter [46], the WoE approach computes the weight for each LCF (B) concerning the existence or non-existence of the landslide events (L) in the area. For each specific predictive variable, the positive weight (W+) represents the landslide incident that happens, and the negative weight (W−) represents a landslide incident not likely to happen. These weights represent the degrees of relationship among evidence (predictive variables) with the landslide events to interpret the empirical observations easily.

(1) C = W + W ,

(2) W i + = log e P { B i | S } P { B i | S ¯ } ,

where,

(3) W i = log e P { B ¯ i | S } P { B ¯ i | S ¯ } ,

B i is the existence of a conditioning factor for potential landslide, i is the non-existence of a conditioning factor for landslide potential, S is the existence of the landslide event, and is the non-existence of a landslide event.

The calculated difference among weights (C) is the precision of the relationship among the evaluated variables with the landslide events [47]. Arc-SDM, an ArcGIS extension, was then used to compute the weights and other statistics using the landslide training dataset. Their respective LCFs and methodological steps follow as Riaz et al. [5].

3.3.2 AHP method

AHP, a versatile and simple technique, is to analyze and resolve complicated tasks [48]. This method is based on the field expertise and expert’s knowledge about the area and the controls of different factors on landslides [1,49]. For AHP, it is compulsory to breakdown an unstructured issue into its constituent factors; organize them in a hierarchic order; allocate some numerical ratings to individual findings on their relative significance of each component factor, and finally integrate these findings to establish the implications related to such factors [50].

In this study, LCFs were segmented into classes and then assigned the ratings from 1 to 9; 1 is for low susceptibility, and 9 is for high susceptibility to the landslides, and is allocated to each factor class based on their triggering potential. The consistency ratio (CR and CI) is calculated in formulas (4) and (5) as under:

(4) CI = λ max n n 1 ,

(5) CR = CI RI ,

where λ Max is the maximum eigenvalue of the matrix; n is the number of studied factors.

The inconsistency of the matrix is acceptable if the CR value is less than or equal to 0.1, but if the CR value is >0.1, the subjective decision should be reconsidered [51].

3.3.3 LR method

Based on the mathematical concept of the logit-natural logarithm, LR was initially developed in the late 1960s and early 1970s [52]. The main goal of LR in terms of landslide prediction is to find the best-fitting algorithm for evaluating the spatial association between the absence or presence of a landslide and a collection of [53]. The logit-natural logarithm of LR is computed by equation (6):

(6) y = f ( P ) = In P 1 P = β 0 + β 1 X 1 + β 2 X 2 + , + β n X n ,

The probability of a landslide occurrence (P) may be estimated using the equation below:

(7) P = P ( X Y ) = e β 0 + β 1 X 1 + β 2 X 2 + , . + β n X n 1 + e β 0 + β 1 X 1 + β 2 X 2 + , . + β n X n ,

where Y represents the resultant factors (landslide or non-landslide), X = X 1, X 2,…, X n indicates predictor variables of landslide influencing factors, n is the number of landslide influencing factors, β 0 is the intercept condition, and β 1, β 2 … + β n are the regression coefficients.

3.3.4 RF method

Conventional ML techniques are computationally intensive. Breiman et al. [54] developed the classification tree algorithm that significantly drawdown the calculations required for the classification and regression through the repeated dichotomous data. Breiman [55] has established the prediction model of the RF algorithm by using various DTs. The bootstrap resampling, i.e., randomly selected variables (columns) and samples (rows), is utilized to generate the trees and their decisions. After model generation, the samples (rows) of the model are initially classified by each DT individually and later verified through the majority amongst all the trees. LSMs can be generated by assessing the DTs proportion that predicts the landslide occurrence amongst all the DTs in the RF model [56]. Therefore, RF is among the most frequently used ML techniques with high prediction accuracy. Considering these advantages of the RF model in landslide susceptibility, this method is selected for the present study.

According to Breiman [55], RF is comprehensive and reliable among the key ML methods. The RF approach develops several classification trees to derive a classification [54]. Hansen and Salamon [57] pointed out that ensemble classification trees possess much more precision than discrete members. RF enhances diversity amongst the classification trees by sorting the data and amending the explanatory factor sets randomly over several procedures of tree generation. To develop the RF model, two user-defined variables are required, i.e., the trees number (k) and predictive factor number, to split the nodes (m). For evaluating the conditioning factors, both statistical and categorical variables are used. The “k” keeps sufficiently high to tolerate convergence. The advantage of employing the RF in the present study is that it resists false training and the generation of a significant number of RF trees which does not risk overfitting; that is, each tree is completely independent in the random experiment. As a result, there should be no need to transform, rescale, or adjust the RF algorithm. For predictors, RF also has the advantage of resistance against the outliers and automatically deals with the missing values [58]. To obtain the best RF model performance node size, mtry, and ntree, parameters were selected to reduce the out-of-bag (OOB) error. Feature importance measures by Inc Node Purity (i.e., Mean Decrease Gini) based on the Gini impurity index utilized for estimating the separations in trees.

4 Results and discussion

4.1 LCF selection

The feature selection (FS) module is a unique LSM-Tool Pack developed by Sahin et al. [59]. It is meant to help improve the model’s prediction performance by choosing a subset of LCFs or deleting duplicate parameters from huge LCFs. The FS procedure is broken down into three primary phases. The relevance of each characteristic (i.e., factor) is determined in the first phase using chi-squared or RF importance. The primary heuristic used in the FS method is to assess an LCFs usefulness. Users have the option of using three statistical tests for this purpose: information gain, RF importance, and chi-square. The impacts of the LCFs on the prediction model’s performance have been evaluated in the second phase. The optimal subset comprises the related characteristics that ensured higher prediction results were chosen in the last phase. A paired statistical test is used to identify the cut-off point (e.g., the optimal subsets) when an increment in LCFs results in no apparent difference in prediction accuracy. For feature rating and choosing the optimal subset, the FS module provided users with various options and statistical metrics. The FS procedure was carried out using one of the scenarios from the several choices, and then, the resultant subset size was utilized as a primary data set in the subsequent LSM phases. Therefore, Table 2 contains all potential scenarios based on FS and statistical techniques. Consider the following example from Table 3, where Test 1 was chosen to implement the LSM employing LR and RF in the study area. Model-11 has been proven to have the same features as other models (i.e., Model-07, 10, and 08) with different subsets. It should be emphasized that the LSMs were created using the “Test 1” data set.

Table 2

Feature importance’s of feature ranking algorithms along three road corridors

Chi-squared Information gain Random forest importance
Neelum road Jhelum road Kohala road Neelum road Jhelum road Kohala road Neelum road Jhelum road Kohala road
Landcover 0.4072 Road 0.4249 Road 0.5863 Landcover 0.3013 Slope 0.2409 Slope 0.2393 Road 118.1157 Slope 68.9311 Slope 87.5124
Road 0.3674 Slope 0.4140 Landcover 0.3989 Slope 0.2960 Road 0.2023 Road 0.1998 Landcover 115.8885 Road 57.9292 Road 84.6599
Geology 0.3578 Landcover 0.4048 Slope 0.3767 Geology 0.2553 Landcover 0.1698 Landcover 0.0942 NDVI 96.5700 Streams 47.0984 Landcover 71.5810
Slope 0.3205 Elevation 0.3531 Streams 0.3072 Road 0.1398 Elevation 0.1345 Aspect 0.0641 Slope 91.4137 NDVI 44.9130 Geology 71.5111
NDVI 0.2495 TWI 0.2711 Geology 0.2327 Elevation 0.0723 TWI 0.0996 NDVI 0.0544 Curvature 74.7858 Elevation 42.9974 Faults 66.2489
Elevation 0.2058 NDVI 0.2526 Elevation 0.1867 NDVI 0.0704 Aspect 0.0696 Faults 0.0510 Streams 65.3167 Landcover 39.5287 Streams 58.8007
Streams 0.1685 Geology 0.2481 Aspect 0.1763 TWI 0.0596 NDVI 0.0342 Geology 0.0421 Geology 54.5901 Geology 38.6589 Aspect 49.9428
Aspect 0.1656 Streams 0.2023 NDVI 0.1644 Aspect 0.0532 Geology 0.0302 Streams 0.0166 Elevation 48.0054 Faults 38.2451 NDVI 48.0325
Curvature 0.1636 Aspect 0.1823 TWI 0.1550 Streams 0.0315 Streams 0.0213 Curvature 0.0150 Faults 40.9273 Aspect 37.1037 Elevation 46.3186
Faults 0.1330 Curvature 0.1120 Curvature 0.1329 Curvature 0.0244 Faults 0.0202 Elevation 0.0120 TWI 40.5316 TWI 25.4502 TWI 26.4103
TWI 0.0755 Faults 0.0823 Faults 0.1021 Faults 0.0025 Curvature 0.0084 TWI 0.0112 Aspect 40.4824 Curvature 18.5610 Curvature 12.1708
Table 3

Best feature subset size estimated by Chi-square, information gain, and random forest importance along selected road corridors of Muzaffarabad

Feature ranking methods Test no. Statistical tests used for subset selection The best subset size Model Features in the best subsets
Neelum road Jhelum valley road Kohala road Neelum road Jhelum valley road Kohala road Neelum road Jhelum valley road Kohala road
Chi-squared Test 1 F-Test 11 11 11 Model-11 Model-11 Model-11 Landcover, road, geology, slope, NDVI, elevation, streams, aspect, curvature, faults, TWI Road, slope, landcover elevation, TWI, NDVI, geology, streams, aspect, curvature, faults Road, landcover, slope, streams, geology, elevation, aspect, NDVI, TWI, curvature faults
Test 2 Kolmogorov Smirnov test 9 8 7 Model-9 Model-8 Model-7 Landcover, road, geology, slope, elevation, TWI, NDVI, Streams, faults Slope, road, faults, landcover, geology, TWI, streams, NDVI Landcover, road, slope, streams, geology, elevation, aspect
Test 3 One sample T-test 10 9 5 Model-10 Model-9 Model-5 Landcover, road, slope, geology, elevation, NDVI, aspect, faults, streams, TWI Road, slope, landcover, elevation, faults, geology, TWI, streams, aspect Road, slope, landcover, geology, streams
Test 4 Wilcoxon signed-rank test 8 6 6 Model-8 Model-6 Model-6 Landcover, slope, road, NDVI, geology, elevation, streams, faults Road, slope, faults, landcover, geology, elevation Landcover, road, slope, streams, geology, elevation
Information gain Test 5 F-Test 10 9 6 Model-10 Model-9 Model-6 Landcover, slope, geology, road, elevation, NDVI, TWI, aspect, streams, curvature Slope, road, landcover, elevation, geology, faults, NDVI, aspect, streams Slope, road, landcover, streams, aspect, geology
Test 6 Kolmogorov Smirnov test 8 9 7 Model-8 Model-9 Model-7 Landcover, slope, geology, elevation, faults, NDVI, TWI, aspect Slope, geology, road, faults, landcover, aspect, TWI, elevation, NDVI Slope, landcover, road, aspect, streams, elevation, geology
Test 7 One sample T-test 8 10 9 Model-8 Model-10 Model-9 Landcover, slope, geology, elevation, faults, NDVI, TWI, aspect Slope, road, geology, streams, faults, landcover, curvature, aspect, TWI, elevation Slope, landcover, road, aspect, streams, elevation, geology, NDVI, curvature
Test 8 Wilcoxon signed-rank test 5 7 6 Model-5 Model-7 Model-6 Landcover, slope, geology, elevation, faults Slope, road, geology, landcover, faults, TWI, aspect Slope, landcover, road, aspect streams, geology
RF-importance Test 9 F-Test 6 6 5 Model-6 Model-6 Model-5 Slope, road, landcover, geology, elevation, NDVI Slope, streams, NDVI, landcover, road, elevation Road, slope, landcover, elevation, geology
Test 10 Kolmogorov Smirnov test 10 9 6 Model-10 Model-9 Model-6 Slope, road, landcover, geology, aspect, NDVI, streams, TWI, faults, elevation Slope, streams, road, NDVI, landcover, elevation, road, geology, faults Slope, road, landcover, geology, faults, streams
Test 11 One sample T-test 4 6 4 Model-4 Model-6 Model-4 Slope, road, landcover, geology Slope, road, streams, NDVI, landcover, elevation Slope, road, landcover, geology
Test 12 Wilcoxon signed-rank test 5 4 10 Model-5 Model-4 Model-10 Slope, road, geology, elevation, landcover Slope, road, streams, landcover Slope, road, landcover, elevation, streams, faults, geology, curvature, TWI, NDVI, aspect

4.2 LCF’s contribution to LSM models

In LSMs, often two or more LCFs are strongly correlated to one another as independent variables because of their homogenous impact on landslide events in the area. Therefore, employing such variables in the matrix may generate ambiguous results, and hence less reliable LSMs may be produced. To overcome this ambiguity and to validate the collinearity among LCFs, multi-collinearity analyses like Pearson correlation were performed to generate reliable LSMs. Therefore, to acquire reliable LSMs, Pearson’s correlation has been performed in the present study along each study area road corridor to validate that there were no repetitive variables before data analysis (Figure 9). The Pearson correlation results along the Neelum Valley road corridor indicate that the collinearity among eleven LCFs used in the study ranges from −0.25 to 0.42, along Jhelum Valley road the correlation results range from −0.485 to 0.68 whereas along Kohala road it ranges between −0.79 and 0.62 (Figure 9). The Pearson correlation results along each road corridor are less than the maximum threshold value i.e. 0.7, as specified by the Allison [60]. The analysis indicates that lithology, distance to road and faults, slope and aspect along the Neelum valley road while slope, distance to faults, aspect, landcover, and TWI along Kohala road whereas lithology, distance to the road faults, slope, aspect, landcover, and NDVI possess less inter-correlation among these factors, so in the hilly terrain like Muzaffarabad, these factors may be used as most influential LCFs as these factors have a significant impact on landslide events.

Figure 9 
                  LCFs analysis: (a) Pearson correlation along Neelum Valley road, (b) Pearson correlation along Jhelum Valley road, and (c) Pearson correlation along Kohala road.
Figure 9

LCFs analysis: (a) Pearson correlation along Neelum Valley road, (b) Pearson correlation along Jhelum Valley road, and (c) Pearson correlation along Kohala road.

LCFs individually contribute contrarily in LSM to the applied predictive model based on their probability ranks. Therefore, correlating normalized significance of LCFs for susceptibility models are helpful tools to carry out their relative contribution of factors in the area under study. For this reason, the LCF’s contributions to RF and LR have been evaluated to prioritize the most influencing factors responsible for landslide events in the current study area (Figure 10). According to the LCFs' importance, all 11 factors possess significant contributions to LSMs. The LR relevance analysis of the current study indicated that slope, NDVI, landcover, and road are the most influential causative factors along all the studied road corridors, whereas the RF analysis depicted that landcover, road, geology, and slope are the most influencing LCFs along the Neelum valley road. Along Jhelum valley road, road, slope, landcover, and elevation are found to be the most significant influential factors among the 11 causative factors. Moreover, road, landcover, slope, and streams are assessed as the most prominent causative factors along Kohala road. In contrast, the statistical analysis indicates the positive relationship of road networks, landcover, geology, NDVI, slope, TWI, and faults with landslide events along the Neelum valley road. The statistical analysis of Jhelum valley road depicts the positive relationship of road, slope, landcover, NDVI, aspect, geology, and faults as the extensive contributing factors. The statistics of the Kohala road represents that curvature, road, landcover, elevation, geology, slope, and TWI possess positive association with landslide incidents.

Figure 10 
                  LCFs contribution in case of (a) LR and (b) RF.
Figure 10

LCFs contribution in case of (a) LR and (b) RF.

Based on the above-mentioned analysis, the variations in the relative contributions of factors found in the RF and LR models depict that a specific LCF may possess less contribution in one model compared with that in the other model. Therefore, it is concluded from the LCF contribution analysis that the contributory factor with less influence can’t be ignored because it may possess a high impact in other models. On the other hand, the LCFs like slope, road, and landcover possess strong influence in both RF and LR models which in turn concluded that these LCFs can have the least variations for the spatial probability of landslides in both applied models.

4.3 WoE analysis

Weight assessment and analysis of the relationship of LCFs classes with training points (landslides) were computed at a 90% confidence level with a unit area of 0.5 km2. Along Neelum valley road, slope (41°–50°) has a maximum weight value (0.879) while least weight (0) and (−2.709) for slope class (>70) and (0°–10°), respectively. So, for the slope factor along Neelum valley road, Jhelum valley road, and Kohala road, slopes between 41°–50°, 31°–40° and 21°–30° respectively possess the maximum weights and shows the positive relationship among training points, i.e., landslides, whereas the minimum weight values −2.71, −0.45 and −2.11 calculated against the slope class (0°–10°) are along Neelum valley, Jhelum valley and Kohala roads respectively. The slope aspect weights analysis reveals that the maximum value of contrast is 0.79 with a weight (0.67) against the east-facing slopes along Neelum valley road, 0.69 having weight (0.52) against the northeast-facing slopes along Jhelum valley road whereas 0.36 with a weight (0.90) against the northwest facing slopes along Kohala road. The minimum contrast (−1.14) having weight (−1.05) against the northwest-facing slope, (−1.26) having weight value (−0.12) against the west-facing slope, and (−2.00) having weight value (−1.88) against north-facing slope were observed along Neelum valley, Jhelum valley, and Kohala roads, respectively. The contrast values for the aspect category range between −1.14 and 0.79 along Neelum valley road and −2.00 and 0.88 along Kohala road indicate a comparatively weak correlation among the landslides and slope aspect compared to slope gradient along these roads. The elevation weights analysis of these roads illustrates that the highest contrast (1.16) with a weight value (0.79) is obtained for the class of 772–921 m along Neelum valley road. In contrast, the least contrast value has low weight values observed at the elevation above 1521 m asl. Along Jhelum valley road, the elevation ranges from 633–832 m classes have maximum landslide points, hence more significant influence than all other elevation classes. The maximum contrast and weight of elevation along Kohala road were observed in the 537–687 m class with 36 landslide points and a weight value of 0.68. The slope curvature weights analysis indicates that the maximum contrast values were observed along the concave slope compared to the convex class along all these road corridors. Landcover is classified into five classes, and their analysis reveals that high contrast values of (2.59) with weight (1.88), (1.57) with weight (1.16), and (1.51) with weight (1.12) were observed for barren land class against all the investigated road corridors. However, the low contrast value of (−1.61) with weight (−1.56) and (−1.68) with weight (−1.53) is calculated for water bodies along Neelum valley and Kohala roads, respectively, and (−1.27) with weight (−1.14) for urban land along Jhelum valley road. Based on the calculated results, landcover is considered the main LCF for the landslide along study area road corridors.

Distance to road network factor weight investigation indicates that the maximum contrast values of all the investigated roads were observed within 50 m buffer zone while least contrast and weight were observed >200 m buffer zone. Observed weights are 2.592, 2.04, and 2.80 computed for (0–50 m) class, whereas the minimum weight value is −0.52, −0.36, and −0.74 for class >200 m along Neelum valley, Jhelum valley, and Kohala roads, respectively. This analysis indicated a significant, influential association among landslides and road networks along study area road corridors. The results of distance to faults showed that the highest value of contrast and weights were observed along >500 m class for Neelum valley and Jhelum valley road while the highest weight value was observed within a 100 m buffer zone along Kohala road. This analysis suggests that faults have significantly less impact on the landslide triggering mechanism along Neelum and Jhelum valley roads, however, some influence along Kohala road.

Distance to stream weight results depicts that the 50–100 m class computed the maximum value of contrast (0.87) with the highest weight (0.70) along Neelum valley road. In comparison, the highest value of weight 0.44 and 0.81 was observed within the 50 m buffer zone along Jhelum and Kohala road. The analysis of geological units indicates that the maximum value of contrast (1.67) with maximum weight (1.51) was derived for the Muzaffarabad Formation along Neelum valley road while the least contrast and weight were observed for Nauseri granite genesis, stream channel deposits, Paleocene-Eocene sequence, Tanol Formation, and Panjal sequence. Along Jhelum valley road, the highest contrast (0.88) with maximum weight (0.42) was observed for the Murree Formation, while surficial deposits obtained the minimum contrast. However, along Kohala road, the Nagri Formation computed maximum contrast (0.77) with the highest weight (0.52).

The TWI weight analysis showed that the highest value of contrast and weights were observed for class (7–10), while the lowest contrast and weights were observed for class (>13) for all the investigated road corridors. These outcomes depict that TWI is a comparatively low influential factor along these road corridors. The results of NDVI showed that the highest value of contrast (1.95) with maximum weight (1.28) and (1.13) with the highest weight (0.97) was derived for the class (0.09–0.2) and lowest contrast value (−1.91) with the least weight (−1.68) and (−1.20) with least weight (−0.93) was derived against class (0.3–1) along Neelum valley and Kohala roads respectively whereas highest contrast (1.25) with maximum weight (0.89) for the class (0.5–1) and minimum contrast (−0.83) with minimum weight (−0.54) for class (0–0.5) was observed along Jhelum valley road.

4.4 Calculate response for WoE

The selected causative factors were integrated with their weight tables through Arc-SDM Calculate Response Tool to produce the continuous scale posterior probability map. The generated susceptibility maps are categorized into low, moderate, high, and very high susceptible zones (Figures 11b, 12b and 13b). So, these maps are classified through a relative ranking instead of posterior probability values. In the present work, relative ranking is employed rather than conditional independence (CI) testing. Porwal [61] has explained the doubt concerning the competency of the maps influenced by the absence of CI and concludes that these causative factors represent the model’s functioning. Without CI, the model’s functioning shall not be severely damaged, mainly when the susceptibility maps are classified through the relative ranking.

Figure 11 
                  Landslide susceptibility maps of the Neelum Valley road: (a) AHP model, (b) WoE model, (c) LR model, and (d) RF model.
Figure 11

Landslide susceptibility maps of the Neelum Valley road: (a) AHP model, (b) WoE model, (c) LR model, and (d) RF model.

Figure 12 
                  Landslide susceptibility maps of the Jhelum Valley road: (a) AHP model, (b) WoE model, (c) LR model, and (d) RF model.
Figure 12

Landslide susceptibility maps of the Jhelum Valley road: (a) AHP model, (b) WoE model, (c) LR model, and (d) RF model.

Figure 13 
                  Landslide susceptibility maps of the Kohala road: (a) AHP model, (b) WoE model, (c) LR model, and (d) RF model.
Figure 13

Landslide susceptibility maps of the Kohala road: (a) AHP model, (b) WoE model, (c) LR model, and (d) RF model.

4.5 AHP analysis

In the present study, the actual ranks of the selected LCFs were allocated through the expert’s opinion. After comprehensive field experience, their weighting values were assigned against each parameter class. The calculation regarding PCM of studied factors and their respective weight values and CR values along the study area road corridors are presented in Table 4.

Table 4

AHP Pairwise comparison matrix with factor weights of the causative factors selected for this study

The difference in weight values of the 11 LCFs concerning their influence and the CR (0.09) was calculated through the AHP method along study area road corridors. According to AHP analysis, distance to roads, streams, and faults, landcover, and lithology with the maximum weight values of 0.170, 0.1281, 0.125, 0.124, and 0.114, respectively, contribute as the most influential factors for landslide occurrences along Neelum valley road, whereas the slope, elevation, and NDVI contributed as the moderate significant factors for landslides events with the weight values of 0.105, 0.086, and 0.084, respectively. Finally, the TWI, aspect, and curvature are the least influencing LCFs having weight values of 0.042, 0.039, and 0.027, respectively (Table 4). The internal weights of the class of the selected causative factors are then calculated using the scale proposed by Saaty [51]. The relative importance of the classes of the most influential causative factors, that is, distance to roads, streams, faults, landcover, and geology, for landslide occurrence is based on the landslide concentration against each class of the studied factors. The AHP analysis for the distance to roads factor reveals that the landslide occurrence probability is high against 0–150 m classes. Distance to streams factor analysis indicates that the classes 0–200 m are more influential for landslides. Distance to faults factor evaluation shows that 0–150 m classes significantly correlate with the landslide phenomenon. The landcover factor analysis implies that barren and urban land is significant for landslide occurrences. The causative factor of geological units assessment pointed out that Muzaffarabad Formation, Hazara Formation, and Murree Formation are found more effective for landslides as compared to the other geological units along Neelum valley road, Murree Formation and Kamlial Formation along Jhelum valley road, and Murree Formation and Nagri Formation along Kohala road. There is no well-defined relationship with landslides for the remaining factors due to the low contrast values among their classes. So, the remaining factors possess the least contribution to the landslide distribution along these roads. The procedure provides integration of the several LCFs in a single LSI based on the weighted linear sum.

4.6 Calculate response for AHP

The final LSMs along the study area road corridors were produced using the AHP technique (Figures 11a, 12a and 13a). These maps were then reclassified through the natural break classification method into 04 susceptibility zones, i.e., low, moderate, high, and very high [62]. The susceptibility condition along the study area road corridors is very high to high within the 100 m buffer zone of the roads. This part is mainly characterized by major road cuts, the river under cutting, urban land, and fragile geology with fault controls lithology. Moderate susceptible zones are widely distributed within the 500 m buffer zone along the roads. The low susceptible parts are mainly found in the roads’ >500 m buffered zone. The total area calculated for the low landslide susceptibility zone is 33.94, 33.82% for the moderate susceptible zone, 22.82% for high, and 9.42% for very high susceptible zones along Neelum valley road. This analysis showed that the landslide density in these susceptible zones does not enhance the degree of susceptibility. Finally, the LSI maps of each study area road corridors were developed by integrating the investigated causative factors and their weighted linear sums by using the following equation.

LSI (AHP) = 0.170 × distance to roads + 0.128 × distance to streams + 0.125 × distance to faults + 0.124 × landcover + 0.114 × lithological units + 0.105 × slope + 0.086 × elevation + 0.042 × TWI + 0.039 × aspect + 0.034 × NDVI + 0.027 × curvature (08).

4.7 LSM using LR

LR is a typical approach for analyzing the dependence of a dichotomous variable on a collection of variables, that is, 1 for landslide presence and 0 for landslide absence. The regression coefficients and the statistical test findings for the eleven variables along three road corridors are shown in Table 5. The estimate, standard errors, z-score, and p-values for each of the coefficients are shown in Table 5. The sign of the calculated coefficients may be used to assess the impact of the specified factors on landslide occurrence. A positive coefficient implies a higher likelihood of landslide, whereas a negative coefficient indicates a lesser landslide probability. The road networks, landcover, geology, NDVI, slope, TWI, and faults coefficients are all positive, indicating a positive connection between the LCFs and landslide incidence along Neelum road. Instead, elevation, streams, aspect, and curvature have a negative relationship with landslide incidence along Neelum road. The road, slope, landcover, NDVI, aspect, geology, and faults positively correlate with landslide locations and LCFs along Jhelum Valley road. At the same time, elevation, TWI, curvature, and streams negatively associate LCFs and landslide locations. Along Kohala road, curvature, road, landcover, elevation, geology, slope, and TWI positively correlate with LCFs and landslide incidence. In contrast, NDVI, aspect, streams, and faults negatively associate LCFs and landslide incidence. Furthermore, the LR model’s statistical significance (i.e., p-values) findings demonstrate that all LCFs have a significant impact along three road sections. LSMs yielded from LR along three road sections are presented in Figures 11c, 12c and 13c.

Table 5

Statistical test results for LR along three road corridors of Muzaffarabad district

Factors Estimates Std. error z Value Pr(>|z|)
Neelum road Jhelum road Kohala road Neelum road Jhelum road Kohala road Neelum road Jhelum road Kohala road Neelum road Jhelum road Kohala road
Intercept 0.89590 −2.81331 0.46108 0.40333 1.72287 0.93379 2.221 −1.633 0.494 0.026335 *** 0.10248*** 0.62147***
Road 0.27705 0.81301 1.55451 0.03123 0.26100 0.10565 8.872 −3.115 −14.714 <2 × 10−16 *** 0.00184** <2 × 10−16 ***
Slope 2.11496 1.65586 2.42450 0.05417 0.21022 0.14514 39.040 7.877 16.704 <2 × 10−16 *** 3.36 × 10−15 *** <2 × 10−16 ***
Landcover 0.42189 0.49786 0.69626 0.04055 0.16357 0.07839 −10.405 −3.044 8.882 <2 × 10−16 *** 0.00234** <2 × 10−16 ***
Elevation −0.35795 −0.18901 0.04149 0.04714 0.19231 0.11619 −7.593 −0.983 0.357 3.13 × 10−14 *** 0.32567* 0.72106*
TWI 0.14845 −0.33002 0.40416 0.05377 0.22177 0.12964 2.761 −1.488 3.118 0.005761** 0.13671* 0.00182**
NDVI 2.20432 1.97226 −0.92109 0.06655 0.26714 0.18064 33.123 7.383 −5.099 <2 × 10−16 *** 1.55 × 10−13 *** 3.42 × 10−07 ***
Geology 0.10201 0.38978 1.35418 0.02782 0.15481 0.14186 −3.667 −2.518 9.546 0.000246*** 0.01181* <2 × 10−16 ***
Streams −0.05610 −0.64162 −0.35465 0.03063 0.11938 0.07514 −1.832 −5.374 −4.720 0. 067007* 7.68 × 10−08 *** 2.36 × 10−06 ***
Aspect −0.51037 0.10234 −0.20285 0.02074 0.06386 0.05155 −24.612 1.602 −3.935 <2 × 10−16 *** 0.10905* 8.31 × 10−05 ***
Curvature −0.17730 −0.76639 0.26358 0.08551 0.34709 0.18505 −2.074 −2.208 1.424 0.038125* 0.02724* 0.15433*
Faults 0.39959 0.94557 −0.22032 0.02973 0.15630 0.04658 13.441 6.050 −4.730 <2 × 10−16 *** 1.45 × 10−09 *** 2.25 × 10−06 ***

Significance codes: 0, ‘***’; 0.001, ‘**’; 0.01, ‘*’; 0.05, ‘.’; 0.1, ‘ ’ 1.

4.8 LSM using RF

The RF ensemble learning technique, the most popular and frequently used method, has been executed by LSM-Tool Pack. To implement the RF module, the user must first choose three parameters: ntree, mtry, and nodesize, to reduce the OOB error and achieve acceptable model efficiency. The RF model’s parameters were 250 for ntree, 4 for mtry, and 5 for nodesize in this study. IncNodePurity is an optional outcome that the user can obtain (i.e., Mean Decrease Gini). This feature significance metric is based on the Gini impurity index used to calculate tree splits. Table 6 shows the outcomes of the RF algorithm used in this study regarding feature significance. The findings were evaluated using IncNodePurity values, and the highest importance among all LCFs was determined to be landcover, road, geology, slope, and NDVI along Neelum road.

Table 6

The feature importance measurement results along three roads by the RF algorithm

Neelum road Inc node purity Jhelum road Inc node purity Kohala road Inc node purity
Landcover 934.6311 Road 554.9808 Road 603.6348
Road 751.7451 Slope 521.6383 Landcover 533.9662
Geology 697.9192 Landcover 419.1526 Slope 473.6515
Slope 558.8423 Elevation 415.6949 Streams 455.6965
NDVI 471.5872 TWI 314.1176 Geology 435.9214
Elevation 348.9673 NDVI 311.1932 Elevation 330.1840
Streams 287.7055 Geology 210.9038 Aspect 328.9769
Aspect 250.0781 Streams 119.8353 NDVI 322.6915
Curvature 197.7780 Aspect 115.6166 TWI 218.4110
Faults 173.0320 Curvature 59.4003 Curvature 124.6412
TWI 130.04203 Faults 47.6296 Faults 91.6996

Moreover, aspect, curvature, faults, and TWI were determined the least effective LCFs considered along Neelum road. However, along Jhelum valley road, the distance to road, slope, landcover, and elevation were the highest significant LCFs. At the same time, streams, aspect, curvature, and faults have the least significant relationship with landslide incidents. Along Kohala road, the road, landcover, slope, streams, and geology have the highest significance, while NDVI, TWI, curvature, and faults have the lowest significance. Random Forest Module yields LSM results after running the module with the desired parameters along three studied road corridors (Figures 11d, 12d and 13d).

4.9 Validation of the generated models

The receiver operating characteristics (ROC) curve was obtained for the model validation, and later, the AUC was calculated. The AUC signifies the model quality to predict reliable landslide events [63]. A visual comparison may also be made using the ROC curve, one of the most significant assessment criteria. AUC has a maximum value of 1.0, and an AUC in the range of 0.9 to 1 is regarded as excellent, 0.8–0.9 good, 0.7–0.8 medium, 0.6–0.7 sufficient, 0.5–0.6 poor, and less than 0.5 unsatisfactory. The AUC for the WoE and AHP models were 0.863 and 0.826 respectively along Neelum valley road, 0.831 and 0.812 along Jhelum valley road, whereas 0.886 and 0.882 respectively along Kohala road (Figure 14a–c). Another alternative output part in the LSM-tool pack is the AUC curve, which may be generated using LR and RF model’s ROC curve. The AUC of the LR model’s ROC curve graph was 0.91, 0.93, and 0.89 along Neelum, Jhelum, and Kohala road, respectively, indicating that it had excellent predictive power. When the RF model’s ROC curve graph is examined, it is determined to have a high predictive capability, with an AUC of 0.97, 0.95, and 0.92 along Neelum, Jhelum, and Kohala road, respectively (Figure 14a–c). These findings indicate that the ML (RF model) seems to be more precise for the LSM and more effective for prediction accuracy than data-driven and knowledge-driven models along these roads.

Figure 14 
                  RF, LR, WoE, and AHP generated susceptibility maps performance of the study area road corridors based on AUC-ROC: (a) Neelum Valley road, (b) Jhelum Valley road, and (c) Kohala road.
Figure 14

RF, LR, WoE, and AHP generated susceptibility maps performance of the study area road corridors based on AUC-ROC: (a) Neelum Valley road, (b) Jhelum Valley road, and (c) Kohala road.

4.10 Accuracy assessment and comparison of LSMs

The LSMs along the study area road corridors were generated using four GIS-based approaches, i.e., WoE, AHP, LR, and RF. This study’s findings used several performance measures for both LR and RF techniques (Table 7). The estimated values for accuracy, AUC-classified (AUC-C), AUC-Non-Classified (AUC-NC), MAE, RMSE, Kappa, precision, recall, and F1 were 0.7631, 0.7613, 0.9101, 0.2368, 0.4867, 0.5221, 0.7352, 0.7430, and 0.7390, respectively, according to the model using LR along Neelum road. While the observed values for accuracy, AUC-C, AUC-NC, MAE, RMSE, Kappa, precision, recall, and F1 were 0.9153, 0.7942, 0.9309, 0.0846, 0.2909, 0.5131, 0.7862, 0.6492, and 0.8047, respectively, according to the model using LR along Jhelum Valley road. The observed values for accuracy, AUC-C, AUC-NC, MAE, RMSE, Kappa, precision, recall, and F1 were 0.7734, 0.7942, 0.8923, 0.2265, 0.4759, 0.5079, 0.7483, 0.8043, and 0.8016, respectively, according to the model using LR along Kohala road.

Table 7

Performance results in terms of all metrics for best models along all the selected road corridors of Muzaffarabad

Road corridors Accuracy AUC-C AUC-NC MAE RMSE Kappa Precision Recall F1
Neelum RF 0.9127 0.9079 0.9715 0.0872 0.2953 0.7224 0.9429 0.8587 0.8988
LR 0.7631 0.7613 0.9101 0.2368 0.4867 0.5221 0.7352 0.7430 0.7390
Jhelum valley RF 0.9351 0.8176 0.9512 0.0648 0.2546 0.6076 0.8945 0.6769 0.8348
LR 0.9153 0.7942 0.9309 0.0846 0.2909 0.5131 0.7862 0.6492 0.8047
Kohala RF 0.7765 0.8041 0.9211 0.2234 0.4726 0.5139 0.8334 0.8394 0.8162
LR 0.7734 0.7942 0.8923 0.2265 0.4759 0.5079 0.7483 0.8043 0.8016

For the RF model, accuracy, AUC-C, AUC-NC, MAE, RMSE, Kappa, precision, recall, and F1 were found 0.9127, 0.9079, 0.9715, 0.0872, 0.2953, 0.7224, 0.9429, 0.8587, and 0.8988, respectively, along Neelum road while accuracy, AUC-C, AUC-NC, MAE, RMSE, Kappa, precision, recall, and F1 were found 0.9351, 0.8176, 0.9512, 0.0648, 0.2546, 0.6076, 0.8945, 0.6769 and 0.8348, respectively along Jhelum Valley road. Accuracy, AUC-C, AUC-NC, MAE, RMSE, Kappa, precision, recall, and F1 were found 0.7765, 0.8041, 0.9211, 0.2234, 0.4726, 0.5139, 0.8334, 0.8394, and 0.8162, respectively, along Kohala road. The high recall value of RF models shows that the RF algorithm is superior at identifying non-landslide regions. As a result, the RF approach is more effective than the LR model. On the other hand, the LR model’s RMSE and MAE error estimates were more than RF along all the studied road corridors. When compared to the LR model, the performance of the RF model with the minimum MAE and RMSE values was shown to be superior. In terms of matrix analysis, the RF method using optimal factor subsets outperformed the LR model when the confusion matrix and accuracy metrics between known and anticipated class values were evaluated.

The percentage distribution and area of the susceptibility classes in the studied road corridors were calculated for all applied LSM. To check the precision of these AHP, WoE, LR, and RF-based susceptibility maps, landslide inventory generated along these roads and field verifications were carried out. Finally, the LSMs were produced and compared. The landslide susceptibility zones are used to verify the distribution of the validating data. The LSM possesses a continuous scale of mathematical values and is necessary to split these mathematical values into their relative susceptibility classes [49].

Analysis of the WoE-based LSMs of the study area road corridors reveals that 39.18, 37.09, and 29.39% of the total area are low susceptible zones. In contrast, moderate susceptible zones represent 27.25, 15.91, and 23.21% of the total area. In comparison, high and very high landslide-susceptible zones are marked by 33.56, 47.01, and 47.40% of the total area along Neelum, Jhelum valley, and Kohala roads, respectively (Figure 14).

The LSM produced through the AHP approach indicated that 33.94, 28.76, and 20.23% of the total area are categorized as low landslide susceptible zones. In contrast, moderate susceptible zones represent 33.82, 27.65, and 31.06% of the total area. In comparison, high and very high landslide-susceptible zones are marked by 32.24, 43.58, and 48.71% of the total area along Neelum, Jhelum valley, and Kohala roads, respectively (Figure 15). Using the Quantile Classifier in ArcGIS software, the LR and RF LSM were divided into four classes: low, moderate, high, and very high. LR susceptibility map depicts 22.26% of the area in the low susceptible zone, 25.02% of the area in a moderate susceptible zone, whereas 27.54% of the area lies in high and 25.18% of the area is a very high susceptible zone along Neelum road. The LSM produced through the LR approach indicated that 43.27 and 22.92% of the total area are categorized as low landslide susceptible zones whereas moderate susceptible zones represent 22.97 and 35.05%, while very high and high zones of susceptibility are marked by 33.76 and 41.03% of the total area along Jhelum valley and Kohala roads respectively.

Figure 15 
                  Comparison of susceptibility zones for RF, LR, AHP, and WoE along Neelum Valley road, Jhelum Valley road, and Kohala road corridor.
Figure 15

Comparison of susceptibility zones for RF, LR, AHP, and WoE along Neelum Valley road, Jhelum Valley road, and Kohala road corridor.

The LSM produced through the RF approach indicated that 30.94, 38.62, and 40.42% of the total area are categorized as low landslide susceptible zones. In contrast, moderate susceptible zones represent 19.85, 20.81, and 20.99% of the total area. In comparison, high and very high landslide-susceptible zones are marked by 49.21, 40.57, and 38.59% of the total area along Neelum, Jhelum valley, and Kohala roads, respectively. In the present study, AUC–ROC curves created for AHP, WoE, RF, and LR methods are illustrated in Figure 13. The RF model produced the best prediction results with AUC scores of 0.9715, 0.9512, and 0.9211 for Neelum, Jhelum, and Kohala roads, respectively, compared to other models. The AUC results indicated that the most precise prediction in the identification of LSM is the RF and LR models compared to the AHP- and WoE-based models in the studied road sections. The order of accuracies is RF > LR > WoE > AHP along all the road sections. However, in general, all the models’ prediction accuracy along each road is reasonable.

5 Discussion

Hazard mitigation and mapping are built on the foundation of LSMs. As a result, a more precise and trustworthy LSM can minimize the cost and damage caused by natural catastrophes like landslides. For the present study, experts’ opinions AHP, data-driven WoE, and ML (LR and RF) approaches were used for comparative analysis of LSM along the main road corridors of district Muzaffarabad. By employing the RF, LR, WoE, and AHP approaches, the common and most critical LCFs among the eleven selected factors are the distance from roads, landcover, slope, geology, and NDVI. The WoE-weight values for distance to road factors are 2.59, 2.04, and 2.80 along Neelum valley, Jhelum valley, and Kohala roads, respectively, for 0–50 m class. The second most crucial factor is land cover with weight values of 1.88, 1.16, and 1.12 along Neelum valley, Jhelum valley, and Kohala roads, respectively, for the barren land class. The next most influential factor is lithology with weight values of 1.51 for the Muzaffarabad Formation along Neelum Valley road, 0.42 for the Murree Formation along Jhelum valley road, and 0.52 for the Nagri Formation along Kohala road: NDVI with the contrast values 1.95 and 1.13 for 0.09–0.2 class along Neelum valley and Kohala roads, respectively, whereas 1.25 value of contrast for 0.5–1 class along Jhelum valley road.

Furthermore, the NDVI contributes significantly to landslide occurrences in the research area with the weight value of 1.28, 0.89, and 0.97 along Neelum, Jhelum, and Kohala roads. Slope gradient also possesses the influential characteristics for landslides in the study area, having a weight value of 0.88 in the 41°–50° class along Neelum valley road, 1.61 in the 51°–60° class along Jhelum valley road, and 0.52 in the 21°–30° class along Kohala road. In comparison, the remaining studied factors possess a comparatively low impact on landslide susceptibility along these road corridors. These observations align with Riaz et al. [5] and Ahmed et al. [19]. Kamp et al. [64] produced the AHP-based LSM for the 2005 Kashmir earthquake-affected areas and considered the lithology as crucial LCF along with the slope gradient and faults of the area.

The susceptibility models of these roads based on the AHP technique suggest that the present research’s critical triggering factor is the distance to roads. The maximum weight calculated for this factor is 0.170. Hence, the present study concludes that undercutting slopes through non-engineering techniques may trigger the slopes. Distance to streams factor along these roads possesses the second-highest weight value of 0.128 and hence is calculated as the second most influential factor for landsliding as streams are the major source of erosion and seepages along these road corridors. The next prominent landslide triggering factor is the distance to faults with a weight value of 0.125. Thus, this study concluded that the influence reduces with increasing the distance from faults. Furthermore, the land cover weight value of 0.1248 against the barren land class shows a positive relationship with the landslides. Moreover, the area’s fragile and weathered rock units with a weight value of 0.114 are also characterized as the potential causative factor along these roads. The slope gradient factor analysis along these roads also showed a relatively positive relationship with the landslide probability with a weight value of 0.105; therefore, it can be assessed through this analysis that steep slopes possess a greater probability for landslides as compared to gentle slopes. Advanced ML algorithms (LR and RF) were executed via LSM-Tool Pack in ArcGIS. FS module assists in providing the best subsets for LCFs by eliminating non-influential factors. After configuring all of the parameters, twelve tests were performed along all the road corridors, and Test 1 was chosen as the dataset. Test 1 was created using the chi-square feature ranking and found that no factor is statistically insignificant using the F-Test approach. Model-11 represents the best subset and has higher prediction accuracies.

All ML models showed a high degree of learning and prediction performance, with small differences that might be attributed to their varied learning approaches and restrictions. The predictive potential of WoE-based models for LSM in the study area proposed that distance to roads, land cover, geology, NDVI, and slope gradient, whereas AHP-based models of the study area proposed that distance to roads, distance to streams, land cover, geology slope, and distance to faults are the major influential factors along these road corridors. ML-based RF proposed that road, landcover, slope, and geology are the most influential LCFs along the three selected road sections. The majority of the very high to high susceptibility zones through RF, LR, WoE, and AHP analysis are mainly placed in the areas close to the roads, in the barren lands, with weak/crushed rocks, with the decrease in the vegetation index values and along steep slopes. Along the Neelum, Jhelum Valley, and Kohala roads, the RF model has the highest AUC value (AUC 0.97, 0.95, and 0.92), followed by the LR model (AUC 0.91, 0.93, and 0.89), the WoE model (AUC 0.86, 0.83, and 0.89), and the AHP model (AUC 0.82, 0.81, and 0.88), respectively. Compared to other landslide models (LR, WoE, and AHP), the ROC curve study findings show that the RF model has the best prediction capacity. ML approaches are well recognized to be more efficient in talking about many real-world issues than conventional models such as analytical or expert opinion-based models [10,11]). Different researchers observed the different AUC in the region, e.g., Basharat et al. [1] calculated the AUC values of 0.76 using AHP in NW Himalayas, Pakistan; Kamp et al. [64] have studied the LSM by using the AHP method in the 2005 Kashmir earthquake areas and observed the 67% accuracy; Riaz et al. [5] have developed the WoE model and classification accuracy was achieved via SRC (89%) and PRC (86.2%).

Do et al. [65] conducted a comparative study using AHP, WoE, and LR with Flow-R model analysis in Vietnam and reported the AUC for the WoE (0.86), AHP (0.80), and LR (0.85). Our LR-AUCs align with the AUC of Pham et al. [12]. They comparatively analyzed the different ML algorithms in Uttarakhand, India. Merghadi et al. [66] comparatively reviewed different ML algorithms and found that RF has higher prediction accuracies. Sun et al. [67] performed a comparative analysis of RF and LR and found RF has a higher prediction performance than LR, which is in line with our results. Comparison research in Austria found similar results, indicating that the RF model is more precise than the LR and other data-mining approaches [56]. According to Trigila et al. [68], who compared the RF model’s results to those of an FR model and an LR model, the RF model is more accurate than the others. In Lianhua County, China, Hong et al. [69] found that the RF model performed poorly compared to the Evidential Belief Function, FR, and LR models. These divergent results might be attributable to various variables, such as variations in the research regions’ geographical settings, the criteria considered when picking LCFs, and the volume of data supplied for model development. Park et al. [70] found the AUC values for AHP and LR models were 0.78 and 0.79, respectively. Kayastha et al. [71] also narrated the AUC value (0.77) for the AHP model of the Tinau watershed, west Nepal. Pourghasemi et al. [72] calculated the AUC through the AHP and LR models were 0.75 and 0.85, respectively. Dahal et al. [73,74] employed the WoE approach in the lesser Himalayas and Nepal, and calculated the AUC values (0.77 to 0.85). Guo et al. [75] used WoE-based LSM in the Tibetan plateau, China, with various landslide triggering factors and calculated the AUC values ranging between 0.79 and 0.89. Regmi et al. [76] estimated the prediction accuracy of 0.78 through WoE for the LSM in Western Colorado, USA. Our prediction accuracies showed good agreement with previous studies’ AUC values. Therefore, it can be determined from our results that current models can be effectively executed in landslide susceptibility and risk management strategies along the study area road corridors. The maps produced during this study specify the landslide-prone areas along these road corridors of the study area.

6 Conclusions

To understand the landslide processes, susceptibility assessment provides basic information regarding the development of landscapes and provides the basis for hazard management and establishing mitigation strategies. Based on this idea, the present study has been conducted comprehensively on landslide susceptibility along the road corridors of district Muzaffarabad, Azad Kashmir, Pakistan. In this area, landslide events were reported during the past few years after precipitation. This is because of the terrain characteristics, rehabilitation work for road networks after the 2005 devastating Kashmir earthquake and fragile lithological units are extremely capable of landslide activities. The landslide inventory map using satellite imageries was generated and verified in the field along the main road corridors in this research work. Comparative analysis of landslide susceptibility adopting RF, LR, WoE, and AHP method was carried out. Eleven LCFs were identified, and their impact on landslides was evaluated. Among these eleven causative factors, road networks, landcover, lithological units, NDVI, faults, and slope gradient were considered as more influential causative factors.

To validate the efficiency of the obtained results, the susceptibility maps produced through the WoE method were then compared with the ones made using the AHP technique. The results revealed that the active landslide zones along each road have a prominent relationship to these maps’ high and very high susceptible classes. Validation of the generated susceptibility maps was evaluated by AUC–ROC curves and found that the overall prediction accuracy of RF is better than prediction accuracies of LR, WoE, and AHP developed model. The LSM generated by using the RF, LR, WoE, and AHP technique can facilitate to predict potential landslides in the future for stable road sections, understanding the landslide mechanisms, landslide hazard preparedness, better planning for mitigation measures in accordance with landslide risk, and improve the land-use strategies in the area. Regarding the earlier studies carried out in the current study area (e.g., ref. [1,5,19,77]), based on the AHP and WoE methods; this is an upgradation of information in places where some essential statistics are still lacking. The outcomes of this study give a higher level of information for relevant personnel by merging ML algorithms with GIS to map and evaluate landslide vulnerability. The present analyses also allow the stakeholders to adopt the appropriate decisions regarding future land use planning for the previous single method-based maps, reinforcing the decision-makers and the public works departments to develop these remote but potential tourism attractive areas.

Acknowledgements

The authors are grateful to the Land Use and Planning department of the Government of Azad Jammu and Kashmir (GoAJK) for providing the World view 3 satellite imageries for this study. The first author is also thankful to the Director Institute of Geology for providing transportation facilities during fieldwork.

  1. Funding information: No funding.

  2. Author contributions: Conceptualization, Y.S., M.T.R., M.B., and M.S.A; methodology, Y.S., M.T.R., and K.S.A; software, M.T.R. and Y.S.; validation, M.B, M.S.A., A.S, and Y.Y.; formal analysis, M.B., N.T.T.L., and N.A.A.; investigation, Y.S. and A.S.; resources, N.T.T.L. and N.A.A; data curation, K.S.A, Y.S., and M.T.R.; writing – original draft preparation, Y.S. and M.T.R; writing – review and editing, M.B., M.S.A., N.A.A., C.X. and M.T.R; visualization, K.S.A., A.S., and N.T.T.L.; supervision, M.B., and M.S.A.; project administration, M.T.R., N.T.T.L. and M.B.; funding acquisition, N.T.T.L., and N.A.A. All authors have read and agreed to the published version of the manuscript.

  3. Conflict of interest: This manuscript has not been published or presented elsewhere in part or entirety and is not under consideration by another journal. There are no conflicts of interest to declare.

  4. Data availability statement: The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

[1] Basharat M, Shah HR, Hameed N. Landslide susceptibility mapping using GIS and weighted overlay method: a case study from NW Himalayas, Pakistan. Arab J Geosci. 2016 Apr;9(4):1–9.10.1007/s12517-016-2308-ySearch in Google Scholar

[2] Anderson MG, Holcombe E. Community-based landslide risk reduction: managing disastersin small steps. Washington, DC: World Bank Publications; 2013 Jan 22.10.1596/978-0-8213-9456-4Search in Google Scholar

[3] Akgun A, Sezer EA, Nefeslioglu HA, Gokceoglu C, Pradhan B. An easy-to-use MATLAB program (MamLand) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm. Comput. Geosci. 2012 Jan 1;38(1):23–34.10.1016/j.cageo.2011.04.012Search in Google Scholar

[4] Hong H, Ilia I, Tsangaratos P, Chen W, Xu C. A hybrid fuzzy weight of evidence method in landslide susceptibility analysis on the Wuyuan area. China Geomorphol. 2017 Aug 1;290:1–6.10.1016/j.geomorph.2017.04.002Search in Google Scholar

[5] Riaz MT, Basharat M, Hameed N, Shafique M, Luo J. A data-driven approach to landslide-susceptibility mapping in mountainous terrain: case study from the Northwest Himalayas. Pak Nat Hazards Rev. 2018 Nov 1;19(4):05018007.10.1061/(ASCE)NH.1527-6996.0000302Search in Google Scholar

[6] Lee SA. Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. Int J Remote Sens. 2005 Apr 1;26(7):1477–91.10.1080/01431160412331331012Search in Google Scholar

[7] Ikram N, Basharat M, Ali A, Usmani NA, Gardezi SA, Hussain ML, et al. Comparison of landslide susceptibility models and their robustness analysis: a case study from the NW Himalayas, Pakistan. Geocarto Int. 2021 Dec 13;36:1–38.10.1080/10106049.2021.2017010Search in Google Scholar

[8] Jiang W, Rao P, Cao R, Tang Z, Chen K. Comparative evaluation of geological disaster susceptibility using multi-regression methods and spatial accuracy validation. J Geographical Sci. 2017 Apr;27(4):439–62.10.1007/s11442-017-1386-4Search in Google Scholar

[9] Buša J, Tornyai R, Bednarik M, Greif V, Rusnák M. Hodnotenie zosuvného hazardu pomocou multivariačnej a bivariačnej štatistickej analýzy v Košickej kotline (Západné Karpaty). Geografický Časopis. 2019;71:383–405.10.31577/geogrcas.2019.71.4.20Search in Google Scholar

[10] Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia MB. Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol. 2017 Apr;128(1):255–73.10.1007/s00704-015-1702-9Search in Google Scholar

[11] Pradhan B. A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput. Geosci. 2013 Feb 1;51:350–65.10.1016/j.cageo.2012.08.023Search in Google Scholar

[12] Pham BT, Pradhan B, Bui DT, Prakash I, Dholakia MB. A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India). Environ Model Softw. 2016 Oct 1;84:240–50.10.1016/j.envsoft.2016.07.005Search in Google Scholar

[13] Dou J, Yunus AP, Bui DT, Merghadi A, Sahana M, Zhu Z, et al. Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides. 2020 Mar;17(3):641–58.10.1007/s10346-019-01286-5Search in Google Scholar

[14] Pandey VK, Pourghasemi HR, Sharma MC. Landslide susceptibility mapping using maximum entropy and support vector machine models along the Highway Corridor, Garhwal Himalaya. Geocarto Int. 2020 Jan 25;35(2):168–87.10.1080/10106049.2018.1510038Search in Google Scholar

[15] Youssef AM, Pourghasemi HR. Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region, Saudi Arabia. Geosci Front. 2021 Mar 1;12(2):639–55.10.1016/j.gsf.2020.05.010Search in Google Scholar

[16] Ali SA, Parvin F, Vojteková J, Costache R, Linh NT, Pham QB, et al. GIS-based landslide susceptibility modeling: A comparison between fuzzy multi-criteria and machine learning algorithms. Geosci Front. 2021 Mar 1;12(2):857–76.10.1016/j.gsf.2020.09.004Search in Google Scholar

[17] Zhao Y, Wang R, Jiang Y, Liu H, Wei Z. GIS-based logistic regression for rainfall-induced landslide susceptibility mapping under different grid sizes in Yueqing, Southeastern China. Eng Geol. 2019 Sep 4;259:105147.10.1016/j.enggeo.2019.105147Search in Google Scholar

[18] Chang KT, Merghadi A, Yunus AP, Pham BT, Dou J. Evaluating scale effects of topographic variables in landslide susceptibility models using GIS-based machine learning techniques. Sci Rep. 2019 Aug 23;9(1):1–21.10.1038/s41598-019-48773-2Search in Google Scholar PubMed PubMed Central

[19] Ahmed KS, Basharat M, Riaz MT, Sarfraz Y, Shahzad A. Geotechnical investigation and landslide susceptibility assessment along the Neelum road: a case study from Lesser Himalayas, Pakistan. Arab J Geosci. 2021 Jun;14(11):1–9.10.1007/s12517-021-07396-6Search in Google Scholar

[20] Petley D, Dunning S, Rosser N, Kausar AB. Incipient landslides in the Jhelum Valley. Pakistan following the 8th October 2005 earthquake; Messages v. 2006.Search in Google Scholar

[21] Khan MA, Basharat M, Riaz MT, Sarfraz Y, Farooq M, Khan AY, et al. An integrated geotechnical and geophysical investigation of a catastrophic landslide in the Northeast Himalayas of Pakistan. Geol J. 2021 Sep;56(9):4760–78.10.1002/gj.4209Search in Google Scholar

[22] Shafique M. Spatial and temporal evolution of co-seismic landslides after the 2005 Kashmir earthquake. Geomorphology. 2020 Aug 1;362:107228.10.1016/j.geomorph.2020.107228Search in Google Scholar

[23] Riaz S, Wang G, Basharat M, Takara K. Experimental investigation of a catastrophic landslide in northern Pakistan. Landslides. 2019 Oct;16(10):2017–32.10.1007/s10346-019-01216-5Search in Google Scholar

[24] Basharat M, Riaz MT, Jan MQ, Xu C, Riaz S. A review of landslides related to the 2005 Kashmir Earthquake: implication and future challenges. Nat Hazards. 2021 Aug;108(1):1–30.10.1007/s11069-021-04688-8Search in Google Scholar

[25] Shafique M, van der Meijde M, Khan MA. A review of the 2005 Kashmir earthquake-induced landslides; from a remote sensing prospective. J Asian Earth Sci. 2016 Mar 15;118:68–80.10.1016/j.jseaes.2016.01.002Search in Google Scholar

[26] Basharat M, Rohn J, Baig MS, Khan MR. Spatial distribution analysis of mass movements triggered by the 2005 Kashmir earthquake in the Northeast Himalayas of Pakistan. Geomorphology. 2014 Feb 1;206:203–14.10.1016/j.geomorph.2013.09.025Search in Google Scholar

[27] Saba SB, van der Meijde M, van der Werff H. Spatiotemporal landslide detection for the 2005 Kashmir earthquake region. Geomorphology. 2010 Dec 1;124(1–2):17–25.10.1016/j.geomorph.2010.07.026Search in Google Scholar

[28] Owen LA, Kamp U, Khattak GA, Harp EL, Keefer DK, Bauer MA. Landslides triggered by the 8 October 2005 Kashmir earthquake. Geomorphology. 2008 Feb 1;94(1–2):1–9.10.1016/j.geomorph.2007.04.007Search in Google Scholar

[29] Chen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, et al. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. Catena. 2017 Apr 1;151:147–60.10.1016/j.catena.2016.11.032Search in Google Scholar

[30] Breiman L, Last M, Rice J. Random forests: finding quasars. In Statistical challenges in astronomy. New York, NY: Springer; 2003. p. 243–54.10.1007/0-387-21529-8_16Search in Google Scholar

[31] Riaz S, Kikumoto M, Basharat M, Putra AD. Wetting Induced Deformation of Soils Triggering Landslides in Pakistan. Geotech Geol Eng. 2021 Dec;39(8):5633–49.10.1007/s10706-021-01851-7Search in Google Scholar

[32] Riaz MT, Basharat M, Pham QB, Sarfraz Y, Shahzad A, Ahmed KS, et al. Improvement of the predictive performance of landslide mapping models in mountainous terrains using cluster sampling. Geocarto Int. 2022 Apr 19;37:1–44.10.1080/10106049.2022.2066202Search in Google Scholar

[33] Hearn GJ, editor. Slope engineering for mountain roads. London: Geological Society of London; 2011.Search in Google Scholar

[34] Kazmi AH, Jan MQ. Geology and tectonics of Pakistan. Karchi, Pakistan: Graphic Publishers; 1997.Search in Google Scholar

[35] Pourghasemi HR, Sadhasivam N, Amiri M, Eskandari S, Santosh M. Landslide susceptibility assessment and mapping using state-of-the art machine learning techniques. Nat Hazards. 2021 Aug;108(1):1291–316.10.1007/s11069-021-04732-7Search in Google Scholar

[36] Pourghasemi HR, Kornejady A, Kerle N, Shabani F. Investigating the effects of different landslide positioning techniques, landslide partitioning approaches, and presence-absence balances on landslide susceptibility mapping. Catena. 2020 Apr 1;187:104364.10.1016/j.catena.2019.104364Search in Google Scholar

[37] Devkota KC, Regmi AD, Pourghasemi HR, Yoshida K, Pradhan B, Ryu IC, et al. Landslide susceptibility mapping using certainty factor, index of entropy and logistic regression models in GIS and their comparison at Mugling–Narayanghat road section in Nepal Himalaya. Nat Hazards. 2013 Jan;65(1):135–65.10.1007/s11069-012-0347-6Search in Google Scholar

[38] Pachauri AK, Pant M. Landslide hazard mapping based on geological attributes. Eng Geol. 1992 Feb 1;32(1–2):81–100.10.1016/0013-7952(92)90020-YSearch in Google Scholar

[39] Wilson JP, Gallant JC. Digital terrain analysis. Terrain Analysis: Princ Appl. 2000;6(12):1–27.Search in Google Scholar

[40] Pradhan B, Singh RP, Buchroithner MF. Estimation of stress and its use in evaluation of landslide prone regions using remote sensing data. Adv Space Res. 2006 Jan 1;37(4):698–709.10.1016/j.asr.2005.03.137Search in Google Scholar

[41] Yalcin A. GIS-based landslide susceptibility mapping using analytical hierarchy process and bivariate statistics in Ardesen (Turkey): comparisons of results and confirmations. Catena. 2008 Jan 1;72(1):1–2.10.1016/j.catena.2007.01.003Search in Google Scholar

[42] Pourghasemi HR, Moradi HR, Fatemi Aghda SM, Gokceoglu C, Pradhan B. GIS-based landslide susceptibility mapping with probabilistic likelihood ratio and spatial multi-criteria evaluation models (North of Tehran, Iran). Arab J Geosci. 2014 May;7(5):1857–78.10.1007/s12517-012-0825-xSearch in Google Scholar

[43] Blyth EM, Finch J, Robinson M, Rosier P. Can soil moisture be mapped onto the terrain? Hydrol Earth Syst Sci. 2004 Oct 31;8(5):923–30.10.5194/hess-8-923-2004Search in Google Scholar

[44] Hengl T, Reuter HI, editors. Geomorphometry: concepts, software, applications. Amsterdam, The Netherlands: Newnes; 2008 Sep 25.Search in Google Scholar

[45] Soeters R, Van Westen CJ. Slope instability recognition, analysis and zonation. Landslides: Investigation Mitig. 1996 Dec;247:129–77.Search in Google Scholar

[46] Bonham-Charter GF. Geographic information systems for geoscientists pergamon. Kidlington, UK: Elsevier; 1994. p. 398.Search in Google Scholar

[47] Barbieri G, Cambuli P. The weight of evidence statistical method in landslide susceptibility mapping of the Rio Pardu Valley (Sardinia, Italy). In 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation: Interfacing Modelling and Simulation with Mathematical and Computational Sciences, Proceedings; 2009 Jul. p. 2658–64.Search in Google Scholar

[48] Saaty TL. What is the analytic hierarchy process? In Mathematical models for decision support. Berlin, Heidelberg: Springer; 1988. p. 109–21.10.1007/978-3-642-83555-1_5Search in Google Scholar

[49] Ayalew L, Yamagishi H, Ugawa N. Landslide susceptibility mapping using GIS-based weighted linear combination, the case in Tsugawa area of Agano River, Niigata Prefecture, Japan. Landslides. 2004 Mar;1(1):73–81.10.1007/s10346-003-0006-9Search in Google Scholar

[50] Saaty TL, Vargas LG. Models, methods, concepts & applications of the Analytic Hierarchy Process. Boston/Dordrecht/London: Kluwer Academic Publishers; 2001.10.1007/978-1-4615-1665-1Search in Google Scholar

[51] Saaty TL. A scaling method for priorities in hierarchical structures. J Math Psychol. 1977 Jun 1;15(3):234–81.10.1016/0022-2496(77)90033-5Search in Google Scholar

[52] Peng CY, Lee KL, Ingersoll GM. An introduction to logistic regression analysis and reporting. J Educ Res. 2002 Sep 1;96(1):3–14.10.1080/00220670209598786Search in Google Scholar

[53] Pradhan B, Lee S. Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ Earth Sci. 2010 May;60(5):1037–54.10.1007/s12665-009-0245-8Search in Google Scholar

[54] Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees (The Wadsworth Statistics/Probability Series). New York, NY: Chapman and Hall; 1984. p. 1–358.Search in Google Scholar

[55] Breiman L. Random forests. Mach Learn. 2001 Oct;45(1):5–32.10.1023/A:1010933404324Search in Google Scholar

[56] Goetz JN, Brenning A, Petschko H, Leopold P. Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput Geosci. 2015 Aug 1;81:1.10.1016/j.cageo.2015.04.007Search in Google Scholar

[57] Hansen LK, Salamon P. Neural network ensembles. IEEE Trans Pattern Anal Mach Intell. 1990 Oct;12(10):993–1001.10.1109/34.58871Search in Google Scholar

[58] Crippen RE. Calculating the vegetation index faster. Remote Sens Environ. 1990 Oct 1;34(1):71–3.10.1016/0034-4257(90)90085-ZSearch in Google Scholar

[59] Sahin EK, Colkesen I, Acmali SS, Akgun A, Aydinoglu AC. Developing comprehensive geocomputation tools for landslide susceptibility mapping: LSM tool pack. Comput Geosci. 2020 Nov 1;144:104592.10.1016/j.cageo.2020.104592Search in Google Scholar

[60] Allison PD. Logistic Regression using the SAS System: Theory and Application. Multicollinearity. 1999;1:48–51.Search in Google Scholar

[61] Porwal AK. Mineral potential mapping with mathematical geological models. PhD thesis, Utrecht University; 2006 Feb 1.Search in Google Scholar

[62] Pourghasemi HR, Mohammady M, Pradhan B. Landslide susceptibility mapping using index of entropy and conditional probability models in GIS: Safarood Basin. Iran Catena. 2012 Oct 1;97:71–84.10.1016/j.catena.2012.05.005Search in Google Scholar

[63] Chen W, Li W, Chai H, Hou E, Li X, Ding X. GIS-based landslide susceptibility mapping using analytical hierarchy process (AHP) and certainty factor (CF) models for the Baozhong region of Baoji City, China. Environ Earth Sci. 2016 Jan;75(1):1–4.10.1007/s12665-015-4795-7Search in Google Scholar

[64] Kamp U, Growley BJ, Khattak GA, Owen LA. GIS-based landslide susceptibility mapping for the 2005 Kashmir earthquake region. Geomorphology. 101. 2008 Nov 1;4:631–42.10.1016/j.geomorph.2008.03.003Search in Google Scholar

[65] Do HM, Yin KL, Guo ZZ. A comparative study on the integrative ability of the analytical hierarchy process, weights of evidence and logistic regression methods with the Flow-R model for landslide susceptibility assessment. Geomatics, Natural Hazards and Risk. 2020 Jan 1;11(1):2449–85.10.1080/19475705.2020.1846086Search in Google Scholar

[66] Merghadi A, Yunus AP, Dou J, Whiteley J, ThaiPham B, Bui DT, et al. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci Rev. 2020 Aug 1;207:103225.10.1016/j.earscirev.2020.103225Search in Google Scholar

[67] Sun D, Xu J, Wen H, Wang D. Assessment of landslide susceptibility mapping based on Bayesian hyperparameter optimization: A comparison between logistic regression and random forest. Eng Geol. 2021 Feb 1;281:105972.10.1016/j.enggeo.2020.105972Search in Google Scholar

[68] Trigila A, Iadanza C, Esposito C, Scarascia-Mugnozza G. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology. 2015 Nov 15;249:119–36.10.1016/j.geomorph.2015.06.001Search in Google Scholar

[69] Hong H, Pourghasemi HR, Pourtaghi ZS. Landslide susceptibility assessment in Lianhua County (China): a comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology. 2016 Apr 15;259:105–18.10.1016/j.geomorph.2016.02.012Search in Google Scholar

[70] Park S, Choi C, Kim B, Kim J. Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ Earth Sci. 2013 Mar;68(5):1443–64.10.1007/s12665-012-1842-5Search in Google Scholar

[71] Kayastha P, Dhital MR, De, Smedt F. Application of the analytical hierarchy process (AHP) for landslide susceptibility mapping: A case study from the Tinau watershed, west Nepal. Comput. Geosci. 2013 Mar 1;52:398–408.10.1016/j.cageo.2012.11.003Search in Google Scholar

[72] Pourghasemi HR, Pradhan B, Gokceoglu C, Mohammadi M, Moradi HR. Application of weights-of-evidence and certainty factor models and their comparison in landslide susceptibility mapping at Haraz watershed, Iran. Arab J Geosci. 2013 Jul;6(7):2351–65.10.1007/s12517-012-0532-7Search in Google Scholar

[73] Dahal RK, Hasegawa S, Nonomura A, Yamanaka M, Dhakal S, Paudyal P. Predictive modelling of rainfall-induced landslide hazard in the Lesser Himalaya of Nepal based on weights-of-evidence. Geomorphology. 2008 Dec 15;102(3–4):496–510.10.1016/j.geomorph.2008.05.041Search in Google Scholar

[74] Dahal RK, Hasegawa S, Nonomura A, Yamanaka M, Masuda T, Nishino K. GIS-based weights-of-evidence modelling of rainfall-induced landslides in small catchments for landslide susceptibility mapping. Environ Geol. 2008 Mar;54(2):311–24.10.1007/s00254-007-0818-3Search in Google Scholar

[75] Guo C, Montgomery DR, Zhang Y, Wang K, Yang Z. Quantitative assessment of landslide susceptibility along the Xianshuihe fault zone, Tibetan Plateau, China. Geomorphology. 2015 Nov 1;248:93–110.10.1016/j.geomorph.2015.07.012Search in Google Scholar

[76] Regmi NR, Giardino JR, Vitek JD. Modeling susceptibility to landslides using the weight of evidence approach: Western Colorado, USA. Geomorphology. 2010 Feb 15;115(1–2):172–87.10.1016/j.geomorph.2009.10.002Search in Google Scholar

[77] Riaz MT, Basharat M, Brunetti MT. Assessing the effectiveness of alternative landslide partitioning in machine learning methods for landslide prediction in the complex Himalayan terrain. Prog Phys Geography: Earth Environ. 2022 Jul 11;03091333221113660. 10.1177/03091333221113660.Search in Google Scholar

Received: 2022-04-11
Revised: 2022-06-07
Accepted: 2022-10-04
Published Online: 2022-12-31

© 2022 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 29.11.2023 from https://www.degruyter.com/document/doi/10.1515/geo-2022-0424/html
Scroll to top button