This study established a probability model based on the landslide spatial and size probabilities to predict the possible volume and locations of landslides in watershed scale under rainfall events. First, we assessed the landslide spatial probability using a random forest landslide susceptibility model including intrinsic causative factors and extrinsic rainfall factors. Second, we calculated the landslide volume probability using the Pearson type V distribution. Lastly, these probabilities were joined to predict possible landslide volume and locations in the study area, the Taipei Water Source Domain, under rainfall events. The possible total landslide volume in the watershed changed from 1.7 million cubic meter under the event with 2-year recurrence interval to 18.2 million cubic meter under the event with 20-year recurrence interval. Approximately 62% of the total landslide volume triggered by the rainfall events was concentrated in 20% of the slope units. As the recurrence interval of the events increased, the slope units with large landslide volume tended to concentrate in the midstream of Nanshi River subwatershed. The results indicated the probability model posited can be used not only to predict total landslide volume in watershed scale, but also to determine the possible locations of the slope units with large landslide volume.
Estimation of total landslide volume in watershed scale is frequently required in research on integrated watershed planning, sediment control for soil and water conservation engineering as well as the change of river morphology. Watershed-scale landslide volume estimation methods can be classified into three types: the first type estimates the landslide depth of each landslide site according to its slope, multiplies the depth by the area to generate the landslide volume of each site, and then sums up the volume to obtain total landslide volume [1,2,3]; the second type establishes a statistical regression formula linking landslide area and volume, uses the formula and landslide area of each site to calculate landslide volume, and then adds up the volume [4,5,6,7]; the third type involves measurement of the whole area in the watershed before and after a landslide event, and treats the changes in the terrain surfaces as the total landslide volume [8,9,10]. With the development of technologies, such as LiDAR and unmanned aerial vehicles (UAV), it has become easier to get terrain surface measurement data before and after landslide events. Although the three foregoing methods are simple, they are applied to estimate total landslide volume from the observation data, rather than to predict the possible total landslide volume under scenario events because of the unestablished relationship between total landslide volume and rainfall factors.
When analyzing landslides induced by the torrential rainfall, Murano  found the more cumulative rainfall, the higher ratio of total landslide area to the watershed area, and the greater total number of individual landslides. Based on the results of Murano , Uchiogi  further investigated the relationship between rainfall and landslide areas, and employed these data from different watersheds to build equations linking the cumulative rainfall with incremental landslide area ratio. The application of Uchiogi’s empirical model requires the determination of both the K and P 0 values. Nevertheless, the K value can be determined only when there is sufficient rainfall and landslide data in the watershed. Also, the P 0 value is not easy to be obtained because the phenomenon that landslides typically occur in mountain areas makes it difficult to record the occurrence time. Uchiogi’s empirical model was used to predict possible incremental landslide areas under cumulative rainfalls as well [13,14,15,16], and the incremental landslide volume can be obtained by multiplying by the average depth of landslides [17,18,19]. Based on Uchiogi’s proposition, different models investigating the relationship between the rainfalls and landslide areas have been developed, and the cumulative rainfall was replaced by other rainfall parameters, while the incremental landslide area ratio was replaced by landslide area ratio. For example, Li et al.  analyzed the relationship between the new and enlarged landslides with precipitation during the wet season. Kuroiwa et al.  established the empirical relationship between shallow landslide area ratio and annual maximum daily precipitation. The empirical relationship between landslide area ratio and peak rainfall intensity built by Chen et al.  was used to predict the impact of climate change on landslide area ratio for the end of the 21st century . Furthermore, the effect of time or an earthquake on the relationship between the rainfall and landslide area was also considered in some studies. For example, Shou et al.  proposed a modified model considering the time effect on the relationship between the incremental landslide area ratio and cumulative rainfall. Liu et al.  proposed that the rainfall-induced incremental landslide density is a function of daily precipitation, impact of an earthquake, and impact of antecedent precipitation. Although these empirical models built from the statistical methods can estimate the total landslide volume, they cannot predict the possible spatial locations of landslides.
Landslide hazard assessment is gaining more and more attention in recent years due to its important role in the landslide risk management. Landslide hazard is defined as the probability of landslide occurrence within a specified period of time and within a given area . Several probabilistic models for landslide hazard assessment have been proposed. For instance, Guzzetti et al.  predicted the locations, occurrence frequencies, and sizes of landslides within a certain area based on a landslide probability model. Tien Bui et al.  integrated the landslide spatial and the temporal probabilities to generate landslide hazard maps. Wu and Chen  employed the landslide inventory, landslide thematic variables, and rainfall data of typhoon events to carry out the landslide hazard analysis. The present study attempted to establish a landslide probability model integrating the landslide spatial probability and landslide volume probability to predict possible total landslide volume in watershed scale under rainfall events with recurrence intervals. Besides, this model could also shed some light on the possible locations of landslides in the watershed, and thus could be used as a reference for future integrated watershed planning.
In the evaluation of landslide hazard, various qualitative or quantitative approaches have been developed to estimate landslide spatial probabilities, known as landslide susceptibility indexes. These approaches are further classified into heuristic, statistical, probability, and deterministic methods . The statistical methods were most often used and can be grouped into a bivariate analysis [31,32] and a multivariate analysis. Multivariate analyses include multivariate regression , logistic regression (LR) [29,34], and discriminant analysis . Very recently, numerous machine-learning techniques have also been developed for this field, including artificial neural networks [36,37], decision trees [38,39], support vector machine [40,41], and random forest (RF) [40,42]. Among the techniques, RF and support vector machine are the most promising methods . In terms of the size probability, studies on landslides induced by rainfall  and earthquakes  have investigated the power law relationship between the landslide area and noncumulative frequency. After verifying the power law relationship, a satisfying analysis outcome was obtained by fitting the probability density function of landslide area with a truncated inverse gamma distribution  and by fitting with a double Pareto distribution . Research on landslides induced by rainfall [10,47], landslides induced by earthquakes , and submarine landslides  have also examined the power law relationship between landslide volume and noncumulative frequency. Brunetti et al.  used data from 19 studies to analyze the probability density function of landslide volume, and found that the scaling exponent of negative power law correlated to the landslide type, triggering factors, geomorphological characteristics, landslide history, and the predominance of large or small landslides in a particular watershed . In the present study, the RF method was adopted in the spatial prediction of landslide susceptibility, and the inverse gamma distribution was used to fit probability density function of landslide volume.
The Taipei Water Source Domain is the main supply of tap water in the Great Taipei Metropolitan Area. In 2015, Typhoon Soudelor caused lots of landslides in the upstream watershed, and the raw water turbidity was beyond the capacity of the water purification plant, which resulted in severe water supply disruptions. After this event, the influence of extreme rainfall events on the landslides and raw water turbidity in the Taipei Water Source Domain has received wide attention from various researchers. Additionally, the soaring raw water turbidity mainly results from the landslides induced by the heavy rainfall events in this domain. Thus, the Taipei Water Source Domain was selected for the present study due to the influence of upstream landslides on the downstream sediment yields and turbidity.
This section summarized the research process of this study as shown in Figure 1. Adopting the concept proposed by Wu and Chen , this study sought to establish a landslide probability model. We incorporated extrinsic rainfall factors into a landslide susceptibility model to facilitate prediction of the corresponding landslide spatial probabilities under rainfall events with recurrence intervals. We also used a probability density function to calculate the cumulative probability of landslide volume using a long-term landslide inventory. Joint probability of the landslide spatial probability and cumulative probability of landslide volume was then obtained to predict the possible landslide volume in each slope unit under rainfall events. Due to the lack of observation data, the performance of this landslide probability model in forecasting was validated indirectly by comparing the possible total landslide volume predicted by the model with that predicted by an empirical formula linking rainfall and landslide volume. Finally, we used standard deviational ellipses to assess the concentration degree of the slope units with large landslide volume.
2.1 Landslide spatial probability analysis
2.1.1 Establishment of a landslide susceptibility model
The RF, an ensemble model introduced by Breiman (2001) , resamples the sample data and employs a subset of variables randomly selected to construct multiple decision trees for classification. The RF model is suitable for analyzing datasets with nonlinear relationships between variables and target variables because no assumptions about the relations need to be made in advance . Besides, by reason of its non-parametric feature as well as its advantages such as the ability to rank the importance of the variables and the supply of an algorithm to estimate the missing values , this model has been widely used in landslide predictions recently to achieve a high predictive performance [40,42,54,55].
The training and the testing sets were mainly constructed by the method proposed by Chang et al. (2019) . The input samples, comprising all landslide samples and an equal number of non-landslide samples selected randomly, were split randomly into the training and testing sets by 70:30 ratios. The training set was innerly resampled using 10-folds cross-validations for tuning the optimal hyperparameters based on the grid search method to prevent overfitting in the training process.
Two main hyperparameters that need to be set in the RF model were the number of tree (numtree) and the number of variables (mtry). The optimum pairs of hyperparameters with the highest area under the receiver operating characteristic (AUROC) were chosen to train the models with the training set. Then, we evaluated the predictive capabilities of the tuned models with the testing set using the accuracy (Acc) and AUROC. After repeating the training and testing process 10 times, 10 optimum pairs of hyperparameters and the corresponding models were obtained. We selected the model with the best predictive capability during testing stage for the following analysis.
2.1.2 Landslide spatial probability under rainfall events with recurrence intervals
This study collected multi-year rainfall data from adjacent rain gauge stations, tested data quality, eliminated erroneous and spotty data, and performed rainfall frequency analysis. The rainfall spatial distributions under events for the duration of 1 and 12 h with recurrence intervals ranging from 2 to 200 years were estimated using the Kriging method and then were input in the RF model to predict the landslide susceptibility indexes in the slope units. Finally, the landslide spatial probabilities in the slope units were calculated based on the relationship between landslide susceptibility index and spatial probability.
2.2 Probability density function of landslide volume
Stark and Hovius  analyzed the distribution of landslide noncumulative frequency and the corresponding area, and discovered that a power law relationship existed between landslide area and noncumulative frequency within a certain area scope. The power law relationship can be converted to a relationship between the landslide size and cumulative frequency. Referring to the previous analysis results [29,45], this study adopted the three-parameter Pearson type V distribution. The probability density function of landslide volume fitting the Pearson distribution can be written as
where V L is the landslide volume (m3), α is the shape parameter, β is the scale parameter, γ is the location parameter, and Γ(α) is the gamma function of α.
After ranking the landslide volume from minimum to maximum for the individual landslide, this study estimated the three parameters by fitting the equation (1) and calculated the cumulative probability of landslide volume. The probability of landslides exceeding or equal to one specific volume in each slope unit could then be predicted.
2.3 Use of the probability model to predict total landslide volume under rainfall events
We employed the concept of conditional probability and calculated the landslide spatial probability and cumulative probability of landslide volume to predict the possible landslide volume in each slope unit under rainfall events with recurrence intervals. By integrating the probability of landslides exceeding or equal to one specific volume, the expected landslide volume in each slope unit was calculated, and then together with the landslide spatial probability to predict the possible landslide volume in each slope unit under a rainfall event with one specific recurrence interval. The landslide volume in each slope unit was summed up to obtain the possible total landslide volume in each subwatershed, as shown in equation (2).
where LV i is the possible landslide volume (m3) in slope unit i, P s is the landslide spatial probability under a rainfall event with one specific recurrence interval, and EV i is the expected landslide volume (m3) in slope unit i.
2.4 Establishment of an empirical formula linking rainfall and incremental landslide volume
According to Uchiogi’s empirical model, this study used event-based landslide inventories to establish an empirical formula linking cumulative rainfall amount and incremental landslide area ratio (equation (3)) by taking the incremental landslide area ratio of each event as the dependent variable and the cumulative rainfall amount of each event as the independent variable. The incremental landslide volume, the product of the incremental landslide area, and the average landslide depth can be derived from equation (3) by multiplying the watershed area and the average landslide depth, as shown in equation (4):
where s/a is the incremental landslide area ratio (%), s is the incremental landslide area (ha) in the watershed, a is the watershed area (ha), P is the cumulative rainfall amount (mm), P 0 is the critical rainfall at which landslides occur (mm), K is a site specific coefficient, V is the incremental landslide volume (m3), and h is the average landslide depth (m).
2.5 Assessment of the concentration degree of slope units with large landslide volume
We employed the area under a concentration curve (AUCC) to assess the concentration degree of the slope units with large landslide volume. Based on the concept of success rates mentioned by Chung and Fabbri , we ordered the slope units in accordance to their landslide volume from maximum to minimum and then calculated the cumulative percentages of slope units and the corresponding cumulative percentages of landslide volume to plot the concentration curve. The greater the AUCC value is, the more the landslide volume concentrated in a small number of slope units.
The standard deviational ellipses were used to analyze the spatial distribution characteristics of landslides induced by an earthquake  or the spatial pattern of socioeconomic landslide vulnerability . This study also employed the standard deviational ellipses to analyze the spatial characteristics, such as central tendency, dispersion, and directional trends, of the slope units with large landslide volume. When using one standard deviation to construct an ellipse, approximately 63% of slope units with large landslide volume was contained within the ellipse. The area chiefly encompassed by the ellipse represented the scope of subwatersheds in which those slope units were concentrated.
3 Research area and materials
3.1 Research area
The study area, the Taipei Water Source Domain, is located in the northeast corner of Taiwan and includes the Feitsui Reservoir and Zhitan Water Purification Plant which supplies tap water for six million people in the Great Taipei Metropolitan Area. It is the only water source district designated under the Taiwan’s urban planning procedure for water quality and quantity conservation. The domain with an area of 69,783 ha has three major river systems containing the Beishi River, Nanshi River, and Xindian River as shown in Figure 2. The Beishi River watershed contains Wantan River, Jingualiao River, Daiyuku River, Beishi River, and Reservoir area subwatersheds, while Nanshi River watershed contains Daloulan River, Tonghou River, Zhakong River, and Nanshi River subwatersheds.
The whole watershed is mainly comprised of hills and mountains, and is generally higher in the southwest and lower in the northeast. Forests covered around 92% of the entire area. Typhoon Soudelor, in 2015, dumping near record volumes of rainfall on Taipei caused severe damage to the Nanshi River watershed. Multiple landslides took place in the upstream watershed, and the raw water turbidity of the Nanshi River reached up to 40,000 NTU which was beyond the capacity of the Zhitan Water Purification Plant. As a result, tap water in many parts of the Great Taipei Metropolitan Area became brownish and turbid.
3.2 Research materials
This study employed a digital elevation model with resolution of 5 m to divide the whole area into subwatersheds and to acquire a slope map using the ArcGIS programs. The basic analysis units used in this study were the slope units created according to the subdivision method proposed by Xie et al. . The entire research area was divided into 26,951 slope units, and their average area was approximately 2.5 ha. Most of the slope units had areas larger than the average landslide area of 0.2 ha in the research area, which reduced the chance that an individual landslide would be divided among different slope units, ensuring relatively optimal analytical results . On the basis of the past landslide susceptibility analysis of the research area , we used 12 intrinsic susceptibility variables and 2 extrinsic triggering variables that were derived from different data sources (Table 1). These variables were analyzed and calculated by the methods proposed , and the resulting maps of the intrinsic variables are shown in Figure 3.
|Data||Resolution/Scale||Derived susceptibility variables||Source|
|DEM||5 m||Highest elevation||Ministry of the Interior|
|Total slope height|
|Geologic map||1:50,000||Lithology||Central geological survey, MOEA|
|Distance from fault|
|Road map||1:5,000||Distance from road||Institute of transportation, MOTC|
|Satellite image||10, 20 m||NDVI||Center for space and remote sensing research, NCU|
|Rainfall data||Maximum 1 h rainfall||Water resources agency, MOEA and central weather bureau, MOTC|
|Maximum 12 h rainfall|
According to the daily turbidity data at Zhitan Dam near the outlet of the watershed from 2003 to 2016 and daily rainfall data at Fushan rain gauge station from 1990 to 2016, this study chose the typhoon events characterized by soaring turbidity in the water or high cumulative rainfall amounts as landslide events, including Typhoons Tim (1994), Nari (2001), Aere (2004), Jangmi (2008), Megi (2010), Saola (2012), and Soudelor (2015). After collecting satellite images before and after three Typhoons, Tim, Jangmi, and Saola, the normalized difference vegetation index (NDVI) analysis was performed to find potential landslide locations. When creating landslide inventory, this study used the slope, river system, and land use maps to eliminate and revise unreasonable locations, which helped to locate landslide sites. To reduce human mapping errors, we followed the recommended procedures proposed in ref.  to map landslides for the inventories before and after each Typhoon event, and each event-based landslide inventory was then created. Landslide volume of each site was estimated by multiplying the landslide area with the landslide depth, which was implemented using the relationship between the slope of landslide sites and the proposed landslide depth . The total landslide areas of 78.03, 61.87, and 54.69 ha in the watershed triggered by the above events were obtained. We also re-examined the event-based landslide inventories of Typhoons Nari, Aere, Megi, and Soudelor created by Wu and Yeh , and employed the seven event-based landslide inventories for the analysis. The landslide distributions after different Typhoon events are shown in Figure 2. The cumulative rainfall amounts and incremental landslide areas of the seven typhoon events are shown in Table 2.
|Cumulative rainfall (mm)||309||1,166||668||556||362||705||646|
|Incremental landslide area (ha)||56.63||26.17||23.99||42.81||11.89||39.95||107.53|
|Incremental landslide area ratio (%)||0.0812||0.0375||0.0344||0.0613||0.0170||0.0572||0.1541|
A total of 34 rain gauge stations located in the vicinity of the research area recorded recent-year rainfall data. This study analyzed rainfall amounts under the events with various recurrence intervals and durations at 26 rain gauge stations with over 10 years of records. Because of the great cumulative rainfall of Typhoon Nari event and the large landslide area of Typhoon Soudelor event, they both were chosen as basic events for building landslide susceptibility model. After calculating the maximum rainfall amounts for the duration of 1 and 12 h at the rain gauge stations, this study used the geostatistical method to estimate the spatial distributions of rainfall amounts for Typhoons Nari and Soudelor (Figure 4).
4.1 Landslide spatial probability analysis
4.1.1 Establishment of landslide susceptibility models
The landslide susceptibility models based on three datasets and three groups of variables were built, respectively, for choosing the best RF model (Table 3). Due to the great cumulative rainfall of Typhoon Nari event and the large landslide area of Typhoon Soudelor event, we included the landslide and non-landslide samples in the inventories of Typhoons Nari and Soudelor for the first dataset. The first dataset was then combined with the inventories of Typhoons Aere and Megi to create the second and the third dataset, respectively. Additionally, 14 variables, such as total slope height, maximum slope, average aspect, lithology, NDVI, distance from road, maximum 1 h rainfall amount, maximum 12 h rainfall amount, terrain roughness, average elevation, average slope, slope roughness, highest elevation, and distance from fault, were included in the first group. When adopting the variable selection method proposed by Chen et al. , we selected the first nine variables in the second group. According to the multicollinearity diagnosis indexes (Table 4), the first eight variables were chosen in the third group by eliminating the terrain roughness with high multicollinearity in the second group. A tolerance value smaller than 0.2 or a variance inflation factor (VIF) larger than 5 indicates high multicollinearity.
|Model||Number of variables||Data set||Optimum parameters||Training stage||Testing stage|
|numtree||mtry||Acc (%)||AUROC||Acc (%)||AUROC|
|1||8||Nari + Soudelor||700||8||76.8||0.887||68.4||0.803|
|2||Nari + Aere + Soudelor||1,400||8||74.5||0.880||74.2||0.854|
|3||Nari + Megi + Soudelor||600||8||77.5||0.886||74.8||0.832|
|4||9||Nari + Soudelor||800||8||77.8||0.889||71.3||0.791|
|5||Nari + Aere + Soudelor||200||9||78.7||0.907||73.2||0.844|
|6||Nari + Megi + Soudelor||600||9||76.3||0.892||70.8||0.837|
|7||14||Nari + Soudelor||400||14||76.6||0.875||70.1||0.799|
|8||Nari + Aere + Soudelor||500||13||81.3||0.897||76.2||0.833|
|9||Nari + Megi + Soudelor||900||14||80.8||0.906||71.3||0.817|
|Maximum slope||0.748||1.337||Distance from road||0.770||1.299|
|Total slope height||0.044||22.902||Maximum 1 h rainfall amount||0.436||2.296|
|Average aspect||0.943||1.060||Maximum 12 h rainfall amount||0.390||2.565|
In the process of model hyperparameter tuning with grid search method, we set numtree ranging from 100 to 1,500 and mtry ranging from 3 to the number of variables. The optimum pairs of hyperparameters obtained in Table 3 were used to train each model, and the model performance was assessed using Acc and AUROC values. Among the nine models, Model 8 yielded the highest overall predictive performance, as shown in Table 3.
The individual predictor weights of the RF models were calculated to understand the importance of the variables based on the information gain ratio (Figure 5). Figure 5 revealed that in the group of eight variables, maximum slope got the highest predictor weights, and the predictor weights of total slope height oscillate more strongly.
4.1.2 Prediction of landslide spatial probability under rainfall events
For the landslide spatial probabilities in the slope units, this study analyzed the rainfall amounts of events with different durations and various recurrence intervals at 26 rain gauge stations, and then used the Kriging method to estimate the spatial distributions of maximum rainfall amounts for the duration of 1 and 12 h with recurrence intervals of 2, 5, 10, 20, 50, 100, and 200 years, respectively. The estimated spatial distributions of the maximum rainfall amounts for the duration of 1 and 12 h with recurrence intervals of 2 and 100 years are shown in Figure 6(a) and (b). The maximum rainfall amounts for the duration of 1 and 12 h were input into the Model 8 to produce landslide susceptibility maps.
The proposed relationship between the landslide ratios and landslide susceptibility indexes  was employed to predict the landslide spatial probability. Equation (5) indicated that the landslide ratio (y) increased with a raise in the landslide susceptibility index (x). We used this equation to calculate landslide ratios which represented the landslide spatial probabilities in slope units under rainfall events. The landslide spatial probabilities under rainfall events with recurrence intervals of 2 and 100 years are shown in Figure 7(a) and (b). It is noteworthy that the resultant landslide spatial probabilities reflected the simultaneous maximum rainfall amounts for the duration of 1 and 12 h for all rain gauge stations, and could be seen as the prediction results for the worst case.
4.2 Probability density function of landslide volume
We used the long-term landslide inventory, which was the combination of the seven event-based landslide inventories, to perform a landslide volume probability analysis. In the long-term inventory, the scope of landslide volume distribution ranged from 49 to 840,577 m3 and the power law relationship existed between the landslide volume and noncumulative frequency. Moreover, the long-term landslide inventory revealed large differences in the landslide volume between the Beishi River, Nanshi River, and Xindian River watersheds. In order to obtain reasonable predictions for each watershed, this study first grouped the slope units into three watersheds and then used the three-parameter Pearson type V distribution to fit the power law relationships to obtain the cumulative probabilities of landslide volume for the three watersheds (Figure 8), which enabled the probability of landslides exceeding or equal to one specific volume to be computed. The results indicated that when landslides occurred in the slope units within the Beishi River, Nanshi River, and Xindian River watersheds, the probability that landslide volume exceeded or equaled 10,000 m3 was approximately 23, 37, and 7%, respectively. The results also showed the highest probability of large landslide volume in the Nanshi River watershed and the lowest in the Xindian River watershed.
We then used the new landslides caused by Typhoon Soudelor to validate the cumulative probabilities of landslide volume. Typhoon Soudelor caused a total of 86, 456, and 47 new landslides in the Beishi River, Nanshi River, and Xindian River watersheds, respectively. The number of the landslides with volume exceeding or equal to 1,000 m3 were 74, 385, and 40, respectively. The probabilities of the landslide greater than or equal to 1,000 m3 in the Beishi River, Nanshi River, and Xindian River watersheds were 86.0, 84.4, and 85.1%, respectively, similar to the probabilities presented in Figure 8. The results showed that the cumulative probabilities of landslide volume were applicable to the three watersheds, and could be employed to predict the expected volume of landslides occurring in the slope units under rainfall events with different recurrence intervals.
4.3 Prediction of landslide volume under rainfall events
This study used the cumulative probabilities of landslide volume in the three watersheds (Figure 8) to calculate the expected landslide volume in each slope unit. By combining the landslide spatial probability with the expected landslide volume, we calculated the possible landslide volume in each slope unit under rainfall events (Figure 9(a) and (b)).
The possible total landslide volume in each subwatershed under rainfall events was computed by equation (2). As shown in Figure 10, the larger possible total landslide volume occurred in Nanshi River, Tonghou River, and Zhakong River subwatersheds. Because the slope and total slope height are the main geomorphological parameters dominating landslide occurrence in the research area, the three subwatersheds contain greater areas of higher landslide spatial probability. In addition, the probability of large landslide volume in the slope units in the Nanshi River watershed tended to be higher than that in Beishi River and Xindian River watersheds, resulting in the greater areas of larger possible landslide volume in Nanshi River, Tonghou River, and Zhakong River subwatersheds.
4.4 Use of an empirical model to predict incremental landslide volume
We used the cumulative rainfall and incremental landslide data from the six Typhoons Tim, Nari, Aere, Jangmi, Megi, and Saola (Table 2) to establish an empirical model, and obtained a K value of 0.95 and a critical rainfall of 270 mm (equation (6)). Employing data acquired from Typhoon Soudelor to verify the empirical model, the incremental landslide area ratio was 0.1343% predicted by equation (6) according to the cumulative rainfall of 646 mm. The predicted ratio was close to the actual incremental landslide area ratio of 0.1541%, which represented the suitability of this empirical model.
where s is the incremental landslide area (ha) in the watershed and P is the cumulative rainfall amount (mm).
The incremental landslide area was generated from the cumulative rainfall of the events with various recurrence intervals by equation (6), and then incremental landslide volume under rainfall events was calculated by using the abovementioned landslide area and the average landslide depth of 3.29 m obtained from the landslide inventory statistics. According to the rainfall durations during the seven typhoon events, the most frequent duration of 3 days was chosen to estimate the cumulative rainfall of the events with recurrence intervals. Table 5 reveals the incremental landslide volume under various rainfall events.
|Recurrence interval||2 years||5 years||10 years||20 years||50 years||100 years||200 years|
|Cumulative rainfall (mm)||416.31||613.73||738.96||888.21||1016.56||1150.75||1290.07|
|Incremental landslide area ratio (%)||0.02||0.11||0.21||0.36||0.53||0.74||0.99|
|Incremental landslide volume (m3)||466,923||2,576,939||4,796,682||8,335,606||12,156,193||16,919,035||22,695,107|
4.5 Assessment of the concentration degree of slope units with large landslide volume
In order to determine whether a small number of slope units occupied most of the predicted landslide volume, the slope units were accumulated in accordance with their landslide volume from maximum to minimum to calculate the proportions of total landslide volume occupied by the top 20% of the slope units. As can be seen from Figure 11, these top 20% of slope units with large landslide volume accounted for 97.1, 74.1, 61.2, 42.2, 54.3, 52.3, and 52.2% of the total landslide volume under rainfall events with recurrence intervals of 2, 5, 10, 20, 50, 100, and 200 years, respectively. An average of roughly 62% of total landslide volume concentrated in the top 20% of the slope units.
This study also used the standard deviational ellipses with one standard deviation to analyze the central tendency, dispersion, and directional trends of the slope units with large landslide volume, as shown in Figure 12. The slope units with large landslide volume were defined as the slope units with landslide volume exceeding the mean value plus three times the standard deviation under the events with recurrence intervals. The slope units with large landslide volume under the event with recurrence interval of 2 years were more scattered than those of 200 years.
Landslides are generally regarded as the most important factor affecting the level of sedimentation and turbidity in the water resources . Apart from the total landslide volume, the locations where landslides would occur under different rainfall events in the watershed were critical. Previously, Guzzetti et al.  proposed a quantitative hazard model using spatial, temporal, and size probability and calculated the joint probabilities according to various combinations of periods and landslide sizes to plot the landslide hazard map. However, without considering the landslide volume, the predictive ability of this model could be weakened. This study further calculated the expected landslide volume in each analysis unit, together with the landslide spatial probability to map the landslide locations and obtained the landslide volume under a rainfall event with one specific recurrence interval. The model established in the present study is found not only to describe different joint probabilities but also to better predict the possible landslide locations and volume for analysis units.
When comparing RF model with the LR employing seven intrinsic causative factors and two extrinsic triggering factors proposed by Chen et al. , the greater Acc value of 76.2% and the higher AUROC value of 0.833 indicated that the RF model performed better for the study area, which was in accordance with the previous results [40,42,55]. Although the multicollinearity between the 14 variables considered in the RF model could affect the quality of the interpretation of each individual variable, it does not impact the model’s ability to accurately predict the value of the response variable . Therefore, the RF model can be employed to predict the landslide spatial probability under the rainfall events with recurrence intervals ranging from 2 to 200 years.
The exponent of the power law tails of the three proposed distributions of landslide volume in Beishi River, Nanshi River, and Xindian River watersheds are, respectively, 2.04, 1.46, and 2.18, indicating the higher relative proportion of large landslides in the Nanshi River watershed and lower in the Xindian River watershed. The values of the scaling exponent in the Beishi River and Xindian River watersheds were close to 2.11 which could be derived from the previous study result .
Due to the lack of observation data, the empirical formula linking rainfall and landslide volume was established employing a K value of 0.95 and a critical rainfall of 270 mm to indirectly assess the rationality of the landslide probability model. The critical rainfall value and K value are generally in the range of 250–500 mm and 0.5–4.0 , respectively. The verification result of Typhoon Soudelor represented the suitability of this empirical model. The differences between the total landslide volume predicted by the landslide probability model and by the empirical model are 71.9, 5.6, 31.5, 0.7, 26.4, 5.8, and 24.5% under the rainfall events with recurrence intervals of 2, 5, 10, 20, 50, 100, and 200 years, respectively. The two model’s predictions were close under the rainfall events with most recurrent intervals except for the 2-year interval, which indicated the rationality of the landslide probability model. Additionally, the average landslide depth is a key parameter to obtain the incremental landslide volume from the incremental landslide area. The empirical model uses only one value representing the average landslide depth in the whole watershed, but the average landslide depth may change in the three watersheds by reason of their geomorphological characteristics. In contrast, the landslide probability model established in the present study can take the differences in the cumulative probabilities of landslide volume among these watersheds into consideration and is therefore able to result in more accurate prediction.
With regard to the concentration degree, as shown in Figure 11, the greater the AUCC value is, the more the landslide volume concentrated in a small number of slope units. The results revealed that under the short recurrence interval of rainfall events, the landslide-prone area is restricted to the slope units with vulnerable geomorphological conditions, and the total landslide volume therefore tended to be concentrated within a small number of slope units. As the recurrence interval of rainfall events became longer, the landslide-prone area would eventually extend to the slope units that were not particularly vulnerable to landslides, the concentration degree of landslide volume within a small number of slope units would gradually decrease, and the slope units with large landslide volume became more spatially concentrated in the midstream of Nanshi River subwatershed (Figure 12), which would result in severer impacts on downstream sediment and turbidity responses. The prediction results obtained in the present study could contribute to developing the watershed strategic planning to lower the incidence of landslides or to reduce downstream sediment delivery.
The soaring raw water turbidity mainly results from the upstream landslides induced by the heavy rainfall events in the Taipei Water Source Domain. Therefore, predicting possible landslide locations and volume in this domain under the rainfall events is important for sustainable water management, especially when considering extreme high rainfall amounts resulting from climate change. In the current study, the validation results of the RF landslide susceptibility model employing 14 variables and the cumulative probability models of landslide volume showed their suitability for predicting landslide locations and volume. We found that the slope units with large landslide volume under the rainfall events with recurrence intervals of more than 50 years were largely concentrated in the midstream of Nanshi River subwatershed. The prediction results of these worst-case scenarios can help to develop watershed strategic planning and the corresponding mitigation measures to reduce impacts on water quality.
Moreover, the probability model established in the present study, which takes the differences in the cumulative probabilities of landslide volume among these watersheds into consideration, could reflect the changes in the total landslide volume and the locations of the slope units with large landslide volume induced by different rainfall events. This model is found not only to describe different joint probabilities but also to better predict the possible landslide locations and volume for analysis units, and can be served as a reference for future study on predicting landslide locations and volume in watershed scale.
Finally, we suggest that, to further enhance the model’s predictive capability, some issues need to be addressed, e.g., the inclusion of additional landslide inventory and susceptibility variables, an improvement in the relationship between the landslide ratios and susceptibility indexes, a significant improvement in cumulative probability models of landslide volume by grouping the slope units with similar geomorphological characteristics, or by considering the types of landslides.
We also suggested that sediment delivery ratio be further included in the analysis to predict the landslide-derived sediment yields in the downstream watershed. The recommended model allows further determination of the relationship between the possible landslide volume and downstream water turbidity under the rainfall events with various recurrence intervals, and thereby could predict the water turbidity under the rainfall events, which can provide further information for watershed management.
The authors would like to thank the Water Resources Agency, MOEA, Taiwan for providing rainfall and water quality data, and Y.C. Yeh for his work on the event-based landslide inventories.
Funding information: This work was supported by the Ministry of Science and Technology, Taiwan, under Grant No. MOST 106-2313-B-005-027.
Author contributions: C.Y.W. and P.K.C. designed the research (conceptualization). C.Y.W. developed the methodology and carried out the analysis. C.Y.W. and P.K.C. performed the acquisition of data and interpretation of the results. C.Y.W. prepared the original draft. The authors applied the SDC approach for the sequence of authors.
Conflict of interest: Authors state no conflict of interest.
 Lida T, Okunishi K. Development of Hillslopes due to landslides. Geomorphology. 1983;46:67–77. Search in Google Scholar
 Chen SC, Wang WN. The evaluation study of landslide susceptibility and landslide size in reservoir watersheds (3/3), water resources agency, ministry of economic affairs. N Taipei City. 2006:123–4 (in Chinese with English summary). Search in Google Scholar
 Hovius N, Stark CP, Allen PA. Sediment flux from a mountain belt derived by landslide mapping. Geology. 1997;25:231–4. 10.1130/0091-7613(1997)025%3C0231:SFFAMB%3E2.3.CO;2. Search in Google Scholar
 Guzzetti F, Ardizzone F, Cardinali M, Galli M, Reichenbach P, Rossi M. Distribution of landslides in the upper tiber river basin, central Italy. Geomorphology. 2008;96:105–22. 10.1016/j.geomorph.2007.07.015. Search in Google Scholar
 Shirzadi A, Shahabi H, Chapi K, Bui DT, Pham BT, Shahedi K, et al. A comparative study between popular statistical and machine learning methods for simulating volume of landslides. Catena. 2017;157:213–26. 10.1016/j.catena.2017.05.016. Search in Google Scholar
 Tseng CM, Lin CW, Stark CP, Liu JK, Fei LY, Hsieh YC. Application of a multi‐temporal, LiDAR‐derived, digital terrain model in a landslide‐volume estimation. Earth Surf Process Landf. 2013;38:1587–601. 10.1002/esp.3454. Search in Google Scholar
 Chen SC, Chen CY, Huang WY. Exploring landslide erosion volume-area scaling relationships by slip depth using changes in DTMs for basin sediment volume estimation. J Mt Sci. 2019;(16):581–94. 10.1007/s11629-018-4888-3. Search in Google Scholar
 Murano Y. Study on landslides of heavy rain type. Shin Sabo. 1966;19:1–6. (in Japanese with English summary). Search in Google Scholar
 Uchiogi T. Landslides due to one continual rainfall. J Jpn Soc Eros Control Eng. 1971;23:21–34 (in Japanese). Search in Google Scholar
 Chiou SJ, Cheng CT, Hsu SM, Lin YH, Chi SY. Evaluating landslides and sediment yields induced by the Chi-Chi Earthquake and followed heavy rainfalls along the Ta-Chia River. J Geoengin. 2007;2(2):73–82. Search in Google Scholar
 Hong CY. Time series analysis of control factors of landslides in central Taiwan. Master Thesis. Taichung, Taiwan: National Chung Hsing University; 2010. p. 186. (in Chinese with English summary). Search in Google Scholar
 Chou HT, Lee CF, Lo CM. The formation and evolution of a coastal alluvial fan in eastern Taiwan caused by rainfall-induced landslides. Landslides. 2017;14:109–22. 10.1007/s10346-016-0678-6. Search in Google Scholar
 Shieh CL, Lin CW, Lee SY, Huang ML, Lee SP, Chen YS. Study on the watershed models of sedimentary management (III), water resources agency, ministry of economic affairs. N Taipei City. 2002;355 (in Chinese with English summary). Search in Google Scholar
 Huang HS, Liu SI, Tsai YJ, Wang RJ, Chen JY. Study and verification of sediment change in Wu-Shin-Keng River Basin. J Chin Soil Water Conserv. 2009;40:297–309. (in Chinese with English summary). Search in Google Scholar
 Liu KF. A study of flood control and sediment management due to climate change in the Jhuoshuei river watershed, water resources planning institute. Water Resour Agency Ministry Econ Affairs Taichung. 2013. p. 7-29-7-30 (in Chinese with English summary). Search in Google Scholar
 Kuroiwa C, Hiramatsu S, Fukuyama T. A method for predicting shallow landslide area in managed and unmanaged forests. Int J Eros Control Eng. 2017;10:92–9. 10.13101/ijece.10.92. Search in Google Scholar
 Chen CW, Tung YS, Liou JJ, Li HC, Cheng CT, Chen YM, et al. Assessing landslide characteristics in a changing climate in northern Taiwan. Catena. 2019;175:263–77. 10.1016/j.catena.2018.12.023. Search in Google Scholar
 Chen YM, Chen CW, Chao YC, Tung YS, Liou JJ, Li HC, et al. Future landslide characteristic assessment using ensemble climate change scenarios: a case study in Taiwan. Water. 2020;12:564. 10.3390/w12020564. Search in Google Scholar
 Shou KJ, Hong CY, Wu CC, Hsu HY, Fei LY, Lee JF, et al. Spatial and temporal analysis of landslides in Central Taiwan after 1999 Chi-Chi earthquake. Eng Geol. 2011;123:122–8. 10.1016/j.enggeo.2011.03.014. Search in Google Scholar
 Liu SH, Lin CW, Tseng CM. A statistical model for the impact of the 1999 Chi-Chi earthquake on the subsequent rainfall-induced landslides. Eng Geol. 2013;156:11–9. 10.1016/j.enggeo.2013.01.005. Search in Google Scholar
 Varnes DJ. IAEG commission on landslides and other mass movements, landslide hazard zonation: a review of principles and practice. Paris: The UNESCO Press; 1984. p. 63. Search in Google Scholar
 Guzzetti F, Reichenbach P, Cardinali M, Galli M, Ardizzone F. Probabilistic landslide hazard assessment at the basin scale. Geomorphology. 2005;72:272–99. 10.1016/j.geomorph.2005.06.002. Search in Google Scholar
 Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick ØB. Regional prediction of landslide hazard using probability analysis of intense rainfall in the Hoa Binh province (Vietnam). Nat Hazards. 2013;66:707–30. 10.1007/s11069-012-0510-0. Search in Google Scholar
 Wu CY, Chen SC. Integrating spatial, temporal, and size probabilities for the annual landslide hazard maps in the Shihmen watershed (Taiwan). Nat Hazards Earth Syst Sci. 2013;13:2353–67. 10.5194/nhess-13-2353-2013. Search in Google Scholar
 van Westen CJ, van Asch TW, Soeters R. Landslide hazard and risk zonation—why is it still so difficult? Bull Eng Geol Environ. 2006;65:167–84. Search in Google Scholar
 Anis Z, Wissem G, Vali V, Smida H, Mohamed Essghaier G. GIS-based landslide susceptibility mapping using bivariate statistical methods in North-western Tunisia. Open Geosci. 2019;11(1):708–26. 10.1515/geo-2019-0056. Search in Google Scholar
 Milevski I, Dragićević S, Zorn M. Statistical and expert-based landslide susceptibility modeling on a national scale applied to North Macedonia. Open Geosci. 2019;11(1):750–64. 10.1515/geo-2019-0059. Search in Google Scholar
 Chu L, Wang L, Jiang J, Liu X, Sawada K, Zhang J. Comparison of landslide susceptibility maps using random forest and multivariate adaptive regression spline models in combination with catchment map units. Geosci J. 2019;23:341–55. 10.1007/s12303-018-0038-8. Search in Google Scholar
 Wang G, Chen X, Chen W. Spatial prediction of landslide susceptibility based on GIS and discriminant functions. ISPRS Int J Geo-Information. 2020;9:144. 10.3390/ijgi9030144. Search in Google Scholar
 Can A, Dagdelenler G, Ercanoglu M, Sonmez H. Landslide susceptibility mapping at Ovacık-Karabük (Turkey) using different artificial neural network models: comparison of training algorithms. Bull Eng Geol Environ. 2019;78:89–102. Search in Google Scholar
 Kalantar B, Pradhan B, Naghibi SA, Motevalli A, Mansor S. Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomatics, Nat Hazards Risk. 2018;9:49–69. Search in Google Scholar
 Wang LJ, Guo M, Sawada K, Lin J, Zhang J. A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network. Geosci J. 2016;20:117–36. Search in Google Scholar
 Zhang K, Wu X, Niu R, Yang K, Zhao L. The assessment of landslide susceptibility mapping using random forest and decision tree methods in the three gorges reservoir area, China. Environ Earth Sci. 2017;76:1–20. Search in Google Scholar
 Chang KT, Merghadi A, Yunus AP, Pham BT, Dou J. Evaluating scale effects of topographic variables in landslide susceptibility models using GIS-based machine learning techniques. Sci Rep. 2019;9:1–21. Search in Google Scholar
 Oh HJ, Kadavi PR, Lee CW, Lee S. Evaluation of landslide susceptibility mapping by evidential belief function, logistic regression and support vector machine models. Geomatics, Nat Hazards Risk. 2018;9:1053–70. Search in Google Scholar
 Kim HG, Lee DK, Park C, Ahn Y, Kil SH, Sung S, et al. Estimating landslide susceptibility areas considering the uncertainty inherent in modeling methods. Stoch Environ Res Risk Assess. 2018;32:2987–3019. Search in Google Scholar
 Jaiswal P, van Westen CJ, Jetten V. Quantitative assessment of landslide hazard along transportation lines using historical records. Landslides. 2011;8:279–91. 10.1007/s10346-011-0252-1. Search in Google Scholar
 Guzzetti F, Reichenbach P, Ghigi S. Rockfall hazard and risk assessment in the Nera River Valley, Umbria Region (central Italy). Environ Manag. 2004;34:191–208. Search in Google Scholar
 Weissel JK, Stark CP, Hovius N. Landslides triggered by the 1999 Mw7.6 Chi-Chi earthquake in Taiwan and their relationship to topography. Proceedings of 2001 IEEE International Geoscience and Remote Sensing Symposium. Sydney: Institute of Electrical and Electronics Engineers (IEEE); 9–13 July 2001. p. 759–61. Search in Google Scholar
 Breiman L. Random forests. Mach Learn. 2001;45:5–32. Search in Google Scholar
 Olden JD, Kennard MJ, Pusey BJ. Species invasions and the changing biogeography of Australian freshwater fishes. Glob Ecol Biogeogr. 2008;17:25–37. Search in Google Scholar
 Zhao L, Wu X, Niu R, Wang Y, Zhang K. Using the rotation and random forest models of ensemble learning to predict landslide susceptibility. Geomat Nat Hazards Risk. 2020;11:1542–64. Search in Google Scholar
 Brock J, Schratz P, Petschko H, Muenchow J, Micu M, Brenning A. The performance of landslide susceptibility models critically depends on the quality of digital elevation models. Geomat Nat Hazards Risk. 2020;11:1075–92. Search in Google Scholar
 Chung CJF, Fabbri AG. Probabilistic prediction models for landslide hazard mapping. Photogramm Eng Remote Sens. 1999;65:1389–99. Search in Google Scholar
 Zhang S, Li R, Wang F, Iio A. Characteristics of landslides triggered by the 2018 Hokkaido Eastern Iburi earthquake (Northern Japan). Landslides. 2019;16:1691–708. 10.1007/s10346-019-01207-6. Search in Google Scholar
 Samodra G, Chen G, Sartohadi J, Kasama K, Hadmoko DS. Spatial pattern of socio-economic landslide vulnerability and its spatial prediction by means of GIS-fuzzy logic in Kayangan catchment (Indonesia). Proceedings of the 8th Annual Conference of IIIRR. Kumamoto: The International Institute for Infrastructure, Renewal and Reconstruction (IIIRR); 24–26 Aug 2012. p. 520–9. Search in Google Scholar
 Xie M, Esaki T, Zhou G. GIS-based probabilistic mapping of landslide hazard using a three-dimensional deterministic model. Nat Hazards. 2004;33:265–82. 10.1023/B:NHAZ.0000037036.01850.0d. Search in Google Scholar
 Van Den Eeckhaut M, Reichenbach P, Guzzetti F, Rossi M, Poesen J. Combined landslide inventory and susceptibility assessment based on different mapping units: an example from the Flemish Ardennes, Belgium. Nat Hazards Earth Syst Sci. 2009;9:507–21. 10.5194/nhess-9-507-2009. Search in Google Scholar
 Chen SC, Wu CY, Hsieh CD. The establishment of a landslide hazard analysis model for the Taipei water source domain. J Chin Soil Water Conserv. 2012;43:332–45. (in Chinese with English summary). Search in Google Scholar
 Liu JK, Weng TC, Hung CH, Yang MT. Remote sensing analysis of heavy rainfall induced landslide. Proceedings of 21st Century Civil Engineering Technology and Management Conference. Hsinchu, Taiwan: Minghsin University of Science and Technology; 2001. p. C21–31. (in Chinese with English summary). Search in Google Scholar
 Chen SC, Wang WN, Ho CW, Wu CH, Lai YC. The evaluation study of landslide potential and landslide volume in reservoir watersheds (3/3). Taipei: Water Resour Agency Ministry Econom Affairs; 2006. Search in Google Scholar
 Lin GW, Chen H, Petley DN, Horng MJ, Wu SJ, Chuang B. Impact of rainstorm-triggered landslides on high turbidity in a mountain reservoir. Eng Geol. 2011;117:97–103. Search in Google Scholar
 Weiss NA. Introductory statistics. 10th ed. Boston: Pearson; 2012. Search in Google Scholar
© 2021 Chun-Yi Wu and Po-Kai Chou, published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.