Abstract
This study examines rainfall forecasting for the Perfume (Huong) River basin using the machine learning method. To be precise, statistical measurement indicators are deployed to evaluate the reliability of the actual accumulated data. At the same time, this study applied and compared two popular models of multilayer perceptron and the knearest neighbors (kNN) with different configurations. The calculated rainfall data are obtained from the Hue, Aluoi, and Namdong hydrological stations, where the rainfall demonstrated a giant impact on the downstream from 1980 to 2018. This study result shows that both models, once finetuned properly, enjoyed the performance with standard metrics of R_squared, mean absolute error, Nash–Sutcliffe efficiency, and rootmeansquare error. In particular, once Adam stochastic is deployed, the implementation of the MLP model is significantly improving. The promising forecast results encourage us to consider applying these models with future data to help natural disaster nonstop mitigation in the Perfume River basin.
1 Introduction
Global climate change has extreme effects on the annual volume and pattern of rainfall. It is also the main cause of several droughts and floods worldwide. This situation has negatively impacted people, such as farmers, peasants, and agriculturists, whose livelihood depends on regular rainfall [1,2,3]. These points indicate that the desirability of highly accurate rainfall forecasting is now an urgent situation. Therefore, several studies have proposed several prediction methods of hydrological processes for forecasting soil temperature with neural networks and machine learning methods for rain runoff prediction, forecasting water flow, semiarid precipitation forecast, and drought prediction [4,5,6,7,8,9]. In addition, recently several studies have applied machine learning methods to predict the quality of dykes, water quality in rivers, and the amount of sludge in wastewater treatment plants [10,11,12].
The MLP and knearest neighbors (kNN) that the models conduct for supervised learning techniques in classification math are mentioned [13,14]. Classification math is divided into three processes: collecting the input training data set, using the test data set to check the classification accuracy, and deploying the classifier to categorize the new data [15]. Its abilities identify the relationships of the high complexity of input and output variables without realizing the natural physical processes [16,17,18,19,20,21,22]. Specific functions of the kNN model’s salient features are the nonparametric approach and the most straightforward in both regression and classification functions [23,24,25]. In addition, the main advantages of kNN can be listed as fast calculation time, a simple algorithm, easy to interpret, useful for regression and classification, high accuracy, no assumptions about data, no need to make additional assumptions and adjust some parameters or build a model [26,27,28]. Meanwhile, the MLP provides reliable regression and classification for the neural networks, which involves data entry from the input units and passes through the network to output units. Its hierarchy includes an input layer and one or more invisible layers of computational nodes and an output layer of computational nodes [29,30,31]. The MLP model integrates with the backpropagation algorithm [32].
Therefore, several studies on rainfall forecasts had been published using the models. Dash et al. [33] applied the kNN model to predict the rainfall season of the summer monsoon (June–September) and postmonsoon (from October to December) for 4 years (from 2011 to 2016) in Kerala state of Indian Peninsula. The study concluded that kNN has been carried out reasonably well. Wu et al. [34] used the kNN model to forecast rain from February to April every year at 18 major hydrological stations in the Southeastern Mediterranean region. The results indicated that kNN model well narrowed the gap between the global and the coarse forecasts models for the Southeastern Mediterranean region. Vallam and Qin [35] developed a kNN model to test predicted longterm rainfall simulation in Singapore over 30 years. The results showed that the kNN model is satisfactory when forecasts were conducted in the wet seasons. Moreover, the model could repeat the values closely of extreme rainfall. Zhang et al. [36] used the MLP model to predict the annual and nonmonsoon rainfall prediction in Odisha, India. The results indicated that MLP was more accurate when using the model for the rest of the eight nonmonsoon months in future rainfall prediction. Zahmatkesh and Goharian [37] used the MLP model to predict long lead monthly rainfall forecast from 1925 to 2016 in Vancouver, British Columbia, Canada. The research pointed out that the model with the best forecasting performance is selected to forecast rainfall 1 month ahead of time.
The perfume River basin in Thua Thien Hue Province is a vulnerable place, sensitive to natural disasters and the impact of climate change. Therefore, this area needs many types of forecasting related to natural disasters. Toward rainfall prediction for the Perfume River basin will be deployed by Machine Learning based on the Python platform. Even though the firsttime study methodology is applied, this study result may contribute to making more accurate predictions and supplying a new method for rainfall forecast in this basin.
This study proposes two MLP and kNN models with four configurations: Adam, LBFGS methods, Euclidean, and Minkowski distance metrics predict rainfall in the Perfume River basin, respectively. These models are also deployed to compare each other to find the most optimal model. Several accurate measurement parameters such as R_squared, Nash–Sutcliffe efficiency (NSE), rootmeansquare error (RMSE), and mean absolute error (MAE) are used to evaluate the accuracy levels of the proposed models. In addition, statistical measurement indicators (the percentage, the average, minimum and maximum values, standard deviation (St Dev), coefficient of variation (Cv) are applied to evaluate the reliability of the actual accumulated data.
The rest of the paper is structured as follows: Section 2 describes the methodology and study area, Section 3 evaluates the study data and analyzes the study results, Section 4 discusses the study approaches and limitations, and Section 5 presents the conclusion.
2 Methodology, study area, and data collection
2.1 Methodology
2.1.1 Multilayer perceptron
The MLP model is considered a typical representative. It includes an input layer, an output layer, and many hidden layers in between; all the nodes in the hidden layers and the output layer are named as neurons. The strength of the signal transmitting from one node to the others depends on the connection weight of the interconnections. Hidden layers improve the network’s ability to complex functions of the model [38,39], appurtenant to a lot of the training process. The training principle for the MLP model is using a variety of backpropagation algorithms. Training is a process of adjusting the weights and bias connections and calculating the errors caused by the network. In the training process, the differences between the desired with actual responses that the output layer of the training process fit the bestdesired output [40]. During training of neurons, the activation function is applied to this training process and the rectified linear unit (ReLU) is used for the activation function. ReLU does training for machine learning networks [41,42]. Due to ReLU convergence and gradient calculation almost instantly, ReLU solves explosion and the disappearance of gradients, maintaining a steadystate convergence rate as well [43]. In addition, the ReLU function is simple and effective for rainfall prediction [44]. For popular forecasts, the Adam or Limitedmemory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) method applies for a stochastic optimizer [45,46].
The specific characteristics of ReLU function, Adam, and LBFGS methods are explained in detail.
The ReLU function is illustrated in Figure 1, and ReLU is described as follows:
Figure 1
Equation (1) indicates that fʹ(x) = 0 when x < 0 and fʹ(x) = 1 when x ≥ 0.
LBFGS is an algorithm of optimization of the quasiNewton methods. It applies to the estimation of parameters in Machine Learning [47,48]. LBFGS was performed as an estimate of the Hessian matrix of inversion; the purpose of steer is to search through variable space. Due to its requiring linear memory, the LBFGS method is particularly suitable for optimization problems with multiple variables [49,50].
Adam is derived from the estimation of the adaptive moment. The Adam method is applied for efficient stochastic optimization. It only requires a small memory for the firstorder gradience; it calculates the learning rates for different parameters from approximate for the first and second moments of the gradients. The method has several advantages of deep neutral networks as follows. The parameter amplitude updates do not change the gradient scale, do not need the stationary objective, and the stepsizes approximate bounded by the stepsizes of hyperparameter. At the same time, it carries out with sparse gradients and naturally works in the form of stepsize annealing.
In this study, Figure 2 describes the structural MLP that input layer has 12 input nodes from a _{1} to a _{12} (which are also 12 months of the year), one neuron of the output layer has represented the values of rainfall. There are three hidden layers: the first hidden layer contains neurons from H _{11} to H _{112}, the second one is from H _{21} to H _{212}, and the last one is from H _{31} to H _{312}. Each neuron of the hidden layer and the output layer has a corresponding weight and bias, as W _{11} ^{(2)}, B _{1} ^{(1)} and W _{12} ^{(2)} , B _{2} ^{(2)} are the weight and bias to correspond for neuron H _{11} and neuron H _{12}, respectively, so on. Each neuron of the hidden layers takes the output from all neurons of the previous layers and converts these values with a weighted linear sum into the output layer, where n is the number of neurons of class and corresponds to the component of the vector weights. The output class gets the values from the last hidden layer. The ReLU function is the activation function for three hidden layers. Adam and LBFGS methods are two stochastic optimizations to the solver of weight optimization, and using these two methods, rainfall prediction of three station areas is compared. The training method for MLP is regression.
Figure 2
2.1.2 kNN
The kNN is the layer model for objects and locates on the nearest distance between the objects (query point) layer and remained objects in the training data. The kNN algorithm is considered an easy learning algorithm and a simple implementation [49]. The response values are calculated as a weighted sum of the whole k neighbors when the kNN model carries out the regression method. The weight is inversely proportional to the distance from the input record. This distance is called the Minkowski distance. Wilson and Martinez [51] defined the Minkowski distance of order p (p is an integer) between two vectors X, Y as follows:
where
After selecting the value of k, a prediction is an average over the outcomes for kNN, and equation (3) is as follows [28]:
where
2.1.3 Accuracy measurements
Forecasting data will be calculated and compared with actual data to accurately evaluate the forecasted values. The metrics that calculate the forecast accuracy include the MAE, the RMSE, and the R_squared. The error metrics are as follows:
where x
_{
f,t
} and x
_{
a,t
} are the forecast value and actual value in the period time t, respectively,
Thus, the methodology of this paper is summarized in Figure 3. The data in Figure 3 describe a flowchart illustrating the experiment steps for this study.
Figure 3
2.2 Study area
2.2.1 Brief of geography
Thua Thien Hue Province belongs to the North Central Coast Region of Vietnam. The province containing the largest basin is the Perfume River basin (see Figure 4), which is located between the North of Bach Ma mountain and the East of Truong Son range, its area is about 2,830 km^{2}, the altitude ranging from 200 m to 1,708 m, and the average slope ranging from 15 to 35°. Its main branches originate from the high areas of Bach Ma mountain, flow from South to North about 104 km. At the same time, the basin has three relatively subdrainage basins: Huu Trach branch (a catchment range of 691 km^{2} with 70 km long), Ta Trach branch (a catchment range of 729 km^{2} with 51 km long), and Bo River (a drainage basin of 938 km^{2} with 94 km long). Perfume River basin has the highest rainfall in Vietnam. Annually, the dry season runs from March to August in this basin, and the rough often from the end of July to the end of August. Especially, hurricane season starts in September and finishes in December. The average precipitation in Hue, ALuoi, and Namdong areas is about 2,850, 3,500, and 3,200 mm, respectively (see Figure 5(a)–(b)). The basin topography has not transitional areas from the upstream of the mountain down to the plain and the lagoon system. Hence, this morphology mainly causes high runoff upstream and large floods downstream during the rainy season.
Figure 4
Figure 5
In addition, the black square dots in Figure 4 point out the Hue, Aluoi, Namdong hydrological stations. The areas signify various climatic characteristics. The precipitation of three hydrological stations is a key to flood or drought seasons in the downstream. Therefore, the obtained rainfall data are crucially important in this study.
2.3 Data collection
The annual statistical report by Thua Thien Hue Centre for HydroMeteorological Forecasting provided the monthly rainfall data of Hue, Aluoi, and Namdong hydrological stations. The data are also checked with the annual statistical report of Thua Thien Hue Province. This preliminary data evaluation process is crucial for the study input. Table 1 shows the features of the data deployed in this study.
Station  Location  Earliest record year  Latest record year  Numbers of month 

Hue  Hue city  1980  2018  468 
Aluoi  Aluoi district  1980  2018  468 
Namdong  Namdong district  1980  2018  468 
Statistical features calculating from the monthly rainfall time series of each hydrological station are listed in Table 2. For comparative implementation, monthly rainfall data were measured with millimeters (mm). The range of the following characteristics was computed from the time series of the observed monthly rainfall: the percentage, average, minimum and maximum values, St Dev, and Cv.
Percentage  Average (mm)  St Dev  Cv (%)  Min (mm)  Max (mm)  

Station  Min  Max  Min  Max  Min  Max  Min  Max  Min  Max  Min  Max 
Hue  20  314  50.6  788.3  46.7  451  48  106  3.2  35  353.7  2,452.3 
Aluoi  21  275  68.2  912.4  68.2  912.4  31  89  4.5  132.7  499.0  2590.0 
Namdong  20  293  66.2  974.0  50  681.1  43  76  1.6  123.5  412.4  2,672.3 
Dataset included 468 rainfall months from January 1980 to December 2018. In this study, the dataset from January 1980 to December 2003 of the hydrological stations is used for the training phase, and the dataset from January 2004 to December 2018 is applied for the test phase.
3 Results
3.1 The rainfall forecasting of the MLP model
After many experiments to find the optimal MLP model with two methods of Adam and LBFGS, the study found the optimal model with the values of the core parameters that are listed in Table 3.
Item  Configuration 

Number of inputs  12 
Number of hidden layers  3 
Hidden layer sizes  12/12/12 
Number of outputs  1 
Learning rate init  0.001 
Iter no change  10 
Beta 1  0.9 
Validation_fraction  0.1 
Alpha  0.0001 
Max iter  10000 
Power_t  0.5 
Beta 2  0.999 
Solver  Adam, LBFGS 
The results of the simulation by the MLP model with Adam and LBFGS stochastic optimizations is shown in Figure 6. The line charts in Figure 6(a)–(c) are relatively good fitness between trained data and tested data for MLP models with Adam and LBFGS stochastic optimizations. The difference between the two stochastic optimizations of the three hydrological stations is hardly distinguished by the figures. Hence, the accuracy parameters are provided in higher detail in the data in Table 4.
Figure 6
Parameter  Hue rainfall prediction used Adam  Hue rainfall prediction used LBFGS  Average  Namdong rainfall prediction used Adam  Namdong rainfall prediction used LBFGS  Average  Aluoi rainfall prediction used Adam  Aluoi rainfall prediction used LBFGS  Average 

R_squared  0.999  0.997  0.998  0.986  0.984  0.985  0.991  0.988  0.990 
NSE  0.999  0.998  0.999  0.996  0.995  0.996  0.998  0.997  0.998 
MAE  2.97  5.12  4.045  14.37  16.36  15.37  7.81  9.59  8.70 
RMSE  4.38  6.24  5.31  17.18  20.21  18.70  9.85  13.11  11.48 
The data in Table 4 compares the two methods of Adam and LBFGS stochastic optimizations. Results from the statistics show that these three hydrological stations have more accurate values when using the Adam method. The results show that the best model is Hue with Rsquared = 0.999, NSE = 0.999, MAE = 2.97, and RMSE = 5.38, the secondbest model is Aluoi with Rsquared = 0.991, NSE = 0.998, MAE = 7.81, and RMSE = 9.85, and the thirdbest model is Namdong hydrological station with Rsquared = 0.986, NSE = 0.996, MAE = 14.37, and RMSE = 17.18.
3.2 The rainfall forecasting of the kNN model
The parameters in Table 5 give optimal values for the kNN model with distance metrics p = {2, ∞}. These values are obtained after many experiments to get the optimal model.
Algorithm  auto  Leaf_size  30 
Metric:  Minkowski  P  {2, ∞} 
N_neighbors:  3  Weights:  uniform 
The data in Figure 7 show the kNN for rainfall forecasting to apply distance metric with p = 2 and p = ∞ in Hue, Namdong, and Aluoi hydrological stations. Figure 7(b) and (c) indicates that the rainfall prediction and actual rainfall are a very close relationship; moreover, there are no significant differences. Because the two graphs above are difficult to distinguish the best optimal distance metric, the data in Table 6 is provided to evaluate the best method. Figure 7(a) shows that the kNN with p = ∞ is the best forecast for rainfall at Hue hydrological station; moreover, the prediction and actual data are very rigid. Meanwhile, the relationship between expected and actual rainfall of the values of p = 2 is loosefitting.
Figure 7
Parameter  Hue p = 2  Hue p = ∞  Average  Namdong p = 2  Namdong p = ∞  Average  Aluoi p = 2  Aluoi p = ∞  Average 

R_squared  0.982  0.993  0.985  0.983  0.981  0.982  0.987  0.979  0.982 
NSE  0.996  0.998  0.997  0.992  0.991  0.992  0.996  0.994  0.995 
MAE  32.83  16.39  24.61  21.67  28.65  25.16  19.05  31.46  25.255 
RMSE  43.62  27.70  35.66  61.88  76.21  69.045  29.36  44.78  37.07 
The data in Table 6 show the value of prediction errors of the R_squared, NSE, MAE, and RMSE. These data were collected from the analysis of the rainfall prediction of the three hydrological stations using the kNN model with distance metrics of p = 2 and p = ∞. At the same time, the result of the analysis indicated that the value of the forecast errors at the Hue station with p = ∞ is the lowest, and the secondlowest is the Aluoi station with p = 2. On the other hand, the value of forecast errors for the Namdong station with p = 2 is the highest. R_squared, NSE, MAE, and RMSE of the best model for the Hue, Aluoi, and Namdong hydrological station are 0.993, 0.998, 16.39, and 27.70; 0.987, 0.996, 19.05, 29.36; and 0.983, 0.992, 21.67, 61.88, respectively.
3.3 Comparison and analysis of simulation results between the MLP model and the kNN model
The models of MLP and kNN are carried out to assess rainfall during the 1980 to 2018 period in Thua Thien Hue Province. The line chart of Figure 8 summarizes the best rainfall projections at Hue, Aluoi, and Namdong hydrological stations after using the methods of distance metric and stochastic optimization for both the kNN and MPL models.
Figure 8
Figure 9 and Table 7 show that the average R_squared and NSE indicators of the two models are from 0.987 to 0.997, which proves that simulation results in a highly accurate forecast when compared with true data together. At the same time, the average error indicators of the MLP and kNN models fluctuate from 8.38 to 39.65, in which the average values of MAE, RMSE parameters of the kNN, and MPL models are 19.04, 8.38 and 39.45, 10.47, respectively, which mean that the indicators are fitness values for both the rainfall training data and the rainfall forecasting data.
Figure 9
Parameter  kNN  MLP  

Hue p = ∞  Namdong p = 2  Aluoi p = 2  Average  Hue Adam  Namdong Adam  Aluoi Adam  Average  
R_squared  0.993  0.983  0.987  0.988  0.999  0.986  0.991  0.992 
NSE  0.998  0.992  0.996  9.995  0.999  0.995  0.998  0.997 
MAE  16.39  21.67  19.05  19.04  2.97  14.37  7.81  8.38 
RMSE  27.7  61.88  29.36  39.65  4.38  17.18  9.85  10.47 
In addition, Figure 10 shows a comparison between the predicted values of precipitation rainfall and the actual values of precipitation in the training and testing periods and the correlation coefficient for the best kNN model and MLP prediction model. The MLP method is more exact in the provision of the correlation coefficient.
Figure 10
4 Discussion
The result of simulating rainfall by MLP and kNN models using four different configurations showed the following findings. Two models obtained the best performance and reliability for rainfall prediction; moreover, the forecast values compared to the actual parameters achieved high accuracy, where the R_squared and NSE values were higher than 0.979. At the same time, the RMSE values were lower than 76.21. The MLP model with the Adam optimization method gave the best accuracy for rainfall prediction to compare with the rest methods.
The study is conducted to predict a time series of annual rainfall from 1980 to 2018 in three hydrological stations: Hue station is located downstream and Aluoi and Namdong are located upstream.
However, several recent rainfall studies have incorporated rainfall and some effects on precipitation. The research of Choubin et al. [52] evaluated factors that may influence fall rain forecast in Kerman Province, Iran, which consisted of largescale oceanic and atmospheric information. Hence, the combination between these factors and accumulated rainfall data has given high accuracy for the forecast of autumn rainfall. Rainfall data have nonlinear variation. Therefore, Choubin et al. [53] deployed the data normalization method for the rainfall study at the MaharluBakhtegan basin, Iran. And the results indicated that the MLP model using data after normalization have resulted in a lower RMSE than the RMSE of this study. In addition, the studies by Najafzadeh et al. [54,55] have used some models such as neurofuzzy group method of data handling (NFGMDH) based on selforganized models and group method of data handling geneexpression programming (GMDHGEP) model to forecast bridge pier scour depth under debris flow effects and free span expansion rates below pipelines under waves, respectively. Research results have shown that the RMSE index of these two models is also smaller than the RMSE value of this study.
Even though the precipitation at three hydrological stations has a seasonal variation with different complexity, applying these two models with four configurations has achieved highreliability results. Hence, it can be used for rainfall forecasting for other regions in Vietnam. In addition, the study results are also a utility reference channel for the province authority to develop shortterm plans for natural disaster mitigation.
5 Conclusion
This study performs the predicted precipitation of the Perfume River basin. This study also indicated that the MLP model is more accurate than the kNN model. The measured rainfall was collected from three hydrological stations at the Hue, Namdong, and ALuoi areas of the province from 1980 to 2018. The dataset is separated using timebased criteria: training data (1980–2003) and test data (2004–2018). The results demonstrate that the effectiveness of the models for the core parameters has been mentioned earlier. In addition, this study result may help the Thua Thien Hue government formulate shortterm plans of natural disasters to mitigate for the basin.
Acknowledgments
We would like to thank the school of Civil Engineering of National Kaohsiung University of Science and Technology of Taiwan, and the Thu Dau Mot University of Vietnam for implementing and financial study. We also thank the master class of MSE#07HCM at FSB School Of Business and Technology and Dr. Hector Tibo, Dr. June Raymon (who are currently Ph.D. students at the National Kaohsiung University of Science and Technology of Taiwan) for their help in correcting the academic writing of the paper.

Funding information: This research is funded by Thu Dau Mot University, Vietnam. The APC was funded by Thu Dau Mot University.

Author contributions: Conceptualization, discussion, and conclusions, material and methods: Nguyen Hong Giang; writing, original draft preparation: Tran Dinh Hieu, Hoang Ngo Tu Do; writing, review, and editing: Yu Ren Wang, Quan Thanh Tho, Le Anh Phuong; funding acquisition: Tran Dinh Hieu and Nguyen Hong Giang. All authors have read and agreed to the published version of the manuscript.

Conflict of interest: The authors declare no conflict of interest.

Data availability statements: The datasets analyzed during the study are available from the corresponding author on request.
References
[1] Wang B, Xiang B, Li J, Webster PJ, Rajeevan MN, Liu J, et al. Rethinking Indian monsoon rainfall prediction in the context of recent global warming. Nat Commun. 2015;6(1):1–9. Search in Google Scholar
[2] Cramer S, Kampouridis M, Freitas AA, Alexandridis AK. An extensive evaluation of seven machine learning methods for rainfall prediction in weather derivatives. Expert Syst Appl. 2017;85:169–81. Search in Google Scholar
[3] Kusiak A, Wei X, Verma AP, Roz E. Modeling and prediction of rainfall using radar reflectivity data: a datamining approach. IEEE Trans Geosci Remote Sens. 2012;51(4):2337–42. Search in Google Scholar
[4] Bui DT, Pradhan B, Lofman O, Revhaug I, Dick ØB. Regional prediction of landslide hazard using probability analysis of intense rainfall in the Hoa Binh province, Vietnam. Nat Hazards. 2013;66(2):707–30. Search in Google Scholar
[5] Bonakdari H, Moeeni H, Ebtehaj I, Zeynoddin M, Mahoammadian A, Gharabaghi B. New insights into soil temperature time series modeling: linear or nonlinear? Theor Appl Climatol. 2019;135(3):1157–77. Search in Google Scholar
[6] Labat D, Ababou R, Mangin A. Linear and nonlinear input/output models for karstic springflow and flood prediction at different time scales. Stoch Environ Res risk Assess. 1999;13(5):337–64. Search in Google Scholar
[7] Adamowski J, Sun K. Development of a coupled wavelet transform and neural network method for flow forecasting of nonperennial rivers in semiarid watersheds. J Hydrol. 2010;390(1–2):85–91. Search in Google Scholar
[8] Choubin B, KhalighiSigaroodi S, Malekian A, Ahmad S, Attarod P. Drought forecasting in a semiarid watershed using climate signals: a neurofuzzy modeling approach. J Mt Sci. 2014;11(6):1593–605. Search in Google Scholar
[9] Choubin B, Malekian A, Samadi S, Khalighi‐Sigaroodi S, Sajedi‐Hosseini F. An ensemble forecast of semi‐arid rainfall using large‐scale climate predictors. Meteorol Appl. 2017;24(3):376–86. Search in Google Scholar
[10] Zeinolabedini M, Najafzadeh M. Comparative study of different waveletbased neural network models to predict sewage sludge quantity in wastewater treatment plant. Environ Monit Assess. 2019;191(3):1–25. Search in Google Scholar
[11] Najafzadeh M, Oliveto G. Riprap incipient motion for overtopping flows with machine learning models. J Hydroinf. 2020;22(4):749–67. Search in Google Scholar
[12] Najafzadeh M, Ghaemi A. Prediction of the fiveday biochemical oxygen demand and chemical oxygen demand in natural streams using machine learning methods. Environ Monit Assess. 2019;191(6):1–21. Search in Google Scholar
[13] Hosseini S, Azizi M. The hybrid technique for DDoS detection with supervised learning algorithms. Computer Netw. 2019;158:35–45. Search in Google Scholar
[14] Govindarajan M, Chandrasekaran RM. Intrusion detection using neural based hybrid classification methods. Computer Netw. 2011;55(8):1662–71. Search in Google Scholar
[15] Eslamloueyan R. Designing a hierarchical neural network based on fuzzy clustering for fault diagnosis of the Tennessee–Eastman process. Appl Soft Comput. 2011;11(1):1407–15. Search in Google Scholar
[16] Mahsin MD. Modeling rainfall in Dhaka division of Bangladesh using time series analysis. J Math Model Appl. 2011;1(5):67–73. Search in Google Scholar
[17] Alizadeh Z, Yazdi J, Kim JH, AlShamiri AK. Assessment of machine learning techniques for monthly flow prediction. Water. 2018;10(11):1676. Search in Google Scholar
[18] Ren J, Ren B, Zhang Q, Zheng X. A Novel hybrid extreme learning machine approach improved by knearest neighbor method and fireworks algorithm for flood forecasting in medium and small watershed of Loess region. Water. 2019;11(9):1848. Search in Google Scholar
[19] Nkoana R. Artificial neural network modelling of flood prediction and early warning. Master Degree. Bloemfontein: University of the Free State; 2011. ufs.ac.za. Search in Google Scholar
[20] Di Piazza A, Conti FL, Noto LV, Viola F, La Loggia G. Comparative analysis of different techniques for spatial interpolation of rainfall data to create a serially complete monthly time series of precipitation for Sicily, Italy. Int J Appl Earth Obs Geoinf. 2011;13(3):396–408. Search in Google Scholar
[21] Chang TK, Talei A, Alaghmand S, Ooi MPL. Choice of rainfall inputs for eventbased rainfallrunoff modeling in a catchment with multiple rainfall stations using datadriven techniques. J Hydrol. 2017;545:100–8. Search in Google Scholar
[22] MartínezAcosta L, MedranoBarboza JP, LópezRamos Á, Remolina López JF, LópezLambraño ÁA. SARIMA approach to generating synthetic monthly rainfall in the Sinú River watershed in Colombia. Atmosphere. 2020;11(6):602. Search in Google Scholar
[23] Loh WY. Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov. 2011;1(1):14–23. Search in Google Scholar
[24] Ahmed U, Mumtaz R, Anwar H, Shah AA, Irfan R, GarcíaNieto J. Efficient water quality prediction using supervised machine learning. Water. 2019;11(11):2210. Search in Google Scholar
[25] Altman NS. An introduction to kernel and nearestneighbor nonparametric regression. Am Stat. 1992;46(3):175–85. Search in Google Scholar
[26] Gayathri K, Marimuthu A. Text document preprocessing with the KNN for classification using the SVM. 2013 7th International Conference on Intelligent Systems and Control (ISCO). IEEE; 2013. p. 453–7. Search in Google Scholar
[27] Amra IAA, Maghari AY. Students performance prediction using KNN and Naïve Bayesian. 2017 8th International Conference on Information Technology (ICIT). IEEE; 2017 May. p. 909–13. Search in Google Scholar
[28] Imandoust SB, Bolandraftar M. Application of knearest neighbor (knn) approach for predicting economic events: Theoretical background. Int J Eng Res Appl. 2013;3(5):605–10. Search in Google Scholar
[29] Tfwala SS, Wang YM. Estimating sediment discharge using sediment rating curves and artificial neural networks in the Shiwen River, Taiwan. Water. 2016;8(2):53. Search in Google Scholar
[30] Jozdani SE, Johnson BA, Chen D. Comparing deep neural networks, ensemble classifiers, and support vector machine algorithms for objectbased urban land use/land cover classification. Remote Sens. 2019;11(14):1713. Search in Google Scholar
[31] Abdullah S, Ismail M, Ahmed AN, Abdullah AM. Forecasting particulate matter concentration using linear and nonlinear approaches for air quality decision support. Atmosphere. 2019;10(11):667. Search in Google Scholar
[32] Naganna SR, Deka PC, Ghorbani MA, Biazar SM, AlAnsari N, Yaseen ZM. Dew point temperature estimation: application of artificial intelligence model integrated with natureinspired optimization algorithms. Water. 2019;11(4):742. Search in Google Scholar
[33] Dash Y, Mishra SK, Panigrahi BK. Rainfall prediction for the Kerala state of India using artificial intelligence approaches. Comput Electr Eng. 2018;70:66–73. Search in Google Scholar
[34] Wu W, Liu Y, Ge M, RostkierEdelstein D, Descombes G, Kunin P, et al. Statistical downscaling of climate forecast system seasonal predictions for the Southeastern Mediterranean. Atmos Res. 2012;118:346–56. Search in Google Scholar
[35] Vallam P, Qin XS. Multi‐site rainfall simulation at tropical regions: a comparison of three types of generators. Meteorol Appl. 2016;23(3):425–37. Search in Google Scholar
[36] Zhang X, Mohanty SN, Parida AK, Pani SK, Dong B, Cheng X. Annual and nonmonsoon rainfall prediction modelling using SVRMLP: an empirical study from Odisha. IEEE Access. 2020;8:30223–33. Search in Google Scholar
[37] Zahmatkesh Z, Goharian E. Comparing machine learning and decision making approaches to forecast long lead monthly rainfall: The city of Vancouver, Canada. Hydrology. 2018;5(1):10. Search in Google Scholar
[38] Cao W, Wang X, Ming Z, Gao J. A review on neural networks with random weights. Neurocomputing. 2018;275:278–87. Search in Google Scholar
[39] Patra JC, Pal RN, Chatterji BN, Panda G. Identification of nonlinear dynamic systems using functional link artificial neural networks. IEEE Trans Syst Man Cyber Part B. 1999;29(2):254–62. Search in Google Scholar
[40] Simpson PK. Artificial neural systems: foundations, paradigms, applications, and implementations. 1st ed. Elmsford, NY: Pergamon Press, Inc.; 1990. worldcat.org. Search in Google Scholar
[41] FreireObregon D, Narducci F, Barra S, CastrillonSantana M. Deep learning for source camera identification on mobile devices. Pattern Recognit Lett. 2019;126:86–91. Search in Google Scholar
[42] Wang Y, Li Y, Song Y, Rong X. The influence of the activation function in a convolution neural network model of facial expression recognition. Appl Sci. 2020;10(5):1897. Search in Google Scholar
[43] Huang X, Gao L, Crosbie RS, Zhang N, Fu G, Doble R. Groundwater recharge prediction using linear regression, multilayer perception network, and deep learning. Water. 2019;11(9):1879. Search in Google Scholar
[44] Xiang Z, Yan J, Demir I. A rainfall‐runoff model with LSTM‐based sequence‐to‐sequence learning. Water Resour Res. 2020;56(1):e2019WR025326. Search in Google Scholar
[45] Verma C, Stoffová V, Illés Z, Tanwar S, Kumar N. Machine learningbased student’s native place identification for realtime. IEEE Access. 2020;8:130840–54. Search in Google Scholar
[46] Basu M, Kumar S, Gupta P, Kumar Singh R. A quantitative analysis of machine learning based regressors for pressure reconstruction in particle image velocimetry applications. Fluids Engineering Division Summer Meeting. Vol. 83716, American Society of Mechanical Engineers; 2020 July. p. V001T02A016 Search in Google Scholar
[47] Malouf R. A comparison of algorithms for maximum entropy parameter estimation. In COLING02. The 6th Conference on Natural Language Learning 2002 (CoNLL2002); 2002. Search in Google Scholar
[48] Andrew G, Gao J. Scalable training of l 1regularized loglinear models. Proceedings of the 24th International Conference on Machine Learning; 2007 June. p. 33–40 Search in Google Scholar
[49] Morales JL, Nocedal J. Remark on “Algorithm 778: LBFGSB: Fortran subroutines for largescale bound constrained optimization”. ACM Trans Math Softw. 2011;38(1):1–4. Researchgate.net. Search in Google Scholar
[50] Zhu C, Byrd RH, Lu P, Nocedal J. Algorithm 778: LBFGSB: Fortran subroutines for largescale boundconstrained optimization. ACM Trans Math Softw. 1997;23(4):550–60. Search in Google Scholar
[51] Wilson DR, Martinez TR. Reduction techniques for instancebased learning algorithms. Mach Learn. 2000;38(3):257–86. Search in Google Scholar
[52] Choubin B, Zehtabian G, Azareh A, RafieiSardooi E, SajediHosseini F, Kişi Ö. Precipitation forecasting using classification and regression trees (CART) model: a comparative study of different approaches. Environ Earth Sci. 2018;77(8):1–13. Search in Google Scholar
[53] Choubin B, Malekian A, Golshan M. Application of several datadriven techniques to predict a standardized precipitation index. Atmósfera. 2016;29(2):121–8. Search in Google Scholar
[54] Najafzadeh M, SaberiMovahed F. GMDHGEP to predict free span expansion rates below pipelines under waves. Mar Georesour Geotechnol. 2019;37(3):375–92. Search in Google Scholar
[55] Najafzadeh M, SaberiMovahed F, Sarkamaryan S. NFGMDHBased selforganized systems to predict bridge pier scour depth under debris flow effects. Mar Georesour Geotechnol. 2018;36(5):589–602. Search in Google Scholar
© 2021 Nguyen Hong Giang et al., published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.