Morphological classi ﬁ cation method and data-driven estimation of the joint roughness coe ﬃ cient by consideration of two-order asperity

: The roughness of the joint surface plays a significant role in evaluating the shear strength of rock. The waviness ( ﬁ rst-order) and unevenness (second-order) of natural joints have di ﬀ erent e ﬀ ects on the characterization of joint surface roughness. To accurately quantify the in ﬂ u-ence of the two-order asperity on the joint roughness coef-ﬁ cient (JRC) prediction of joint surface pro ﬁ le curve, the optimal sampling interval of the asperity was determined through the change of the R p value of the joint surface pro ﬁ le curve. The separation of the two-order asperity of 48 joint surface pro ﬁ le curves was completed at the optimal sampling interval, and morphological parameters of the asperity such as i ave , R max , and R p were counted from three aspects: asperity angle of the pro ﬁ le curve, asperity degree, and the trace length. Based on the statistical results of the morphological parameters considering the two-order asperity, the new nonlinear prediction models were proposed. The results showed that the curve slope mutation point SI = 2 mm is the optimal separation distance of the two-order asperity of the joint surface pro ﬁ le curve. The re ﬁ ned separation method that considers the waviness and unevenness of morphological parameters can characterize the detailed morphological features of the joint surface in more dimensions. The support vector regression (SVR) and random forest (RF) models that take into account a two-order asperity separated results have higher accuracy than traditional models. The prediction accuracy has improved by 7 – 8% in SVR model compared with SVR (SO) and RF(SO). The SVR nonlinear model that considering separation of two-orders of joint surface roughness is more suitable for the prediction of JRC.


Introduction
A rock mass is a discontinuous medium composed of rock blocks and joint surfaces. The rock mass's integrity is destroyed by the joint surface, which also lowers its mechanical strength [1,2]. Failure of the joint surface is the cause of the failure of several rock masses [3,4], such as the failure of the Malpassay arch dam in France [5] and the Jiweishan landslide in Wulong, Chongqing [6]. These accidents all occurred due to the existence of joint surfaces. As one of the most important mechanical properties of rock masses, the shear strength of the joint surface is of great importance for evaluating the stability of rock masses [7][8][9]. A rock joint's mechanical reaction to shear is principally influenced by the characteristics of the rock, normal load, and surface roughness [10,11]. Generally, the joint surface can be divided into hard (rigid) joint surface and weak joint surface according to rock own properties. The shear strengths of the two kinds of joint surface are all affected by factors such as lithology, moisture content, connectivity, plane morphology, filling properties, and stress state [12,13]. The friction coefficient of hard joint surface is large, and most of them have no filling material (the main object of this article). The mechanical properties are influenced by the roughness of the joint surface and the normal stress. The extension of the weak joint surface is long, and the interfacial friction coefficient is relatively small. The interior is generally filled with clay, mud, rock fragments, and so on. The filler properties dominate the mechanical properties of the joint surface. For the normal stress, usually the higher the normal stress, the higher the shear strength of the joint surface will be [14]. This is especially evident in hard joint surfaces. The normal stress level is mainly influenced by the regional ground stress. The aforementioned two factors can be obtained by lithological identification and stress testing. The interface roughness, as an important parameter to describe the morphological characteristics of joint surfaces such as joints and faults, also greatly determines the shear-slip properties of hard joint surfaces [15,16]. However, the quantitative estimation of joint roughness coefficient (JRC) is very complicated and remains under debate in recent year research.
At the earliest time, the JRCs of joint surface were obtained mainly by back-calculation based on test results [17,18]. Barton first proposed ten standard JRC curve patterns based on a large number of experimental studies. Researchers could then derive the JRC of the joint surface by comparing them with the standard profile lines, and this method was of great significance in the research on JRC. It was found that rock joints show different roughness characteristics at different scales in the follow-up practice research [19,20]. The International Society of Rock Mechanics qualitatively suggests that the roughness of a joint surface is characterized by two parts: waviness (first-order) and unevenness (second-order). Patton [21] found that the shear behavior of rock joints is primarily controlled by the asperity of the second-order and the first-order at small and large displacements, respectively. Some other research results [22] also indicated that the waviness plays a major role in influencing the nodal surface under low normal stress. In the case of high normal stress, the effect of unevenness of the joint surface should be emphasized. Li et al. [23] studied that asperity degradation controls the shear behavior of the rock joint. For a rock joint subjected to shear under non-extreme normal stress conditions, dilation and asperity degradation occur simultaneously.
However, the morphological parameters of the twoorder asperities of joint surface contour curves obtained by different sampling interval methods are different, which has a great impact on the accurate estimation of the final JRC values. Yu and Vayssade [24] found that Z 2 and SF vary for sampling intervals of 0.25, 0.5, and 1.0 mm, indicating that the coefficients of these fitting equations need to be modified for different sampling intervals. Jang et al. [25] found that Z 2 and RP decrease with increasing sampling interval, but SF increases very rapidly with increasing interval. Therefore, the structure of joints such as the quantitative ranking and accurate characterization of surface roughness are the theoretical basis for analyzing the shear resistance of rock masses. How to accurately quantify the evolution of waviness and unevenness is crucial to calculate the JRC value and the prediction of shear behavior of a rock joint.
Generally speaking, the main methods for quantifying the roughness of joint surfaces contain the experimental method, straight-edge method with a modified straight edge, the statistical parameter method, and the fractal mathematical method. Geertsema [26] obtained the JRC of specimen using the straight shear test on the joint surface and verified the applicability of the JRC-JCS model. Du [27] modified the straight-edge method and plotted the graphical solution of the modified straight-edge method to derive mathematical expression of joint surface roughness. Li and Huang [28] proposed a new method of using the relative undulation Ra and elongation R to jointly respond to the JRC of the joint surface and finally established the empirical formula of JRC with Ra and R as two factors. The results provided conditions for a rapid acquisition of shear strength of joint surfaces. Chen et al. [29] proposed the multiple fractal theory to quantify JRC as a way to solve the problem that subjective factors in Barton's formula had a large influence on JRC. Ge et al. [30] proposed a method to describe the joint surface morphology using the bright area percentage BAP and fitted the regression relationship between BAP and JRC. Among these quantitative calculation methods of the aforementioned JRC, although the experimental method was accurate and was recognized by most of scholars, it was contrary to the significance of studying the shear strength of the joint surface by using the JRC-JCS model. Moreover, experiments also took a lot of time and cost. Thus, it had not been widely promoted in practical applications [31,32]. The straight-edge method and the modified straight-edge method took into account the large fluctuation of the joint surface, but the contribution of the small fluctuation of the joint surface to the overall shear strength was ignored. According to a large number of research results on JRC using fractal theory, different scholars got different research results [33]. Therefore, the statistical parameter method was gradually being accepted as a method for quantifying JRC that can effectively overcome the aforementioned disadvantages and had some practical significance [34,35].
The statistical parameter method is a way to obtain the prediction values of JRC by rigorous mathematical formulas and is easy to implement in the program. It effectively avoids the errors caused by subjective factors and is gradually becoming one of the main methods for JRC prediction model research. There have been as many as dozens of statistical parameters used to express the morphology of joint surfaces, including the average tilt angle i ave , the first-order-derivative root mean square Z 2 , the modified first-order-derivative root mean square Z 2 ', the roughness index R p − 1, the contour index R p , the structure function SF, the knotted surface roughness gauge + θ C * / 1 max ( ), and so on [36]. However, there were two major difficulties in the calculation of JRC using the statistical parameter method. One was the refined characterization of the undulating morphology of the rock structure surface and the accurate acquisition of the main controlling statistical parameters. The other was the complex nonlinear relationship between statistical parameters of the joint surface morphology and JRC.
For the first problem, it is well known that natural joints of the joint surface possess two-order roughness, i.e., waviness (first-order) and unevenness (second-order) [37][38][39]. Both order asperities experience dilation and degradation, leading to the non-linear mechanical response of a rock joint to shear loading [40,41]. The waviness generally has the characteristics of a small dip angle and large asperity height. Conversely, the unevenness has the characteristics of a large dip angle and small asperity height [22,42]. Zhu et al. [43] analyzed the effect of two-order asperity on the shear strength and deformation characteristics of the joint surface through a shear test of the joint surface containing the two-order asperity. They found that under low normal stress, the shear strength of the joint surface increases with the height of the second-order small asperity, and the two-order asperity shows different shear characteristics during the shearing process. Huang et al. [44] used natural joint surface samples to conduct shear tests to explore the influence of two-order asperities on the JRC at different shear stages. The test results showed that the twoorder asperity contributed different influences to the shear failure of the joint surface, and research on the shear strength of the joint surface should comprehensively consider the shear contribution of the two-order topography. Guo [45] used the particle flow software PFC2D to simulate a joint surface with the two-order asperity and conducted shear tests. They obtained the law that the two-order asperity has different influences on the failure mode and shear mechanical properties of the joint surface. Therefore, the correct quantification of the contribution of the two-order roughness to JRC was of great significance for determining the shear strength of joint surfaces [46][47][48]. To quantify the contribution of different orders of roughness to the JRC, it is necessary to realize the separation of the two-order asperity. Zou et al. [49] classified the low-frequency and high-frequency variables in the joint surface profile curve by the wavelet transform method and characterized a two-order asperity in the joint surface profile curve with low frequency and high frequency. Based on wavelet transform theory and the critical decomposition-level criterion, Yuan et al. [50] separated the first-order and second-order asperity in the profile curve of the joint surface and established a JRC hierarchical characterization regression equation based on the statistical parameters of the first-order and second-order asperities. The determination of the optimal wavelet basis in the wavelet transform method played a crucial role in the separation of the first-order and second-order asperities. However, the selection of the optimal wavelet basis was timeconsuming, labor-intensive, and complicated. The results of the selection varied among researchers. Liu et al. [22] used the fixed sampling interval method to decompose the standard profile curve into a curve containing only two-order asperity, and the morphological parameters were also statistically analyzed. The fixed sampling interval method provided a new solution for the separation of the first-order and second-order asperities. However, the existing research results have not been deeply explored in the selection of sampling intervals, and further research is needed on the selection of separation intervals for the two-order asperity.
For the second problem, many researchers separated the two-order asperity and established a linear regression model between the morphological parameters and the JRC to calculate the JRC based on the ten standard profile curves proposed by Barton [15,16]. However, the standard profile curve had the problem that the data sample was too small and the length was single. Therefore, the effect was poor when it was actually applied to profile curves of different lengths. In addition, JRC was a comprehensive parameter to describe the morphology of the joint surface. But many researchers had quantified JRC using only a singlejoint surface morphological parameter, without considering morphological characteristics at the same time such as undulation angle, undulation degree, and trace length. This kind of linear model selected only one or two morphological parameters for fitting and regression, and it was difficult to fully characterize the influence of morphological parameters on JRC. Although the aforementioned two-order roughness division method produced a large number of statistical parameters that can provide a fine characterization of the joint surface roughness, the complex nonlinear relationships between the parameters were difficult to characterize by traditional regression analysis methods. In recent years, machine learning has increasingly become the focus of much attention in rock parametric estimation [51,52]. For the characteristics of small and discrete of rock structure surface data, random forest (RF) and support vector regression (SVR) models show strong advantages in solving both small and nonlinearity samples among the many commonly used machine learning models. They not only meet the requirement of training sample size, but also ensure the accuracy of prediction [53,54]. However, little research has been done in morphological classification and statistical regression of rock joint surface consideration of the two-order asperity.
Therefore, in view of the problem that most of the previous studies were based on 10 standard profile curves, this article collected 48 joint surface profile curves with lengths from 64 to 112 mm (the values of JRC were known). The optimal sampling interval of the two-order asperity of the plane profile curve was explored deeply to achieve the precise separation of the two-order asperity of the 48 joint surface profile curves. The asperity angle, degree, and length morphological parameters of the two-order asperity were counted. The morphological characteristics of the first-order and second-order asperities were characterized by the refinement of the morphological parameters of the twoorder asperity, and the morphology of the two-order asperity was accurately and effectively quantified. Thirty-eight curves were randomly selected to establish the training database, and the regression models of SVM and RF that could capture the complex relationship between the morphology parameters and the JRC were constructed to determine the influence of each morphological parameter on the JRC. Finally, ten profile curves were randomly selected as the prediction set, and the prediction accuracy of the data-driven model constructed based on the separation results was also verified by the joint surface shear test. These works may provide new research results for the JRC prediction and shear strength calculation of rock joints.

Data set
To carry out research on the determination of the roughness of the joint surface of the two-order asperity based on the data-driven model, 48 joint surface profile curves of the known JRC were collected, of which 1-10 were standard profile curves, 11-22 were derived from Grasselli [55][56][57], and 23-48 were derived from Bandis et al. [58]. The JRC of the 48 profile curves was calculated by the relevant researchers through the direct shear test. The results were real and effective. A large number of researchers had studied the influence of the roughness of the joint surface on the shear strength.

Standard JRC profile decomposition method
The influence of the first-order asperity and the secondorder asperity on the roughness in the rock mass joint surface is quite different. Under a large sampling interval, only the morphological characteristics of the first-order asperity can be collected. Under the small sampling interval, the statistical results of the morphological characteristics are the comprehensive results of the first-order asperity and the second-order asperity. Therefore, a suitable interval should be used to distinguish the two-order asperity in the profile curve of the joint surface, and the morphological parameters of the two-order asperity should be counted. When the sampling interval of the profile curve of the joint surface dropped to a certain value, the length of the second-order asperity was included in the statistical result of the total length of the profile curve, and the roughness profile indexes R p changed greatly. The sampling interval changed with the slope of the fitting curve. This sampling interval was the optimal sampling interval [59]. The optimal sampling interval was the change in the slope of the fitting curve, which consisted of the sampling interval and R p . The profile curve of the joint surface was discrete with the limit sampling interval to realize the acquisition of the first-order asperity. The acquisition of the second-order asperity was obtained by subtracting the morphological data of the firstorder asperity from the morphological data under the minimum sampling interval.

Data-driven methods
Based on the characteristics of the two-order undulating morphological parameter data of the joint surface, this article selected two data-driven models, RF and support vector machine, and established the machine learning model to determine the JRC of the joint surface.

SVR
Support vector machine is a statistical learning theory based on the principle of minimum structural risk and applied to small samples [60][61][62]. To solve the regression problem, the support vector machine maps the input variable data set x to the high-dimensional feature space F by establishing a nonlinear mapping function from the input space to the output space and constructs the estimated function in F. The estimated function is as follows: where f(x) is the output variable, x is the input variable, ω is the weight vector, and the dimension of ω is the dimension of the feature space, and b is the threshold. ω ‖ ‖ is a measure of the model's ability to control the generalization. When ω 1 ‖ ‖ is the largest, the generalization ability is the best. To find the smallest ω, assuming that all data can be linearly fitted with the accuracy of ω, the problem of finding the smallest ω is transformed into a quadratic convex optimization problem: s.t.: Considering the allowable error, introducing a relaxation factor ξ i , ξ * i , and an equilibrium factor C > 0, the regression estimation problem is transformed into: Introducing Lagrangian multipliers, α i , α * i , λ i , and λ * i the aforementioned equation is converted into a dual optimization problem to solve: When the partial derivative value of ω, b, ξ i , ξ and * i and other parameters is 0, the aforementioned equation obtains the minimum value. The aforementioned equation is entered when the partial derivative value of ω, b, ξ i , and ξ * i and other parameters is 0 to obtain the dual optimization problem: The aforementioned equation is the quadratic programming problem of the support vector machine. Solving this problem results in the form of data points ω: The choice of hyperparameters and kernel functions has an important impact on the prediction accuracy and generalization performance of the model. Therefore, an optimal search of hyperparameters and kernel functions is performed using MSE values as indicators for parameter tuning and model selection. Finally, it is concluded that the highest model accuracy is achieved when the Gaussian radial basis kernel function is used for the joint surface JRC regression study. In support vector machine regression, the kernel function ) 〈 ( ) ( )〉, and the kernel function is introduced for nonlinear approximation. The regression function is obtained from this solution: Different support vector machines can be generated by selecting different forms of kernel functions such as radial basis functions, polynomial functions, perceptron (sigmoid) functions, and linear functions. In this study, the Gaussian radial basis kernel function was used for the JRC regression study of the joint surface.

RF
RF is an ensemble learning model that integrates multiple decision trees to solve the problem of overfitting and large error of a single decision tree algorithm, where X is the predictor variable, θ k is an input variable with independent and identical distributions, and K is the number of trees in the RF model.
The regression process of the RF model is as follows: 1) The self-help method (bootstrap) is used to extract samples with replacements to ensure that the N samples generate N subsets of the same size and N decision trees. 2) When nodes are randomly generated in each decision tree, m variants (m < n) among the explanatory variables are randomly selected to participate in the growth of the tree. Using the principle of the minimum Gini coefficient or the principle of maximum information gain, the optimal variable is selected for node segmentation to ensure that each tree generates the optimal branch and realizes the growth of the tree. 3) Each tree is generated from top to bottom. The maximum generation depth of each tree depends on the initial set hyperparameter (max_depth). The branches that exceed the maximum depth will be cut off to prevent the model overfitting. 4) The prediction results of all decision trees are accumulated and averaged to obtain the final prediction result of the prediction set.
The prediction accuracy of the RF model is affected by model hyperparameters such as the number of decision trees (n_estimators), the maximum depth of the decision tree (maximum_depth), the minimum number of samples required for node division (minimum_samples_leaf), and the minimum number of samples of leaf nodes (mini-mum_samples_split). Among them, the number of regression trees, the maximum depth of the regression tree, and the model accuracy have the greatest impact on the model hyperparameter optimization. In order to improve the model predictive performance, the hyperparameters are also optimized based on the MSE value and the number of characteristic variables. It is determined that the highest model accuracy is achieved when the number of regression trees is 400 and the maximum depth of regression trees is 2 for the JRC regression study of rock joint roughness.

Profile curve separation result
When making the curve selection, it is necessary to concern that the samples should be able to characterize the morphological features of the ten standard contour curves as much as possible. Therefore, samples from both ends and the middle of the JRC standard contour curve sample library are selected. In this case, the selection of 2nd, 5th, and 8th; the selection of 3rd, 6th, and 9th; and the selection of 4th, 7th, and 10th curves from the standard profile curve all can be considered. However, the overall JRC values of the 48 contour curves by Grasselli and Bands (those selected in this article) are larger, and many of them are close to the JRC values of the ninth and tenth curves of the standard contour curves (JRC > 16.7). The selection of 4th, 7th, and 10th curves takes into account both the standard contour curves and the JRC distribution characteristics of the sample data source. Therefore, the 4th, 7th, and 10th curves in the standard profile curve were selected as the research objects, as shown in Figure 1.
Based on the pixel analysis function of MATLAB software, the grayscale image processing method was adopted. The histogram equalization and homomorphic filtering operations are mainly performed with the help of grayscale histograms and discrete Fourier transform spectral amplitude maps of low-illumination images, so as to obtain the profile curve coordinate data under different intervals. To verify the feasibility of the processing method and the accuracy of the data, the values of R p under the corresponding interval were counted by an SI = 0.5 mm discrete standard profile curve and compared with other data [31]. The comparison results are shown in Figure 2. It can be seen that R p is obtained by different researchers with different methods, so the results of the study vary slightly.
The values obtained by the grayscale image processing method had high consistency and small differences from other research results, which proved the rationality of the grayscale image processing method. Therefore, the grayscale image processing method was used to discretize the profile curve of the joint surface with different sampling intervals to count the morphological parameter values of the research object. The sampling intervals were set with SI = 0.25, 0.5, 1, 2, 5, 10, 20, and 25 mm. The statistical results are shown in Figure 3. It can be seen that when the SI is 2 mm, the slope of the straight-line-fitting relationship with SI changes significantly. The change in slope means that the previously ignored second-order asperity length is included in the total length of the profile curve statistics, so R p suddenly increases, causing the slope to change. Therefore, it can be determined that the limit sampling interval of the two-order asperity is 2 mm.
To obtain the profile curve of the joint surface that only contained the first-order asperity, the collected profile curves were processed with the limit sampling interval SI = 2 mm, and the discrete data of the profile curve SI = 2 mm were obtained. Then, the discrete data only contain the first-order asperity. Then, the profile curve was processed with the minimum sampling interval SI = 0.1 mm, and the discrete data of the profile curve SI = 0.1 mm were obtained. The results are shown in Figure 4. It shows that the profile of the profile curve under the limit sampling interval and the profile curve of the minimum sampling interval are almost the same. The sampling result of the limit sampling interval has a second-order asperity that is ignored, and the calculation of the morphological parameters is not performed. Therefore, the subtraction result of the morphological curve can independently obtain the second-order small undulating topography, and the separation of the two-order asperity can be successfully carried out. This means that the same method is adopted to separate the two-order asperity of the remaining profile curves, and the next step is carried out.

Statistical results of morphological parameters
Based on the separation results of the two-order asperity on the joint surface, the morphological parameters of the two-order asperity were counted to quantitatively characterize the roughness of the joint surface. The morphological parameters of the joint surface were divided into three types: the roughness angle parameter, the roughness degree parameter, and the trace length parameter. Therefore, combined with the type of morphological parameters and the size effect of the joint surface profile curve, appropriate morphological parameters were selected to predict the roughness coefficient. The selected parameters and corresponding calculation equations were expressed as follows [24,63] where L is the line length of the joint surface profile curve, γ i is the joint surface discrete point ordinate, and N is the number of data points.
where y max and y min represent the maximum and minimum values of the joint surface profile curve discrete data on the y coordinate.
where x i is the discrete point abscissa for the joint surface profile curve. 4) Roughness profile indexes (R p ) In this article, the same method was used to distinguish the same morphological parameters of the two-order asperity. For example, R p 1st and R p 2nd represent the roughness profile index of the two-order asperity, respectively. The other morphological parameters adopted the same definition method. The statistical results of the morphological parameters of 48 profile curves and the statistical characteristics of the results are shown in Figures 5 and  6, respectively. It can be seen that there is a certain correlation between the morphological parameters (i ave 1st , R max 1st , R p 1st ) of the first-order asperity. For the same morphological parameters of a two-order asperity, taking the morphological parameters as an example, the standard deviation, average value, maximum value, minimum value and median of the sum of i ave 1st and i ave 2nd have obvious differences in value. From the correlation analysis results, it can be seen that the correlation level with i ave 1st and i ave 2nd is low. A lower correlation level indicates that the relationship between the two is weaker, which can explain the morphology more comprehensively. It is proven that the separation results of the two-order asperity of the joint surface profile curve have rationality and effectiveness. The correlation coefficients between the six morphological parameters (i ave 1st , R max 1st , R p 1st , i ave 2nd , R max 2nd , and R p 2nd ) and JRC are 0.938, 0.910, 0.941, 0.626, 0.593, and 0.586, respectively, all of which have obvious correlations. Therefore, these six morphological parameters can be used as effective descriptive parameters of JRC. The dimensions and value ranges of the morphological parameters and JRC were different. To speed up the training speed of the model and avoid the calculation error of the model, the morphological parameters and JRC normalization were used in the machine learning model to [−1,1]. In this study, the maximum and minimum methods were used to normalize the input and output data set. The normalization equation was as follows: where ′ x is the normalized data and x is the original data. After the data normalization was completed, it was randomly divided into 38 machine learning training sets and 10 prediction sets in a ratio of 0.8:0.2. The training set is used for model learning and training. The prediction set was used to test the performance of the model after the model was built to evaluate the prediction results and accuracy. To evaluate the rationality of the data set partition, the statistical characteristics and T-test of the training set and prediction set data were carried out. The test can count the distribution characteristics of data sets and compare the differences in the distribution characteristics of data sets (Tables 1 and 2).
It means that the two sets have similar distribution characteristics and are not significantly different from each other when the P-value between the two sets is

Regression results
The value of MSE was used as the evaluation indicator; the method of grid search and cross-validation was used to optimize C and g of the SVR model. When the model optimization results were ε = 0.05, C = 1.4641, and g = 1.1096, the SVR model achieved the optimal prediction accuracy. By adjusting the model hyperparameter, the RF model achieves optimal accuracy when hyperparameter n_estimators were 400, and maximum_depth was 2. The prediction results of the SVR and RF models are shown in Figure 7. It can be seen that both SVR and RF models have ideal precision. The fitted correlation coefficient R 2 is 0.954 and 0.933, respectively (all above 0.9). The deviation of model prediction results from true values is small, which indicates that the predicted results of the SVR and RF models are reasonable and realistic. For the prediction set, the SVR   model has a higher correlation coefficient R 2 compared with the RF model, and the mean absolute error (MAE) and root mean square error (RMSE) are reduced by 19.5 and 12.8%, respectively. Therefore, from the point of view of model applicability, it can be concluded that both SVR and RF models are all applicable to the prediction of JRC on joint surfaces, but the accuracy of SVR model is a little bit higher [23].
To further evaluate the prediction accuracy of the machine learning model based on contour curve separation results (considering the two-order asperity), a machine learning model based on unseparated profile curve morphological parameters (only considering single-order asperity) was also established for comparative analysis in this article, which expressed as SVR(SO) and RF(SO). The comparison of prediction results is shown in Figure 8, and the comparison of accuracy is shown in Table 3. It can be concluded that the predicted results of the machine learning model based on the morphological parameters of the unseparated profile curve do not exceed the value range of JRC (from 0 to 20) but the goodness of fit. The prediction results of the SVR and RF models both perform better. The fit of both models is higher than 0.93. The MAE values are less than 1, and the RMSE values are less than 1.2, which indicates that the predicted results of the SVR and RF models are quite reasonable and realistic. From the prediction results, it can be seen that the difference between the predicted values of the SVR and RF models and the true values of the median is small, and they all show the prediction results that are smaller than the true values. However, the difference between the predicted and true values of the SVR model is smaller than that of the RF model, and the SVR model has a higher correlation coefficient R 2 and smaller RMSE and root mean error than the RF model. This indicates that the SVR model is more suitable for handling low-dimensional data sets and more suitable for JRC prediction of joint surface than the RF model. In addition, the MAE and RMSE of the model prediction results are not as good as those of the model based on the separation results.
As for the two-order asperities of joint surface (taking RF as an example), a total of six statistical parameters were selected for the quantitative characterization roughness. The RF model is able to assess the importance of the input variables. The model uses cross-validation to estimate the significance of the input variables. The impact of features on model accuracy is calculated by disrupting the order of eigenvalues of a feature in the sample. The higher the feature importance, the greater the impact on model accuracy. Feature importance is measured by the degree of decrease in accuracy, from which the importance score of each variable is obtained. The calculation results show that the importance ratings of the six morphological parameters i ave 1st , R max 1st , R p 1st , i ave 2nd , R max 2nd , and R p 2nd for JRC were obtained as 29.4, 26.3, 27.6, 5.4, 5.9, and 5.4%, respectively. The importance score shows that the morphological parameter waviness (first-order) is more important to the JRC, and the morphological parameter unevenness (secondorder) contributes less to the JRC. However, the total importance of the unevenness morphological parameter  is greater than 16%, which also influences the quantitative value of JRC to some extent. So, it is helpful to improve the prediction accuracy of JRC considering the morphological parameters of waviness and unevenness of joint surface in the quantification of joint profiles and their roughness parameters. Therefore, the machine learning model based on the separation results is more suitable for JRC prediction. The traditional classical R p -based linear regression model (LM) by Yu and Vayssade [24] was used to calculate the contour curve morphological data in prediction set and also compared with the prediction results of the machine learning model in this study. The regression equation is shown in Eq. (15). The prediction results are shown in Figure 9, and the accuracy comparison is shown in Table 4. It can be seen that the partial prediction results of the linear model exceed the range of JRC values (0-20), the fitting coefficient is 0.513, and the correlation is low. The average absolute error and RMSE are larger. Therefore, the prediction results are unreasonable, and the application effect is poor on the randomly selected prediction set.
Based on the prediction results of various models, the machine learning model considering two-order asperities of joint surface was superior to those of the learning model based on the morphological parameters of the unseparated profile curve and linear model in terms of the fitting coefficient, MAE and RMSE. Actually, JRC was a comprehensive parameter to describe the morphology of the joint surface. But many researchers had quantified JRC using only a single joint surface morphological parameter, without considering morphological characteristics at the same time such as undulation angle, undulation degree, and trace length. This kind of linear model selected only one or two morphological parameters for fitting and regression, and it was difficult to fully characterize the influence of morphological parameters on JRC. Although the two-order roughness division method produced a large number of statistical parameters that can provide a fine characterization of the joint surface roughness, the complex nonlinear relationships between the parameters were difficult to characterize by traditional regression analysis methods. Nonlinear data-driven models show strong advantages in solving both small and nonlinearity samples among the many commonly used machine learning models. They not only meet the requirement of training sample size, but also ensure the accuracy of prediction. That is why, the data-driven model proposed in this study can improve the accuracy of prediction results.
The machine learning model based on the separation results synthesized the different contributions of the twoorder asperity in the roughness of the joint surface and constructed the nonlinear relationship between the morphological parameters and JRC. Therefore, the separation method of joint surface morphological parameters and the nonlinear calculation model proposed in this study were more suitable for the JRC prediction of joint surface.

Case application
To verify the accuracy of the roughness prediction of the machine learning model based on the profile curve  separation results, a direct shear test of the structural profile was carried out to measure the peak shear strength and the friction angle of the sample under different stresses. The rebound test of the joint surface sample was carried out to measure the strength of the wall rock of the sample. Based on the results of the shear test and the rebound test, the JRC of the joint surface was inversely calculated, and the prediction results of the model used in this study were compared and analyzed. The test sample was taken from the water inlet of the lower reservoir of Wuyue Pumped Storage Power Station, which was located in Yinpeng Township, Guangshan County, Henan Province. The bedrock lithology in the engineering area mainly included schist of the Mesoproterozoic Guishan Formation, granulite and schist of the Nanwan Formation of the Devonian, and middle-grained granite intruded in the late Yanshan period. The sample used in this study was mediumgrained granite, and the basic physical parameters are shown in Table 5. The size of the sample was 100 mm × 100 mm × 100 mm. The L-type hammer was used to measure the wall rock strength of the joint surface (after the rebound value is measured by the test, the joint compressive strength (JCS) is finally obtained by checking the graph or calculating with relative formula). The sample shear tests were carried out by an automatic bidirectional shear instrument. The constant normal stress was set as 1.0 MPa. The shear force was applied at a loading rate of 4 mm·min −1 , and the shear stress was stopped when the shear displacement reached 10 mm. The test process and results are shown in Figure 10.
Based on the test results, the JRC of the sample can be calculated by the JRC-JCS equation (16): where τ p is the shear strength of the joint plane, σ n is the normal stress of the joint surface, and JCS is the strength of the structural wall rock, which was measured by rebound meter on-site. The final calculation results are shown in Table 6. These four JRC values obtained by experiment and classical theoretical calculation were expressed as experimental joint compressive strength (EX-JCS) results, and they were used as a reference standard for the JRC prediction of joint surface.
To verify the accuracy of the machine learning model prediction of the two-order morphological parameters of the joint surface, the samples were scanned with a threedimensional structured light scanner, and morphological data with a sample scanning pitch of 0.02 mm were obtained. The structural light scanner used in this study has a scanning range of 500 mm × 500 mm and a scanning accuracy of 0.02 mm. The single scanning time is only 4 s. 3DScanner and Geomagic Studio are used to process the point cloud data. Along the shear direction of the sample, 26 joint surface profile curves were extracted at a distance of 4 mm, and each profile curve i st ave 1 , i ave , and other morphological parameters were counted. The profile curve JRC prediction was performed. The sample scanning and profile layout are shown in Figure 11.
The morphological parameters without the two-order asperity separation were counted based on the profile curve of the joint surface sample. The statistical results of the morphological parameters of 104 profile curves of joint surface from four samples are shown in Figure 12. Since the range and magnitude of the curvilinear morphological parameters were different, the data obtained from Eq. (14) were normalized to reduce the order of magnitude differences between the parameters. Then, it was also easier to observe the distribution characteristics of the morphological parameters. For the three morphological parameters of waviness (first-order) of the same specimen, although there were local differences in the degree of variation, the overall undulation pattern of the curves was more similar. However, the morphological parameters of unevenness (second-order) showed a high degree of dispersion. The difference between the morphological parameters of waviness (first-order) and unevenness (secondorder) indicated that the refined separation results can characterize the detailed morphological features in more dimensions, which could not be found in the apparent morphology of interface undulations. Absolutely, the finer the joint surface morphology was portrayed, the more accurate the JRC prediction results were.
To further validate the applicability of the machine learning model, the SVR and RF models for predicting the JRC of joint surface based on the profile separation results were established ( Figure 13). It can be seen that the prediction values of the SVR and RF models fluctuate slightly up and down around the EX-JCS results. For the four random specimens, the deviation rates of the RF model calculation results from the EX-JCS results are 0. 42   in RF model, and it was 5.57-13.53% in the SVR model. The results indicated that the established test system can meet the requirement of engineering precision (<20%). However, the prediction results of the SVR model were closer to the EX-JCS values than that in the RF model, and the errors were smaller. Therefore, the SVR model based on the profile separation results were more suitable for predicting the joint surface JRC effectively and accurately.
To further validate the effectiveness of the SVR and RF models for the prediction of morphological parameters of joint surface considering both waviness and unevenness at the same time, the morphological parameters that have not been separated by the two-order asperity are counted based on the plane profile curve of the joint surface sample. Moreover, linear models were also established to predict the JRC of joint surface. Prediction results of different methods are shown in Table 7 and Figure 14.
It can be concluded that different models had different JRC prediction results for the sample plane profile curve. Firstly, the results calculated by the prediction model were generally larger than those obtained by EX-JCS. This was mainly due to the fact that 25-75% of the curves in the model training set had a JRC value division interval of 6-15 (as shown in Figure 5). This also indicated that the profile curve of hard joint surfaces of rock bodies was generally more undulating in nature, and similar conclusions had been obtained by other scholars [64,65]. Second, it can be found that the prediction value of the linear equation based on R p had exceeded the value range of JRC, and the prediction result was unreasonable. The relative error was also much larger than those of the SVR model and the RF model. The reason for the unreasonable prediction value and the large prediction error was that it was difficult to describe the morphological characteristics of the profile curve due to the single morphology parameter selected by the linear equation, and it was only established based on the standard profile curve. Therefore, the error rate was large and the application effect was poor in practical applications. Third, the prediction results of the SVR model and the RF model based on the profile separation results were well distributed, and there were no results exceeding the range of JRC. The prediction results based on the single-order model had similar patterns of change with that of the two-order model. However, the relative error between the former and EX-JCS results was larger than those in latter. The prediction accuracy had improved by 7.2 and 5.0% in the SVR model and the RF model, respectively. Finally, comparing the prediction results of the SVR and RF models, it can be seen that the SVR prediction results were closer to EX-JCS with less error compared to RF prediction results and were more suitable for predicting and calculating the JRC of joint surfaces.
In summary, it can be concluded that the SVR nonlinear model that considering separation of two-orders of joint surface roughness (waviness and unevenness) was more suitable for the prediction of JRC, and also the calculation results will be more consistent with the real characteristics of the rock joint surface. It is worth noting that the application of the model in this research is mainly for igneous rocks with predominantly hard joint surface,   especially under dry environment. Moreover, the size effect of joint surface shear strength was not considered. Usually, as the size of joint surface increases, the shear strength of joint surface decreases accordingly, and they all should be studied in the later stage. Moreover, more sample data, either from our experimental measurements or from other scientific studies in the literature, are also needed for future research.   1) The R p of profile curve under different sampling intervals was counted by the grayscale image processing method, and the relationship between the sampling interval and R p was studied. The curve slope mutation point SI = 2 mm was the optimal separation distance of the two-order asperity of the joint surface profile curve. Accurate separation of the two-order asperities in 48 profile curves of known roughness was achieved.
2) The quantitative separation method of the two-order asperities of joint surface contour curves was proposed. Based on the separation results of the two-order asperity with 48 profile curves, the i ave , R , max and R p parameters were counted from three aspects: asperity angle, asperity degree, and trace length. The difference between the morphological parameters of waviness (first-order) and unevenness (second-order) indicated that the refined separation results can characterize the detailed morphological features in more dimensions, and it would help to quantify the factors affecting the roughness more comprehensively and improve the prediction accuracy of JRC.
3) A database of morphological parameters of joint surface was established according to the statistics of the separation results of the two-order asperities of 48 profile curves. Thirty-eight curves were randomly selected as the training set, and the remaining 10 curves were selected as prediction set, and the nonlinear prediction models were constructed by the machine learning method. The comparison results showed that the SVR and RF models considering the separation results showed better prediction performance compared with the prediction results of the traditional linear model and the machine learning model without separating the parameters obtained by the two-order asperity. 4) Samples with natural joints were prepared for shear tests to calculate the JRC of samples. The test results were compared with predicted values by the SVR model, the RF model, the SVR(SO) model, the RF(SO) model, and the linear model. The results showed that the prediction accuracy had improved by 7.2% in the SVR model and it was 5.0% in the RF model compared with the SVR(SO) and RF(SO) models. The SVR prediction results were closer to EX-JCS with less error compared to RF prediction results and were more suitable for predicting and calculating the JRC of surfaces. Thus, the SVR nonlinear model that considers the separation of two-orders of joint surface roughness (waviness and unevenness) was more suitable for JRC prediction, and also, the calculation results will be more consistent with the real characteristics of rock joint surface.