Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access February 9, 2022

Rock mass structural surface trace extraction based on transfer learning

Xuefeng Yi, Hao Li, Rongchun Zhang, Zhuoma Gongqiu, Xiufeng He, Lanfa Liu and Yuxuan Sun
From the journal Open Geosciences

Abstract

To solve engineering geological problems, including water conservancy, transportation, and mining, it is necessary to obtain information on rock mass structures, such as slopes, foundation pits, and tunnels, in time. The traditional method for obtaining structural information requires manual measurement, which is time consuming and labor intensive. Because geological information is complicated and diverse, it is not practical for general deep learning methods to obtain full-scale structural surface trace images to prepare training samples. Transfer learning can abstract high-level features from low-level features with a small number of training samples, which can automatically express the inherent characteristics of objects. This article proposed a rock mass structural surface trace extraction method based on the transfer learning technique that considers the attention mechanism and shape constraints. For the general test set, the accuracy of rock mass structural surface trace recognition with the proposed method can reach 87.2%. Experimental results showed that the proposed method has advantages in extracting complicated geological structure information and is valuable for providing technical support for the extraction of geological information in the construction of water conservancy, transportation, mining, and related projects.

1 Introduction

The timely acquisition of information regarding rock mass structures such as slopes, foundation pits, and tunnels is a necessary basis for evaluating problems in water conservancy, transportation, mining, and other engineering projects. Two principle traditional methods, scanline and window statistics, mainly use measuring tape and compasses to measure structural surface information and are inefficient and create heavy workloads. Moreover, in water conservancy and hydropower engineering and highway and railway projects, most large-scale infrastructure construction encounters alpine gorge environments, and surveying work is very dangerous but necessary. In addition, the accuracy of manual field measuring results is difficult to guarantee because it largely depends on subjective judgments. With the development of new technologies, some noncontact measuring methods have been gradually applied to rock mass structural information extraction. Digital cameras mounted on unmanned aerial vehicles (UAVs) or other ground platforms are used to collect field geological images, and geological characteristic recognition and extraction can be finished indoors, which significantly improves both efficiency and safety.

Complicated and diverse rock mass structural surface traces may mix with nonstructural surface information. Artificial identification is time consuming and labor intensive, and the results are more subjective. How to extract rock mass structural surface information from digital images accurately and automatically is a difficult problem to solve. Both deep and shallow features of objects could be well expressed by high-level features abstracted from low-level features using deep learning technology. This is advantageous for the extraction of complicated geological structure information. The first mathematical model of neurons, the MP model, was proposed by Mcculloch and Pitts in 1943 [1]. In 1986, Rumelhart et al. [2] proposed a multilayer feedforward network trained by an error back propagation algorithm (back propagation (BP) network). The artificial neural network with multiple hidden layers proposed by Hinton and Salakhutdinov has excellent feature learning capabilities and can effectively overcome the training difficulties of deep neural networks through “layerwise pretraining,” which led to the study of deep learning [3]. In recent years, applications of deep learning technologies in the geological field have gradually increased. For example, some studies concentrate on using large-scale images (e.g., satellite and aerial images) to achieve regional geological interpretation and to extract landslides, debris flows, and other macrogeological hazards, while other applications focus on microscale rock lithologies and microfissure identification [4,5]. Singh et al. [6] proposed a basalt rock sample texture recognition method based on deep learning in which 27 parameters are extracted from the RGB or gray image of the rock sample slice to construct inputs of the neural network and classify the rock texture. Li et al. [7] used the transfer learning method to train sandstone microscopic images and obtain a sandstone microscopic image classification model with high accuracy. Zhang et al. [8] established a deep transfer learning model for rock image set analysis and used the transfer learning method to automatically identify and classify rock lithology. Feng et al. [9] proposed a deep learning lithology automatic recognition method based on the fresh rock surface image and the twin convolutional neural network structure and used the subchannels in the twin convolutional neural network to extract the global and local feature information of the rock to construct a descriptor for lithology identification. Ullo et al. [10] combined transfer learning with a pretrained Mask R-CNN model to detect landslides although training image annotation led to serious time consumption in their method. Xu et al. [11] constructed a labeled postearthquake scene dataset that includes six categories through remote sensing image segmentation and proposed a postearthquake multiple scene recognition model based on the single-shot multibox detector method for postearthquake recognition. Lu et al. [12] generated a landslide sample dataset using orthoimages derived from UAV images and compared four deep learning and transfer learning landslide feature models. Their study analyzed the application of different models to landslides with different scales. Polat et al. [13] classified various types of volcanic rock using transfer learning methods based on DenseNet121 and ResNet50 networks. In their research, the features of thin-section images of rocks were extracted with a single-layer fully connected neural network. Due to the diversity and complexity of structural surface traces, the insufficiency of samples is a difficult problem for the application of general deep learning technology to identify and extract geological information in near-ground engineering projects. Transfer learning provides an important method for rock mass structure information extraction that requires only a small number of training samples. Zhang et al. [14] applied the Inception-v3 model to identify geological structures and accommodated the geological structure classification using k-nearest neighbors, artificial neural networks, and extreme gradient boosting. Chen et al. [15] presented an RS-SMOTE-GBT classification method for rock trace identification. The hybrid classifier performance for rock trace identification is evaluated by combining the SMOTE technique and hyperparameters optimization algorithms. Xu et al. [16] applied a deeply supervised object detector to acquire rock position information and constructed a transfer learning model based on ResNet to extract rock feature information. Geological objects present complicated structural features and carry a variety of geological information. The attention mechanism can train the network model to focus on the decision-making information and ignore irrelevant geological information in the images, which largely improves its efficiency and accuracy. On the basis of the existing research on rock surface extraction [17], this article conducts research on the extraction of rock mass structural surface traces based on deep learning.

In this article, an ResNet-SA (residual network with shape constraint and attention mechanism) model is proposed for rock mass structural surface trace extraction. The remainder of this article is organized as follows. The detailed methodology of the proposed algorithm for rock mass structural surface trace extraction is introduced in Section 2. In Section 3, the feasibility and accuracy of the algorithm are illustrated through experiments using concrete crack datasets and geological datasets with a small number of samples. Finally, a discussion and our conclusions are summarized in Section 4.

2 Methodology

2.1 Transfer learning

Transfer learning is a learning process that uses the knowledge from previous models [18,19]. Knowledge can be transferred within different but related domains. Figure 1 shows the schematic of traditional transfer learning between two domains. Instead of starting from scratch, as is the case in most networks, new model parameters can be learned and optimized more efficiently with a small number of samples through transfer learning. As one of the most popular models, residual networks (ResNets) have various applications in computer vision and many other areas [20].

Figure 1 
                  The schematic of traditional transfer learning.

Figure 1

The schematic of traditional transfer learning.

2.1.1 Residual network

In deep learning, a larger number of network layers is helpful to extract more representative characteristics, but may create the problem of gradient disappearance in network training. ResNet with skip connections can overcome the aforementioned problem and improve model training accuracy. The ResNet models mainly include ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152 [21].

Suppose H is a network layer, the input and output for which are denoted by X and H(X), respectively. For residual learning, any function H(X) can be fixed by a multilayer neural network. Compared to H(X), the residual function f(x) = H(X) − X can avoid gradient disappearance and is easier to fit in the network training process. The residual network module is shown in Figure 2.

Figure 2 
                     The structure of the residual network module.

Figure 2

The structure of the residual network module.

2.1.2 ResNet model applicability for rock mass structural surface traces

ImageNet is an important standard dataset to evaluate algorithm performance in image processing research based on deep learning. There are more than 14 million pictures in the ImageNet dataset, which are divided into more than 20,000 classes. In this article, the ResNet50 pretrained neural network model based on the ImageNet dataset officially provided by PyTorch is used for rock mass structural surface trace extraction research.

The rock mass structural surface trace dataset and ImageNet dataset are two different but related domains. Let structural surface trace samples be the input of the pretrained model for ResNet50, and feature maps of the model parameters are visualized in Figure 3. The main characteristics of the structural surface trace are well reflected on shallow features, including color, texture, and edge, to some extent, as shown in Figure 3(a)–(c). Therefore, it is appropriate to apply the ResNet50 pretrained model to rock mass structural surface trace recognition and extraction.

Figure 3 
                     The feature map visualization results for ResNet50 with geological inputs: (a) f1_conv1; (b) f3_relu; (c) f5_layer1; (d) f8_layer4_1.

Figure 3

The feature map visualization results for ResNet50 with geological inputs: (a) f1_conv1; (b) f3_relu; (c) f5_layer1; (d) f8_layer4_1.

2.2 Attention mechanism

The attention mechanism in machine learning is a technique mimicking cognitive attention [22] that selectively concentrates on a few relevant parts of the input data and ignores the remaining data. Bahdanau et al. [23] applied attention mechanism to language translation research and extended it to other fields. Both channel attention and spatial attention are used in this article to enhance the crucial features of rock mass structural surface traces. The schematic of channel attention and spatial attention is shown in Figure 4.

Figure 4 
                  Schematic of channel attention and spatial attention.

Figure 4

Schematic of channel attention and spatial attention.

2.2.1 Channel attention and spatial attention

The channel attention module considers each channel of a feature map as a feature detector and only focuses on the meaningful information provided by an input image. The channel attention module filters those channels that have a negative effect on information transmission to enhance effective channel weights. The input feature map is used to generate a channel attention feature map through width global max pooling, height global average pooling, and a multilayer perceptron, and the process is shown in Figure 5(a). The channel attention feature map is calculated by equation (1).

(1) M c ( F ) = σ ( MLP ( AvgPl ( F ) ) + MLP ( MaxPl ( F ) ) ) = σ ( W 1 ( W 0 ( F avg c ) ) + W 1 ( W 0 ( F max c ) ) ) .

Figure 5 
                     The procedures of attention modules: (a) the channel attention; (b) the spatial attention.

Figure 5

The procedures of attention modules: (a) the channel attention; (b) the spatial attention.

The spatial attention module mainly concentrates on the position and is shown in Figure 5(b). The output of the channel attention module is taken as the input for spatial attention, and the channel attention feature map is concatenated by average and max pooling operations. After pooling along the channel axis, valid feature descriptors are generated. The spatial attention feature map is calculated by equation (2).

(2) M s ( F ) = σ ( f 7 × 7 ( [ AvgPl( F ); MaxPl( F ) ] ) ) = σ ( f 7 × 7 ( [ F avg s ; F max s ] ) ) .

2.2.2 Visualization validation for ResNet based on the attention mechanism

Model recognition results may vary with different algorithms. The heatmap using only the ResNet50 model and those that consider the attention mechanism are shown in Figure 6(b) and (c), respectively. The network model, which considers the attention mechanism, could better concentrate the rock mass structural surface traces. For complicated geological objects, traces are easier to recognize using an attention mechanism, which shows the effectiveness of introducing an attention mechanism into the feature extraction network.

Figure 6 
                     The heatmap visualization: (a) an input image, (b) the heatmap for ResNet50, and (c) the heatmap for ResNet50 integrated with the attention mechanism.

Figure 6

The heatmap visualization: (a) an input image, (b) the heatmap for ResNet50, and (c) the heatmap for ResNet50 integrated with the attention mechanism.

2.3 Transfer learning integrating attention mechanism with shape constraints

2.3.1 Sample data processing and augmentation

For a convolutional neural network model, robustness in different situations cannot be guaranteed. For example, if the training data and testing data belong to different domains, the model performance degrades. Rock masses exist in different physical and chemical environments, and their formation and composition in strata are different, which results in diverse morphologies. The geological structural surface has various shapes (e.g., linear and wave-like undulations), its width ranges from millimeters to kilometers, and the fillings are even more diverse, which makes it difficult to prepare complete data samples. Therefore, the lack of data samples is a crucial problem to be solved for structural surface trace extraction with deep learning techniques. This article performs the following preprocessing on sample data.

(1) The small number of sample data for the structural surface traces are acquired by different digital cameras, and various resolutions and multiple views may affect the dataset generalization. Embossed filtering preprocessing is used to emphasize the shape characteristics of the traces and weaken the effects of the nonstructural surface traces in color, texture, and light, as shown in Figure 7.

Figure 7 
                     Examples of structural surface trace characteristics: (a) original images and (b) embossed images.

Figure 7

Examples of structural surface trace characteristics: (a) original images and (b) embossed images.

(2) Existing datasets that approximate geological structural surface traces are identified. The shape of concrete cracks is relatively stable, and the distribution is not particularly complicated. Concrete is an artificial rock that is similar to geological objects, and the reason for concrete crack formation is similar to that for geological structural traces, as both are caused by the mechanism of tension or shear. Concrete cracks have a certain similarity to structural surface traces in surface morphology. Concrete crack datasets for deep learning research are now available as open resources [24,25,26]. In the case of insufficient structural surface trace samples, the concrete crack dataset can increase the number of samples and enhance the model’s generalization of the shape. However, geological structural surface traces cannot be completely replaced by concrete cracks because of their inconsistent color and texture characteristics. In this article, a two-step transfer learning strategy is proposed that considers a small number of geological trace samples and concrete crack datasets for structural surface trace recognition and extraction. Figure 8(a) and (b) show some sample examples of the original data and the corresponding shape data, respectively.

Figure 8 
                     Sample examples for concrete crack dataset: (a) original samples and (b) shape samples.

Figure 8

Sample examples for concrete crack dataset: (a) original samples and (b) shape samples.

2.3.2 Transfer learning based on ResNet-SA model

This article modifies the traditional ResNet model and proposes a ResNet-SA model that enhances the structural surface trace characteristics and ignores other features. The network structure is designed as shown in Figure 9. The RGB sample dataset – the regular input used for ResNet50 – is first preprocessed by embossed filtering to generate the shape sample dataset, which is taken as a dual input together with the RGB sample dataset. In the proposed ResNet-SA model, three input channels are replaced with six channels in the first convolutional layer of the ResNet50 model. To maintain synchronous training, the RGB sample dataset is resampled randomly, and the resulting indices are used to extract the corresponding shape sample data to maintain consistency between the two types of datasets. Compared to the traditional ResNet50 model structure, both channel and spatial attention mechanisms are added to the first convolutional layer and the fully connected layers in the proposed ResNet-SA. The weight parameters of all convolutional layers are fine-tuned with a small learning rate by transfer learning on the ResNet50 model. The output of the fully connected layer are transformed to two nodes. The ResNet-SA model is then trained by a combination of the stochastic gradient descent (SGD) optimizer and the cross-entropy loss function.

Figure 9 
                     The structure of the ResNet-SA model.

Figure 9

The structure of the ResNet-SA model.

Considering the sample insufficiency of rock mass structural surface traces, this article first acquires the corresponding shape sample dataset according to the RGB concrete crack dataset. Both the shape dataset and the RGB dataset were taken as dual inputs for the first-step transfer learning training of the ResNet-SA model. For the color and texture differences between geological traces and concrete cracks, a small geological trace dataset is used as inputs for the second-step transfer learning training to optimize the parameters of the ResNet-SA model. For the latter, the learning rate should be set to a smaller parameter. Finally, the ResNet-SA model is obtained with better generalization to the structural surface traces. The two-step transfer learning strategy for the ResNet-SA model with the RGB dataset and the shape dataset of geological traces and concrete cracks is shown in Figure 10.

Figure 10 
                     The two-step transfer learning strategy for the ResNet-SA model.

Figure 10

The two-step transfer learning strategy for the ResNet-SA model.

3 Experiments and results

3.1 Datasets

There are few studies on structural surface trace extraction based on deep learning technologies; to the best of our knowledge, there are no related datasets available to the public. This article prepares a small dataset of rock mass structural surface traces acquired from several large-scale engineering projects, including the NJ and Kahala Hydropower Projects in Pakistan, the Larsen Hydropower Project in Laos, the Guyuan Water Project in Ningxia, and the BH Hydropower Project in Xinjiang, China. Figure 11 shows several examples of geological structural surfaces at different project sites. The concrete crack dataset considered in this article is acquired from Kaggle (https://www.kaggle.com/arunrk7/surfa-ce-crack-detection), which is used for sample data augmentation and model generalization.

Figure 11 
                  Examples of geological structural surfaces in different project sites: (a) weak interlayer in the exploration audit of the BH hydropower project in Xinjiang, (b) the fracture concentrated belt of the Kahala Hydropower Project in Pakistan, (c) horizontal fractures of the water conveyance tunnel of the Guyuan Water Source Project in Ningxia in China, and (d) ground fractures of the BH hydropower project in Xinjiang.

Figure 11

Examples of geological structural surfaces in different project sites: (a) weak interlayer in the exploration audit of the BH hydropower project in Xinjiang, (b) the fracture concentrated belt of the Kahala Hydropower Project in Pakistan, (c) horizontal fractures of the water conveyance tunnel of the Guyuan Water Source Project in Ningxia in China, and (d) ground fractures of the BH hydropower project in Xinjiang.

Several sample examples of the geological dataset are shown in Figure 12. The labeled samples for geological traces and concrete cracks are divided into a training set, a validation set, and a testing set with a ratio of 8:1:1, and the details are listed in Table 1. The training set is used for training (or fine-tuning) networks, while the validation set is used for hyperparameter optimization.

Figure 12 
                  Sample examples of geological structural surface traces: (a) RGB samples and (b) shape samples.

Figure 12

Sample examples of geological structural surface traces: (a) RGB samples and (b) shape samples.

Table 1

The division of sample dataset

Type Concrete dataset Rock mass trace dataset
Positive Negative Positive Negative
Tr 1,600 1,600 800 800
Val 200 200 100 100
Test 200 200 100 100
Total 2,000 2,000 1,000 1,000

Notes: Tr denotes the training set; Val denotes the validation set; Test denotes the testing set.

3.2 Experiment results and evaluation

The experimental process of the proposed ResNet-SA model is as follows.

  1. (1)

    Unlike the ResNet50 pretrained model, the output features in the fully connected layer of ResNet-SA are set as two. The three input channels for the RGB data instead have six channels composed of RGB and shape data. Both the spatial and channel attention mechanisms are merged into the first and final convolutional layers, respectively.

  2. (2)

    The parameters from the second convolutional layer to the fifth convolutional layer in the ResNet50 pretrained model are loaded to the ResNet-SA model for initialization.

  3. (3)

    The training and validation processes of the ResNet-SA model are conducted in two stages. The concrete sample dataset is applied in the first stage, and the rock mass trace sample dataset is used in the second stage. The dual input preprocessing and two-stage transfer learning strategies for ResNet-SA vary with traditional transfer learning methods.

The RGB sample dataset and the shape sample dataset are taken as dual inputs in the proposed method. The dual input data must be loaded with the same sampler to ensure the correct data correspondence between the RGB and shape samples.

As the earlier layers of the model have been fully trained, a smaller learning rate is generally preferred for fine-tuning in transfer learning training, which mainly aims at all convolutional layers (or all layers excluding frozen layers). In this research, the dual input channel has a large effect on the first convolutional layer, and the learning rate is set by stages and layers, i.e., the learning rate in the first and final convolutional layers is set to 0.001, and that for other layers is set to 0.0001, which helps to enhance the generalization ability of the model. A smaller learning rate is adopted for the second training stage with the rock mass trace sample dataset.

The appearance of geological trace samples differs across various sensors and photography conditions, which may affect the generalization performance of the sample set. However, both the color and texture of the trace appear to be obviously different from those of the surrounding rock mass, and the edge detection technique has been widely applied in structural surface trace extraction. In this study, three preprocessing methods, including traditional edge enhancement, binary transform, and embossed filtering, are compared to obtain a more appropriate shape sample dataset. Some shape sample examples with different preprocessing are shown in Figure 13.

Figure 13 
                  Examples of shape samples with different preprocessing. (a) Original samples, (b) samples after edge enhancement, (c) samples after binary transform, and (d) samples after embossed filtering.

Figure 13

Examples of shape samples with different preprocessing. (a) Original samples, (b) samples after edge enhancement, (c) samples after binary transform, and (d) samples after embossed filtering.

Three shape sample datasets obtained with the aforementioned preprocessing methods were used for ResNet-SA model training, and the partition is shown Section 2.1. The validation set and the test set were used for the accuracy assessment of different preprocessing methods, and all the results were averaged over five random data splits. The final results of the accuracy comparison for different shape samples are listed in Table 2.

Table 2

Accuracy comparison of different shape sample datasets

Shape dataset Accuracy for validation set Accuracy for test set
Binary samples 83.4 ± 0.4 82.5 ± 0.3
Edge samples1 89.8 ± 0.2 87.9 ± 0.3
Edge samples2 90.2 ± 0.2 88.8 ± 0.2

Notes: Edge samples1 are obtained through traditional edge enhancement; Edge samples2 are obtained through edge enhancement with embossed filtering.

Table 2 shows that the accuracy on both the validation set and test set for the ResNet-SA model with the shape sample dataset preprocessed by edge enhancement is obviously higher than that by binary transform. Moreover, compared with traditional edge enhancement, the shape sample dataset from embossed filtering helps to reach higher model accuracy.

To further evaluate the performance of the proposed method, two group experiments were conducted in this article, one of which mainly concentrated on a quantitative evaluation of the attention mechanism in ResNet-SA and the other on a comparison between the traditional ResNet50 model and the ResNet-SA model. The accuracy comparison results for each model are presented in Table 3, which were averaged over five random data splits.

Table 3

Accuracy comparison of different models

Method Accuracy for validation set Accuracy for test set
ResNet50 86.5 ± 0.4 83.9 ± 0.2
ResNet-SA1 88.6 ± 0.3 86.4 ± 0.3
ResNet-SA 90.2 ± 0.2 87.2 ± 0.2

Notes: ResNet-SA1 denotes the model with dual inputs but no attention mechanism; ResNet-SA denotes the proposed model.

For the ResNet50 model, only the RGB sample data are taken as inputs, while both the RGB dataset and the shape sample dataset are taken as dual inputs for the ResNet-SA model. Compared with the traditional ResNet50 model, the accuracy on the validation set and test set for ResNet-SA is increased by 3.7 and 3.3%, respectively, and compared with the model that has dual inputs but does not consider attention mechanism, the accuracy of the ResNet-SA is increased by 1.6 and 0.8%, respectively. It can be seen that the ResNet-SA model considering the attention mechanism and shape constraints performs best in both accuracy and generalization.

The accuracy on the test set is slightly less than that on the validation set in the aforementioned experiments, indicating the existence of a slight overfitting. In traditional transfer learning methods, the parameters in the bottom convolutional layers are generally fixed, and only ones in the top convolutional layers are adjusted to avoid overfitting. However, as the particularity of dual inputs in the proposed method, a larger learning rate is necessary for the first and final convolutional layers, and the sample size used in experiments is not large enough, both of which may result in overfitting. The future work would try to use smaller learning rate and make more diverse samples to improve overfitting.

To more intuitively reflect the extraction effect of the rock mass structural surface traces, two examples of rock mass structural surface trace extraction with the proposed method and the ResNet50 model are shown in Figure 14. It is easy to show that noise can be resisted more effectively using the proposed method.

Figure 14 
                  Examples of rock mass structural surface trace extraction with the proposed method and ResNet50 model. (a) The original picture of granite, (b) the result of granite fissure extraction with the ResNet50 model, (c) the result of granite fissure extraction with the proposed method, (d) the picture of columnar basalt, (e) the result of columnar basalt extraction with the ResNet50 model, and (f) the result of columnar basalt extraction with the proposed method.

Figure 14

Examples of rock mass structural surface trace extraction with the proposed method and ResNet50 model. (a) The original picture of granite, (b) the result of granite fissure extraction with the ResNet50 model, (c) the result of granite fissure extraction with the proposed method, (d) the picture of columnar basalt, (e) the result of columnar basalt extraction with the ResNet50 model, and (f) the result of columnar basalt extraction with the proposed method.

4 Discussion and conclusions

Currently, applications of deep learning in geology are mainly focused on lithology identification of rock grinding pieces and field rock samples. In fact, geologists pay more attention to the mechanical properties of rock masses for projects in the field of engineering construction, such as water conservancy and transportation, and structural surfaces are one of the key factors determining the mechanical properties of rock masses. However, because of the complicated and irregular shape of rock mass structural surface traces, it is difficult to achieve automatic extraction. Inspired by the detection of concrete cracks based on deep learning, this article considered the deep transfer learning technique to study structural surface trace recognition in tunnels, slopes, and other engineering projects, and integrated the attention mechanism and shape constraints into transfer learning. In addition, both the concrete crack dataset and the geological dataset are used for deep transfer learning model training. In this article, the accuracy of rock mass structural surface trace recognition on the test set with the ResNet-SA model reached 87.2%, and the experimental results thus demonstrated the accuracy and reasonability of the proposed method.

Compared with the traditional deep learning method, the model proposed in this article simultaneously considers both the color and shape characteristics of the structural surface traces and has better generalization performance for the recognition of structural surface traces with complicated conformations. However, the number and the type of structural surface trace samples in the current experiment are still insufficient. To further improve the generalization performance of the model and the recognition accuracy of structural surface traces of various types, future work should continue to study and improve the structural surface trace recognition method. The structural surface dataset should be further expanded to include more characteristics and variability. Moreover, semantic features should be considered as inputs for multimodal transfer learning.

In this article, deep transfer learning technology is applied to the identification and extraction of rock mass structural surface traces, and experiments are designed to prove the feasibility and effectiveness of the new method. The method presented in this article provides a new approach to rock mass structural surface trace extraction in rock mass engineering fields relating to water conservancy, railways, highways, and mining.

Acknowledgements

The authors would like to thank the reviewers and editors for valuable comments and suggestions.

  1. Funding information: This research was funded by the National Natural Science Foundation of China (41830110, 41901401, and 42101070), the Natural Science Foundation of Jiangsu Province (BK20190743), and the China Postdoctoral Science Foundation (2021M691653).

  2. Author contributions: Conceptualization: X.Y. and R.Z.; data curation: Z.G. and Y.S.; formal analysis: X.Y.; methodology: X.Y. and H.L.; supervision: H.L. and R.Z.; validation: X.Y. and L.L.; writing-original draft: X.Y.; writing-review and editing: H.L., R.Z., X.H., and L.L. All authors have read and agreed to the published version of the manuscript.

  3. Conflict of interest: The authors state no conflict of interest.

References

[1] Mcculloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115–33.10.1007/BF02478259Search in Google Scholar

[2] Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back propagating errors. Nature. 1986;323(6088):533–6.10.1038/323533a0Search in Google Scholar

[3] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006;313(5786):504–11.10.1126/science.1127647Search in Google Scholar PubMed

[4] Su C, Wang W. Concrete cracks detection using convolutional neuralnetwork based on transfer learning. Math Probl Eng. 2020;2020(13):1–10.10.1155/2020/7240129Search in Google Scholar

[5] Liu X, Wang H, Jing H, Shao A, Wang L. Research on intelligent identification of rock types based on faster R-CNN method. IEEE Access. 2020;8:1.10.1109/ACCESS.2020.2968515Search in Google Scholar

[6] Singh N, Singh TN, Tiwary A, Sarkar KM. Textural identification of basaltic rock mass using image processing and neural network. Comput Geosci. 2010;14(2):301–10.10.1007/s10596-009-9154-xSearch in Google Scholar

[7] Li N, Hao H, Gu Q, Wang D, Hu X. A transfer learning method for automatic identification of sandstone microscopic images. Comput Geosci. 2017;103:111–21.10.1016/j.cageo.2017.03.007Search in Google Scholar

[8] Zhang Y, Li MC, Han S. Automatic identification and classification in lithology based on deep learning in rock images. Acta Petrologica Sin. 2018;34(2):333–42.Search in Google Scholar

[9] Feng Y, Gong X, Xu Y. Study on rock identification method based on rock fresh surface image and twin convolutional neural network. Geography and Geoinformation. Science. 2019;35(5):89–94 (chinese).Search in Google Scholar

[10] Ullo SL, Mohan A, Sebastianelli A, Ahamed SE, Sinha GR. A new mask R-CNN based method for improved landslide detection. IEEE J Sel Top Appl Earth Observations Remote Sens. 2020;14:3799–810.10.1109/JSTARS.2021.3064981Search in Google Scholar

[11] Xu Z, Chen Y, Yang F, Chu T, Zhou H. A postearthquake multiple scene recognition model based on classical SSD method and transfer learning. Int J Geo-Information. 2020;9(4):16.10.3390/ijgi9040238Search in Google Scholar

[12] Lu H, Ma L, Fu X, Liu C, Wang Z, Tang M, et al. Landslides information extraction using object-oriented image analysis paradigm based on deep learning and transfer learning. Remote Sens. 2020;12(5):752.10.3390/rs12050752Search in Google Scholar

[13] Polat O, Polat A, Ekici T. Automatic classification of volcanic rocks from thin section images using transfer learning networks. Neural Comput Appl. 2021;33:11531–40.10.1007/s00521-021-05849-3Search in Google Scholar

[14] Zhang Y, Wang G, Li M, Han S. Automated classification analysis of geological structures based on images data and deep learning model. Appl Sci. 2018;8(12):2493.10.3390/app8122493Search in Google Scholar

[15] Chen J, Huang H, Cohn A, Zhang D, Zhou M. Machine learning-based classification of rock discontinuity trace: SMOTE oversampling integrated with gbt ensemble learning. Int J Min Sci Technol. 2021;6. 10.1016/j.ijmst.2021.08.004.Search in Google Scholar

[16] Xu Z, Ma W, Lin P, Shi H, Liu T, Pan D. Intelligent lithology identification based on transfer learning of rock images. J Basic Sci Eng. 2021;29(5):1075–92 (chinese).Search in Google Scholar

[17] Yi X, Zhang R, Li H, Chen Y. An MFF-SLIC hybrid superpixel segmentation method with multi-source RS data for rock surface extraction. Appl Sci. 2019;9(5):906.10.3390/app9050906Search in Google Scholar

[18] Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira O, Vaughan J, A theory of learning from different domains. Mach Learn. 2010;79(1–2):151–75.10.1007/s10994-009-5152-4Search in Google Scholar

[19] Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks. Adv Neural Inf Process Syst. 2014(1):3320–8.Search in Google Scholar

[20] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, 27 June-30 June 2016. Seattle, WA: IEEE; 2016. p. 770–8.10.1109/CVPR.2016.90Search in Google Scholar

[21] He K, Zhang X, Ren S, Jian S. Identity mappings in deep residual networks. 14th European Conference on Computer Vision (ECCV), 2016, 08 October-16 October 2016. Cham: Springer; 2016. p. 630–45.10.1007/978-3-319-46493-0_38Search in Google Scholar

[22] Woo S, Park J, Lee JY, Kweon IS. CBAM: convolutional block attention module. 15th European Conference on Computer Vision (ECCV), 2018, 08 September-14 September 2018. Cham: Springer; 2018. p. 3–19.10.1007/978-3-030-01234-2_1Search in Google Scholar

[23] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. Computer Sci. 2014Search in Google Scholar

[24] Song EP, Eem SH, Jeon H. Concrete crack detection and quantification using deep learning and structured light. Constr Build Mater. 2020;252(5):119096.10.1016/j.conbuildmat.2020.119096Search in Google Scholar

[25] Yokoyama S, Matsumoto T. Development of an automatic detector of cracks in concrete using machine learning. Procedia Eng. 2017;171(Complete):1250–5.10.1016/j.proeng.2017.01.418Search in Google Scholar

[26] Xie B, Ai D, Yang Y. Crack detection and evolution law for rock mass under SHPB impact tests. Shock Vib. 2019;2:1–12.Search in Google Scholar

Received: 2021-09-03
Revised: 2021-11-25
Accepted: 2022-01-16
Published Online: 2022-02-09

© 2022 Xuefeng Yi et al., published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.