The size of the texture extraction window impacts image tree species classification, and the determination of the optimal texture extraction window requires the supervision of a specific classifier for accuracy. Therefore, it is necessary to analyse which kind of classifier is more suitable and should be to choose. In this study, we extracted eight types of textures, namely mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment and correlation, changed the window size by gradient increase and used maximum likelihood classification (MLC) and random forest (RF) to supervise and determine their optimal extraction windows, respectively. Finally, the optimised time consumption and classification accuracy for tree species classification was identified. The time consumption of MLC was significantly less than that of RF; however, neither was very long; for most textures, the optimal texture extraction window determined by MLC supervision was larger than that determined by RF supervision; in the classification of most feature sets, the overall accuracy obtained by MLC was less than that of RF. Because the time consumption of the texture extraction was much greater than that of the image classification, the comprehensive trade-off indicates that using RF supervision to determine the optimal window for texture extraction was more conducive to tree species recognition.
Textures of remote sensing images are important features in tree species recognition. Using textures to classify tree species can significantly improve the classification accuracy compared with the use of other features such as spectral band and spectral index. When the textures are combined with these features, the accuracy of tree species classification will be improved more than using textures alone [1,2,3,4,5,6]. Textures play a very important role in tree species classification; however, extracting them with an inappropriate window size will affect tree species identification. When an undersized window is used, it will fail to fully exploit the textural arrangement of the tree species, whereas an oversized window will result in blurring of the tree species boundary . Therefore, determining the optimal window size for texture extraction is very important.
To find the optimal window for texture extraction, scholars usually start with a small window, such as 3 × 3 and then gradually increase the window by odd numbers [3,8,9]. With the increase in extraction window, the recognition accuracy of the tree species also increases. After reaching a certain critical window, the recognition accuracy of the tree species no longer increases and decreases significantly. This critical window will be used as the optimal window for texture extraction [3,8,9]. The best texture feature set for tree species classification can be constructed by combining different types of textures extracted with their optimal windows [3,9,10,11]. In terms of optimal window selection for texture extraction, some scholars use the same window for all types of textures to obtain the same optimal window [3,7,8,10,11]. However, other scholars search for the optimal window for each type of texture [9,11]. The optimal texture feature sets obtained by the two forms generally have different classification accuracy in tree species recognition, and the optimal texture feature set constructed by searching the optimal windows for each type of texture will achieve slightly higher accuracy . The supervisors used to determine the optimal window for texture extraction are mainly maximum likelihood classification (MLC) [3,10], support vector machine (SVM) [7,9] and random forest (RF) [8,11]. MLC and RF are less time consuming than SVM in model training and image classification.
In the current research of identifying tree species by remote sensing, we try to identify tree species using unmanned aerial vehicle (UAV) RedEdge-MX data because the sensor can obtain a digital surface model (DSM) image containing elevation information of the trees and a digital orthophoto map (DOM) image with five-band containing colour and texture information. We are also interested in understanding the effect of combining the DOM texture extracted under the optimal window with the DOM spectral bands and DSM. When determining the optimal window for texture extraction, scholars choose a single classifier to supervise, and there is no performance comparison between multiple classifiers. Therefore, it is not clear which classifier for supervising and determining the optimal window for texture extraction will be relatively small (because a small window means less time consumption), or which classifier supervising and determining the best texture feature set/mixed feature set will achieve a high tree species recognition accuracy.
This study uses the RedEdge-MX images as the data source to classify typical greening tree species and grass. Less time-consuming MLC and RF were used to supervise and determine the optimal extraction windows of eight types of textures commonly used in tree species recognition, namely mean (MEA), variance (VAR), homogeneity (HOM), contract (CON), dissimilarity (DIS), entropy (ENT), second moment (SM) and correlation (COR) [1,3,12]. The time consumption of texture extraction, model training, image classification and classification accuracy of the related data sets were comprehensively analysed to determine the classifier that should be used for optimal window selection during RedEdge-MX texture extraction. The study will provide basic information for the selection of suitable classifiers to determine the optimal window for texture extraction in tree species classification.
2 Materials and methods
2.1 Data and pre-processing
The data used in this study were taken by a UAV RedEdge-MX sensor that has five bands of blue, green, red, red edge and near infra-red, which can obtain high spatial resolution images . The detailed parameters of the sensor obtained data are shown in Table 1. The location of the imaged site is the Luoyang Normal University of China. The imaging time was between 11:00 a.m. and 12:00 a.m. on 3 January 2020.
|Band number||Band name||Spatial resolution (cm)||Wavelength range (μm)||Central wavelength|
After the data acquisition, basic pre-processing, such as image mosaic, band synthesis and image clipping, was performed. An image with a spatial resolution of 16.285 cm covering the whole campus of Luoyang Normal University, with an area of about 2.03 km2, was created. The false colour display effect of the image (RGB vs bands 532) is shown in Figure 1.
2.2 Tree species investigation and sample collection
The pre-processed image was cut into two parts vertically from the middle and printed on two sheets of 104 cm × 60 cm and 106 cm × 60 cm papers. The paper images were created for the tree species investigation. We identified the trees corresponding to the paper images in the actual environment, circled the tree crown and recorded their names in the paper image. The results show that there are eight typical tree species that do not lose their leaves in winter. The photos are shown in Figure 2a–h.
After the investigation, we marked the tree species that were recorded in the paper image on the electronic image in the form of a region of interest, and parts of the tree were used as training samples (TSs), and other parts were used as precision validation samples. Detailed information on the surveyed tree species and pixels of training and validation samples are shown in Table 2.
|Latin names||Leaf type and phenology||Pixel number of TS||Precision validation pixels|
|Ligustrum lucidum||Evergreen broad-leaf tree||993||4,491|
|Cedrus deodara||Evergreen conifer||996||4,591|
|Photinia serrulata||Evergreen broad-leaf tree||971||4,535|
|Eriobotrya japonica||Evergreen broad-leaf tree||991||4,589|
|Magnolia grandiflora||Evergreen broad-leaf tree||922||4,632|
|Platycladus orientalis||Evergreen conifer||996||4,483|
|Cinnamomum camphora||Evergreen broad-leaf tree||994||4,522|
|Trachycarpus fortunei||Evergreen broad-leaf tree||978||3,896|
2.3 Extraction of vegetation from image
The DSM threshold of [133, 187.89] was used to initially mask the buildings and extract the building contour lines. The lines were then imported to MapGIS, and mismatched places were modified to make the vector lines more consistent with the building boundaries. The updated building contour lines were then used to build a mask, and the buildings were masked. Third, the generated mask file was used to mask the normalised difference vegetation index (NDVI) image, then edge-based segmentation was performed on the masked NDVI images (the best scale is 60.2), and a full Lambda schedule was used to merge (the best scale is 99) the segment patches. Finally, the object-oriented rule-based method was used, the threshold [0.11664, 0.63974] was set to extract the vegetation in the image and the vegetation parts were retained.
2.4 Determination of the optimal window for texture extraction
2.4.1 Texture images generation
The co-occurrence measures of the ENVI 5.4 software were used to extract the eight textures that are often used in tree species classification. The textures are MEA, VAR, HOM, CON, DIS, ENT, SM and COR. The window size for texture extraction starts from 3 × 3 and increases with odd gradients. The eight types of texture images were then generated from the 5-band RedEdge-MX data. Textures of the same type extracted under different window sizes would be used one by one for subsequent tree species classification.
2.4.2 Selection of the TS size
The size of the TS may affect the time and accuracy of image classification by different classifiers, as well as the size of the optimal window for texture extraction. Therefore, we randomly selected nine times from the TS in the form of gradient increasing by 10% to form nine subsets of 10–90% TS. These TS subsets were then used to classify the DOM image, and the time consumption and accuracy of the classifiers were analysed. According to the experimental results, two typical samples that cause a large difference in time consumption or classification accuracy were selected for the TS to determine the optimal window for texture extraction. The influence of the sample size on the determination of the optimal window for texture extraction and differences between different classifiers were further analysed.
2.4.3 Supervision of the optimal window selection
When a texture was extracted under a specific window, based on the extracted texture, MLC or RF  was used to classify the tree species, and the overall accuracy (OA) was recorded. With a larger texture extraction window, the OA generally increased as well. When the texture extraction window increased, the extracted texture was used to classify the tree species; the obtained OA was no longer improved and instead decreased significantly. We then stopped using a larger window to extract the texture and determined the texture extraction window corresponding to the maximum OA as the optimal window for this texture extraction. We determined the optimal extraction windows for different types of textures, classifiers and TS sizes using the same approach.
2.5 Analysis of the suitable classifier for optimal window determination
Two types of TSs with the largest difference in time consumption or classification accuracy were used as m × 10% TS and n × 10% TS (m and n are coefficients of the TS size; the values of m and n are between 2–10 and 1–9, respectively; m is greater than n). The best texture feature sets determined by MLC under the large and small TSs were recorded as TexM-m × 10% TS and TexM-n × 10% TS, respectively, and the two types of optimal feature sets determined by RF supervision were recorded as TexR-m × 10% TS and TexR-n × 10% TS, respectively. Large and small TS to classifying tree species based on the best texture feature sets and mixed feature sets were formed by combining them with DOM and DSM, and the difference in accuracy obtained by the two classifiers was analysed. Combining the time consumption of the two classifiers in model training, image classification and supervision for texture extraction, the supervised classifier that is more suitable for RedEdge-MX texture extraction in the tree species classification was determined.
3.1 Classification analysis of different sample sizes
Under different TS sizes, the time consumption and OA obtained by the two classifiers for DOM image processing are shown in Table 3.
|Proportion of TS (%)||MLC||RF|
|Classification time (minute:second)||OA%||Training time (minute:second)||Classification time (minute:second)||OA%|
As shown in Table 3, for MLC, the time consumption of DOM image classification was relatively stable at different sample scales. However, the OA obtained at 10% TS scale was slightly less than, though not significantly, values obtained at other scales. However, for RF, with the increase in TS size, the time consumption of model training and image classification shows an increasing trend, and the OA of classification results also shows an increasing trend. The large and small TSs were 100% TS and 10% TS, respectively (m and n are 10 and 1, respectively).
3.2 Best texture extraction window analysis
As shown in Figure 3, eight types of textures have different optimal texture extraction windows in tree species classification using TS. The optimal extraction window for some texture features were small (e.g. 9 × 9 for MEA) and for some texture features were large (e.g. 45 × 45 for HOM). There were obvious differences in the size of the optimal extraction window for different textures. The maximum OA obtained by each type of texture feature was different. The highest OA (68.50%) was obtained using MEA, and the lowest classification accuracy (37.36%) was obtained using COR, and the OA of other types of texture features was among these two OA.
Although the OA obtained using 10% TS in most classifications under different window scales was less than that of the TS used, the change rule of the OA obtained using the 10% TS in the texture extraction window changes was similar to that of the TS used.
As shown in Figure 4, TS based on RF, the optimal texture extraction window for eight types of texture was also different (except ENT and SM), and the OA obtained by these textures with optimal extraction window was also different. When 10% TS was used to determine the optimal texture extraction window, the variation trend of OA obtained by various textures was generally consistent with that from TS. However, in almost all windows, the OA obtained with 10% TS was significantly less than that obtained with TS.
Based on MLC and RF, under two different scales of TS used, the optimal window for eight types of texture extraction and the corresponding OA obtained under these optimal windows are shown in Table 4.
|TS||OA%||10% TS||OA%||TS||OA%||10% TS||OA%|
|MEA||9 × 9||68.50||9 × 9||67.21||5 × 5||73.14||9 × 9||65.11|
|VAR||33 × 33||48.98||35 × 35||49.36||21 × 21||69.15||21 × 21||59.94|
|HOM||47 × 47||56.27||47 × 47||55.75||45 × 45||63.82||47 × 47||58.13|
|CON||37 × 37||46.43||39 × 39||45.28||15 × 15||64.21||15 × 15||55.14|
|DIS||27 × 27||54.91||29 × 29||54.92||23 × 23||62.61||27 × 27||56.59|
|ENT||23 × 23||54.91||23 × 23||55.06||25 × 25||64.76||25 × 25||57.79|
|SM||33 × 33||43.79||29 × 29||42.80||25 × 25||60.48||21 × 21||54.03|
|COR||15 × 15||37.36||11 × 11||36.14||27 × 27||67.67||31 × 31||59.54|
As shown in Table 4, with the exception of a few types of textures (SM and COR for MLC, SM for RF), the optimal extraction windows for the other textures obtained with 10% TS were not smaller than those obtained with TS, whether based on MLC or RF. The best texture feature set for tree species classification using 10% TS may require a large window to extract texture features.
In addition, using MLC and RF to obtain the optimal extraction window (used TS) for eight types of texture in tree species classification also has many differences. First, different classifiers in the optimal window of the same texture were different. The optimal extraction window of MEA, VAR, HOM, CON, DIS and SM using MLC was larger than that used for RF; however, the ENT and COR was smaller than that of the RF. Using the same texture for tree species classification, the OA achieved by MLC was less than that of RF. The importance ranking (according to OA) of the eight types of textures determined by the MLC was: MEA, HOM, DIS, ENT, VAR, CON, SM and COR. However, the order based on RF was: MEA, VAR, COR, ENT, CON, HOM, DIS and SM.
3.3 Optimal feature set combined with other features
After texture feature sets of TexM-TS, TexM-10% TS, TexR-TS and TexR-10% TS were constructed, they were used and combined with DOM and DSM for tree species classification. The OA of these classifications is shown in Table 5.
|Supervision form||Data sets||MLC||RF|
|TS||10% TS||TS||10% TS|
|DOM + DSM||68.2103||67.4331||75.9150||65.7918|
|TexM-TS + DOM + DSM||83.8904||76.5332||83.6992||74.9839|
|TexM-10% TS + DOM + DSM||82.7854||76.8312||83.8879||75.7809|
|TexR + DOM + DSM||81.3329||73.9062||83.1206||76.2576|
|TexR-10% TS + DOM + DSM||82.1051||75.1006||83.4161||76.8362|
As shown in Table 5, the classification accuracy of the tree species based on the four texture feature sets was greater than that of DOM, and the combination of DOM and DSM, regardless of whether it used MLC or RF for classifying. Therefore, the best extraction window to construct texture feature sets plays a very important role in tree species classification. In particular, in all datasets using TS for tree species classification, the obtained OA was greater than that of 10% TS, whether based on MLC or RF. The best texture feature sets and mixed feature sets constructed using 10% TS supervision achieved better accuracy than those constructed using TS. Whether it was the best texture feature sets or mixed feature sets constructed by MLC or RF supervision, in general, RF would achieve better accuracy than MLC; however, TexM-TS + DOM + DSM, MLC achieved relatively better accuracy than RF.
Overall, using the supervision form of MLC with large TS to determine the optimal texture extraction window and construct the best texture feature set, the mixed feature set for tree species classification was a good approach, where both MLC (83.8904%) and RF (83.6992%) achieved a high OA. In the supervision form of RF with large TS to determine the optimal texture extraction window and construct the best texture feature set, a mixed feature set for tree species classification is also a good approach; it does not achieve much lower accuracy than that obtained by MLC.
3.4 Confusion matrix of the optimal classification results
Based on the mixed feature sets of RedEdge-MX UAV images, the classification of eight tree species and grass using MLC produced the best results, and the confusion matrix of the best result is shown in Table 6.
|Type||Ligustrum lucidum||Cedrus deodara||Photinia serrulata||Eriobotrya japonica||Magnolia grandiflora||Platycladus orientalis||Cinnamomum camphora||Trachycarpus fortunei||Grass|
|Producer accuracy (%)||86.40||92.20||89.42||78.03||65.41||76.18||84.65||92.35||91.86|
|User accuracy (%)||66.87||84.78||66.95||86.92||91.82||99.82||87.20||91.23||98.09|
Overall accuracy = (33,786/40,274) = 83.8904%, Kappa coefficient = 0.8187.
As shown in Table 6, the producer accuracy of the tree species recognition ranged from 65.41% (Magnolia grandiflora) to 92.35% (Trachycarpus fortunei). Furthermore, the user accuracy of the tree species recognition ranged from 66.87% (Ligustrum lucidum) to 99.82% (Platycladus orientalis). Large differences among the various tree species were observed in producer accuracies and user accuracies. Large differences were observed between the producer and user accuracies for the same tree species (e.g. L. lucidum, Photinia serrulata, M. grandiflora and P. orientalis). The classification effect of the whole image was not particularly ideal in this group of features.
3.5 Image of the best classification results
Under the optimal feature combination, the landscape tree species classification results of the whole RedEdge-MX image are shown in Figure 5.
As shown in Figure 5, the grassland area on the campus is relatively large and well-identified. The landscape tree species L. lucidum is mainly distributed on the side of the road, and it had a good recognition effect. P. serrulata is distributed on the edge of the grass and is planted individually in blocks on the campus, which can also be well-identified. In addition to scattered distribution in the campus, Cedrus deodara is planted in a large area in the south of the campus, which can be detected effectively. Eriobotrya japonica, M. grandiflora, P. orientalis, Cinnamomum camphora and T. fortunei are scattered on the campus and can also be effectively detected.
4 Discussion and conclusions
The number of pixels in a TS has little or no impact on the time consumption and accuracy of MLC; however, it has a great impact on RF. With the increased number of pixels in the TS, the training time and classification time of RF were continually increased, and the sum of the training and classification time was more than ten times the MLC. However, with the increase in pixel content of the TS, the accuracy of the RF classification also increased, which was significantly greater than that of MLC. In terms of time consumption, in the face of sample changes, MLC is better than RF. However, in terms of classification accuracy, RF is better than MLC.
In texture feature extraction, using a large window is more time consuming than using a small window. For example, it takes more than 10 h to extract a 5-band HOM under a 47 × 47 window and approximately 0.5 h under a 3 × 3 window. In this study, for most types of textures, the optimal windows for texture extraction found with small TS were larger than with large TS. Using small TS to find the optimal window for texture extraction takes more time than using large TS, which means that the search for the optimal window should be done with large TS.
In tree species classification based on the best texture feature sets and mixed feature sets, although the MLC obtains slightly better accuracy than RF in the classification of TexM-TS + DOM + DSM, the accuracy of RF classification is better than the MLC for most data sets. In addition, the optimal window of texture extraction found by MLC supervision is larger than that by RF supervision. Although the time consumption of MLC in image classification is smaller than that of RF, the texture extraction is longer than that of RF. Overall, RF is more time-efficient. Therefore, considering the impact of the TS scale, the time consumption of the texture extraction and training/classification and classification accuracy of the datasets, the RF should be more suitable for determining the optimal window of texture extraction in tree species classification based on RedEdge-MX data.
When using the two classifiers to supervise and determine the optimal window for texture extraction, we only used the data of one period, which may not be representative. In addition, because of the large amount of calculation, there is a lack of testing of larger windows and repeated experiments. These are shortcomings of this study. In future research, we will improve the experimental design and further clarify the use of a suitable classifier to determine the optimal window for texture extraction.
To analyse whether the RedEdge-MX image is suitable for MLC or RF to supervise and determine the optimal window for texture extraction, the data first acquired on 3 January 2020 were used as an example to classify eight tree species and grass using two TS of different sizes and to evaluate the use of appropriate supervisors from the aspects of time consumption and classification accuracy. The size of the TS had little effect on the time consumption and accuracy of MLC; however, it had a greater impact on RF. The large-size TS increased the time consumption and classification accuracy of RF; the optimal window for texture extraction obtained using a small TS was larger than that obtained with a large TS. Under MLC supervision, the optimal window for texture extraction is generally larger than that under RF supervision. The classification accuracy of RF in most feature sets was greater than that of MLC, and the determination of the best window for RedEdge-MX texture extraction should be conducted by RF.
We want to provide our gratitude to the editors and the anonymous reviewers.
Funding information: This work was supported by the Natural Science Foundation of Henan Province (Grant No. 202300410293), the National Nature Science Foundation of China (Grant Nos. 32001250 and 42071198) and the Scientific and Technological Project of Henan Province (Grant No. 212102310433).
Author contributions: Huaipeng Liu: investigation, methods, visualization, results analysis, writing – original draft, project administration; Xiaoyan Su, Chuancai Zhang and Huijun An: investigation, resources.
Conflict of interest: The authors state that they have no conflict of interest.
Data availability statement: The datasets produced for this study are available from the corresponding author on reasonable request.
 Ghosh A, Joshi PK. A comparison of selected classification algorithms for mapping bamboo patches in lower Gangetic plains using very high resolution WorldView 2 imagery. Int J Appl Earth Obs Geoinf. 2014;26:298–311. 10.1016/j.jag.2013.08.011.Search in Google Scholar
 Dian Y, Li Z, Pang Y. Spectral and texture features combined for forest tree species classification with airborne hyperspectral imagery. J Indian Soc Remote Sens. 2015;43(1):101–7. 10.1007/s12524-014-0392-6.Search in Google Scholar
 Liu HP, An HJ, Wang B, Zhang QL. Tree species classification using WorldView-2 images based on recursive texture feature elimination. J Beijing Univ. 2015;37(8):53–9 (in Chinese with English abstract). 10.13332/j.1000-1522.20140311.Search in Google Scholar
 Gini R, Sona G, Ronchetti G, Passoni D, Pinto L. Improving tree species classification using UAS multispectral images and texture measures. ISPRS Int J Geo-Inf. 2018;7(8):315. 10.3390/ijgi7080315.Search in Google Scholar
 Ferreira MP, Wagner FH, Arago LEOC, Shimabukuro YES, Filho CRDS. Tree species classification in tropical forests using visible to shortwave infrared worldview-3 images and texture analysis. ISPRS J Photogramm Remote Sens. 2019;149:119–31. 10.1016/j.isprsjprs.2019.01.019.Search in Google Scholar
 Tian XM, Chen L, Zhang XL. Classifying tree species in the plantations of southern china based on wavelet analysis and mathematical morphology. Comput Geosci. 2021;151:104757. 10.1016/j.cageo.2021.104757.Search in Google Scholar
 Wang T, Zhang H, Lin H, Fang C. Textural-spectral feature-based species classification of mangroves in mai po nature reserve from Worldview-3 imagery. Remote Sens. 2015;8(1):24. 10.3390/rs8010024.Search in Google Scholar
 Wang X, Wang Y, Zhou C, Yin L, Feng X. Urban forest monitoring based on multiple features at the single tree scale by uav. Urban Urban Green. 2020;58:126958. 10.1016/j.ufug.2020.126958.Search in Google Scholar
 Wu YS, Zhang XL. Object-oriented tree species classification with multi-scale texture features based on airborne hyperspectral images. J Beijing Univ. 2020;42(6):91–101 (in Chinese with English abstract).10.12171/j.1000−1522.20190155.Search in Google Scholar
 Wang N, Peng SK, Li MS. High-resolution remote sensing of textural images for tree species classification. J Zhejiang A&F Univ. 2012;29(2):210–7 (in Chinese with English abstract). 10.11833/j.issn.2095-0756.2012.02.010.Search in Google Scholar
 Liu H, Su X, An H. Typical landscape tree species recognition based on RedEdge-MX: suitability analysis of two texture extraction forms under random forest supervision. Pol J Env Stud. 2022;31(2):1475–84. 10.15244/pjoes/141815.Search in Google Scholar
 Shi YF, Wang TJ, Skidmore AK, Heurich M. Improving LiDAR-based tree species mapping in Central European mixed forests using multi-temporal digital aerial colour-infrared photographs. Int J Appl Earth Obs Geoinf. 2020;84:101197. 10.1016/j.jag.2019.101970.Search in Google Scholar
 Van der Linden S, Rabe A, Held M, Jakimow B, Leitão PJ, Okujeni A, et al. The EnMAP-Box—A toolbox and application programming interface for EnMAP data processing. Remote Sens. 2015;7(9):11249–66. 10.3390/rs70911249.Search in Google Scholar
© 2022 Huaipeng Liu et al., published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.