Application of clustering algorithm in complex landscape farmland synthetic aperture radar image segmentation

: In synthetic aperture radar ( SAR ) image segmentation ﬁ eld, regional algorithms have shown great potential for image segmentation. The SAR images have a multiplicity of complex texture, which are di ﬃ cult to be divided as a whole. Existing algorithm may cause mixed super - pixels with di ﬀ erent labels due to speckle noise. This study presents the technique based on organization evolution ( OEA ) algorithm to improve ISODATA in pixels. This approach e ﬀ ectively ﬁ lters out the useless local information and success fully introduces the e ﬀ ective information. To verify the accuracy of OEA - ISO data algorithm, the segmenta tion e ﬀ ect of this algorithm is tested on SAR image and compared with other techniques. The results demonstrate that the OEA - ISO data algorithm is 10.16% more accurate than the WIPFCM algorithm, 23% more accurate than the K - means algorithm, and 27.14% more accurate than the fuzzy C - means algorithm in the light - colored farmland category. It can be seen that the OEA - ISO data algorithm introduces the pixel block strategy, which successfully reduces the noise interference in the image, and the e ﬀ ect is more obvious when the image background is complex.


Introduction
The most widely used clustering algorithm is ISODATA clustering algorithm. Its main function is to automatically adjust the cluster center and the number of clusters by clustering fusion and splitting. In 1965, Hall and Ball proposed an unsupervised clustering algorithm based on K-means clustering algorithm, ISODATA clustering algorithm. At the same time, the number of clustering centers can be adjusted on the basis of the original clustering algorithm. ISODATA clustering algorithm controls the splitting and fusion of clustering by using two threshold parameters. Therefore, the selection of threshold parameters has a direct impact on the correctness of the algorithm results. However, it is difficult to set the threshold parameters. Therefore, how to set the threshold parameters correctly and reasonably has become an urgent problem [1 -3].
The synthetic aperture radar (SAR) is a form of radar utilized technique for creating 2D or 3D images of objects, such as landscapes. SAR is used for the radar antenna motion over a target region for providing finer resolution than conventional beam-scanning radars. It is a widely used sensing system, which produces high-resolution images under all weather conditions as shown in Figure 1.
There are two stages of 3D processing SAR system. The phase differences between complex images are measured by the digital elevation model after focusing the azimuth and range direction for the 2D generation. The azimuth-range coordinates with the height information are delivered by the 2D SAR. It finds its applications in different fields like environment, archeology, military, and so on [4]. There should be correct segmentation in SAR application for correct identification of objectives. SAR image is divided into various nonoverlapping and coherent regions for SAR interpretations and segmentations. There are similarities in the image pixels in the same region while pixels are having different features in different regions. For processing and analysis purpose, SAR image is divided into different regions to better understand the image [5][6][7][8]. The SAR image segmentation process is to partition an image into different characteristic regions. The approaches like thresholding, clustering algorithms, statistic model-based methods, and morphologic methods are generally utilized for image compression.
An image is divided into various regions of different characteristics in the SAR segmentation process, and samples are different. The SAR image segmentation is complicated as there are various image samples, and in the process of imaging, speckle noise is introduced [9][10][11]. For example, if the SAR image is 256 × 256, then there are 65,536 pixel samples in it, which is a very challenging task for the accuracy requirement. A basic SAR image sample is shown in Figure 2.

Contribution
To remove the demerit of traditional oversegmentation techniques that cause mixed super-pixels with dissimilar labels due to speckle noise, improved ISODATA (OEA ISODATA) clustering algorithm is presented in this study. It effectively filters out the useless local information and successfully introduces the effective information. The SAR images are utilized for testing the segmentation effect of the algorithm to verify the OEA-ISO data algorithm accuracy. The proposed OEA-ISO data algorithm introduces the pixel block strategy to reduce the noise interference in the image.
The organization of the study is as follows. Section 2 details the exhaustive literature survey followed by Section 3 that details the research methodology adopted. Section 4 provides the results and discussion. Finally, Section 5 concludes the complete study.

Literature review
In recent years, many scholars began to improve the traditional ISODATA clustering algorithm. For example, Ting-Ting et al. applied an evolutionary algorithm to calculate the optimal ISODATA parameters [12]. Aili and Xiuhua used genetic algorithm to determine the setting of ISODATA parameters [13]. Yan and Hui-Lei combined ISODATA clustering algorithm with PSO genetic algorithm and used their proposed algorithm for remote sensing image segmentation [14].
Based on the earlier research, this study proposes an improved ISODATA (OEA-ISO data) clustering algorithm based on the organizational evolution algorithm. The threshold parameters of ISODATA mainly rely on the global search ability of OEA. The OEA-ISO data algorithm abandons the traditional pixel points and uses the pixel block as the basis of clustering. This method effectively filters out the useless local information, successfully introduces the effective information, and suppresses the noise in SAR image segmentation and has a better optimized clustering result.
In this study, an efficient multiobjective automatic segmentation framework is proposed, which is further applied to SAR image [15]. Four important issues are presented in this framework. At the initial stage, two image preprocessing techniques are detailed. In the next stage, an efficient immune multiobjective optimization algorithm is proposed by the author. Quantitative analysis and validation are done on both simulated data and real images. Comparison of the proposed technique is done with the state-of-the art technique, which shows the better performance of the proposed technique. In this article, author proposed a new technique for segmenting SAR images based on fuzzy clustering approach [16]. This is called fast fuzzy C-means (FCM) clustering, which is based on key pixels (FKP_FCM). The overall image segmentation is accelerating using this approach as the time-consuming clustering operation is only performed on a small subset of pixels. The proposed algorithm is very efficient as series of experiments including segmenting twelve simulated. The results obtained by the proposed technique are compared with the state-of-the art segmentation techniques. The experimental results of the proposed technique are better than the compared techniques in terms of the computational speed and suppression of speckle noise. Authors proposed a two-pass clustering algorithm with a combination of the linear assignment and FCM methods for polarimetric SAR (PolSAR) image segmentation in this study [17]. For SAR images based on texture complexity analysis, authors proposed a new semantic segmentation algorithm [18]. The proposed method is highly efficient as its effectiveness is verified on various images. The proposed algorithm is highly efficient in achieving semantic segmentation of SAR images in terms of suppression of noise as compared with the state-of-the art techniques. With an unknown number of clusters, authors present a novel multilook SAR image segmentation algorithm in this study [19]. Markov random field is then used for the spatial relationship among neighboring pixels by the weighting coefficients of components in gamma mixture model. The experimental results obtained from the proposed technique are compared with the state-of-the art techniques, and the simulation is done on the real multilook SAR images. In this study, a new algorithm named SC ensemble is proposed for the segmentation SAR images [20]. This technique provides necessary diversity as well as high-quality components for an efficient ensemble. Experimental results obtained by the proposed technique show that the proposed method is effective for SAR image segmentation. A kernel FCM algorithm with pixel intensity and location information for SAR image segmentation is proposed in this study [21].

Individual coding
Each chromosome can be regarded as an individual In ISODATA, the threshold of splitting operation is X 1 , the threshold of fusion operation is X 2 , the lower limit of search space is x i , i = 1, 2 and the upper limit of the search space is In each body, the splitting operation threshold parameter and fusion operation threshold parameter of ISODATA algorithm are X 1 and X 2 , respectively, in chromosome, which are used as the standard to evaluate the clustering results. The difference between the two methods is based on the data processing method of ISODATA, and the data processing range of the two methods is very large, and at the same time, the OEA-ISO data need the algorithm with good decomposition space search ability to get the global optimal solution, so the OEA-ISO data algorithm is selected for optimization (M27).

Pixel block model
This study does not cluster the pixels but uses the OEA-ISO data algorithm to cluster the pixel blocks. Based on the dynamic weighting strategy, each pixel is given a weight, and the sum of the weights of all pixels is 1, the partial information contribution is considered as the standard, and the weight is allocated in equal proportion. The noise suppression can be achieved by obtaining the effective local information of the image according to the strategy.
First, the average variance σ kr of each pixel pK(R) of P KI in the pixel block is calculated by the following formula. The following is the calculation process of the weight vector of a pixel block: 1 = {I 1 , I 2 ,…, I n } is used to represent an image; as far as the pixel I k is concerned, a window with Q × Q at the center is represented by n k , and the sequence number of the R pixel in the original image is represented by n k (R), r = 1,…,Q × Q. Then a vector P KI = (pK(R), R ∈ 1,…,Q × q) can represent a block of pixels centered on pixel I k , where pK(R) = I NK (R).
First, the average variance σ kr of each pixel pK(R) of P KI in the pixel block is calculated by the following formula.
The weight of each pixel pK(R) in a group of pixel blocks is obtained by exponential kernel function: Finally, the weight of all pixels pK(R) is added in the pixel block PK to ensure that the sum of the weights of all pixels in the block is 1.
For the specific process of OEA-ISO data algorithm, first of all, we need to use the above formula and process to construct all pixel blocks and the weight of pixel blocks in the image. Second, the original cluster number is randomly selected as the number of pixels, which is the basis of the initial clustering center of ISODATA algorithm. It should be noted that because each cluster center and each data point are pixel blocks, it is necessary to design a calculation method for the distance between two cluster centers and the standard deviation within each cluster. For some variables used in clustering, this study designs a special calculation formula. The specific calculation method is as follows: because a cluster center can be used as a pixel block, so let VI = (VIR, R ∈ 1Q × q), {I = 1, 2,…,C}, where C is the number of clusters, and the cluster center is represented by C > 2.
where q is the size of the pixel block, and N is the size of the R cluster (the number of individuals in the cluster). In addition, we need to pay more attention to the difference between pixel and cluster center: both pixel and cluster center have one pixel block, but pixels have weight vector, whereas cluster center has no weight vector. The calculation equation between cluster center v and pixel block ik is as follows: The following is the calculation equation of standard deviation within a cluster: As for the strategy of prime block, the size of Q is considered as one of the important parameters of pixel block. In this study, the size of the local information introduced is controlled by the size Q. In general, although the effect of noise suppression is directly proportional to the size of the pixel block, the larger the pixel block means that more details will be lost, and the computational complexity will also increase.

Evaluation function
Because the combination of different fusion operation threshold and split operation threshold represents different individuals, and the clustering results generated by ISODATA algorithm will change with the combination; therefore, OEA algorithm is used to search the optimal individual to determine the quality of clustering results.
In this study, the classic DB index is used to evaluate the clustering results. The idea of DB index is to determine the advantages and disadvantages of clustering by calculating the ratio of the sum of the separation degree between clusters and the compact density within the cluster. Because the excellent clustering results are close within the cluster, the clustering results show that DB index is small. Because the algorithm does not cluster on the traditional pixel, but on the pixel block, the distance calculation method in DB index is different from that in OEA-ISO data algorithm. The distance between clusters DIJ and the compact density S i in the cluster can be calculated by the following two formulas, and the DB index value in the OEA-ISO data algorithm can be obtained.   (9) Finally, the following DB index values in the OEA-ISO data algorithm are obtained:

Algorithm flow
In the OEA-ISO data algorithm, many organizations constitute the population. Therefore, in order to gradually produce high-quality individuals in the population, it is necessary to construct the organization to implement the splitting operator, annexation operator, and cooperation operator in the evolution of organization. For an individual, the fusion operation threshold and splitting operation threshold in ISODATA constitute the two-dimensional vector of its chromosome. The parameters of the algorithm can be considered as two thresholds in its chromosome. The pixel blocks to be segmented are clustered by ISODATA clustering algorithm, and the DB value is used as the fitness value of the individual [22,23].
When each population is initialized with ISODATA, the organization is executed again. If the detected tissue size exceeds the set threshold, the first step is to segment the tissue. In the second step, when the probability of merging the two operators from the current population is less than the number of cooperative operators and when the number of evolutionary operators in the current population is less than the number of operators in the current population. In the third step, when the evolution process reaches the population convergence or preset evolution algebra, the evolution process is terminated, and the input image is segmented by the clustering results represented by the optimal individuals in the population. The following is the specific algorithm flow: 1. The corresponding weight vector and pixel block are constructed for all the pixels to be segmented. 2. Initialization of the population P 0 so that there are N 0 tissues and only one individual in each tissue. 3. In each individual population, the parameters X 1 and X 2 of ISODATA clustering algorithm, the pixels of input image are clustered ISODATA algorithm, and the clustering result of DB index is the individual fitness value. At the same time, the minimum and maximum values of the standard deviation obtained in step 3 are considered as the upper and lower limits of x 1 , respectively, and the upper and lower limits of x 2 are the minimum and maximum values of the distance between clusters, respectively. 4. t ← 0. 5. When t ≥ the maximum evolution algebra, execute ten, otherwise execute six. 6. The objective is to detect the size of each tissue and split the tissue that meets the conditions of division.
The fitness values of two offspring tissues were calculated. The original tissue was removed from the population Pt, and the two offspring tissues were put into the population Pt + 1. 7. When the number of tissues in group P was less than two, nine was performed, and vice versa. 8. Two organizations, orgp1 and orgp2, were randomly selected from population P to execute the cooperation operator or merge operator with the same probability. Calculate the fitness value of the two offspring, take the original tissue from the population P, put the two offspring into the population Pt + 1, and execute step 7. 9. t → T + 1, Pt → Pt + 1. 10. According to the clustering results of the optimal individuals in population P, the image is segmented, and the segmentation results are output.

Results and discussion
For the experimentation purpose of the presented technique, well-known UCI datasets and public PolSAR image segmentation are utilized. These are open access publically available datasets. In the image processing field, image segmentation plays an important role. For the performance evaluation, these two datasets are utilized having 900 × 1,024 pixel dimension.
To prove the correctness of the OEA-ISO data algorithm, a set of experiments are designed to verify it. At the same time, the results of four other algorithms are compared with those of the OEA-ISO data algorithm, including PSO-ISO data algorithm, WIPFCM algorithm, FCM algorithm, and K-means algorithm. PSO-ISO data refer to the ISODATA algorithm optimized by PSO evolutionary algorithm. Its idea is basically the same as that of the OEA-ISO data algorithm. One of the comparative algorithms is PSO-ISO data. In the classic FCM algorithm, the weighted pixel block is introduced by using the image spatial information and suppressing the noise by WIPFCM algorithm. Because the spatial information of the image is introduced by the WIPFCM algorithm through the weighted pixel block strategy, the same approach is the OEA-ISO data algorithm, so WIPFCM can be used as a comparative algorithm [24].
According to the following parameters, CS = 0.6, as = 0.8, Max OS = 20, and N 0 = 150. Because the size of the pixel block is directly proportional to the amount of local information introduced into the image, that is, the larger the size of the pixel block, the more local information of the image is introduced, and the better the effect of noise suppression is. However, with the rapid increase of calculation, more details are lost. After a series of experiments, it is concluded that if the size of the prime block is set to 3, the general noise interference can be removed for the image, and the amount of calculation is also within the tolerance range. Therefore, the pixel block size of the WIPFCM algorithm and the OEA-ISO data algorithm is set to 3 in all experiments.
In order to detect the correctness of the image segmentation results, different types of pixels on the image are marked with different marks to form an image. Then the ratio of the total number of labeled pixels in the test image to the number of correctly segmented pixels is used as the basis for the accuracy of segmentation results. In order to test the accuracy of the OEA-ISO data algorithm, the SAR image of farmland with 256 × 256 pixels was considered as an example to verify. The farmland SAR image is clustered into two kinds of pixels: dark farmland and light-colored farmland. The accuracy of segmentation results is verified by constructing test images. As shown in Figure 3, in the test image, 5,218 pixels are marked for light-colored farmland, and 9,310 pixels are marked for dark farmland. The segmentation results of the image are shown in Figure 4. The original SAR image is shown in Figure 4(a). The segmentation results of WIPFCM algorithm, K-means algorithm, FCM algorithm, PSO-ISO data algorithm, and OEA-ISO data algorithm are shown in Figure 4  of each algorithm is quite different, the best segmentation effect is OEA-ISO data algorithm, and OEA-ISO data algorithm has more obvious advantages in light-colored farmland classification.
To verify the accuracy of calculation and segmentation more accurately, the test image was compared with the segmentation results, and the accuracy rate of light-colored farmland category, dark-colored farmland category, and total accuracy was counted. The error classification and correct classification of pixels in the segmentation results of each comparison algorithm are shown in Table 1, and the segmentation accuracy comparison of each algorithm of farmland SAR image is shown in Table 2, and the graphical representation of the segmentation results is shown in Figures 5 and 6, respectively. It can be seen from the table that the algorithm has excellent accuracy in all three items. The accuracy of the OEA-ISODATA algorithm is 10.16% higher than the WIPFCM algorithm, 23% higher than K-means algorithm, and 27.14% higher than the FCM algorithm. In addition, the algorithm has an advantage in the total accuracy, which is 3-9% higher than the other three algorithms [25][26][27]. PSO-ISO data algorithm is relatively poor in all three accuracy indicators than OEA-ISODATA algorithm, so OEA-ISODATA algorithm can find better solutions than PSO-ISO data [28][29][30].
The SAR is a widely used sensing system, which produces high-resolution images under all weather conditions. It finds its applications in different fields such as environment, archeology, military and target recognition, and so on. There should be correct segmentation in SAR application for the correct    identification of objectives. When optical sensors are inoperable and do not work in night time, then the SAR images find wide applications because the SAR sensors can penetrate clouds and also work well in bad weather conditions. Correct segmentation is important in the SAR image applications.

Conclusion
This article studies ISODATA clustering algorithm and a hybrid algorithm of OEA (OEA-ISO data) and applies the OEA-ISO data algorithm to the SAR image segmentation. In this algorithm, the global search of OEA algorithm can obtain the optimal combination of ISODATA algorithm fusion operation threshold and split operation threshold. Because the clustering process is not performed on pixel points, but on pixel blocks, noise interference in the image is suppressed by introducing spatial information of the image. To verify the accuracy of OEA-ISO data algorithm, the segmentation effect of this algorithm is detected on SAR image and compared with WIPFCM algorithm, PSO-ISO data algorithm, FCM algorithm, and K-means algorithm. ISODATA clustering algorithm controls the splitting and fusion of clustering by two threshold parameters. Therefore, the selection of threshold parameters has a direct impact on the correctness of the algorithm results. The proposed OEA-ISO data algorithm has the best performance as compared with the state-of-the art techniques. It is 10.16% more accurate than the WIPFCM algorithm, 23% more accurate than the K-means algorithm, and 27.14% more accurate than the FCM algorithm in light-colored farmland category. It is concluded that OEA-ISODATA can also make the ISODATA algorithm more convenient. The improvement of the algorithm is focused in the future in terms of edges and other original image details. The study will focus on improvement of the algorithm in terms of preserving edges and other details of the original image.
Funding information: Innovation fund of industry-university-research center for science and technology development of the ministry of education in 2018 "Key technologies of smart campus system based on Internet of things in the era of Internet + education" (2018A02002); Research project of boda college of jilin normal university in 2019 "Intelligent monitoring system of motorist's heart rate based on big data" (2019BD002).

Conflict of interest:
The authors declare no conflict of interest.  Application of clustering algorithm in complex landscape farmland  1023