A detailed and comparative work for retinal vessel segmentation based on the most effective heuristic approaches

: Computer based imaging and analysis techniques are frequently used for the diagnosis and treat-ment of retinal diseases. Although retinal images are of high resolution, the contrast of the retinal blood vessels is usually very close to the background of the retinal image. The detection of the retinal blood vessels with low contrast or with contrast close to the background of the retinal image is too difficult. Therefore, improving algorithms which can successfully distinguish retinal blood vessels from the retinal image has become an important area of research. In this work, clustering based heuristic artificial bee colony, particle swarm optimization, differential evolution, teaching learning based optimization, grey wolf optimization, firefly and harmony search algorithms were applied for accurate segmentation of retinal vessels and their performances were compared in terms of convergence speed, mean squared error, standard deviation, sensitivity, specificity. accuracy and precision. From the simulation results it is seen that the performance of the algorithms in terms of convergence speed and mean squared error is close to each other. It is observed from the statistical analyses that the algorithms show stable behavior and also the vessel and the background pixels of the retinal image can successfully be clustered by the heuristic algorithms.


Introduction
Retinal vessel segmentation is one of the most important areas of retinal image analysis because some attributes of retinal vessels are usually important symptoms of diseases [1]. Due to the disadvantages of manual retinal image analysis automatic analysis of retinal images becomes an important area of research. Automated segmentation of retinal vessels is accepted as the first step of developments in the area of computer-aided diagnosis systems for ophthalmic disorders [2], The abnormalities caused by diseases of obesity [3], hypertension [4], glaucoma [5] and diabetic retinopathy [6][7][8] are able to display with higher accuracy by means of the automated segmentation of retinal vessels. It also plays an important role in the areas such as evaluation of retinopathy of prematurity [9], vessel diameter measurement [10]. fovea region detection [11], arteriolar stenosis [12], computer assisted laser surgery [13], treatment of ophthalmologic diseases [14][15][16] and optic disk detection [17].
The conventional classical algorithms are well developed for clustering based retinal image analysis. However, there are few works in literature including retinal image analysis by using heuristic algorithms. In this work clustering based heuristic artificial bee colony (ABC), particle swarm optimization (PSO), differential evolution (DE), teaching learning based optimization (TLBO), grey wolf optimization (GWO), firefly (FA) and harmony search (HS) algorithms have been applied to retinal vessel segmentation. The retinal images used in the simulations are taken from the DRIVE and STARE databases. In order to perform a detailed analysis simulations are realized for both normal and abnormal retinal images. The normal and abnormal retinal images taken from DRIVE and STARE databases are given in Figures 1  and 2, respectively.
In order to improve the performance of retinal vessel segmentation, some pre-processing operations have to be applied before clustering. Then the pre-processed retinal images are evaluated by the algorithms. Pre-processing operations used in this work can be described as the following.

Pre-processing operations
The retinal images analyzed in this work consist of red (R), green (G) and blue (B) color components and when each layer of these images was examined separately, it was seen that the highest clustering performance was obtained in the green layer. The G layer has higher illuminance and it also provides optimal contrast and brightness levels when compared to R and B layers. For these reasons, after this step, the image analysis was continued on the green layer. This pre-processing operation can be called as band selection.
As a result, after switching from a three-dimensional RGB image format to a one-dimensional image by band selection process; i.) the image becomes suitable for twodimensional bottom-hat filtering, ii.) the pixel differences between the image background and the vessels become a bit more evident.
The contrast difference between the vessels and the image background was not found to be sufficient for a high clustering performance even after pre-processing operation mentioned above. For this reason, the next step is to apply a second pre-processing operation called bottom-hat transformation. This transformation can be described as extraction of the original image from the morphologically closed image. A bottom-hat filter enhances black spots in a white background by subtracting the morphological Close of the image from the original image. Bottom-hat filter increases the contrast between the bright levels in the retinal image and as a result of this, regions with different brightness levels can successfully be distinguished. Eq. (1) represents the bottom-hat transformation, bottom − hat(g) = g − (g⋅nB) (1) where g is the retinal image. B is the structural element to be used (filter structure), and n is the bottom-hat transformation. In this work, a disk with r=8 pixel radius have been used as building element and bottom-hat transformation is applied as to be n=8. The last pre-processing applied before clustering is the brightness correction. During the photo shoots of biomedical images, low contrast values or high pixel levels may occur due to the device or environment. In brightness correction process, these extreme pixel values creating noise effect on the retinal image are being optimized by using the following Equation for a successful segmentation.
In the equation given above, a and b represent the minimum and maximum pixel values of the image, respectively. Also, c and d are respectively representing the lowest and highest limit values of the pixel range that we want to normalize the image and in which the noise effects are minimized. As a result of brightness correction the contrast between the vessels and the image background becomes optimal for clustering. Figure 3 (b) and (e) show the green layer images obtained as a result of band selection for the retinal images taken from the DRIVE database given in Figure 1

Materials and methods
Artificial bee colony algorithm Swarm intelligence simulates the collective behaviour of animal societies [18] and it provides effective solutions in all areas of engineering. Artificial bee colony algorithm which is proposed by Karaboga in 2005 [19] is one of the most effective and widely used swarm based algorithms.
The pseudo-code of a basic ABC algorithm can be given as the following.
Randomly create an initial population consisting of solutions Calculate the fitness value of each x i solution in the population. Cycle=1

REPEAT
Produce new solutions v ij in the neighbourhood of x ij using v ij = x ij + ϕ ij (x ij − x kj ) and calculate the fitness values of these solutions.
Apply the greedy selection betweenx i andv i and save the selected new solution to memory.
Calculate the probability values p i for the solutions x i by means of their fitness values using the following Eq. (3).
where the p i values are normalized into [0,1]. The fitness value of each solution is calculated by using the following Eq. (4).
Produce the new solutions (new positions) v i from the solutions x i . selected depending on p i . and evaluate them.
Apply the greedy selection between x i and v i then memorize the selected new solution.
Determine the abandoned solution (source), if exists, and replace it with a new randomly produced solution x i using Eq. (5) given below, So far, memorize the best food source position (solution) achieved Cycle=Cycle + 1 UNTIL Cycle=Maximum Cycle Number After an initial population is created randomly the ABC algorithm performs three steps at each cycle of the search. At the first step the fitness (quality) values of the initial solutions are calculated and memorized. Then a new possible solution within the neighborhood of the present solution is determined and its fitness value is calculated. If the fitness value of the new solution is higher, the previous solution is forgotten and the new solution is memorized (greedy selection). At last step new possible solutions are determined within the neighborhood of the solutions evaluated so far and then their fitness values are calculated.
In ABC algorithm, if a solution cannot be improved by a predetermined number of trials, it means that the associated solution has been exhausted and will no more be evaluated. The memorized information about this abandoned solution is replaced with information of a randomly produced new solution. The number of trials for releasing a solution is equal to the value of "limit" which is an important control parameter of ABC algorithm. These three steps are repeated until the termination criteria are satisfied.
Assume that w i is the ith solution to the problem and f (w i ) represents the fitness of this solution. Let P(c) = {w i (c)|i = 1, 2, …, s} (c: cycle. s: number of possible solutions) represent the population of possible solutions. If the fitness value of a solution is high, the probability of this solution to be transferred to the next generation becomes high. The probability function can be defined as, where s is the number of solutions in the population. A new possible solution is selected depending on the probabilities calculated and then a neighbour solution around the chosen one is determined. This process is repeated until all possible solutions evaluated. Assume that a possible solution selected is w i . The neighbour solution of w i might be determined as the following, where ϕ i is a randomly generated number in the interval [−1, +1] and k is a randomly produced index different from i. If f (w i (c + 1)) is higher than f(w i (c)), then the algorithm memorizes w i (c + 1). Namely, w i (c) is replaced with w i (c + 1), otherwise w i (c) is kept as it is. If w i cannot be improved through the predetermined number of trials, it is abandoned. The algorithm starts to search for a new possible solution randomly. and after finding the new one, the new solution is accepted to be w i (c + 1). Similar to PSO, DE, TLBO, GWO, FA and HS algorithms, while performing retinal vessel segmentation with basic ABC algorithm, optimal pixel range consisting of optimal lower and upper pixel values is investigated. In basic ABC algorithm, the segmentation process is initialized via a randomly selected x i pixel position. Firstly, the fitness value of each x i pixel position creating the retinal image is calculated and then these position values are updated by using Eq. (5) until the optimum fitness values for lower and upper bounds is reached. Finally, the retinal vessel segmentation is realized in the interval between the optimum x i lower and upper pixel positions obtained so far.

Particle swarm optimization algorithm
Particle swarm optimization is a population based stochastic optimization technique developed by Eberhart and Kennedy in 1995, inspired by social behavior of bird flocking or fish schooling [20]. In PSO, a population of particles (solutions) is made to move across D-dimensional space. The particles move across the problem space by getting information from the current optimum particles. The position of a given particle at any cycle represents a possible solution of the problem. Evaluating the objective function at a given position gives the fitness of the corresponding solution. The particles also have velocities which direct particles searching in the space.
PSO is initialized with a population including randomly produced particles and then searches for optimal solutions by updating positions in each generation. In every cycle, each particle is updated by best solution [p → (t)] that it has achieved so far. and the best solution of the population. [g → (t)]. When a particle takes part of the population as its topological neighbors, the best value is a local best.
Intuitively, the information about good solutions is distributed through the swarm and thus the particles are directed to move to good areas in the search space. At each time step t. the velocity υ → (t) is updated and the particle is moved to a new position. x → (t + 1). This new position is calculated as the sum of the previous position and the new velocity: The update of the velocity from the previous velocity to the new velocity is determined by the following Eq. (9): where r 1 and r 2 are uniformly distributed random numbers. The parameter ω is called the inertia weight and controls the magnitude of the old velocity υ → (t) in updating velocity. whereas c 1 and c 2 determine the significance of p → (t) and g → (t), respectively. Furthermore, at any time step of the algorithm υ i is constrained by the υ max parameter. The swarm in PSO is initialized by assigning each particle to a uniformly and randomly chosen position in the search space. Velocities are initialized randomly in the [υ min , υ max ] range.
If the sum of accelerations would cause the velocity of that parameter to exceed υ max , the velocity on that dimension is limited to υ max .
While performing retinal vessel segmentation by using basic PSO algorithm, firstly, for each x i pixel position value that is randomly selected and constitutes the initial population, fitness values are

Differential evolution algorithm
Differential evolution algorithm which has been introduced by Storn and Price [21], is a method with crossover and mutation operations that work directly on continuous-valued vectors. An optimization task consisting of D parameters can be represented by a D dimensional vector. Therefore, in DE algorithm a population of NP solution vectors is randomly created at the start. This population is successfully improved by applying mutation, crossover and selection operators.
The main steps of a basic DE algorithm are given below: In the mutation operator, for each target vector x i, G , a mutant vector is produced by where i, r 1 ,r 2 , r 3 ∈ {1, 2, ..., NP} that randomly chosen and must be different from each other. In Eq. (10). F is the scaling factor ∈[0, 2] The pseudo code of the procedure: Initialize REPEAT Calculate fitness values of all particles Modify the best particles in the swarm Choose the particle with the best fitness value of all the particles Calculate the velocities of particles Update the particle positions UNTIL (maximum generations or minimum error criteria is satisfied) Selection UNTİL (maximum generations or minimum error criteria is satisfied) Çetinkaya and Duran: Retinal vessel segmentation based on heuristic approaches affecting on difference vector (x r2, G − x r3, G ), K is the combination factor. In the crossover operator, the parent and the mutated vectors are evaluated together in order to produce a trial vector of u ji, G+1 , where j=1,2, … , D; r j ∈ [0, 1] random number; CR stands for the crossover constant ∈[0, 1] and rn i ∈(1,2, … , D) randomly chosen index. Performance of the trial vector and its parent is compared and the better one is selected. All solutions have the same chance of being selected as parents without dependence of their fitness value. While DE algorithm is applied to retinal vessel segmentation, firstly, an initial population consisting of vectorially defined pixel locations is created. Then, by using the mutation and crossover operations represented in Eqs. (10) and (11), new pixel values are randomly created in the neighbourhood of each pixel value available in the initial population. The fitness values of the previous and the new produced pixel values are compared and the better solutions are stored as the optimal lower and upper pixel values. At each cycle, the optimal pixel values in the memory are updated by mutation and crossover, and as a result of this the most appropriate lower and upper pixel values for retinal vessel segmentation are obtained. Finally, by using the pixel value interval between the lower and upper pixel values obtained optimal retinal vessel segmentation is achieved.

Teaching learning based optimization algorithm
TLBO is an approach based on the learning-teaching interaction between teachers and students in order to solve multi-dimensional, linear and nonlinear problems [22]. The algorithm consists of two main phases so as to be teacher phase and learners phase.
In teacher phase, the teacher transfers knowledge to the students in order to increase the mean knowledge level of the class. Since the teacher is the most experienced and knowledgeable person in a class, he or she represents the best solution of the entire population. Let m be the number of subjects which corresponds to the parameter number to be optimized. Also, if n is the number of learners, population size can be determined as k=1, 2, … , n. At any cycle of i, M j.i represents the mean result of the learners in a particular subject of j=1, 2, … , m. The average difference between the teacher's knowledge capacity and the learning capacity of students is given in Eq. (12).
where, X j, kbest, i is the best solution obtained for subject j. Also, r i is a randomly produced coefficient in the interval of [0,1]. Finally, the variable T f is called as learning factor and its value is randomly decided by the algorithm according to Eq. (13).
As seen from Eq. (12), the value of T f can either be 1 or 2. The available solution is updated according to Difference Mean j, i as given in Eq. (14), where, X ′ j, k, i is the updated value of X j, k, i . The learner phase aims to increase the knowledge owned by the students by means of interaction with the teacher or between themselves. Let, P and Q be the randomly selected students, such that, where X ′ total−P, i and X ′ total−Q, i are the updated values of X total−P, i and X total−Q,i , at the end of the teacher phase. The candidate solution is obtained as the following.
In retinal vessel segmentation based on TLBO algorithm, each pixel in the retinal image represents a student. The pixel with the highest fitness value among these pixels represents the teacher. At each cycle, the new pixel values representing possible solutions are produced by Eq. (14) and then fitness values of these new solutions are calculated. Among these new fitness values, the best two is selected and stored to the memory as the best lower and upper pixel values by using Eqs. (15)- (17). By repeating these steps at each cycle, the optimal lower and upper pixel values producing the best fitness values are found and the retinal vessel segmentation is applied according to these optimal solutions.

Grey wolf optimization algorithm
GWO is a heuristic optimization algorithm inspired from the collective behaviour of gray wolves in nature and proposed by Mirjalili et al. in 2014 [23-25]. It simulates the leadership hierarchy consisting of four types of wolves named as alpha (α), beta (β), delta (δ) and omega (w). In the population, α wolves represent the best solution for the problem to be optimized while the β and δ wolves represent the second and the third best solutions, respectively. The rest of the populations are determined as the w wolves. In GWO algorithm, an optimization task consists of three phases to be encircling, hunting and attacking.
Encircling phase includes the optimization of the positions of gray wolf and prey as given below, where, t represents the current cycle, X → (t) and X p → (t) are the position vectors of the grey wolf and prey, respectively. Also, A where, r 1 → and r 2 → are randomly produced vectors in the range [0,1] and they provide reach to random positions in the solution space. Also, a → are the linearly decreased coefficients from 2 to 0 over the course of cycles.
In hunting phase, the position of the prey corresponds to the global solution. At the beginning of the search it is supposed that the α, β and δ have better knowledge about the position of the prey. The fitness values of each position information are calculated and the solution with the best fitness value is defined as α. The solution with second best fitness value is assigned to β and the solution with third best fitness value is defined as δ. At each cycle of the search, the position of the current optimal solution can be updated according to the Eqs. (22)- (24) in order to reach to the global solution.
where, X ∝ → , X β → and X δ → represents to the initial positions of the α, β and δ.
As seen from Eq. (24), the position updated would be in a random place within a circle which is defined by the positions of α, β and δ in the search space.
Attacking phase can be simulated by reducing the distance between the gray wolf and the prey. This phase can be modeled by decreasing the value of a The pseudo code of a basic GWO algorithm can be given as the following.

Return
Step 3 for current X a → In GWO algorithm, an initial population consisting of X i pixel values each representing position vectors is randomly created. At each cycle of the algorithm the population X i is evaluated and the solution with the best pixel value range producing the highest fitness value is determined as X a → , the solution with the second best pixel value range is determined as X β → and finally the solution with the third best pixel value range is determined as X δ → . At each cycle, the values of X a → , X β → and X δ → are updated by using Eqs. (22)- (24). in order to reach lower and upper bounds of the best pixel range producing the optimal segmentation. As a result, retinal vessel segmentation is realized by using X a → which is producing the optimal fitness value.

Firefly algorithm
FA, which was proposed by Xin-She Yang, is a novel swarm intelligence based algorithm inspired by the phenomenon of bioluminescent communication behaviour of fireflies [26].
The following rules create the philosophy of the FA algorithm [26]. The brightness or light intensity of a firefly is influenced by the landscape of fitness/cost function.
The pseudo code of a basic FA can be given as the following.
In retinal vessel segmentation based on FA algorithm, the optimization task begins with creating an initial population consisting of X i pixel values in an interval limited by lowest and highest pixel values and then the fitness value of each X i is calculated. Then the fitness value of each X i is compared to that of the fitness value of new possible solution produced in the neighbourhood of the relevant X i by the following Eq. (25).
where, β represents the value of attractiveness parameter changing in the interval of [0,1], i represents the index of the selected pixel value, j represents the index of the new pixel produced as a possible solution in the neighbourhood of the ith pixel, k represents the cycle number, X i,k represents the value of ith pixel for kth cycle, X j,k represents the Among the pixel values updated at each cycle, the pixel values producing the highest fitness value for lower and upper pixel value bounds are selected and retinal vessel segmentation is applied for this interval.

Harmony search algorithm
Harmony search (HS) algorithm was developed by Geem at al. in an analogy with music improvisation process where music players improvise the pitches of their instruments to obtain better harmony [27].
The steps in the procedure of HS algorithm can be given as the following [28]. - Step 1. Initialize the problem and algorithm parameters.

-
Step 4. Update the harmony memory.
The pseudo code of a basic HS algorithm can be given as the following [28].

Find the current best estimates
While applying HS algorithm to retinal vessel segmentation, firstly, an initial population consisting of HMS position vectors within an interval limited by lowest and highest pixel values is randomly created and then these position vectors representing possible solutions are registered into the harmony memory. After calculating the fitness values for each pixel representing possible solutions, new position vectors representing possible solutions (pixel values) are created by using the equations given in the pseudo code. The fitness value of the older and the new produced position vectors are compared and the pixel value with higher fitness is transferred to the next generation by replacing with the relevant pixel value in the harmony matrix. Otherwise, the old pixel value in the harmony matrix is preserved. Thus, after a predetermined cycle number the algorithm will be reached the most appropriate lower and upper limits of the pixel interval in which the optimum retinal vessel segmentation can be performed.

Retinal vessel segmentation
The performances of the algorithms have been tested on the images taken from two public databases. The first one is the STARE (Structured Analysis of the Retina) database which is created by Hoover et al. [29]. STARE database contains 20 raw retinal images for blood vessel segmentation and 10 of these images have pathologies. All images were captured by a TopCon TRV-50 fundus camera at 35°field of view (FOV) and digitized to 700 × 605 pixels with 8 bits per color channel.
The second database used in this work is the DRIVE database which was established by Alonso-Montes et al. [30]. In DRIVE database, 20 images are employed for training and 20 images for testing. These images are captured by a CanonCR5 3CCD camera with a 45°F OV and size is 700 × 605 pixels per color channel.
The flowchart of the method applied for retinal vessel segmentation is shown in Figure 4.
At first step, due to its higher contrast the green layer of the RGB retinal image is extracted and in the next steps analysis have been carried out on this layer. Although its higher rate of contrast, it is seen from the analyses that the G layer alone is insufficient for a successful clustering. For this reason, in order to clarify the vessels and the image background and as a result of this to improve the clustering performance, two more pre-processing were applied to the retinal image. The first process is the bottom-hat filtering while the second process is the brightness correction. Retinal images obtained after the pre-processing phase are subjected to segmentation process by the ABC, PSO, DE, TLBO, GWO, FA and HS algorithms. In the segmentation phase. optimal clustering centers have been determined by using ABC, PSO, DE, TLBO, GWO, FA and HS algorithms in order to obtain the highest clustering performance and clustering has been done according to these centers.
In the clustering process cluster centers are randomly determined and all pixels are indexed to the closest cluster center. In order to index each pixel to the nearest cluster center, the Mean-Squared Error (MSE) function is used. Namely, the quality of each solution has been calculated as mean squared error. The MSE function can be expressed as the following, As seen from the equation given above, MSE error function calculates the quality of a solution based on the distance between each pixel and its related cluster center. In this equation, N represents the total number of pixels in the retinal image. Cluster centers are determined by algorithm and f i represents the value of the cluster center closest to the pixel i. Also, y i represents the pixel value of the ith pixel. By using MSE error function it is aimed to group the pixels around the appropriate cluster centers with the minimum error.
The control parameter values of the algorithms used in the simulations are given in Table 1. Figure 5 represents the resulting retinal images obtained after applying segmentation to the images Figure 1 (a) and (b) by using the clustering based ABC, PSO, DE, TLBO, GWO, FA and HS algorithms. From the figures it is seen that some regions that are not vessel but whose pixel values are close to the pixel value of the vessels seems to be segmented as vessel. There are post-processing methods in literature in order to eliminate these regions. But in this work in order to represent the pure performance of the algorithm these postprocessing methods have not been applied. In general, as seen from the figures all three algorithms show a similar performance in terms of clustering. Also it is seen that ABC, PSO, DE, TLBO, GWO, FA and HS algorithms are able to cluster close pixel values at high accuracy.

Results
The clustering performance of the algorithms for Figure 2 (a) and (b) are given in Figure 6. In these images taken from the STARE database, it is seen that the performance of the algorithms is lower than the images taken from the DRIVE database. On the other hand, the results obtained show that the clustering performance of the algorithms is similar but a bit higher when compared with the results obtained in the literature.
where the true positives (TP) represent the number of pixels that are actually vessels and have been detected as vessels.
The false negatives (FN) determine the number of pixels that are actually vessels but which have been detected as background pixels. The true negatives (TN) are the pixels classified as non-vessel and they are really background pixels. The false positives (FP) represent the number of pixels that are actually background pixel and have been detected as vessels. So the sensitivity (Se) is the ratio of correctly classified vessel pixels while specificity (Sp) is the ratio of correctly classified background pixels and accuracy (Acc) is the ratio of correctly classified both the vessels and background pixels. Finally, precision determines the number of accurately true predicted TP pixels. Table 2, Table 3, Table 4, Table 5 show the performance of the ABC, PSO, DE, TLBO, GWO, FA and HS algorithms in terms of sensitivity, specificity, accuracy and precision for 20 retinal images taken from both DRIVE and STARE  databases. These tables also contain the mean values of the results obtained for 20 retinal images. From the results obtained for Se, it is seen that each algorithm is able to reach high performance in terms of the ratio of correctly classified vessel pixels. The Sp values reached represents that each algorithm can successfully distinguish the vessel pixels and background pixels. Also, when the Acc values examined the results show that the ratio of correctly classified both the vessels and background pixels is very high for each algorithm. Finally, it is seen from the results obtained for precision that heuristic algorithms show similar and good performances in terms of correctly classified vessel pixels. In order to compare the performances of the algorithms, the mean values obtained are combined in Table 6 and Table 7. When the mean Se, Sp and Acc values obtained for the DRIVE database are compared it is seen that all the algorithms produce similar results but FA performs a bit better classification. However, the performance of DE algorithm for DRIVE database in terms of precision seems only a bit better than the other algorithms. In the retinal images taken from the STARE database, the PSO algorithm showed the best results in terms of Se, Sp and Acc. It is also seen that the algorithm with the best precision performance for the STARE database is the HS algorithm. In general, it can be said that the clustering performances of the algorithms are successful and too close to each other.

Performance analysis for DRIVE and STARE databases
A detailed statistical performance comparison has been realized between the heuristic approaches analyzed in this work and the studies published in literature. Table 8 and Table 9 contain the results obtained for DRIVE and  Table 8, the sensitivity of the heuristic algorithms is close to [33], but higher than the other approaches. On the other hand, for the STARE database, the sensitivity values produced by the heuristic algorithms are similar to [33] and [49], but higher than the other approaches. Namely, the ABC, PSO, DE, TLBO, GWO, FA and HS algorithms considerably improve the sensitivity of the clustering process for both DRIVE and STARE databases. Furthermore, the clustering performance of the heuristic algorithms in terms of the specificity seems similar with the methods published in literature for both DRIVE and STARE databases. Finally, the higher accuracy values produced by the heuristic algorithms represent that the pixel classification performance of the heuristic methods is higher than or similar to the other methods in literature. Figure 7 represents the convergence speeds of the algorithms for the retinal images given in Figure 1 (b) and Figure 2 (a). For a fair and detailed comparison, the convergence speeds have been given for both an abnormal retinal image from DRIVE database and a normal retinal image from STARE database. The evolution of best solutions has been obtained based on MSE. As seen from Figure 7 (a) which shows the convergence speeds obtained for the normal retinal image taken from DRIVE database, each algorithm needs approximately 10 cycles to converge to the optimal solution and the MSE performance of the GWO algorithm seems a bit better than the other algorithms. On the other hand, Figure 7 (b) shows the convergence speeds of the algorithms for the abnormal retinal image taken from STARE database. It is seen that each algorithm reaches to the global solution at 15 cycles. Also, the MSE performance of GWO seems a bit better than the other algorithms. According to the simulation results it is clear that convergence speeds and MSE performances of the algorithms in terms of clustering is close to each other.
Another important performance criterion for heuristic algorithms is the standard deviation which determines the stability of the algorithm. The low standard deviation indicates that the algorithm approximates similar error values at each random run. Table 10 and Table 11 contain the standard deviations obtained after 20 random runs for the retinal images taken from DRIVE and STARE databases. In addition to the standard deviation, the minimum MSE values reached by the algorithms have been shown in the tables. When the minimum MSE values reached by the algorithms are examined it is seen that all the algorithms produce similar MSE values, in other words, their clustering performance are very close to each other. On the

Discussion
In retinal image analysis it is important to improve algorithms that are flexible and capable of accurate vessel segmentation. The results obtained in this work represent that the clustering based heuristic algorithms provide satisfactory performances in retinal vessel segmentation. In this work, ABC, PSO, DE, TLBO, GWO, FA and HS algorithms which are most commonly used algorithms in literature for engineering problems are applied for the aim of retinal vessel segmentation. The analyses have been realized for both the normal and abnormal retinal images taken from the DRIVE and STARE databases. From the simulation results it is observed that all the algorithms successfully performed the retinal vessel segmentation. They produced similar MSE values and were able to converge to the global solutions. The standard deviation values reached by the algorithms have been representing that each algorithm was performing a stable behavior. On the other hand, statistical results obtained for sensitivity, specificity, accuracy and precision have been representing that the pixels with value very close to each other can successfully be distinguished with ABC, PSO, DE, TLBO, GWO, FA and HS algorithms.
The conventional gradient based algorithms may usually stuck into the local minimum and also their high dependence to the parameter values is an another disadvantage of these algorithms. Also, the nonflexible structures have been decreasing their optimization performance. Due to these disadvantages of conventional gradient based algorithms, more effective methods that can be used in image analysis have been needed. In this Research funding: No financial support was received from any person or organization in this work.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: The authors do not have financial and personal relationships with other people or organizations that could inappropriately influence their work.