Evaluation of several initialization methods on arithmetic optimization algorithm performance

: Arithmetic optimization algorithm ( AOA ) is one of the recently proposed population - based meta - heuristic algorithms. The algorithmic design concept of the AOA is based on the distributive behavior of arithmetic operators, namely, multiplication ( M ) , division ( D ) , subtraction ( S ) , and addition ( A ) . Being a new metaheuristic algorithm, the need for a performance evaluation of AOA is signi ﬁ cant to the global optimization research community and speci ﬁ cally to nature - inspired metaheuristic enthusiasts. This article aims to evaluate the in ﬂ uence of the algorithm control parameters, namely, population size and the number of iterations, on the performance of the newly proposed AOA. In addition, we also investigated and validated the in ﬂ uence of di ﬀ erent initialization schemes available in the literature on the performance of the AOA. Experiments were conducted using di ﬀ erent initialization scenarios and the ﬁ rst is where the population size is large and the number of iterations is low. The second scenario is when the number of iterations is high, and the population size is small. Finally, when the population size and the number of iterations are similar. The numerical results from the conducted experiments showed that AOA is sensitive to the population size and requires a large population size for optimal performance. Afterward, we initia - lized AOA with six initialization schemes, and their performances were tested on the classical functions and the functions de ﬁ ned in the CEC 2020 suite. The results were presented, and their implications were discussed. Our results showed that the performance of AOA could be in ﬂ uenced when the solution is initialized with schemes other than default random numbers. The Beta distribution outperformed the random number distribution in all cases for both the classical and CEC 2020 functions. The performance of uniform distribution, Rayleigh distribution, Latin hypercube sampling, and Sobol low discrepancy sequence are relatively competitive with the Random number. On the basis of our experiments ’ results, we recommend that a solution size of 6,000, the number of iterations of 100, and initializing the solutions with Beta distribution will lead to AOA performing optimally for scenarios considered in our experiments.


Introduction
Optimization techniques have been applied successfully in many real-world problems. These real-world problems are usually complex, with multiple nonlinear constraints and multimodal nature. Solving these complex, nonlinear, and multimodal problems usually requires reliable optimization techniques.
Metaheuristic algorithms are among the reliable optimization techniques that have been used to solve realworld problems in medicine, imaging processing, and many more [1,2].
Nature has been the main inspiration behind most metaheuristic algorithms. Mimicking some natural phenomena is central to these nature-inspired metaheuristic algorithms. Several classifications or taxonomies of metaheuristic algorithms exist in the literature. The common taxonomy is bioinspired and physical based (based on physical phenomena such as physics and chemistry) [3]. Another taxonomy is inspired by swarm intelligence, evolution, physics-based, and human based [4]. Metaheuristic algorithms are modeled after natures' best features, and many attributed their popularity to this fact and their ability to find nearoptimal solutions [5].
The population-based metaheuristic algorithms use stochastic methods to generate the population's location vectors. The location vectors are updated after every iteration to help find the location of the global optimum. Finding the global optima usually involves exploring and exploiting the search space. Different metaheuristic algorithms use various mechanisms to achieve exploration and exploitation. A good balance of exploration and exploitation leads to an excellent performance of the algorithm. Most metaheuristic algorithms have the advantage of being gradient-free and usually avoid being stuck in the local optima [6].
The nature and diversity of population-based algorithms play a significant role in their performances. It has also been shown by empirical observation and experiments that the finding of the global optima depends heavily on the initial starting points of the population of metaheuristic algorithms [7]. The population size and the number of iterations also contribute to the metaheuristic algorithm's performance. Some algorithms require a large population size and a small number of iterations to achieve optimality, and others require the reverse setting. Then, others require the population size and the number of iterations to be almost the same [8]. Accordingly, the relative sizes of the population and number of iterations need to be considered carefully to ensure good performance for the algorithm.
New metaheuristic algorithms are proposed daily; each is either wholly novel or improves existing ones or hybridizes two (or more) existing algorithms. Arithmetic optimization algorithm (AOA) is a recently proposed population-based metaheuristic algorithm [9]. AOA is based on the distributive behavior of arithmetic operators: multiplication (M), division (D), subtraction (S), and addition (A). A detailed description of AOA is presented in Section 2. The authors checked the algorithm's performance using 23 benchmark functions, six hybrid composite functions, and several real-world engineering design problems. The experimental results were compared against results from 11 other well-known optimization algorithms. The outcome showed that the AOA provided promising results in most cases and was very competitive in others.
Being a new metaheuristic algorithm, the need for a performance evaluation of AOA is an excellent idea. The motivation of this research is to evaluate the performance of AOA when the population size (solution as is the case here) and the number of iterations are varied. Also, the initial solutions (populations) in AOA are initialized using the random number generator. Previous studies [10][11][12] have shown that the random number generator may not be the optimal scheme for initializing the populations. We also checked the performance of AOA when initialized using various initialization schemes available in the literature. The goal is to propose the balance of population size, the number of iterations, and initialization methods that will lead to the optimal performance of AOA for the optimization problems considered in this study. The significant contribution of this article is mainly focused on conducting a performance analysis study for the newly developed AOA metaheuristic optimizer. Moreover, the specific contributions of this article are given as follows: • We evaluate the influence of different initialization schemes on the performance of the newly proposed AOA metaheuristic optimizer by varying initialization conditions, namely, population size and the number of iterations, coupled with the sensitivity analysis test. • Also, we evaluated the performance of AOA when the solutions are initialized using six different probability distribution initialization schemes available in the literature. • Finally, we recommended a balance of population size, the number of iterations, and probability distribution methods to yield high performance for the AOA optimizer under the scenarios considered in this article.
The rest of this article is organized as follows. In Section 2, the AOA is presented and discussed. We provided the methodology, which included the experimental setup in Section 3. Section 4 covers the detailed experiments conducted and discusses the results for the proposed modified AOA optimizer. Finally, Section 5 presents the concluding remarks and future direction.

The arithmetic optimization algorithm
The main inspiration of this algorithm is the use of arithmetic operators (multiplication, division, subtraction, and addition) in solving arithmetic problems. The AOA is modeled after the rules of the arithmetic operators in mathematics. The algorithm randomly initializes the starting solutions, and the best solution for each iteration is considered the near-optimal solution. The adaptation of pseudocode for AOA from ref. [9] is presented in Algorithm 1. (1) Apply the division math operator ( ) ÷ D " " .

27:
Update the ith solutions' positions using the second rule in equation (

Exploration
The next phase after initialization is the exploration or exploitation phase. The math optimizer accelerated (MOA) function is used to determine whether the next phase will be exploration or exploitation. MOA is evaluated using equation (1), and it is evaluated for every iteration. Depending on the comparison of MOA and a random value r 1 , AOA goes into the exploration or exploitation phase, as shown in Figure 1. If > r MOA 1 , the exploration phase is activated. The high distribution of numbers generated when using division and multiplication is used for the exploration phase.
where C_Iter is the current iteration, Max and Min are maximum and minimum values of the accelerated function, respectively, and M _Iter is the maximum number of iterations. The division operator is activated if the random number < r 0.5 2 and multiplication is activated otherwise. The position vector is updated using equation (2). As shown in Figure 1, if < r 0.5 2 , the division operator will continue to be executed until the condition fails, then the multiplication operator would be activated. The high distribution of numbers generated at this phase ensures that the search space is searched exhaustively for the optimal solution. The stochastic scaling factor ( ) μ ensures the randomness of the numbers generated, which corresponds to the position vectors, thereby ensuring that the algorithm does not return to a previously occupied position. The value of = μ 0.5 was experimentally selected by the authors.
denotes the solution's position vector at the next iteration, ( ) x best j is the current best solution, ε is a small integer, and UB j and LB j are the jth upper and lower bounds, respectively. Also, the math optimization probability is defined by equation (3).  where M Iter is the maximum number of iterations and α denotes the exploitation accuracy for each iteration. The authors set = α 5 after a series of experiments.

Exploitation
Now, if ≯ r MOA 1 , AOA enters the exploitation phase. The highly dense or low dispersion numbers generated when the addition and subtraction operators are performed help the exploitation phase of the algorithm. The results of the addition and subtraction operators modeled in equation (4) exploit the search space deeply to find a (near) optimal solution. AOA enters the addition or subtraction phase depending on the value of the random number r 3 . As shown in Figure 1, if < r 0.5 3 , the subtraction operator is activated; otherwise, the addition operator is activated. The subtraction operator continues to execute until the condition fails. The numbers generated in this phase are highly dense as such and help to converge to the (near) optimal solution. Also, the operators at this phase, along with the carefully selected value of μ, helps AOA to avoid being trapped in a local optimum.

Discussion
AOA is an interesting population-based metaheuristic algorithm that harnesses the basic arithmetic operators' powers in solving arithmetic problems. AOA used division (D) and multiplication (M) for exploration and addition (A) and subtraction (S) for exploitation. Figure 2 shows how the four (4) operators behave. We can see from the direction of the arrowhead that the high dispersion of numbers generated by the D and M operators makes it unable to converge toward the target solution. However, it gives the algorithm its exploratory ability to scan the search space effectively. Similarly, the direction of the arrowhead of the A and S operators shows that the generated high dense numbers make it possible to converge at the target location. The operators work complementarily to enable convergence, as shown by the direction of the arrowhead, as shown in Figure 2. The starting solutions of the original AOA are initialized using the random number generator. The solution size and number of iterations are fixed, and the results are obtained for both test functions and some engineering problems. Research abounds in the literature that shows that population-based performance of metaheuristic algorithms is greatly influenced when initialized with, say, low discrepancy sequences such as Sobol, Faure, Halton, and Van der Corput [12][13][14]. The low discrepancy sequences are known to falter when the dimension gets higher. The population of metaheuristic algorithms has been initialized using Lévy flight [15,16]. Other authors have used chaos theory to improve the diversity of the population of metaheuristic algorithms [17,18]. However, chaos-based algorithms suffer from high computational complexity. The hybrid of other metaheuristic algorithms has been used to improve the diversity of the population of algorithms [19,20]. The success of this approach depends heavily on the relevant authors' experience, and time complexity may also be an issue here.
Probability distributions, varying population sizes, and the number of iterations have also been shown to affect the performance of metaheuristic algorithms [8]. Relevant literature elaborated on the performance of the AOA when applied to test functions and some engineering problems. However, the effect of solution size, the number of iterations, and other initializing methods have not been discussed. Therefore, in our work, a performance study of AOA is undertaken in terms of its reaction to solution size, the number of iterations, and other initialization methods. Results of the analysis are presented and discussed in Section 4.

Methodology
In this section, we describe in detail all the steps we took to achieve our objectives. The experimental setup is fully described, the range of values for the solution size and number of iterations are given, and the initialization schemes are also discussed.

Experimental setup
The AOA was implemented using MATLAB R2020b; this was run on Windows 10 OS, Intel Core i7-7700@3.60 GHz CPU, 16G RAM. The number of function evaluations was set at 15,000, and the number of independent runs was set at 30. We tested our work using 23 classical test functions ( Table 1), consisting of a wide variety of unimodal, nonseparable multimodal, numbers of local optima, and multidimensional problems. We also tested our work using benchmark functions defined in CEC2020 ( Table 2). The suite consists of ten functions that are designed specially to make finding the global optimum difficult. The AOA parameters μ α and are set at 0.5 and 5, respectively. To test the effect of solution size and the number of iterations on AOA, we carefully selected a range of numbers that reflect three scenarios. The first scenario is a large number of solution sizes and a small number of iterations. In the second scenario, we considered a small solution size and a large number of iterations. The last scenario has an almost equal number of solution sizes of iterations. Fourteen cases were considered, each case consisting of a pair of solution sizes and the number of iterations. Table 3 presents the range of numbers selected for the solution size and number of iterations. The experiment was carried out for each case using the CEC2020 test suite, and results are presented and discussed in Section 4. The best-performing population size and iteration number were then used for the subsequent experiments discussed in Section 3.2.

Initialization methods
We also selected six different initialization schemes available in the literature to test how they would affect the performance of AOA. The initialization methods consist of two variants of beta distribution, a uniform distribution, Rayleigh distribution, Latin hypercube sampling, and Sobol low discrepancy sequence. A detailed description is given in Section 3.2.1.

Beta distribution
The beta distribution is defined over the interval (0,1).
It can be written as ( ) ∼ X a b Be , . For our work, we varied the values of a and b to get two variants of the beta distribution that we then used to generate a position vector for the AOA initial solutions. The variants used for our experiments are betarnd(3,2) and betarnd(2.5,2.5).

Uniform distribution
, can be defined as follows:  It is usually written as , . We used unifrnd(0,1) for this research to generate a position vector for the AOA initial solutions.

Rayleigh distribution
The Rayleigh distribution [21] is defined as follows: It can be written as ( ) ∼ X σ Rayleigh . We used raylrnd(0.4) for our experiments.

Latin hypercube sampling
Latin hypercube sampling (LHS) can fill the search space spatially and produce samples that reflect the underlying distribution. A grid is created in the search space by dividing each dimension into equal interval segments and generating random points within the interval [22]. We used a MATLAB function that generates LHS sequences for our experiments.

Sobol low discrepancy sequence
and a linear nonrecurrence relation defined over it, > n 0 can be expanded as follows: X n j can then be defined as follows [23]: and a i is the coefficient of the qth primitive polynomial of F 2 . This definition is used to generate the Sobol sequence using the MATLAB function based on equation (8), as was used for our experiments.
To incorporate these initialization schemes into the AOA, we modified step 2 in Algorithm 1. Instead of initializing the solutions using the rand function, we used the following functions as shown in Figure 3: betarnd(3,2) and betarnd(2.5,2.5), unifrnd(0,1), raylrnd(0.4), lhsdesign( ), and sobol( ). Algorithm 2 shows the modifications, which gave rise to six variants of AOA, each initialized with one of the functions. The box with the star in Figure 3 showed the part of the flowchart that we modifiedonly the solution's initial positions were affected. The rest of the algorithm runs as described by the original authors. The six resultant variants are each compared with the original AOA, and results are presented.  for (j = 1 to positions) do 10: Generate a random value between [0,1] (r_1,r_2,r_3) 11: if r_1 > MOA 12: Exploration phase 13: If r_2 > 0.5 then 14: (1) Apply the division math operator (D "÷").

Results and discussion
In this section, we present and discuss all the results of our experiment. The results are reported using the following performance indicators: best, worst, mean, standard deviation, and the algorithm mean runtime. The statistical analyses of the results were carried out using Friedman's test.

Influence of solution size and number of iterations
This experiment sets out to determine whether there is any significant effect due to the number of the solutions and the maximum number of iterations the algorithm uses. Studies in existing literature have shown that different population sizes and numbers of iterations can significantly influence the performance of metaheuristic algorithms [24]. The results of the experiment are presented in Table 4. The results from Friedman's test presented in Table 5 show significant influence (with a p value of 0.000, which is less than the significance tolerance level of 0.05) when the solution size and the number of iterations vary. We also see that the lowest mean ranking occurred when the solution size was 6,000 and the number of iterations was 100. This is closely followed by a solution size of 3,000 and 200 iterations and then a solution size of 2,000 and 300 iterations. We see that the best results are returned when the solution size is the largest. This means that AOA depends heavily on the number of solutions, in which case it manages to find the optimal (Continued)  solution with a small number of iterations. So, for the remaining experiments, we used a solution size of 6,000 and set the number of iterations to 100 to obtain the results presented in Section 4.2.

Initialization methods
We used the beta distribution, uniform distribution, Rayleigh distributions, the Latin hypercube sampling, and Sobol low discrepancy sequence to initialize the AOA solutions. The goal was to find out if any of the different initialization procedures led to a significant improvement in the performance of AOA. We carried out the experiments using both the classical test functions and the CEC2020 test. The results obtained are presented in Tables 6 and 8.

Classical test functions
A quick look at Table 6 shows that all the AOA variants found the global optima for F1-F4, F9, and F11. For F5, all the variants failed to find the global optimum; however, the variant initialized with unifrnd(0,1) performed better than all other variants. We see that one or more modified AOAs outperformed the original AOA (initialized by the rand( )) for most functions. The modified AOA's "best" value is closer to the global optimum than the original AOA. Looking at the deviation from the mean, we see that the variants have lower values for "Stand. dev.," which means they are more stable and closer to the "best" value. Variant initialized with betarnd(3,2) seems to have the lowest mean run time for most functions. The performance of all the variants is the same for F10, F16-F19 as indicated by values returned for all the metrics considered.
A Friedman's ranking test was carried out to understand better the results obtained, and the results are presented in Table 7. The p-value is 0.039, which is less than 0.05 (significant level); this means that there is a significant difference in the respective means compared. AOA initialized with unifrnd(0,1) is ranked first because it has the lowest mean rank. The original AOA initialized with rand is ranked joint third with betarnd (3,2); this clearly showed that the performance of AOA is greatly influenced when initialized with other methods bearing in mind the solution size and the number of iterations.
The convergence curve for classical functions (F1-F10) is shown in Figure 4. All the AOA variants exhibited similar behavior for these functions. However, our proposed variants showed a smoother curve than the original AOA in most cases. The nature of the curve exhibited by all the variants for F1-F4 indicated that the algorithms did not converge the search positions around the best result for a particular iteration. This means proper scouring of the search space. Similar behavior is noticed for F9 and F10; however, there is some aggregation around the best solution after the 400th iteration. Early convergence around the best result can be observed by the algorithms for F5-F8; this can be attributed to the inability of the algorithms to find the global optimum. The best run value, however, is close to the optimum.                3.00 × 10 +00 3.00 × 10 +00 3.00 × 10 +00 3.00 × 10 +00 3.00 × 10 +00 3.00 × 10 +00 Stand. dev.
Initialization methods on AOA performance  89

CEC 2020 test suite
The benchmark test functions defined in CEC 2020 suite are designed to make finding the global optima a challenging task. The seven variants of AOA were tested on this suite, and the results are presented in Table 8.
All the chosen variants of AOA were able to find the global optimum for F6 and failed to find the global optima for F1, F3, F5, and F7-F10. However, when all the variants fail to find the global optimum, the "best" values are still significantly close to the global optimum. For F2, two variants (betarnd (3,2) and rand) were able to find the global optimum. Friedman's test results given in Table 9 suggest that the lowest mean rank is for AOA initialized with betarnd(3,2), which means it ranked first. The original AOA initialized with rand is ranked joint fourth; this significantly means that initializing AOA with these schemes influenced the algorithm's performance.
We also looked at the convergence behavior of the AOA variants under study; the results are shown in Figure 5. We see clearly that the algorithms displayed unsteady convergence for most of the functions, which can be attributed to the nature and design of the functions. However, for F1, F4, and F7, a steady convergence curve is observed. The algorithms could quickly converge to the optimum (F2 and F6) and near-optimum for the remaining functions despite this unsteadiness. Although one can see that there is premature convergence for F2, F3, and F8, this can be attributed to the algorithm's ability to find optimal or near-optimal solutions early because of the initial positions of the solutions in the search space. It is also clear from Figure 5 that in most cases, the other variants converged steadily and searched the solution space more effectively than did the original AOA.

Conclusion
This article tested the influence of solution size, the number of iterations, and other initialization schemes on AOA. AOA is a new metaheuristic algorithm based on the behavior of arithmetic operators in mathematical calculations. We started by testing how AOA was affected when the solution size and the number of iterations were varied. Experiments were conducted on three scenarios: where the solution size was large and the number of iterations small, the number of iterations large and the solution size small, and where the solution size and the number of iterations were similar. The results showed that AOA is sensitive to solution size, which must be large for optimal performance. We then initialized AOA with six initialization schemes, and their performances were tested on the classical functions and the functions defined in the CEC 2020 suite. The results were presented, and their implications were discussed.
Our results showed that the performance of the AOA was influenced when the solution was initialized with schemes other than random numbers. The beta distribution outperformed the random number distribution in all cases of both the classical and CEC 2020 functions. The performances of the uniform distribution, Rayleigh distribution, Latin hypercube sampling, and Sobol low discrepancy sequence were on a par with the random number. On the basis of our experimental results, we recommend that setting a solution size of 6,000, using 100 as the number of iterations, and initializing the solutions with the beta distribution will lead to the AOA performing optimally. We agree that the initial population's distribution will play a less significant role in the algorithm's performance for high-dimension problems. However, with the difficulty of finding global optima for most real-world problems, anything that can increase the algorithms' ability to converge at the global optimum is worthwhile in the field of metaheuristic optimization.
Funding information: This work was supported by the Tertiary Education Trust Fund under Grant TETF/ES/ UNIV/NASARAWA/TSAS/2019.

Conflict of interest:
The authors declare no conflict of interest.