Automatic Generation and Optimization of Test case using Hybrid Cuckoo Search and Bee Colony Algorithm

: Software testing is a very important technique to design the faultless software and takes approximately 60% of resources for the software development. It is the process of executing a program or application to detect the software bugs. In software development life cycle, the testing phase takes around 60% of cost and time. Test case generation is a method to identify the test data and satisfy the software testing criteria. Test case generation is a vital concept used in software testing, that can be derived from the user requirements specification. An automatic test case technique determines automatically where the test cases or test data generates utilizing search based optimization method. In this paper, Cuckoo Search and Bee Colony Algorithm (CSBCA) method is used for optimization of test cases and generation of path convergence within minimal execution time. The performance of the proposed CSBCA was compared with the performance of existing methods such as Particle Swarm Optimization (PSO), Cuckoo Search (CS), Bee Colony Algorithm (BCA), and Firefly Algorithm (FA).


Introduction
Nowadays, software is significantly used in many applications such as home appliances, Bank process, nuclear-power-plants, automobiles, telecommunications, medical devices and so on [1,2]. The software testing process is a significant task to make it free from bugs and defects and to improve its quality. The software quality estimation uses several factors such as reliability, efficiency, software functionality, testability and so on [3]. In these quality factors software reliability is a more significant factor due to it checks that how much software is consistent by tolerating failures during the life time of software [4]. The definition of software reliability is the successful running of the system with no error at a particular time period. To make the system more efficient and less error, less maintenance, there is a need of predicting and estimating software reliability using recent techniques and methodologies. Earlier researches focused on to reduce complexities and the failure rate is high in the system. It is a difficult task to compute the best cost in a large area with a population at the random movement of many components [5,6]. At present, the software testing takes more time and cost and makes the software development process an expensive task. The development of automatic test case generation process assists the software testing engineer and saves more time. But, the cost of testing decreases with the reduction of testing time [7]. However, most of the software delivered without enough testing, which is due to marketing pressures and the aim to save testing time and cost but delivering the software without sufficient testing may lead to loss of revenue [8,9].
The software testing is an essential technique and it is very helpful for software developers. Several existing research works implemented to improve the quality of the software and their improvement is noted. The traditional optimization technique of PSO utilized in software structural testing for generating test data. During the search process, PSO algorithm is used to generate the program input data with the fitness value as high as possible. The PSO algorithm generates the path coverage data, then the search direction of next iteration depends on the previous iteration coverage data [10][11][12]. The fuzzy clustering method is utilized to decrease the testing period as well as the number of test cases. This method collects the same test cases clusters that helps to detect the redundant test cases. This methodology uses cyclomatic complexity to define the fullfledged conditional coverage but its expensive [13]. The precise prediction of the software cost is very complex during an initial step of software development project. Therefore, with the help of fuzzy logic technique, it is easy to handle the terms and conditions of software in the initial stage itself. The combination of fuzzy logic and CS algorithm used for software cost prediction and it provided the accurate prediction rate [14]. Another existing method of optimal software testing as well as maintenance policy is Neural Network based Model (NNM). An environmental factors of developing software classifies two classes such as testing environment and operational environment. The NNM technique calculates the time period of an optimal software testing and maintenance limit by reducing the software cost. In case the number of test cases are high then NNM takes more time [15]. In this paper, ATM machine withdrawal operation related test case generated from combinational system diagram graph that merged with the State Chart and Sequence Diagram Graph (SCSEDG). The test cases are optimized through proposed CSBCA and within the minimal execution time maximizes the generation of path coverage. The developed method is compared with the other existing method and hybrid test case generation technique to analyze the performance.
An organization of the paper is as follows. An existing recent research work on test case optimization described in section II. The proposed system, generation and optimization of a test case by employing CSBCA method described in section III. Section IV shows comparative experimental result for proposed and existing software test case based optimization techniques. The section V, explains the withdrawal operation of ATM machine case study and also mention the potential path generation by utilizing the SCSEDG system diagram graph. Finally, the conclusion made in section VI.

Literature Review
The test case generation technique involves in the process of code coverage and the optimal test case selection process. The recent research related to the optimal test case generation and code coverage were reviewed in this section.
L.S. de Souza, et al. [16] presented two strategies, namely Binary Multi-Objective PSO with Crowdin-Distance-Roullete Wheel (BMOPSO-CDR) and BMOPSO-CDR combined with the Harmony Search (BMOPSO-CDRHS) that was employed for selection of test cases. The goal of test case selection was to search the relevant test case subset based on the condition of implementing test tolerability. The method showed better performance of the BMOPSO-CDRHS algorithm, because the execution cost of the method was low. An investigation of hybrid strategies was performed on a smaller number of programs only.
M. Boopathi, et al. [17] proposed a hybrid technique namely Markov chain and Artificial Bee Colony (ABC) optimization methods were used to achieve the software code coverage. A number of paths were generated using Linear-Code-Sequence-And-Jump (LCSAJ) coverage. The LCSAJ was employed to decrease the number of independent paths as related to the paths generated by original path testing. The test cases qualities enhanced in each iteration of ABC optimization and determine the sequences of complete LCSAJ independent paths in a software code. The calculation of test tolerability and reliability of different kind of critical software was difficult to calculate through ABC optimization with mutation testing. M. Mann, et al. [18] presented a Hybrid Test Language Processing (HTLP) which is a keyword related data driven hybrid method. This method automatically executes the functional test cases for the System Under Test (SUT). Moreover, a calculation of test optimization and software scalability attained by performing regression testing of SUT. An iterative regression cycle with HTLP testing time period was less as compared with manual testing method. A high level of test suite optimization was achieved by using the framework. The size of application increased with lots of different modules interacting with each other. The human effort became more and more trivial for saving time.
M. Khari, et al. [19] proposed improvement of an automated tool with significant components. The two essential components of software testing were test suite optimization as well as generation. The proposed method was able to provide a set of minimal test cases with maximum path coverage as compared to other algorithms. The automated fault detection was implemented by generating optimal test suite. The automated testing model included ABC and CS algorithm only. ABC and CSA should be hybridized for better results in every aspect and the algorithm required large amount of inputs.
R.K. Sahoo, et al. [20] proposed a hybrid BCA for generating and optimizing the test cases from combinational UML diagrams. The objective of this proposed Hybrid BCA was to optimize the test cases and generation of path coverage within the minimal execution time. This gave better result in comparison with particle swarm and Bee colony algorithm. The proposed system took less time to choose the best test path and it is more capable, reliable for the development of software. The method was unable to enhance the test case or test data generation for large programs. The approach consumed large amount of time.
Khari and Kumar, [21] reviewed the researches related to the software testing and discussed the issues of pointer position. The research [21] also stated that the execution time of the software testing can be decreased by using non-functional testing. Marculescu, et al. [22] developed a prototype of the Interactive Search based Software Testing and examined in the industrial system under test. The performance of the software testing is low. Chen, et al. [23] performed the software testing based on the multi objective optimization algorithm with high initial population. The efficiency of the method is low than search based software testing methods. Malhotra and Khari, et al. [24] investigated the heuristics search-based software testing methods. The study stated that heuristics search-based software testing methods suffers from the computational complexity and low feasibility. Khari and Kumar, [25] applied the cuckoo search algorithm for finding the optimal test case for the software testing. This method has the higher performance compared to the hill climbing algorithm. The computation complexity of the method is need to be reduced. Khari and Kumar, [26] proposed cuckoo algorithm in the software testing to reduce the complexity of software testing. This method is proved to be cost effective and time effective than Firefly algorithm. The effectiveness of the method is need to be improved. Khari and Kumar,[27] proposed Artificial Bee Colony (ABC) algorithm for the software testing process. The ABC method is cost effective and time effective in the software testing. The feasibility of the method is low in software testing. Malhotra and Khari, [28] proposed mutated ABC algorithm in software testing and this method select the minimum test suite for testing. The flexibility of the method is need to be improved in the software testing.
To overcome the above addressed limitation, a CSBCA is implemented to produce an automatic and optimized test case with a less execution time.

Proposed Methodology
Model driven testing is a method that signifies the behavioral model and encodes the system performance with particular terms and conditions. The model includes a group of objects that express by variables and object relationship. This research work, obtained an automated optimized test case or test data with the potential test paths from combinational system graphs. Here, the hybrid CSBCA optimization technique is proposed for producing and improving test cases from combinational IML diagrams. Here, ATM machine based cash withdrawal operation is considered for generating the test cases using SCSEDG. With the help of CSBCA, the test cases are optimized. The objective of the proposed method is to optimize the test cases and generation of test case with minimal execution time. Also, proposed method takes minimum time to select the best path and it is more reliable for the development of software.

Conversion of State Chart Diagram to State Chart Diagram Graph
State chart diagram is under UML that describe the time taken by a software system. It majorly consists of transitions of states. State-chat diagram represents the different states and events and different effects change the state. The Figure 1 represents the state-chart diagram and a state chart diagram graph for the withdrawal task of an ATM. Table 1 represents the dependency table for overall operation of ATM which is shown in the state chart diagram graph.

Conversion of Sequence Diagram to Sequence Diagram Graph
Sequence diagram explains how the objects interact with each other for a particular test scenario. The Figure 2 represents the SCSEDG for the withdrawal task of an ATM. Table 2 represents the dependency table for withdrawal operation of ATM which is shown in sequence diagram graph.
Based on the activity, the sequence diagram is converted into the graph to show the function. The symbol of activity are denoted as node in the graph and these were connected with the other activity. The activity symbol is denoted in the Table 2 that provides the information of each activity.

Generation and Optimization of test cases
After generating SCSEDG graph, next stage generates and improve the test cases. The different types of metaheuristic evolutionary methods utilized for optimization. Here, the proposed CSBCA method is utilized for optimizing the test cases. The test coverage criteria are calculated through test cases which covered a number of elements and the generation of test cases are reduced. The test case is given as input to the hybrid method, in that cuckoo search method analyze the given test cases. The test case is denoted as the food source in the cuckoo search and optimal test case is found using the Levy flight. The output of the hybrid method is optimal test case with low execution time. The case generation using proposed method flowchart presented in Figure 3.
At first, the population size and test case generations or number of iterations are fixed based on the number of test case. The number of test case is taken as the initial population size of the method and the fitness values is given in the equation (1). The test case of ATM is given as input to the hybrid method and finds the optimal test case. Then, a preliminary population is arbitrarily generated and their consistence fitness values are estimated and stored. The best initial optimal values are estimated. After that, the candidate solutions are ordered based on their fitness values. The maximum fitness values represent the solutions nearer to the optimality. After arranging the operation, bottom half of the poor solutions are rejected and use the first half of the best solutions. These solutions undergo two various phases of optimization methods like CS and BCA.

Phase 1: Cuckoo Search
The CS is the brood parasitism of several cuckoo species. In CS algorithm, Levy flight technique is employed to enhance the CS method rather than simple isotropic random walks. The CS algorithm consists of egg in a nest that signifies the solution. A cuckoo egg indicates novel solutions. The objective of CS method is to employ the new best solution and replace the poor solutions to better solutions in nests. All solutions are calculated based on the fitness function to obtain the fitness value. When obtaining new solutions x t+1 i is estimated in Whereas, x is control parameter and t is number of iterations. Levy is the searching vector and it's similar to the cuckoo random walk but it's faster than the random walk. In most of the conditions, α = 1 , in case α > 0 is the size of the steps which must similar to the scales of the problem of interests. Commonly, a random walk is a Markov Chain whose next location or status depend on the present location (in equation (1) first term) and the transition probability (second term). The product ⊕ is entry wise multiplication. These entry wise products same as employed in PSO technique, but the levy flight search vector more effectively explore the search space as its step length is much longer in the long run.

Phase 2: Bee Colony Algorithm
The BCA algorithm has two phases: employed bee phase and onlooker bee phase. The major responsibilities of employed bee phase are to verify the new solution, whether the fitness value of the new solution is superior than earlier solution or not. If the solution is found to be producing a better solution, then the old solutions are replaced by the new solution. Later, in the employed bee phase, every candidate solution fitness value is calculated. In Onlooker Bee phase, the candidate solutions having a relative value less than a specific constant value "pa" then that solution is discarded from the memory and is replaced with a newly generated random solution. The results gained from both phases are merged. Again, all the candidate solutions are sorted and the bottom half worst solution is discarded and is replaced with a copy of top half of the best solution. The first half best solutions undergo two phases and the programs repeats until termination criteria is satisfied. The solution produced so far is the best optimal solution. The mathematical description of BCA algorithm presented below.
The new solution can be calculated in equation (2), Where x (j) represents a candidate solution at j th level, ebf indicate a random number in the range of [-1,+1]. The probability of occurrence for each candidate solution is calculated in equation (3), Where prob (j) is the probability factor, fx (j) is the fitness function value, tfx is indicated as the total fitness value of all candidate solution. In Onlooker Bee phase, the solution having a probability greater than a random value in the range of [0, 1] are selected and their corresponding solutions are improved with the help of the following equation (4) v (j) = x (j) + ebf · x (jj) Where ebf is a random number in the range of [-0.1, +0.1]. x (jj)is indicated as improved solution.

Fitness Function Value
The fitness value provides the optimality of the software and higher fitness value shows more optimality. The mathematical expression for calculating the fitness function is shown in the Eq. (5).
Here Successive Amount (suc_amt) is defined as: Where net_bal = current account balance Min_bal = Minimum bank balance limit The Table 3 represents the fitness values and test cases / test data with various iterations. In this case, 200 iterations are considered. The function value depends upon the parametric values of the input variables.  The soft drink Vending Machine Automation system [29] is used to test the performance of the proposed CSBCA method in the test case generation system. Even though the loops of the methods is high, the state of activity is low. Hence, the CSBCA method achieves transition coverage of 100%. The proposed CSBCA method has the transition coverage of 100% in the case study, as shown in Table 4. Since the developed CSBCA method is linearly depend on the input, the computation complexity of the proposed method is measured as the O(N).
The proposed CSBCA method found that the solution reached its optimum value after 90 iterations. The Proposed CSBCA method generated up to 44000 test cases in the ATM software.

Experimental Result and Discussion
The proposed CSBCA technique was implemented in Net-Beans (version 8.2) on PC with 3.2 GHz with i5 processor. The proposed CSBCA methodology is used for generation of the test cases with possible test cases from combinational system diagram graph which combines the SCSEDG. The state chart diagram is under UML that describe the time taken by a system software and this diagram consists of transitions of states. In our research, the withdrawal of ATM task is an example for chart diagram. The proposed method performance is measured using Mean Time Between Failures (MTBF).

MTBF
The MTBF includes the overall time period of test cases failure and overall time period to repair the test cases and its measure the software reliability. The MTBF measure the software reliability based on the probability of failure. The amount of time requires to find the fault and rectify the failure. The MTBF mathematical description presented in equation (7), The Table 5 shows that around 45% of test cases or test data have the higher fitness function f(x) value and lies in between 0.7 and 1.0 fitness range using PSBCA but in case of BCA, only 25% of test cases or test data are available within the fitness value between 0.7 and 1.0. Finally, the proposed CSBCA achieved 65% of test cases or test data having higher function f(x) value and lies between 0.7 and 1.0. By considering all the functional value of the fitness function from table.5 the proposed CSBCA achieved better fitness value range compared to the BCA and PSBCA. The Figure 4 represents the fitness value ranges with respect to the different number of test cases/test data. The proposed scheme obtains an automated test cases or test data belongs to the ATM withdrawal operation employing CSBCA method. The Figure 5 indicates the relation between two different variables such as different test cases and various iterations are shown in Table 3.   The Figure 5 represents the performance of various iterations and test cases in terms of fitness value. Here, the BCA method achieved optimal solution after the 160 iterations and PSBCA method attained optimal solution after 120 iterations. Finally, using CABCA method an optimal solution attained much earlier approximately 90 th iterations. The proposed scheme generated the test case or test data for Bank ATMs withdrawal operation using CSBCA. The proposed method performance presented in Table 6.
An experimental analysis of the proposed research work performance measured through evaluation metrics such as MTBF and execution time. The proposed CSBCA takes 16.4 Sec of execution time for select the best test path and its more capable, reliable for the development software.
The Table 7 represents the comparative study of the ATM withdrawal operation based on different optimization techniques and proposed method's performance. All the methods in the Table 7 were tested in the same data and in the same environment. All optimization methods used in a similar fitness value range but the percentage of test data are different. Compare to all existing methods, the proposed CSBCA algorithm achieved 65% of test cases/ test data having a higher fitness function value and lies between 0.7 and 1.0.

Case Study of Withdrawal task of an ATM machine
Each generated test case path was analyzed with the nodes that represents the function of ATM. Here, majorly seven possible number of traversing paths presented below using CSBCA through the SCSEDG system graph. In all seven paths, only one path can generate an optimal result and remaining six paths don't obtain the optimal result.
The path number is represented as Path 1, Path 2, Path 3, Path 4, Path 5, and Path 6. All six paths give improper optimal result and these paths are unsuccessful to achieve the ATM withdrawal operation. Only path 7 gives the proper optimized solution and it is shown successful withdrawal operation.

Conclusion
In a model-driven approach based automated software testing, test cases are very useful. In this paper, CSBCA an evolutionary meta-heuristic algorithm used to optimize the automated test cases with test data. This algorithm is used to generate the test cases which are optimized by taking an example of withdrawal operation by an ATM machine automatically. Test data values are selected based on the fitness function. This proposed approach optimized the test cases those are maximized with minimum iterations and time. An experimental analysis demonstrated that the proposed CSBCA method takes 16.4 Sec for the generation of path coverage. Compare to all existing methods, the proposed CSBCA algorithm achieved 65% of test cases/ test data having a higher fitness function value and lies between 0.7 and 1.0. The proposed CSBCA method gave better results compared to PSO, CS, FA, and BCA. This method can be applicable for the software testing in the bank ATM machine. In future work, this method will be implement and tested in the complex software test case generation.