Investigation of optimized ELM using Invasive Weed - optimization and Cuckoo - Search optimization

: In order to classify data and improve extreme learning machine ( ELM ) , this study explains how a hybrid optimization - driven ELM technique was devised. Input data are pre - processed in order to compute missing values and convert data to numerical values using the exponen - tial kernel transform. The Jaro – Winkler distance is used to identify the relevant features. The feed - forward neural net - work classi ﬁ er is used to categorize the data, and it is trained using a hybrid optimization technique called the modi ﬁ ed enhanced Invasive Weed, a meta heuristic algo - rithm, and Cuckoo Search, a non - linear optimization algo - rithm ELM. The enhanced Invasive Weed optimization ( IWO ) algorithm and the enhanced Cuckoo Search ( CS ) algorithm are combined to create the modi ﬁ ed CSIWO. The experimental ﬁ ndings presented in this work demon - strate the viability and e ﬃ cacy of the created ELM method based on CSIWO, with good experimental result as com - pared to other ELM techniques.


Introduction
Extreme learning machines are feed-forward neural networks (FFNNs) containing one or more layers of hidden nodes for clustering, sparse approximation, classification, compression, regression, and feature learning, without having to change the hidden bias settings. These nodes might be randomly selected and kept constant, or they could be handed down from their ancestors and kept constant. Typically, most of the time, discovery of weights of hidden nodes only requires one step, which results in a system that is rapid to pick up new information. These types of models have thousands of times faster learning rates than backpropagation networks and are capable of delivering good generalization performance. In applications involving classification and regression, these models can also perform better than support vector machines, according to the research [1].
Randomly chosen input weights and hidden nodes that impair extreme leaning machine's (ELM) performance are the main bottleneck. Our model, the ELM based on Cuckoo Search (CS), a non-linear optimization algorithm (CSIWO), offers a strategy to enhance the hidden neurons and input weights in order to improve them. The following outcomes demonstrate the hybrid model approach's efficacy for enhancing dataset classification [2,3].
The major contributions of this work are: • Try to optimize the input weights and hidden nodes with the proposed method. • Construct statistical equation of the CSIWO method.
• Furnish different results which shows effectiveness.
• Recommend scientific investigation of the presented model, which confirms the observations.

FFNN approach for data classification in the proposed method
In this section, the created ELM-based CSIWO method that was used by FFNN to classify the data is described. Basically, there are three steps in the above model which comprises data preprocessing, parameter selection, and data classification. The transformation is carried out using an exponential kernel after missing value imputation is performed on the input data during the data preprocessing stage. Following that, the very important features for categorizing the medical dataset are chosen using the Jaro-Winkler distance, and FFNN provides the selected features for additional classification. By integrating the CS and Invasive Weed optimization (IWO) algorithms, the suggested modified CSIWO is created. Accordingly, FFNN is used to find the best option for carrying out the categorization. The architecture diagram for the ELMbased CSIWO FFNN for data classification is shown in Figure 1. The dataset employed to retrieve the input data can be conceived as Z. The following is the equation of dataset: where the total amount of data from the dataset is represented as X and the total number of features obtained from the given dataset is denoted as Y. The dataset is of the size [ ] × X Y . The mth value of data from the dataset in the nth value of feature from dataset is denoted as Z mn . The following expression of the exponential kernel function is given as where the flattening parameter is displayed as d and constant is represented by the term C. Also, S provides the sample in the kernel. Calculation of flattening parameter is based on the factor of damping 1 minus the alpha level, i.e., (1 -α). By increasing computation speed and streamlining the computation process, the exponential kernel transformation enhances the data classification process [4][5][6][7]. The preprocessing procedure produces the following output data P H . The two features' similarity and Jaro distance are displayed as h 1 and h 2 .
where matching sequences are displayed as s and Jaro similarity metric is presented as f. The number of transpositions is w and the size of a common prefix is given as β. The length of the sequences can be represented as k 1 and k 2 , respectively. The factor of scaling is given as λ.
The Jaro distance is used to compare the features that are appropriate for the data selection procedure. The appropriate results of selection of feature process are shown as { } = P P P P , , ... , l 1 2 . The feature parameters which were selected from the above procedure are sent to the dataset classification steps for further processing [8].

Classification of dataset using CSIWO ELM method
The expanded modified ELM based on CSIWO FFNN algorithm is utilized for classification of the presented dataset. For the classification of data, the CSIWO ELM is used to train the FFNN. The output information gathered throughout the feature selection procedure serves as the input to the FFNN classifier. The newly presented CSIWO ELM approach combines ELM and the intelligent mixed optimization strategy. By incorporating the CS with IWO, the CS-IWO is developed. The ELM's hidden nodes and input weights were improved strategically using CS method. Randomly chosen from the swarm in the CS, the pairwise rivalry occurs within the particles. The competition's loser is modified and shifted to the following next generation, while the competition's winner is moved to it immediately. Along with the resolving challenges of high dimension, the CS offered a better balance between exploitation and exploration [9][10][11]. Invasive weeds have a colonization behavior that is dependent on the IWO. Against various parameter values, the IWO offered an effective solution. In order to provide a better solution to the optimization problem, the CS and IWO algorithms are combined [12]. The FFNN's architecture and training procedure are explained here. Let us consider that the dataset Z with V samples is given as . The final network output is represented as where the primary vector of weight representing the connection between the input and the hidden node is a j . The bias is displayed as b j in the jth node. With the help of the output weights and the set of values, both output node and hidden node are connected, . The equation number For the purpose of getting the output weight vector ρ, the optimal norm least-square solution is determined. The minimal least-square solution for the norm is given as The pseudo-inverse of Moore-Penrose H is labelled as + Q , and the hidden node output vector is written as Q. The ELM method computes the output weight using the equation to verify alternative representation.
Here regularization vector is given as C and the identity matrix is presented as I . The presented modified CSIWO algorithm is used to train the FFNN classifier to classify data by offering the best possible outcome.

Random Initialization
The initialization of the population of the CS technique is random. In a particle, the components, such as hidden biases and primary input weights, are tuned. The elements are initialized inside the range [ ] −1, 1 and input vector as well as the hidden biases are used to specify the candidate ELM. At iteration, the output ( ) G δ is illustrated as, where J displays the potential results found inside the solution set k. The particles v of population are started and updated at random.

alculation of fitness function
It is predicted that the fitness measure, which is determined using the expression below, should be minimum in order to identify the best solution.
where LF gives the loss function. This is figured out by taking the square root of the difference between the actual value and the predicted value. χ implies the function which provides the value of fitness, ψ gives the mean squared error (MSE), B represents accu, i.e., accuracy, and μ defines the information based on entropy. The accuracy is a closeness measure, which is calculated by where where χ defines the function which represents the fitness value, q indicates the total random samples, E q ⁎ refers the final target output values, and E q denotes the actual output values of FFNN. Additionally, entropy is a measure of randomness during information being processed.

Calculating the update equation
In this phase, the CS and IWO are altered to incorporate the modified CSIWO model. The CS technique is frequently based on species breed features. The velocity updation equation is stated in the following expression when using the CS method. The modification in Cuckoo Search algorithm is in κ which represents the step size where we made iterative and it can be described as where ⊕ represent the entry wise multiplication, G h δ symbolizes the solution at current iteration, andw is given by the distance where fitness of solution m x is represented as r mx , which has the first-best value, and the solution of fitness m y is displayed as r my , which provides the second-best value. Furthermore, when the random steps of random walks are drawn from the Levy distribution for big steps, Levy flights are critical in enabling random walks. The variance and mean value of b are infinite in this instance.

Fitness function re-evaluation
Every iteration involves computing the fitness metric, and the best outcome is determined by the fitness metric with the optimal value.

Termination
Up till the best ideal solution is attained, the aforementioned procedures are repeated.

IWO algorithm modification
The properties of weed colonies serve as the inspiration for the overall IWO model, which is population-driven optimization. Here the upgraded IWO methodology is used in place of the IWO approach.

Random Initialization
First, using chaotic mapping, the population of plants is formulated as follows: where d denotes different values with total solution and G e denotes eth output solution.

Calculation of fitness function:
For each iteration, measure of fitness is calculated, and the optimal result is determined by selecting the fitness metric with the highest value, as shown in Eq. (15).

Updation in solution
After computing the fitness function, solutions are updated using a modified IWO approach. The following is the typical method for the enhanced IWO to obtain the optimal position: where G best denotes the finest weed in the entire population, + G h δ 1 denotes the new weed in the hth iteration, and ( ) γ δ presents the standard deviation.
And ( ) γ δ is given by where the chaotic mapping ( ) g δ is equal in δth iteration. In the IWO, we try to modify the k, i.e., the modulation index which is given by where max parameter presents the value maximum of k parameter found till modulation index, k l gives the iteration of recent location, and w parameter gives the distance, given in Eq. (18).

Verify feasibility of solution using fitness function
The fitness measure is used to determine the optimal solution, and if a new solution is more advantageous than the prior one, the value is updated.

Termination
The aforementioned procedures are repeated until a better answer is found.

Analytical results and discussion
In this section, the proposed modified ELM-based FFNN which is modified using CSIWO is discussed along with its findings and assessment metrics, which are affected by changing the training data percentage.

Performance analysis of developed method using hidden layers
This section explicates the results and performance analysis of ELM based on updated method. Here the performance analysis is performed by considering Cleveland, Switzerland, and Hungarian data with the number of hidden layers.  Similarly, we can perform the hidden layer analysis of sensitivity and specificity of ELM-based modified model using Cleveland dataset.   1,000 is 0.8052. The accuracy of the devised technique with hidden layers 250, 500, 750, and 1,000 is 0.8635, 0.8648, 0.8658, and 0.8663 for 80% of training data. The accuracy of the developed model with hidden layers 250 is 0.9108, 500 is 0.9111, 750 is 0.9125, and 1,000 is 0.9131 for 90% of training data.

Performance analysis using Switzerland dataset
We can perform similar operation on model and can draw the sensitivity and specificity graph with different values.
Similarly using we can draw graph and table of accuracy, sensitivity, and specificity using Hungarian dataset by performing same operation on model.

Comparative analysis of different optimization techniques
The

Analysis using linear value function
This section illustrates the analysis of the proposed ELMbased modified CSIWO FFNN based on linear function in terms of performance measures using three datasets. Figure 4 illustrates the comparative assessment of the proposed modified CSIWO ELM-based FFNN in terms of sensitivity. If the training data is increased to 90%, sensitivity obtained by the proposed ELM model updated using created modified algorithm is 90%, which reveals the performance enhancement of the The analysis made by the proposed ELM based on the updated algorithm (CSIWO) in terms of specificity is depicted in Figure 5. For training data of 90%, the specificity gained by the proposed modified CSIWO ELM-based FFNN is 92%, which reveals the performance enhancement of the proposed compared with that of the classical approaches, such as CS + FFNN, chaotic IWO + FFNN, modified CS + FFNN, modified chaotic IWO + FFNN, IWO-CS ELM, and CSIWO ELM-based FFNN which are 6.373, 5.565, 4.995, 3.837, 2.870, and 1.726%, respectively.

Analysis of Cleveland data
Similarly, we can perform the linear value function analysis on Switzerland dataset and Hungarian dataset with the evaluation matrix like sensitivity, accuracy, specificity, and precision. All the related values are presented in Table 1 comparative discussion table. Figure 5 shows the analysis based on linear value using precision. Similarly, we have calculated the  sensitivity and specificity of the Cleveland dataset. And accordingly, we have drawn the accuracy, sensitivity, specificity, and precision graph for all the datasets, i.e., Cleveland, Switzerland, and Hungarian.

Analysis using objective value function
This function analysis gives how efficiently the developed algorithm uses the search space. We have analyzed the Switzerland, Hungarian, and Cleveland dataset with the parameters precision, accuracy, sensitivity, and specificity. Following graph shows the analysis of objective value function based on Cleveland dataset using sensitivity and specificity parameters. Rest of the parameter evaluations are represented in Table 1. Figure 6 illustrates the comparative assessment of the proposed ELM method based on updated algorithm (CSIWO) in terms of sensitivity. If the training data is increased to 90%, sensitivity obtained by the proposed ELM method based on updated algorithm (CSIWO) is 91% that reveals the performance enhancement of the proposed model with that of the traditional models like CS plus FFNN, chaotic IWO plus FFNN, modified-CS plus FFNN, modified-chaotic IWO plus FFNN, IWO-CS ELM, and CSIWO ELM-based FFNN, which are 7.318, 6.371, 4.584, 4.045, 3.001, and 1.952%, respectively. However, the conventional techniques delivered the sensitivity value of 84.6% for cuckoo search plus FFNN, 85% for chaotic IWO plus FFNN, 87% for modified CS plus FFNN, 87% for modified chaotic IWO plus FFNN, 88.5% for IWO-CS ELM, and 89% for CSIWO ELMbased FFNN.

Analysis of Cleveland data
The analysis made by the proposed ELM method based on the updated algorithm (CSIWO) in terms of specificity is depicted in Figure 7. For training data of 90%, the specificity gained by the proposed modified CSIWO ELM-based FFNN is 0.929 that reveals the performance enhancement of the proposed model with that of classical   The analysis made by the proposed ELM based on updated algorithm (CSIWO) in terms of specificity is depicted in Figure 9. For training data of 90%, the specificity gained by the proposed ELM based on updated algorithm (CSIWO) is 92% which reveals the performance enhancement of the proposed model with that of classical approaches, such as CS plus FFNN, chaotic IWO plus FFNN, modified CS plus FFNN, modified chaotic IWO plus FFNN, IWO-CS ELM, and CSIWO ELM-based FFNN which are 6.630, 5.544, 4.055, 2.830, 2.123, and 1.051%, respectively.
The Cleveland dataset's accuracy and precision have also been determined by us. In accordance with this, we have created a graph that shows the accuracy, sensitivity, specificity, and precision for all three datasets: Cleveland, Switzerland, and Hungarian.    Table 1 presents the analysis of the proposed ELM method based on the updated algorithm (CSIWO) with different optimized model. While doing analysis we have to consider sensitivity, specificity, accuracy, and precision evaluation techniques on dataset 1, dataset 2, and dataset 3, namely, Hungarian data, Cleveland data, and Switzerland data, respectively. We also have the consider the different functions to show the model stability and feasibility with the related data. We consider the objective value function, linear value function, and optimization value function for analysis. From the analysis the proposed model, maximum sensitivity, max specificity, max accuracy and max precision of 91, 93, 95, and 90% were achieved, respectively, considering the linear value function in dataset 2, i.e., Switzerland dataset.

Comparative analysis
Also, our proposed model works well on the dataset 1 and dataset 3, i.e., Cleveland and Hungarian, respectively. The results are shown in Table 1.    Table 2 shows the comparative discussion based on the objective value function with dataset 1, dataset 2, and dataset 3. With the proposed approach, we have obtained max sen, max spec, max accu, and max prec with the values of 92, 94, 95, and 91%, respectively, with dataset 2, i.e., Switzerland data. Our proposed model also shows the effective performance with other datasets. Comparing our proposed model with other optimized models, it provides good and stable result. Here dataset 1 represents the Cleveland dataset, dataset 2 represents the Switzerland dataset, and dataset 3 represents the Hungarian dataset. Table 3 shows the optimization values function evaluation. Comparing with other optimized models, our proposed approach shows max sen of 95%, max spec of 95%, max accu of 95%, and max prec of 95% for dataset 2, i.e., Switzerland dataset. By looking at the results, it can be seen that our model performed well and gives enhanced results. On dataset 1 as well as on dataset 2 our proposed approach gives stable results.

Statistical analysis
When integrating various algorithms, the algorithmic performance is evaluated using statistical tests which are conducted in pairs. A statistical test has been conducted for the values for sen, spec, and accu [35,36]. The pair-wise statistical test of sensitivity, i.e., sen-based algorithms is shown in Table 4. To determine the performance deviation of the approaches using a statistical test, the methods under consideration are examined using the proposed approach. Comparing the proposed method to the current approaches, Table 5 shows that it discards the null hypothesis by getting to the t value of 0.02. The statistical pair-wise examination of algorithms based on specificity is shown in Table 5. Comparing the proposed ELM-based updated algorithm (CSIWO) FF-NN modified using CSIWO to the already used approaches, Table 4 shows that it discards the null hypothesis by getting to the t value of 0.02. The accuracy, i.e., accu-based t values for various approaches are shown in Table 6. The t value for the hypothesis test should be less than 0.1. In Table 6, the suggested ELM-based on updated algorithm (CSIWO) FF-NN achieves the value of 0.02, discarding the null hypothesis in the majority of the pairings. The table shows that, compared to other algorithms, in the proposed method based on ELM updated algorithm (CSIWO) FFNN, the statistical test almost invariably yields lower t values, and this frequently rejects the null hypothesis [37][38][39][40]. In this work, we architected a modified hybridize ELM based on CSIWO algorithm model for selecting input weights and hidden neurons. The statistical model represents the method for selection of input weights. The selected input weights are sent to the CSIWO ELM model for predication of the output. In this research, the proposed modified CSIWO ELMbased FFNN is investigated using evaluation metrics for three functions, namely, linear values-based function, objective value-based function, and optimization valuebased function using three different datasets. Moreover, the proposed ELM based on updated algorithm (CSIWO)based FFNN has achieved max accu of 95%, max sen of 92%, and max spec of 94% for objective function based on Switzerland dataset.
The proposed model can be used for ventilation diagnosis, fault diagnosis, predicating the diseases.
In future we can modify the different parameters in the equation and can investigate the result. We can even change the classifier and use the hyperparameter tuning to improve the result. The challenge is in the deployment of the model in the real-world scenario. In future, we need to focus more on parameter selection and parameter tuning, so that more accurate results can be obtained. Also, the same model can be deployed on the real-life data and the model performance can be checked on the real data.