Full spectrum and genetic algorithm-selected spectrum-based chemometric methods for simultaneous determination of azilsartan medoxomil, chlorthalidone, and azilsartan: Development, validation, and application on commercial dosage form

Five various chemometric methods were established for the simultaneous determination of azilsartan medoxomil (AZM) and chlorthalidone in the presence of azilsartan which is the core impurity of AZM. The full spectrum-based chemometric techniques, namely partial least squares (PLS), principal component regression, and artificial neural networks (ANN), were among the applied methods. Besides, the ANN and PLS were the other two methods that were extended by genetic algorithm procedure (GA-PLS and GA-ANN) as a wavelength selection procedure. Themodels were developed by applying amultilevel multifactor experimental design. The predictive power of the suggestedmodels was evaluated through a validation set containing ninemixtures with different ratios of the three analytes. For the analysis of Edarbyclor tablets, all the proposed procedures were applied and the best results were achieved in the case of ANN, GA-ANN, and GA-PLS methods. The findings of the three methods were revealed as the quantitative tool for the analysis of the three components without any intrusion from the co-formulated excipient and without prior separation procedures. Moreover, the GA impact on strengthening the predictive power of ANNand PLS-based models was also highlighted.


Introduction
Metabolic activation of a drug leading to reactive metabolite(s) that can covalently modify proteins is considered an initial step that may lead to drug-induced organ toxicities.
Chemometrics is the science that explores the extraction of precious knowledge from raw data [4]. The application of numerous multivariate calibration methods has considerably improved quantitative spectroscopy [5][6][7][8][9].
Multivariate statistical models are very useful in spectral analysis because of simultaneous incorporation of multiple spectral intensities, which greatly improve the accuracy and potential applications of quantitative spectral analysis [10]. The literature survey showed that spectroscopic, RP-HPLC, and LC-MS methods [11,12] are reported for quantitation of AZM and CHT binary mixture. However, there is no reported method for the quantitation of the ternary mixture of AZM, CHT, and AZ may be because of the high similarity of UV spectra of AZM and AZ. Accordingly, these findings motivated us to develop simple chemometric-assisted spectrophotometric methods for resolving this ternary mixture. Therefore, the rationales for this work are to quantify AZM, CHT in the presence of AZ, to perform a comparison between the traditional PCR and partial least squares (PLS) models (as examples for full spectrum-based models) and selected spectrum models including PLS and artificial neural network [ANN] preceded by genetic algorithm [GA] procedure, and ultimately to illustrate the impact of GA on increasing the predictive ability of PLS and ANN models.

Instrument
SHIMADZU dual-beam (Kyoto/Japan) UV-visible spectrophotometer model UV-1800 PC was connected to IBM compatible and an hp1020 laser jet printer. The bundle software, UV-Probe personal spectroscopy software version 2.21 (SHIMADZU), was used to process absorption spectra, the spectral bandwidth was 2 nm, and scanning speed was 2,800 nm/min.

Software
All chemometric methods were performed in MATLAB ® 7.0.0.19920 (R14). PCR and PLS were executed using PLS toolbox software version 2.1. GA-PLS and GA-ANN were carried out using PLS toolbox software in conjunction with the Neural Network toolbox. Microsoft ® Excel was adopted to conduct the t-test and F-test.

Pure samples and reagents
Pure AZM, CHT, and AZ were purchased from Weihua Pharma Co. Ltd, Zhejiang, China. Their purity was stated to be 99.69 ± 0.79%, 99.94 ± 0.64%, and 99.54 ± 0. 84%, respectively. Edarbyclor ® tablets, labeled to contain 40 mg AZM plus 12.5 or 25 mg of CHT, batch number 459512, were manufactured by Takeda, Japan. The dosage form (Edarbyclor ® tablets) was purchased from the United Arab Emirates market. Methanol (spectroscopic grade) was available in the laboratory.

Spectral characteristics of AZM, CHT, and AZ
The zero-order absorption spectra of 40 μg mL −1 AZM, 12.5 μg mL −1 CHT, and 40 μg mL −1 AZ are recorded against methanol as a blank over the range of 200-350 nm.

Chemometric procedures
Sixteen mixture samples were developed through a multilevel multifactor design [13] by transferring various volumes from the stock solutions of three components separately into a series of 5 mL volumetric flasks, and all flasks were diluted with methanol to a given volume. The concentration range of AZM and CHT in 16 samples depended on their calibration range and their ratio in the market pharmaceutical product. The expected percentages for AZ found in the degraded samples (1-15% of AZM) were the key factor for choosing AZ concentrations in the calibration set. The concentration design matrix is illustrated in Table 1. The regions from 200 to 229 nm and 341-400 nm were not considered. For examining the predictive capability of the developed multivariate models, a validation set containing nine laboratory prepared mixtures was prepared with different proportions of AZM, AZ, and CHT.

Preparation of pharmaceutical tablet solutions
Edarbyclor ® 40 mg/12.5 mg tablets were weighed and finely powdered. The powder corresponding to 12.5 mg of CHT and 40 mg of AZM was carefully transferred into a 20 mL volumetric flask containing 10 mL methanol. The solution mixture was extracted through sonication, and this procedure took 20 min. Later, methanol was used to dilute it to the mark. The final concentration, i.e., 625 µg mL −1 and 2,000 µg mL −1 for CHT and AZM, respectively, was obtained by filtering the extract.
Ethical approval: The conducted research is not related to either human or animal use.

Results and discussion
The high degree of interference was shown by the absorption spectra of AZ, CHT, and AZM ( Figure 2). Moreover, none of them could be determined in the mixture by the application of direct spectrophotometry.

PCR and PLS methods
PLS and PCR models were chosen because they are fullspectrum chemometric procedures. Nonetheless, you cannot obtain precise results using noisy and barely informative wavelengths. If especially noisy wavelengths are discarded, more accurate results could be obtained. In all the proposed models, range of 230-340 nm was used, where the defined concentration ratio of the Edarbyclor ® tablets had observed the good linearity for AZM and CHT. The noise absorbance and strong participation of AZ and CHT absorbances were shown in the area 200-229. The poor absorbances of three analytes were demonstrated by the regions 341-350. Sixteen mixtures' calibration set containing various ratios of AZM, CHT, and AZ was used to establish the PLS  and PCR models. The desirable models were developed in line with the multilevel multifactor design [13] ( Table 1). Before developing both PLS and PCR models, it is a crucial step to indicate the optimum number of latent variables for these models. The reason is that the data would show more if the number of retained latent variables is higher than those required. In contrast, the substantial data for the calibration purpose might be thrown out if the number of retained variables was insignificant. The optimum number of latent variables can be determined in a number of ways [4][5][6][7][8][9]14,15].
In this work, the latent variables' number was selected by adopting a cross-validation method leaving out one sample at a time, using the calibration data [14]. Errors in the estimated concentrations were evaluated by calculating root mean square error of cross-validation (RMSECV) as a diagnostic tool. It points out both the accuracy and precision of predictions. It was calculated again when each new factor was added to the PLS and PCR models. The latent variables' number was initially plotted versus the calculated RMSECV to determine the optimum number of latent variables. "F" value was used for comparing the RMSECV values of various developed models. The smallest number of latent variables that does not considerably boost the RMSECV was selected to be the optimum [8,14]. As far as PLS and PCR models are concerned, three latent variables were found to be appropriate (Figure 3).

GA-PLS method
The optimization problems can be solved through GAs with the help of techniques encouraged by natural advancement. For some types of optimization, they have become charismatic and appealing because the   price/performance of computer systems continues to increase. Specifically, the performance of GAs is absolutely wonderful on mixed (discrete and continuous) combinatorial issues. A number of GA-based applications have been reported in pharmaceutical analysis [16][17][18][19].
Investigating GA and its pre-processing impact on the ANN and PLS methodologies for the analysis of AZ, CHT, and AZM in laboratory prepared mixtures and in Edarbyclor ® tablets analysis is one of the objectives of this research study. Accurate adjustment of GA parameters is a decisive issue of successful GA performance. Table 2 depicts the configuration of GA parameters. As far as the matter of AZ, CHT, and AZM are concerned, a PLS regression method was applied and the GA was run for 111 variables (in the range 230-340 nm) together with the most of the latent variables allowed, which is the ideal number of components that are determined by cross-validation. Later, they used the selected variables for PLS execution. For finding the best set of wavelengths for the analysis of each drug, GA procedure was repeated 10 times. Finally, a wavelength was chosen if the percent of the variable selection exceeds a certain value. The threshold of 80% for the three analytes was attained, according to the minimal error of prediction for every analyte. As a result of GA, the absorbance matrix was reduced to 51 wavelengths for AZM. Similarly, this matrix was reduced to 32 wavelengths for AZ and 27 wavelengths for CHT. The selected wavelengths for AZM, AZ, and CHT were 230-280, 230-261, and 230-256 nm, respectively. Regarding calibration set absorption spectra, the PLS method was implemented at the wavelengths designated by the GA. Using a calibration set of 16 spectra to select the number of latent variables in the PLS algorithm, a cross-validation method leaving out one sample at a time was used [14]. Figure 4 depicts the number of latent variables used for every drug.
The GA did not minimize the ideal number of latent variables compared to those obtained with the PLS model, as seen in Figure 4.

ANN and GA-ANN methods
Neural networks with backward propagation learning exhibited the outcomes by seeking for different types of functions. Nevertheless, the success of the training process is determined by the choice of the basic parameter [20].
The problem-solving ability is illustrated by GAs and neural networks. They possess easy and straightforward principles where the linear and nonlinear iterations are used in their mathematical nature [20]. The literature has reported a number of ANN-based applications in pharmaceutical preparations [16][17][18][19][20]. The predictability of the ANN was evaluated with and without using the GA so that the AZ, CHT, and AZM could be determined in the laboratory prepared mixtures in addition to Edarbyclor ® tablets analysis. The three layers, namely an input layer, an output layer, and a hidden layer, constitute the overall network. ANN modeling took extra time because of the huge number of nodes in the input layer of the network (i.e., the number of wavelength readings for each solution). Therefore, the absorbance matrix was reduced to 32 (in case of AZ), 27 (in case of CHT), and 51 values (in case of AZM) before commencing to the network. Subsequently, ANN was applied. Accordingly, the inputs were either the absorbances of the selected wavelengths (for GA-ANN) or the raw data (for ANN model). The concentration matrix of one component is represented by the output layer. Moreover, a single layer is encompassed in the hidden layer which is believed to be adequate for addressing similar or more complicated issues. In addition, over-fitting might be caused because of more hidden layers [21]. Consequently, three ANNs were used to predict the AZ, CHT, and AZM concentrations from raw data. Table 3 illustrates the architectures of the proposed ANNs. Number of parameters should be optimized for proper modeling of ANN. The learning coefficients, hidden neuron number, learning coefficient increase, and learning coefficients decrease are included among these parameters. Table 3 depicts every component's parameter in both ANN and GA-ANN methods along with their values. Various training functions were used to train the ANN and the mean square error of prediction indicated no decrease (which means that the performance is not improved). The Levenberg-Marquardt backpropagation (TRAINLM) training function has been used because of its efficient functionalities. The training function of the network through which weight and bias values are updated in line with the Levenberg-Marquardt optimization is referred to as the TRAINLM. In the present design, optimum results were delivered by the linear transfer function pair (Purelin-Purelin) between hidden and input layer and between outer and hidden layer. This has taken place because of the linear relationship between concentration and absorbance. To prevent overfitting of this model, the training step acknowledged a nine mixture validation set and ANN stops when there is a reduction in the mean square error of the calibration set and increase in that of the validation set.    Through the optimal parameters, the proposed methods were implemented on the calibration data and the three drugs' concentrations were calculated ( Table S1 in Supplementary Materials).
For measuring as to whether the concentration variability in the calibration set has been justified or not by the model, the predicted concentrations of the calibration samples were plotted against the known concentrations. Table S2 in Supplementary Materials collects the parameters of the linear calibration models, which were supposed to lie on a straight line that contains a slope of 0 and 1 intercept.
For model validation, the proposed methods were applied for analyzing a nine mixture validation set ( Table 4). It was clear that PLS and PCR did not provide good results for the prediction of AZ concentrations.
Another diagnostic tool to investigate the errors in the predicted concentrations was the root mean square error of prediction (RMSEP). RMSEP demonstrates both accuracy and precision of predictions, because it played the same role as a standard deviation in showing the distribution of concentration errors [22]. Figure 5 illustrates the calculated RMSEP values for AZM, CHT, and AZ using the five models, where suitable values were obtained for all the models excluding AZM and AZ by PCR and PLS models and the best optimal results were shown by GA-ANN and GA-PLS, which demonstrated the GA (as a variable selection procedure) and its impact on the functionality of two models, in particular, the ANN-based model.
This was attributable to the choice of wavelengths best related to the concentration for every component by GA. Therefore, building ANN models with those chosen wavelengths will give more accurate results in predicting the concentrations of the desired component. Table 5 shows the proposed methods, which were effectively implemented for the analysis of Edarbyclor ® tablets. On the contrary, substandard results were delivered by PLS and PCR, when they were compared with the other three models (recovery percent around 92 and RSD above 4.3 as depicted in Table 5). The standard addition technique was applied to evaluate the validity of the proposed methods ( Table 6). In bulk powder and commercial tablets, the results of the CHT and AZM determination  were compared statistically with the reference first derivative spectra method [11] (Table 7). "F" and "t" values were calculated, and they were found less than the tabulated ones apart from the case of PLS and PCR for CHT and AZM determination in commercial tablets (tand F-test values were higher than tabulated ones in case of AZM, and t value was higher than tabulated one in case of CHT).

Conclusions
This study provided five chemometric methods (PCR, PLS, GA-PLS, ANN, and GA-ANN) for resolving AZM, CHT, and AZ in powder and pharmaceutical dosage form. The influence of GA as a pre-processing step for ANN and PLS was highlighted. The results of this work concluded that the GA-PLS, ANN, and GA-ANN methods could be categorized among precise and sensitive methodology. The results in this work recommended that the proposed GA-PLS, ANN, and GA-ANN methods can be classified among accurate and sensitive methods. These advantages demonstrate the potential for using the suggested methodologies in the quality control study of AZM, CHT, and AZ in laboratories that do not have liquid chromatographic instruments. Moreover, the findings of this study offer hopes of using intelligent chemometric approaches to analyze pharmaceutical products using easy and cheap instruments such as the UV spectrophotometer; nevertheless, the spectrum of the interferant is highly overlapping.

Conflicts of interest:
The authors declare no conflict of interest. Data availability statement: All data generated or analyzed during this study are included in this published article (and its supplementary information files).