Toxicity and severe adverse effects are the primary cause of drug-candidate failures at the late stages of preclinical and clinical trials. Since most xenobiotics undergo biotransformations, their interaction with human organism reveals the effects produced by parent compounds and all metabolites. To increase the chances of successful drug development, estimation of the entire toxicity for drug substance and its metabolites is necessary for filtering out the potentially toxic compounds. We proposed the computational approach to the integral evaluation of xenobiotics’ toxicity based on the structural formula of the drug-like compound. In the framework of this study, the consensus QSAR model was developed based on the analysis of over 3000 compounds with information about their rat acute toxicity for intravenous route of administration. Four different numerical methods, estimating the integral toxicity, were proposed, and their comparative performance was studied using the external evaluation set consisting of 37 structures of drugs and 200 their metabolites. It was shown that, on the average, the best correspondence between the predicted and published data is obtained using the method that takes into account the estimated characteristics for both the parent compound and its most toxic metabolite.
In the 1960s, one of the pathbreakers in the field of drug metabolism postulated that the study of the fate of foreign compounds in human and animals plays an increasingly important role in the assessment of drugs and other chemicals safety in the environment. Such study helps to increase our understanding of the toxic action of such substances and should lead to the production of more effective and less toxic materials for human . Toxicity and serious adverse effects are the primary cause of failures at the late stage of drug development. By the end of the XX century, the development of new drugs was halted since the drug-candidates revealed various toxic effects observed in 16% of cases compared to all cases of the projects stopped . Today, in the second decade of the 21st century this figure has reached 40% . Toxic effects on human may be caused not only by a xenobiotic itself and its major final metabolites but also by the intermediate and final metabolites formed in trace quantities. For example, the toxicity of felbamate is caused by its bioactive metabolites appeared due to the biotransformation through aldehyde dehydrogenase and conjugation of glutathione (GSH) pathways. These hydroxylation pathways are the five-fold factors in rats than in humans, and rats accumulate 4–5 fold less glucuronide, carboxylic acid, and mercapturate derived from the GSH conjugate; that is why the felbamate toxicity may not be observed in preclinical studies .
To decrease the risks of drug failure, when a significant amount of human, financial and temporal resources are already gone, estimation of the entire toxicity of drug substance and all its metabolites is necessary at the early stages of drug discovery. Such estimation provides the basis for selecting the most prospective candidates among the available hits and allows for further study of compounds, filtering out those with the high probability of toxic and adverse effects. Medicinal chemistry in the field of drug discovery and development uses in silico approaches .
Estimation of adverse effects and toxicity of drug-like compounds may be carried out by different computational methods based on the structural formula of the molecule under study , , .
Moreover, a few methods for prediction of drug metabolism are developed. They allow estimating the interaction of the molecule under study with the particular biotransformation enzymes, identification of the most probable sites of biotransformation, and generation of putative metabolite structures , , , .
We use the term “integral toxicity” that reflects the measure of the toxic effect of the parent compound and its metabolites. Four different numerical methods, estimating the integral toxicity, are proposed in this article. Based on our earlier experience regarding the drug metabolism prediction , , , ,  as well as the estimation of rodents acute toxicity , , we showed that it is possible to create the computational approach, which combines the prediction of metabolites with estimation of integral toxicity , . In the scientific literature published, no approach of such integral evaluation of xenobiotics’ toxicity has been found.
In this study, we carefully examine four different numerical methods of integral toxicity estimation and evaluate their comparative performance using the external evaluation set consisting of 37 drug structures and about 200 their metabolites. The best correspondence between the predicted and published data is found for the method that takes into account the estimated characteristics for the parent compound and its most toxic metabolite. Then, using the antiepileptic drug felbamate as a case study, we demonstrated the practical feasibility of this approach.
Materials and methods
Drug metabolism prediction
To predict the particular metabolites and generate the complete metabolic tree, we used the computational method developed earlier, which is based on the prediction of possible classes of nine biotransformation reactions and the subsequent estimation of metabolic sites (SOM) for each reaction. We considered nine classes of biotransformation reactions: aliphatic and aromatic hydroxylation, N- and O-glucuronidation, N-, S- and C-oxidation, and N- and O-dealkylation. These reactions are catalyzed by five major isoforms of human cytochromes P450s (1A2, 2C19, 2C9, 2D6, 3A4) and by human UDP-glucuronosyltransferase. The training set was created using information about drug-like compounds’ biotransformations from the Biovia Metabolite database . The procedure of metabolite generation includes three steps: (1) prediction of the class of reaction for the entire molecule; (2) prediction of SOM for each reaction class; (3) calculation of the probability of metabolite formation. The probability of metabolites formation is calculated using the Bayesian approach based on the analyses of “structure-biotransformation reactions” and “structure-modified atoms” relationships. By a stepwise application of this procedure, we obtained a set of metabolites generated at each particular stage of biotransformation with the estimated probabilities of their formation. This method is described in detail in our publications , , , .
Rat acute toxicity prediction
We have earlier proposed the method for rodents’ acute toxicity prediction and considered its applicability for prediction of rat acute toxicity under four routes of administration: oral, intravenous, intraperitoneal and subcutaneous . The method involves the electrotopological quantitative neighbourhoods of atoms (QNA) , atom centric substructural multilevel neighbourhoods of atoms (MNA) descriptors, topological length and volume of molecules for representation of structures and self-consistent regression for establishing of QSAR . As shown earlier, the method proposed provides the reasonable accuracy of prediction and better performance comparing to the US EPA T.E.S.T. program . This method was implemented in the GUSAR program  used in this study. For a detailed description of this method, see , . The on-line version of web-service for prediction of LD50 values of rat acute toxicity with four routes of administration is freely available via the web interface .
In this work, we consider only one intravenous route of administration because in this case the immediate toxic reaction induced by the interaction of a drug-like compound with the organism is observed. Furthermore, the proper statistical parameters regarding both the R2 and coverage values had been previously found for the intravenous route of administration .
To extend the applicability domain of QSAR model for prediction of rat acute intravenous toxicity, we significantly increased the training set using data from the BIOVIA Toxicity Database . Currently, it includes the information about 3632 chemical structures with the data on 3965 LD50 values of rat acute intravenous toxicity. We systematize the data in accordance with the current “good QSAR modeling practice” ,  by removing salts, mixtures, inorganic compounds, and polymers. The median LD50 value was calculated for those structures, which had more than one experimental LD50 value. To provide the comparability of toxicity data, we transformed the toxicity end-points values from the commonly applied in toxicology mg/kg units to the log10(1/LD50) (mol/g) representation of LD50 values in mol/g. The final training set included 2978 structures.
GUSAR created 320 QSAR models varying parameters of using QNA and MNA descriptors. The complete description of models is presented in supplementary materials. All these models were used as a single averaged consensus model. Table 1 shows the statistical parameters of the model.
aNumber of compounds.
bMean value of log10(1/LD50) (mol/g) values in the set.
cIntervals between the minimal and maximal log10(1/LD50) (mol/g) values in the set.
dStandard deviation of log10(1/LD50) (mol/g) values in the set.
Both Table 1 and the plot in Fig. 1 with the observed vs. predicted LD50 values for the training set (Fig. 1) demonstrate a reasonable range of the toxicity end-point values and reasonable “goodness-of-fit” characteristics.
To validate the quality of the QSAR model developed for the rat acute intravenous toxicity and to compare the performance of four different numerical parameters of the integral toxicity estimation, we created the external test set included 37 drug structures from the 22 ChEMBL database release . The experimental data on metabolic reactions of these drugs, structures of about 200 of their metabolites and the LD50 values for the parent compounds (rats, intravenous route of administration), were extracted from the BIOVIA Toxicity Database . The list of these drugs and their LD50 values calculated as a median lethal dose in the N independent studies in rats when administered intravenously are given in Table 2.
|No||Compound name||ChEMBL ID||Nm||Exp||Pc||M1||M2||M3||M4|
|27||Penicillin g sodium||CHEMBL1430||3||5.07||5.18||5.60||5.56||5.73||5.54|
Nm, Number of metabolites.
Exp, Experimental value of median lethal dose studied in rats by intravenous administration [log10(1/LD50) (mol/g) i.v. rats].
Pc, Prediction value of toxicity, calculated for parent compound, without taking into account compound biotransformations.
M1, Prediction value of integral toxicity, calculated by method 1. (The effect of all metabolites on integral toxicity without taking into account toxicity of the parent compound.)
M2, Prediction value of integral toxicity, calculated by method 2. (The effect of all metabolites and taking into account toxicity of the parent compound on integral toxicity.)
M3, Prediction value of integral toxicity, calculated by method 3. (The effect of the most toxic metabolite on integral toxicity.)
M4, Prediction value of integral toxicity, calculated by method 4. (The effect of the most toxic metabolite and the parent compound on integral toxicity.)
Results and discussion
Methods proposed for evaluation of integral toxicity
The definitive goal of our studies is to improve the general quality of the toxicity risk assessment of drug-like compounds taking into account the complexity of the effects produced in an organism by a particular drug and all its metabolites.
We assume that the doses of metabolites (Di) may be represented from the dose of the initial compound (D) and probabilities (Pi) of its metabolite formation as follows:
Based on the probability of metabolite formation and their acute toxicity values calculated by our method , we proposed four different methods for calculating the integral toxicity.
Method 1. The integral toxicity is calculated as the effect of toxicity of all metabolites without taking into account the toxicity of the parent compound. It is assumed that each of the parent compounds is metabolized and all its metabolites are formed with an equal probability Pi=1/N (N is an amount of metabolites). We estimate an integral LD50 value LDM1 using the LD50 values predicted for metabolites (LDi) as follows:
Method 2. The integral toxicity is calculated as the toxic effects of the parent compounds and all its metabolites. We assumed that the parent compound is not completely metabolized and may be considered together with its metabolites. We estimate an integral LD50 value LDM2 using the LD50 values predicted for metabolites (LDi) and for the parent compound (LD0) as follows:
Method 3. The integral toxicity is calculated as the toxic effects of the most toxic metabolite. We assumed that the effect of the most toxic metabolite is crucial, even if it is formed in a small quantity. We estimate an integral LD50 value LDM3 as follows:
Method 4. The integral toxicity is calculated as the toxic effects of the most toxic metabolite and the parent compound. We assumed that both effects of the most toxic metabolite and the parent compound are important. We estimate an integral LD50 value LDM4 as follows:
Thus, for each of the parent compounds from the test set, we calculated all the four values mentioned above and compared them with the experimental LD50 values (rats, i.v.).
Results of integral toxicity estimation for the test set
Predicted values of rat acute intravenous toxicity for the test set estimated using four different methods are presented in Table 2.
To compare the performance of the integral toxicity estimation, we calculated the R2 and RMSE values characterizing the conformity between the experimental and predicted data presented in log(mol/g) (Table 3). From the data presented in Table 3, we can conclude that the values of the R2 test are equal or exceed 0.759 and the RMSE test values are <0.487 for all four methods.
It is clear from the data presented in Table 3 that the relative performance of different methods arranged in the ascending order, the R2 values are as follows: M1<M2<Parent compound<M3<M4. In terms of RMSE the order is slightly different: M2<M4<M1<M3<Parent compound. Thus, we can conclude that the utilization of the M4 method (the effect of the most toxic metabolite and the parent compound on integral toxicity) provides the best performance. The accuracy obtained is close to the accuracy of estimation that takes into account the toxicity of the parent compound. It may be explained by the fact that the measured experimental values of LD50 (i.v. rats) of the training set already include the hidden values of metabolite toxicity and the most toxic metabolites play a major role.
The results show that the consideration of the LD50 values predicted for both the parent compound and the most toxic metabolite (method 4) provides the better estimates of integral toxicity. The prediction results obtained by method 1 (consideration of the predicted acute toxicity for all metabolites without taking into account those of the parent compound) are less accurate. It allows concluding that the role of the parent compound in integral toxicity is significant in most cases. The values of toxicity obtained by methods 1 and 2 are lower in comparison with those obtained by the other three methods for the test set. It may be explained by the commonly known fact that metabolism usually leads to a decrease in xenobiotics toxicity. An opposite situation is observed in the case of method 3, when the toxicity of the most toxic metabolite is used as the estimate of integral toxicity.
The example of the paroxetine metabolic pathway (see Fig. 2) and the results obtained (see Table 4) for antidepressant drug paroxetine are presented below.
|Parent compound and its metabolites||Predicted LD50 values, mg/kg|
|Paroxetine (experimental value)||30.0|
|Paroxetine (predicted value)||43.1|
|Integral toxicity calculated by method 1||32.0|
|Integral toxicity calculated by method 2||33.8|
|Integral toxicity calculated by method 3||28.4|
|Integral toxicity calculated by method 4||34.2|
As can be seen from the data presented in Table 4, the predicted acute toxicity for paroxetine significantly exceeds the experimental LD50 value. Consideration of the paroxetine biotransformation leads to a decrease in the predicted values in all cases; the best correspondence between the predicted and experimental values is observed for method 3 that estimates the integral toxicity on the basis of the values predicted for the most toxic metabolites. The integral toxicity calculated by method 4 that takes into consideration not only the toxicity of the most toxic metabolite but also the predicted toxicity of the parent compound does not show the best performance due to the low toxicity predicted for the parent compound.
A possibility of the practical usefulness of the integral toxicity prediction is demonstrated in the case study of the antiepileptic drug felbamate mentioned above. Prediction of rat acute toxicity for felbamate (Table 5) makes it possible to suggest that this compound is low toxic. However, the felbamate biotransformation pathway presented in Fig. 3 provides fewer values of felbamate integral toxicity predicted by both methods 3 and 4 (Table 5). According to these estimates, the compound belongs to a more toxic third class of acute intravenous toxicity , which corresponds to the experimental results obtained in human but was not observed in rodents.
|Parent compound and its metabolites||Predicted LD50 values, mg/kg|
|Integral toxicity calculated by method 1||60.1|
|Integral toxicity calculated by method 2||61.9|
|Integral toxicity calculated by method 3||30.5|
|Integral toxicity calculated by method 4||34.0|
Prediction results of the integral toxicity obtained for felbamate directly correspond to the cause of its withdrawal , . Felbamate was approved in 1993 as an antiepileptic drug. However, it was soon withdrawn due to the cases of aplastic anemia, and hepatic failure resulted in deaths. The manifestation of toxic effects was associated with metabolite 2-phenylpropenal as a reactive electrophile responsible for the toxicities of felbamate observed.
The recent estimates confirmed that the safety issues remain a serious factor of drug failures in phase II and phase III of clinical trials . Computer-aided predictions of the metabolic pathways and toxicity may help to detect potentially dangerous molecules at the early stages of drug development. In this study, we investigated four methods of estimating the integral toxicity based on the prediction of rat acute intravenous toxicity of the parent compound and all its metabolites. Using an independent test set of 37 drugs we have shown that, on the average, the best correspondence between the prediction and experiment is found when both the toxicity of the parent compound and its the most toxic metabolite are taken into consideration. In a case study of the known antiepileptic drug paroxetine, we have demonstrated the potential usefulness of the approach proposed. Based on the estimation of integral toxicity, the approach proposed can help to identify whether a molecule actually belongs to a higher toxicity class compared to the method that directly evaluate the toxicity of only one paroxetine molecule. Interestingly, that in the case of paroxetine, the computational prediction of rat acute toxicity provides more accurate results that those obtained in the experimental studies of toxicity in rodents.
Based on the investigations performed, we have added the calculation of integral toxicity option to our freely available web-service MetaTox , which generates the structures of metabolites and builds the metabolic trees.
A collection of invited papers based on presentations at the XX Mendeleev Congress on General and Applied Chemistry (Mendeleev XX), held in Ekaterinburg, Russia, September 25–30, 2016.
Funding source: Russian Science Foundation
Award Identifier / Grant number: 14-15-00449
Funding statement: The Russian Science Foundation (grant No. 14-15-00449) has supported the work.
 R. T. Williams. Pure Appl. Chem. 18, 129 (1969).10.1351/pac196918010129Search in Google Scholar PubMed
 P. J. Sinko. Curr. Opin. Drug Discov. Devel. 2, 42 (1999).Search in Google Scholar
 M. J Waring, J. Arrowsmith, A. R. Leach, P. D. Leeson, S. Mandrell, R. M. Owen, G. Pairaudeau, W. D. Pennie, S. D. Pickett, J. Wang, O. Wallace, A. Weir. Nat. Rev. Drug Discov. 14, 475 (2015).10.1038/nrd4609Search in Google Scholar PubMed
 F. P. Guengerich, J. S. MacDonald. Chem. Res. Toxicol.20, 344 (2007).10.1021/tx600260aSearch in Google Scholar PubMed
 P. W. Erhardt. Pure Appl. Chem.74, 703 (2002).10.1351/pac200274050703Search in Google Scholar
 A. Zakharov, A. Lagunin. “Computational toxicology in drug discovery: Opportunities and limitations”, in Application of Computational Techniques in Pharmacy and Medicine. L. Gorb, V. Kuz’min, E. Muratov (Eds.), pp. 325–367, Springer, Dordrecht, Netherlands (2014).10.1007/978-94-017-9257-8_11Search in Google Scholar
 S. M. Ivanov, A. A. Lagunin, V. V. Poroikov. Drug Discov. Today.21, 58 (2016).10.1016/j.drudis.2015.07.018Search in Google Scholar PubMed
 W. Evangelista, R. L. Weir, S. R. Ellingson, J. B. Harris, K. Kapoor, J. C. Smith, J. Baudry. Bioorg. Med. Chem.24, 4928 (2016).10.1016/j.bmc.2016.07.064Search in Google Scholar PubMed PubMed Central
 J. Kirchmair, A. H. Göller, D. Lang, J. Kunze, B. Testa, I. D. Wilson, R. C. Glen, G. Schneider. Nat. Rev. Drug Discov.14, 387 (2015).10.1038/nrd4581Search in Google Scholar PubMed
 L. Olsen, C. Oostenbrink, F. S. Jørgensen. Adv. Drug Deliv. Rev.86, 61 (2015).10.1016/j.addr.2015.04.020Search in Google Scholar PubMed
 V. M. Bezhentsev, O. A. Tarasova, A. V. Dmitriev, A. V. Rudik, A. A. Lagunin, D. A. Filimonov, V. V. Poroikov. Russ. Chem. Rev.85, 854 (2016).10.1070/RCR4614Search in Google Scholar
 B. Testa, A. L. Balmat, A. Long. Pure Appl. Chem.76, 907 (2004).10.1351/pac200476050907Search in Google Scholar
 Y. Borodina, A. Rudik, D. Filimonov, N. Kharchevnikova, A. Dmitriev, V. Blinova, V. Poroikov. J. Chem. Inf. Comput. Sci.44, 1998 (2004).10.1021/ci049834hSearch in Google Scholar PubMed
 Y. Borodina, A. Sadym, D. Filimonov, V. Blinova, A. Dmitriev, V. Poroikov. J. Chem. Inf. Comput. Sci.43, 1636 (2003).10.1021/ci034078lSearch in Google Scholar PubMed
 A. V. Rudik, A. V. Dmitriev, A. A. Lagunin, D. A. Filimonov, V. V. Poroikov. J. Chem. Inf. Model.54, 498 (2014).10.1021/ci400472jSearch in Google Scholar PubMed
 A. Rudik, A. Dmitriev, A. Lagunin, D. Filimonov, V. Poroikov. Bioinformatics31, 2046 (2015).10.1093/bioinformatics/btv087Search in Google Scholar PubMed
 A. V. Rudik, A. V. Dmitriev, A. A. Lagunin, D. A. Filimonov, V. V. Poroikov. J. Cheminform.8, 68 (2016).10.1186/s13321-016-0183-xSearch in Google Scholar PubMed PubMed Central
 A. A. Lagunin, A. V. Zakharov, D. A. Filimonov, V. V. Poroikov. SAR QSAR Environ. Res.18, 285 (2007).10.1080/10629360701304253Search in Google Scholar PubMed
 A. Lagunin, A. Zakharov, D. Filimonov, V. Poroikov. Mol. Inform.30, 241 (2011).10.1002/minf.201000151Search in Google Scholar PubMed
 A. V. Rudik, V. M. Bezhentsev, A. V. Dmitriev, D. S. Druzilovskiy, A. A. Lagunin, D. A. Filimonov, V. V. Poroikov. J. Chem. Inform. Model.57, 638 (2017).10.1021/acs.jcim.6b00662Search in Google Scholar PubMed
 Web-service MetaTox, http://www.way2drug.com/mg, accessed 2016-12-04.Search in Google Scholar
 BIOVIA Metabolite Database, http://accelrys.com/products/collaborative-science/databases/bioactivity-databases/biovia-metabolite.html, accessed 2016-12-04.Search in Google Scholar
 D. A. Filimonov, A. V. Zakharov, A. A. Lagunin, V. V. Poroikov. SAR QSAR Environ. Res.20, 679 (2009).10.1080/10629360903438370Search in Google Scholar PubMed
 Web-service GUSAR online, http://www.way2drug.com/gusar/acutoxpredict.html, accessed 2016-12-04.Search in Google Scholar
 BIOVIA Toxicity Database, http://accelrys.com/products/collaborative-science/databases/bioactivity-databases/biovia-toxicity.html, accessed 2016-12-04.Search in Google Scholar
 D. Fourches, E. Muratov, A. Tropsha. J. Chem. Inf. Model.50, 1189 (2010).10.1021/ci100176xSearch in Google Scholar PubMed PubMed Central
 D. Fourches, E. Muratov, A. Tropsha. Nat. Chem. Biol.11, 535 (2015).10.1038/nchembio.1881Search in Google Scholar PubMed
 ChEMBL Database, https://www.ebi.ac.uk/chembl/, accessed 2016-12-04.Search in Google Scholar
 Paroxetine metabolism pathway from ChEMBL Database, https://www.ebi.ac.uk/chembl/compound/metabolism/1362484, accessed 2016-12-04.Search in Google Scholar
 A. V. Lyubimov. Encyclopedia of Drug Metabolism and Interactions, Vol. 4, p. 76, Wiley (2012).10.1002/9780470921920Search in Google Scholar
 I. Berezovskaya. Pharm. Chem. Journ. (article in Russian). 37, 3 (2003).10.1023/A:1024586630954Search in Google Scholar
 C. Ioannides, D. F. Lewis. Curr. Top. Med. Chem.4, 1767 (2004).10.2174/1568026043387188Search in Google Scholar PubMed
 R. K. Harrison. Nat. Rev. Drug. Discov.15, 817 (2016).10.1038/nrd.2016.184Search in Google Scholar PubMed
The online version of this article (DOI: https://doi.org/10.1515/pac-2016-1205) offers supplementary material, available to authorized users.
©2017 IUPAC & De Gruyter. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. For more information, please visit: http://creativecommons.org/licenses/by-nc-nd/4.0/