Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Physical Sciences Reviews

Ed. by Giamberini, Marta / Jastrzab, Renata / Liou, Juin J. / Luque, Rafael / Nawab, Yasir / Saha, Basudeb / Tylkowski, Bartosz / Xu, Chun-Ping / Cerruti, Pierfrancesco / Ambrogi, Veronica / Marturano, Valentina / Gulaczyk, Iwona

See all formats and pricing
More options …

Chemical space of naturally occurring compounds

Fernanda I. Saldívar-González
  • Corresponding author
  • Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de Mexico, Av. Universidad 3000, Mexico City 04510, Mexico
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ B. Angélica Pilón-Jiménez
  • Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de Mexico, Av. Universidad 3000, Mexico City 04510, Mexico
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ José L. Medina-Franco
  • Corresponding author
  • Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de Mexico, Av. Universidad 3000, Mexico City 04510, Mexico
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2018-12-04 | DOI: https://doi.org/10.1515/psr-2018-0103


The chemical space of naturally occurring compounds is vast and diverse. Other than biologics, naturally occurring small molecules include a large variety of compounds covering natural products from different sources such as plant, marine, and fungi, to name a few, and several food chemicals. The systematic exploration of the chemical space of naturally occurring compounds have significant implications in many areas of research including but not limited to drug discovery, nutrition, bio- and chemical diversity analysis. The exploration of the coverage and diversity of the chemical space of compound databases can be carried out in different ways. The approach will largely depend on the criteria to define the chemical space that is commonly selected based on the goals of the study. This chapter discusses major compound databases of natural products and cheminformatics strategies that have been used to characterize the chemical space of natural products. Recent exemplary studies of the chemical space of natural products from different sources and their relationships with other compounds are also discussed. We also present novel chemical descriptors and data mining approaches that are emerging to characterize the chemical space of naturally occurring compounds.

Keywords: biodiversity; BioFacQuim; cheminformatics; consensus diversity plots; drug discovery; foodinformatics; molecular diversity; natural products


  • [1]

    Dobson CM. Chemical space and biology. Nature. 2004;432:824–8.Google Scholar

  • [2]

    Lipinski C, Hopkins A. Navigating chemical space for biology and medicine. Nature. 2004;432:855–61.Google Scholar

  • [3]

    Awale M, Visini R, Probst D, Arús-Pous J, Reymond J-L. Chemical space: big data challenge for molecular diversity. Chimia. 2017;71:661–6.Google Scholar

  • [4]

    Naveja JJ, Rico-Hidalgo MP, Medina-Franco JL. Analysis of a large food chemical database: chemical space, diversity, and complexity. F1000Res. 2018;7.Google Scholar

  • [5]

    López-Vallejo F, Giulianotti MA, Houghten RA, Medina-Franco JL. Expanding the medicinally relevant chemical space with compound libraries. Drug Discov Today. 2012;17:718–26.Google Scholar

  • [6]

    López-Vallejo F, Waddell J, Yongye AB, Houghten RA, Medina-Franco JL. A large scale classification of molecular fingerprints for the chemical space representation and SAR analysis. J Cheminform. 2012;4:P26.Google Scholar

  • [7]

    Medina-Franco JL, Martinez-Mayorga K, Giulianotti MA, Houghten RA, Pinilla C. Visualization of the chemical space in drug discovery. Current Comput - Aided Drug Des. 2008;4:322–33.Google Scholar

  • [8]

    Osolodkin DI, Radchenko EV, Orlov AA, Voronkov AE, Palyulin VA, Zefirov NS. Progress in visual representations of chemical space. Expert Opin Drug Discov. 2015;10:959–73.Google Scholar

  • [9]

    Opassi G, Gesù A, Massarotti A. The hitchhiker’s guide to the chemical-biological galaxy. Drug Discov Today. 2018;23:565–74.Google Scholar

  • [10]

    Newman DJ, Cragg GM. Natural products as sources of new drugs from 1981 to 2014. J Nat Prod. 2016;79:629–61.Google Scholar

  • [11]

    Bauer A, Brönstrup M. Industrial natural product chemistry for drug discovery and development. Nat Prod Rep. 2014;31:35–60.Google Scholar

  • [12]

    Harvey AL, Edrada-Ebel R, Quinn RJ. The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov. 2015;14:111–29.Google Scholar

  • [13]

    Alvarec-Ruiz E, Collis AJ, Dann AS, Forsbury AP, Reddy SJ, Vázquez Muniz MJ, Microbiological process. Patent. 2017. https://patentimages.storage.googleapis.com/96/8b/de/87242640defaa1/CN106687596A.pdf. Accessed: 30 Sep 2018.

  • [14]

    Pereira DM, Valentão P, Andrade PB. Tuning protein folding in lysosomal storage diseases: the chemistry behind pharmacological chaperones. Chem Sci. 2018;9:1740–52.Google Scholar

  • [15]

    Zhanel GG, Lawson CD, Zelenitsky S, Findlay B, Schweizer F, Adam H, et al. Comparison of the next-generation aminoglycoside plazomicin to gentamicin, tobramycin and amikacin. Expert Rev Anti Infect Ther. 2012;10:459–73.Google Scholar

  • [16]

    Cobb R, Boeckh A. Moxidectin: a review of chemistry, pharmacokinetics and use in horses. Parasit Vectors. 2009;2:S5.Google Scholar

  • [17]

    Ca G, Ci F, Ag P, Chen C, Tipping R, Cm C, et al. Safety, tolerability, and pharmacokinetics of escalating high doses of ivermectin in healthy adult subjects. J Clin Pharmacol. 2002;42:1122–33.Google Scholar

  • [18]

    Brandt W, Haupt VJ, Wessjohann LA. Chemoinformatic analysis of biologically active macrocycles. Curr Top Med Chem. 2010;10:1361–79.Google Scholar

  • [19]

    Wessjohann LA, Ruijter E, Garcia-Rivera D, Brandt W. What can a chemist learn from nature’s macrocycles? – A brief, conceptual view. Mol Divers. 2005;9:171–86.Google Scholar

  • [20]

    Cuevas C, Francesch A. Development of Yondelis (trabectedin, ET-743). A semisynthetic process solves the supply problem. Nat Prod Rep. 2009;26:322–37.Google Scholar

  • [21]

    Gajdos C, Elias A. Trabectedin: safety and efficacy in the treatment of advanced sarcoma. Clin Med Insights Oncol. 2011;5:35–43.Google Scholar

  • [22]

    Scotti L, Ferreira EI, Ms S, Mt S. Chemometric studies on natural products as potential inhibitors of the NADH oxidase from Trypanosoma cruzi using the VolSurf approach. Molecules. 2010;15:7363–77.Google Scholar

  • [23]

    Scotti MT, Scotti L. Editorial: chemometrics in drug discovery. Comb Chem High Throughput Screen 2015;18:702–03.Google Scholar

  • [24]

    Rodrigues T, Reker D, Schneider P, Schneider G. Counting on natural products for drug design. Nat Chem. 2016;8:531–41.Google Scholar

  • [25]

    Chen Y, de Bruyn Kops C, Kirchmair J. Data resources for the computer-guided discovery of bioactive natural products. J Chem Inf Model. 2017;57:2099–111.Google Scholar

  • [26]

    Maier ME. Design and synthesis of analogues of natural products. Org Biomol Chem. 2015;13:5302–43.Google Scholar

  • [27]

    Wilk W, Zimmermann TJ, Kaiser M, Waldmann H. Principles, implementation, and application of biology-oriented synthesis (BIOS). Biol Chem. 2010;391:491–97.Google Scholar

  • [28]

    Cy-C C. TCM Database@Taiwan: the world’s largest traditional Chinese medicine database for drug screening in silico. PLoS One. 2011;6:e15939.Google Scholar

  • [29]

    Tsai T-Y, Chang K-W, Chen CY. iScreen: world’s first cloud-computing web server for virtual screening and de novo drug design based on TCM database@Taiwan. J Comput Aided Mol Des. 2011;25:525–31.Google Scholar

  • [30]

    Gu J, Gui Y, Chen L, Yuan G, Lu H-Z XX. Use of natural products as chemical library for drug discovery and network pharmacology. PLoS One. 2013;8:e62839.Google Scholar

  • [31]

    Ntie-Kang F, Zofou D, Babiaka SB, Meudom R, Scharfe M, Lifongo LL, et al. AfroDb: a select highly potent and diverse natural product library from African medicinal plants. PLoS One. 2013;8:e78085.Google Scholar

  • [32]

    Ntie-Kang F, Onguéné PA, Scharfe M, Owono Owono LC, Megnassan E, Mbaze LM, et al. ConMedNP: a natural product library from Central African medicinal plants for drug discovery. RSC Adv. 2014;4:409–19.Google Scholar

  • [33]

    Valli M, Dos Santos RN, Ld F, Ch N, Castro-Gamboa I, Ad A, et al. Development of a natural products database from the biodiversity of Brazil. J Nat Prod. 2013;76:439–44.Google Scholar

  • [34]

    Pilon AC, Valli M, Dametto AC, Pinto MEF, Freire RT, Castro-Gamboa I, et al. NuBBE DB: an updated database to uncover chemical and biological information from Brazilian biodiversity. Sci Rep. 2017;7:7215.Google Scholar

  • [35]

    NuBBE - Núcleo de Bioensaios, Biossíntese e Ecofisiologia de Produtos Naturais (Nuclei of Bioassays, Ecophysiology and Biosynthesis of Natural Products Database). http://nubbe.iq.unesp.br/portal/nubbedb.html. Accessed 30 Sep 2018.

  • [36]

    Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL. Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers. 2018;22:247–58.Google Scholar

  • [37]

    Medina-Franco JL. Chemoinformatic Characterization of the Chemical Space and Molecular Diversity of Compound Libraries. In: Trabocchi A, editor. Diversity-Oriented Synthesis. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2013:325–52.Google Scholar

  • [38]

    Gaspar HA, Sidorov P, Horvath D, Marcou G. Generative topographic mapping approach to chemical space analysis. ACS Symp Ser. 2016. https://elibrary.ru/item.asp?id=27576908.

  • [39]

    Tino P, Nabney I. Hierarchical GTM: constructing localized nonlinear projection manifolds in a principled way. IEEE Trans Pattern Anal Mach Intell. 2002;24:639–56.Google Scholar

  • [40]

    Naveja JJ, Medina-Franco JL. ChemMaps: towards an approach for visualizing the chemical space based on adaptive satellite compounds. F1000Res. 2017;6:1134.Google Scholar

  • [41]

    Feher M, Schmidt JM. Property distributions: differences between drugs, natural products, and molecules from combinatorial chemistry. J Chem Inf Comput Sci. 2003;43:218–27.Google Scholar

  • [42]

    Shelat AA, Guy RK. The interdependence between screening methods and screening libraries. Curr Opin Chem Biol. 2007;11:244–51.Google Scholar

  • [43]

    Singh SB, Chris Culberson J. Chapter 2: chemical space and the difference between natural products and synthetics. In: Antony D Buss, Mark S Butler (editors). Natural product chemistry for drug discovery, Cambridge, UK: Royal Society of Chemistry. 2009:28–43.Google Scholar

  • [44]

    Chen H, Engkvist O, Blomberg N, Li J. A comparative analysis of the molecular topologies for drugs, clinical candidates, natural products, human metabolites and general bioactive compounds. Med Chem Commun. 2012;3:312–21.Google Scholar

  • [45]

    Ertl P, Schuffenhauer A. Cheminformatics analysis of natural products: lessons from nature inspiring the design of new drugs. Prog Drug Res. 2008;66:217, 219–35.Google Scholar

  • [46]

    Pascolutti M, Campitelli M, Nguyen B, Pham N, Gorse A-D, Quinn RJ. Capturing nature’s diversity. PLoS One. 2015;10:e0120942.Google Scholar

  • [47]

    González-Medina M, Prieto-Martínez FD, Naveja JJ, Méndez-Lucio O, El-Elimat T, Pearce CJ, et al. Chemoinformatic expedition of the chemical space of fungal products. Future Med Chem. 2016;8:1399–412.Google Scholar

  • [48]

    Chen Y, Garcia de Lomana M, N-O F, Kirchmair J. Characterization of the chemical space of known and readily obtainable natural products. J Chem Inf Model. 2018;58:1518–32.Google Scholar

  • [49]

    Shang J, Hu B, Wang J, Zhu F, Kang Y, Li D, et al. Cheminformatic Insight into the differences between terrestrial and marine originated natural products. J Chem Inf Model. 2018;58:1182–93.Google Scholar

  • [50]

    Ertl P, Schuffenhauer A. Cheminformatics analysis of natural products: lessons from nature inspiring the design of new drugs. Prog Drug Res. 2008;66:217, 219–35.Google Scholar

  • [51]

    Muigg P, Rosén J, Bohlin L, Backlund A. In silico comparison of marine, terrestrial and synthetic compounds using ChemGPS-NP for navigating chemical space. Phytochem Rev. 2013;12:449–57.Google Scholar

  • [52]

    Saldívar-González FI, Valli M, Da Silva Bolzani V, Medina-Franco JL. Chemical diversity of NuBBE database: A chemoinformatic characterization 2018.Google Scholar

  • [53]

    Larsson J, Gottfries J, Muresan S, Backlund A. ChemGPS-NP: tuned for navigation in biologically relevant chemical space. J Nat Prod. 2007;70:789–94.Google Scholar

  • [54]

    Rosén J, Rickardson L, Backlund A, Gullbo J, Bohlin L, Larsson R, et al. ChemGPS-NP mapping of chemical compounds for prediction of anticancer mode of action. QSAR Comb Sci. 2009;28:436–46.Google Scholar

  • [55]

    Korinek M, Tsai Y-H, El-Shazly M, Lai K-H, Backlund A, Wu S-F, et al. Anti-allergic Hydroxy Fatty Acids from Typhonium blumei Explored through ChemGPS-NP. Front Pharmacol. 2017;8:356.Google Scholar

  • [56]

    Rosén J, Lövgren A, Kogej T, Muresan S, Gottfries J, Backlund A. ChemGPS-NP(Web): chemical space navigation online. J Comput Aided Mol Des. 2009;23:253–9.Google Scholar

  • [57]

    Frédérick R, Bruyère C, Vancraeynest C, Reniers J, Meinguet C, Pochet L, et al. Novel trisubstituted harmine derivatives with original in vitro anticancer activity. J Med Chem. 2012;55:6489–501.Google Scholar

  • [58]

    Ertl P, Rohde B. The molecule cloud - compact visualization of large collections of molecules. J Cheminform. 2012;4:12.Google Scholar

  • [59]

    Schuffenhauer A, Ertl P, Roggo S, Wetzel S, Koch MA, Waldmann H. The scaffold tree--visualization of the scaffold universe by hierarchical scaffold classification. J Chem Inf Model. 2007;47:47–58.Google Scholar

  • [60]

    Medina-Franco JL, Petit J, Maggiora GM. Hierarchical strategy for identifying active chemotype classes in compound databases. Chem Biol Drug Des. 2006;67:395–408.Google Scholar

  • [61]

    Koch MA, Schuffenhauer A, Scheck M, Wetzel S, Casaulta M, Odermatt A, et al. Charting biologically relevant chemical space: A structural classification of natural products (SCONP). Proc Natl Acad Sci USA. 2005;102:17272–77.Google Scholar

  • [62]

    Schäfer T, Kriege N, Humbeck L, Klein K, Koch O, Mutzel P. Scaffold Hunter: a comprehensive visual analytics framework for drug discovery. J Cheminform. 2017;9:28.Google Scholar

  • [63]

    Tao L, Zhu F, Qin C, Zhang C, Chen S, Zhang P, et al. Clustered distribution of natural product leads of drugs in the chemical space as influenced by the privileged target-sites. Sci Rep. 2015;5:9325.Google Scholar

  • [64]

    Pye CR, Bertin MJ, Lokey RS, Gerwick WH, Linington RG. Retrospective analysis of natural products provides insights for future discovery trends. Proc Natl Acad Sci USA. 2017;114:5601–6.Google Scholar

  • [65]

    Camp D, Garavelas A, Campitelli M. Analysis of physicochemical properties for drugs of natural origin. J Nat Prod. 2015;78:1370–82.Google Scholar

  • [66]

    Stratton CF, Newman DJ, Tan DS. Cheminformatic comparison of approved drugs from natural product versus synthetic origins. Bioorg Med Chem Lett. 2015;25:4802–7.Google Scholar

  • [67]

    Clemons PA, Bodycombe NE, Carrinski HA, Wilson JA, Shamji AF, Wagner BK, et al. Small molecules of different origins have distinct distributions of structural complexity that correlate with protein-binding profiles. Proc Natl Acad Sci USA. 2010;107:18787–92.Google Scholar

  • [68]

    Medina-Franco JL, Navarrete-Vázquez G, Méndez-Lucio O. Activity and property landscape modeling is at the interface of chemoinformatics and medicinal chemistry. Future Med Chem. 2015;7:1197–211.Google Scholar

  • [69]

    Reddy AS, Zhang S. Polypharmacology: drug discovery for the future. Expert Rev Clin Pharmacol. 2013;6:41–7.Google Scholar

  • [70]

    Medina-Franco JL, Martinez-Mayorga K, Meurice N. Balancing novelty with confined chemical space in modern drug discovery. Expert Opin Drug Discov. 2014;9:151–65.Google Scholar

  • [71]

    van Hattum H, Waldmann H. Biology-oriented synthesis: harnessing the power of evolution. J Am Chem Soc. 2014;136:11853–9.Google Scholar

  • [72]

    Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2016;44:D1202–13.Google Scholar

  • [73]

    Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–7.Google Scholar

  • [74]

    Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017;45:D945–54.Google Scholar

  • [75]

    Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46:D1074–82.Google Scholar

  • [76]

    Boufridi A, Quinn RJ. Harnessing the properties of natural products. Annu Rev Pharmacol Toxicol. 2018;58:451–70.Google Scholar

  • [77]

    Rosén J, Gottfries J, Muresan S, Backlund A, Oprea TI. Novel chemical space exploration via natural products. J Med Chem. 2009;52:1953–62.Google Scholar

  • [78]

    Martinez-Mayorga K, Medina-Franco JL, editors. Foodinformatics: applications of chemical information to food chemistry, Switzerland: Springer. 2014. https://www.springer.com/gp/book/9783319102252.

  • [79]

    Medina-Franco JL, Martínez-Mayorga K, Peppard TL, Del Rio A. Chemoinformatic analysis of GRAS (Generally recognized as safe) flavor chemicals and natural products. PLoS One. 2012;7:e50798.Google Scholar

  • [80]

    Medina-Franco JL. Advances in computational approaches for drug discovery based on natural products. Revista Latinoamericana de Química. 2013;41:95–110.Google Scholar

  • [81]

    Houghten RA, Pinilla C, Giulianotti MA, Appel JR, Dooley CT, Nefzi A, et al. Strategies for the use of mixture-based synthetic combinatorial libraries: scaffold ranking, direct testing in vivo, and enhanced deconvolution by computational methods. J Comb Chem. 2008;10:3–19.Google Scholar

  • [82]

    Brown N, Jacoby E. On scaffolds and hopping in medicinal chemistry. Mini Rev Med Chem. 2006;6:1217–29.Google Scholar

  • [83]

    Singh N, Guha R, Giulianotti MA, Pinilla C, Houghten RA, Medina-Franco JL. Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model. 2009;49:1010–24.Google Scholar

  • [84]

    Yongye AB, Waddell J, Medina-Franco JL. Molecular scaffold analysis of natural products databases in the public domain. Chem Biol Drug Des. 2012;80:717–24.Google Scholar

  • [85]

    Lipinski CA. Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol. 2004;1:337–41.Google Scholar

  • [86]

    Veber DF, Johnson SR, Cheng H-Y, Smith BR, Ward KW, Kopple KD. Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem. 2002;45:2615–23.Google Scholar

  • [87]

    Maldonado AG, Doucet JP, Petitjean M, Fan B-T. Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers. 2006;10:39–79.Google Scholar

  • [88]

    Schuffenhauer A, Varin T. Rule-based classification of chemical structures by scaffold. Mol Inform. 2011;30:646–64.Google Scholar

  • [89]

    Schneider G, Neidhart W, Giller T, Schmid G. “Scaffold-hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chem Int Ed. 1999;38:2894–96.Google Scholar

  • [90]

    Evans BE, Rittle KE, Bock MG, DiPardo RM, Freidinger RM, Whitter WL, et al. Methods for drug discovery: development of potent, selective, orally effective cholecystokinin antagonists. J Med Chem. 1988;31:2235–46.Google Scholar

  • [91]

    Medina-Franco JL, Martínez-Mayorga K, Bender A, Scior T. Scaffold diversity analysis of compound data sets using an entropy-based measure. QSAR Comb Sci. 2009;28:1551–60.Google Scholar

  • [92]

    González-Medina M, Prieto-Martínez FD, Owen JR, Medina-Franco JL. Consensus diversity plots: a global diversity analysis of chemical libraries. J Cheminform. 2016;8:63.Google Scholar

  • [93]

    González-Medina M, Owen JR, El-Elimat T, Pearce CJ, Oberlies NH, Figueroa M, et al. Scaffold diversity of fungal metabolites. Front Pharmacol. 2017;8:180.Google Scholar

  • [94]

    Olmedo DA, González-Medina M, Gupta MP, Medina-Franco JL. Cheminformatic characterization of natural products from Panama. Mol Divers. 2017;21:779–89.Google Scholar

About the article

Published Online: 2018-12-04

Citation Information: Physical Sciences Reviews, Volume 4, Issue 5, 20180103, ISSN (Online) 2365-659X, DOI: https://doi.org/10.1515/psr-2018-0103.

Export Citation

© 2019 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in