Skip to content
BY 4.0 license Open Access Published by De Gruyter April 21, 2020

An enumeration of natural products from microbial, marine and terrestrial sources

  • Fidele Ntie-Kang EMAIL logo and Daniel Svozil
From the journal Physical Sciences Reviews


The discovery of a new drug is a multidisciplinary and very costly task. One of the major steps is the identification of a lead compound, i.e. a compound with a certain degree of potency and that can be chemically modified to improve its activity, metabolic properties, and pharmacokinetics profiles. Terrestrial sources (plants and fungi), microbes and marine organisms are abundant resources for the discovery of new structurally diverse and biologically active compounds. In this chapter, an attempt has been made to quantify the numbers of known published chemical structures (available in chemical databases) from natural sources. Emphasis has been laid on the number of unique compounds, the most abundant compound classes and the distribution of compounds in terrestrial and marine habitats. It was observed, from the recent investigations, that ~500,000 known natural products (NPs) exist in the literature. About 70 % of all NPs come from plants, terpenoids being the most represented compound class (except in bacteria, where amino acids, peptides, and polyketides are the most abundant compound classes). About 2,000 NPs have been co-crystallized in PDB structures.

1 Introduction

In the quest to discover new drugs, researchers have often resorted to natural sources, e. g. plants, marine organisms, bacteria and fungi [1, 2]. This is because these organisms are known to host sophisticated metabolic pathways that have led to complex and intriguing chemical structures. The existence of such structures could never have been figured out by any chemist, had nature not synthesized them. Besides, some naturally occurring compounds have required several decades to be synthesized, even after the entire chemical structure had been elucidated [3, 4, 5]. Such compounds are often the products of secondary metabolism in higher organisms, fungi, and microbes. Thus, they are often referred to as secondary metabolites (or natural products).

An attempt has been made to provide a classification of secondary metabolites (SMs) according to their structural diversity, bioactivity and ecological functions [6]. By so doing, an examination of the main natural product (NP) classes was carried out according to their metabolic building blocks, e. g. alkaloids, fatty acids, polyketides, phenylpropanoids, and aromatic polyketides, and terpenoids. This included a discussion of the structural diversity of natural product classes using the scaffold approach while focusing on the characteristic carbon frameworks.

Several key questions still, however, remain to be answered:

  1. How many naturally occurring compounds are currently known?

  2. How diverse are they?

  3. What are the most frequently occurring chemical scaffolds and functional groups (FGs) among secondary metabolites?

  4. Which of the main pool of NPs (marine, terrestrial or microbial) is the most promising source of new and biologically active compounds?

  5. What proportion of the biosphere is still unexplored in terms of organisms and in terms of their metabolite contents?

In summary, what future lies ahead of us in terms of the coverage of the chemical space of secondary metabolites?

It must be mentioned that, although the investigation of several NP datasets versus synthetically obtained chemical libraries (SOCLs) have shown that NPs often occupy a much wider chemical space [7, 8, 9, 10, 11, 12, 13], the number of known unique compounds obtained from synthetic chemistry laboratories far outnumbers those of natural sources [14].

An illustration would be by comparing the number of synthetically derived compounds in the ZINC [15] and PubChem [16] databases, to those obtained biosynthetically, i. e. NPs (Table 1).

Table 1:

The number of NPs versus synthetically obtained compounds currently in ZINC and PubChem databases.

Database # SCsa* # NPsb* Ratio Reference
ZINC 94,774,466 856,096 110:1 [15]
PubChem 97,997,240 2,760 35,500:1 [16]
  1. aApproximate number of synthetic compounds; bApproximate number of natural products; *These are not the number of unique compound entries, since the number of unique compounds numbers might be much lower that these.

The ratios are 110:1 for ZINC and 35,500:1 for PubChem, showing that chemists have gone a long way in mastering the art of making new desired compounds much more than mastering how to identify and obtain compounds made by nature’s machinery. Besides, although the use of traditional medicinal preparations from source organisms containing NPs is so widespread, the art of engineering organisms to make the required compounds in the quantities relevant for drug discovery with the goal of treating several millions of patients is relatively new [17, 18].

With the renewed interest in NPs as sources of new compounds for the discovery of drugs, agrochemicals, cosmetics, etc., several databases and datasets have been collected and made available to the scientific community, either freely or commercially. Table 2 provides a summary of selected NP databases with the most abundant compound annotations (~3,000 or more).

Table 2:

Selected NP databases including  ≥3,000 molecular annotations.

Database # mol.a Weblink Origin Reference
Ambinter and Greenpharma* ~6 k Diverse [19]
Analyticon Discovery MEGx* ~3 k Diverse [20]
AntiBase** ~40 k Fungi and microbes [21]
AntiMarin** (AntiBase + MarinLit) ~60 k Marine, fungi and microbes [22]
Chemical Abstracts Services (CAS) ~283 k Diverse [23]
ConMedNP ~3 k Mostly plants [24]
DMNP** ~55 k Marine [25]
DNP** ~300 k Diverse [26]
HIT ~5 k Plants [27]
iSMART ~20 k Mostly plants [28]
Marinlit ~29 k Marine [29]
NANPDB ~5 k Mostly plants [30]
Natural Products Atlas ~21 k Fungi and microbes [31]
NPASS ~35 k Mostly plants [32]
NPs in PubChem Substance ~3 k Diverse [16, 33]
NPs in ZINC (UEFS) ~900 k Diverse [15]
NPs with known producing organism ~186 k Diverse [34]
OpenNP ~67 k Diverse [35]
Readily obtainable NPs ~26 k Diverse [35]
Reaxys** ~220 k Diverse [36]
StreptomeDB ~4 k Bacteria [37]
SuperNatural II*** ~326 k Diverse [38]
TCM Database@Taiwan ~53 k Mostly plants [39]
TCMID ~13 k Plants [40]
TIPdb ~9 k Plants [41, 42]
UNPD ~229 k Mostly plants [43]
  1. *Compound samples available; **Commercial database; ***Data could be available for collaborative projects; aApproximate number of molecular or sample annotations.

Although Table 2 provides a rough idea of the number of known NPs available in the literature, it should be mentioned that this is only based on data that has been curated and included in chemical databases. The non-curated chemical compound data (or data is the process of being curated), as well as compounds in uninvestigated species, have not been included in such estimations. A clearer picture would only be possible if parallel information of the number of known terrestrial, marine, fungal and microbial species which have been completely investigated is made available. This is rather far beyond the scope of this review.

There have been several attempts to quantify the number of NPs available in nature, by first attempting to provide an approximation of the number of NPs available in the literature:

  1. In his investigation of bioactive microbial metabolites János Bérdy first intuitively estimated the number of NPs to be about 1 million [44].

  2. Although not supported by strong experimental evidence and without providing details of the approximation method, this same author later estimated the number of published NPs to be between 300,000 and a double the number, i. e. 600,000 [45].

  3. Blunt et al. also conducted and published a similar estimate the same year, based on information available in NP databases [46].

  4. Based on information currently available in the most advanced NP databases like the CAS registry [23], DNP [26], Reaxys [36], SuperNatural II [38] and the UNPD [43] (see Table 2), such a number could be expected to be >300,000 compounds, knowing that the compound annotations from NPs in ZINC (UEFS) rather represent data donated collectively from several vendors (without removing duplicates and including compounds derived from semi-synthesis).

  5. A recent attempt by Chen et al. [35] to collect all known NPs in commercial and freely available virtual databases, including vendor libraries and removing duplicates, led to about 250,000 NPs, not including compound annotations from SuperNatural II and Reaxys. If one would also add compounds from fossils and define them as NPs [47], we could easily place the number at Bérdy’s original estimate of between 300,000 and 600,000 NPs (still an estimate!).

  6. The most recent analysis of NPs which have been included currently available (free and commercial) databases by Zeng et al. placed the number at ~470,000 NPs [32], which is still within Bérdy’s original estimate.

In this chapter, an attempt to review the literature with the view of providing recent numbers of known or published SMs from the various aforementioned major sources of NPs has been carried out. The discussion is subdivided into several parts, the first summarizing the major efforts from industry and academia towards a universal collection (hence enumeration) of NPs. We shall then attempt an enumeration by countries/regions of the world, then by biological activities, compound types and according to the various major sources of NPs (plants, marine, fungal and microbial), together with the geographical distribution of organisms producing SMs in the marine environment. The last sections will focus on the enumeration of NPs present in the major data resources for food chemicals and metabolites in the human body. The discussion is solely based on information available in databases or data collected, analysed and published by various research groups.

2 Attempts to obtain numbers by the generation of universal natural compound libraries

2.1 Enumeration of compound annotations in commercial libraries

2.1.1 The Dictionary of Natural Products (DNP)

Until now, the DNP (although only commercially available) is being regarded as one of the largest and most comprehensive compilations of compounds from natural sources [35, 48]. The latest version of the DNP (v. 27.1) provides data for ~300,000 NPs, including their physicochemical and biological properties, their systematic and common names, literature references, molecular structures, and origins (e. g. family, genus, and species names) [48]. An analysis of the current content of the DNP showed that:

  1. Of the ~195,000 SMs with available information on the compound origin, most SMs are derived from plants (almost 70 %), animals and bacteria (Figure 1a).

  2. Terpenoids and alkaloids are the two most abundant chemical classes of NPs in plants, animals, fungi, and bacteria put together, representing more than half of all compounds isolated from the plants (Figure 1b).

  3. Compositae and Leguminosae are the plant families with the highest number of SMs identified (Figure 2).

  4. Among all kingdoms, NPs isolated from Streptomyces spp. were largely represented

  5. The most abundant bioactive NPs are of bacterial, botanical and fungal sources (Figure 3).

Figure 1: 
              Summary of the contents of the DNP; (a) distribution of natural products per kingdom of life; (b) distribution of the main chemical classes of natural products in each kingdom of life [48]. Figure reproduced by permission.
Figure 1:

Summary of the contents of the DNP; (a) distribution of natural products per kingdom of life; (b) distribution of the main chemical classes of natural products in each kingdom of life [48]. Figure reproduced by permission.

Figure 2: 
              The top 15 plant families containing NPs and the distribution of the different chemical classes [48]. Figure reproduced by permission.
Figure 2:

The top 15 plant families containing NPs and the distribution of the different chemical classes [48]. Figure reproduced by permission.

Figure 3: 
              The top 10 biological activities exhibited by NPs and their distribution by kingdom of life [48]. Figure reproduced by permission.
Figure 3:

The top 10 biological activities exhibited by NPs and their distribution by kingdom of life [48]. Figure reproduced by permission.

However, it could be noted that certain classes of NPs were conspicuously absent from certain kingdoms, e. g. steroids were seldom noted in bacteria, while flavonoids and lignans were almost exclusively seen in plant species and polyketides were almost exclusively found in bacteria, fungi, and protists (Figure 1b) [48].

Being the second kingdom to produce abundant NPs (about 25 % of DNP compounds with source species information), most compounds produced by animals were found in snake venoms. The reader is referred to the recently published Snake Venom Database (SVDB) [49] for further reading.

2.1.2 Natural products in the Chemical Abstracts Services (CAS) directory

With about 285,000 NPs, the CAS Registry (the search engine of this database is known as SciFinder) has been the gold standard for chemical substance information. This database currently includes >149 million unique and diverse organic and inorganic molecules and substances, e. g. alloys, coordination compounds, minerals, mixtures, polymers and salts, and >67 million sequences, more than any other similar database [23]. The CAS Registry Number® is used to identify your substance of interest. This is universally used to provide a unique, unmistakable identifier for chemical substances. The CAS registry information includes daily updated information on literature references to the substance experimental and predicted property data (e. g. boiling and melting points, etc.), CA Index Names and synonyms, commercial availability of compounds/substances, preparative methods, spectra, regulatory information from international sources, etc.

2.1.3 Reaxys

This is a vast commercial database that includes about 220,000 NP annotations [36]. Apart from supporting chemistry research, including pharmaceutical development, the chemicals industry, and academic research, Reaxys provides integrated access to the eMolecules database containing more than 8 million unique molecules, including screening compounds and building blocks, from over 150 commercial suppliers, which is updated weekly.

2.2 Academic and other open-access efforts

The quantification of available NPs has also been assisted by efforts from academic groups, often providing the information in open access. A collection of freely accessible small molecule databases is available [50]. The most important efforts focused on NPs are described herein.

2.2.1 The Berlin collection

This is the SuperNatural II database, published and regularly updated by Robert Preissner’s group in Charité Berlin [38]. Published in Open Access, the authors also included information of modes of action, pathways and clusters for the entire dataset of currently 325,508 natural products extracted from various resources. It also provides the toxicity prediction for the database compounds. For instance, a substructure search can be performed to identify compounds containing this substructure. Additionally, possible target proteins and pathways are predicted for the natural compounds, based on 2D structural similarity search to known drug molecules.

2.2.2 The Hamburg collection

The efforts of researchers from Johannes Kirchmair’s group in Hamburg led to a collection of close to 250,000 unique NPs from a wide range of commercial and freely available virtual databases, excluding data from SuperNatural II and Reaxys [35]. It was reported that only about 10 % of the unique NPs had readily available samples either from compound vendors of via collaborations.

2.2.3 The UNPD collection

This database, currently including about 229,000 NP entries, was originally derived as a collection, aimed at getting a “universal” database of NPs [43]; with ~118,000 unique compounds from Reaxys [36], ~42,000 from Chinese Natural Product Database (CNPD) [51], ~7,500 from Traditional Chinese Medicines Database (TCMD) [52] and ~30,000 from the Chinese Herbal Drugs Database (CHDD) [53], after removing duplicates. Although the goal was to get a “universal” collection of NPs, this dataset ended up mostly including compounds from plants used in Traditional Chinese Medicine (TCM) and does not include NPs from bacterial and marine sources.

2.2.4 The Natural Product Activity and Species Source (NPASS) collection

NPASS is a freely accessible database focused on the biological activities and source species of NPs, currently including >35,000 NPs [32]. The strength of NPASS is the availability of information on NPs derived from 25,041 species with activities against 5,863 targets (i. e. 2,946 proteins, along with 1,352 microbial species and 1,227 cell lines). The database also includes 446,552 quantitative activity records (e. g. IC50, Ki, EC50, GI50 or MIC values) in relation to 222,092 NP-target pairs and 288,002 NP-species pairs

2.2.5 The PubChem and ZINC collection

PubChem [16] and ZINC [15] are the most frequently used and cited small molecule databases with the computer-aided drug design and virtual screening community, because of the availability of compound vendor or supplier information, in addition to the fact that the included data are freely accessible. In numbering the NPs in ZINC and PubChem, care must be taken because their open access policy makes duplicates and errors in compound structures inevitable.

2.3 Enumerating natural product sample collections

Several NP sample collections are available within academic groups and from vendors [35], e. g. Ambinter and Greenpharma [19], Analyticon Discovery MEGx [20], etc. However, the number and individual quantities of available samples are often very limited [17]. Although Chen et al. stated that about 10 % of NPs could be directly available through vendors and academic collaborations [35], a major challenge towards getting the exact number of NP samples available is that such information is either seldom published, not properly organized, inaccessible in existing NP databases or are not constantly being updated as the samples are being used up. A reference case for the efficient development and management of NPs samples are Nature Bank and Queensland Compound Library (QCL), hosted at the Eskitis Institute for Drug Discovery, Australia [54], which currently holds about 3,250 compounds [55], started off as a collection of about 800 samples [56]. Nature Bank, for example, is a comprehensive collection of plants and marine invertebrates, mainly terrestrial plants (from Queensland, China, and Papua New Guinea) and marine invertebrates (from the Great Barrier Reef and Tasmania). Meanwhile, QCL is the Australian national resource for compound management and logistics. The two institutions are working closely together. The Bioinformatics Institute (Singapore) currently hosts 2,500 compounds and 340,000 crude extracts from 37,000 plants, while the ChromaDex library includes about 3,000 compounds from 1,640 plants [55]. The largest currently available screening library with compounds of natural origin is the InterBioScreen (Moscow) [57], which includes over 68,000 NPs. Originally the main contributors of natural compounds and derivatives were research institutes of the former Soviet Union but now includes contributions from Japan, Europe, and the USA. About 13,000 natural and synthetic in-stock-available building blocks are also available [55, 57].

3 Number of unique molecules in the country and regional datasets

The number of NPs included in the country or regional databases have been summarized in Table 3. These rather represent much smaller numbers and most of the data have already been included in the majority of the aforementioned databases [35, 38].

Table 3:

Number of NPs included in the country/regional datasets.

Database Country/region # mol.a Weblink Origin Reference
AfroDb Africa ~1 k Plants [58]
BIOFACQUIM Mexico ~400 Diverse [59]
ChemDP Pakistan ~1 k [60]
CamMedNP Cameroon ~1,5 k Mostly plants [61]
ConMedNP Central Africa ~3 k Mostly plants [24]
EARNPDB Eastern Africa Plants unpublished
Indonesian NPs Indonesia ~6,800 Plants [62]
MAPS Database India  >1,200 Plants [63]
Mitishamba Kenya Plants [64]
NANPDB Northern Africa ~5 k Mostly plants [30]
NUBBEDB Brazil ~3 k Mostly plants [65, 66]
Panama NPs Panama ~400 Diverse [67]
SANCDB South Africa ~700 Diverse organisms [68]
Phytochemica Himalaya ~1 k Mostly plants [69]
TCM Database@Taiwan Taiwan ~53 k Mostly plants [39]
TCMID China ~10 k Plants [40]
TIPdb Taiwan ~9 k Plants [41, 42]
TM-MC Northeast Asia ~26 k Medicinal materials [70]
VIETHERB Vietnam ~11 k Plants [71]
WADB West Africa ~1 k Plants unpublished

4 Number of unique molecules in disease or therapeutic use datasets

These have been summarized in Table 4 and also represent much smaller numbers, almost never exceeding 1,000 compound annotations per dataset, apart from the datasets related to Chinese traditional medicine, which could also be classified under the country/regional databases. This is because the compounds are not related to a specific therapeutic class but are derived from plants with many diverse uses.

Table 4:

Number of NPs included in therapeutic use datasets.

Database Disease/use/characteristic # mol.a Weblink Origin Reference
AfroCancer Cancer ~400 Plants [72]
AfroMalariaDB Malaria ~500 Plants [73]
Afrotryp Human African trypanosiomiasis ~300 Plants [74]
Antimalarial NPs Malaria ~1 k Diverse [75, 76, 77]
Ayurveda Ayurvedic medicine ~1 k [78]
BioPhytMol Mycobacterial infections ~600 [79]
CHMIS-C Bind to anticancer targets ~9 k [80]
CVDHD Cardiovascular diseases ~35 k [81]
NPACT Cancer ~1.5 k Mostly plants [82]
NPCARE Cancer ~6.5 k [83]
SVDB Snake venom ~700 Snake venom [49]
TCM-ID Traditional Chinese medicine ~13 k [40]
TCMSP Traditional Chinese medicine ~29 k [84, 85]
TIPdb Antiplatelet, anticancer and antitubercular ~9 k Plants [41, 42]
  1. aApproximate number of molecular annotations.

5 Enumeration by compound types

Apart from the previously described classification of NPs in the DNP into the various compound classes [48], very little effort has been put into the development of databases focused on compound classes, apart from the Carotenoid database [86] and the on-going project for the development of a database of flavonoids [87]. At the moment, it would be quite tedious to get exact numbers of NPs per compound class, as such information is only included in a few datasets [26, 30, 72, 73, 74].

6 Global enumeration by major natural product pools

6.1 The challenge of classifying natural products into terrestrial, marine and microbial origins

It is quite challenging to get a clear cut demarcation of natural product sources into terrestrial, marine and microbial for several reasons:

  1. Plants animals and microbes are both terrestrial and aquatic.

  2. Apart from the challenging effort to develop “universal” databases [23, 26, 32, 36, 38, 43] and marine-based metabolites [25, 29], NP databases are often developed based on types of species, e. g. plant-based [24, 28], animal-based [50], microbial [31, 37], etc., by geographical regions (Table 3) by disease/therapeutic uses (Table 4) or by compound classes (Table 5).

Table 5:

Number of NPs included in therapeutic use datasets.

Database Compound class # mol.a Weblink Origin Reference
Carotenoids Database Carotenoids ~1,100 Mostly plants [86]
Flavonoid Database Flavonoids Not determined Ón going Plants [87]
ProCarDB Bacterial carotenoids 304 Bacteria [88, 89]
  1. aApproximate number of molecular annotations.

6.1.1 Databases of natural products from terrestrial plants

Apart from the enumeration of plant-based SMs from diverse countries (Table 3), as well as including the molecular activities of useful plants, the Collective Molecular Activities of Useful Plants (CMAUP), was recently published to cover plants growing on the terrestrial habitat [90]. However, no database with a universal collection of plant-based SMs exists [91]. The CMAUP database currently includes 5,645 plants (i. e. 2,567 medicinal plants, 170 food plants, 1,567 edible plants, 3 agricultural plants, and 119 garden plants). These were collected from 153 countries and regions and includes 47,645 plant ingredients active against 646 targets in 234 KEGG pathways associated with 2473 gene ontologies and 656 diseases.

6.1.2 Databases of natural products from fungal species

Most databases which could enable the enumeration of NPs from fungal species are combined with those from microbes (Table 6), e. g. AntiBase [21], AntiMarin [22], and Natural Product Atlas [31]. A specialized dataset of compounds exclusive to saprophytic fungi like mushrooms only currently includes about 1,100 NPs. However, this does not include compounds from plant endophytes and parasitic fungi.

Table 6:

Number of NPs from fungal sources.

Database # mol.a Weblink Reference
Antibase ~40 k [21]
AntiMarin ~60 k [22]
Natural Products from Mushrooms ~1,100 [92]
Natural Product Atlas ~21 k [31]
  1. aApproximate number of molecular annotations.

6.2 Natural products of microbial origin

Compounds from bacteria and protists do not currently represent a significant proportion of NPs included in the DNP [48]. However, Natural Products Atlas was designed to cover all microbially derived natural products published in the peer-reviewed primary scientific literature [31]. This includes about 21,000 bacterial, fungal and cyanobacterial compounds, as well as NPs from lichens and mushrooms and other higher fungi. This excludes compounds from plants, invertebrates or other higher organisms, except if these have also been explicitly identified from a microbial source. Compounds from marine macroalgae and diatoms are also excluded. Pye et al. [91] collected and analysed such data by combining a dataset of a dataset comprising all published microbial and marine-derived natural products from the period 1941–2015, which were obtained from the commercial database AntiMarin [22]. They then combined this with data for the period 2012–2015 were through manual curation of all published articles from a large panel of journals in the chemistry and chemical biology arena. This resulted in a collection of 40,229 NPs. It was not, however, mentioned how many of these compounds were of microbial origin. The StreptomeDB, for example, currently contains 4,040 NPs that have been biosynthesized by 2,584 bacterial strains from the genus Streptomyces [37]. A recent collection of compounds from cyanobacteria alone led to the identification of 578 NPs distributed between the three major environmental sources, i. e. marine, terrestrial and freshwater [93]. Crüsemann et al. were recently able to develop a rapid method, based on molecular networks, comprising of 603 samples from 146 marine Salinispora and Streptomyces strains [94]. The method was capable of generating  ~1.8 million mass spectra, although it wasn’t specified how many SMs this might correspond to [94].

6.3 Natural products of marine origin

Marine organisms are quite varied and include; phytoplankton, green, brown, and red algae, sponges, cnidarians, bryozoans, molluscs, tunicates, echinoderms, mangroves, and other intertidal plants and microorganisms. Again, it is quite challenging to enumerate marine separately from microbial NPs, since many NP-producing microbes are marine-based, so the most advanced databases focusing on such metabolites, e. g. AntiBase [21], AntiMarin [22], DMNP [25], and MarinLit [29] are often combined. SMs from marine organisms are known to have several implications in medicinal chemistry [95, 96, 97, 98, 99, 100], although only 8 marine NPs have to date been approved as drugs and while 12 marine-derived metabolites are currently in different phases of clinical trials [1, 101, 102, 103]. Since, for example, most currently used antibiotics have been isolated from terrestrial microbes, accounting for more than 75 % of all antibiotics discovered [44, 104], the marine environment remains an untapped source of new bioactive molecules. For this reason, several thousands of SMs have been collected from marine sources and could enable us to enumerate the available NPs from marine sources.

6.3.1 Marine databases of natural products

In addition to terrestrial organisms that still remain a promising source of new bioactive metabolites, the marine environment (which covers approximately 70 % of the earth’s surface) represents largely unexplored biodiversity [105]. Apart from the aforementioned commercial databases specialized in marine metabolites, Lei and Zhou had also published a dataset of 6,000 chemical compounds derived from over 10,000 marine-derived material, including information on the source organisms (mainly coelenterates, sponges and blue algae) and biological activities of each compound [98]. Another collection focused on metabolites from red algae of the genus Laurencia showed that for data published until 2015, a total of 1,047 secondary metabolites with carbocyclic skeletons (sesquiterpenes, diterpenes, triterpenes, acetogenins, indoles, aromatic compounds, steroids), and miscellaneous compounds were already published for this genus [105]. Detailed analysis and enumeration of NPs from cyanobacteria by biological activities have also been provided by Burja et al. [106]. Davis et al. also published a publicly accessible Seaweed Metabolite Database (SWMD) currently containing  >1,100 compounds, mostly from the red algae of the genus Laurencia (Ceramiales, Rhodomelaceae) [107]. The authors made an extra effort to include the geographical origin of the seaweed, the extraction method and detailed chemical information on each metabolite.

6.3.2 An analysis of the geographical distribution marine source organism

A detailed comparison between the physicochemical properties of NPs from marine and terrestrial sources has been recently published [108] and a summary will be provided in the next chapter. In addition, Principe and Fisher recently reviewed a collection of information data associated with 298 pharmacological products originating from marine biota during the past 47 years [97]. The products were developed from 232 different marine species belonging to 15 phyla, i. e. 1,296 collections of specimens from 69 countries and from all 7 continents (Table 7). An investigation of the spatial distribution of the geographical locations of sample collections (Figure 4) provides a sort of map of where and when the specimens were collected that yielded MNPs with pharmacological potential (for which the clinical is reported) were collected. The goal of the study was not to have a representative sample of chemical structures or geographic locations, but rather to identify locations that yield the MNPs with demonstrated value or potential value. This also led to the identification of species from which those MNPs had been isolated and the locations where the specimens yielding those MNPs had been collected. The data collected covered 298 MNPs (i. e. 16 the FDA-approved drugs, 55 compounds in clinical, 51 compounds in preclinical testing and 176 lead compounds or probes) from 1,296 specimen collections. The spatial distributions of such data around the Bramble reef (North East Australia) have been shown in Figure 4 [97].

Table 7:

Collections by Phyluma [97].

Phyla Number in collections Percentage
Actinobacteriab 40 3.1 %
Bryozoac 18 1.4 %
Cnidariad 31 2.4 %
Cyanobacteriae 77 5.9 %
Dinoflagellataf 13 1.0 %
Echinodermatag 12 0.9 %
Fish 29 2.2 %
Fungi 3 0.2 %
Green Algae 8 0.6 %
Hemichordatah 2 0.2 %
Mollusca 195 15.1 %
Nemerteai 3 0.2 %
Poriferaj 716 55.2 %
Rhodophytak 4 0.3 %
Tunicatal 145 11.2 %
  1. aFucoxanthin is found in most or all species of the classes Phaeophyceae (brown algae), Chrysophyceae (golden algae) and Bacillariophyceae (diatoms) in the phylum Ochrophyta plus some species in the phyla Dinoflagellata and Haptophyta, possibly as many as 16,000 species in total, not included in this analyses. bGram-positive bacteria. cMostly colonial filter feeders. dIncludes octocorals. eEx-blue-green algae. fMostly marine plankton. gIncludes starfish, sea urchins, and sea cucumbers. hAcorn worms. iRibbon worms. jSponges. kRed algae. lTunicates.

Figure 4: 
              An example of the Google Earth Database showing collections made from the Great Barrier Reef near Townsville, Queensland, Australia [97]. Figure reproduced by permission.
Figure 4:

An example of the Google Earth Database showing collections made from the Great Barrier Reef near Townsville, Queensland, Australia [97]. Figure reproduced by permission.

6.4 Natural products from food plants

Food-based chemicals include primary metabolites and secondary metabolites. Table 8 shows some databases of food-based compounds and their approximate numbers of components.

Table 8:

Number of NPs from some food sources.

Database Origin # mol.a Weblink Reference
DFC Natural food components and additives ~30k [109]
EuroFIR-BASIS Phytochemicals in foods ~260 [110]
FooDB Diverse food sources ~28 k [111]
GRAS Flavour chemicals ~2,300 [112, 113]
NutriChem Plant-based foods and phytochemicals ~8 k [114, 115, 116]
Phenol-Explorer Polyphenols in foods ~500 [117, 118]
SuperScent Components of flavours and aromas ~2,300 [119]
SuperSweet Sweet compounds ~8 k [120]
TMDB Tea (Camellia spp.) ~1,400 [121]
USDA Food Components [122]
  1. aApproximate number of molecular annotations.

The commercially available Dictionary of Food Compounds (DFC) is the most advanced collection of compounds contained in foods, currently holding more than 30,000 compounds [109] and the FooDB with about the same number of compounds [111]. Although the chemoinformatics analysis of components of foods, particularly phytochemicals and natural additives present or added in foods [123, 124, 125, 126, 127, 128, 129, 130, 131, 132], the most advanced open access resources for natural compounds in foods is the NutriChem 1.0 server [114], a database generated by text mining of 21 million MEDLINE abstracts, with information that links plant-based foods with their small molecule components and human disease phenotypes. This server contains text-mined data corresponding to 18,478 pairs of 1,772 plant-based foods and 7,898 phytochemicals, along with 6,242 pairs of 1,066 plant-based foods and 751 diseases. Predicted associations for 548 phytochemicals and 252 diseases are also included. This tool provides the latest foundation for a mechanistic understanding of the consequences of eating behaviours on health [115, 116].

6.5 Human metabolites

Metabolites in humans have been collected into the Human Metabolome Database (HMDB) [133, 134, 135, 136], a freely available resource containing detailed information on small molecule metabolites found in the human body, mainly:

  1. chemical data,

  2. clinical data and

  3. molecular biology/biochemistry data.

It currently contains 114,083 water-soluble and lipid-soluble metabolite entries, as well as metabolites that would be regarded as either abundant (>1 μM) or relatively rare (<1 nM). Moreover, the metabolite entries have been linked to 5,702 protein sequences, chemical/clinical data, and enzymatic or biochemical data. Many data fields are hyperlinked to other databases (e. g. KEGG [137], PubChem [16], MetaCyc [138], ChEBI [139], PDB [140], UniProt [141] and GenBank [142]) and a variety of structure and pathways. Another utility of the HMDB is its links with additional databases like DrugBank [143], T3DB [144, 145], SMPDB [146, 147] and FooDB [111], which contain information on ~2,280 drug and drug metabolites, ~3,670 common toxins and environmental pollutants, pathway diagrams for ~25,000 human metabolic and disease pathways, food components, and food additives, respectively [136].

7 Conclusions

Due to the important roles that NPs play in the pharmaceutical, cosmetics and food industries, their potential for exploration in these areas has always come to question. Several attempts have been made to quantify the number of molecules made by nature. This chapter has been focused on estimating this approximate number. The discussion has been based on recently published data on compounds produced in plants, bacteria, marine organisms, humans or contained food substances. It has been estimated that a total number of more than 450,000 NPs exist in the literature, the majority being of plant origin. However, most bioactive NPs are of marine origin, although the marine environment still remains largely unexplored. A major limitation to the exploitation of NPs in large scale drug discovery remains the availability of samples since NPs are quite hard to synthesize and only a proportion of about 10 % of known NPs have available samples ready for screening. It is hoped that the automatic structure elucidation of metabolites [148] and the exploration of genomic data from NP-producing organisms would revolutionalize the world of NP drug discovery.


F.N.K. would also like to acknowledge the European Structural and Investment Funds, OP RDE-funded project ”ChemJets” (No. CZ.02.2.69/0.0/0.0/16_027/0008351). D.S. was supported from the Ministry of Education, Youth and Sports of the Czech Republic (project number LM2018130) and by RVO 68378050-KAV-NPUI.


[1] Harvey AL. Natural products in drug discovery. Drug Discov Today. 2008;13:894–901.10.1016/j.drudis.2008.07.004Search in Google Scholar PubMed

[2] Rodrigues T, Reker D, Schneider P, Schneider G. Counting on natural products for drug design. Nat Chem. 2016;8:531.10.1038/nchem.2479Search in Google Scholar PubMed

[3] Carreira EM. Natural products synthesis: a personal retrospective and outlook. Israel J Chem. 2018;58:114–21.10.1002/ijch.201700127Search in Google Scholar

[4] Lear MJ, Hirai K, Ogawa K, Yamashita S, Hirama M. A convergent total synthesis of the kedarcidin chromophore: 20-years in the making. J Antibiot (Tokyo). 2019. DOI: 10.1038/s41429-019-0175-y.Search in Google Scholar PubMed

[5] Ortholand J-Y, Ganesan A. Natural products and combinatorial chemistry: back to the future. Curr Opin Chem Biol. 2004;8:271–80.10.1016/j.cbpa.2004.04.011Search in Google Scholar PubMed

[6] Abegaz BM, Kinfe HH. Secondary metabolites, their structural diversity, bioactivity, and ecological functions: an overview. Phys Sci Rev. 2018. DOI: 10.1515/psr-2018-0100.Search in Google Scholar

[7] Feher M, Schmidt JM. Property distributions: differences between drugs, natural products, and molecules from combinatorial chemistry. J Chem Inf Comput Sci. 2003;43:218–27.10.1021/ci0200467Search in Google Scholar PubMed

[8] Larsson J, Gottfries J, Muresan S, Backlund A. ChemGPS-NP: tuned for navigation in biologically relevant chemical space. J Nat Prod. 2007;70:789–94.10.1021/np070002ySearch in Google Scholar PubMed

[9] Singh N, Guha R, Giulianotti MA, Pinilla C, Houghten RA, Medina-Franco JL. Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model. 2009;49:1010–24.10.1021/ci800426uSearch in Google Scholar PubMed PubMed Central

[10] Rosén J, Gottfries J, Muresan S, Backlund A, Oprea TI. Novel chemical space exploration via natural products. J Med Chem. 2009;52:1953–62.10.1021/jm801514wSearch in Google Scholar PubMed PubMed Central

[11] Gu J, Chen L, Yuan G, Xu X. A drug-target network-based approach to evaluate the efficacy of medicinal plants for type II diabetes mellitus. Evid Based Complement Alternat Med. 2013;2013:203614.10.1155/2013/203614Search in Google Scholar PubMed PubMed Central

[12] Lachance H, Wetzel S, Kumar K, Waldmann H. Charting, navigating, and populating natural product chemical space for drug discovery. J Med Chem. 2012;55:5989–6001.10.1021/jm300288gSearch in Google Scholar PubMed

[13] López-Vallejo F, Giulianotti MA, Houghten RA, Medina-Franco JL. Expanding the medicinally relevant chemical space with compound libraries. Drug Discov Today. 2012;17:718–26.10.1016/j.drudis.2012.04.001Search in Google Scholar PubMed

[14] Harvey AL, Edrada-Ebel R, Quinn RJ. The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov. 2015;14:111–29.10.1038/nrd4510Search in Google Scholar PubMed

[15] Sterling T, Irwin JJ. ZINC 15−ligand discovery for everyone. J Chem Inf Model. 2015;55:2324−37. Available at: Accessed: 16 May 2019.10.1021/acs.jcim.5b00559Search in Google Scholar PubMed PubMed Central

[16] Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2016;44:D1202–13.10.1093/nar/gkv951Search in Google Scholar PubMed PubMed Central

[17] Pan L, Chai HB, Kinghorn AD. Discovery of new anticancer agents from higher plants. Front Biosci (Schol Ed). 2013;4:142–56.Search in Google Scholar

[18] Akone SH, Pham C-D, Chen H, Ola AR, Ntie-Kang F, Proksch P. Epigenetic modification, co-culture and genomic methods for natural product discovery. Phys Sci Rev. 2018. DOI: 10.1515/psr-2018-0118.Search in Google Scholar

[19] AnalytiCon Discovery. Available at: Accessed: 16 May 2019.Search in Google Scholar

[20] Ambinter. Available at: Accessed: 16 May 2019.Search in Google Scholar

[21] Laatsch H. Antibase, version 4.0 – the natural compound identifier (for microbial secondary metabolites as well as higher fungi). Weinheim: Wiley-VCH Verlag GmbH & Co. KGaA, 2012.Search in Google Scholar

[22] Blunt JW, Munro MH, Laatsch H. AntiMarin database. Christchurch, New Zealand: University of Canterbury, 2006.Search in Google Scholar

[23] CAS REGISTRY - The gold standard for chemical substance information. Available at: Accessed: 9 May 2019.Search in Google Scholar

[24] Ntie-Kang F, PA O, Scharfe M, Owono Owono LC, Megnassan E, LM M, et al. ConMedNP: a natural product library from Central African medicinal plants for drug discovery. RSC Adv. 2014;4:409–19. Available at: Accessed: 16 May 2019.10.1039/C3RA43754JSearch in Google Scholar

[25] Blunt JW, Munro MH, editors. Dictionary of marine natural products with CD-ROM. Boca Raton, FL: Chapman and Hall/CRC, 2007.10.1201/9780849382178Search in Google Scholar

[26] Dictionary of Natural Products (DNP). Available at: Accessed: 16 May 2019.Search in Google Scholar

[27] Ye H, Ye L, Kang H, Tao L, Tang K, Liu X, et al. HIT: linking herbal active ingredients to targets. Nucleic Acids Res. 2011;39:D1055–9.10.1093/nar/gkq1165Search in Google Scholar PubMed PubMed Central

[28] Chang K-W, Tsai T-Y, Chen K-C, Yang S-C, Huang H-J, Chang T-T, et al. iSMART: an integrated cloud computing web server for traditional Chinese medicine for online virtual screening, de novo evolution and drug design. J Biomol Struct Dyn. 2011;29:243–50.10.1080/073911011010524988Search in Google Scholar PubMed

[29] Blunt JW, Munro MH MarinLit: A database of the marine natural products literature. Available at: Accessed: 20 Nov 2017.Search in Google Scholar

[30] Ntie-Kang F, Telukunta KK, Döring K, Simoben CV, Moumbock AF, Malange YI, et al. NANPDB: a resource for natural products from Northern African sources. J Nat Prod. 2017;80:2067−76. Available at: Accessed: 16 May 2019.10.1021/acs.jnatprod.7b00283Search in Google Scholar PubMed

[31] Natural Product Atlas. Available at: in Google Scholar

[32] Zeng X, Zhang P, He W, Qin C, Chen S, Tao L, et al. NPASS: natural product activity and species source database for natural product research, discovery and tool development. Nucleic Acids Res. 2018;46:D1217–22.10.1093/nar/gkx1026Search in Google Scholar PubMed PubMed Central

[33] Hao M, Cheng T, Wang Y, Bryant HS. Web search and data mining of natural products and their bioactivities in PubChem. Sci China Chem. 2013;56:1424−35.10.1007/s11426-013-4910-0Search in Google Scholar PubMed PubMed Central

[34] Ertl P, Schuhmann T. A systematic cheminformatics analysis of functional groups occurring in natural products. J Nat Prod. 2019;82:1258–63.10.1021/acs.jnatprod.8b01022Search in Google Scholar PubMed

[35] Chen Y, de Bruyn Kops C, Kirchmair J. Data resources for the computer-guided discovery of bioactive natural products. J Chem Inf Model. 2017;57:2099−111.10.1021/acs.jcim.7b00341Search in Google Scholar PubMed

[36] Reaxys; Elsevier: New York. Available at: (accessed Jul 17, 2017). Accessed: 9 May 2019.Search in Google Scholar

[37] Klementz D, Döring K, Lucas X, Telukunta KK, Erxleben A, Deubel D, et al. StreptomeDB 2.0 - an extended resource of natural products produced by streptomycetes. Nucleic Acids Res. 2016;44:D509−14. Available at: Accessed: 16 May 2019.10.1093/nar/gkv1319Search in Google Scholar PubMed PubMed Central

[38] Banerjee P, Erehman J, Gohlke B-O, Wilhelm T, Preissner R, Dunkel M. Super Natural II − a database of natural products. Nucleic Acids Res. 2015;43:D935−9. Available at: Accessed: 9 May 2019.10.1093/nar/gku886Search in Google Scholar PubMed PubMed Central

[39] Chen CY. TCM database@Taiwan: the world’s largest traditional chinese medicine database for drug screening in silico. PLoS One. 2011;6:e15939. Available at: in Google Scholar PubMed PubMed Central

[40] Xue R, Fang Z, Zhang M, Yi Z, Wen C, Shi T. TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Res. 2013;41:D1089−95. Available at: Accessed: 16 May 2019.10.1093/nar/gks1100Search in Google Scholar PubMed PubMed Central

[41] Lin Y-C, Wang C-C, Chen I-S, Jheng J-L, Li J-H, Tung C-W. TIPdb: a database of anticancer, antiplatelet, and antituberculosis phytochemicals from indigenous plants in Taiwan. Sci World J. 2013;2013:736386.10.1155/2013/736386Search in Google Scholar PubMed PubMed Central

[42] Tung C-W, Lin Y-C, Chang H-S, Wang C-C, Chen I-S, Jheng J-L, et al. TIPdb-3D: the three-dimensional structure database of phytochemicals from Taiwan indigenous plants. Database. 2014;2014:bau055. Available at: Accessed: 16 May 2019.10.1093/database/bau055Search in Google Scholar PubMed PubMed Central

[43] Gu J, Gui Y, Chen L, Yuan G, Lu H-Z, Xu X. Use of natural products as chemical library for drug discovery and network pharmacology. PLoS One. 2013;8:e62839. Available at: Accessed: 16 May 2019.10.1371/journal.pone.0062839Search in Google Scholar PubMed PubMed Central

[44] Bérdy J. Bioactive microbial metabolites. J Antibiot. 2005;58:1–26.10.1038/ja.2005.1Search in Google Scholar PubMed

[45] Bérdy J. Thoughts and facts about antibiotics: where we are now and where we are heading. J Antibiot. 2012;65:385–95.10.1038/ja.2012.27Search in Google Scholar PubMed

[46] Blunt J, Munro M, Upjohn M. The role of databases in marine natural products research. In: Fattorusso E, Gerwick WH, Taglialatela-Scafati O, editors. Handbook of marine natural products. Dordrecht: Springer, 2012:389–421.10.1007/978-90-481-3834-0_6Search in Google Scholar

[47] Falk H, Wolkenstein K. Natural product molecular fossils. In: Kinghorn D, Falk H, Gibbons S, Kobayashi J, editors. Progress in the chemistry of organic natural products, vol. 104. Cham: Springer, 2017:1–126.10.1007/978-3-319-49712-9Search in Google Scholar

[48] Chassagne F, Cabanac G, Hubert G, David B, Marti G. The landscape of natural product diversity and their pharmacological relevance from a focus on the dictionary of natural products®. Phytochem Rev. 2019. DOI: 10.1007/s11101-019-09606-2.Search in Google Scholar

[49] Hossain M, Haque A, Mazid ZS, Khan A, Ullah TR, Rumee TA, Jesmin. SVDB: a comprehensive domain specific database of snake venom toxins generated through NCBI. Preprints. 2019. DOI: 10.20944/preprints201809.0454.v1.Search in Google Scholar

[50] Sixty-Four Free Chemistry Databases. Available at: Accessed: 09 May 2019.Search in Google Scholar

[51] Shen JH, Xu XY, Cheng F, Liu H, Luo XM, Shen J, et al. Virtual screening on natural products for discovering active compounds and target information. Curr Med Chem. 2003;10:2327–42.10.2174/0929867033456729Search in Google Scholar PubMed

[52] He M, Yan XJ, Zhou JJ, Xie GR. Traditional Chinese medicine database and application on the Web. J Chem Inf Comput Sci. 2001;41:273–7.10.1021/ci0003101Search in Google Scholar PubMed

[53] Qiao XB, Hou TJ, Zhang W, Guo SL, XJ A X. 3D structure database of components from Chinese traditional medicinal herbs. J Chem Inf Comput Sci. 2002;42:481–9.10.1021/ci010113hSearch in Google Scholar PubMed

[54] Camp D, Newman S, Pham NB, Quinn RJ. Nature Bank and the Queensland Compound Library: unique international resources at the Eskitis Institute for drug discovery. Comb Chem & High Throughput Screen. 2014;17:201–9.10.2174/1386207317666140109120515Search in Google Scholar PubMed

[55] Ng SB, Kanagasundaram Y, Fan H, Arumugam P, Eisenhaber B, Eisenhaber F. The 160K Natural Organism Library, a unique resource for natural products research. Nat Biotechnol. 2018;36:570–3. Available at: in Google Scholar PubMed

[56] Quinn RJ, Carroll AR, Pham NB, Baron P, Palframan ME, Suraweera L, et al. Developing a drug-like natural product library. J Nat Prod. 2008;71:464–8.10.1021/np070526ySearch in Google Scholar PubMed

[57] InterBioScreen (Moscow). Available at: Accessed: 16 May 2019.Search in Google Scholar

[58] Ntie-Kang F, Zofou D, Babiaka SB, Meudom R, Scharfe M, Lifongo LL, et al. AfroDb: a select highly potent and diverse natural product library from African medicinal plants. PLoS One. 2013;8:e78085. Available at: Accessed: 9 May 2019.10.1371/journal.pone.0078085Search in Google Scholar PubMed PubMed Central

[59] Pilón-Jiménez BA, Saldívar-González FI, Díaz-Eufracio BI, Medina-Franco JL. BIOFACQUIM: A Mexican compound database of natural products. Biomolecules. 2019;9:E31.10.3390/biom9010031Search in Google Scholar PubMed PubMed Central

[60] Mirza SB, Bokhari H, Fatmi MQ. Exploring natural products from the biodiversity of Pakistan for computational drug discovery studies: collection, optimization, design and development of a chemical database (ChemDP). Curr Comput Aided Drug Des. 2015;11:102–9.10.2174/157340991102150904101740Search in Google Scholar PubMed

[61] Ntie-Kang F, Mbah JA, Mbaze LM, Lifongo LL, Scharfe M, Hanna JN, et al. CamMedNP: building the Cameroonian 3D structural natural products database for virtual screening. BMC Complement Altern Med. 2013;13:88.10.1186/1472-6882-13-88Search in Google Scholar PubMed PubMed Central

[62] Yanuar A, Mun’im A, Lagho AB, Syahdi RR, Rahmat M, Suhartanto H. Medicinal plants database and three dimensional structure of the chemical compounds from medicinal plants in Indonesia. Int J Comput Sci. 2011;8:180–3.Search in Google Scholar

[63] Ashfaq UA, Mumtaz A, Qamar TU, Fatima T. MAPS database: medicinal plant activities, phytochemical and structural database. Bioinformation. 2013;9:993–5.10.6026/97320630009993Search in Google Scholar PubMed PubMed Central

[64] Mitishamba Database: A database of natural products from Kenya for drug discovery. Available at: Accessed: 08 May 2019.Search in Google Scholar

[65] Valli M, Dos Santos RN, Figueira LD, Nakajima CH, Castro-Gamboa I, Andricopulo AD, et al. Development of a natural products database from the biodiversity of Brazil. J Nat Prod. 2013;76:439−44.10.1021/np3006875Search in Google Scholar PubMed

[66] Pilon AC, Valli M, Dametto AC, Pinto ME, Freire RT, Castro-Gamboa I, et al. NuBBEDB: an updated database to uncover chemical and biological information from Brazilian biodiversity. Sci Rep. 2017;7:7215.10.1038/s41598-017-07451-xSearch in Google Scholar PubMed PubMed Central

[67] Olmedo DA, González-Medina M, Gupta MP, Medina-Franco JL. Cheminformatic characterization of natural products from Panama. Mol Divers. 2017;21:779–89.10.1007/s11030-017-9781-4Search in Google Scholar PubMed

[68] Hatherley R, Brown DK, Musyoka TM, Penkler DL, Faya N, Lobb KA, et al. SANCDB: a South African natural compound database. J Cheminform. 2015;7:29.10.1186/s13321-015-0080-8Search in Google Scholar PubMed PubMed Central

[69] Pathania S, Ramakrishnan SM, Bagler G. Phytochemica: a platform to explore phytochemicals of medicinal plants. Database (Oxford). 2015;2015:bav075.10.1093/database/bav075Search in Google Scholar PubMed PubMed Central

[70] Kim SK, Nam S, Jang H, Kim A, Lee JJ. TM-MC: a database of medicinal materials and chemical compounds in Northeast Asian traditional medicine. BMC Complement Altern Med. 2015;15:218.10.1186/s12906-015-0758-5Search in Google Scholar PubMed PubMed Central

[71] Nguyen-Vo TH, Le T, Pham D, Nguyen T, Le P, Nguyen A, et al. VIETHERB: a database for vietnamese herbal species. J Chem Inf Model. 2019;59:1–9.10.1021/acs.jcim.8b00399Search in Google Scholar PubMed

[72] Ntie-Kang F, Nwodo JN, Ibezim A, Simoben CV, Karaman B, Ngwa VF, et al. Molecular modeling of potential anticancer agents from African medicinal plants. J Chem Inf Model. 2014;54:2433–50.10.1021/ci5003697Search in Google Scholar PubMed

[73] Onguéné PA, Ntie-Kang F, Mbah JA, Lifongo LL, Ndom JC, Sippl W, et al. The potential of anti-malarial compounds derived from African medicinal plants, part III: an in silico evaluation of drug metabolism and pharmacokinetics profiling. Org Med Chem Lett. 2014;4:6.10.1186/s13588-014-0006-xSearch in Google Scholar PubMed PubMed Central

[74] Ibezim A, Debnath B, Ntie-Kang F, Mbah CJ, Nwodo NJ. Binding of anti-Trypanosoma natural products from African flora against selected drug targets: a docking study. Med Chem Res. 2017;26:562−79.10.1007/s00044-016-1764-ySearch in Google Scholar

[75] Egieyeh S, Syce J, Christoffels A, Malan SF. Exploration of scaffolds from natural products with antiplasmodial activities, currently registered antimalarial drugs and public malarial screen data. Molecules. 2016;21:104.10.3390/molecules21010104Search in Google Scholar PubMed PubMed Central

[76] Egieyeh SA, Syce J, Malan SF, Christoffels A. Prioritization of anti-malarial hits from nature: chemo-informatic profiling of natural products with in vitro antiplasmodial activities and currently registered anti-malarial drugs. Malar J. 2016;15:50.10.1186/s12936-016-1087-ySearch in Google Scholar PubMed PubMed Central

[77] Egieyeh SA, Syce J, Malan S, Christoffels A. Antimalarial drug development from phytomedicine: chemoinformatic and pharmacological studies. Int J Infect Dis. 2014;21S:1–460.10.1016/j.ijid.2014.03.777Search in Google Scholar

[78] Lagunin AA, Druzhilovsky DS, Rudik AV, Filimonov DA, Gawande D, Suresh K, et al. Computer evaluation of hidden potential of phytochemicals of medicinal plants of the traditional Indian ayurvedic medicine. Biomed Khim. 2015;61:286−97.10.18097/PBMC20156102286Search in Google Scholar PubMed

[79] Sharma A, Dutta P, Sharma M, NK R, Dodiya B, Georrge JJ, et al. BioPhytMol: a drug discovery community resource on anti-mycobacterial phytomolecules and plant extracts. J Cheminform. 2014;6:46.10.1186/s13321-014-0046-2Search in Google Scholar PubMed PubMed Central

[80] Fang X, Shao L, Zhang H, Wang S. CHMIS-C: a comprehensive herbal medicine information system for cancer. J Med Chem. 2005;48:1481−8.10.1021/jm049838dSearch in Google Scholar PubMed

[81] Gu J, Gui Y, Chen L, Yuan G, Xu X. CVDHD: a cardiovascular disease herbal database for drug discovery and network pharmacology. J Cheminform. 2013;5:51.10.1186/1758-2946-5-51Search in Google Scholar PubMed PubMed Central

[82] Mangal M, Sagar P, Singh H, Raghava GPS, Agarwal SM. NPACT: naturally occurring plant-based anti-cancer compound-activity-target database. Nucleic Acids Res. 2013;41:D1124–9.10.1093/nar/gks1047Search in Google Scholar PubMed PubMed Central

[83] Choi H, Cho SY, Pak HJ, Kim Y, Choi JY, Lee YJ, et al. NPCARE: database of natural products and fractional extracts for cancer regulation. J Cheminform. 2017;9:2.10.1186/s13321-016-0188-5Search in Google Scholar PubMed PubMed Central

[84] Chen X, Zhou H, Liu YB, Wang JF, Li H, Ung CY, et al. Database of traditional Chinese medicine and its application to studies of mechanism and to prescription validation. Br J Pharmacol. 2006;149:1092−103.10.1038/sj.bjp.0706945Search in Google Scholar PubMed PubMed Central

[85] Ru J, Li P, Wang J, Zhou W, Li B, Huang C, et al. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines. J Cheminform. 2014;6:13.10.1186/1758-2946-6-13Search in Google Scholar PubMed PubMed Central

[86] Kinoshita T, Lepp Z, Kawai Y, Terao J, Chuman H. An integrated database of flavonoids. Biofactors. 2006;26:179–88.10.1002/biof.5520260303Search in Google Scholar PubMed

[87] Yabuzaki J. Carotenoids Database: structures, chemical fingerprints and distribution among organisms. Database (Oxford). 2017;2017:bax004.10.1093/database/bax004Search in Google Scholar PubMed PubMed Central

[88] Nupur LN, Vats A, Dhanda SK, Raghava GP, Pinnaka AK, Kumar A. ProCarDB: a database of bacterial carotenoids. BMC Microbiol. 2016;16:96.10.1186/s12866-016-0715-6Search in Google Scholar PubMed PubMed Central

[89] Johnson EA, Schroeder WA. Microbial carotenoids. Adv Biochem Eng Biotechnol. 1996;53:119–78.10.1007/BFb0102327Search in Google Scholar PubMed

[90] Zeng X, Zhang P, Wang Y, Qin C, Chen S, He W, et al. CMAUP: a database of collective molecular activities of useful plants. Nucleic Acids Res. 2019;47:D1118–27.10.1093/nar/gky965Search in Google Scholar PubMed PubMed Central

[91] Pye CR, Bertin MJ, Lokey RS, Gerwick WH, Linington RG. Retrospective analysis of natural products provides insights for future discovery trends. Proc Natl Acad Sci U S A. 2017;114:5601–6.10.1073/pnas.1614680114Search in Google Scholar PubMed PubMed Central

[92] Maruca A, Moraca F, Rocca R, Molisani F, Alcaro F, Gidaro MC, et al. Chemoinformatic database building and in silico hit-identification of potential multi-targeting bioactive compounds extracted from mushroom species. Molecules. 2017;22:1571.10.3390/molecules22091571Search in Google Scholar PubMed PubMed Central

[93] González-Medina M, Medina-Franco JL. Chemical diversity of cyanobacterial compounds: a chemoinformatics analysis. ACS Omega. 2019;4:6229–37.10.1021/acsomega.9b00532Search in Google Scholar

[94] Crüsemann M, O’Neill EC, Larson CB, Melnik AV, Floros DJ, Da Silva RR, et al. Prioritizing natural product diversity in a collection of 146 bacterial strains based on growth and extraction protocols. J Nat Prod. 2017;80:588–97.10.1021/acs.jnatprod.6b00722Search in Google Scholar PubMed PubMed Central

[95] Jiménez C. Marine natural products in medicinal chemistry. ACS Med Chem Lett. 2018;9:959–61.10.1021/acsmedchemlett.8b00368Search in Google Scholar PubMed PubMed Central

[96] Miller JH, Field JJ, Kanakkanthara A, Owen JG, Singh AJ, Northcote PT. Marine invertebrate natural products that target microtubules. J Nat Prod. 2018;81:691–702.10.1021/acs.jnatprod.7b00964Search in Google Scholar PubMed

[97] Principe PP, Fisher WS. Spatial distribution of collections yielding marine natural products. J Nat Prod. 2018;81:2307–20.10.1021/acs.jnatprod.8b00288Search in Google Scholar PubMed PubMed Central

[98] Lei J, Zhou J. A marine natural product database. J Chem Inf Comput Sci. 2002;42:742–8.10.1021/ci010111xSearch in Google Scholar PubMed

[99] Timmermans ML, Paudel YP, Ross AC. Investigating the biosynthesis of natural products from marine proteobacteria: a survey of molecules and strategies. Mar Drugs. 2017;15:E235.10.3390/md15080235Search in Google Scholar PubMed PubMed Central

[100] Pereira F, Aires-de-Sousa J. Computational methodologies in the exploration of marine natural product leads. Mar Drugs. 2018;16:E236.10.3390/md16070236Search in Google Scholar PubMed PubMed Central

[101] Mayer AM, Rodriguez AD, Taglialatela-Scafati O, Fusetani N. Marine pharmacology in 2009–2011: Marine compounds with antibacterial, antidiabetic, antifungal, anti-inflammatory, antiprotozoal, antituberculosis, and antiviral activities; affecting the immune and nervous systems, and other miscellaneous mechanisms of action. Mar Drugs. 2013;11:2510–73.10.3390/md11072510Search in Google Scholar PubMed PubMed Central

[102] Choudhary A, Naughton LM, Montanchez I, Dobson AD, Rai DK. Current status and future prospects of marine natural products (MNPs) as antimicrobials. Mar Drugs. 2017;15:272.10.3390/md15090272Search in Google Scholar PubMed PubMed Central

[103] Blunt JW, Copp BR, Keyzers RA, Munro MH, Prinsep MR. Marine natural products. Nat Prod Rep. 2016;33:382–431.10.1039/C5NP00156KSearch in Google Scholar PubMed

[104] Wohlleben W, Mast Y, Stegmann E, Ziemert N. Antibiotic drug discovery. Microb Biotechnol. 2016;9:541–8.10.1111/1751-7915.12388Search in Google Scholar PubMed PubMed Central

[105] Harizani M, Ioannou E, Roussis V. The Laurencia paradox: an endless source of chemodiversity. Prog Chem Org Nat Prod. 2016;102:91–252.10.1007/978-3-319-33172-0_2Search in Google Scholar

[106] Burja AM, Banaigs B, Abou-Mansour E, Burgess JG, Wright PC. Marine cyanobacteria - a prolific source of natural products. Tetrahedron. 2001;57:9347–77.10.1016/S0040-4020(01)00931-0Search in Google Scholar

[107] Davis GD, Vasanthi AH. Seaweed metabolite database (SWMD): A database of natural compounds from marine algae. Bioinformation. 2011;5:361–4. Available at: Accessed: 16 May 2019.10.6026/97320630005361Search in Google Scholar PubMed PubMed Central

[108] Shang J, Hu B, Wang J, Zhu F, Kang Y, Li D, et al. Cheminformatic insight into the differences between terrestrial and marine originated natural products. J Chem Inf Model. 2018;58:1182–93.10.1021/acs.jcim.8b00125Search in Google Scholar PubMed

[109] The Dictionary of Food Compounds. Available at: Accessed: 16 May 2019.Search in Google Scholar

[110] Gry J, Black L, Eriksen FD, Pilegaard K, Plumb J, Rhodes M, et al. EuroFIR-BASIS a combined composition and biological activity database for bioactive compounds in plant-based foods. Trends Food Sci Technol. 2007;18:434–44.10.1016/j.tifs.2007.05.008Search in Google Scholar

[111] FooDB (The Food Database) (version 1.0). Available at: Accessed: 16 May 2019.Search in Google Scholar

[112] Burdock GA, Carabin IG. Generally Recognized as Safe (GRAS): history and description. Toxicol Lett. 2004;150:3–18.10.1016/j.toxlet.2003.07.004Search in Google Scholar PubMed

[113] Martinez-Mayorga K, Peppard TL, López-Vallejo F, Yongye AB, Medina-Franco JL. Systematic mining of generally recognized as safe (GRAS) flavor chemicals for bioactive compounds. J Agric Food Chem. 2013;61:7507–14.10.1021/jf401019bSearch in Google Scholar PubMed

[114] Jensen K, Panagiotou G, Kouskoumvekaki I. NutriChem: a systems chemical biology resource to explore the medicinal value of plant-based foods. Nucleic Acids Res. 2015;43:D940–5.10.1093/nar/gku724Search in Google Scholar PubMed PubMed Central

[115] Jensen K, Panagiotou G, Kouskoumvekaki I. Integrated text mining and chemoinformatics analysis associates diet to health benefit at molecular level. PLoS Comput Biol. 2014;10:10.1371.10.1371/annotation/96a702bd-85a5-49d9-8fcc-3aad7aa4afa7Search in Google Scholar

[116] The NutriChem 1.0 server. Available at: Accessed: 16 May 2019.Search in Google Scholar

[117] Neveu V, Perez-Jimenez J, Vos F, Crespy V, Du Chaffaut L, Mennen L, et al. Phenol-Explorer: an online comprehensive database on polyphenol contents in foods. Database. 2010;2010:bap024.10.1093/database/bap024Search in Google Scholar PubMed PubMed Central

[118] Perez-Jimenez J, Neveu V, Vos F, Scalbert A. Systematic analysis of the content of 502 polyphenols in 452 foods and beverages: an application of the Phenol-Explorer database. J Agric Food Chem. 2010;58:4959–69.10.1021/jf100128bSearch in Google Scholar PubMed

[119] Dunkel M, Schmidt U, Struck S, Berger L, Gruening B, Hossbach J, et al. SuperScent–a database of flavors and scents. Nucleic Acids Res. 2009;37:D291–4.10.1093/nar/gkn695Search in Google Scholar PubMed PubMed Central

[120] Ahmed J, Preissner S, Dunkel M, Worth CL, Eckert A, Preissner R. SuperSweet–a resource on natural and artificial sweetening agents. Nucleic Acids Res. 2011;39:D377–82.10.1093/nar/gkq917Search in Google Scholar PubMed PubMed Central

[121] Yue Y, Chu GX, Liu XS, Tang X, Wang W, Liu GJ, et al. TMDB: a literature-curated database for small molecular compounds found from tea. BMC Plant Biol. 2014;14:243.10.1186/s12870-014-0243-1Search in Google Scholar PubMed PubMed Central

[122] United States Department of Agriculture National Agricultural Library. Available at: Accessed: 16 May 2019.Search in Google Scholar

[123] Peña-Castillo A, Méndez-Lucio O, Owen JR, Martínez-Mayorga K, Medina-Franco JL. Chemoinformatics in food science. In: Engel T, Gasteiger J, editors. Applied chemoinformatics: achievements and future opportunities. ISBN:9783527342013 |Online ISBN:9783527806539 |DOI:10.1002/9783527806539. Wiley-VCH Verlag GmbH & Co. KGaA, 2018:501–25.10.1002/9783527806539.ch10Search in Google Scholar

[124] Naveja JJ, Rico-Hidalgo MP, Medina-Franco JL. Analysis of a large food chemical database: chemical space, diversity, and complexity. F1000 Res. 2018;7:993.10.12688/f1000research.15440.2Search in Google Scholar PubMed PubMed Central

[125] Minkiewicz P, Darewicz M, Iwaniak A, Bucholska J, Starowicz P, Czyrko E. Internet databases of the properties, enzymatic reactions, and metabolism of small molecules - search options and applications in food science. Int J Mol Sci. 2016;17:2039.10.3390/ijms17122039Search in Google Scholar PubMed PubMed Central

[126] Holton TA, Vijayakumar V, Khaldi N. Bioinformatics: current perspectives and future directions for food and nutritional research facilitated by a food-wiki database. Trends Food Sci Technol. 2013;34:5−17.10.1016/j.tifs.2013.08.009Search in Google Scholar

[127] Minkiewicz P, Iwaniak A, Darewicz M. Using internet databases for food science organic chemistry students to discover chemical compound information. J Chem Educ. 2015;92:874–6.10.1021/ed5006739Search in Google Scholar

[128] Iwaniak A, Minkiewicz P, Darewicz M, Protasiewicz M, Mogut D. Chemometrics and cheminformatics in the analysis of biologically active peptides from food sources. J Functional Foods. 2015;16:334–51.10.1016/j.jff.2015.04.038Search in Google Scholar

[129] Martínez-Mayorga K, Peppard TL, Medina-Franco JL. Software and online resources: perspectives and potential applications. In: Martínez-Mayorga K, Medina-Franco JL, editors. Foodinformatics. Applications of chemical information to food chemistry. Cham, Switzerland: Springer International Publishing AG, 2014:233−48.Search in Google Scholar

[130] Scalbert A, Andres-Lacueva C, Arita M, Kroon P, Manach C, Urpi-Sarda M, et al. Databases on food phytochemicals and their health-promoting effects. J Agric Food Chem. 2011;59:4331−48.10.1021/jf200591dSearch in Google Scholar PubMed

[131] Malkaram SA, Hassan YI, Zempleni J. Online tools for bioinformatics analyses in nutrition sciences. Adv Nutr. 2012;3:654−65.10.3945/an.112.002477Search in Google Scholar PubMed PubMed Central

[132] Minkiewicz P, Miciński J, Darewicz M, Bucholska J. Biological and chemical databases for research into the composition of animal source foods. Food Rev Int. 2013;29:321−51.10.1080/87559129.2013.818011Search in Google Scholar

[133] Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, et al. HMDB: the human metabolome database. Nucleic Acids Res. 2007;35:D521–6.10.1093/nar/gkl923Search in Google Scholar PubMed PubMed Central

[134] Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, et al. HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res. 2009;37:D603–10.10.1093/nar/gkn810Search in Google Scholar PubMed PubMed Central

[135] Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, Liu Y, et al. HMDB 3.0 — the human metabolome database in 2013. Nucleic Acids Res. 2013;41:D801–7.10.1093/nar/gks1065Search in Google Scholar PubMed PubMed Central

[136] Wishart DS, Feunang YD, Marcu A, Guo AC, Liang K, Vázquez-Fresno R, et al. HMDB 4.0 — The human metabolome database for 2018. Nucleic Acids Res. 2018;46:D608–17. Available at: Accessed: 16 May 2019.10.1093/nar/gkx1089Search in Google Scholar PubMed PubMed Central

[137] Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45:D353–61.10.1093/nar/gkw1092Search in Google Scholar PubMed PubMed Central

[138] Caspi R, Billington R, Ferrer L, Foerster H, Fulcher CA, Keseler IM, et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 2016;44:D471–80.10.1093/nar/gkv1164Search in Google Scholar PubMed PubMed Central

[139] Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 2016;44:D1214–9.10.1093/nar/gkv1031Search in Google Scholar PubMed PubMed Central

[140] Burley SK, Berman HM, Bhikadiya C, Bi C, Chen L, Di Costanzo L, et al. RCSB protein data bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res. 2019;47:D464–74.10.1093/nar/gky1004Search in Google Scholar PubMed PubMed Central

[141] UniProt Consortium T. The UniProt Consortium UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45:D158–69.10.1093/nar/gkw1099Search in Google Scholar PubMed PubMed Central

[142] Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, et al. GenBank. Nucleic Acids Res. 2017;45:D37–42.10.1093/nar/gkw1070Search in Google Scholar PubMed PubMed Central

[143] Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2014;42:D1091–7.10.1093/nar/gkt1068Search in Google Scholar PubMed PubMed Central

[144] Lim E, Pon A, Djoumbou Y, Knox C, Shrivastava S, Guo AC, et al. T3DB: a comprehensively annotated database of common toxins and their targets. Nucleic Acids Res. 2010;38:D781–6.10.1093/nar/gkp934Search in Google Scholar PubMed PubMed Central

[145] Wishart D, Arndt D, Pon A, Sajed T, Guo AC, Djoumbou Y, et al. T3DB: the toxic exposome database. Nucleic Acids Res. 2015;43:D928–34.10.1093/nar/gku1004Search in Google Scholar PubMed PubMed Central

[146] Frolkis A, Knox C, Lim E, Jewison T, Law V, Hau DD, et al. SMPDB: the small molecule pathway database. Nucleic Acids Res. 2010;38:D480–7.10.1093/nar/gkp1002Search in Google Scholar PubMed PubMed Central

[147] Jewison T, Su Y, Disfany FM, Liang Y, Knox C, Maciejewski A, et al. SMPDB 2.0: big improvements to the small molecule pathway database. Nucleic Acids Res. 2014;42:D478–84.10.1093/nar/gkt1067Search in Google Scholar PubMed PubMed Central

[148] Jayaseelan KV, Steinbeck C. Building blocks for automated elucidation of metabolites: natural product-likeness for candidate ranking. BMC Bioinform. 2014;15:234.10.1186/1471-2105-15-234Search in Google Scholar PubMed PubMed Central

Published Online: 2020-04-21

© 2020 Fidele Ntie-Kang and Daniel Svozil, published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 31.5.2023 from
Scroll to top button