Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year

IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2016: 0.625
Source Normalized Impact per Paper (SNIP) 2016: 0.596

Mathematical Citation Quotient (MCQ) 2016: 0.06

See all formats and pricing
More options …
Volume 15, Issue 4


Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

A joint modeling approach for uncovering associations between gene expression, bioactivity and chemical structure in early drug discovery to guide lead selection and genomic biomarker development

Nolen Perualila-Tan
  • Corresponding author
  • Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Center for Statistics, Hasselt University, 3590 Diepenbeek, Belgium
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Adetayo Kasim / Willem Talloen / Bie Verbist / Hinrich W.H. Göhlmann / QSTAR Consortium / Ziv Shkedy
  • Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Center for Statistics, Hasselt University, 3590 Diepenbeek, Belgium
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-05-25 | DOI: https://doi.org/10.1515/sagmb-2014-0086


The modern drug discovery process involves multiple sources of high-dimensional data. This imposes the challenge of data integration. A typical example is the integration of chemical structure (fingerprint features), phenotypic bioactivity (bioassay read-outs) data for targets of interest, and transcriptomic (gene expression) data in early drug discovery to better understand the chemical and biological mechanisms of candidate drugs, and to facilitate early detection of safety issues prior to later and expensive phases of drug development cycles. In this paper, we discuss a joint model for the transcriptomic and the phenotypic variables conditioned on the chemical structure. This modeling approach can be used to uncover, for a given set of compounds, the association between gene expression and biological activity taking into account the influence of the chemical structure of the compound on both variables. The model allows to detect genes that are associated with the bioactivity data facilitating the identification of potential genomic biomarkers for compounds efficacy. In addition, the effect of every structural feature on both genes and pIC50 and their associations can be simultaneously investigated. Two oncology projects are used to illustrate the applicability and usefulness of the joint model to integrate multi-source high-dimensional information to aid drug discovery.

This article offers supplementary material which is provided at the end of the article.

Keywords: bioactivity; biomarkers; chemical structure; joint model; transcriptomic


  • Amaratunga, D., J. Cabrera and Z. Shkedy (2014): Exploration and analysis of DNA microarray and other high-dimensional data, Wiley Series in Probability and Statistics, Hoboken, New Jersey: John Wiley & Sons Inc.Google Scholar

  • Bai, J. P. F., A. V. Alekseyenko, A. Statnikov, I.-M. Wang and P. H. Wong (2013): “Strategic applications of gene expression: from drug discovery/development to bedside,” AAPS J., 15, 427–437.Web of ScienceGoogle Scholar

  • Benjamini, Y. and Y. Hochberg (1995): “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. R. Stat. Soc. Series B Stat. Methodol., 57, 289–300.Google Scholar

  • Brattström, D., M. Bergqvist, P. Hesselius, A. Larsson, K. Lamberg, J. Wernlund, O. Brodin and G. Wagenius (2002): “Elevated preoperative serum levels of angiogenic cytokines correlate to larger primary tumours and poorer survival in non-small cell lung cancer patients,” Lung Cancer, 7, 57–63.Google Scholar

  • Bruce, E., R. Autenrieth, R. Burghardt, K. Donnelly and T. McDonald (2008): “Using quantitative structure-activity relationships (QSAR) to predict toxic endpoints for polycyclic aromatic hydrocarbons (PAH),” J. Toxicol. Environ. Health A, 71, 1073–1084.Google Scholar

  • Buyse, M. and G. Molenberghs (1998): “The validation of surrogate endpoints in randomized experiments,” Biometrics, 54, 186–201.Google Scholar

  • Collins, M. and M. di Magliano (2014): “Kras as a key oncogene and therapeutic target in pancreatic cancer,” Front Physiol., 4, 407.Web of ScienceGoogle Scholar

  • Dai, B., S. Yoo, G. Bartholomeusz, R. Graham, M. Majidi, S. Yan, J. Meng, L. Ji, K. Coombes, J. Minna, B. Fang and J. Roth (2013): “Keap1-dependent synthetic lethality induced by akt and txnrd1 inhibitors in lung cancer,” Cancer Res., 73, 5532–5543.Google Scholar

  • Dearden, J. C. (2003): “In silico prediction of drug toxicity,” J. Comput. Aided Mol. Des., 17, 119–127.Google Scholar

  • Eriksson, S., S. E. Prast-Nielsen, E. Flaberg, L. Szekely and E. Arner (2009): “High levels of thioredoxin reductase 1 modulate drug-specific cytotoxic efficacy,” Free Radic. Biol. Med., 47, 1661–1671.Google Scholar

  • Fadiel, A. and F. Naftolin (2003): “Microarray applications and challenges: a vast array of possibilities,” Reprod. Sci., 1, 1111–1121.Google Scholar

  • Göhlmann, H. and W. Talloen (2009): Gene expression studies using affymetrix microarrays, Chapman & Hall/CRC Mathematical & Computational Biology, Boca Raton, Florida: CRC Press.Google Scholar

  • Gorrini, C., I. S. Harris and T. W. Mak (2013): “Modulation of oxidative stress as an anticancer strategy,” Nat. Rev. Drug Discov., 12, 931–947.Google Scholar

  • Harris, V., C. Coticchia, B. Kagan, S. Ahmad, A. Wellstein and A. Riegel (2000): “Induction of the angiogenic modulator fibroblast growth factor-binding protein by epidermal growth factor is mediated through both mek/erk and p38 signal transduction pathways,” J. Biol. Chem., 275, 10802–10811.Google Scholar

  • Hasumi, H., M. Baba, S. Hong, Y. Hasumi, Y. Huang, M. Yao, V. Valera, W. Linehan and L. Schmidt (2008): “Identification and characterization of a novel folliculin-interacting protein fnip2,” Gene, 415, 60–67.Google Scholar

  • Hauptmann, S., A. Siegert, S. Berger, C. Denkert, K. M., S. Ott, A. Siri and L. Borsi (2003): “Regulation of cell growth and the expression of extracellular matrix proteins in colorectal adenocarcinoma: a fibroblast-tumor cell coculture model to study tumor-host interactions in vitro,” Eur. J. Cell Biol., 82, 1–8.Google Scholar

  • Johnson, M. and G. Maggiora (1990): Concepts and applications of molecular similarity, Wiley, New York.Google Scholar

  • Kasim, A., D. Lin, S. Van Sanden, D. Clevert, L. Bijnens, H. Göhlmann, D. Amaratunga, S. Hochreiter, Z. Shkedy and W. Talloen (2010): “Informative or noninformative calls for gene expression: a latent variable approach,” Stat. Appl. Genet. Mol. Biol., 9. DOI: 10.2202/1544-6115.1460.CrossrefGoogle Scholar

  • Kuwahara, K., T. Sasaki, Y. Kuwada, M. Murakami, S. Yamasaki and K. Chayama (2003): “Expressions of angiogenic factors in pancreatic ductal carcinoma: a correlative study with clinicopathologic parameters and patient survival,” Pancreas, 26, 344–349.Google Scholar

  • Lin, D., Z. Shkedy, G. Molenberghs, W. Talloen, H. Gohlmann and L. Bijnens (2010): “Selection and evaluation of gene-specific biomarkers in pre-clinical and clinical microarray experiments,” Online J. Bioinform., 11, 106–127.Google Scholar

  • Martin, Y. C., J. L. Kofron and L. M. Traphagen (2002): “Do structurally similar molecules have similar biological activity?,” J. Med. Chem., 45, 4350–4358.Google Scholar

  • Nantasenamat, C., C. Isarankura-Na-Ayudhya, T. Naenna and V. Prachayasittikul (2009): “A practical overview of quantitative structure-activity relationship,” EXCLI J., 8, 74–78.Google Scholar

  • Pardo, O. E., A. Lesay, A. Arcaro, R. Lopes, B. L. Ng, P. H. Warne, I. A. McNeish, T. D. Tetley, N. R. Lemoine, H. Mehmet, M. J. Seckl and J. Downward (2003): “Fibroblast growth factor 2-mediated translational control of iaps blocks mitochondrial release of smac/diablo and apoptosis in small cell lung cancer cells,” Mol. Cell. Biol., 23, 7600–7610.Google Scholar

  • Powis, G. and D. Kirkpatrick (2007): “Thioredoxin signaling as a target for cancer therapy,” Curr. Opin. Pharmacol., 7, 392–397.Web of ScienceGoogle Scholar

  • Prasanth Kumar, S., Y. T. Jasrai, H. A. Pandya and R. M. Rawal (2015): “Pharmacophore-similarity-based QSAR (PS-QSAR) for group-specific biological activity predictions,” J. Biomol. Struct. Dyn., 33, 56–69.Web of ScienceGoogle Scholar

  • Rogers, D. and M. Hahn (2010): “Extended-connectivity fingerprints,” J. Chem. Inf. Model., 50, 742–754.Google Scholar

  • Scholtens, D. and A. von Heydebreck (2005): Bioinformatics and computational biology solutions using R and bioconductor, Springer, New York, chapter Analysis of Differential Gene Expression Studies, 229–248.Google Scholar

  • Shaib, W., R. Mahajan and B. El-Rayes (2013): “Markers of resistance to anti-egfr therapy in colorectal cancer,” J. Gastrointest. Oncol., 4, 303–318.Google Scholar

  • Smyth, G. K. (2004): “Linear models and empirical Bayes methods for assessing differential expression in microarray experiments,” Stat. Appl. Genet. Mol. Biol., 3, 397–420.Google Scholar

  • Smyth, G. K., J. Michaud and H. S. Scott (2005): “Use of within-array replicate spots for assessing differential expression in microarray experiments,” Bioinformatics, 21, 2067–2075.Google Scholar

  • Talloen, W., D. Clevert, S. Hochreiter, D. Amaratunga, L. Bijnens, S. Kass and H. Göhlmann (2007): “I/ni-calls for the exclusion of non-informative genes: a highly effective filtering tool for microarray data,” Bioinformatics, 23, 2897–2902.Web of ScienceGoogle Scholar

  • Tassi, E., R. T. Henke, E. T. Bowden, M. R. Swift, D. P. Kodack, A. H. Kuo, A. Maitra and A. Wellstein (2006): “Expression of a fibroblast growth factor-binding protein during the development of adenocarcinoma of the pancreas and colon,” Cancer Res., 66, 1191–1198.Google Scholar

  • Tilahun, A., D. Lin, Z. Shkedy, H. Geys, A. Alonso, P. Peeters, W. Talloen, W. Drinkenburg, H. Gohlmann, E. Gorden, L. Bijnens and G. Molenberghs (2010): “Genomic biomarkers for depression: feature-specific and joint biomarkers,” Stat. Biopharm. Res., 2, 419–434.Google Scholar

  • Todeschini, R. and V. Consonni (2009): Molecular Descriptors for Chemoinformatics, Weinheim: Wiley-VCH Verlag GmbH & Co. KGaA.Google Scholar

  • Trabzuni, D. and P. Thomson (2014): “Analysis of gene expression data using a linear mixed model/finite mixture model approach: application to regional differences in the human brain,” Bioinformatics, 30, 1555–1561.Web of ScienceGoogle Scholar

  • Urig, S. and K. Becker (2006): “On the potential of thioredoxin reductase inhibitors for cancer therapy,” Semin. Cancer Biol., 16, 452–465.Google Scholar

  • van Krieken, J., A. Jung, T. Kirchner, F. Carneiro, R. Seruca, F. Bosman, P. Quirke, J. Fléjou, H. T. Plato, G. de Hertogh, P. Jares, C. Langner, G. Hoefler, M. Ligtenberg, D. Tiniakos, S. Tejpar, G. Bevilacqua and A. Ensari (2008): “Kras mutation testing for predicting response to anti-egfr therapy for colorectal carcinoma: proposal for an european quality assurance program,” Virchows Arch., 435, 417–431.Google Scholar

  • Van Sanden, S., Z. Shkedy, T. Burzykowski, H. Gohlmann, W. Talloen and L. Bijnens (2012): “Genomic biomarkers for a binary response in early drug development microarray experiments,” J. Biopharm. Stat., 22, 72–92.Web of ScienceGoogle Scholar

  • Verma, J., V. Khedkhar and E. Coutinho (2010): “3D-QSAR in drug design – a review,” Curr. Top. Med. Chem., 10, 95–115.Google Scholar

  • Zimmermann, G., B. Papke, S. Ismail, N. Vartak, A. Chandra, M. Hoffmann, S. A. Hahn, G. Triola, A. Wittinghofer, P. I. H. Bastiaens and H. Waldmann (2013): “Small molecule inhibition of the KRAS-PDEδ interaction impairs oncogenic kras signalling,” Nature, 497, 638–642.Web of ScienceGoogle Scholar

About the article

Published Online: 2016-05-25

Published in Print: 2016-08-01

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 15, Issue 4, Pages 291–304, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2014-0086.

Export Citation

©2016 by De Gruyter. Copyright Clearance Center

Supplementary Article Materials

Comments (0)

Please log in or register to comment.
Log in