Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year


IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2016: 0.625
Source Normalized Impact per Paper (SNIP) 2016: 0.596

Mathematical Citation Quotient (MCQ) 2016: 0.06

Online
ISSN
1544-6115
See all formats and pricing
More options …
Volume 14, Issue 3 (Jun 2015)

Issues

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

A novel method to prioritize RNAseq data for post-hoc analysis based on absolute changes in transcript abundance

Patrick McNutt
  • Corresponding author
  • US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Ian Gut
  • National Biodefense Analysis and Countermeasures Center, 110 Thomas Johnson Drive, Frederick, MD 21702, USA
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Kyle Hubbard
  • US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Phil Beske
  • US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2015-03-11 | DOI: https://doi.org/10.1515/sagmb-2014-0018

Abstract

The use of fold-change (FC) to prioritize differentially expressed genes (DEGs) for post-hoc characterization is a common technique in the analysis of RNA sequencing datasets. However, the use of FC can overlook certain population of DEGs, such as high copy number transcripts which undergo metabolically expensive changes in expression yet fail to exceed the ratiometric FC cut-off, thereby missing potential important biological information. Here we evaluate an alternative approach to prioritizing RNAseq data based on absolute changes in normalized transcript counts (ΔT) between control and treatment conditions. In five pairwise comparisons with a wide range of effect sizes, rank-ordering of DEGs based on the magnitude of ΔT produced a power curve-like distribution, in which 4.7–5.0% of transcripts were responsible for 36–50% of the cumulative change. Thus, differential gene expression is characterized by the high production-cost expression of a small number of genes (large ΔT genes), while the differential expression of the majority of genes involves a much smaller metabolic investment by the cell. To determine whether the large ΔT datasets are representative of coordinated changes in the transcriptional program, we evaluated large ΔT genes for enrichment of gene ontologies (GOs) and predicted protein interactions. In comparison to randomly selected DEGs, the large ΔT transcripts were significantly enriched for both GOs and predicted protein interactions. Furthermore, enrichments were were consistent with the biological context of each comparison yet distinct from those produced using equal-sized populations of large FC genes, indicating that the large ΔT genes represent an orthagonal transcriptional response. Finally, the composition of the large ΔT gene sets were unique to each pairwise comparison, indicating that they represent coherent and context-specific responses to biological conditions rather than the non-specific upregulation of a family of genes. These findings suggest that the large ΔT genes are not a product of random or stochastic phenomenon, but rather represent biologically meaningful changes in the transcriptional program. They furthermore imply that high abundance transcripts are associated with particularly cellular states, and as cells change in response to internal or external conditions, the relative distribution of the abundant transcripts changes accordingly. Thus, prioritization of DEGs based on the concept of metabolic cost is a simple yet powerful method to identify biologically important transcriptional changes and provide novel insights into cellular behaviors.

This article offers supplementary material which is provided at the end of the article.

Keywords: bioinformatics; botulinum neurotoxin; differential gene expression; excitotoxicity; fold-change; functional annotation; gene ontologies; RNA sequencing; neurogenesis; neurotoxicity

References

  • Alstott, J., E. Bullmore and D. Plenz (2014): “Powerlaw: A python package for analysis of heavy-tailed distributions,” PLoS ONE 9(1): e85777.CrossrefGoogle Scholar

  • Anders, S. and W. Huber (2010): “Differential expression analysis for sequence count data,” Genome Biol., 11(10), R106.CrossrefGoogle Scholar

  • Anders, S., A. Reyes and W. Huber (2012): “Detecting differential usage of exons from RNA-seq data,” Genome Res., 22(10), 2008–2017.CrossrefPubMedGoogle Scholar

  • Ardizzone, T. D., A. Lu, K. R. Wagner, Y. Tang, R. Ran and F. R. Sharp (2004): “Glutamate receptor blockade attenuates glucose hypermetabolism in perihematomal brain after experimental intracerebral hemorrhage in rat,” Stroke, 35(11), 2587–2591.CrossrefPubMedGoogle Scholar

  • Baker, M. (2012): “Digital PCR hits its stride,” Nat. Methods, 9, 541–544.CrossrefGoogle Scholar

  • Benjamini, Y. and Y. Hochberg (1995): “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. Roy. Stat. Soc. B (Met.) 57(1), 289–300.Google Scholar

  • Bergmann, S., J. Ihmels and N. Barkai (2004): “Similarities and differences in genome-wide expression data of six organisms,” PLoS Biol., 2(1), E9.CrossrefGoogle Scholar

  • Bi, Y. and R. V. Davuluri (2013): “NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data,” BMC Bioinformatics, 14, 262.PubMedCrossrefGoogle Scholar

  • Bindea, G., B. Mlecnik, H. Hackl, P. Charoentong, M. Tosolini, A. Kirilovsky, W. H. Fridman, F. Pages, Z. Trajanoski and J. Galon (2009): “ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks,” Bioinformatics, 25(8), 1091–1093.CrossrefPubMedGoogle Scholar

  • Blaybel, R., O. Theoleyre, A. Douablin and F. Baklouti (2008): “Downregulation of the Spi-1/PU.1 oncogene induces the expression of TRIM10/HERF1, a key factor required for terminal erythroid cell differentiation and survival,” Cell Res, 18(8), 834–845.Google Scholar

  • Bullard, J. H., E. Purdom, K. D. Hansen and S. Dudoit (2010): “Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments,” BMC Bioinformatics, 11, 94.CrossrefPubMedGoogle Scholar

  • Cline, M. S., M. Smoot, E. Cerami, A. Kuchinsky, N. Landys, C. Workman, R. Christmas, I. Avila-Campilo, M. Creech, B. Gross, K. Hanspers, R. Isserlin, R. Kelley, S. Killcoyne, S. Lotia, S. Maere, J. Morris, K. Ono, V. Pavlovic, A. R. Pico, A. Vailaya, P. L. Wang, A. Adler, B. R. Conklin, L. Hood, M. Kuiper, C. Sander, I. Schmulevich, B. Schwikowski, G. J. Warner, T. Ideker and G. D. Bader (2007): “Integration of biological networks and gene expression data using Cytoscape,” Nat. Protoc., 2(10), 2366–2382.CrossrefPubMedGoogle Scholar

  • Coffield, J. A. and X. Yan (2009): “Neuritogenic actions of botulinum neurotoxin A on cultured motor neurons,” J. Pharmacol. Exp. Ther., 330(1), 352–358.Google Scholar

  • de Paiva, A., F. A. Meunier, J. Molgo, K. R. Aoki and J. O. Dolly (1999): “Functional repair of motor endplates after botulinum neurotoxin type A poisoning: biphasic switch of synaptic activity between nerve sprouts and their parent terminals,” Proc. Natl. Acad. Sci. USA, 96(6), 3200–3205.CrossrefGoogle Scholar

  • Dillies, M. A., A. Rau, J. Aubert, C. Hennequet-Antier, M. Jeanmougin, N. Servant, C. Keime, G. Marot, D. Castel, J. Estelle, G. Guernec, B. Jagla, L. Jouneau, D. Laloe, C. Le Gall, B. Schaeffer, S. Le Crom, M. Guedj, F. Jaffrezic and C. French StatOmique (2013): “A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis,” Brief Bioinform., 14(6), 671–683.PubMedCrossrefGoogle Scholar

  • Endersby, R., I. J. Majewski, L. Winteringham, J. G. Beaumont, A. Samuels, R. Scaife, E. Lim, M. Crossley, S. P. Klinken and J. P. Lalonde (2008): “Hls5 regulated erythroid differentiation by modulating GATA-1 activity,” Blood, 111(4), 1946–1950.PubMedGoogle Scholar

  • Furusawa, C. and K. Kaneko (2003): “Zipf’s law in gene expression,” Phys. Rev. Lett., 90(8), 088102.CrossrefGoogle Scholar

  • Greenbaum, D., C. Colangelo, K. Williams and M. Gerstein (2003): “Comparing protein abundance and mRNA expression levels on a genomic scale,” Genome Biol., 4(9), 117.PubMedCrossrefGoogle Scholar

  • Guo, Y., P. Xiao, S. Lei, F. Deng, G. G. Xiao, Y. Liu, X. Chen, L. Li, S. Wu, Y. Chen, H. Jiang, L. Tan, J. Xie, X. Zhu, S. Liang and H. Deng (2008): “How is mRNA expression predictive for protein expression? A correlation study on human circulating monocytes,” Acta Biochim. Biophys. Sin (Shanghai), 40(5), 426–436.CrossrefPubMedGoogle Scholar

  • Gut, I. M., P. H. Beske, K. S. Hubbard, M. E. Lyman, T. A. Hamilton and P. M. McNutt (2013): “Novel application of stem cell-derived neurons to evaluate the time- and dose-dependent progression of excitotoxic injury,” PLoS One, 8(5), e64423.Google Scholar

  • Huang da, W., B. T. Sherman, Q. Tan, J. R. Collins, W. G. Alvord, J. Roayaei, R. Stephens, M. W. Baseler, H. C. Lane and R. A. Lempicki (2007): “The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists,” Genome Biol., 8(9), R183.CrossrefGoogle Scholar

  • Huang da, W., B. T. Sherman and R. A. Lempicki (2009a): “Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists,” Nucleic Acids Res., 37(1), 1–13.CrossrefGoogle Scholar

  • Huang, D. W., Sherman, B. T., Zheng, X., Yang, J., Imamichi, T., Stephens, R. and Lempicki, R. A. (2009): “Extracting biological meaning from large gene lists with DAVID,” Current Protocols in Bioinformatics. (27)13.11, 13.11.1–13.11.13.Google Scholar

  • Hubbard, K. S., I. M. Gut, M. E. Lyman and P. M. McNutt (2013): “Longitudinal RNA sequencing of the deep transcriptome during neurogenesis of glutamatergic neurons from murine ESCs,” F1000 Research, 2(35).PubMedGoogle Scholar

  • Hubbard, K. S., I. M. Gut, M. E. Lyman, K. M. Tuznik, M. T. Mesngon and P. M. McNutt (2012): “High yield derivation of enriched glutamatergic neurons from suspension-cultured mouse ESCs for neurotoxicology research,” BMC Neuroscience, 13(127).CrossrefPubMedGoogle Scholar

  • Iyer-Biswas, S., F. Hayot and C. Jayaprakash (2009): “Stochasticity of gene products from transcriptional pulsing,” Phys. Rev. E Stat. Nonlin. Soft. Matter Phys., 79(3 Pt 1), 031911.CrossrefGoogle Scholar

  • Jensen, L. J., M. Kuhn, M. Stark, S. Chaffron, C. Creevey, J. Muller, T. Doerks, P. Julien, A. Roth, M. Simonovic, P. Bork and C. von Mering (2009): “STRING 8–a global view on proteins and their functional interactions in 630 organisms,” Nucleic Acids Res., 37(Database issue), D412–416.Google Scholar

  • Jiang, L., F. Schlesinger, C. A. Davis, Y. Zhang, R. Li, M. Salit, T. R. Gingeras and B. Oliver (2011): “Synthetic spike-in standards for RNA-seq experiments,” Genome Res., 21(9), 1543–1551.CrossrefPubMedGoogle Scholar

  • Krewski, D., D. Acosta Jr., M. Andersen, H. Anderson, J. C. Bailar, 3rd, K. Boekelheide, R. Brent, G. Charnley, V. G. Cheung, S. Green Jr., K. T. Kelsey, N. I. Kerkvliet, A. A. Li, L. McCray, O. Meyer, R. D. Patterson, W. Pennie, R. A. Scala, G. M. Solomon, M. Stephens, J. Yager and L. Zeise (2010): “Toxicity testing in the 21st century: a vision and a strategy,” J. Toxicol. Environ. Health B Crit. Rev., 13(2–4): 51–138.CrossrefGoogle Scholar

  • Leng, N., J. A. Dawson, J. A. Thomson, V. Ruotti, A. I. Rissman, B. M. Smits, J. D. Haag, M. N. Gould, R. M. Stewart and C. Kendziorski (2013): “EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments,” Bioinformatics, 29(8), 1035–1043.CrossrefPubMedGoogle Scholar

  • Love, M. I., W. Huber and S. Anders (2014): “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2,” Genome Biol., 15(12), 550.CrossrefGoogle Scholar

  • Maere, S., K. Heymans and M. Kuiper (2005): “BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks,” Bioinformatics, 21(16), 3448–3449.CrossrefGoogle Scholar

  • Marioni, J. C., C. E. Mason, S. M. Mane, M. Stephens and Y. Gilad (2008): “RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays,” Genome Res., 18(9), 1509–1517.CrossrefPubMedGoogle Scholar

  • McNutt, P., J. Celver, T. Hamilton and M. Mesngon (2011): “Embryonic stem cell-derived neurons are a novel, highly sensitive tissue culture platform for botulinum research,” Biochem. Biophys. Res. Commun., 405(1), 85–90.Google Scholar

  • Montroll, E. W. and M. F. Shlesinger (1982): “On 1/f noise and other distributions with long tails,” Proc. Natl. Acad. Sci. USA, 79(10), 3380–3383.CrossrefGoogle Scholar

  • Mutch, D. M., A. Berger, R. Mansourian, A. Rytz and M. A. Roberts (2002): “The limit fold change model: a practical approach for selecting differentially expressed genes from microarray data,” BMC Bioinformatics, 3, 17.CrossrefGoogle Scholar

  • Nagalakshmi, U., Z. Wang, K. Waern, C. Shou, D. Raha, M. Gerstein and M. Snyder (2008): “The transcriptional landscape of the yeast genome defined by RNA sequencing,” Science, 320(5881), 1344–1349.Google Scholar

  • Novelli, A., J. A. Reilly, P. G. Lysko and R. C. Henneberry (1988): “Glutamate becomes neurotoxic via the N-methyl-D-aspartate receptor when intracellular energy levels are reduced,” Brain Res., 451(1–2), 205–212.Google Scholar

  • Rapaport, F., R. Khanin, Y. Liang, M. Pirun, A. Krek, P. Zumbo, C. E. Mason, N. D. Socci and D. Betel (2013): “Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data,” Genome Biol., 14(9), R95.CrossrefGoogle Scholar

  • Redmond, L. C., C. I. Dumur, K. J. Archer, J. L. Haar and J. A. Lloyd (2008): “Identification of erythroid-enriched gene expression in the mouse embryonic yolk sac using microdissected cells,” Dev. Dyn., 237(2), 436–446.PubMedGoogle Scholar

  • Robinson, M. D., D. J. McCarthy and G. K. Smyth (2010): “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data,” Bioinformatics, 26(1), 139–140.PubMedCrossrefGoogle Scholar

  • Robinson, M. D. and G. K. Smyth (2007): “Moderated statistical tests for assessing differences in tag abundance,” Bioinformatics, 23(21), 2881–2887.PubMedCrossrefGoogle Scholar

  • Salari, R., D. Wojtowicz, J. Zheng, D. Levens, Y. Pilpel and T. M. Przytycka (2012): “Teasing apart translational and transcriptional components of stochastic variations in eukaryotic gene expression,” PLoS Comput. Biol., 8(8), e1002644.CrossrefGoogle Scholar

  • Schwanhausser, B., D. Busse, N. Li, G. Dittmar, J. Schuchhardt, J. Wolf, W. Chen and M. Selbach (2011): “Global quantification of mammalian gene expression control,” Nature, 473(7347), 337–342.Google Scholar

  • Simpson, L. L. (2004): “Identification of the major steps in botulinum toxin action,” Annu. Rev. Pharmacol. Toxicol., 44, 167–193.PubMedCrossrefGoogle Scholar

  • Soneson, C. and M. Delorenzi (2013): “A comparison of methods for differential expression analysis of RNA-seq data,” BMC Bioinformatics, 14, 91.CrossrefPubMedGoogle Scholar

  • Spandidos, A., X. Wang, H. Wang, S. Dragnev, T. Thurber and B. Seed (2008): “A comprehensive collection of experimentally validated primers for Polymerase Chain Reaction quantitation of murine transcript abundance,” BMC Genomics, 9, 633.CrossrefPubMedGoogle Scholar

  • Storey, J. D. (2003): “The positive false discovery rate: A Bayesian interpretation and the q-value,” Ann. Stat., 31(6), 2013–2035.CrossrefGoogle Scholar

  • Sultan, M., M. H. Schulz, H. Richard, A. Magen, A. Klingenhoff, M. Scherf, M. Seifert, T. Borodina, A. Soldatov, D. Parkhomchuk, D. Schmidt, S. O’Keeffe, S. Haas, M. Vingron, H. Lehrach and M. L. Yaspo (2008): “A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome,” Science, 321(5891), 956–960.Google Scholar

  • Tallack, M. R., G. W. Magor, B. Dartigues, L. Sun, S. Huang, J. M. Fittock, S. V. Fry, E. A. Glazov, T. L. Bailey and A. C. Perkins (2012): “Novel roles for KLF1 in erythropoiesis revealed by mRNA-seq,” Genome Res., 22(12), 2385–2398.CrossrefGoogle Scholar

  • Tarazona, S., F. Garcia-Alcalde, J. Dopazo, A. Ferrer and A. Conesa (2011): “Differential expression in RNA-seq: a matter of depth,” Genome Res., 21(12), 2213–2223.CrossrefPubMedGoogle Scholar

  • Vogelstein, B. and K. W. Kinzler (1999): “Digital PCR,” Proc. Natl. Acad. Sci. USA, 96(16), 9236–9241.CrossrefGoogle Scholar

  • Washburn, M. P., A. Koller, G. Oshiro, R. R. Ulaszek, D. Plouffe, C. Deciu, E. Winzeler and J. R. Yates, 3rd (2003): “Protein pathway and complex clustering of correlated mRNA and protein expression analyses in Saccharomyces cerevisiae,” Proc. Natl. Acad. Sci. USA, 100(6), 3107–3112.CrossrefGoogle Scholar

About the article

Corresponding author: Patrick McNutt, US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA, e-mail:


Published Online: 2015-03-11

Published in Print: 2015-06-01


Citation Information: Statistical Applications in Genetics and Molecular Biology, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2014-0018.

Export Citation

©2015 by De Gruyter. Copyright Clearance Center

Supplementary Article Materials

Comments (0)

Please log in or register to comment.
Log in