Jump to ContentJump to Main Navigation
Show Summary Details
In This Section

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year


IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2015: 0.954
Source Normalized Impact per Paper (SNIP) 2015: 0.554

Mathematical Citation Quotient (MCQ) 2015: 0.06

Online
ISSN
1544-6115
See all formats and pricing
In This Section
Volume 14, Issue 3 (Jun 2015)

Issues

A novel method to prioritize RNAseq data for post-hoc analysis based on absolute changes in transcript abundance

Patrick McNutt
  • Corresponding author
  • US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA
  • Email:
/ Ian Gut
  • National Biodefense Analysis and Countermeasures Center, 110 Thomas Johnson Drive, Frederick, MD 21702, USA
/ Kyle Hubbard
  • US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA
/ Phil Beske
  • US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA
Published Online: 2015-03-11 | DOI: https://doi.org/10.1515/sagmb-2014-0018

Abstract

The use of fold-change (FC) to prioritize differentially expressed genes (DEGs) for post-hoc characterization is a common technique in the analysis of RNA sequencing datasets. However, the use of FC can overlook certain population of DEGs, such as high copy number transcripts which undergo metabolically expensive changes in expression yet fail to exceed the ratiometric FC cut-off, thereby missing potential important biological information. Here we evaluate an alternative approach to prioritizing RNAseq data based on absolute changes in normalized transcript counts (ΔT) between control and treatment conditions. In five pairwise comparisons with a wide range of effect sizes, rank-ordering of DEGs based on the magnitude of ΔT produced a power curve-like distribution, in which 4.7–5.0% of transcripts were responsible for 36–50% of the cumulative change. Thus, differential gene expression is characterized by the high production-cost expression of a small number of genes (large ΔT genes), while the differential expression of the majority of genes involves a much smaller metabolic investment by the cell. To determine whether the large ΔT datasets are representative of coordinated changes in the transcriptional program, we evaluated large ΔT genes for enrichment of gene ontologies (GOs) and predicted protein interactions. In comparison to randomly selected DEGs, the large ΔT transcripts were significantly enriched for both GOs and predicted protein interactions. Furthermore, enrichments were were consistent with the biological context of each comparison yet distinct from those produced using equal-sized populations of large FC genes, indicating that the large ΔT genes represent an orthagonal transcriptional response. Finally, the composition of the large ΔT gene sets were unique to each pairwise comparison, indicating that they represent coherent and context-specific responses to biological conditions rather than the non-specific upregulation of a family of genes. These findings suggest that the large ΔT genes are not a product of random or stochastic phenomenon, but rather represent biologically meaningful changes in the transcriptional program. They furthermore imply that high abundance transcripts are associated with particularly cellular states, and as cells change in response to internal or external conditions, the relative distribution of the abundant transcripts changes accordingly. Thus, prioritization of DEGs based on the concept of metabolic cost is a simple yet powerful method to identify biologically important transcriptional changes and provide novel insights into cellular behaviors.

This article offers supplementary material which is provided at the end of the article.

Keywords: bioinformatics; botulinum neurotoxin; differential gene expression; excitotoxicity; fold-change; functional annotation; gene ontologies; RNA sequencing; neurogenesis; neurotoxicity

References

  • Alstott, J., E. Bullmore and D. Plenz (2014): “Powerlaw: A python package for analysis of heavy-tailed distributions,” PLoS ONE 9(1): e85777. [Crossref]

  • Anders, S. and W. Huber (2010): “Differential expression analysis for sequence count data,” Genome Biol., 11(10), R106. [Crossref]

  • Anders, S., A. Reyes and W. Huber (2012): “Detecting differential usage of exons from RNA-seq data,” Genome Res., 22(10), 2008–2017. [Crossref] [PubMed]

  • Ardizzone, T. D., A. Lu, K. R. Wagner, Y. Tang, R. Ran and F. R. Sharp (2004): “Glutamate receptor blockade attenuates glucose hypermetabolism in perihematomal brain after experimental intracerebral hemorrhage in rat,” Stroke, 35(11), 2587–2591. [Crossref] [PubMed]

  • Baker, M. (2012): “Digital PCR hits its stride,” Nat. Methods, 9, 541–544. [Crossref]

  • Benjamini, Y. and Y. Hochberg (1995): “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. Roy. Stat. Soc. B (Met.) 57(1), 289–300.

  • Bergmann, S., J. Ihmels and N. Barkai (2004): “Similarities and differences in genome-wide expression data of six organisms,” PLoS Biol., 2(1), E9. [Crossref]

  • Bi, Y. and R. V. Davuluri (2013): “NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data,” BMC Bioinformatics, 14, 262. [PubMed] [Crossref]

  • Bindea, G., B. Mlecnik, H. Hackl, P. Charoentong, M. Tosolini, A. Kirilovsky, W. H. Fridman, F. Pages, Z. Trajanoski and J. Galon (2009): “ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks,” Bioinformatics, 25(8), 1091–1093. [Crossref] [PubMed]

  • Blaybel, R., O. Theoleyre, A. Douablin and F. Baklouti (2008): “Downregulation of the Spi-1/PU.1 oncogene induces the expression of TRIM10/HERF1, a key factor required for terminal erythroid cell differentiation and survival,” Cell Res, 18(8), 834–845.

  • Bullard, J. H., E. Purdom, K. D. Hansen and S. Dudoit (2010): “Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments,” BMC Bioinformatics, 11, 94. [Crossref] [PubMed]

  • Cline, M. S., M. Smoot, E. Cerami, A. Kuchinsky, N. Landys, C. Workman, R. Christmas, I. Avila-Campilo, M. Creech, B. Gross, K. Hanspers, R. Isserlin, R. Kelley, S. Killcoyne, S. Lotia, S. Maere, J. Morris, K. Ono, V. Pavlovic, A. R. Pico, A. Vailaya, P. L. Wang, A. Adler, B. R. Conklin, L. Hood, M. Kuiper, C. Sander, I. Schmulevich, B. Schwikowski, G. J. Warner, T. Ideker and G. D. Bader (2007): “Integration of biological networks and gene expression data using Cytoscape,” Nat. Protoc., 2(10), 2366–2382. [Crossref] [PubMed]

  • Coffield, J. A. and X. Yan (2009): “Neuritogenic actions of botulinum neurotoxin A on cultured motor neurons,” J. Pharmacol. Exp. Ther., 330(1), 352–358.

  • de Paiva, A., F. A. Meunier, J. Molgo, K. R. Aoki and J. O. Dolly (1999): “Functional repair of motor endplates after botulinum neurotoxin type A poisoning: biphasic switch of synaptic activity between nerve sprouts and their parent terminals,” Proc. Natl. Acad. Sci. USA, 96(6), 3200–3205. [Crossref]

  • Dillies, M. A., A. Rau, J. Aubert, C. Hennequet-Antier, M. Jeanmougin, N. Servant, C. Keime, G. Marot, D. Castel, J. Estelle, G. Guernec, B. Jagla, L. Jouneau, D. Laloe, C. Le Gall, B. Schaeffer, S. Le Crom, M. Guedj, F. Jaffrezic and C. French StatOmique (2013): “A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis,” Brief Bioinform., 14(6), 671–683. [PubMed] [Crossref]

  • Endersby, R., I. J. Majewski, L. Winteringham, J. G. Beaumont, A. Samuels, R. Scaife, E. Lim, M. Crossley, S. P. Klinken and J. P. Lalonde (2008): “Hls5 regulated erythroid differentiation by modulating GATA-1 activity,” Blood, 111(4), 1946–1950. [PubMed]

  • Furusawa, C. and K. Kaneko (2003): “Zipf’s law in gene expression,” Phys. Rev. Lett., 90(8), 088102. [Crossref]

  • Greenbaum, D., C. Colangelo, K. Williams and M. Gerstein (2003): “Comparing protein abundance and mRNA expression levels on a genomic scale,” Genome Biol., 4(9), 117. [PubMed] [Crossref]

  • Guo, Y., P. Xiao, S. Lei, F. Deng, G. G. Xiao, Y. Liu, X. Chen, L. Li, S. Wu, Y. Chen, H. Jiang, L. Tan, J. Xie, X. Zhu, S. Liang and H. Deng (2008): “How is mRNA expression predictive for protein expression? A correlation study on human circulating monocytes,” Acta Biochim. Biophys. Sin (Shanghai), 40(5), 426–436. [Crossref] [PubMed]

  • Gut, I. M., P. H. Beske, K. S. Hubbard, M. E. Lyman, T. A. Hamilton and P. M. McNutt (2013): “Novel application of stem cell-derived neurons to evaluate the time- and dose-dependent progression of excitotoxic injury,” PLoS One, 8(5), e64423.

  • Huang da, W., B. T. Sherman, Q. Tan, J. R. Collins, W. G. Alvord, J. Roayaei, R. Stephens, M. W. Baseler, H. C. Lane and R. A. Lempicki (2007): “The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists,” Genome Biol., 8(9), R183. [Crossref]

  • Huang da, W., B. T. Sherman and R. A. Lempicki (2009a): “Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists,” Nucleic Acids Res., 37(1), 1–13. [Crossref]

  • Huang, D. W., Sherman, B. T., Zheng, X., Yang, J., Imamichi, T., Stephens, R. and Lempicki, R. A. (2009): “Extracting biological meaning from large gene lists with DAVID,” Current Protocols in Bioinformatics. (27)13.11, 13.11.1–13.11.13.

  • Hubbard, K. S., I. M. Gut, M. E. Lyman and P. M. McNutt (2013): “Longitudinal RNA sequencing of the deep transcriptome during neurogenesis of glutamatergic neurons from murine ESCs,” F1000 Research, 2(35). [PubMed]

  • Hubbard, K. S., I. M. Gut, M. E. Lyman, K. M. Tuznik, M. T. Mesngon and P. M. McNutt (2012): “High yield derivation of enriched glutamatergic neurons from suspension-cultured mouse ESCs for neurotoxicology research,” BMC Neuroscience, 13(127). [Crossref] [PubMed]

  • Iyer-Biswas, S., F. Hayot and C. Jayaprakash (2009): “Stochasticity of gene products from transcriptional pulsing,” Phys. Rev. E Stat. Nonlin. Soft. Matter Phys., 79(3 Pt 1), 031911. [Crossref]

  • Jensen, L. J., M. Kuhn, M. Stark, S. Chaffron, C. Creevey, J. Muller, T. Doerks, P. Julien, A. Roth, M. Simonovic, P. Bork and C. von Mering (2009): “STRING 8–a global view on proteins and their functional interactions in 630 organisms,” Nucleic Acids Res., 37(Database issue), D412–416.

  • Jiang, L., F. Schlesinger, C. A. Davis, Y. Zhang, R. Li, M. Salit, T. R. Gingeras and B. Oliver (2011): “Synthetic spike-in standards for RNA-seq experiments,” Genome Res., 21(9), 1543–1551. [Crossref] [PubMed]

  • Krewski, D., D. Acosta Jr., M. Andersen, H. Anderson, J. C. Bailar, 3rd, K. Boekelheide, R. Brent, G. Charnley, V. G. Cheung, S. Green Jr., K. T. Kelsey, N. I. Kerkvliet, A. A. Li, L. McCray, O. Meyer, R. D. Patterson, W. Pennie, R. A. Scala, G. M. Solomon, M. Stephens, J. Yager and L. Zeise (2010): “Toxicity testing in the 21st century: a vision and a strategy,” J. Toxicol. Environ. Health B Crit. Rev., 13(2–4): 51–138. [Crossref]

  • Leng, N., J. A. Dawson, J. A. Thomson, V. Ruotti, A. I. Rissman, B. M. Smits, J. D. Haag, M. N. Gould, R. M. Stewart and C. Kendziorski (2013): “EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments,” Bioinformatics, 29(8), 1035–1043. [Crossref] [PubMed]

  • Love, M. I., W. Huber and S. Anders (2014): “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2,” Genome Biol., 15(12), 550. [Crossref]

  • Maere, S., K. Heymans and M. Kuiper (2005): “BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks,” Bioinformatics, 21(16), 3448–3449. [Crossref]

  • Marioni, J. C., C. E. Mason, S. M. Mane, M. Stephens and Y. Gilad (2008): “RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays,” Genome Res., 18(9), 1509–1517. [Crossref] [PubMed]

  • McNutt, P., J. Celver, T. Hamilton and M. Mesngon (2011): “Embryonic stem cell-derived neurons are a novel, highly sensitive tissue culture platform for botulinum research,” Biochem. Biophys. Res. Commun., 405(1), 85–90.

  • Montroll, E. W. and M. F. Shlesinger (1982): “On 1/f noise and other distributions with long tails,” Proc. Natl. Acad. Sci. USA, 79(10), 3380–3383. [Crossref]

  • Mutch, D. M., A. Berger, R. Mansourian, A. Rytz and M. A. Roberts (2002): “The limit fold change model: a practical approach for selecting differentially expressed genes from microarray data,” BMC Bioinformatics, 3, 17. [Crossref]

  • Nagalakshmi, U., Z. Wang, K. Waern, C. Shou, D. Raha, M. Gerstein and M. Snyder (2008): “The transcriptional landscape of the yeast genome defined by RNA sequencing,” Science, 320(5881), 1344–1349.

  • Novelli, A., J. A. Reilly, P. G. Lysko and R. C. Henneberry (1988): “Glutamate becomes neurotoxic via the N-methyl-D-aspartate receptor when intracellular energy levels are reduced,” Brain Res., 451(1–2), 205–212.

  • Rapaport, F., R. Khanin, Y. Liang, M. Pirun, A. Krek, P. Zumbo, C. E. Mason, N. D. Socci and D. Betel (2013): “Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data,” Genome Biol., 14(9), R95. [Crossref]

  • Redmond, L. C., C. I. Dumur, K. J. Archer, J. L. Haar and J. A. Lloyd (2008): “Identification of erythroid-enriched gene expression in the mouse embryonic yolk sac using microdissected cells,” Dev. Dyn., 237(2), 436–446. [PubMed]

  • Robinson, M. D., D. J. McCarthy and G. K. Smyth (2010): “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data,” Bioinformatics, 26(1), 139–140. [PubMed] [Crossref]

  • Robinson, M. D. and G. K. Smyth (2007): “Moderated statistical tests for assessing differences in tag abundance,” Bioinformatics, 23(21), 2881–2887. [PubMed] [Crossref]

  • Salari, R., D. Wojtowicz, J. Zheng, D. Levens, Y. Pilpel and T. M. Przytycka (2012): “Teasing apart translational and transcriptional components of stochastic variations in eukaryotic gene expression,” PLoS Comput. Biol., 8(8), e1002644. [Crossref]

  • Schwanhausser, B., D. Busse, N. Li, G. Dittmar, J. Schuchhardt, J. Wolf, W. Chen and M. Selbach (2011): “Global quantification of mammalian gene expression control,” Nature, 473(7347), 337–342.

  • Simpson, L. L. (2004): “Identification of the major steps in botulinum toxin action,” Annu. Rev. Pharmacol. Toxicol., 44, 167–193. [PubMed] [Crossref]

  • Soneson, C. and M. Delorenzi (2013): “A comparison of methods for differential expression analysis of RNA-seq data,” BMC Bioinformatics, 14, 91. [Crossref] [PubMed]

  • Spandidos, A., X. Wang, H. Wang, S. Dragnev, T. Thurber and B. Seed (2008): “A comprehensive collection of experimentally validated primers for Polymerase Chain Reaction quantitation of murine transcript abundance,” BMC Genomics, 9, 633. [Crossref] [PubMed]

  • Storey, J. D. (2003): “The positive false discovery rate: A Bayesian interpretation and the q-value,” Ann. Stat., 31(6), 2013–2035. [Crossref]

  • Sultan, M., M. H. Schulz, H. Richard, A. Magen, A. Klingenhoff, M. Scherf, M. Seifert, T. Borodina, A. Soldatov, D. Parkhomchuk, D. Schmidt, S. O’Keeffe, S. Haas, M. Vingron, H. Lehrach and M. L. Yaspo (2008): “A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome,” Science, 321(5891), 956–960.

  • Tallack, M. R., G. W. Magor, B. Dartigues, L. Sun, S. Huang, J. M. Fittock, S. V. Fry, E. A. Glazov, T. L. Bailey and A. C. Perkins (2012): “Novel roles for KLF1 in erythropoiesis revealed by mRNA-seq,” Genome Res., 22(12), 2385–2398. [Crossref]

  • Tarazona, S., F. Garcia-Alcalde, J. Dopazo, A. Ferrer and A. Conesa (2011): “Differential expression in RNA-seq: a matter of depth,” Genome Res., 21(12), 2213–2223. [Crossref] [PubMed]

  • Vogelstein, B. and K. W. Kinzler (1999): “Digital PCR,” Proc. Natl. Acad. Sci. USA, 96(16), 9236–9241. [Crossref]

  • Washburn, M. P., A. Koller, G. Oshiro, R. R. Ulaszek, D. Plouffe, C. Deciu, E. Winzeler and J. R. Yates, 3rd (2003): “Protein pathway and complex clustering of correlated mRNA and protein expression analyses in Saccharomyces cerevisiae,” Proc. Natl. Acad. Sci. USA, 100(6), 3107–3112. [Crossref]

About the article

Corresponding author: Patrick McNutt, US Army Medical Research Institute of Chemical Defense, 3100 Ricketts Point Road, Gunpowder, MD 21010, USA, e-mail:


Published Online: 2015-03-11

Published in Print: 2015-06-01



Citation Information: Statistical Applications in Genetics and Molecular Biology, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2014-0018. Export Citation

Supplementary Article Materials

Comments (0)

Please log in or register to comment.
Log in