Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido


IMPACT FACTOR 2017: 0.812
5-year IMPACT FACTOR: 1.104

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2017: 0.04

Online
ISSN
1544-6115
See all formats and pricing
More options …
Volume 12, Issue 4

Issues

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions

David R. Bickel
  • Corresponding author
  • Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, K1H 8M5, Canada
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2013-06-21 | DOI: https://doi.org/10.1515/sagmb-2013-0003

An erratum for this article can be found here: https://doi.org/10.1515/sagmb-2014-0100

Abstract

Multiple comparison procedures that control a family-wise error rate or false discovery rate provide an achieved error rate as the adjusted p-value or q-value for each hypothesis tested. However, since achieved error rates are not understood as probabilities that the null hypotheses are true, empirical Bayes methods have been employed to estimate such posterior probabilities, called local false discovery rates (LFDRs) to emphasize that their priors are unknown and of the frequency type. The main approaches to LFDR estimation, relying either on fully parametric models to maximize likelihood or on the presence of enough hypotheses for nonparametric density estimation, lack the simplicity and generality of adjusted p-values. To begin filling the gap, this paper introduces simple methods of LFDR estimation with proven asymptotic conservatism without assuming the parameter distribution is in a parametric family. Simulations indicate that they remain conservative even for very small numbers of hypotheses. One of the proposed procedures enables interpreting the original FDR control rule in terms of LFDR estimation, thereby facilitating practical use of the former. The most conservative of the new procedures is applied to measured abundance levels of 20 proteins.

Keywords: Bayesian false discovery rate; confidence distribution; empirical Bayes; local false discovery rate; multiple comparison procedure; multiple testing

References

  • Abadir, K. (2005): “The mean-median-mode inequality: counterexamples,” Economet. Theory, 21(2), 477–482.Google Scholar

  • Basu, S. and A. Dasgupta (1997): “The mean, median, and mode of unimodal distributions: a characterization,” Theor. Probab. Appl+, 41(2), 210–223.CrossrefGoogle Scholar

  • Benjamini, Y. and Y. Hochberg (1995): “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. Roy. Stat. Soc. B, 57, 289–300.Google Scholar

  • Bickel, D. R. (2011a): “Estimating the null distribution to adjust observed confidence levels for genome-scale screening,” Biometrics, 67, 363–370.PubMedWeb of ScienceCrossrefGoogle Scholar

  • Bickel, D. R. (2011b): Small-scale inference: Empirical Bayes and confidence methods for as few as a single comparison. Technical Report, Ottawa Institute of Systems Biology, arXiv:1104.0341.Google Scholar

  • Bickel, D. R. (2012a): “Coherent frequentism: a decision theory based on confidence sets,” Commun. Stat. Theory, 41, 1478–1496.CrossrefWeb of ScienceGoogle Scholar

  • Bickel, D. R. (2012b): “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Stat. Applications Genet. Mol. Biol., 11(3), art.7.Web of ScienceGoogle Scholar

  • Clopper, C. J. and E. S. Pearson (1934): “The use of confidence or fiducial limits illustrated in the case of the binomial,” Biometrika, 26, 404–413.CrossrefGoogle Scholar

  • Dudoit, S. and M. J. van der Laan (2008): Multiple testing procedures with applications to genomics, New York: Springer.Google Scholar

  • Edwards, A. W. F. (1992): Likelihood, Baltimore: Johns Hopkins Press.Google Scholar

  • Edwards, D., L. Wang, and P. Sørensen (2012): “Network-enabled gene expression analysis,” BMC Bioinformatics, 13(art. 167).CrossrefGoogle Scholar

  • Efron, B. (1986): “Why isn′t everyone a Bayesian,” Am. Stat., 40(1), 1–5.Google Scholar

  • Efron, B. (2004): “Large-scales imultaneous hypothesis testing: the choice of a null hypothesis,” J. Am. Stat. Assoc., 99, 96–104.CrossrefGoogle Scholar

  • Efron, B. (2010a): “Correlated z-values and the accuracy of large-scale statistical estimates,” J. Am. Stat. Assoc., 105, 1042–1055.Web of ScienceCrossrefGoogle Scholar

  • Efron, B. (2010b): Large-scale inference: empirical bayes methods for estimation, testing, and prediction, Cambridge: Cambridge University Press.Google Scholar

  • Efron, B. (2010c): “Rejoinder to comments on B. Efron, “Correlated z-values and the accuracy of large-scale statistical estimates,”” J. Am. Stat. Assoc., 105, 1067–1069.Web of ScienceCrossrefGoogle Scholar

  • Efron, B. and R. Tibshirani (2002): “Empirical Bayes methods and false discovery rates for microarrays,” Genet. Epidemiol., 23, 70–86.PubMedCrossrefGoogle Scholar

  • Efron, B., R. Tibshirani, J. D. Storey, and V. Tusher (2001): “Empirical Bayes analysis of a microarray experiment,” J. Am. Stat. Assoc., 96, 1151–1160.CrossrefGoogle Scholar

  • Fisher, R. A. (1973): Statistical methods and scientific inference, New York: Hafner Press.Google Scholar

  • Gentleman, R. C., V. J. Carey, D. M. Bates, et al., (2004): “Bioconductor: open software development for computational biology and bioinformatics,” Genome Biol., 5, R80.CrossrefGoogle Scholar

  • Hald, A. (2007): A history of parametric statistical inference from bernoulli to fisher, New York: Springer, 1713–1935.Google Scholar

  • Kyburg, H. E. and C. M. Teng (2006): “Non monotonic logic and statistical inference,” Comput. Intell. 22, 26–51.CrossrefGoogle Scholar

  • Li, X. (2009): ProData. Bioconductor.org documentation for the ProData package. http://www.bioconductor.org/packages/2.12/data/experiment/html/ProData.html.

  • Morris, C. N. (1983a): “Parametric empirical Bayes inference: theory and applications,” J. Am. Stat. Assoc., 78, 47–55.CrossrefGoogle Scholar

  • Morris, C. N. (1983b): “Parametric empirical Bayes inference: theory and applications: rejoinder,” J. Am. Stat. Assoc., 78, 63–65.Google Scholar

  • Morris, J. S., P. J. Brown, R. C. Herrick, K. A. Baggerly, and K.R. Coombes (2008): “Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models,” Biometrics, 64(2), 479–489.PubMedWeb of ScienceCrossrefGoogle Scholar

  • Muralidharan, O. (2010): “An empirical Bayes mixture method for effect size and false discovery rate estimation,” Ann. Appl. Stat., 4, 422–438.Web of ScienceCrossrefGoogle Scholar

  • Padilla, M. and D. R. Bickel (2012): “Empirical Bayes methods corrected for small numbers of tests,” Stat. Applications Genet. Mol. Biol., 11(5), art.4.Google Scholar

  • R Development Core Team (2008): R:a language and environment for statistical computing, Vienna, Austria: R foundation for statistical computing.Google Scholar

  • Singh, K., M. Xie, and W. E. Strawderman (2007): “Confidence distribution (CD)–distribution estimator of a parameter,” IMS Lecture Notes Monograph Series, 54, 132–150.Google Scholar

  • Storey, J. D. (2002): “A direct approach to false discovery rates,” J. Roy. Stat. Soc. B, 64, 479–498.CrossrefGoogle Scholar

  • Westfall, P. H. (2010): “Comment on B. Efron,“Correlated z-values and the accuracy of large-scale statistical estimates,”” J. Am. Stat. Assoc., 105, 1063–1066.CrossrefGoogle Scholar

  • Westfall, P. H. and S. S. Young (1993): Resampling-Based Multiple Testing. Hoboken: John Wiley & Sons.Google Scholar

  • Whittemore, A. S. (2007): “A Bayesian false discovery rate for multiple testing,” J. Appl. Stat., 34(1), 1–9.Web of ScienceCrossrefGoogle Scholar

  • Wilkinson, G. N. (1977): “On resolving the controversy instatistical inference(with discussion),” J. Roy. Stat. Soc. B, 39, 119–171.Google Scholar

  • Yuan, B. (2009): “Bayesian frequentist hybrid inference,” Ann. Stat., 37, 2458–2501.CrossrefGoogle Scholar

About the article

Corresponding author: David R. Bickel, Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, K1H 8M5, Canada


Published Online: 2013-06-21

Published in Print: 2013-08-01


Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 12, Issue 4, Pages 529–543, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2013-0003.

Export Citation

©2013 by Walter de Gruyter Berlin Boston.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

[1]
Céline Aguer, Oliver Fiehn, Erin L. Seifert, Véronic Bézaire, John K. Meissen, Amanda Daniels, Kyle Scott, Jean-Marc Renaud, Marta Padilla, David R. Bickel, Michael Dysart, Sean H. Adams, and Mary-Ellen Harper
The FASEB Journal, 2013, Volume 27, Number 10, Page 4213
[3]
[5]
David R. Bickel and Marta Padilla
Journal of Statistical Planning and Inference, 2014, Volume 145, Page 204

Comments (0)

Please log in or register to comment.
Log in