Multiple comparison procedures that control a family-wise error rate or false discovery rate provide an achieved error rate as the adjusted p-value or q-value for each hypothesis tested. However, since achieved error rates are not understood as probabilities that the null hypotheses are true, empirical Bayes methods have been employed to estimate such posterior probabilities, called local false discovery rates (LFDRs) to emphasize that their priors are unknown and of the frequency type. The main approaches to LFDR estimation, relying either on fully parametric models to maximize likelihood or on the presence of enough hypotheses for nonparametric density estimation, lack the simplicity and generality of adjusted p-values. To begin filling the gap, this paper introduces simple methods of LFDR estimation with proven asymptotic conservatism without assuming the parameter distribution is in a parametric family. Simulations indicate that they remain conservative even for very small numbers of hypotheses. One of the proposed procedures enables interpreting the original FDR control rule in terms of LFDR estimation, thereby facilitating practical use of the former. The most conservative of the new procedures is applied to measured abundance levels of 20 proteins.
Bickel, D. R. (2012a): “Coherent frequentism: a decision theory based on confidence sets,” Commun. Stat. Theory, 41, 1478–1496.10.1080/03610926.2010.543302)| false
Bickel, D. R. (2012b): “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Stat. Applications Genet. Mol. Biol., 11(3), art.7.
Bickel, D. R. (2012b): “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Stat. Applications Genet. Mol. Biol., 11(3), art.7.10.1515/1544-6115.1765)| false
Clopper, C. J. and E. S. Pearson (1934): “The use of confidence or fiducial limits illustrated in the case of the binomial,” Biometrika, 26, 404–413.
Efron, B. (2010c): “Rejoinder to comments on B. Efron, “Correlated z-values and the accuracy of large-scale statistical estimates,”” J. Am. Stat. Assoc., 105, 1067–1069.10.1198/jasa.2010.tm10367)| false
Efron, B. and R. Tibshirani (2002): “Empirical Bayes methods and false discovery rates for microarrays,” Genet. Epidemiol., 23, 70–86.
Efron, B., R. Tibshirani, J. D. Storey, and V. Tusher (2001): “Empirical Bayes analysis of a microarray experiment,” J. Am. Stat. Assoc., 96, 1151–1160.
Fisher, R. A. (1973): Statistical methods and scientific inference, New York: Hafner Press.
Gentleman, R. C., V. J. Carey, D. M. Bates, et al., (2004): “Bioconductor: open software development for computational biology and bioinformatics,” Genome Biol., 5, R80.
Hald, A. (2007): A history of parametric statistical inference from bernoulli to fisher, New York: Springer, 1713–1935.
Kyburg, H. E. and C. M. Teng (2006): “Non monotonic logic and statistical inference,” Comput. Intell. 22, 26–51.
Morris, C. N. (1983b): “Parametric empirical Bayes inference: theory and applications: rejoinder,” J. Am. Stat. Assoc., 78, 63–65.10.2307/2287105)| false
Morris, J. S., P. J. Brown, R. C. Herrick, K. A. Baggerly, and K.R. Coombes (2008): “Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models,” Biometrics, 64(2), 479–489.
Morris, J. S., P. J. Brown, R. C. Herrick, K. A. Baggerly, and K.R. Coombes (2008): “Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models,” Biometrics, 64(2), 479–489.10.1111/j.1541-0420.2007.00895.x17888041)| false
Muralidharan, O. (2010): “An empirical Bayes mixture method for effect size and false discovery rate estimation,” Ann. Appl. Stat., 4, 422–438.
Padilla, M. and D. R. Bickel (2012): “Empirical Bayes methods corrected for small numbers of tests,” Stat. Applications Genet. Mol. Biol., 11(5), art.4.
Singh, K., M. Xie, and W. E. Strawderman (2007): “Confidence distribution (CD)–distribution estimator of a parameter,” IMS Lecture Notes Monograph Series, 54, 132–150.10.1214/074921707000000102)| false
Storey, J. D. (2002): “A direct approach to false discovery rates,” J. Roy. Stat. Soc. B, 64, 479–498.
Westfall, P. H. (2010): “Comment on B. Efron,“Correlated z-values and the accuracy of large-scale statistical estimates,”” J. Am. Stat. Assoc., 105, 1063–1066.
Westfall, P. H. and S. S. Young (1993): Resampling-Based Multiple Testing. Hoboken: John Wiley & Sons.
Whittemore, A. S. (2007): “A Bayesian false discovery rate for multiple testing,” J. Appl. Stat., 34(1), 1–9.
Corresponding author: David R. Bickel, Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, K1H 8M5, Canada
SAGMB publishes significant research on the application of statistical ideas to problems arising from computational biology. The range of topics includes linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarrary data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies.