Simultaneously performing many hypothesis tests is a problem commonly encountered in high-dimensional biology. In this setting, a large set of p-values is calculated from many related features measured simultaneously. Classical statistics provides a criterion for defining what a “correct” p-value is when performing a single hypothesis test. We show here that even when each p-value is marginally correct under this single hypothesis criterion, it may be the case that the joint behavior of the entire set of p-values is problematic. On the other hand, there are cases where each p-value is marginally incorrect, yet the joint distribution of the set of p-values is satisfactory. Here, we propose a criterion defining a well behaved set of simultaneously calculated p-values that provides precise control of common error rates and we introduce diagnostic procedures for assessing whether the criterion is satisfied with simulations. Multiple testing p-values that satisfy our new criterion avoid potentially large study specific errors, but also satisfy the usual assumptions for strong control of false discovery rates and family-wise error rates. We utilize the new criterion and proposed diagnostics to investigate two common issues in high-dimensional multiple testing for genomics: dependent multiple hypothesis tests and pooled versus test-specific null distributions.
SAGMB publishes significant research on the application of statistical ideas to problems arising from computational biology. The range of topics includes linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarrary data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies.