Stability Selection, which combines penalized regression with subsampling, is a promising algorithm to perform variable selection in ultra high dimension. This work is motivated by its evaluation in the context of genome-wide association studies (GWAS). One critical aspect for its use lies in the choice of a decision rule that accounts for the massive number of comparisons realised. The current decision rule relies on the control of the Family Wise Error Rate (FWER) by means of an upper bound derived theoretically. Alternatively, we propose to set the detection threshold according to the more liberal false discovery rate (FDR) criterion. The procedure we propose for its estimation relies on permutations. This procedure is evaluated by simulations according to several scenarios mimicking various correlation structures of genetic data and is compared to the original FWER upper bound. The proposed procedure is shown to be less conservative, and able to pick up more true signals than the FWER upper bound. Finally, the proposed methodology is illustrated on a GWAS analysis of a lipid phenotype (high-density lipoproteins, HDL) in the Northern Finland Birth Cohort.
©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston