Statistical Applications in Genetics and Molecular Biology
Editor-in-Chief: Stumpf, Michael P.H.
6 Issues per year
Increased IMPACT FACTOR 2012: 1.717
Rank 18 out of 117 in category Statistics & Probability in the 2012 Thomson Reuters Journal Citation Report/Science Edition
Mathematical Citation Quotient 2012: 0.07
Volume 12 (2013)
Volume 11 (2012)
Volume 10 (2011)
Volume 9 (2010)
Volume 8 (2009)
Volume 6 (2007)
Volume 5 (2006)
Volume 4 (2005)
Volume 3 (2004)
Volume 2 (2003)
Volume 1 (2002)
Most Downloaded Articles
- A General Framework for Weighted Gene Co-Expression Network Analysis by Zhang, Bin and Horvath, Steve
- Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Smyth, Gordon K
- Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates by Lund, Steven P./ Nettleton, Dan/ McCarthy, Davis J. and Smyth, Gordon K.
- A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics by Schäfer, Juliane and Strimmer, Korbinian
- Normalization, bias correction, and peak calling for ChIP-seq by Diaz, Aaron/ Park, Kiyoub/ Lim, Daniel A. and Song, Jun S.
Sample Size Calculations for Designing Clinical Proteomic Profiling Studies Using Mass Spectrometry
1Centre for Molecular Medicine, Ninewells Hospital, and University of Dundee
2Cancer Research UK and University of Birmingham
3University of Oxford
4Cancer Research UK and University of Birmingham
5Cancer Research UK and University of Birmingham
Citation Information: Statistical Applications in Genetics and Molecular Biology. Volume 11, Issue 3, Pages –, ISSN (Online) 1544-6115, DOI: 10.1515/1544-6115.1686, February 2012
- Published Online:
In cancer clinical proteomics, MALDI and SELDI profiling are used to search for biomarkers of potentially curable early-stage disease. A given number of samples must be analysed in order to detect clinically relevant differences between cancers and controls, with adequate statistical power. From clinical proteomic profiling studies, expression data for each peak (protein or peptide) from two or more clinically defined groups of subjects are typically available. Typically, both exposure and confounder information on each subject are also available, and usually the samples are not from randomized subjects. Moreover, the data is usually available in replicate. At the design stage, however, covariates are not typically available and are often ignored in sample size calculations. This leads to the use of insufficient numbers of samples and reduced power when there are imbalances in the numbers of subjects between different phenotypic groups. A method is proposed for accommodating information on covariates, data imbalances and design-characteristics, such as the technical replication and the observational nature of these studies, in sample size calculations. It assumes knowledge of a joint distribution for the protein expression values and the covariates. When discretized covariates are considered, the effect of the covariates enters the calculations as a function of the proportions of subjects with specific attributes. This makes it relatively straightforward (even when pilot data on subject covariates is unavailable) to specify and to adjust for the effect of the expected heterogeneities. The new method suggests certain experimental designs which lead to the use of a smaller number of samples when planning a study. Analysis of data from the proteomic profiling of colorectal cancer reveals that fewer samples are needed when a study is balanced than when it is unbalanced, and when the IMAC30 chip-type is used. The method is implemented in the clippda package and is available in R at: http://www.bioconductor.org/help/bioc-views/release/bioc/html/clippda.html.
Keywords: sample size calculations; data imbalance; heterogeneity; covariates; technical replicates; observational study; expected Fisher information; cancer; clinical proteomics; SELDI; designing a proteomic profiling experiment