Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

IMPACT FACTOR 2018: 0.536
5-year IMPACT FACTOR: 0.764

CiteScore 2018: 0.49

SCImago Journal Rank (SJR) 2018: 0.316
Source Normalized Impact per Paper (SNIP) 2018: 0.342

Mathematical Citation Quotient (MCQ) 2017: 0.04

See all formats and pricing
More options …
Volume 4, Issue 1


Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data.

Merrill D. Birkner / Sandra E. Sinisi / Mark J. van der Laan
Published Online: 2005-04-18 | DOI: https://doi.org/10.2202/1544-6115.1110

Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological insights regarding the replication ability of HIV-1. Determining specific target codons on the viral strand will facilitate the manufacturing of target-specific antiretrovirals. Various algorithmic and analysis techniques can be applied to this application. In this paper, we apply two techniques to a data set consisting of 317 patients, each with 282 sequenced protease and reverse transcriptase codons. The first application is recently developed multiple testing procedures to find codons which have significant univariate associations with the replication capacity of the virus. A single-step multiple testing procedure (Pollard and van der Laan 2003) method was used to control the family wise error rate (FWER) at the five percent alpha level as well as the application of augmentation multiple testing procedures to control the generalized family wise error (gFWER) or the tail probability of the proportion of false positives (TPPFP). We also applied a data adaptive multiple regression algorithm to obtain a prediction of viral replication capacity based on an entire mutant/non-mutant sequence profile. This is a loss-based, cross-validated Deletion/Substitution/Addition regression algorithm (Sinisi and van der Laan 2004), which builds candidate estimators in the prediction of a univariate outcome by minimizing an empirical risk. These methods are two separate techniques with distinct goals used to analyze this structure of viral data.

Keywords: Bootstrap; codon; generalized family wise error rate; HIV-1; multiple testing; prediction; tail probability of the proportion of false positives; type I error; variable selection.

About the article

Published Online: 2005-04-18

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 4, Issue 1, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.2202/1544-6115.1110.

Export Citation

©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Thomas Lengauer and Tobias Sing
Nature Reviews Microbiology, 2006, Volume 4, Number 10, Page 790

Comments (0)

Please log in or register to comment.
Log in