Jump to ContentJump to Main Navigation

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year

IMPACT FACTOR 2013: 1.055
Rank 48 out of 119 in category Statistics & Probability in the 2013 Thomson Reuters Journal Citation Report/Science Edition

VolumeIssuePage

Learning from Past Treatments and Their Outcome Improves Prediction of In Vivo Response to Anti-HIV Therapy

Hiroto Saigo1 / Andre Altmann2 / Jasmina Bogojeska3 / Fabian Müller4 / Sebastian Nowozin5 / Thomas Lengauer6

1Max Planck Institute for Informatics

2Max Planck Institute for Informatics

3Max Planck Institute for Informatics

4Max Planck Institute for Informatics

5Max Planck Institute for Biological Cybernetics

6Max Planck Institute for Informatics

Citation Information: Statistical Applications in Genetics and Molecular Biology. Volume 10, Issue 1, Pages 1–32, ISSN (Online) 1544-6115, DOI: 10.2202/1544-6115.1604, January 2011

Publication History

Published Online:
2011-01-14

Infections with the human immunodeficiency virus type 1 (HIV-1) are treated with combinations of drugs. Unfortunately, HIV responds to the treatment by developing resistance mutations. Consequently, the genome of the viral target proteins is sequenced and inspected for resistance mutations as part of routine diagnostic procedures for ensuring an effective treatment. For predicting response to a combination therapy, currently available computer-based methods rely on the genotype of the virus and the composition of the regimen as input. However, no available tool takes full advantage of the knowledge about the order of and the response to previously prescribed regimens. The resulting high-dimensional feature space makes existing methods difficult to apply in a straightforward fashion. The machine learning system proposed in this work, sequence boosting, is tailored to exploiting such high-dimensional information, i.e. the extraction of longitudinal features, by utilizing the recent advancements in data mining and boosting.

When applied to predicting the latest treatment outcome for 3,759 treatment-experienced patients from the EuResist integrated database, sequence boosting achieved superior performance compared to SVMs with RBF kernels. Moreover, sequence boosting allows an easy access to the discriminative treatment information.

Analysis of feature importance values provided by our model confirmed known facts regarding HIV treatment. For instance, application of potent and recently licensed drugs was beneficial for patients, and, conversely, the patient group that was subject to NRTI mono-therapies in the past had poor treatment perspectives today. Furthermore, our model revealed novel biological insights. More precisely, the combination of previously used drugs with their in vivo response is more informative than the information of previously used drugs alone. Using this information improves the performance of systems for predicting therapy outcome.

Keywords: data mining; discriminative sequence features; boosting; HIV; clinical; optimization

Comments (0)

Please log in or register to comment.
Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.