Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

6 Issues per year

IMPACT FACTOR 2017: 0.812
5-year IMPACT FACTOR: 1.104

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2017: 0.04

See all formats and pricing
More options …
Volume 11, Issue 1


Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Improving Hidden Markov Models for Classification of Human Immunodeficiency Virus-1 Subtypes through Linear Classifier Learning

Ingo Bulla / Anne-Kathrin Schultz / Peter Meinicke
Published Online: 2012-01-06 | DOI: https://doi.org/10.2202/1544-6115.1680

Profile Hidden Markov Models (pHMMs) are widely used to model nucleotide or protein sequence families. In many applications, a sequence family classified into several subfamilies is given and each subfamily is modeled separately by one pHMM. A major drawback of this approach is the difficulty of coping with subfamilies composed of very few sequences.Correct subtyping of human immunodeficiency virus-1 (HIV-1) sequences is one of the most crucial bioinformatic tasks affected by this problem of small subfamilies, i.e., HIV-1 subtypes with a small number of known sequences. To deal with small samples for particular subfamilies of HIV-1, we employ a machine learning approach. More precisely, we make use of an existing HMM architecture and its associated inference engine, while replacing the unsupervised estimation of emission probabilities by a supervised method. For that purpose, we use regularized linear discriminant learning together with a balancing scheme to account for the widely varying sample size. After training the multiclass linear discriminants, the corresponding weights are transformed to valid probabilities using a softmax function.We apply this modified algorithm to classify HIV-1 sequence data (in the form of partial-length HIV-1 sequences and semi-artificial recombinants) and show that the performance of pHMMs can be significantly improved by the proposed technique.

Keywords: HMMs; linear classifier learning; HIV-1

About the article

Published Online: 2012-01-06

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 11, Issue 1, Pages 1–27, ISSN (Online) 1544-6115, DOI: https://doi.org/10.2202/1544-6115.1680.

Export Citation

©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Tianyi Li, Guoqing Sun, Dijing Jia, Changrong Sun, Zhe Wang, Siyang Liu, Yongjian Liu, Hanping Li, Xiaolin Wang, Jingyun Li, and Lin Li
AIDS Research and Human Retroviruses, 2016, Volume 32, Number 7, Page 722
Bharat Singh Negi, Tomohiro Kotaki, Sunil Kumar Joshi, Anup Bastola, Minato Nakazawa, and Masanori Kameoka
AIDS Research and Human Retroviruses, 2017, Volume 33, Number 9, Page 960
Graeme Brendon Jacobs, Eduan Wilkinson, Shahieda Isaacs, Georgina Spies, Tulio de Oliveira, Soraya Seedat, Susan Engelbrecht, and Marco Salemi
PLoS ONE, 2014, Volume 9, Number 3, Page e90845

Comments (0)

Please log in or register to comment.
Log in