Bulletin of the Polish Academy of Sciences Technical Sciences

The Journal of Polish Academy of Sciences

4 Issues per year

IMPACT FACTOR increased in 2015: 1.087
Rank 39 out of 85 in category Engineering, Multidisciplinary in the 2015 Thomson Reuters Journal Citation Report/Science Edition

SCImago Journal Rank (SJR) 2015: 0.526
Source Normalized Impact per Paper (SNIP) 2015: 1.208
Impact per Publication (IPP) 2015: 1.158

Open Access
Efficient alternatives to PSI-BLAST

M. Startek
  • Institute of Informatics, University of Warsaw, 2 Banacha St., 02-097 Warszawa, Poland
/ S. Lasota
  • Institute of Informatics, University of Warsaw, 2 Banacha St., 02-097 Warszawa, Poland
/ M. Sykulski
  • Institute of Informatics, University of Warsaw, 2 Banacha St., 02-097 Warszawa, Poland
/ A. Bułak
  • Institute of Informatics, University of Warsaw, 2 Banacha St., 02-097 Warszawa, Poland
/ L. Noé
  • LIFL/CNRS/INRIA, Bˆat. M3, Campus Scientifique, Villeneuve d’Ascq, France
/ G. Kucherov
  • Laboratoire d’Informatique Gaspard-Monge, Marne-la-Valle, France
/ A. Gambin
  • Institute of Informatics, University of Warsaw, 2 Banacha St., 02-097 Warszawa, Poland / Mossakowski Medical Research Centre Polish Academy of Sciences, 5 Pawińskiego St., 02-106 Warszawa, Poland
Published Online: 2012-12-22 | DOI: https://doi.org/10.2478/v10175-012-0063-0


In this paper we present two algorithms that may serve as efficient alternatives to the well-known PSI BLAST tool: SeedBLAST and CTX-PSI Blast. Both may benefit from the knowledge about amino acid composition specific to a given protein family: SeedBLAST uses the advisedly designed seed, while CTX-PSI BLAST extends PSI BLAST with the context-specific substitution model. The seeding technique became central in the theory of sequence alignment. There are several efficient tools applying seeds to DNA homology search, but not to protein homology search. In this paper we fill this gap. We advocate the use of multiple subset seeds derived from a hierarchical tree of amino acid residues. Our method computes, by an evolutionary algorithm, seeds that are specifically designed for a given protein family. The seeds are represented by deterministic finite automata (DFAs) and built into the NCBI-BLAST software. This extended tool, named SeedBLAST, is compared to the original BLAST and PSI-BLAST on several protein families. Our results demonstrate a superiority of SeedBLAST in terms of efficiency, especially in the case of twilight zone hits. The contextual substitution model has been proven to increase sensitivity of protein alignment. In this paper we perform a next step in the contextual alignment program. We announce a contextual version of the PSI-BLAST algorithm, an iterative version of the NCBI-BLAST tool. The experimental evaluation has been performed demonstrating a significantly higher sensitivity compared to the ordinary PSI-BLAST algorithm.

Keywords : PSI BLAST tool; sequence alignment; seeding technique.

Published Online: 2012-12-22

Published in Print: 2012-12-01

Citation Information: Bulletin of the Polish Academy of Sciences: Technical Sciences. Volume 60, Issue 3, Pages 495–505, ISSN (Print) 0239-7528, DOI: https://doi.org/10.2478/v10175-012-0063-0, December 2012

This content is open access.

