Efficient seeding techniques for protein similarity search

Archive ouverte

Roytberg, Mihkail | Gambin, Anna | Noé, Laurent | Lasota, Slawomir | Furletova, Eugenia | Szczurek, Ewa | Kucherov, Gregory

Edité par CCSD ; Springer Berlin Heidelberg -

International audience. We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets.We then perform an analysis of seeds built over those alphabet and compare them with the standard Blastp seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix.

Suggestions

Du même auteur

On subset seeds for protein alignment

Archive ouverte | Roytberg, Mikhail, A. | CCSD

International audience. We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with opt...

Subset seed extension to Protein BLAST

Archive ouverte | Gambin, Anna | CCSD

International audience. The seeding technique became central in the theory of sequence alignment and there are several efficient tools applying seeds to DNA homology search. Recently, a concept of subset seeds has b...

A unifying framework for seed sensitivity and its application to subset seeds.

Archive ouverte | Kucherov, Gregory | CCSD

International audience. We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem--a ...

Chargement des enrichissements...