A unifying framework for seed sensitivity and its application to subset seeds.

Archive ouverte

Kucherov, Gregory | Noé, Laurent | Roytberg, Mihkail

Edité par CCSD ; World Scientific Publishing -

International audience. We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem--a set of target alignments, an associated probability distribution, and a seed model--that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which we propose an efficient automaton construction. Experimental results confirm that sensitive subset seeds can be efficiently designed using our approach, and can then be used in similarity search producing better results than ordinary spaced seeds.

Suggestions

Du même auteur

Efficient seeding techniques for protein similarity search

Archive ouverte | Roytberg, Mihkail | CCSD

International audience. We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with opt...

Subset seed automaton

Archive ouverte | Kucherov, Gregory | CCSD

International audience. We study the pattern matching automaton introduced in [KucherovNoeRoytbergJBCB06] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, mu...

Protein similarity search with subset seeds on a dedicated reconfigurable hardware

Archive ouverte | Peterlongo, Pierre | CCSD

International audience. Genome sequencing of numerous species raises the need of complete genome comparison with precise and fast similarity searches. Today, advanced seed-based techniques (spaced seeds, multiple se...

Chargement des enrichissements...