Back-translation for discovering distant protein homologies in the presence of frameshift mutations

Archive ouverte

Gîrdea, Marta | Noé, Laurent | Kucherov, Gregory

Edité par CCSD ; BioMed Central -

International audience. Background
Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level.
Results
We developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at http://bioinfo.lifl.fr/path/.
Conclusions
Our approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.

Suggestions

Du même auteur

Designing Efficient Spaced Seeds for SOLiD Read Mapping.

Archive ouverte | Noé, Laurent | CCSD

International audience. The advent of high-throughput sequencing technologies constituted a major advance in genomic studies, offering new prospects in a wide range of applications.We propose a rigorous and flexible...

Protein similarity search with subset seeds on a dedicated reconfigurable hardware

Archive ouverte | Peterlongo, Pierre | CCSD

International audience. Genome sequencing of numerous species raises the need of complete genome comparison with precise and fast similarity searches. Today, advanced seed-based techniques (spaced seeds, multiple se...

Improved hit criteria for DNA local alignment.

Archive ouverte | Noé, Laurent | CCSD

International audience. BACKGROUND: The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisi...

Chargement des enrichissements...