Towards a reliable objective function for multiple sequence alignments

Archive ouverte

Thompson, Julie | Plewniak, Frédéric | Ripp, Raymond | Thierry, Jean-Claude | Poch, Olivier

Edité par CCSD ; Elsevier -

Multiple sequence alignment is a fundamental tool in a number of different domains in modern molecular biology, including functional and evolutionary studies of a protein family. Multiple alignments also play an essential role in the new integrated systems for genome annotation and analysis. Thus, the development of new multiple alignment scores and statistics is essential, in the spirit of the work dedicated to the evaluation of pairwise sequence alignments for database searching techniques. We present here norMD, a new objective scoring function for multiple sequence alignments. NorMD combines the advantages of the column-scoring techniques with the sensitivity of methods incorporating residue similarity scores. In addition, norMD incorporates ab initio sequence information, such as the number, length and similarity of the sequences to be aligned. The sensitivity and reliability of the norMD objective function is demonstrated using structural alignments in the SCOP and BAliBASE databases. The norMD scores are then applied to the multiple alignments of the complete sequences (MACS) detected by BlastP with E-value<10, for a set of 734 hypothetical proteins encoded by the Vibrio cholerae genome. Unrelated or badly aligned sequences were automatically removed from the MACS, leaving a high-quality multiple alignment which could be reliably exploited in a subsequent functional and/or structural annotation process. After removal of unreliable sequences, 176 (24 %) of the alignments contained at least one sequence with a functional annotation. 103 of these new matches were supported by significant hits to the Interpro domain and motif database.

Consulter en ligne

Suggestions

Du même auteur

Multiple alignment of complete sequences (MACS) in the post-genomic era

Archive ouverte | Lecompte, Odile | CCSD

Multiple alignment, since its introduction in the early seventies, has become a cornerstone of modern molecular biology. It has traditionally been used to deduce structure / function by homology, to detect conserved motifs and in ...

An integrated analysis of the genome of the hyperthermophilic archaeon Pyrococcus abyssi

Archive ouverte | Cohen, Georges | CCSD

International audience. The hyperthermophilic euryarchaeon Pyrococcus abyssi and the related species Pyrococcus furiosus and Pyrococcus horikoshii, whose genomes have been completely sequenced, are presently used as...

Genome Evolution at the Genus Level: Comparison of Three Complete Genomes of Hyperthermophilic Archaea

Archive ouverte | Lecompte, Odile | CCSD

International audience. We have compared three complete genomes of closely related hyperthermophilic species of Archaea belonging to the Pyrococcus genus: Pyrococcus abyssi, Pyrococcus horikoshii, and Pyrococcus fur...

Chargement des enrichissements...