Accounting for ambiguity in ancestral sequence reconstruction

Archive ouverte

Oliva, Adrien | Pulicani, Sylvain | Lefort, Vincent | Brehelin, Laurent | Gascuel, Olivier | Guindon, Stéphane

Edité par CCSD ; Oxford University Press (OUP) -

International audience. Motivation: The reconstruction of ancestral genetic sequences from the analysis of contemporan-eous data is a powerful tool to improve our understanding of molecular evolution. Various statistical criteria defined in a phylogenetic framework can be used to infer nucleotide, amino-acid or codon states at internal nodes of the tree, for every position along the sequence. These criteria generally select the state that maximizes (or minimizes) a given criterion. Although it is perfectly sensible from a statistical perspective, that strategy fails to convey useful information about the level of uncertainty associated to the inference. Results: The present study introduces a new criterion for ancestral sequence reconstruction, the minimum posterior expected error (MPEE), that selects a single state whenever the signal conveyed by the data is strong, and a combination of multiple states otherwise. We also assess the performance of a criterion based on the Brier scoring scheme which, like MPEE, does not rely on any tuning parameters. The precision and accuracy of several other criteria that involve arbitrarily set tuning parameters are also evaluated. Large scale simulations demonstrate the benefits of using the MPEE and Brier-based criteria with a substantial increase in the accuracy of the inference of past sequences compared to the standard approach and realistic compromises on the precision of the solutions returned. Availability and implementation: The software package PhyML (https://github.com/stephaneguin don/phyml) provides an implementation of the Maximum A Posteriori (MAP) and MPEE criteria for reconstructing ancestral nucleotide and amino-acid sequences.

Suggestions

Du même auteur

New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0

Archive ouverte | Guindon, Stéphane | CCSD

Supplementary Material: http://www.lirmm.fr/mab/phyml_benchmarks/. International audience. PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performi...

Phylogeny.fr: robust phylogenetic analysis for the non-specialist

Archive ouverte | Dereeper, Alexis | CCSD

International audience. Phylogenetic analyses are central to many research areas in biology and typically involve the identification of homologous sequences, their multiple alignment, the phylogenetic reconstruction...

Genomics, biogeography, and the diversification of placental mammals

Archive ouverte | E. Wildman, Derek | CCSD

International audience. Previous molecular analyses of mammalian evolutionary relationships involving a wide range of placental mammalian taxa have been restricted in size from one to two dozen gene loci and have no...

Chargement des enrichissements...