Evolution of Coding Microsatellites in Primate Genomes

Archive ouverte

Loire, Etienne | Higuet, Dominique | Netter, Pierre | Achaz, Guillaume

Edité par CCSD ; Society for Molecular Biology and Evolution -

International audience. Microsatellites (SSRs) are highly susceptible to expansions and contractions. When located in a coding sequence, the insertion or the deletion of a single unit for a mono-, di-, tetra-, or penta(nucleotide)-SSR creates a frameshift. As a consequence, one would expect to find only very few of these SSRs in coding sequences because of their strong deleterious potential. Unexpectedly, genomes contain many coding SSRs of all types. Here, we report on a study of their evolution in a phylogenetic context using the genomes of four primates: human, chimpanzee, orangutan, and macaque. In a set of 5,015 orthologous genes unambiguously aligned among the four species, we show that, except for tri- and hexa-SSRs, for which insertions and deletions are frequently observed, SSRs in coding regions evolve mainly by substitutions. We show that the rate of substitution in all types of coding SSRs is typically two times higher than in the rest of coding sequences. Additionally, we observe that although numerous coding SSRs are created and lost by substitutions in the lineages, their numbers remain constant. This last observation suggests that the coding SSRs have reached equilibrium. We hypothesize that this equilibrium involves a combination of mutation, drift, and selection. We thus estimated the fitness cost of mono-SSRs and show that it increases with the number of units. We finally show that the cost of coding mono-SSRs greatly varies from function to function, suggesting that the strength of the selection that acts against them can be correlated to gene functions.

Consulter en ligne

Suggestions

Du même auteur

Hypermutability of genes in Homo sapiens due to the hosting of long mono-SSR.

Archive ouverte | Loire, Etienne | CCSD

International audience. Simple Sequence Repeats (SSRs) are very common short repeats in eukaryotic genomes. Long SSRs are considered "hypermutable" sequences since they exhibit a high rate of expansion and contracti...

Expression Patterns Suggest that Despite Considerable Functional Redundancy, Galectin-4 and-6 Play Distinct Roles in Normal and Damaged Mouse Digestive Tract

Archive ouverte | Houzelstein, Denis | CCSD

Chantier qualité GA. International audience. The galectin-4 protein is mostly expressed in the digestive tract and is associated with lipid raft stabilization, protein apical trafficking, wound healing, and inflamma...

MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads

Archive ouverte | Suez, Marie | CCSD

International audience

Chargement des enrichissements...