Hypermutability of genes in Homo sapiens due to the hosting of long mono-SSR.

Archive ouverte

Loire, Etienne | Praz, Françoise | Higuet, Dominique | Netter, Pierre | Achaz, Guillaume

Edité par CCSD ; Oxford University Press (OUP) -

International audience. Simple Sequence Repeats (SSRs) are very common short repeats in eukaryotic genomes. Long SSRs are considered "hypermutable" sequences since they exhibit a high rate of expansion and contraction. Because they are potentially deleterious, long SSRs tend to be uncommon in coding sequences. However, several genes contain "long" SSRs in their exonic sequences. Here, we identify 1,291 human genes that host a mono-nucleotide SSR (mono-SSR) long enough to be prone to expansion or contraction, being called << hypermutable >> hereafter. On the basis of Gene Ontology annotations, we show that only a restricted number of functions are overrepresented among those hypermutable genes, including cell cycle and maintenance of DNA integrity. Using a probabilistic model, we show that genes involved in these functions are expected to host long SSRs because they tend to be long and/or are biased in nucleotide composition. Finally, we show that for almost all functions we observe fewer hypermutable sequences than expected under a neutral model. There are however interesting exceptions, for example, genes involved in protein and RNA transport, as well as meiosis and mismatch repair functions that have as many hypermutable genes as expected under neutrality. Conversely, there are functions (e.g. collagen related genes) where hypermutable genes are more often avoided than in other functions. Our results show that, even though several functions harbor unusually long SSR in their exons, long SSRs are deleterious sequences in almost all functions and are removed by purifying selection. The strength of this purifying selection however greatly varies from function to function. We discuss possible explanations for this intriguing result.

Consulter en ligne

Suggestions

Du même auteur

Evolution of Coding Microsatellites in Primate Genomes

Archive ouverte | Loire, Etienne | CCSD

International audience. Microsatellites (SSRs) are highly susceptible to expansions and contractions. When located in a coding sequence, the insertion or the deletion of a single unit for a mono-, di-, tetra-, or pe...

Expression Patterns Suggest that Despite Considerable Functional Redundancy, Galectin-4 and-6 Play Distinct Roles in Normal and Damaged Mouse Digestive Tract

Archive ouverte | Houzelstein, Denis | CCSD

Chantier qualité GA. International audience. The galectin-4 protein is mostly expressed in the digestive tract and is associated with lipid raft stabilization, protein apical trafficking, wound healing, and inflamma...

MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads

Archive ouverte | Suez, Marie | CCSD

International audience

Chargement des enrichissements...