Detecting the molecular scars of evolution in the Mycobacterium tuberculosis complex by analyzing interrupted coding sequences.

Archive ouverte

Deshayes, Caroline | Perrodou, Emmanuel | Euphrasie, Daniel | Frapy, Eric | Poch, Olivier | Bifani, Pablo | Lecompte, Odile | Reyrat, Jean-Marc

Edité par CCSD ; BioMed Central -

BACKGROUND: Computer-assisted analyses have shown that all bacterial genomes contain a small percentage of open reading frames with a frameshift or in-frame stop codon We report here a comparative analysis of these interrupted coding sequences (ICDSs) in six isolates of M. tuberculosis, two of M. bovis and one of M. africanum and question their phenotypic impact and evolutionary significance. RESULTS: ICDSs were classified as "common to all strains" or "strain-specific". Common ICDSs are believed to result from mutations acquired before the divergence of the species, whereas strain-specific ICDSs were acquired after this divergence. Comparative analyses of these ICDSs therefore define the molecular signature of a particular strain, phylogenetic lineage or species, which may be useful for inferring phenotypic traits such as virulence and molecular relationships. For instance, in silico analysis of the W-Beijing lineage of M. tuberculosis, an emergent family involved in several outbreaks, is readily distinguishable from other phyla by its smaller number of common ICDSs, including at least one known to be associated with virulence. Our observation was confirmed through the sequencing analysis of ICDSs in a panel of 21 clinical M. tuberculosis strains. This analysis further illustrates the divergence of the W-Beijing lineage from other phyla in terms of the number of full-length ORFs not containing a frameshift. We further show that ICDS formation is not associated with the presence of a mutated promoter, and suggest that promoter extinction is not the main cause of pseudogene formation. CONCLUSION: The correlation between ICDSs, function and phenotypes could have important evolutionary implications. This study provides population geneticists with a list of targets, which could undergo selective pressure and thus alters relationships between the various lineages of M. tuberculosis strains and their host. This approach could be applied to any closely related bacterial strains or species for which several genome sequences are available.

Consulter en ligne

Suggestions

Du même auteur

Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors?

Archive ouverte | Deshayes, Caroline | CCSD

BACKGROUND: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the org...

Ortho-proteogenomics: multiple proteomes investigation through orthology and a new MS-based protocol.

Archive ouverte | Gallien, Sébastien | CCSD

International audience. The progress in sequencing technologies irrigates biology with an ever-increasing number of genome sequences. In most cases, the gene repertoire is predicted in silico and conceptually transl...

ICDS database: interrupted CoDing sequences in prokaryotic genomes.

Archive ouverte | Perrodou, Emmanuel | CCSD

International audience. Unrecognized frameshifts, in-frame stop codons and sequencing errors lead to Interrupted CoDing Sequence (ICDS) that can seriously affect all subsequent steps of functional characterization, ...

Chargement des enrichissements...