ICDS database: interrupted CoDing sequences in prokaryotic genomes.

Archive ouverte

Perrodou, Emmanuel | Deshayes, Caroline | Muller, Jean | Schaeffer, Christine | van Dorsselaer, Alain | Ripp, Raymond | Poch, Olivier | Reyrat, Jean-Marc | Lecompte, Odile

Edité par CCSD ; Oxford University Press -

International audience. Unrecognized frameshifts, in-frame stop codons and sequencing errors lead to Interrupted CoDing Sequence (ICDS) that can seriously affect all subsequent steps of functional characterization, from in silico analysis to high-throughput proteomic projects. Here, we describe the Interrupted CoDing Sequence database containing ICDS detected by a similarity-based approach in 80 complete prokaryotic genomes. ICDS can be retrieved by species browsing or similarity searches via a web interface (http://www-bio3d-igbmc.u-strasbg.fr/ICDS/). The definition of each interrupted gene is provided as well as the ICDS genomic localization with the surrounding sequence. Furthermore, to facilitate the experimental characterization of ICDS, we propose optimized primers for re-sequencing purposes. The database will be regularly updated with additional data from ongoing sequenced genomes. Our strategy has been validated by three independent tests: (i) ICDS prediction on a benchmark of artificially created frameshifts, (ii) comparison of predicted ICDS and results obtained from the comparison of the two genomic sequences of Bacillus licheniformis strain ATCC 14580 and (iii) re-sequencing of 25 predicted ICDS of the recently sequenced genome of Mycobacterium smegmatis. This allows us to estimate the specificity and sensitivity (95 and 82%, respectively) of our program and the efficiency of primer determination.

Consulter en ligne

Suggestions

Du même auteur

Ortho-proteogenomics: multiple proteomes investigation through orthology and a new MS-based protocol.

Archive ouverte | Gallien, Sébastien | CCSD

International audience. The progress in sequencing technologies irrigates biology with an ever-increasing number of genome sequences. In most cases, the gene repertoire is predicted in silico and conceptually transl...

Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors?

Archive ouverte | Deshayes, Caroline | CCSD

BACKGROUND: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the org...

Detecting the molecular scars of evolution in the Mycobacterium tuberculosis complex by analyzing interrupted coding sequences.

Archive ouverte | Deshayes, Caroline | CCSD

BACKGROUND: Computer-assisted analyses have shown that all bacterial genomes contain a small percentage of open reading frames with a frameshift or in-frame stop codon We report here a comparative analysis of these interrupted cod...

Chargement des enrichissements...