TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes.

Archive ouverte

Leroy, Philippe | Guilhot, Nicolas | Sakai, Hiroaki | Bernard, Aurélien | Choulet, Frédéric | Theil, Sébastien | Reboux, Sébastien | Amano, Naoki | Flutre, Timothée | Pelegrin, Céline | Ohyanagi, Hajime | Seidel, Michael | Giacomoni, Franck | Reichstadt, Matthieu | Alaux, Michael | Gicquello, Emmanuelle | Legeai, Fabrice | Cerutti, Lorenzo | Numa, Hisataka | Tanaka, Tsuyoshi | Mayer, Klaus | Itoh, Takeshi | Quesneville, Hadi | Feuillet, Catherine

Edité par CCSD ; Frontiers -

International audience. In support of the international effort to obtain a reference sequence of the bread wheat genome and to provide plant communities dealing with large and complex genomes with a versatile, easy-to-use online automated tool for annotation, we have developed the TriAnnot pipeline. Its modular architecture allows for the annotation and masking of transposable elements, the structural, and functional annotation of protein-coding genes with an evidence-based quality indexing, and the identification of conserved non-coding sequences and molecular markers. The TriAnnot pipeline is parallelized on a 712 CPU computing cluster that can run a 1-Gb sequence annotation in less than 5 days. It is accessible through a web interface for small scale analyses or through a server for large scale annotations. The performance of TriAnnot was evaluated in terms of sensitivity, specificity, and general fitness using curated reference sequence sets from rice and wheat. In less than 8 h, TriAnnot was able to predict more than 83% of the 3,748 CDS from rice chromosome 1 with a fitness of 67.4%. On a set of 12 reference Mb-sized contigs from wheat chromosome 3B, TriAnnot predicted and annotated 93.3% of the genes among which 54% were perfectly identified in accordance with the reference annotation. It also allowed the curation of 12 genes based on new biological evidences, increasing the percentage of perfect gene prediction to 63%. TriAnnot systematically showed a higher fitness than other annotation pipelines that are not improved for wheat. As it is easily adaptable to the annotation of other plant genomes, TriAnnot should become a useful resource for the annotation of large and complex genomes in the future.

Suggestions

Du même auteur

Shifting the limits in wheat research and breeding using a fully annotated reference genome

Archive ouverte | Appels, Rudi | CCSD

International audience. Insights from the annotated wheat genome Wheat is one of the major sources of food for much of the world. However, because bread wheat's genome is a large hybrid mix of three separate subgeno...

Genome interplay in the grain transcriptome of hexaploid bread wheat

Archive ouverte | Pfeifer, Matthias | CCSD

International audience. Allohexaploid bread wheat (Triticum aestivum L.) provides approximately 20% of calories consumed by humans. Lack of genome sequence for the three homeologous and highly similar bread wheat ge...

The Rice Annotation Project Database (RAP-DB): 2008 update.

Archive ouverte | Tanaka, Tsuyoshi | CCSD

The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics informa...

Chargement des enrichissements...