BlastR-fast and accurate database searches for non-coding RNAs

Archive ouverte

Bussotti, Giovanni | Raineri, Emanuele | Erb, Ionas | Zytnicki, Matthias | Wilm, Andreas | Beaudoing, Emmanuel | Bucher, Philipp | Notredame, Cedric

Edité par CCSD ; Oxford University Press -

International audience. We present and validate BlastR, a method for efficiently and accurately searching non-coding RNAs. Our approach relies on the comparison of di-nucleotides using BlosumR, a new log-odd substitution matrix. In order to use BlosumR for comparison, we recoded RNA sequences into protein-like sequences. We then showed that BlosumR can be used along with the BlastP algorithm in order to search non-coding RNA sequences. Using Rfam as a gold standard, we benchmarked this approach and show BlastR to be more sensitive than BlastN. We also show that BlastR is both faster and more sensitive than BlastP used with a single nucleotide log-odd substitution matrix. BlastR, when used in combination with WU-BlastP, is about 5% more accurate than WU-BlastN and about 50 times slower. The approach shown here is equally effective when combined with the NCBI-Blast package. The software is an open source freeware available from www.tcoffee.org/blastr.html.

Suggestions

Du même auteur

Long noncoding RNAs with enhancer-like function in human cells

Archive ouverte | Derrien, Thomas | CCSD

International audience. While the long noncoding RNAs (ncRNAs) constitute a large portion of the mammalian transcriptome, their biological functions has remained elusive. A few long ncRNAs that have been studied in ...

The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression

Archive ouverte | Derrien, Thomas | CCSD

Supplemental material is available for this article. absent

Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression

Archive ouverte | Pervouchine, Dmitri | CCSD

International audience. Abstract Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of m...

Chargement des enrichissements...