An integrative method to normalize RNA-Seq data.

Archive ouverte

Filloux, Cyril | Meerssemann, Cédric | Philippe, Romain | Forestier, Lionel | Klopp, Christophe | Rocha, Dominique | Maftah, Abderrahman, A. | Petit, Daniel

Edité par CCSD ; BioMed Central -

BackgroundTranscriptome sequencing is a powerful tool for measuring gene expression, but as well as some other technologies, various artifacts and biases affect the quantification. In order to correct some of them, several normalization approaches have emerged, differing both in the statistical strategy employed and in the type of corrected biases. However, there is no clear standard normalization method.ResultsWe present a novel methodology to normalize RNA-Seq data, taking into account transcript size, GC content, and sequencing depth, which are the major quantification-related biases. In this study, we found that transcripts shorter than 600 bp have an underestimated expression level, while longer transcripts are even more overestimated that they are long. Second, it was well known that the higher the GC content (>50%), the more the transcripts are underestimated. Third, we demonstrated that the sequencing depth impacts the size bias and proposed a correction allowing the comparison of expression levels among many samples. The efficiency of our approach was then tested by comparing the correlation between normalized RNA-Seq data and qRT-PCR expression measurements. All the steps are automated in a program written in Perl and available on request.ConclusionsThe methodology presented in this article identifies and corrects different biases that influence RNA-Seq quantification, and provides more accurate estimations of gene expression levels. This method can be applied to compare expression quantifications from many samples, but preferentially from the same tissue. In order to compare samples from different tissue, a calibration using several reference genes will be required.

Suggestions

Du même auteur

Bovine TWINKLE and mitochondrial ribosomal protein L43 genes are regulated by an evolutionary conserved bidirectional promoter

Archive ouverte | Meerssemann, Cédric | CCSD

TWINKLE is a mitochondrial DNA helicase playing an important role in mitochondrial DNA replication. In human, mutations in this gene cause progressive external ophtalmoplegia and mitochondrial DNA depletion syndrome-7. TWINKLE is ...

Genetic variability of the activity of bidirectional promoters : a pilot study in bovine muscle

Archive ouverte | Meerssemann, Cédric | CCSD

Bidirectional promoters are regulatory regions co-regulating the expression of two neighbouring genes organized in a head-to-head orientation. In recent years, these regulatory regions have been studied in many organisms; however,...

Glycosylation-related gene expression is linked to differentiation status in glioblastomas undifferentiated cells

Archive ouverte | Cheray, Mathilde | CCSD

International audience

Chargement des enrichissements...