Eoulsan 2: an efficient workflow manager for reproducible bulk, long-read and single-cell transcriptomics analyses

Archive ouverte

Lehmann, Nathalie | Perrin, Sandrine | Wallon, Claire | Bauquet, Xavier | Deshaies, Vivien | Firmo, Cyril | Du, Runxin | Berthelier, Charlotte | Hernandez, Céline | Michaud, Cédric | Thieffry, Denis | Le Crom, Stéphane | Thomas-Chollier, Morgane | Jourdren, Laurent

Edité par CCSD -

A bstract Motivation Core sequencing facilities produce huge amounts of sequencing data that need to be analysed with automated workflows to ensure reproducibility and traceability. Eoulsan is a versatile open-source workflow engine meeting the needs of core facilities, by automating the analysis of a large number of samples. Its core design separates the description of the workflow from the actual commands to be run. This originality simplifies its usage as the user does not need to handle code, while ensuring reproducibility. Eoulsan was initially developed for bulk RNA-seq data, but the transcriptomics applications have recently widened with the advent of long-read sequencing and single-cell technologies, calling for the development of new workflows. Result We present Eoulsan 2, a major update that (i) enhances the workflow manager itself, (ii) facilitates the development of new modules, and (iii) expands its applications to long reads RNA-seq (Oxford Nanopore Technologies) and scRNA-seq (Smart-seq2 and 10x Genomics). The workflow manager has been rewritten, with support for execution on a larger choice of computational infrastructure (workstations, Hadoop clusters, and various job schedulers for cluster usage). Eoulsan now facilitates the development of new modules, by reusing wrappers developed for the Galaxy platform, with support for container images (Docker or Singularity) packaging tools to execute. Finally, Eoulsan natively integrates novel modules for bulk RNA-seq, as well as others specifically designed for processing long read RNA-seq and scRNA-seq. Eoulsan 2 is distributed with ready-to-use workflows and companion tutorials. Availability and implementation Eoulsan is implemented in Java, supported on Linux systems and distributed under the LGPL and CeCILL-C licenses at: http://outils.genomique.biologie.ens.fr/eoulsan/ . The source code and sample workflows are available on GitHub: https://github.com/GenomicParisCentre/eoulsan . A GitHub repository for modules using the Galaxy tool XML syntax is further provided at: https://github.com/GenomicParisCentre/eoulsan-tools Contact eoulsan@bio.ens.psl.eu

Consulter en ligne

Suggestions

Du même auteur

Aozan: an automated post-sequencing data-processing pipeline

Archive ouverte | Perrin, Sandrine | CCSD

International audience. Motivation: Data management and quality control of output from Illumina sequencers is a disk space- and time-consuming task. Thus, we developed Aozan to automatically handle data transfer, de...

Cooperation between T cell receptor and Toll-like receptor 5 signaling for CD4 + T cell activation

Archive ouverte | Rodríguez-Jorge, Otoniel | CCSD

International audience. CD4 + T cells recognize antigens through their T cell receptors (TCRs); however, additional signals involving co-stimulatory receptors, for example CD28, are required for proper T cell activa...

RSAT 2015: Regulatory Sequence Analysis Tools

Archive ouverte | Medina-Rivera, Alejandra | CCSD

International audience. RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropria...

Chargement des enrichissements...