Spiked proteomic standard dataset for testing label-free quantitative software and statistical methods

Archive ouverte

Ramus, Claire | Hovasse, Agnès | Marcellin, Marlène | Hesse, Anne-Marie | Mouton-Barbosa, Emmanuelle | Bouyssié, David | Vaca, Sebastian | Carapito, Christine | Chaoui, Karima | Bruley, Christophe | Garin, Jérôme | Cianférani, Sarah | Ferro, Myriam | van Dorssaeler, Alain | Burlet-Schiltz, Odile | Schaeffer, Christine | Couté, Yohann | Gonzalez de Peredo, Anne

Edité par CCSD ; Elsevier -

International audience. This data article describes a controlled, spiked proteomic dataset for which the “ground truth” of variant proteins is known. It is based on the LC-MS analysis of samples composed of a fixed background of yeast lysate and different spiked amounts of the UPS1 mixture of 48 recombinant proteins. It can be used to objectively evaluate bioinformatic pipelines for label-free quantitative analysis, and their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. More specifically, it can be useful for tuning software tools parameters, but also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. The raw MS files can be downloaded from ProteomeXchange with identifier http://www.ebi.ac.uk/pride/archive/projects/PXD001819. Starting from some raw files of this dataset, we also provide here some processed data obtained through various bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, to exemplify the use of such data in the context of software benchmarking, as discussed in details in the accompanying manuscript [1]. The experimental design used here for data processing takes advantage of the different spike levels introduced in the samples composing the dataset, and processed data are merged in a single file to facilitate the evaluation and illustration of software tools results for the detection of variant proteins with different absolute expression levels and fold change values.

Suggestions

Du même auteur

Benchmarking quantitative label-free LC–MS data processing workflows using a complex spiked proteomic standard dataset

Archive ouverte | Ramus, Claire | CCSD

International audience. Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the us...

Proline: an efficient and user-friendly software suite for large-scale proteomics

Archive ouverte | Bouyssié, David | CCSD

International audience. Motivation: The proteomics field requires the production and publication of reliable mass spectrometry-based identification and quantification results. Although many tools or algorithms exist...

Looking for Missing Proteins in the Proteome of Human Spermatozoa: An Update

Archive ouverte | Vandenbrouck, Yves | CCSD

International audience

Chargement des enrichissements...