FAIR data pipeline: provenance-driven data management for traceable scientific workflows

Archive ouverte

Mitchell, Sonia Natalie | Lahiff, Andrew | Cummings, Nathan | Hollocombe, Jonathan | Boskamp, Bram | Field, Ryan | Reddyhoff, Dennis | Zarebski, Kristian | Wilson, Antony | Viola, Bruno | Burke, Martin | Archibald, Blair | Bessell, Paul | Blackwell, Richard | Boden, Lisa | Brett, Alys | Brett, Sam | Dundas, Ruth | Enright, Jessica | Gonzalez-Beltran, Alejandra | Harris, Claire | Hinder, Ian | David Hughes, Christopher | Knight, Martin | Mano, Vino | Mcmonagle, Ciaran | Mellor, Dominic | Mohr, Sibylle | Marion, Glenn | Matthews, Louise | Mckendrick, Iain | Mark Pooley, Christopher | Porphyre, Thibaud | Reeves, Aaron | Townsend, Edward | Turner, Robert | Walton, Jeremy | Reeve, Richard

Edité par CCSD ; Royal Society, The -

International audience. Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of ‘following the science’ are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline. Although developed during the COVID-19 pandemic, it allows easy annotation of any data as they are consumed by analyses, or conversely traces the provenance of scientific outputs back through the analytical or modelling source code to primary data. Such a tool provides a mechanism for the public, and fellow scientists, to better assess scientific evidence by inspecting its provenance, while allowing scientists to support policymakers in openly justifying their decisions. We believe that such tools should be promoted for use across all areas of policy-facing research. This article is part of the theme issue ‘Technical challenges of modelling real-life epidemics and examples of overcoming these’.

Suggestions

Du même auteur

E. coli O157 on Scottish cattle farms: Evidence of local spread and persistence using repeat cross-sectional data

Archive ouverte | Herbert, Liam | CCSD

International audience. Abstract Background Escherichia coli (E. coli) O157 is a virulent zoonotic strain of enterohaemorrhagic E. coli. In Scotland (1998-2008) the annual reported rate of human infection is 4.4 per...

Uptake of Diagnostic Tests by Livestock Farmers: A Stochastic Game Theory Approach

Archive ouverte | Mohr, Sibylle | CCSD

International audience

Disentangling the roles of human mobility and deprivation on the transmission dynamics of COVID-19 using a spatially explicit simulation model

Archive ouverte | Banks, Christopher | CCSD

Summary Restrictions on mobility are a key component of infectious disease controls, preventing the spread of infections to as yet unexposed areas, or to regions which have previously eliminated outbreaks. However, even under the ...

Chargement des enrichissements...