Reproducing Deep Learning experiments: common challenges and recommendations for improvement

Archive ouverte

Machicao, Jeaneth | Ben Abbes, Ali | Meneguzzi, Leandro | Corrêa, Pedro | Specht, Alison | David, Romain | Subsol, Gérard | Vellenich, Danton | Devillers, Rodolphe | Stall, Shelley | Chaumont, Marc | Mouquet, Nicolas | Mouillot, David | Berti-Équille, Laure

Edité par CCSD -

IDW 2022 was hosted in Seoul, the Republic of Korea, by the Korea Institute of Science and Technology Information (KISTI), committed by the Ministry of Science and ICT, Seoul Metropolitan Government, National Library of Korea, and National Assembly Library, with the support of the Korea Research Institute of Standards and Science (KRISS), Sungkyunkwan University (SKKU), Korea Institute of Oriental Medicine and the Korea Institute of Geoscience and Mineral Resources.This landmark event brought together data scientists, researchers, industry leaders, entrepreneurs, policymakers, and data stewards from disciplines across the globe to explore how best to exploit the data revolution to improve science and society through data-driven discovery and innovation. IDW 2022 combined the 19th RDA Plenary Meeting, the biannual meeting of this international member organization working to develop and support global infrastructure facilitating data sharing and reuse, and SciDataCon 2022, the scientific conference addressing the frontiers of data in research organized by CODATA and WDS.. International audience. One of the challenges in Machine Learning research is to ensure that the presented and published results are sound and reliable. Reproducibility is an important step to promote open and accessible research, thereby allowing the scientific community to quickly integrate new findings and convert ideas to practice. We already went through the path of darkness: We proposed a set of recommendations ('fixes') to overcome these reproducibility challenges that a researcher may encounter in order to improve Reproducibility and Replicability (R&R) and reduce the likelihood of wasted effort. These strategies can be used as "swiss army knife" to move from DL to more general areas as they are organized as (i) the quality of the dataset (and associated metadata), (ii) the Deep Learning method, (iii) the implementation, and the infrastructure used. We identified the main challenges and constraints from these papers and presented them accordingly. Finally, with the lessons learned in the previous step, we propose a set of mitigation strategies to overcome the main reproducibility challenges and help researchers achieve their goals.

Suggestions

Du même auteur

Mitigation Strategies to Improve Reproducibility of Poverty Estimations From Remote Sensing Images Using Deep Learning

Archive ouverte | Machicao, Jeaneth | CCSD

International audience. The challenges of Reproducibility and Replicability (R & R) in computer science experiments have become a focus of attention in the last decade, as efforts to adhere to good research practice...

Checklist Strategies to Improve the Reproducibility of Deep Learning Experiments with an Illustration

Archive ouverte | Ben Abbess, Ali | CCSD

Poster to be presented during RDA 19th Plenary Meeting, Part Of International Data Week, 20–23 June 2022, Seoul, South Korea. International audience. We report a review of the reproducibility of three publications f...

A Deep-Learning Method for the Prediction of Socio-Economic Indicators from Street-View Imagery Using a Case Study from Brazil

Archive ouverte | Machicao, Jeaneth | CCSD

International audience. Socioeconomic indicators are essential to help design and monitor the impact of public policies on society. Such indicators are usually obtained through census data collected at 10-year inter...

Chargement des enrichissements...