Secure Extraction of Personal Information from EHR by Federated Machine Learning

Archive ouverte

Azzouzi, Mohamed El | Bellafqira, Reda | Coatrieux, Gouenou | Cuggia, Marc | Bouzillé, Guillaume

Edité par CCSD ; IOS Press -

International audience. Secure extraction of Personally Identifiable Information (PII) from Electronic Health Records (EHRs) presents significant privacy and security challenges. This study explores the application of Federated Learning (FL) to overcome these challenges within the context of French EHRs. By utilizing a multilingual BERT model in an FL simulation involving 20 hospitals, each represented by a unique medical department or pole, we compared the performance of two setups: individual models, where each hospital uses only its own training and validation data without engaging in the FL process, and federated models, where multiple hospitals collaborate to train a global FL model. Our findings demonstrate that FL models not only preserve data confidentiality but also outperform the individual models. In fact, the Global FL model achieved an F1 score of 75,7%, slightly comparable to that of the Centralized approach at 78,5%. This research underscores the potential of FL in extracting PIIs from EHRs, encouraging its broader adoption in health data analysis.

Suggestions

Du même auteur

Automatic de-identification of French electronic health records: a cost-effective approach exploiting distant supervision and deep learning models

Archive ouverte | Azzouzi, Mohamed El | CCSD

International audience. Background: Electronic health records (EHRs) contain valuable information for clinical research; however, the sensitive nature of healthcare data presents security and confidentiality challen...

Ensuring GDPR Compliance and Security in a Clinical Data Warehouse: Challenges and Insights from a University Hospital (Preprint)

Archive ouverte | Riou, Christine | CCSD

International audience. Background: The European Union's General Data Protection Regulation (GDPR) has profoundly influenced health data management, with significant implications for clinical data warehouses (CDWs)....

Sharing health big data for research - A design by use cases: the INSHARE platform approach

Archive ouverte | Bouzillé, Guillaume | CCSD

International audience. Sharing and exploiting efficiently Health Big Data (HBD) lead to tackle great challenges: data protection and governance taking into account legal, ethical and deontological aspects which ena...

Chargement des enrichissements...