Stockholm university

Research project Federated Health: A Nordic Federated Health Data Network

Electronic health records are full of important information about diagnoses and treatments of patients. Sharing this information in a federated health data network, across hospitals in the Nordic countries, will improve the quality of health care.

Genre photo: Lots of medicine in different colors in a person’s hand.
Photo: Ksenia Yakovleva/Unsplash.

Unstructured textual data adds contextual information not found elsewhere in patient records. However, this type of information is the most difficult to exploit or share.

Sharing unstructured data such as clinical text presents significant technical challenges since it is difficult to provide privacy guarantees. The clinical text contains personal information. In addition to privacy challenges, individual countries have electronic health record (EHR) data in their native languages. This presents special challenges for processing clinical text using data-driven AI algorithms since the personal information may leak. Therefore it is necessary to de-identify and pseudonymise the clinical text before the AI models are constructed.

In this project, a federated health data network will be developed, geared towards secondary use of health data. The project utilizes distributed machine learning, specifically federated learning, to ensure data privacy and ownership. The federated health data network will be built upon privacy-preserving and secure distributed training of multilingual clinical language models in Norwegian, Swedish, Danish, Finnish and Estonian. Additionally, the solution incorporates a distributed ledger with smart contracts and blockchain technology, enhancing transparency and security.

Two use cases will be used to demonstrate the innovation potential of unstructured clinical text data: The detection of medical implants and the detection of adverse drug reactions (ADRs).

The project aims to advance the field of healthcare data sharing and analysis, unlock the potential of unstructured data in EHR systems and enable innovative advancements in healthcare while maintaining privacy and security.

Project description

Before an MRI (Magnetic Resonance Imaging) examination, it is important to have detailed model knowledge about the medical implants that an individual patient has. Otherwise they can cause serious harm to the patient, or even death. Automatic analysis of clinical text can more rapidly identify new and serious ADRs.

Today, the process requires reading through the entire patient record to establish if the patient has a medical implant and which model.

The model to be developed in the project will be able to reduce the time for finding existing implants and increase the number of devices that can be detected.

The solution includes a pipeline for handling clinical text in multiple Nordic languages, privacy-preserving distributed training using federated learning, and the integration of a distributed ledger for enhanced transparency and security.

The end product will be a privacy-preserving virtual data warehouse with different user levels for data analysis and research purposes. The implementation involves leveraging large EHR datasets, existing natural language processing tools, and a national primary healthcare infrastructure.

The project is led by the Norwegian Centre for E-health Research, Tromsø, Norway, in collaboration with the partners:

  • University of Turku, Finland
  • Stockholm University, Sweden
  • County Council of Östergötland/Linköping University Hospital, Sweden
  • DNV, Norway
  • University of Copenhagen, Denmark
  • University of Tartu, Estonia
  • Omilon, Denmark
  • Cambio, Sweden

Project members

Project managers

Hercules Dalianis

Professor

Department of Computer and Systems Sciences
Hercules Dalianis

Members

Thomas Vakili

PhD student

Department of Computer and Systems Sciences
Thomas_Vakili_2022

Tyr Hullmann

Project assistant

Department of Computer and Systems Sciences
Tyr Hullman. Foto: Privat.

Publications