How to decentralise healthcare AI
The Federated Learning approach to building AI models across institutions
AI has been in use for some time, but it has never been put to mainstream use quite like it is now. With AI adoption accelerating significantly across industries, questions regarding data availability, ownership, and privacy emerge. Alongside these questions, solutions to train AI models without compromising sensitive information are moving to the forefront.
A decentralised approach to artificial intelligence
One approach to collaborative machine learning is Federated Learning (FL)*, where a model is trained across multiple decentralised devices or servers (also called clients) holding local data samples. Instead of centralising data in one location, the model is sent to each client, learns from its local data, and then model updates are aggregated to improve the global model. This process enables privacy preservation, as raw data remains on individual devices, making it a secure and efficient way to train AI models without compromising sensitive information.
Figure: Three clients interacting in an FL scenario
Federated Learning has a wide range of potential applications across various sectors, including financial services, smart devices, telecommunications, manufacturing, retail, energy, and mobility. Still, an area regarded as particularly promising and critical is the application of Federated Learning in the healthcare domain.1
The healthcare data governance odyssey
Recent years have seen an escalating volume of healthcare data generated through electronic health records (EHRs), wearable devices, biomedical research, and medical imaging demands. The global big data analytics in healthcare market was valued at $29.7 billion in 2022 and is projected to reach $134.9 billion by 2032, growing at a CAGR of 16.7%.2
However, there are several key challenges around healthcare data for analytics, including privacy preservation, the siloed distribution of health data across institutions and devices, the dynamic, often multi-modal, and continuously evolving nature of healthcare data, and the lack of diverse healthcare data.
Hence, the vast majority of current AI tools available in healthcare for, among others, computer-aided diagnosis, image analysis and interpretation, automated screening, and risk assessment have been trained on centralised data. This approach is, however, limited by privacy concerns and regulations, as well as efficiency in data transfer.
Note for investors from our expert:
“While the specific concepts behind FL are rather straightforward to understand, the results that can be achieved implementing a federation are highly dependent on a long list of technical aspects, including data availability and heterogeneity among the different client sites, infrastructural specifications, difficulty of the task(s) at hand to be solved by the final global model, aspects of continual/on-line learning where applicable, only to name a few. As such, expert assessment of FL is highly recommended to assess the viability of any related project.”
Healing healthcare AI
Federated Learning circumvents these limitations and eliminates the need to centralise sensitive information by enabling models to be trained on data from various sources without sharing raw patient information. By training models on diverse datasets from different healthcare providers, Federated Learning enhances model generalisation. This means the model is more likely to perform well on new, unseen data, contributing to better diagnostic and predictive capabilities. Furthermore, FL reduces the necessity for data transfer to a central location, minimising bandwidth requirements and associated costs. This makes it a more cost-effective solution, especially when dealing with large health datasets. Federated Learning models can be updated in real-time as new information becomes available across different devices, ensuring that the model stays current and relevant.
This might allow the generation of substantially more accurate and robust AI tools, much less susceptible to differences in data features among various medical institutions.
In study Dayan et al. (2021), each participating hospital saw improved performance with the global federated learning model (green), compared to the model trained only on local data (blue). The performance boost was especially dramatic for hospitals with smaller datasets.3
The readily available magic cure?
While startups like Apheris, Arkhn, Scaleout, Bitfount and FLock, along with heavyweight companies such as Intel and Nvidia, are actively exploring the application of federated approaches to train machine learning models in healthcare, significant commercial breakthroughs are yet to be achieved.
Privacy concerns and regulatory issues remain unresolved. Despite the fact that data in a federation is never directly shared, it still needs to be accessible at client sites for model training. The utilisation of clinical data and patient records for research or commercial purposes necessitates regulatory clearance, posing a limitation to the development of AI tools in healthcare. Traditional reluctance to adopt big data analytics solutions persists, especially when they are not on-premise solutions.
Infrastructural challenges also emerge, as Federated Learning demands at least a workstation capable of machine learning training at each client site, with access to local data. Heterogeneity among participating devices, including varying hardware capabilities, data distributions, and network conditions, may present additional difficulties.
Cyber security is crucial to gain user trust and achieve widespread adoption. Robust and provably secure frameworks are essential to safeguard against potential cyber-attacks.4
Ownership issues over the global model may arise. At the conclusion of Federated Learning training, a global model becomes available, and determining ownership among federation participants and defining allowed applications is not straightforward.
Digital healthcare data availability remains an ongoing effort, with the goal of digitising patient histories from the pre-EHR era and supplementing the standardisation process by transforming static images into machine-readable text.
Winner(s) yet to be picked
The implementation of Federated Learning comes with challenges beyond the technology, especially for large-scale and regulated industries. Companies heavily reliant on legacy systems may encounter integration challenges when adopting Federated Learning, necessitating significant updates to infrastructure and technology stacks. Specifically, industries with stringent regulatory environments, such as finance and healthcare, may face challenges in ensuring that Federated Learning systems comply with complex regulations.
*Although the basic idea of federated learning remains the same, there are various types of FL that can be distinguished based on the specific scenario.5
Deepsense is the expert network for science and technology.
We have experts from leading institutions in AI and FL. Reach out to us to schedule expert calls and materials reviews to get support with technical assessments of your next opportunity.
Nguyen, Truc, and My T. Thai. “Preserving privacy and security in federated learning.” IEEE/ACM Transactions on Networking (2023)
Zhang, Chen, et al. “A survey on federated learning.” Knowledge-Based Systems 216 (2021): 106775