The increasing adoption of technology and AI in business continues to drive concerns regarding sensitive data and the protection of assets. Organizations must implement tools to protect data while also leveraging that data to identify new use cases for AI that can help the business achieve its goals. I’m Ronald van Loon, an industry analyst and an Intel Ambassador, and I’ve been closely examining how these challenges are unfolding.
In response to this complex situation, vendors are proactively developing innovative and effective security solutions embedded into both their software and hardware products. This will ensure that organizations can move forward with their continuous innovation and AI adoption without risking data privacy or a breach of security.
Artificial intelligence is improved by training on vast sets of data, which typically means centralizing and sharing those data sets in a single location. This becomes a concern, however, when the training involves sensitive data, regulated data, and data sets that are too large to move.
Intel is once again out front, pioneering a new machine learning approach to address these issues and those yet to come. Federated learning (FL) is a unique, distributed machine learning (ML) approach that is designed to enable collaboration while reducing the risk of compromising ML algorithms or sensitive data or require the relocation of large sets of data.
This approach explores the secure connection of multiple datasets and systems by removing the barriers that prevent the aggregation of data for analysis and addressing the security concerns of modern technology and cloud storage from the outset. By removing the need for central aggregation, data can continue to live within the provenance of its owners. The proactive nature of federated learning can help industries like retail, healthcare, manufacturing, and financial services can drive secure data analysis so that organizations can benefit from all of the valuable insights that data holds. FL also goes a step further with OpenFL, a trained AI/ML model that can be both productized and deployed for making predictions.
The Use of Federated Learning
In 2018, Intel and Penn Medicine presented a preliminary study on federated learning in the medical imaging industry. The study showed that FL was capable of training a model with more than 99% accuracy when compared to traditional AI modeling and training. Over the years, the project has continued to demonstrate the benefits of FL in healthcare:
- Increased the number of participating research and healthcare organizations 7x over.
- Secured and leveraged the largest dataset of glioblastoma patients (6,314 patients and 5 TB of data).
- Increased accuracy in brain tumor detection by 33% compared to models trained with public datasets.
- As much as 4.48x lower latency and 2.29x lower memory utilization compared to the first consensus model (utilizing the Intel® Distribution of OpenVINO™ toolkit3 to optimize the model).
Many elements had to be combined to create these results, including the four pillars that were essential to success:
- Intel® Software Guard Extensions (Intel® SGX)
- OpenFL framework
- Gramine (an open-source library OS)
- Intel® Distribution for OpenVINO™ toolkit
These components work together to enforce federation rules, protect data, simplify implementation, and optimize AI models. You can read the full case study for a more detailed review and analysis (which was also published by Nature, an industry leader).
The results from this study were accomplished by utilizing a decentralized system to process high volumes of data, combining the power of Intel federated learning technology and Intel SGX to remove barriers, address data privacy concerns, and advance the use cases for AI in healthcare, which can be further extrapolated to industries like financial services, retail, and manufacturing.
Federated Learning in Financial Services
Financial institutions and financial services organizations are facing as much data privacy concern as healthcare, if not more so. The enduring need to protect people’s financial information and prevent the occurrence of illegal or illicit financial activities continues to be a challenge in light of the adoption of technology and the utilization of AI in financial services, online banking, and other transactions.
According to the United Nations Office on Drugs and Crimes, 2% to 5% of the global GDP is laundered each year—essentially trillions of dollars. This is largely due to ineffective AML/CFT systems (anti-money laundering and countering the financing of terrorism) and concerns and complications with information sharing. Currently, financial institutions are mostly seen as islands. Current systems don’t allow or encourage information sharing or collective learning, creating barriers to identifying fraud and reducing compliance issues and regulatory risks.
Federated learning’s ML-driven model allows the algorithm to find and analyze data sets across institutions without actually moving or sharing the data. This overcomes the security concerns and the current information silos that exist and leverages federated learning and federated analytics to enable financial institutions and financial services organizations to manage and mitigate risks. It delivers a more effective, efficient, and sustainable solution that preserves accuracy and privacy.
Federated learning offers a reduction in errors, reducing false positive rates that currently stand around 95% down to as little as 12%, allowing organizations to reduce costs, prioritize their efforts, and mitigate risks more effectively. It also preserves privacy in data for consumers, users, and others, while still being able to detect, address, and prevent criminal activity in the system. This leads to a more effective system because information, insights, and risks are shared across the industry.
Integrating Federated Learning with Privacy and Security
Federated learning does a lot to enable dynamic collaboration and data analysis, making it easier for organizations to leverage data without compromising privacy or security. However, it cannot be done with the approach alone. Intel has worked to create hardware-rooted technologies that facilitate the ML approach of federated learning and ensure that a trusted environment exists to protect the integrity and confidentiality of data sets and code. Through Intel SGX, we’re also protecting intellectual property as it’s being executed in various, potentially untrusted silos while also protecting the privacy and confidentiality of the data that is being executed on by the AI model, which is potentially millions of dollars of assets.
Intel SGX is a hardware-based trusted execution environment (TEE) featured in Intel Xeon processors.
It is designed to protect against snooping or modification of data and code in the TEE. This effectively minimizes the trust boundary so that the risk of attacks is also reduced because there is less space for attacks to be launched. This can protect against software attacks and attacks on memory content and also includes an opportunity to utilize hardware-based attestation. This measures and verifies data signatures and code, increasing confidence in the integrity of data and the modeling itself.
The Use of OpenFL to Leverage Data with Federated Learning
OpenFL is a Python 3-based open-source framework specifically designed for federated learning. It is a scalable, user-friendly, secure tool that data scientists can use to improve security and leverage data for their organization. And with the most recent release of OpenFL v.1.5, you can run it on the IntelSGX framework to maximize the trusted environment of the hardware and software being accessed. The newest version includes a Privacy Meter, vertical FL, differential privacy, model compression, and Habana Gaudi Accelerator support (Note: Gaudi doesn’t support Intel SGX).
OpenFL allows organizations to train an AI model without having to share or risk the compromise of sensitive data. This platform also addresses many concerns that AI model developers have, including:
- Protection of intellectual property
- Utilizes TEEs for secure, controlled system interactions
- Data and model confidentiality
- Computation integrity and accuracy
- Enablement of attestation
Federated learning simplifies all of the issues surrounding data sharing. However, organizations need to have the right tools, like OpenFL, to help deliver powerful data insights without compromise or concern for the security of the information being analyzed.
Federated learning offers a revolutionary machine learning approach that is being pioneered by Intel and is poised to impact industries like healthcare, financial services, manufacturing, and retail to securely gather valuable insights from their most sensitive data.
It’s estimated that the AI industry will be worth as much as $15.7 trillion globally by 2030. A study from Deloitte also found that 79% of those surveyed deployed or are planning to deploy three or more types of AI. AI adoption is happening at an increasingly rapid pace, but it also needs to be done with data security in mind, which is where federated learning makes its mark.
Check out Intel for more information on federated learning and how you can use it to leverage your data insights, scale your AI integrations, and more.
By Ronald van Loon