In this post on Privacy and Security Mechanisms for Federated Learning, I will discuss some of the mechanisms for achieving the privacy in Federated Learning approach. Although, these issues are still the topic of research, yet few techniques are emerging that we can use in preserving the privacy of user data.
Why Privacy Mechanisms are Required for Federated Learning?
Although Federated Learning enables the model building in a distributed way, that ensures privacy of user data. However, it still needs to employ certain privacy mechanisms. Because it is still possible for a malicious hacker to extract the original information from data summaries in a number of ways as specified below.
- Model Poisoning where adversaries can damage the machine learning model through their own device or by taking control of devices of other users.
- An adversary may compromise the centralized server which performs aggregation of partial models.
- The algorithm that builds the model may have certain drawbacks.
Given these points, it is advisable that Federated Learning should also employ certain privacy mechanisms.
Privacy and Security Mechanisms for Federated Learning
Now we discuss some of the mechanisms that can be used to ensure privacy of user data when working to build a machine learning model.
In general, Federated Learning aims to preserve the privacy of user data by enabling the training of machine learning model at the local devices. As a result, the raw data of users is not transferred to the cloud server and only the resulting weight matrix is sent to the server. However, there are still chances that a hacker gains knowledge about the user data.
In order to prevent this attack, there are several means that can be adopted and differential privacy is one of them. Basically, this technique adds some noise in the model that the local device transfers to the server.
When the client devices build their machine learning model using the local data it results in a local model. Further, the encryption of the local model takes place.
Essentially, the encrypted gradient updates from devices are transferred for aggregation. Consequently, this approach performs computations on the encrypted data and determines the updated global model. Later, this updated global model is sent to each local device.
Secure Multiparty Computation (SMC)
As we can see, the users have concerns about the privacy of their data. Hence, they don’t want to share it even for performing any computation. Indeed, it is possible to maintain such privacy by performing computations at the local devices. In fact, there is a protocol that we call Secure Multiparty Computation (SMC) that makes it possible.
Basically, SMC allows the performing of cryptographic computations at the local devices, and the users need not share their data. This technique is very useful when the building machine learning model requires the participants from multiple organizations who are not ready to share their personal data. Consequently, the computation is performed in a distributive manner and combining the results later.
However, SMC has a drawback in that it requires performing complex cryptographic computations at local devices. This adds the computation as well as the communication costs that makes the whole process slower and more expansive. Therefore, this technique may be appropriate to build simpler models with fewer data.
This article on Privacy and Security Mechanisms for Federated Learning highlights the need of implementing privacy mechanisms in the Federated Learning scenario since it still poses certain security vulnerabilities. Further, the article discusses some of the popular privacy mechanisms.