In this post on Collaborating Model Training using Federated Learning, I will explain how the training of a neural network occurs without the need of collecting the whole training data on a centralized location. Basically, Federated Learning is a special form of Machine Learning, which requires the model training at local participants’ location and only the outcome of the trained model need to be sent to the centralized server for overall training.
Collaborating model training using Federated Learning help in resolving two major concerns. Firstly, the training data often resides in geographically diverse locations. Secondly, the privacy issue of user data.
Model Training at Local Devices Matters
Users become more concerned about the privacy of their data. Let us consider the example of smart home users. The devices equipped in the home may transfer the data to a cloud server for the purpose of analysis. However, not all users like the idea of uploading their personal data. If the local devices are powerful enough to perform data analysis, then they merely send the result of analysis to the centralized cloud rather than sending the raw data.
In this way, the personal data of users remain at the local devices. Also, in addition to data security, the reduction in bandwidth consumption is also possible.
Concept of Federated Learning
The local devices such as smartphones create a local model of the data for a specific user. We train the model only with the locally available data. We obtain the trained model in the form of a weight matrix. Therefore, this weight matrix is transferred to the cloud server for building a global model.
In essence, Federated Learning enables a distributed learning environment in which the individual devices working at the source of data, participate in model building.
Federated Learning has applications in a wide range of fields such as healthcare, finance, insurance, retail, smart anything, and agriculture.
Let us take an example of the medical record of a patient. Further, let us assume that the patient has undergone treatment in different medical institutions for different ailments. Suppose, a machine learning model is required for predicting the health of the patient. Also, the medical records of a patient are very sensitive data and usually, the hospitals don’t share it for the purpose of maintaining privacy. Therefore, it is very difficult to create a comprehensive medical dataset. Federated Learning can be helpful in such situations where the data is not required to be shared for creating a machine learning prediction model.
Pros and Cons of the Federated Learning
Pros of this approach
(1) Less Energy Need. Often the devices participating in Federated Learning are smaller and need lesser battery power.
(2) Security and Privacy of User Data. Federated Learning doesn’t require the raw data from the user’s location available on the cloud server. Therefore, the data remains at users’ locations, and the privacy of user data is maintained.
(3) Less Bandwidth Requirement. Basically, Federated Learning requires only the trained local model and not the raw data. Therefore, the data transfer to the cloud server is substantially lesser. Hence, the bandwidth requirement is also less.
(4) No need of 24 X 7 Internet Connectivity. Since, the model training takes place at local devices, these devices need not be connected to the cloud server all the time. Hence, they will still work when Internet Connectivity is not there.
Cons of this approach
Although Federated Learning aims to maintain the privacy and security of user data, it is still prone to certain kinds of attacks. In fact, the training data is still vulnerable to the inference attacks since it requires the continuous transfer of model weights. Therefore, it may reveal the client’s information to the adversary. Another challenge in this approach is that the global model can be corrupted.