Efficient Federated Learning for AIoT Applications Using Knowledge Distillation
- Publisher:
- Institute of Electrical and Electronics Engineers (IEEE)
- Publication Type:
- Journal Article
- Citation:
- IEEE Internet of Things Journal, 2022, PP, (99), pp. 1-1
- Issue Date:
- 2022-01-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
Efficient_Federated_Learning_for_AIoT_Applications_Using_Knowledge_Distillation.pdf | Published version | 1.26 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
As a promising distributed machine learning paradigm, Federated Learning (FL) trains a central model with decentralized data without compromising user privacy, which makes it widely used by Artificial Intelligence Internet of Things (AIoT) applications. However, the traditional FL suffers from model inaccuracy, since it trains local models only using hard labels of data while useful information of incorrect predictions with small probabilities is ignored. Although various solutions try to tackle the bottleneck of the traditional FL, most of them introduce significant communication overhead, making the deployment of large-scale AIoT devices a great challenge. To address the above problem, this paper presents a novel Distillation-based Federated Learning (DFL) method that enables efficient and accurate FL for AIoT applications. By using Knowledge Distillation (KD), in each round of FL training, our approach uploads both the soft targets and local model gradients to the cloud server for aggregation, where the aggregation results are then dispatched to AIoT devices for the next round of local training. During the DFL local training, in addition to hard labels, the model predictions approximate soft targets, which can improve model accuracy by leveraging the knowledge of soft targets. To further improve our DFL model performance, we design a dynamic adjustment strategy of loss function weights for tuning the ratio of KD and FL, which can maximize the synergy between soft targets and hard labels. Comprehensive experimental results on well-known benchmarks show that our approach can significantly improve the model accuracy of FL without introducing significant communication overhead.
Please use this identifier to cite or link to this item: