Multi-Center Federated Learning to Cluster Clients with non-IID data

Publication Type:
Thesis
Issue Date:
2020
Full metadata record
Federated learning (FL) is a new machine learning paradigm to collaboratively learn an intelligent model across many clients without uploading local data to the server. Non-IID data across clients is a significant challenge for the FL system because its inherited distributed machine learning framework is designed for the scenario of IID data across clients. Clustered FL is a type of FL method to solve non-IID challenges using a client clustering method in the FL context. However, even adopts a client clustering FL method still facing minor problems such as unstable against client-wise outliers and the drop of model performance with model poisoning attack. To face the aforementioned challenges, the main research objective of the thesis is to study that how to make FL effectively, seamlessly solved non-IID data across clients in horizontal clients partition settings. The main research objective has been studied from four coherently linked perspectives: (I) how to make FL to address the non-IID distribution of data across different clients in a effective and scalable manner so that they can be applied to real world cases which consists of thousands of client and varies type of devices,(II) how to make cluster FL methods more robust to client-wise outliers, (III) how to make better balance between the performance of global models and the extent of personalisation of local models, (IV)how to make FL training more robust to model poisoning attack by density methods. This thesis proposes a novel FL framework with robust clustering algorithm and secure the models to tackle client-wise outliers as well as model poisoning in the FL system. Specifically, we will develop a robust federated aggregation operator using a bootstrap median-of-means mechanism that can produce a higher breakdown point to tolerate a larger proportion of outliers. All work experiments on three benchmark datasets have demonstrated the effectiveness of the proposed method that outperforms other baseline methods in terms of evaluation criteria.
Please use this identifier to cite or link to this item: