Clustered Federated Learning

Ma, Jie

Clustered Federated Learning

Ma, Jie

Permalink

Publication Type:: Thesis
Issue Date:: 2023

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download thesisAdobe PDF (5.33 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Ma, Jie
dc.date.accessioned	2024-04-02T23:27:05Z
dc.date.available	2024-04-02T23:27:05Z
dc.date.issued	2023
dc.identifier.uri	http://hdl.handle.net/10453/177448
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Heterogeneous federated learning without assuming any structure is challenging due to the conflicts among non-identical data distributions of clients. In practice, clients often comprise near-homogeneous clusters, so training a server-side model per cluster mitigates the conflicts, which is called clustered FL. With new insights and perspectives, we propose a unified bi-level optimization framework for clustered FL methodologies. Based on this, we present a fundamental method called Weighted Clustered Federated Learning (WeCFL). Additionally, we introduce a novel theoretical analysis framework for its convergence analysis. This framework factors in the clusterability among clients to measure the effects of intra-cluster non-IIDness, and a linear convergence rate of O(1/T) is achieved. To enhance the robustness of clustering, we propose a methodology termed Clustered FL with Contrastive Learning (CFL-CON), which can be integrated into our previously proposed clustered FL frameworks and many other clustered FL methods. We propose two variants based on the space of representation and parameters respectively. To address the lack of knowledge sharing due to robust clustering and to improve performance, we propose another generic add-on technique, Clustered FL with Clustered Knowledge Sharing (CFL-CKS). We conduct a theoretical analysis of the term’s simplification, convergence, and interpretation, providing a comprehensive understanding. Furthermore, to bridge the trade-off between these two add-ons, we propose Clustered iv FL with Contrastive Learning and Clustered Knowledge Sharing (CFL-CON&CKS). This method applies contrastive learning to the head of the neural network to create distance, and knowledge sharing to the backbone of the neural network to facilitate knowledge sharing. Lastly, to address the problem of clustering collapse and to stabilize clustered FL, we propose Clustered Additive Modeling (CAM). This method applies a globally shared model along with the cluster-wise models. The global model captures the features shared by all clusters, so cluster-wise models are enforced to focus on the differences among clusters. The asymptotic convergence rate is proved. Experimental simulations also demonstrate the superiority of our methods in terms of robustness, stability of clustering, effectiveness in mitigating clustering collapse and performance. All methods are implemented with unified datasets, non-IID settings, models, optimizers, and baselines, as detailed in the appendix, to ensure consistency.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/177448/1/thesis.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	© 2023 Jie Ma
dc.rights	au.edu.uts.lib/cph
dc.title	Clustered Federated Learning	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Heterogeneous federated learning without assuming any structure is challenging due to the conflicts among non-identical data distributions of clients. In practice, clients often comprise near-homogeneous clusters, so training a server-side model per cluster mitigates the conflicts, which is called clustered FL. With new insights and perspectives, we propose a unified bi-level optimization framework for clustered FL methodologies. Based on this, we present a fundamental method called Weighted Clustered Federated Learning (WeCFL). Additionally, we introduce a novel theoretical analysis framework for its convergence analysis. This framework factors in the clusterability among clients to measure the effects of intra-cluster non-IIDness, and a linear convergence rate of O(1/T) is achieved. To enhance the robustness of clustering, we propose a methodology termed Clustered FL with Contrastive Learning (CFL-CON), which can be integrated into our previously proposed clustered FL frameworks and many other clustered FL methods. We propose two variants based on the space of representation and parameters respectively. To address the lack of knowledge sharing due to robust clustering and to improve performance, we propose another generic add-on technique, Clustered FL with Clustered Knowledge Sharing (CFL-CKS). We conduct a theoretical analysis of the term’s simplification, convergence, and interpretation, providing a comprehensive understanding. Furthermore, to bridge the trade-off between these two add-ons, we propose Clustered iv FL with Contrastive Learning and Clustered Knowledge Sharing (CFL-CON&CKS). This method applies contrastive learning to the head of the neural network to create distance, and knowledge sharing to the backbone of the neural network to facilitate knowledge sharing. Lastly, to address the problem of clustering collapse and to stabilize clustered FL, we propose Clustered Additive Modeling (CAM). This method applies a globally shared model along with the cluster-wise models. The global model captures the features shared by all clusters, so cluster-wise models are enforced to focus on the differences among clusters. The asymptotic convergence rate is proved. Experimental simulations also demonstrate the superiority of our methods in terms of robustness, stability of clustering, effectiveness in mitigating clustering collapse and performance. All methods are implemented with unified datasets, non-IID settings, models, optimizers, and baselines, as detailed in the appendix, to ensure consistency.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/177448