Migrating federated learning to centralized learning with the leverage of unlabeled data

Wang, X; Zhu, T; Ren, W; Zhang, D; Xiong, P

Migrating federated learning to centralized learning with the leverage of unlabeled data

Wang, X Zhu, T

Ren, W Zhang, D Xiong, P

Permalink

Publisher:: Springer Nature
Publication Type:: Journal Article
Citation:: Knowledge and Information Systems, 2023, 65, (9), pp. 3725-3752
Issue Date:: 2023-09-01

Closed Access

	Filename	Description	Size
	Migrating federated learning.pdf	Published version	5.64 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, X
dc.contributor.author	Zhu, T https://orcid.org/0000-0003-3411-7947
dc.contributor.author	Ren, W
dc.contributor.author	Zhang, D
dc.contributor.author	Xiong, P
dc.date.accessioned	2024-03-13T22:36:29Z
dc.date.available	2024-03-13T22:36:29Z
dc.date.issued	2023-09-01
dc.identifier.citation	Knowledge and Information Systems, 2023, 65, (9), pp. 3725-3752
dc.identifier.issn	0219-1377
dc.identifier.issn	0219-3116
dc.identifier.uri	http://hdl.handle.net/10453/176649
dc.description.abstract	Federated learning carries out cooperative training without local data sharing; the obtained global model performs generally better than independent local models. Benefiting from the free data sharing, federated learning preserves the privacy of local users. However, the performance of the global model might be degraded if diverse clients hold non-IID training data. This is because the different distributions of local data lead to weight divergence of local models. In this paper, we introduce a novel teacher–student framework to alleviate the negative impact of non-IID data. On the one hand, we maintain the advantage of the federated learning on the privacy-preserving, and on the other hand, we take the advantage of the centralized learning on the accuracy. We use unlabeled data and global models as teachers to generate a pseudo-labeled dataset, which can significantly improve the performance of the global model. At the same time, the global model as a teacher provides more accurate pseudo-labels. In addition, we perform a model rollback to mitigate the impact of latent noise labels and data imbalance in the pseudo-labeled dataset. Extensive experiments have verified that our teacher ensemble performs a more robust training. The empirical study verifies that the reliance on the centralized pseudo-labeled data enables the global model almost immune to non-IID data.
dc.language	en
dc.publisher	Springer Nature
dc.relation.ispartof	Knowledge and Information Systems
dc.relation.isbasedon	10.1007/s10115-023-01869-8
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0806 Information Systems
dc.subject.classification	Information Systems
dc.subject.classification	46 Information and computing sciences
dc.title	Migrating federated learning to centralized learning with the leverage of unlabeled data
dc.type	Journal Article
utslib.citation.volume	65
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0806 Information Systems
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	University of Technology Sydney/Strength - CCSP - Centre for Cyber Security and Privacy
utslib.copyright.status	closed_access	*
dc.date.updated	2024-03-13T22:36:26Z
pubs.issue	9
pubs.publication-status	Published
pubs.volume	65
utslib.citation.issue	9

Abstract:

Federated learning carries out cooperative training without local data sharing; the obtained global model performs generally better than independent local models. Benefiting from the free data sharing, federated learning preserves the privacy of local users. However, the performance of the global model might be degraded if diverse clients hold non-IID training data. This is because the different distributions of local data lead to weight divergence of local models. In this paper, we introduce a novel teacher–student framework to alleviate the negative impact of non-IID data. On the one hand, we maintain the advantage of the federated learning on the privacy-preserving, and on the other hand, we take the advantage of the centralized learning on the accuracy. We use unlabeled data and global models as teachers to generate a pseudo-labeled dataset, which can significantly improve the performance of the global model. At the same time, the global model as a teacher provides more accurate pseudo-labels. In addition, we perform a model rollback to mitigate the impact of latent noise labels and data imbalance in the pseudo-labeled dataset. Extensive experiments have verified that our teacher ensemble performs a more robust training. The empirical study verifies that the reliance on the centralized pseudo-labeled data enables the global model almost immune to non-IID data.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/176649