Dynamic Sample Selection for Federated Learning with Heterogeneous Data in Fog Computing

Cai, L; Lin, D; Zhang, J; Yu, S

Dynamic Sample Selection for Federated Learning with Heterogeneous Data in Fog Computing

Cai, L Lin, D Zhang, J Yu, S

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: IEEE International Conference on Communications, 2020, 2020-June, pp. 1-6
Issue Date:: 2020-06-01

Closed Access

	Filename	Description	Size
	09148586.pdf	Published version	153 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Cai, L
dc.contributor.author	Lin, D
dc.contributor.author	Zhang, J
dc.contributor.author	Yu, S https://orcid.org/0000-0003-4485-6743
dc.date	2020-06-07
dc.date.accessioned	2021-04-14T20:52:25Z
dc.date.available	2021-04-14T20:52:25Z
dc.date.issued	2020-06-01
dc.identifier.citation	IEEE International Conference on Communications, 2020, 2020-June, pp. 1-6
dc.identifier.isbn	9781728150895
dc.identifier.issn	1550-3607
dc.identifier.uri	http://hdl.handle.net/10453/148111
dc.description.abstract	Federated learning is a state-of-the-art technology used in the fog computing, which allows distributed learning to train cross-device data while achieving efficient performance. Many current works have optimized the federated learning algorithm in homogeneous networks. However, in the actual application scenario of distributed learning, data is independently generated by each device, and this non-homologous data has different distribution characteristics. Therefore, the data used by each device for local learning is unbalanced and non-IID, and the heterogeneity of data affects the performance of federated learning and slows down the convergence. In this paper, we present a dynamic sample selection optimization algorithm, FedSS, to tackle heterogeneous data in federated learning. FedSS dynamically selects the training sample size during the gradient iteration based on the locally available data size, to settle the expensive evaluations of the local objective function with a massive amount of dataset. We theoretically analyze the convergence and present the complexity estimates of our framework when learning large data from unbalanced distribution. Our experimental results show that the use of dynamic sampling methods can effectively improve the convergence speed with heterogeneous data, and keep computational costs low while achieving the desired accuracy.
dc.language	en
dc.publisher	IEEE
dc.relation.ispartof	IEEE International Conference on Communications
dc.relation.ispartof	International Conference on Communications
dc.relation.isbasedon	10.1109/ICC40277.2020.9148586
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Dynamic Sample Selection for Federated Learning with Heterogeneous Data in Fog Computing
dc.type	Conference Proceeding
utslib.citation.volume	2020-June
utslib.location.activity	Dublin, Ireland
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2021-04-14T20:52:24Z
pubs.finish-date	2020-06-11
pubs.place-of-publication	Piscataway, USA
pubs.publication-status	Published
pubs.start-date	2020-06-07
pubs.volume	2020-June
dc.location	Piscataway, USA

Abstract:

Federated learning is a state-of-the-art technology used in the fog computing, which allows distributed learning to train cross-device data while achieving efficient performance. Many current works have optimized the federated learning algorithm in homogeneous networks. However, in the actual application scenario of distributed learning, data is independently generated by each device, and this non-homologous data has different distribution characteristics. Therefore, the data used by each device for local learning is unbalanced and non-IID, and the heterogeneity of data affects the performance of federated learning and slows down the convergence. In this paper, we present a dynamic sample selection optimization algorithm, FedSS, to tackle heterogeneous data in federated learning. FedSS dynamically selects the training sample size during the gradient iteration based on the locally available data size, to settle the expensive evaluations of the local objective function with a massive amount of dataset. We theoretically analyze the convergence and present the complexity estimates of our framework when learning large data from unbalanced distribution. Our experimental results show that the use of dynamic sampling methods can effectively improve the convergence speed with heterogeneous data, and keep computational costs low while achieving the desired accuracy.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/148111