Chronos: Accelerating Federated Learning with Resource Aware Training Volume Tuning at Network Edges

Liu, Y; Zhang, X; Zhao, Y; He, Y; Yu, S; Zhu, K

Chronos: Accelerating Federated Learning with Resource Aware Training Volume Tuning at Network Edges

Liu, Y Zhang, X Zhao, Y He, Y Yu, S

Zhu, K

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Vehicular Technology, 2022, PP, (99), pp. 1-15
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	Chronos Accelerating Federated Learning with Resource Aware Training Volume Tuning at Network Edges.pdf	Published version	3 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Liu, Y
dc.contributor.author	Zhang, X
dc.contributor.author	Zhao, Y
dc.contributor.author	He, Y
dc.contributor.author	Yu, S https://orcid.org/0000-0003-4485-6743
dc.contributor.author	Zhu, K
dc.date.accessioned	2023-03-21T00:11:46Z
dc.date.available	2023-03-21T00:11:46Z
dc.date.issued	2022-01-01
dc.identifier.citation	IEEE Transactions on Vehicular Technology, 2022, PP, (99), pp. 1-15
dc.identifier.issn	0018-9545
dc.identifier.issn	1939-9359
dc.identifier.uri	http://hdl.handle.net/10453/167836
dc.description.abstract	Due to the limited resources and data privacy issue, last decade witnesses the fast development of Distributed Machine Learning (DML) at network edges. Among all the existing DML paradigms, Federated Learning (FL) would be a promising one, since in FL, each client trains its local model without sharing the raw data with others. A community of clients with the same interest can join together to derive a high-performance model by periodically synchronizing the parameters of their local models under the help of a coordination server. However, FL will encounter the straggler problem at network edges, and hence the synchronization among clients becomes inefficient. It slows down the convergence speed of learning process. To alleviate the straggler problem, we propose a method named Chronos that accelerates FL with training volume tuning in this paper. More specifically, Chronos is a resource aware method that adaptively adjusts the amount of data used by each client for training (<italic>i</italic>.<italic>e</italic>. training volume) in each iteration in order to eliminate the synchronization waiting time caused by the heterogeneous and dynamical computing and communication resources. In addition, we theoretically analyze the convergence of Chronos in a non-convex setting and utilize the results for the algorithm design of Chronos in return to guarantee the convergence. Extensive experiments show that compared with the benchmark algorithms (<italic>i</italic>.<italic>e</italic> BSP and SSP), Chronos significantly improves convergence speed by up to 6.4×.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Vehicular Technology
dc.relation.isbasedon	10.1109/TVT.2022.3218155
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	08 Information and Computing Sciences, 09 Engineering, 10 Technology
dc.subject.classification	Automobile Design & Engineering
dc.title	Chronos: Accelerating Federated Learning with Resource Aware Training Volume Tuning at Network Edges
dc.type	Journal Article
utslib.citation.volume	PP
utslib.for	08 Information and Computing Sciences
utslib.for	09 Engineering
utslib.for	10 Technology
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-21T00:11:44Z
pubs.issue	99
pubs.publication-status	Published
pubs.volume	PP
utslib.citation.issue	99

Abstract:

Due to the limited resources and data privacy issue, last decade witnesses the fast development of Distributed Machine Learning (DML) at network edges. Among all the existing DML paradigms, Federated Learning (FL) would be a promising one, since in FL, each client trains its local model without sharing the raw data with others. A community of clients with the same interest can join together to derive a high-performance model by periodically synchronizing the parameters of their local models under the help of a coordination server. However, FL will encounter the straggler problem at network edges, and hence the synchronization among clients becomes inefficient. It slows down the convergence speed of learning process. To alleviate the straggler problem, we propose a method named Chronos that accelerates FL with training volume tuning in this paper. More specifically, Chronos is a resource aware method that adaptively adjusts the amount of data used by each client for training (i.e. training volume) in each iteration in order to eliminate the synchronization waiting time caused by the heterogeneous and dynamical computing and communication resources. In addition, we theoretically analyze the convergence of Chronos in a non-convex setting and utilize the results for the algorithm design of Chronos in return to guarantee the convergence. Extensive experiments show that compared with the benchmark algorithms (i.e BSP and SSP), Chronos significantly improves convergence speed by up to 6.4×.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/167836