Optimal Online Data Partitioning for Geo-Distributed Machine Learning in Edge of Wireless Networks

Lyu, X; Ren, C; Ni, W; Tian, H; Liu, RP; Dutkiewicz, E

Optimal Online Data Partitioning for Geo-Distributed Machine Learning in Edge of Wireless Networks

Lyu, X

Ren, C Ni, W

Tian, H Liu, RP Dutkiewicz, E

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Journal on Selected Areas in Communications, 2019, 37, (10), pp. 2393-2406
Issue Date:: 2019-10-01

Closed Access

	Filename	Description	Size
	08793221.pdf	Published version	1.87 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Lyu, X https://orcid.org/0000-0003-0404-4310
dc.contributor.author	Ren, C
dc.contributor.author	Ni, W https://orcid.org/0000-0002-4933-594X
dc.contributor.author	Tian, H
dc.contributor.author	Liu, RP
dc.contributor.author	Dutkiewicz, E https://orcid.org/0000-0002-4268-9286
dc.date.accessioned	2020-05-06T06:50:38Z
dc.date.available	2020-05-06T06:50:38Z
dc.date.issued	2019-10-01
dc.identifier.citation	IEEE Journal on Selected Areas in Communications, 2019, 37, (10), pp. 2393-2406
dc.identifier.issn	0733-8716
dc.identifier.issn	1558-0008
dc.identifier.uri	http://hdl.handle.net/10453/140518
dc.description.abstract	© 1983-2012 IEEE. To enable machine learning at the edge of wireless networks (such as edge cloud), close to mobile users, is critical for future wireless networks, but challenging since the lower layers in edge cloud are substantially different from existing machine learning configurations in the cloud. In such geo-distributed computing environment, streaming data need to be evenly and cost-efficiently partitioned for different workers to produce an unbiased learning model with reduced parameter synchronization frequency. This paper presents a new online approach to optimally partitioning streaming data under time-varying network conditions. A new measure is proposed to quantify the evenness of data partitioning and restrain the optimization of data admission, partitioning, and processing. Stochastic gradient descent is applied to learn the optimal decisions online and asymptotically maximize the time-average utility of data partitioning. A new protocol is designed to further reduce the measurements of link costs, while preserving the asymptotic optimality, data evenness, and stability of the platform. Simulation results show that the proposed approach is superior to the state of the art in terms of throughput and cost efficiency, while only 24% of the links need to be measured to achieve the asymptotic optimality.
dc.language	English
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Journal on Selected Areas in Communications
dc.relation.isbasedon	10.1109/JSAC.2019.2934002
dc.rights	© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0805 Distributed Computing, 0906 Electrical and Electronic Engineering, 1005 Communications Technologies
dc.subject.classification	Networking & Telecommunications
dc.title	Optimal Online Data Partitioning for Geo-Distributed Machine Learning in Edge of Wireless Networks
dc.type	Journal Article
utslib.citation.volume	37
utslib.for	0805 Distributed Computing
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1005 Communications Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2020-05-06T06:50:36Z
pubs.issue	10
pubs.publication-status	Published
pubs.volume	37
utslib.start-page	2393
utslib.citation.issue	10

Abstract:

© 1983-2012 IEEE. To enable machine learning at the edge of wireless networks (such as edge cloud), close to mobile users, is critical for future wireless networks, but challenging since the lower layers in edge cloud are substantially different from existing machine learning configurations in the cloud. In such geo-distributed computing environment, streaming data need to be evenly and cost-efficiently partitioned for different workers to produce an unbiased learning model with reduced parameter synchronization frequency. This paper presents a new online approach to optimally partitioning streaming data under time-varying network conditions. A new measure is proposed to quantify the evenness of data partitioning and restrain the optimization of data admission, partitioning, and processing. Stochastic gradient descent is applied to learn the optimal decisions online and asymptotically maximize the time-average utility of data partitioning. A new protocol is designed to further reduce the measurements of link costs, while preserving the asymptotic optimality, data evenness, and stability of the platform. Simulation results show that the proposed approach is superior to the state of the art in terms of throughput and cost efficiency, while only 24% of the links need to be measured to achieve the asymptotic optimality.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/140518