Joint Flight Cruise Control and Data Collection in UAV-Aided Internet of Things: An Onboard Deep Reinforcement Learning Approach

Li, K; Ni, W; Tovar, E; Guizani, M

Joint Flight Cruise Control and Data Collection in UAV-Aided Internet of Things: An Onboard Deep Reinforcement Learning Approach

Li, K Ni, W

Tovar, E Guizani, M

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Internet of Things Journal, 2021, 8, (12), pp. 9787-9799
Issue Date:: 2021-06-15

Closed Access

	Filename	Description	Size
	Joint_Flight_Cruise_Control_and_Data_Collection_in_UAV-Aided_Internet_of_Things_An_Onboard_Deep_Reinforcement_Learning_Approach.pdf	Published version	2.11 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, K
dc.contributor.author	Ni, W https://orcid.org/0000-0002-4933-594X
dc.contributor.author	Tovar, E
dc.contributor.author	Guizani, M
dc.date.accessioned	2021-09-13T23:04:14Z
dc.date.available	2021-09-13T23:04:14Z
dc.date.issued	2021-06-15
dc.identifier.citation	IEEE Internet of Things Journal, 2021, 8, (12), pp. 9787-9799
dc.identifier.issn	2327-4662
dc.identifier.issn	2327-4662
dc.identifier.uri	http://hdl.handle.net/10453/150517
dc.description.abstract	Employing unmanned aerial vehicles (UAVs) as aerial data collectors in Internet-of-Things (IoT) networks is a promising technology for large-scale environment sensing. A key challenge in UAV-aided data collection is that UAV maneuvering gives rise to buffer overflow at the IoT node and unsuccessful transmission due to lossy airborne channels. This article formulates a joint optimization of flight cruise control and data collection schedule to minimize network data loss as a partially observable Markov decision process (POMDP), where the states of individual IoT nodes can be obscure to the UAV. The problem can be optimally solvable by reinforcement learning, but suffers from the curse of dimensionality and becomes rapidly intractable with the growth in the number of IoT nodes. In practice, a UAV-aided IoT network contains a large number of network states and actions in POMDP while the up-to-date knowledge is not available at the UAV. We propose an onboard deep Q -network-based flight resource allocation scheme (DQN-FRAS) to optimize the online flight cruise control of the UAV and data scheduling given outdated knowledge on the network states. Numerical results demonstrate that DQN-FRAS reduces the packet loss by over 51%, as compared to existing nonlearning heuristics.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Internet of Things Journal
dc.relation.isbasedon	10.1109/JIOT.2020.3019186
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0805 Distributed Computing, 1005 Communications Technologies
dc.title	Joint Flight Cruise Control and Data Collection in UAV-Aided Internet of Things: An Onboard Deep Reinforcement Learning Approach
dc.type	Journal Article
utslib.citation.volume	8
utslib.for	0805 Distributed Computing
utslib.for	1005 Communications Technologies
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2021-09-13T23:04:13Z
pubs.issue	12
pubs.publication-status	Published
pubs.volume	8
utslib.citation.issue	12

Abstract:

Employing unmanned aerial vehicles (UAVs) as aerial data collectors in Internet-of-Things (IoT) networks is a promising technology for large-scale environment sensing. A key challenge in UAV-aided data collection is that UAV maneuvering gives rise to buffer overflow at the IoT node and unsuccessful transmission due to lossy airborne channels. This article formulates a joint optimization of flight cruise control and data collection schedule to minimize network data loss as a partially observable Markov decision process (POMDP), where the states of individual IoT nodes can be obscure to the UAV. The problem can be optimally solvable by reinforcement learning, but suffers from the curse of dimensionality and becomes rapidly intractable with the growth in the number of IoT nodes. In practice, a UAV-aided IoT network contains a large number of network states and actions in POMDP while the up-to-date knowledge is not available at the UAV. We propose an onboard deep Q -network-based flight resource allocation scheme (DQN-FRAS) to optimize the online flight cruise control of the UAV and data scheduling given outdated knowledge on the network states. Numerical results demonstrate that DQN-FRAS reduces the packet loss by over 51%, as compared to existing nonlearning heuristics.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/150517