On-Board Deep Q-Network for UAV-Assisted Online Power Transfer and Data Collection

Li, K; Ni, W; Tovar, E; Jamalipour, A

On-Board Deep Q-Network for UAV-Assisted Online Power Transfer and Data Collection

Li, K Ni, W

Tovar, E Jamalipour, A

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Vehicular Technology, 2019, 68, (12), pp. 12215-12226
Issue Date:: 2019-12-01

Closed Access

	Filename	Description	Size
	08854903.pdf	Published version	2.22 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, K
dc.contributor.author	Ni, W https://orcid.org/0000-0002-4933-594X
dc.contributor.author	Tovar, E
dc.contributor.author	Jamalipour, A
dc.date.accessioned	2020-05-06T03:13:15Z
dc.date.available	2020-05-06T03:13:15Z
dc.date.issued	2019-12-01
dc.identifier.citation	IEEE Transactions on Vehicular Technology, 2019, 68, (12), pp. 12215-12226
dc.identifier.issn	0018-9545
dc.identifier.issn	1939-9359
dc.identifier.uri	http://hdl.handle.net/10453/140505
dc.description.abstract	© 1967-2012 IEEE. Unmanned Aerial Vehicles (UAVs) with Microwave Power Transfer (MPT) capability provide a practical means to deploy a large number of wireless powered sensing devices into areas with no access to persistent power supplies. The UAV can charge the sensing devices remotely and harvest their data. A key challenge is online MPT and data collection in the presence of on-board control of a UAV (e.g., patrolling velocity) for preventing battery drainage and data queue overflow of the devices, while up-To-date knowledge on battery level and data queue of the devices is not available at the UAV. In this paper, an on-board deep Q-network is developed to minimize the overall data packet loss of the sensing devices, by optimally deciding the device to be charged and interrogated for data collection, and the instantaneous patrolling velocity of the UAV. Specifically, we formulate a Markov Decision Process (MDP) with the states of battery level and data queue length of devices, channel conditions, and waypoints given the trajectory of the UAV; and solve it optimally with Q-learning. Furthermore, we propose the on-board deep Q-network that enlarges the state space of the MDP, and a deep reinforcement learning based scheduling algorithm that asymptotically derives the optimal solution online, even when the UAV has only outdated knowledge on the MDP states. Numerical results demonstrate that our deep reinforcement learning algorithm reduces the packet loss by at least 69.2%, as compared to existing non-learning greedy algorithms.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Vehicular Technology
dc.relation.isbasedon	10.1109/TVT.2019.2945037
dc.rights	© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	08 Information and Computing Sciences, 09 Engineering, 10 Technology
dc.subject.classification	Automobile Design & Engineering
dc.title	On-Board Deep Q-Network for UAV-Assisted Online Power Transfer and Data Collection
dc.type	Journal Article
utslib.citation.volume	68
utslib.for	08 Information and Computing Sciences
utslib.for	09 Engineering
utslib.for	10 Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2020-05-06T03:13:14Z
pubs.issue	12
pubs.publication-status	Published
pubs.volume	68
utslib.start-page	12215
utslib.citation.issue	12

Abstract:

© 1967-2012 IEEE. Unmanned Aerial Vehicles (UAVs) with Microwave Power Transfer (MPT) capability provide a practical means to deploy a large number of wireless powered sensing devices into areas with no access to persistent power supplies. The UAV can charge the sensing devices remotely and harvest their data. A key challenge is online MPT and data collection in the presence of on-board control of a UAV (e.g., patrolling velocity) for preventing battery drainage and data queue overflow of the devices, while up-To-date knowledge on battery level and data queue of the devices is not available at the UAV. In this paper, an on-board deep Q-network is developed to minimize the overall data packet loss of the sensing devices, by optimally deciding the device to be charged and interrogated for data collection, and the instantaneous patrolling velocity of the UAV. Specifically, we formulate a Markov Decision Process (MDP) with the states of battery level and data queue length of devices, channel conditions, and waypoints given the trajectory of the UAV; and solve it optimally with Q-learning. Furthermore, we propose the on-board deep Q-network that enlarges the state space of the MDP, and a deep reinforcement learning based scheduling algorithm that asymptotically derives the optimal solution online, even when the UAV has only outdated knowledge on the MDP states. Numerical results demonstrate that our deep reinforcement learning algorithm reduces the packet loss by at least 69.2%, as compared to existing non-learning greedy algorithms.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/140505