An Autonomous Learning-Based Algorithm for Joint Channel and Power Level Selection by D2D Pairs in Heterogeneous Cellular Networks

Asheralieva, A; Miyanaga, Y

An Autonomous Learning-Based Algorithm for Joint Channel and Power Level Selection by D2D Pairs in Heterogeneous Cellular Networks

Asheralieva, A Miyanaga, Y

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Transactions on Communications, 2016, 64, (9), pp. 3996-4012
Issue Date:: 2016-09-01

Closed Access

	Filename	Description	Size
	An_Autonomous_Learning-Based_Algorithm_for_Joint_Channel_and_Power_Level_Selection_by_D2D_Pairs_in_Heterogeneous_Cellular_Networks.pdf	Published version	2.9 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Asheralieva, A
dc.contributor.author	Miyanaga, Y https://orcid.org/0000-0002-2795-2234
dc.date.accessioned	2022-08-21T22:20:00Z
dc.date.available	2022-08-21T22:20:00Z
dc.date.issued	2016-09-01
dc.identifier.citation	IEEE Transactions on Communications, 2016, 64, (9), pp. 3996-4012
dc.identifier.issn	0090-6778
dc.identifier.issn	1558-0857
dc.identifier.uri	http://hdl.handle.net/10453/160633
dc.description.abstract	We study the problem of autonomous operation of the device-to-device (D2D) pairs in a heterogeneous cellular network with multiple base stations (BSs). The spectrum bands of the BSs (that may overlap with each other) comprise the sets of orthogonal wireless channels. We consider the following spectrum usage scenarios: 1) the D2D pairs transmit over the dedicated frequency bands and 2) the D2D pairs operate on the shared cellular/D2D channels. The goal of each device pair is to jointly select the wireless channel and power level to maximize its reward, defined as the difference between the achieved throughput and the cost of power consumption, constrained by its minimum tolerable signal-to-interference-plus-noise ratio requirements. We formulate this problem as a stochastic non-cooperative game with multiple players (D2D pairs) where each player becomes a learning agent whose task is to learn its best strategy (based on the locally observed information) and develop a fully autonomous multi-agent Q-learning algorithm converging to a mixed-strategy Nash equilibrium. The proposed learning method is implemented in a long term evolution-advanced network and evaluated via the OPNET-based simulations. The algorithm shows relatively fast convergence and near-optimal performance after a small number of iterations.
dc.language	English
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Transactions on Communications
dc.relation.isbasedon	10.1109/TCOMM.2016.2593468
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0804 Data Format, 0906 Electrical and Electronic Engineering, 1005 Communications Technologies
dc.title	An Autonomous Learning-Based Algorithm for Joint Channel and Power Level Selection by D2D Pairs in Heterogeneous Cellular Networks
dc.type	Journal Article
utslib.citation.volume	64
utslib.for	0804 Data Format
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1005 Communications Technologies
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2022-08-21T22:19:57Z
pubs.issue	9
pubs.publication-status	Published
pubs.volume	64
utslib.citation.issue	9

Abstract:

We study the problem of autonomous operation of the device-to-device (D2D) pairs in a heterogeneous cellular network with multiple base stations (BSs). The spectrum bands of the BSs (that may overlap with each other) comprise the sets of orthogonal wireless channels. We consider the following spectrum usage scenarios: 1) the D2D pairs transmit over the dedicated frequency bands and 2) the D2D pairs operate on the shared cellular/D2D channels. The goal of each device pair is to jointly select the wireless channel and power level to maximize its reward, defined as the difference between the achieved throughput and the cost of power consumption, constrained by its minimum tolerable signal-to-interference-plus-noise ratio requirements. We formulate this problem as a stochastic non-cooperative game with multiple players (D2D pairs) where each player becomes a learning agent whose task is to learn its best strategy (based on the locally observed information) and develop a fully autonomous multi-agent Q-learning algorithm converging to a mixed-strategy Nash equilibrium. The proposed learning method is implemented in a long term evolution-advanced network and evaluated via the OPNET-based simulations. The algorithm shows relatively fast convergence and near-optimal performance after a small number of iterations.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/160633