Online Reinforcement Learning for Beam Tracking and Rate Adaptation in Millimeter-Wave Systems

Publisher:
IEEE COMPUTER SOC
Publication Type:
Journal Article
Citation:
IEEE Transactions on Mobile Computing, 2024, 23, (2), pp. 1830-1845
Issue Date:
2024-02-01
Filename Description Size
1633972.pdfPublished version2.15 MB
Adobe PDF
Full metadata record
In this article, we propose MAMBA, a restless multi-armed bandit framework for beam tracking in directional millimeter-wave (mmW) cellular systems. Instead of relying on explicit control messages, MAMBA utilizes the ACK/NACK packets transmitted by user equipments (UEs) to the base station (BS) as a part of the hybrid automatic repeat request (HARQ) procedure. These packets are used to measure the quality of the currently operating downlink beam, and select a new downlink beam along with an appropriate modulation and coding scheme (MCS) for future transmissions. At its core, MAMBA implements an online reinforcement learning technique called adaptive Thompson sampling (ATS), which determines a good beam and associated MCS to be used for the upcoming transmissions. To evaluate MAMBA's performance, we conduct extensive simulations and over-the-air (OTA) experiments over the 28 GHz band using phased-array antennas. We study fixed- as well as adaptive-rate variants of MAMBA, and contrast it with four other beam tracking strategies: a beam selection scheme similar to the one used in 5G NR (called 'static oracle'), a theoretically optimal but practically infeasible beam tracking scheme (called 'dynamic oracle'), an ϵ-greedy algorithm (Mohamed 2021), and the Unimodal Beam Alignment (UBA) algorithm (Hashemi et al. 2018). Our results show that MAMBA achieves 182% throughput gain over the 'static oracle' and is reasonably close to the throughput of the 'dynamic oracle'. Compared to UBA, MAMBA achieves 25-35% gain in throughput, depending on UE mobility. Finally, when operated at a fixed MCS, MAMBA/ATS achieves 21% gain over the ϵ-greedy algorithm at the lowest applied MCS index, and 255% gain at the highest MCS index.
Please use this identifier to cite or link to this item: