A²: Extracting cyclic switchings from DOB-nets for rejecting excessive disturbances: A²: Extracting cyclic switchings from DOB-nets for rejecting excessive disturbances

Publisher:
Elsevier
Publication Type:
Journal Article
Citation:
Neurocomputing, 2020, 400, pp. 161-172
Issue Date:
2020-08-04
Full metadata record
© 2020 Reinforcement Learning (RL) is limited in practice by its poor explainability, which is responsible for insufficient trustiness from users, unsatisfied interpretation for human intervention, inadequate analysis for future improvement, etc. This paper seeks to partially characterize the interplay between dynamical environments and a previously-proposed Disturbance OBserver net (DOB-net). The DOB-net is trained via RL and offers optimal control for a set of Partially Observable Markovian Decision Processes (POMDPs). The transition function of each POMDP is largely determined by the environments (excessive external disturbances). This paper proposes an Attention-based Abstraction (A2) approach to extract a finite-state automaton, referred to as a Key Moore Machine Network (KMMN), to capture the switching mechanisms exhibited by the DOB-net in dealing with multiple such POMDPs. A2 first quantizes the controlled platform by learning continuous-discrete interfaces. Then it extracts the KMMN by finding the key hidden states and transitions that attract sufficient attention from the DOB-net. Within the resultant KMMN, three patterns of cyclic switchings (between key hidden states) are found, and saturated controls are shown synchronized with unknown disturbances. Interestingly, the found switchings have previously appeared in the control design for often-saturated systems. They are interpreted via an analogy to the discrete-event subsystem of hybrid control.
Please use this identifier to cite or link to this item: