Empowerment-driven Policy Gradient Learning with Counterfactual Augmentation in Recommender Systems

Chen, X; Yao, L; Chang, X; Wang, S

Empowerment-driven Policy Gradient Learning with Counterfactual Augmentation in Recommender Systems

Chen, X Yao, L Chang, X

Wang, S

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Conference Proceeding
Citation:: Proceedings - IEEE International Conference on Data Mining, ICDM, 2022, 2022-November, pp. 885-890
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	Empowerment-driven_Policy_Gradient_Learning_with_Counterfactual_Augmentation_in_Recommender_Systems.pdf	Published version	392.98 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Chen, X
dc.contributor.author	Yao, L
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807
dc.contributor.author	Wang, S
dc.date	2022-11-28
dc.date.accessioned	2023-03-31T03:50:25Z
dc.date.available	2023-03-31T03:50:25Z
dc.date.issued	2022-01-01
dc.identifier.citation	Proceedings - IEEE International Conference on Data Mining, ICDM, 2022, 2022-November, pp. 885-890
dc.identifier.isbn	9781665450997
dc.identifier.issn	1550-4786
dc.identifier.uri	http://hdl.handle.net/10453/168920
dc.description.abstract	Deep reinforcement learning (DRL) has been proven its efficiency in capturing users' dynamic interests in recent literature. However, training a DRL agent is challenging, because of the sparse environment in recommender systems (RS), DRL agents could spend times either exploring informative user-item interaction trajectories or using existing trajectories for policy learning. It is also known as the exploration and exploitation trade-off which affects the recommendation performance significantly when the environment is sparse. It is more challenging to balance the exploration and exploitation in DRL RS where RS agent need to deeply explore the informative trajectories and exploit them efficiently in the context of recommender systems. As a step to address this issue, We design a novel empowerment-driven exploration method to increase the capability of exploring informative interaction trajectories in the sparse environment, which are further enriched via a counterfactual augmentation strategy for more efficient exploitation. The extensive experiments on four offline datasets and an online simulation platform demonstrate the superiority of our model to a set of existing state-of-the-art methods.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	Proceedings - IEEE International Conference on Data Mining, ICDM
dc.relation.ispartof	2022 IEEE International Conference on Data Mining (ICDM)
dc.relation.isbasedon	10.1109/ICDM54844.2022.00102
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Empowerment-driven Policy Gradient Learning with Counterfactual Augmentation in Recommender Systems
dc.type	Conference Proceeding
utslib.citation.volume	2022-November
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-31T03:50:24Z
pubs.finish-date	2022-12-01
pubs.publication-status	Published
pubs.start-date	2022-11-28
pubs.volume	2022-November

Abstract:

Deep reinforcement learning (DRL) has been proven its efficiency in capturing users' dynamic interests in recent literature. However, training a DRL agent is challenging, because of the sparse environment in recommender systems (RS), DRL agents could spend times either exploring informative user-item interaction trajectories or using existing trajectories for policy learning. It is also known as the exploration and exploitation trade-off which affects the recommendation performance significantly when the environment is sparse. It is more challenging to balance the exploration and exploitation in DRL RS where RS agent need to deeply explore the informative trajectories and exploit them efficiently in the context of recommender systems. As a step to address this issue, We design a novel empowerment-driven exploration method to increase the capability of exploring informative interaction trajectories in the sparse environment, which are further enriched via a counterfactual augmentation strategy for more efficient exploitation. The extensive experiments on four offline datasets and an online simulation platform demonstrate the superiority of our model to a set of existing state-of-the-art methods.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/168920