Towards Capable and Ethical Text-based Game Agents

Shi, Zijing

Towards Capable and Ethical Text-based Game Agents

Shi, Zijing

Permalink

Publication Type:: Thesis
Issue Date:: 2025

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download thesisAdobe PDF (7.9 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Shi, Zijing
dc.date.accessioned	2026-05-11T00:23:39Z
dc.date.available	2026-05-11T00:23:39Z
dc.date.issued	2025
dc.identifier.uri	http://hdl.handle.net/10453/194918
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Language-based agents constitute an important direction in artificial intelligence, as they enable autonomous systems to perceive, reason, and act through natural language. Text-based games provide a rigorous evaluation setting for such agents, requiring semantic understanding, long-horizon decision-making, and exploration under partial observability. However, the combinatorial nature of free-form actions, sparse rewards, and the risk of unethical behaviour pose persistent challenges. This thesis aims to develop text-based game agents that are both capable and ethically aligned. First, to address the difficulty of navigating vast action spaces, we propose a confidence-based self-imitation learning method that adapts a pretrained language model to generate compact, high-quality action candidates. By filtering actions through confidence-driven pruning, the approach markedly improves sample efficiency and task performance. Second, to strengthen long-horizon reasoning, we introduce an LLM-enhanced Monte Carlo planning algorithm that integrates semantic priors with in-trial and cross-trial memory. This design enables more reliable planning under sparse feedback and outperforms strong RL- and LLM-based baselines. Third, we tackle the challenge of moral value alignment by developing a learning framework that interleaves task optimisation with morality-guided policy shaping. Through a soft mixture of task and moral policies, the agent reduces harmful actions while preserving competitive task effectiveness. Finally, we extend ethical alignment to personalised human preferences by proposing a human-in-the-loop mechanism that infers value judgements from minimal feedback and generalises them across new scenarios. Together, these contributions demonstrate that language-based agents can be both capable and ethically aligned. The proposed methods advance action generation, planning, and value alignment, offering a principled foundation for trustworthy autonomous agents in text-based environments and beyond.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/194918/1/thesis.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	© 2025 Zijing Shi
dc.rights	au.edu.uts.lib/cph
dc.title	Towards Capable and Ethical Text-based Game Agents	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Language-based agents constitute an important direction in artificial intelligence, as they enable autonomous systems to perceive, reason, and act through natural language. Text-based games provide a rigorous evaluation setting for such agents, requiring semantic understanding, long-horizon decision-making, and exploration under partial observability. However, the combinatorial nature of free-form actions, sparse rewards, and the risk of unethical behaviour pose persistent challenges. This thesis aims to develop text-based game agents that are both capable and ethically aligned. First, to address the difficulty of navigating vast action spaces, we propose a confidence-based self-imitation learning method that adapts a pretrained language model to generate compact, high-quality action candidates. By filtering actions through confidence-driven pruning, the approach markedly improves sample efficiency and task performance. Second, to strengthen long-horizon reasoning, we introduce an LLM-enhanced Monte Carlo planning algorithm that integrates semantic priors with in-trial and cross-trial memory. This design enables more reliable planning under sparse feedback and outperforms strong RL- and LLM-based baselines. Third, we tackle the challenge of moral value alignment by developing a learning framework that interleaves task optimisation with morality-guided policy shaping. Through a soft mixture of task and moral policies, the agent reduces harmful actions while preserving competitive task effectiveness. Finally, we extend ethical alignment to personalised human preferences by proposing a human-in-the-loop mechanism that infers value judgements from minimal feedback and generalises them across new scenarios. Together, these contributions demonstrate that language-based agents can be both capable and ethically aligned. The proposed methods advance action generation, planning, and value alignment, offering a principled foundation for trustworthy autonomous agents in text-based environments and beyond.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/194918