Towards Capable and Ethical Text-based Game Agents
- Publication Type:
- Thesis
- Issue Date:
- 2025
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Language-based agents constitute an important direction in artificial intelligence, as they enable autonomous systems to perceive, reason, and act through natural language. Text-based games provide a rigorous evaluation setting for such agents, requiring semantic understanding, long-horizon decision-making, and exploration under partial observability. However, the combinatorial nature of free-form actions, sparse rewards, and the risk of unethical behaviour pose persistent challenges. This thesis aims to develop text-based game agents that are both capable and ethically aligned.
First, to address the difficulty of navigating vast action spaces, we propose a confidence-based self-imitation learning method that adapts a pretrained language model to generate compact, high-quality action candidates. By filtering actions through confidence-driven pruning, the approach markedly improves sample efficiency and task performance.
Second, to strengthen long-horizon reasoning, we introduce an LLM-enhanced Monte Carlo planning algorithm that integrates semantic priors with in-trial and cross-trial memory. This design enables more reliable planning under sparse feedback and outperforms strong RL- and LLM-based baselines.
Third, we tackle the challenge of moral value alignment by developing a learning framework that interleaves task optimisation with morality-guided policy shaping. Through a soft mixture of task and moral policies, the agent reduces harmful actions while preserving competitive task effectiveness.
Finally, we extend ethical alignment to personalised human preferences by proposing a human-in-the-loop mechanism that infers value judgements from minimal feedback and generalises them across new scenarios.
Together, these contributions demonstrate that language-based agents can be both capable and ethically aligned. The proposed methods advance action generation, planning, and value alignment, offering a principled foundation for trustworthy autonomous agents in text-based environments and beyond.
Please use this identifier to cite or link to this item:
