Advancing Training Objectives for Neural Document Summarisation

Publication Type:
Thesis
Issue Date:
2023
Full metadata record
Document summarisation is a natural language processing (NLP) task employed to convey salient information from large documents into concise summaries. It is a fundamental application that covers many different facets, including abstractive and extractive methods, summarisation of single or multiple documents, and across the same or different languages. There have been major strides in the field in recent years, attributed to the advent of complex language models and dedicated training resources. However, these models and datasets have their own limitations, such as the inability to cater to low-resource languages, or optimise the diversity of sequence generation, which affect the otherwise rapid improvement observed in recent years. This underscores the need to advance summarisation models through dedicated training objectives and novel architectures. Traditional summarisation approaches leverage maximum-likelihood training objectives as a means for efficient training, yet limiting the language model’s generation capability to pre-defined reference summaries. Reinforcement learning has been used as an alternative to improve the summary’s articulation and quality by sampling diverse phrases; yet, the choice of suitable techniques depends on the specific summarisation use-case and requires tailored objective functions. In turn, “prompt-based” generation has recently emerged as a popular method to elicit desirable linguistic behaviours from language models by allowing a “prompt” to tailor the summary generation to certain aspects (e.g., style, topics) of the input document. However, the design of an optimal prompt for a given input document is still an open problem. Eventually, the advent of multilingual models has led to advances in cross-lingual summarisation, but the uneven training resources across languages, and their intrinsic linguistic differences, have often hindered performance in low-resource languages. For all these reasons, throughout this thesis we examine contemporary reinforcement learning methodologies, prompt-based language generation, and cross-lingual document summarisation in the scope of abstractive summarisation. We present novel solutions which target a language model’s ability to focus its generation space at no cost of its performance, through reinforcement learning and prompting techniques at a parity of training resources and model capacity. Additionally, we accomplish further improvements by utilising the wealth of existing off-the-shelf resources to help bridge the performance gap between high and low-resource languages. Venturing onto these paths has allowed us to pave the way for qualitative and quantitative improvements in summary generation, in both the monolingual and cross-lingual settings.
Please use this identifier to cite or link to this item: