Improving Machine Translation and Summarization with the Sinkhorn Divergence

Publisher:
Springer Nature
Publication Type:
Conference Proceeding
Citation:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 13938 LNCS, pp. 149-161
Issue Date:
2023-01-01
Full metadata record
Important natural language processing tasks such as machine translation and document summarization have made enormous strides in recent years. However, their performance is still partially limited by the standard training objectives, which operate on single tokens rather than on more global features. Moreover, such standard objectives do not explicitly consider the source documents, potentially affecting their alignment with the predictions. For these reasons, in this paper, we propose using an Optimal Transport (OT) training objective to promote a global alignment between the model’s predictions and the source documents. In addition, we present an original implementation of the OT objective based on the Sinkhorn divergence between the final hidden states of the model’s encoder and decoder. Experimental results over machine translation and abstractive summarization tasks show that the proposed approach has been able to achieve statistically significant improvements across all experimental settings compared to our baseline and other alternative objectives. A qualitative analysis of the results also shows that the predictions have been able to better align with the source sentences thanks to the supervision of the proposed objective.
Please use this identifier to cite or link to this item: