A hybrid neural machine translation technique for translating low resource languages

Publication Type:
Conference Proceeding
Citation:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, 10935 LNAI pp. 347 - 356
Issue Date:
2018-01-01
Full metadata record
© Springer International Publishing AG, part of Springer Nature 2018. Neural machine translation (NMT) has produced very promising results on various high resource languages that have sizeable parallel datasets. However, low resource languages that lack sufficient parallel datasets face challenges in the automated translation filed. The main part of NMT is a recurrent neural network, which can work with sequential data at word and sentence levels, given that sequences are not too long. Due to the large number of word and sequence combinations, a parallel dataset is required, which unfortunately is not always available, particularly for low resource languages. Therefore, we adapted a character neural translation model that was based on a combined structure of recurrent neural network and convolutional neural network. This model was trained on the IWSLT 2016 Arabic—English and the IWSLT 2015 English—Vietnamese datasets. The model produced encouraging results particularly on the Arabic datasets, where Arabic is considered a rich morphological language.
Please use this identifier to cite or link to this item: