English-Basque statistical and neural machine translation

Conference Proceeding
LREC 2018 - 11th International Conference on Language Resources and Evaluation, 2019, pp. 880 - 885
© LREC 2018 - 11th International Conference on Language Resources and Evaluation. All rights reserved. Neural Machine Translation (NMT) has attracted increasing attention in the recent years. However, it tends to require very large training corpora which could prove problematic for languages with low resources. For this reason, Statistical Machine Translation (SMT) continues to be a popular approach for low-resource language pairs. In this work, we address English-Basque translation and compare the performance of three contemporary statistical and neural machine translation systems: OpenNMT, Moses SMT and Google Translate. For evaluation, we employ an open-domain and an IT-domain corpora from the WMT16 resources for machine translation. In addition, we release a small dataset (Berriak) of 500 highly-accurate English-Basque translations of complex sentences useful for a thorough testing of the translation systems.
