Word Preordering as the Next Step for Commercial MT
Dr. Dimitar Shterionov & Anita Ramm
Thursday, 22 June, 2017
Neural MT has gained a lot of momentum recently, but despite an increasing amount of technology and research on the topic, Statistical Machine Translation (SMT) still appears to be the de-facto MT paradigm in commercial translation services. This is mainly because traditionally it takes less time to train and translate with an SMT engine. However, SMT is susceptible to morphologic and syntactic errors - one of the most common errors being, incorrect word order.
Preordering (typically, word reordering of source sentences prior to training or translation) has proven to be an effective way of dealing with this problem. In this webinar, Dimitar and Anita discuss how to integrate this preordering component into a commercial custom MT platform. While the preordering component is inspired by improving SMT, it is quite general and can be applied on both SMT and NMT.
What will viewers learn from watching?
- Application of preordering research into a commercial platform
- Understanding of Tools which can be used to improve MT
- Differences and similarities of application of preordering in SMT and NMT
Dr. Dimitar Shterionov
Dimitar is Head MT Researcher at KantanMT. Dimitar holds a PhD in Computer Science from KU Leuven Belgium. He has worked on design and development of Artificial Intelligence software for learning and reasoning with uncertain data. Since 2016 Dimitar leads KantanLabs – a research and development group committed to advancing language technology. Within KantanLabs Dimitar and the team work on introducing innovative technology in the KantanMT platform such as efficient word reordering, improved alignment, Neural MT, and others.
Anita Ramm is a PhD student at the University of Stuttgart/Munich, Germany. Her research is focused on the German verbs in the context of English-to-German Statistical Machine Translation (SMT). She is working on the positional, as well as on the inflectional problems. She is involved in the EU project "HiML - Health In My Language" where she works on the modelling of the German morphology in the context of the SMT/NMT for medical data. In the past, she was involved in the EU project "TTC - Terminology Extraction, Translation Tools and Comparable Corpora" where she was working on the terminology extraction from the domain-specific texts.
For many years, English dominated the digital world, but in the last couple years, demand has grown for the “other” 6,...
Much has been said and written about the pros and cons of using machine translation with post-editing (MTPE) to get...