QTLaunchPad Workshop on User-Centric Machine Translation & Evaluation at MT Summit XIV
The QTLaunchPad project recently held a full-day workshop on User-Centric Machine Translation & Evaluation at the Machine Translation Summit XIV in Nice, France with the aim of bringing together developers, researchers, industry stakeholders, and users of translation technologies.
With the uptake of MT in production workflows, the issues of usability and realistic assessment of MT quality have led to increased calls for user-centric approaches to evaluation, accessibility, and effective usage. Users and their feedback are vital at many key points in the MT process: ranging from sourcing and using data, developing systems, evaluating them and their output, through to end-user consumption.
Various technological approaches in human-computer interaction have made their way into MT research and development, where metrics have been adopted to assess aspects of cognitive effort, translation performance, user reception, and the effective deployment of cutting-edge tools and cloud-based solutions. However, findings from user data and feedback have yet to demonstrate their growing value to MT itself, and are particularly important in areas of development and industrial application in a time when more and more users have access to MT technologies and their expectations of them are higher.
As users come in many forms, e.g. translators, post-editors, developers, and evaluators, the role of the tools, methods, and resources available to them is of critical importance, especially in the context of high-quality MT. Therefore, the quality of these resources is of significant importance and highlights issues surrounding the sourcing of appropriate high-quality parallel corpora, standardized quality ratings for resources, comparability of corpora and data, annotation, and evaluation. There is a growing need for appropriate tools and resources to support MT in this regard to go beyond the crude scores of one-size-fits-all standard automatic metrics and resource-heavy human evaluation. A top desideratum is to enhance the diagnostic value of MT evaluation in order to help developers fine-tune and optimize the performance of their systems and to prepare the systems for actual usage in specific production environments.
Highlights from the Workshop
Josef van Genabith of the Centre for Next Generation Localisation at Dublin City University presented an overview of the QTLaunchPad project and unveiled plans for QT21, the next step in collaborative research and development that is currently being developed from the QTLP project.
Following this, the workshop keynote was given by Philipp Koehn of the University of Edinburgh on the topic of Quality for Whom? Evaluation for Different Uses of Machine Translation. Philipp’s talk reminded us of the wide variety of application scenarios of evaluation methods, where metrics such as BLEU are still valuable to MT researcher and developers, and where the findings and data from MT evaluation research need to be useful and incorporated into MT to make the most of both worlds.
Focusing on the topic of translation quality, Alan Melby of Brigham Young University presented a usable and inclusive definition of this, Arle Lommel of DFKI presented the Multidimensional Quality Metrics [MQM] being developed by the QTLaunchPad project, where he highlighted the focus on the users in this evaluation framework, rather than metrics solely for the research community. An integral part of the development and testing of MQM has been the on-going collaboration of GALA and QTLaunchPad, in this case, with Kim Harris and her team at text & form. Kim’s talk presented the experience of error annotation using MQM and related post-editing processes, and gave valuable insight into the preparation and lessons learned from such a unique experience from an industry viewpoint. translation quality and discussed the difficulties in finding and applying operational definitions of quality in such a wide range of evaluation tasks as present in today’s translation landscape. Building upon
The full programme listing for the workshop can be found at: https://docs.google.com/document/d/1GBw2OkRszNZCebtlbGVRnLJiHXnRHWksfZ8LiCLWkUw/edit