QTLaunchPad: An Opportunity for GALA Members to Influence the Future of Translation Quality
By: Aljoscha Burchardt (DFKI)
18 December 2012
Translation quality is a major topic of discussion in the globalization industry. Quality translation is in higher demand today than ever before. Progress in Machine Translation (MT) has enabled many new applications for automatic translation, but the quality barriers for translations to be published or distributed outside an organization remain unconquered. As a consequence, fewer translations are produced today than are needed for optimal business or required by law. Industry and society urgently need progress in translation technology and workflows to fulfill existing translation requirements, extend multilingual communication to additional languages and services (for addressing untapped markets), reduce costs, and fulfill commitments to linguistic diversity.
For years, the state of the art consisted of static “error-count” metrics like SAE J2450 and the LISA QA Model, but these have proven more and more insufficient for modern translation production environments that integrate MT, traditional human translation, crowd-sourced localization, and other methods to address content that ranges from standard documentation to social media, locally authored content, dynamic text, and beyond. While these older standard metrics were a step forward at their time, they suffer from the one-size-fits-all problem: most metrics are not “tunable” to reflect different requirements, different production environments, and different user expectations. At the same time, standard methods used for various types of localization are not comparable, leaving users at a loss for how they should compare a BLEU or METEOR score for MT with a LISA QA Model score or the output from an automatic checking tool for human translation. This is not to say that MT metrics, which have required reference translations, are not important, but rather that they are inadequate for working on ongoing translation project assessment. Both human and machine translation evaluation need to be embedded in a larger consistent framework.
Multidimensional Quality: A Paradigm Shift
What is needed is a fundamental paradigm shift for how we consider quality. Recent steps, such as those taken by TAUS’ “Dynamic Quality Framework” are an important step in the right direction. But there is a need for an integrative approach that ties standards, industry needs, and the latest research together in an open environment that is free of charge for all users. To produce such an outcome, the European Commission-funded QTLaunchPad project has started to team up with many different stakeholder groups such as GALA.
One of the core deliverables from QTLaunchPad is a *framework* for defining quality assessment methods. At the heart of this quality framework is the notion of “dimensions,” aspects of the translation task that influence what quality issues are considered and the relevance they are assigned. Dimensions can be thought of as basic questions about the translation that can be used to help select issues that matter for the assessment task.
Dimensions are vital because no single set of quality issues, no matter how well thought out, will meet all requirements: the needs of someone assessing an MT system for translating internal emails will be very different from someone assessing the human translation of a multi-billion euro business contract that must be filed in three jurisdictions with all versions legally binding. At the same time, however, the judgments about quality should be comparable and not unnecessarily complex: not all Dimensions are needed all the time. By providing a master set of quality issues (over 150 at this time) that encompasses all possible tasks and allowing users to select the appropriate ones, the multidimensional metrics will allow tasks to be tuned to different needs while still allowing compatibility between them.
Based on an analysis of all major quality assessment metrics, tools, and methods, the project is currently preparing a tool to allow users to build quality metrics of various sorts based on a dimensions such as vertical industry, output modality (e.g., software user interface, subtitles, print), publication intent (e.g., “gist”, dissemination, internal use), style, assessment task type, and so forth. This tool will be integrated with a revision environment and will allow users to tag texts with quality data based on the framework in the forthcoming Internationalization Tag Set (ITS) 2.0 standard.
As it would be hopeless to try and design such a tool in vitro, its design phase will include many feedback rounds with beta testers, and the larger public in the form of workshops, etc. In order to assure that the quality framework and tool represent industry best practice, QTLaunchPad works towards compatibility with the “translation parameters” of ISO/TS-11669 (see http://www.ttt.org/specs for a summary), ITS 2.0, and other standards. The planned tool (to be released in March 2013) will work with XLIFF files. And for users who are working with existing quality metrics (such as SAE J2450 and the LISA QA Model) no adjustment will be necessary since these models can be represented as applications of the Multidimensional Quality Framework. But users will gain access to more sophisticated tools for using these tools that will enable audit trails, resolution tracking, and better integration with other localization tools, all at the segment or sub-segment level, instead of the document-level metrics currently available.
The QTLaunchPad Planning Panel
Another activity of QTLaunchPad relevant to GALA members is the preparation of a major push in translation technology concentrating on overcoming existing barriers to translation quality. The Project is currently assembling a Planning Panel of the best European centers of MT research, the most sophisticated and enthusiastic large-scale users of translation technology (including enlightened potential users, technophile and quality-conscious translation service, and language tools providers), resource creators, and experienced technology integrators.
The Planning Panel in its first meetings has chosen to focus on a number of core areas (Research Innovation Application Scenarios or RIASes) for improving translation quality that may in the end include areas such as Medical, Corporate, Public, or Media. For each of these areas smaller bodies will be established to discuss issues and come up with proposals for ongoing research, industry collaboration, and promotion that will result in improved quality and industry positioning. Some of these smaller groups have begun their work, but in 2013 they will open to additional input and the QTLaunchPad Project will host workshops on them to gather feedback.
GALA plays a vital role in the QTLaunchPad Project. As the metrics and other materials are developed, GALA will be engaged in an active program of outreach to its members to make sure that the results are usable to industry. Far too often, publicly funded projects have resulted in deliverables of interest to the research community that have been unusable by industry. Currently the QTLaunchPad project has two meetings planned where GALA members will be able to learn more about the Multidimensional Quality Assessment Model, the RIASes, and other aspect of QT Launchpad, and provide feedback. Workshops will be held on March 14 in Rome (a half day on each topic) along with the MultilingualWeb Workshop (http://www.multilingualweb.eu/rome), and the second will be held at the GALA 2013 Miami on 17-20 March. Interested parties are invited to contact Arle Lommel ([email protected]) to learn more about these opportunities and how GALA members can provide their input.
Aljoscha Burchardt works in the Language Technology Lab of the German Research Center for Artificial Intelligence (DFKI GmbH). He manages the EC-funded project QTLaunchPad that is preparing a big European translation quality initiative. He is also a manager within the European Network of Excellence META-NET that is preparing a Technology Alliance for Multilingual Europe and project leader of the taraXÜ project that develops hybrid machine translation technology in a consortium with industry partners. Burchardt has a background in semantic Language Technology. After his PhD in Computational Linguistics at Saarland University he coordinated the Center of Research Excellence "E-Learning 2.0" at Technische Universität Darmstadt.