taraXŰ– The First End-to-End Machine Translation Environment Project Status as of October 2012
Berlin/ Bertrange - The rising demand for machine translation (MT) and the resulting implementations have shown that workflows and system architectures can become very complex. Particularly, the setup of combination or hybrid systems making use of different MT engines running in parallel has kept many language service providers from introducing MT technology.
To address this issue, euroscript teamed up in 2010 with a strong project consortium in the highly ambitious R&D project taraXŰ. The project has by now resulted in the first advanced machine translation environment that deploys amongst other things:
- Corpus and translation management (running different rule or template based and statistical machine translation engines in parallel)
- Automated source- and target-language quality assessment
- Automatic selection of the best translation through innovative methods
- A web-interface allowing human translators to correct machine translation output and assess its quality by ranking, post-editing, error-analysis, etc.
With this cutting-edge approach, existing language resources in the form of translation memories and terminological resources derived from previous translations are taken into account. So as choose the best translation, an intelligent self-calibrating selection mechanism takes over the task of evaluating all translation proposals and of choosing the best candidate for further processing. It uses highly sophisticated metrics based on characteristics of the source and target language, requirements of the translation job and other metadata. This selection mechanism can for example lift the tedious task of comparing different machine translations and translation proposals against a translation memory from the post-editor.
In comparison with other MT-related R&D projects, taraXŰ reaches a high practical relevance.
The developed end-to-end workflow covers all aspects of an LSP-driven translation process. This implies processing of various file formats, preserving formatting information, incorporating target language quality requirements and factoring in existing language resources. It also supports the use of a translation memory system as manual post-editing environment in order to correct machine translations and regenerate original file formats.One crucial feature which further sets aside the taraXŰ system is that it supports the use of corrections resulting from post-editing for the controlled improvement of the translation engines. This goes far beyond the common practice of simply feeding ever more bilingual data into the machine translation systems hoping that it will result in improved performance.
"The taraXŰ project gives us an unprecedented opportunity to perform applied research in a human-centric setting. The early inclusion of human translators in the development process is definitely a winning strategy to further improve machine translation quality" says Hans Uszkoreit, Scientific Director at the German Research Center for Artificial Intelligence (DFKI) and Head of DFKI Language Technology Lab.
The taraXŰ system is currently in the state of a research prototype. It has been used for various machine translation experiments involving euroscript and several other language service providers. An interesting preliminary result that needs to be further validated revealed that the post-editing of machine translation output is 26% faster than a translation process initiated from scratch.
The taraXŰ project consortium relies upon expertise in translation services, machine translation and evaluation, language checking, language technology, and related fields. Project partners are Acrolinx, the German Research Center for Artificial Intelligence (DFKI), euroscript Deutschland and yocoy.
More information can also be found at: taraxu.dfki.de.
euroscript International is a leader in providing customers with global solutions in content lifecycle management. The euroscript divisions deliver comprehensive solutions that help customers design, build and run content management operations of all sizes. Thanks to its employees’ expertise in the fields of consulting, system integration, language services as well as content and document management, euroscript is able to help businesses worldwide to manage content more efficiently.
With a market presence in over 18 countries, euroscript serves customers in a variety of business sectors including the public sector, aerospace, defence and transport, manufacturing, life sciences, financial services and energy and environment.
Acrolinx develops technology for Information Quality Management. Our flagship product Acrolinx IQ is an enterprise system that promotes quality and efficiency during content development. Acrolinx IQ gives writers access to best practice writing standards, terminology and intelligent reuse of existing content. The tools provided by Acrolinx IQ improve productivity, decrease localization costs and accelerate time to market. Companies including Adobe, Cisco, IBM, Philips, Siemens and many more, rely on Acrolinx IQ for managing information quality.
The German Research Center for Artificial Intelligence (Deutsche Forschungszentrum für Künstliche Intelligenz GmbH - DFKI), with facilities in Kaiserslautern, Saarbrücken, Bremen, and a project office in Berlin, is the country's leading research center in the area of innovative software technology for commercial application. In the international scientific community, DFKI is recognized as one of the most important „Centers of Excellence“ in the world for its proven ability to rapidly bring leading edge research to commercially relevant application solutions. DFKI was founded in 1988 as a nonprofit organization by several renowned German IT companies and two research facilities. Since then, DFKI has established a reputation for proactive and customer oriented work and is known both nationally and internationally as a competent and reliable partner for commercial innovation.
Yocoy Technologies GmbH was founded in June 2007 in Berlin, as the 50th spin-off company of the German Research Center for Artificial Intelligence (DFKI).
The vision and ultimate goal of the founders of Yocoy is to develop software that enables people to overcome barriers between the most common languages in the world. This is pursued by developing applications for mobile devices which allow their users to literally carry a new language with them in their pocket all over the world.
The soul of the young Berlin enterprise is an international team of highly skilled specialists thoroughly trained in mobile interfaces, crosslingual/crosscultural communication and tourism information systems.
Project financed by TSB Technologiestiftung Berlin – Zukunftsfonds Berlin
Co-financed by the European Union – European fund for regional development