What is Post-Editing and Why is It So Hard?
By: Jörgen Danielsen (Eule Lokalisierung) - Eule Lokalisierung GmbH
06 March 2013
Jörgen Danielsen describes the challenges in getting started with Machine Translation and the difficulties faced with post-editing.
Modern technology now covers more than the basics, but grew more complex on its way to get there. I am an experienced driver (I am German, so I have to be), but when I bought my current car, the salesperson sat inside the car with me and explained its features for more than 30 minutes before I was able to take it for a ride.
On the other hand, the selection of the “right” technology is not as crucial as it used to be. One particular application might be better suited for your specific purposes than another, but even the latter could solve your issues if used properly. And here is the catch – today’s software does an awful lot, but it gets harder and harder to locate the features and functions you need.
My former IT manager always claimed “The problem sits in front of the computer” and, as arrogant as this statement sounds, he was usually right. Now,that does not mean that the user is outright stupid. In most cases, he or she is just not trained to use the hardware/software combination properly (or to fix the issue at hand).
Machine translation (MT) software, in particular, is a hard nut to crack. No matter which software you buy (or rent), you cannot use it “out of the box”. Even if an expert sets it up for your purposes, you still need to figure out how to get the best out of it.
You either integrate the MT application into a publishing environment (e.g. your intranet) in order to provide quick and dirty translations of huge amounts of text to your target audience, or you employ post-editor/translators in order to create good, “human-legible” output. And here we are back again at the “problem sitting in front of the computer.”
In today’s translation and localization market, there is a severe lack of knowledge on how to post-edit machine translated texts. It starts with the question: Who should execute the post-editing? A translator? A subject-matter expert who understands the target language only? A language student? Some other low-cost resource? A specially trained post-editor (does such a person exist)?
Then there is the question of who provides the MT output and who does the post-editing. Should it all be handled by one party or not? If not, how can you make sure that the results of the post-editor’s efforts are used to improve the MT engine? Post-editors hate to make the same corrections over and over again!
And, to make matters worse, you will have a hard time finding a nice and efficient application supporting the post-editing efforts. The obvious post-editing environment seems to be a translation memory (TM) system, but it is intended for a slightly different purpose and it needs to be directly linked to the MT engine in order to facilitate post-editing work (e.g. by avoiding rework on inline tags). TM providers have undertaken endeavors to support MT engines, but in general, there are still features missing.
On top of these process issues, there is an even more severe lack of properly trained post-editors. No matter whom you select as your post-editor, he or she needs to be trained, since post-editing requirements differ from those for translation. Even worse, they differ between post-editing the outputs of statistical MT engines and the outputs of rule-based MT engines: In case of RBMT, you frequently need to move words around in order to make a sentence readable or you just scrap a sentence, since the MT engine got it wrong completely. In case of SMT, you often get incomplete sentences or even “very nice” translations that happen to have just one minor flaw: they are completely unrelated to the source sentence. Thus you need to look for different types of issues and practice different techniques for the two types of MT engines.
Finally, the expectations on what you get after post-editing and what you pay/receive for the service are all over the place. They range from paying next to nothing (an expectation created by some claims of MT providers that their engines are able to provide “virtually-perfect translations”) to requesting almost full translation rates due to “unusable MT outputs”.
At GALA 2013 in Miami, colleagues gathered for several sessions to discuss precisely these issues. David Canek (Memsource) started off with a rousing discussion of MT Post-editing in Practice. Andy Reid and Riz Karim (SDL) discussed Leveraging Language Technology to Build Your Post-Editing Practice. Olga Beregovaya and Alex Yanishevsky provided a case study on Measuring and Managing Post-editing Productivity Gains at Welocalize, and Harald Elsen (DELTA International) and Jennifer Brundage (Lucy Software) looked at post-editing fundamentals during a three-part small-group discussion. Post-editing was on everybody’s minds in Miami; no doubt we’ll be talking about these issues for the foreseeable future as well!
Jörgen Danielsen is the Managing Director of Eule Lokalisierung. After studying mathematics and physics, Jörgen founded a translation company and bought his first MT system in 1990. Over the next decade, he held several executive-level positions with a focus on ERP, CRM, and technology. After three more years balancing multiple roles at the same time, he decided it was time to implement his own ideas regarding customer-based services again, and founded Eule Lokalisierung.