5 Content and Quality Considerations that Affect Translation Memory Use
By: Erik Vogt (Director of Solutions) - RWS Moravia
08 June 2016
Translation memory (TM) is an established language technology known to save costs and time, and it usually improves quality, but not all organizations are able to leverage translation memory to the fullest. Why does this happen?
There are many reasons; the main being that the relevance of your translation memory to any particular content is influenced by your content and quality strategy. Can having clearly-established content and quality strategies help you get more out of your TM? We tackle these questions in this post.
1. Formulating a quality strategy
A well-defined content strategy will mean, among other things, that there is clarity on the purpose of the content as well as its end-user profile. This will shape not only the content in the target languages, but also quality expectations. Of course, not all types of content are equally important, and hence they don’t need to adhere to the same levels of quality. In addition, not all content is a good fit for all use cases.
For example, leveraging a translation memory containing the collected works of Monty Python would hardly be appropriate for the UN. As quality guru Philip B. Crosby observed, quality is not an inherent property — quality is defined by how well something meets a specific purpose or standard.
Quality starts with the source language content, which in turn strongly influences the quality of the target. Once you are clear on the level of quality you need for a specific project — in other words, the degree to which your content meets the needs of your target audience — you can decide which TMs to use and how the project should be reviewed.
For instance, if the quality expectation is very high, you should only be using a TM that is known to be completely clean and from a highly-relevant source, and any source strings you use from TMs that aren’t ideal will need to be carefully reviewed. On the other hand, the priority of another project might be to output the highest possible volume of content within your budget.
Quality requirements naturally differ with content types, target language tolerance levels, and sometimes even with individual projects. Hence, you need a clearly-defined quality policy to make the most of your TMs. If clarity is lacking on how to measure quality, or if quality is being managed inconsistently across projects, it is going to impact TM health.
2. Grouping like with like — organizing TMs by type
If you have different types of content, those categories naturally guide TM segregation. One way to tier content is given below, but there are many other ways. Some categories naturally go together, such as help and user documentation (and if you are doing it right they are nearly the same exact thing), whereas marketing, legal and social media posts usually belong on their own.
A great example is Apple's slogan “Think Different”; it was very effective as a slogan, but grammatically it’s incorrect. In a marketing campaign it’s a home run, but in the body of a User Guide it would be regarded as an error. Source authors and translators can and probably should take more creative liberties with marketing copy than with user guides. Some common content types are:
- Audio / video content
- Marketing material
- Online help
- Social media content
- Training material
- User interface text
- User documentation
- Website content
Of course some of these categories may have relevant overlapping content, but carefully managing the information you know about the content such as its category, the products that went into each TM, when the TMs were created, the standards to which they comply, and who translated the content, can make the TMs much more valuable in the future.
By carefully including what you trust to be a good match and excluding (or penalizing) what isn’t, you will have the most relevant recommendations for matches on each project — as well as reducing the costs of non-compliance such as post-translation QA and as well as potentially more expensive post-sale failures.
Segregation is important. If you keep dumping all of your translated strings into one big translation memory database, it’s a sure fire way to mess up your TM. TMs lose utility as more types of content, projects and styles blur the focus.
3. Aging TMs
The age of a TM matters. For example, a relatively clean TM that is more than five years old may not comply with new standards, terminology and even spelling. Spelling rules and conventions formally change periodically, as has recently happened with Brazilian Portuguese and Dutch, but there are also changes to a demographic as new terms replace old ones in common use, and the meanings of familiar words change over time.
Although it has been variously attributed, it’s commonly quoted that St. Paul’s Cathedral was once described by a monarch as amusing, awful, and artificial — and at the time it was high praise meaning artful and inspiring. More relevant to today’s business environment is the tendency for corporate branding guidelines to change. The voice that matched your company’s style half a decade ago may sound dated and no longer fit with the image you want to convey today. (This is one reason marketing TMs “degrade” more quickly than technical ones.)
Another challenge is that the older a TM is, the higher the number of translators that would likely have worked on it. Each author and translator has their own style. Differences in writing style can range from obvious to subtle, but unless the guidelines are very clear from the beginning, the content will feel disjointed and confused — even if it is all technically correct.
4. Target audience expectations vary
It's worth noting that some target audiences have a much lower tolerance for errors than others. French orJapanese consumers are stereotypically more demanding of compliance with standards, but one might expect a user guide for software designed for authors to meet higher grammatical standards than one of the required small-print micro-manuals that accompanied your latest keyboard. If anyone actually reads them at all, they are likely not going to be picky as long as the product works. Generally, academic circles will expect more than younger non-professionals. In fact, in some cases highly polished content has a negative effect on a specific demographic.
5. Making mud-pies — how collecting goes wrong
If you’re working with broadly-aggregated language assets, their language quality is usually more difficult to know. If you have different vendors who are each managing their TMs themselves, chances are that the quality is going to be inconsistent. Blindly leveraging them just to save money can seriously backfire.
The same is the case with TM content that isn’t tagged correctly. If you don’t know how a TM originated, what quality requirements were enforced during the projects, or how many rounds of review the leveraged content necessitated, you will probably run into problems.
As you go along the content creation and translation journey, you are going to end up with many TMs. Knowing what is in each TM is critical. If you know what you can trust, you can reduce leverage penalties, or consider leveraging without review at all, especially for “perfect matches” (a.k.a. “ICE” or 101% matches). Focus your limited budget on reviewing just what you need to.
Non-ideal TMs also include projects that were completed in a hurry or even ones about which the quality is unknown. The quality of a TM can only be as good as what you know for sure are the standards the TM complies with — and that goes for ALL of the projects that are included in the TM.
This also doesn’t automatically mean that TMs of known low quality need to be trashed, just that their quality characteristics should be labelled appropriately and used for projects where quality expectations match.
Thus, the relationship between content, quality, and translation memory is very intricate. You will not be able to reap the benefits of a TM unless you have rock-solid content creation and quality measurement and maintenance practices.
This article was originally published on April 25, 2016 on the Moravia Blog.