E.g., 10/22/2018
E.g., 10/22/2018

What is Machine Translation?

Machine translation (MT) refers to fully automated software that can translate source content into target languages. Humans may use MT to help them render text and speech into another language, or the MT software may operate without human intervention.

MT tools are often used to translate vast amounts of information involving millions of words that could not possibly be translated the traditional way. The quality of MT output can vary considerably; MT systems require “training” in the desired domain and language pair to increase quality.

Translation companies use MT to augment productivity of their translators, cut costs, and provide post-editing services to clients. MT use by language service providers is growing quickly. In 2016, SDL—one of the largest translation companies in the world—announced it translates 20 times more content with MT than with human teams.

MT Systems

Generic MT usually refers to platforms such as Google Translate, Bing, Yandex, and Naver. These platforms provide MT for ad hoc translations to millions of people. Companies can buy generic MT for batch pre-translation and connect to their own systems via API.

Customizable MT refers to MT software that has a basic component and can be trained to improve terminology accuracy in a chosen domain (medical, legal, IP, or a company’s own preferred terminology). For example, WIPO’s specialist MT engine translates patents more accurately than generalist MT engines, and eBay’s solution can understand and render into other languages hundreds of abbreviations used in electronic commerce.

Adaptive MT offers suggestions to translators as they type in their CAT-tool, and learns from their input continuously in real time. Introduced by Lilt in 2016 and by SDL in 2017, adaptive MT is believed to improve translator productivity significantly and can challenge translation memory technology in the future.

There are over 100 providers of MT technologies. Some of them are strictly MT developers, others are translation firms and IT giants. 

Examples of MT Providers

Google Translate

Microsoft Translator / Bing

SDL BeGlobal

Yandex Translate

Amazon Web Services translator

Naver

IBM - Watson Language Translator

Automatic Trans

BABYLON

CCID TransTech Co.

CSLi

East Linden

Eleka Ingeniaritza Linguistikoa

GrammarSoft ApS

Iconic Translation Machines

K2E-PAT

KantanMT

Kodensha

Language Engineering Company

Lighthouse IP Group

Lingenio

Lingosail Technology Co.

LionBridge

Lucy Software / ULG

MorphoLogic / Globalese

Multilizer

NICT

Omniscien

Pangeanic

Precision Translation Tools (Slate)

Prompsit Language Engineering

PROMT

Raytheon

Reverso Softissimo

SkyCode

Smart Communications

Sovee

SyNTHEMA

SYSTRAN

tauyou

Tilde

Trident Software

UTH International

Worldlingo Based on a TAUS Report*

MT Approaches

There are three main approaches to machine translation:

  • First-generation rule-based (RbMT) systems rely on countless algorithms based on the grammar, syntax, and phraseology of a language.
  • Statistical systems (SMT) arrived with search and big data. With lots of parallel texts becoming available, SMT developers learned to pattern-match reference texts to find translations that are statistically most likely to be suitable. These systems train faster than RbMT, provided there is enough existing language material to reference.
  • Neural MT (NMT) uses machine learning technology to teach software how to produce the best result. This process consumes large amounts of processing power, and that is why it’s often run on graphics units of CPUs. NMT started gaining visibility in 2016. Many MT providers are now switching to this technology.

A combination of two different MT methods is called Hybrid MT.

Availability: API, Cloud, Server, Desktop

Google, Microsoft, IBM, Amazon, Yandex, and many others run MT software on their own infrastructure and provide it as a Cloud API service, priced per symbol. For example, it costs $20 to translate 1 million characters with Google Translate. In contrast, developers of customizable MT, including Systran and Promt, offer server and desktop products priced per license.

In professional translations, MT is most often integrated into the CAT-tool. The human linguist can pick a suggestion from MT as they go through the text, if they don’t find a better match in the translation memory.

Build Your Own MT Engine

There are open-source toolkits anyone can use to build their own engines for any domain and language combination. The most popular baseline software are: Moses for SMT, OpenNMT for Neural and Apertium for RBT. Training statistical and neural engines requires a large collection of parallel texts in two languages. Some organizations such as TAUS have made a service out of providing baseline data, which companies can further expand by adding their own specialist translations.

Evaluating MT Quality

Translation companies and departments typically evaluate MT quality by the effort it takes for a human to post-edit the output. It is often measured in pages per hour, or in the number of key strokes per segment.

Specialists training MT engines rely on automated tests and metrics. They are better suited for A/B testing and experimentation and show the impact of the tiniest changes, where humans might not notice the difference.

The mainstay metric for auto-testing is called BLEU. “Bilingual evaluation understudy (BLEU)” shows how closely MT translation corresponds to human translation of the same text. It compares parallel translations and produces a score between 0 (worst) and 1 (best). While BLEU scores are widely used by MT researchers, they can be manipulated, and it takes a specialist to make sense of results.

Other MT quality metrics include METEOR, ROUGE, HyTER, and NIST. Quality metrics are the focus of the QT21 program supported by GALA.

Ethics for Translation Providers using MT

Confidentiality - Content translated by free MT platforms such as Google Translate and Microsoft Translator is not confidential. It is stored by the platform owners and may be reused for later translations.

Notifying the Client about MT Use - It’s a point of debate in the industry if a translation company should notify clients about use of MT on their projects. Many pundits are in favor of informing the customer of MT usage and others may not disclose the use of MT. Be sure to ask your provider if you have questions about MT usage.

Read More Translation Technology Descriptions

*TAUS Machine Translation Market Report 2017

randomness