Algorithm Enables Translation Memory Systems to Identify More Relevant Matches in Less Time
By Keva Marable Blair & Michael Bloodgood, University of Maryland Center for Advanced Study of Language (CASL)
Researchers at the University of Maryland Center for Advanced Study of Language (CASL) developed an algorithm for computer-aided translation software that produces quality translation examples by enabling translation memory systems to identify more relevant matches.
The translation services industry includes a range of specializations and related technologies from interpreting to localization and is projected to grow to $39 billion in revenue by 2018. Many assume the industry employs hosts of linguists to manually translate text and audio recordings, but translation services are becoming more technology driven.
Translators now rely upon large repositories of translated material, also known as translation memory banks, to produce accurate translations quickly and inexpensively. Algorithms match the content to retrieve from the translation memory bank against the set of current materials to be translated. Two important factors in determining the usefulness of translation memory technologies are:
- How well the bank of translated resources matches the content to be processed, and
- Whether the available technology is powerful enough to handle large amounts of data.
Due to technological developments and increasing sophistication of machine translation, customers are seeking new translation services that can help them better communicate with their clients quickly and at lower cost. In addition, more consistent translations that fit the audience help global companies maintain solid branding and a consistent voice.
The main conceptions of translation memory systems were published as early as 1980. Since then, further research has been done to enhance the technology and broaden practical application. Interest in improving computer-aided translation technologies, particularly translation memory systems, has surged.
The European Chapter of the Association for Computational Linguistics (EACL) held a workshop on human and computer-assisted translation at its 2014 Conference. These conference proceedings also included papers on translation memory retrieval methods and a computer-assisted translation workbench that offers advanced functionality for computer-aided translation and the scientific study of human translation, including interaction with translation memory matches.
Researchers at the University of Maryland Center for Advanced Study of Language (CASL) developed an algorithm for computer-aided translation software that produces quality translation examples by enabling translation memory systems to identify more relevant matches. This technology has been deemed especially helpful by translators for multiple domains and language pairs.
The translation memory system stores and recalls translations by pulling from a bank that increases with each entry. The algorithm is constantly pairing material and identifying relevant matches based on large amounts of data. When translators encounter words they are not familiar with, including scientific and technical terms, they rely on these systems for support.
There is enormous potential for this technology to reduce the time and cost needed to produce quality, professional translations. The translation memory system retrieves more helpful translations for:
- Multiple languages in very different language families, from French to Chinese.
- Difficult-to-translate materials including technical jargon.
- Several domains and genres, from medical documents to computer software documentation.
In addition, it has faster time complexity than commonly used edit distance-based algorithms, allows for adaptation in length preferences, and is expected to work well with recurring phrases where sentences are not an exact match including patent translations and legal documents.
CASL’s algorithm for computer-aided translation is available for licensing. To learn more about this technology or begin the licensing process, click here. If you are interested in partnering with CASL on future research projects, click here.
Keva Marable Blair is the Online Communications Manager at CASL. She holds an MA in Publications Design from the University of Baltimore and a BS in Electronic Media and Film from Towson University. Keva specializes in marketing and creative strategies, including web presence, digital media, online marketing, and graphic design.
Michael Bloodgood is a CASL research scientist in the field of computational linguistics. He received his PhD in Computer Science from the University of Delaware and is an expert in machine learning and the use of statistical approaches for processing language data. Dr. Bloodgood is interested in improving the efficiency with which both people and computers process language data.
ABOUT UMD CASL
The University of Maryland Center for Advanced Study of Language (CASL) conducts innovative, academically rigorous research in language and cognition that supports national security. CASL research is interdisciplinary and collaborative, bringing together people from the government, academia, and the private sector. CASL research is both strategic and tactical, so that it not only advances areas of knowledge, but also directly serves the critical needs of the nation. For more information, visit www.casl.umd.edu.