Today GALA launches a new series of blog posts, where we ask translation industry experts inside and outside the GALA community for their insights and advice on managing business processes and digital transformation. Do you have a burning or knotty question? Send it to us and we’ll ask our experts.
We kick off with a question on terminology and AI:
Given that artificial intelligence is permeating all activities related to translation and content production, we can't help but pause to consider what role AI will have (or maybe already has) in terminology management. Here's what some experts have answered.
In my opinion, AI will have a twofold impact on terminology management. On the one hand side, it will make companies more aware of the importance of terminology. Since terminology is the number one factor for improving AI output quality, investing in terminology will have much more tangible benefits than we’ve seen so far. In the past, it was very difficult to calculate objective benefits through sound terminology management, apart from improved efficiency in several different global content processes. However, as direct input into AI and deep learning engines, for example machine translation or chatbots or data mining, improvements through terminology can be expressed in actual numbers and facts. Deep learning projects of any kind require structured and harmonized data, which is exactly what terminology management delivers. In marketing, too, terminology can feed into global SEO rollouts, or social media management. Thus, completely new fields of application and new divisions inside companies now are looking for terminology. This is a game changer, because it turns terminology from a pure cost center into an actual and tangible revenue generator--or at least contributor for companies.
On the other side, AI will also play a role in terminology management itself. For starters, not many terminology management systems currently employ functionalities as we know them from authoring tools, such as consistency or style assurance. They will receive those. Also, for terminology creation itself, AI will start playing a role. This might range from automatic definition extraction or generation to finding potential synonyms in a company´s data. AI will help make corporate terminology processes more efficient. For example, machine learning could predict which user groups might have an issue with what terms and channel these accordingly. It will be able to predict the preferred term out of a group of synonyms based on guidelines or previous selection. Or, again, it might help tune terminology extraction tools to predict whether a term candidate will make it through review based on previous data.
Terminology management has employed artificial intelligence well before AI became the buzzword it is today. After all, the best terminology extraction engines have always relied on natural language processing (NLP), one of the applications used in AI. At the same time, a granular terminological data collection represents knowledge and is thus the backbone of AI applications, such as machine translation (MT). A recent project of mine involved term mining in preparation of a 9 million-word MT project. Once the terminology was identified and appropriately documented, it supported the human editing process. Texts with consistent use of the correct terminology in turn result in improved TMs and more well-trained MT engines. While we can predict certain improvements in terminology or information extraction from the field of artificial intelligence, the contribution of terminology management to AI applications can be expected to be far greater.
The term "artificial intelligence" (AI) is often used to describe computers that mimic cognitive functions normally associated with the human mind, such as learning and problem solving (source: Wikipedia). Today, the more advanced term mining tools already perform a form of AI by using sophisticated algorithms coupled with a reference corpus to identify term candidates that present strong “termhood” or semantic relevance.
I wouldn’t say that AI has an impact on term mining. It’s rather the other way around. Term mining has an impact on AI by enabling the effective identification of conceptually-relevant terms, which in turn can be used to support AI applications, such as search engine optimization, automatic content classification, machine translation, and sentiment analysis. As for how AI impacts terminology management, well, the first thing that comes to mind is how online corpora are growing exponentially, and continually evolving search techniques are making it possible to access those corpora like never before. Terminology data itself constitutes the DNA of digital content and so managing and developing terminological resources is like building a toolbox for accessing and leveraging digital content for various end-purposes. The availability of large-scale corpora, corpus analysis tools (such as WordSmith) and (programming languages such as Python) unlock huge opportunities for terminology workers to build highly-performant and multi-purpose terminology resources which in turn aid in the advancement of AI. In this landscape, the job description of the terminologist is changing dramatically.
My team provide guidance on word choice to a large organization, and we are tasked to provide that guidance before authors write the first line of content in a project or initiative. This type of proactive terminology management relies primarily on structured processes and close collaboration between terminologists and content owners. Our goal is to enable authors to use the right terms the first time (as opposed to correcting terminology during a review stage), and to have reviewed and validated translated terms available at the beginning of a translation project (as opposed to starting the term translation process at that time). I strongly believe that proactive terminology management that begins at the planning stage is more effective, efficient and results in more consistent content (across all languages) faster than terminology management that relies on term mining late in the content lifecycle. So, while we have very sophisticated term mining software in our toolbox, we do not use this type of technology as part of our standard terminology development process. And I don’t expect that to change any time soon - regardless of any progress made in developing term mining capabilities.
To discuss the impact of AI in terminology management we need to better understand the AI engine involved. There are currently no global "all knowing" AI engines. Instead, each one is focused within a certain area and trained with relevant material in that field. The key to obtaining intelligence through human interaction with an engine is for the engine to grasp contextual information. What is this text about? What is the person asking for? Therefore, contextual information, semantic relations between terms and categorization of terminology all play major parts. Another important factor is the use of variants, which some termbases are built to avoid. Humans typically use various ways and expressions to describe the same thing, which means synonyms must be included in the termbase. This also impacts term extraction processes. They must become more intelligent to interpret the context of a term while also being flexible to allow more variants and synonyms in the resulting termbase. Terminology management software also needs to scale up to make the information flow manageable. We are now talking hundreds, if not thousands, of terms suggestions entering the termbase per week. There will need to be workflow support and automation functionality so users may collaborate effectively in updating AI-supporting termbases. In summary, terminology will clearly play a key role in the functioning of AI, and this is already impacting how people are working with their termbases.
For more on terminology, visit the GALA Knowledge Center.