Human Problems: Machine Learning and the Language Industry
By: Rita Pang - University of Strasbourg
20 February 2019
As part of the GALA Rising Star 2019 Scholarship Contest, students across the world were asked to answer this question: "What should stakeholders in the language industry do to prepare themselves for the machine learning evolution? What will be the role of the humans?" Participants from more than xx academic programs submitted responses. The winners received a free registration to the GALA 2019 Munich conference and a travel stipend.
The following essay by Rita Pang (University of Strasbourg) was one of two winners.
Human Problems: Machine Learning and the Language Industry
In fall 2007, I went to a conference in the suburbs of Istanbul. Hours before our departure, my friend and I realized that we did not know how to get to the main bus terminal, where we had to catch a bus to Bulgaria. Panicking as time was running out, we ran back inside the conference hall, got on its wifi and on my friend’s brand new iPhone I typed “we go to bus terminal” in Google Translate. I remember specifically using only one verb, as I feared that the complexity would create undesirable translation results. With that translated sentence, we meandered across the city by taxi and a local bus, eventually getting dropped off at the terminal close to midnight.
Machine translation tools have gained tremendous ground in the decade that followed this eventful trip. As these services absorb more patterns from huge amounts of translated texts and input from real people, they learn to produce results similar to a human speaking with proper grammar instead of word-by-word utterances.
A 2018 study conducted by Grand View Research showed that the machine translation market is expected to reach USD 983.3 million by 20221. As lucrative as it is for the industry, a very realistic challenge exists when I work with machine translation as a freelance supplier. I am often asked to accept volume discounts, pour hours into post-edit machine translation (PEMT) or simply refused work, as the client only wanted “a rough idea of what the text is saying”. For a while, I sneered at this concept without realizing that machine learning has become an essential part of our lives. The speed and high-quality results that I get from Google searches, for example, are made possible by machine learning. Search engines’ recognition of synonyms and their ability to predict meanings behind queries is also extremely helpful2.
Let us take a look at some of the things the language industry can do to integrate into the machine learning evolution, as well as roles that we, as businesses and as individual service suppliers, can take on as humans.
Using the technology as a selling point
The first rule of thumb is to never bury your head in the sand. The technological evolution is here to stay, with 73% of companies projected to adopt machine learning by 20223. One Hour Translation, one of the biggest players in the language industry, work with not one, but multiple tools4. One of their selling points is their advisory service for customers and service providers, where they suggest using a specific translation tool for each project, based on subject matter and language pair.
While businesses are apt in selling the technology as a cost-saving tool to customers, the pitch to translators is not always the same. I have come across businesses that showcase the technology in every negative way possible, either as an ultimatum (“use the tool or you will not get projects from us again”), as an excuse to underpay, or as a way to shorten or even remove quality assurance. This is more related to a change of mindsets in the industry in general, which leads to my next point on the role of education.
Learning to learn
Today we hear a lot of talk about the benefits and the power of machine learning, but these discussions do not cover much on helping adults transition into this world. Change is hard, especially for a reluctant translator or a smaller business that is already struggling to keep up with the technology. To adapt to these changes and avoid a race to the bottom, stakeholders need to educate themselves. This extends to unlearning obsolete habits and learning how to process information differently with the aid of technologies, or designing and adapting to new project flows. More importantly, stakeholders need to step away from the “man vs. machine” mindset. Companies must work on developing its human strengths to complement the technology, such as investing in training and education.
Re-skilling and up-skilling today’s workforce
World Economic Forum’s 2016 study on the Future of Jobs asserted that technological change is often accompanied by talent shortage and growing inequality. It is simply not possible to “weather the current technological revolution by waiting for the next generation’s workforce to become better prepared” 5. Businesses, individuals and governments all need to take an active role in retraining their current workforces and ensure that the skill gap is kept to a minimum. The industry needs to deliver stronger on the “human” aspects such as customer service, training and company culture. The 2018 version on the Future of Jobs further ascertained that roles leveraging “distinctively ‘human’ skills” such as sales, marketing, training and organizational development are expected to grow6. These activities require creativity, collaboration and persuasion, which can be complemented but not replaced by technology.
Exploring new business activities and models
Machine learning technologies have created new jobs for our industry. Factors such language morphology means that machine translation output may have a higher “print-ready” accuracy rate in one language, but require more or extensive editing in another before the output can be used. The demand for expert reviewers, editors and Quality Assurance (QA) testers is therefore high; working with PEMT text now takes up a regular portion of my monthly business activities. For businesses, new specialist roles requiring extensive understanding of the latest technologies such as process automation experts are on the rise. Now, more than ever, one needs to have a diverse business offering. You need not be available for every single type language service activity, but presenting yourself as a knowledgeable, well-rounded supplier requires the ability to perform different tasks. Consider adding skills such as software testing and training to your business offering, for example.
Fixing the bias in artificial intelligence
A study conducted in 2017 shows that when being taught a text corpus containing historic biases, whether morally neutral (a preference for insects vs. flowers) or problematic (towards race or gender), the artificial intelligence learns to adopt the biased semantic association7. How does this concern the language industry then? Imagine a word is fed into a terminology database with contexts and meanings containing biases. Every time a translator works with that semantic occurrence, the output is an intrinsically biased piece of text. Think about the effect this may have on the most vulnerable members of society, especially when data related to the justice systems and recruitment is processed as such.
In March 2018, the World Economic Forum released a statement stressing the need for strong standards to prevent marginalization of humans in artificial intelligence. As policies and regulations are unlikely to keep up with the pace of new technological developments, the challenge to the language industry is one of self-governance. Many ML systems today are almost entirely developed by homogenously male teams, for example. A White Paper from the Global Future Council on Human Rights 2016-2018 suggests multiple actions that businesses could take to mitigate the challenge8. The following are the ones that I find especially pertinent to the language industry:
• Ensure diversity in development teams and participate in open source data sharing
• Train system designers/developers on human rights responsibilities
• Organize research and understand how the company’s chosen technology has performed
• Calibrate systems to include fairness criteria where appropriate – for example, linguists working with a high-context language can be brought in to input as many applicable contexts to a word as possible
The points I have touched base here are a part of many extensive, open discussions within the language industry at large. One thing is certain: machine learning was first created to provide solutions to human problems. It is therefore imperative for all stakeholders in the industry to act immediately. They need to actively learn about it, interact with it, use it towards productive means and towards the common good.
Rita Pang is a freelance English/Chinese linguist, social media manager and travel writer. She is currently pursuing a Master's degree in Technical Communication and Localization (TCLoc) at the University of Strasbourg, France.
1 Grand View Research, Report summary on “Machine Translation (MT) Market Size, Share & Trends Analysis Report By Application (Automotive, Military & Defense, Electronics, IT, Healthcare), By Technology, By Region, And Segment Forecasts, 2012 – 2022”, May 2018.
2 Rowe, Kevin. “How Search Engines Use Machine Learning: 9 Things We Know for Sure”, February 2018.
3 World Economic Forum, Future of Jobs Survey 2018.
4 Marr, Bernard. “Will Machine Learning AI Make Human Translators An Endangered Species?” August 2018.
5 World Economic Forum, The Future of Jobs Employment, Skills and Workforce Strategy for the Fourth Industrial Revolution, January 2016
6 World Economic Forum, The Future of Jobs Report 2018, 2018
7 Caliskan, Aylin; Bryson, Joanna J.; Narayanan, Arvind. “Semantics derived automatically from language corpora contain human-like biases”, April 2017.
8 World Economic Forum, “How to Prevent Discriminatory Outcomes in Machine Learning”, March 2018.