As part of the GALA Rising Star 2019 Scholarship Contest, students across the world were asked to answer this question: "What should stakeholders in the language industry do to prepare themselves for the machine learning evolution? What will be the role of the humans?" Participants from more than xx academic programs submitted responses. The winners received a free registration to the GALA 2019 Munich conference and a travel stipend.
The following essay by Katherine Tse (University of Leeds) was one of two winners.
The Writing on Wall
“Go to my hometown,” he said, “and check out our guji. We’re famous for them.”
Guji...like old chickens? Out at dinner with classmates, I was still getting my bearings as a new expat in Taiwan, but I was certain there wasn’t an old chicken-famous place on the island. “What old chickens?” I asked.
No, no, he shook his head as the table laughed. “We’re one of the oldest cities in Taiwan,” he said as he drew the characters in the air: guji, “historical sights.”
Ah, a Chinese homophone.
My error certainly wasn’t for a lack of machine tools at my disposal; Google Translate or a dictionary app lurked a finger’s swipe away, and as machine learning continues its steady march across the language industry, it will become even harder to commit such linguistic stumbles.
The much-touted work efficiency of machine learning (in 2016, Google boasted 143 billion words a day across 100 language combinations)1 underscores their growing significance – and justifiably so. No wonder, then, that talk of machine learning often seems predicated on a fear that burrows deep in the language industry: that human translators might one day be replaced by machines. But that’s not the whole story. Machine learning might replace some tasks, and it might redefine the way jobs look now, yet our capacity to express ourselves ultimately transcends the capacity of machines to replicate language entirely. Lifting the veil on the doomsday tone will reveal that despite (or because of) machine learning, humans still have a role to play as informed and open-minded contributors.
It’s worth remembering that machine learning exists because language contains such intricacies, such complexities that automating tasks makes LSPs more efficient and frees them to focus on creational aspects. As language evolves, so does our worldview. Just this past holiday season, “Baby It’s Cold Outside” came under the microscope for the advances made by the male voice that were repeatedly rebuffed by the female voice. Portrayed through the ages as a vindictive sorceress who turned men into swine, Circe regained agency and humanity in author Madeline Miller’s best-selling narrative. Wicked reimagined a classic villain, the Wicked Witch of the West, as simply misunderstood.
As the most retranslated work in history, the Bible is no exception either. Take the story of Genesis 21 when Sarah gives birth to a long-awaited son and names him for laughter, saying, “God hath made me to laugh, so that all that hear will laugh with me.”2 Her words mark herself as an active participant of the laughter, an act of communal joyfulness.
Yet in an interview with The New York Times3, noted scholar-translator Robert Alter takes a different tack in his reading of the Hebrew and produces instead: “Laughter has God made me,/ Whoever hears will laugh at me.” Sarah becomes the object of unwanted ridicule, the laughter made malevolent. The slightest re-reading reverberates in this Biblical story. As Alter would have it, this gives Sarah a layer of humanity, anger, and complexity.
That the same story in Hebrew inspires two different versions – one a story of a miracle, the other a story of social isolation; a heroine in one, a fallible human in another – comes from the human drive to re-examine our most fundamental narratives, breathe new poeticism into well-worn verses, and grapple for meaning with languages that have far preceded us. Imagine that same line handled by machine learning. It will seek patterns in data and try to match the segment faithfully, but what happens to that transposition from the active voice to the passive voice, from her image as a joyful mother to a scorned woman? How does that diminish her sense of injury and her wounded pride, two universal feelings that connect with people across any era?
What we contribute now, more than ever, is our very human ability to understand each other. We can prepare for our place in a machine era by focusing on that singular element that distinguishes us from our digital counterparts: our humanity, wrapped in our needs and expressions and nuances. Our ability to understand and engage with each other through language unifies us, despite ostensibly deepening polarization.
In the age of machine learning, then, stakeholders can prepare by returning to the roots of the industry: language. Read more. Read books of soaring fiction and heroic sagas. Read books that question our very existence and ponder the foundations of our society. Read age-old tales and sagas about good against evil; read all the retellings they inspired and discussions they provoked. Think and question and engage more deeply with each other.
Basic? Perhaps. Talking about the importance of reading might be the oldest lecture in the book, but stakeholder preparation need not focus only on sweeping policy changes or groundbreaking approaches. Being a part of the industry means working with language on all its varying levels and realizing that there never can be complete mastery of language. My current translation program focuses on learning software and practicing translation methods, but when it comes down to independent learnings, to what fundamental resource do we turn but supplementary readings?
Speaking of linguistic mastery, US Homeland Security Secretary Kirstjen Nielsen drew attention last month for telling a judiciary committee, “From Congress I would ask for wall. We need wall.”4 Just before that, her department had written in a press release that they were “committed to building wall and building wall quickly.”5
The terseness drew instant notice. Was it a wall, any ordinary brick structure? Was it the wall, the singular structure that has inspired legions of red-hatted chanters? Was it plural walls unified under one moniker or even a walled system? The absence of the definite article could make all the difference in policy, intention, and meaning. (No doubt Mexico would agree!)
Sometimes we mean only what we say, and sometimes we mean only what we don’t say. Context and connotations help us understand one another with greater clarity, yet machines see only one segment at a time to translate, a short-sightedness that could result in mistranslation or sow confusion. Absent the context, a post-editor might assume a grammatical gaffe on Nielsen’s part. With the context, it becomes a matter of intention: toeing that line between campaign promises and real policy. Machine learning cares not for that delicate game, but the translator bears the burden of relaying this information in full, complete with all the guesswork that a missing definite article entails. To render words accurately is not the same as rendering words meaningfully.
In turn, stakeholders have a responsibility to themselves and their clients to be informed. On a fundamental level, it sharpens their understanding of the nuances of the times. But more than that, it serves as a reminder that words matter and that the context of the situation affects language beyond the semantics level – a reminder that applies across business, journalism, literature, and beyond. Words nowadays are increasingly rooted in context, as surely as any text is rooted in its source culture. Recognizing that reality means recognizing the responsibility of being informed and the imperative to be open-minded.
There is a greater responsibility on translators to be more informed, not less informed and certainly not informed enough, and that dictates a greater need for human translators. Perhaps in decades past, translations that overlooked context could pass muster with people across different cultures – but not now, no longer. Machine learning might shoulder part of the process for us, but translators, interpreters, and stakeholders still need to be the informed parties: the ones capable of imparting contextual significance to readers.
We are more effective communicators when we are informed and open-minded. Having those qualities can reshape our mindset toward machine learning, helping us as an industry move away from the fear of being replaced by machine learning. Perhaps in time, we might come to see these “fancy prediction systems”6 as incredibly efficient but inherently flawed complements to our understanding of the world. How fortunate we are that language soars beyond semantics, beyond tokens and characters that can be codified into artificial systems, connecting us even when we misunderstand the difference between “old chicken” and “historical sights.”
Katherine Tse is currently an MA candidate in Applied Translation Studies at the University of Leeds, specializing in Chinese-to-English translation. Prior to her studies, she worked as a freelance translator based in the Greater China region.