Embracing Generative AI and Linguist Prompt Engineering
(Image generated by Stable Diffusion)
In the tsunami of new generative AI technologies and services introduced to the market, AI-based translation is also a hot topic among localization professionals. Personally, I take an embracing approach to this new tech. I’ve always considered this as a technology that was bound to come one day or another.
After six months of rigorous testing, I've concluded that generative AI can be a reliable collaborator for translation. Although it has shortcomings that it is trying to overcome as it becomes smarter, it outperforms traditional machine translation by understanding the context of a text or a given prompt, which significantly improves translation quality.
However, the quality of the translation depends heavily on the linguist prompt engineer, a new form of a human translator.
In this post, I will share insights on how to improve translation quality through human-machine collaboration with generative AI (especially chat-enabled AI) and thoughts on whether human translators will remain an essential part of the localization industry.
How can we make AI a better translator?
1. Source processing to meet the eyes of AI
The quality of the source text greatly impacts AI translation. Preprocessing the source text is crucial, especially for languages that often omit subjects or objects, such as Korean and Japanese. Also, clarifying ambiguous words can prevent misinterpretations and improve the translation outcome.
Image 1 is an example of hallucinatory mistranslation due to the lack of source text preprocessing. In image 2, the AI was able to produce a better translation upon being provided with fuller details on the subject and object of the sentence and definitions to newly coined words.
2. Keep it simple
Write clear, concise prompts for the AI, including its role, tone, format, and context. Using bullet points can help keep the AI from getting confused and prevent possible hallucinations. Moreover, it helps save tokens.
The prompt could be as detailed as you would like; For instance, you may provide hyperparameters such as tone, style, temperature, and more. However, providing essential information such as the source and target languages is effective in itself and minimizes oversights and hallucination. What’s surprising is that the ChatGPT GUI powered by GPT-4 has demonstrated the ability to grasp tone and style when given document-type information.
3. Provide categorized and step-by-step instructions
Provide the AI with information in a categorized and step-by-step manner. Start with basic information and then provide additional directions or references in subsequent prompts. This approach can lead to a more stable output from the AI.
Example of a step-by-step instruction for chat-enabled AI.
4. Don't micromanage
Avoid intervening during the AI's first draft of the translation. Too much direction at this stage can confuse the AI and lead to an outcome that you would not aim for. As a result, you will have to do more post-editing. Allow the AI to complete its draft before making any adjustments, keeping in mind the token limitation.
Previously, I had given specific instructions on the use of upper and lower cases for a specific portion of the text. Then, when I input the prompt above (image 1), it simply overlooked it and executed the previous instruction (image 2).
5. AI as a learner
Providing good references is important to improve translation quality, particularly in domains involving wordplay and puns. AI can learn from well-chosen references and even localize puns or wordplays effectively by figuring out logic or patterns.
After letting the AI do its job without any hands-on approach, we can come back to it again to give specific feedback. Based on the strings of interactions, it can draw out results that align with our intention as it gains more information from the human translator.
6. Be careful with numbers
AI struggles with numbers. Pay close attention when dealing with financial documents or content where numbers are crucial to quality.
The English equivalent of “1조500억원” is “1.05 trillion won.”
7. Post-editing is crucial
Post-editing is necessary for high-quality translations. The shortcoming of AI is that it doesn’t adequately adapt and personalize the source text, and it may make mistakes that are considered incorrect in the formal context. Review and revise translations for a final product that meets your quality standards.
Post-editing by a human translator ensures that the translation is more personalized to meet the client’s requirements and preferences.
The future of human translators
Embracing AI and harnessing its power will be key to competitiveness in the localization industry. However, the human touch is an essential component that will never be replaced due to the nature of human language.
Human language is essentially like an organism that evolves as humans interact with a new environment. Unless AI can interact with the environment in real time and learn from it through powerful inference capabilities, humans are the only entity that can keep up with the instantaneous and tremendous inflow of interactions and the generation of new words or expressions from them. In other words, human translators must have the ability to understand and embrace the expanding variety and deepening depth of human interactions manifested in the growing communities on the Internet, as well as the ever-changing physical sphere of our lives through technological advancement, in order to gain competitiveness in their profession in the future.
With the power of human-machine collaboration, linguist prompt engineers at the very doorstep of the age of AI can be an evolution of human translators as well as a way to claim that human beings are irreplaceable in dealing with human language.