[GALA Connected 2021] Extending NLP Support for Less-Resourced Language with the use of Word Vectors

25 Mar 2021

This event has expired, scroll down to see the recording.

Natural language processing tools are now omnipresent in the localization industry. Automatic creation and querying of electronic dictionaries, enhanced translation memory lookup, automatic post editing and machine translation are a must-have in today’s world. Leveraging their potential both speeds up and improves the quality of translation. However, the presence of these natural language processing tools for a given language relies heavily on the availability of electronic language resources in this language. Resources of the highest value include: mono and bilingual dictionaries and parallel or monolingual text corpora. Lack of these resources constitutes a significant impediment in the development of natural language processing for some languages. These languages are often referred to as less-resourced languages.

Importantly, the status of a less-resourced language does not necessarily correlate with the number of people speaking the language and, in consequence, the market demand for translation services involving this language. In our scenario, we focused on several ocial languages of India: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Panjabi and Sindhi. We aimed at providing bilingual word-level alignment - a fundamental functionality that can serve to develop several other natural language processing mechanisms.

Host organization: XTM International

Event Speakers

Rafał Jaworski
XTM International

Rafał Jaworski, PhD, is an academic lecturer and scientist specializing in natural language processing techniques. His alma mater is Adam University in Poznań, Poland where he works at the Department of Artificial Intelligence (AI). Rafał’s scientific work concentrates around developing robust AI algorithms for the needs of computer-assisted translation. These include automatic lookup of linguistic resources and computer-assisted post editing. Apart from the research and teaching, he works as a linguistic AI expert at XTM International leading a team of young and talented developers who put his visions and ideas into practice.