DNNs for Kannada-English Machine Translation

1-StopAsia - DNNs for Kannada-English Machine Translation: A Breakthrough

When people think of India, it’s common to associate the country with linguistic diversity. In fact, apart from Sanskrit, there are 21 modern Indian languages. Among these are Gujarati, Hindi, Kashmiri, Malayalam, Nepali, Punjabi, Tamil, Telugu, Urdu, and others. Of course, one of these 21 languages is the lesser-known Kannada. 

Spoken natively by around 47 million people, it is the second-oldest of the four Dravidian languages that is spoken mainly in Karnataka in southwestern India. It also has an extensive literary tradition with the oldest Kannada inscription being discovered dating back to 450 CE. As the official language of the state of Karnataka, it was previously also known as Canarese. 

When it comes to English-to-Kannada translation or Kannada-to-English, little research has been done in the field. However, five researchers got together in 2021 to discover more about the accuracy of machine translation related to using Deep Neural Networks (DNN). Their paper is titled “Kannada to English Machine Translation Using Deep Neural Network”. The results are quite impressive. Let’s take a closer look below.

What is Kannada in the context of machine translation?

The Kannada language has a rich history dating back centuries. However, it is deemed to have a poor resource “in terms of computational linguistics”. As such, machine translation becomes a difficult task because of its syntactic and semantic variance in its literature. In terms of statistical machine translation (SMT), much research and many studies on Kannada have focused on the English-South Dravidian language (Kannada/Malayalam) as a more traditional approach to machine translation. 

However, Kannada-to-English translation remains a considerably unexplored area as it relates to machine translation. It has generally involved the translation of simple sentences in a Kannada transliterated corpus using lexicon analysis and phrase mapper. But recent research applied neural machine translation (NMT) to translate Kannada to English using the Encoder-Decoder mechanism.

What is a Deep Neural Network (DNN)?

A Deep Neural Network or DNN is considered to be a “hierarchical organization of hidden networks (layers) that connect input and output”. DNNs generally have at least two layers to them, which gives them a sense of complexity. 

They are used in artificial intelligence, mathematical modeling, statistics, deep learning, machine learning, and even in linguistics in terms of translation. 

Consequently, in the context of this study, the DNN sought the correct mathematical manipulation in order to transform the input into an output. In this case, the input was parts of the Kannada language to achieve a Kannada-to-English translation.

Results of the research

Applying a DNN in the context of an English-to-Kannada machine translation, the research produced results that are considered impressive and advanced for the field, in which research remains limited. 

Some of the results noted as part of this research study include:

  • Translation time for the model was between two and five seconds, based on the length of the input sentence;
  • The validation loss obtained was 0.849
  • Initially, for the first epoch, validation accuracy was approximately 74.84%. However, as the number of epochs increased, validation accuracy also increased to 86.32%.
  • The Bilingual Evaluation Study (BLEU) score, a metric that is used to evaluate a predicted sentence to a target sentence usually uses 1 to depict a perfect match and 0 to depict a complete mismatch. The results were impressive in this regard, too.
The future scope of English to Kannada machine translation: could it be applied to other languages?

The results of the study mentioned above are quite significant for linguists, translations, localization experts, academics, businesses, and so many others who work within the ecosystem of the Kannada language. What must be noted is that the Kannada script differs drastically from the English alphabet script and sentence structure, lexicons, and various other linguistic nuances essentially mean that these will pose significant challenges to both humans and machines when translating English to Kannada or Kannada to English. However, with an 86.32% accuracy score, the results are outstanding and prove that the researchers have achieved what few have been able to do before them.

This breakthrough can possibly be applied to English-to-Kannada machine translation in the future as well. Although more research is proposed to be carried out in the field, this is a good sign that the complexities of two different languages with completely different roots can withstand mathematical modeling and result in a highly accurate final result. Although it is not perfect, it does mean that the human touch of a translator will be required to make the finishing touches. But the amount of time, effort, and resources that could be saved in mere seconds of receiving highly accurate output is an impressive feat indeed.

 

 

Do you want to contribute with an article, a blog post or a webinar?

We’re always on the lookout for informative, useful and well-researched content relative to our industry.

Write to us.

Desi Tzoneva

Desi Tzoneva has a Bachelor of Laws degree from the University of South Africa and a Master's in International Relations from the University of Johannesburg. For the past five years, she's been a content writer and enjoys unraveling the intricacies of the translation and localization industry. She loves traveling and has visited many countries in Asia, Europe, Africa, and the Middle East. In her spare time, she enjoys reading. She will also never say no to sushi.