Hyderabad: Marking 25 years of innovation and research, the Language Technologies Research Centre (LTRC) at the International Institute of Information Technology Hyderabad (IIITH) unveiled the ‘BhashaVerse Model’, an unprecedented multi-language translation tool. This encoder-decoder model facilitates translations between 36 Indian languages, including lesser-resourced languages like Tulu, Bodo, and Santhali. It supports tasks like error identification, machine translation evaluation, and automatic post-editing, leveraging 10 billion curated sentence pairs.
-
Expanding Capabilities with BhashaVerse LLM
The Centre also announced the ‘BhashaVerse LLM decoder model’, capable of tasks like summarization and question-answering in Indian languages with some fine-tuning. Additionally, LTRC introduced 10 billion synthetically generated and curated Bhashik datasets for Indian language to Indian language pairs; a generic dataset, one in the Education domain that works across 17 different fields in English and 5 Indian languages; and one for the Health domain in English and 8 Indian languages. This is the first time an automatic post-editing and evaluation dataset has been made available for Indian languages.
-
A Legacy of Innovation
Established in 1999, LTRC was India’s first centre dedicated to natural language processing, focusing on enabling machines to understand Indian languages. Over the years, it has expanded its research to include speech recognition, dialogue systems, sentiment analysis, and more, becoming South Asia’s largest academic hub for language technology.
-
Transforming Research and Industry
Prof. Vasudeva Varma, Head of LTRC, highlighted its contributions, stating, “As the first natural language processing centre in the country we have pioneered several aspects of research and education. We have trained brilliant minds who are leading advances worldwide. Our lasting contribution to research, open datasets, tools and technologies have made a huge impact. Our successful technology transfers have brought industry and academia closer. We look forward to continuing to push the boundaries and our legacy of innovation.”
With the launch of BhashaVerse, LTRC reaffirms its commitment to bridging the gap between Indian languages and technology, fostering advancements in both academia and industry.
Also Read –
Discussion about this post