Hyderabad: In a step toward legislative transparency and digital accessibility, the Punjab Vidhan Sabha has made its entire legislative history — dating back to 1947 — available online. The project, developed in collaboration with the International Institute of Information Technology, Hyderabad (IIIT Hyderabad), marks a significant leap in making governance more open, inclusive, and tech-driven.
The project was executed under the guidance of Prof. Gurpreet Lehal, Consultant at Punjabi University, Patiala, and Prof. C.V. Jawahar from IIIT Hyderabad, in association with C-DAC, Noida. It is a key initiative of the National Language Translation Mission, Bhashini.
Prof. Lehal explained that one of the biggest challenges was creating an Optical Character Recognition (OCR) system capable of recognising three different scripts — English, Hindi (Devanagari), and Punjabi (Gurmukhi) — often found within the same document. The OCR was developed to accurately convert these scripts into machine-readable text, paving the way for efficient search functionality across the database.
Smarter and Inclusive Search Features
The search engine developed as part of this initiative can handle variations in spelling and pronunciation. It can auto-correct minor spelling errors, allowing users to search with flexibility. For example, a search for ‘Punjabi Suba’, a key historical movement, will pull up all relevant references in English, Hindi, and Punjabi across two lakh pages of debates.
Additionally, the search engine allows users to explore topics debated by any specific MLA, view their participation frequency, and access other insightful legislative data.
Audiobooks for Visually Impaired Users
Ensuring accessibility for all, the team has also converted the legislative archives into audiobooks for the visually impaired. This has been made possible using Bhashini’s Text-to-Speech (TTS) technology, which transforms extracted Unicode text into speech. The audio files are available in reader-compatible formats like MP3 and Daisy, which can either be played directly on the platform or downloaded for offline use.
Krishna Tulsyan, a researcher from IIITH working on the project, highlighted that this initiative significantly enhances digital accessibility, ensuring that legislative data reaches diverse user groups.
Breaking the Language Barrier
The next phase of the project focuses on bridging the language divide. Using Bhashini’s machine translation system, debates held in Punjabi will soon be available in other Indian languages, such as Marathi, Tamil, or Bengali, allowing a native speaker of any language to access legislative discussions effortlessly.
According to Prof. Lehal, converting text to Unicode has created a foundation for services like search, translation, and speech conversion. He noted that integrating this data with machine translation systems will make multilingual access seamless.
Discussion about this post