New Delhi: BharatGen team from IIIT Hyderabad, in partnership with IIT Bombay, has launched Patram 7B-Instruct, India’s first vision-language foundational model for documents.
The team was led by Dr Ravi Kiran Sarvadevabhatla, Associate Professor at IIIT Hyderabad and Dr Ganesh Ramakrishnan, Professor at IIT Bombay.
Along with Patram, another generative AI suite for Indic document intelligence called DocBodh was also launched. It will be used across governance, education, law and business.
BharatGen is a government-supported initiative for developing India-centric Multimodal Large Language Models (LLMs).
They are developing Patram from scratch for complex document understanding tasks with funding from DST.
Patram 7B-Instruct is a seven-billion-parameter vision-language AI model trained on a large and diverse collection of Indian documents.
This can analyse and understand scanned or photographed documents and respond to natural language instructions.
The model is now freely available as an open-source release on Hugging Face and MeitY IndiaAI Mission’s AIKosh platform.
Patram outperforms several larger international models like DeepSeek-VL-2 on key benchmarks like DocVQA and VisualMRC. It also shows strong results on Patram-Bench, a custom benchmark reflecting real-world Indian document scenarios.
Also Read –
India’s AI Approach Pro-Innovation, People-Centric: IndiaAI Mission CEO
Discussion about this post