Script · Digitised

Upload a scan.
Get an editable Word file.
In your language.

In minutes.

DocLipi uses AI to transcribe handwritten and printed Indian-language documents — Gurmukhi, Devanagari, and English — into searchable, exportable text. Built for lawyers, archivists, and government offices.

Start for free See how it works

No credit card required

Gurmukhi · Devanagari · English

Export to Word & PDF

Original scan

Transcription · Gemini 2.0 Flash

94% confidence

ਦ Gurmukhi detected

AI transcription complete · ready to verify

₹10,800 Cr

India OCR market by 2033

CAGR 11.8% · 2024–2033

eCourts III

Supreme Court DPR mandates OCR + AI for case file digitisation

Govt of India

DILRMP

Land records modernisation driving bulk regional-language scanning

Ministry of Rural Development

IndiaAI IDP

MeitY challenge seeking Indic multilingual document processing

MeitY · Nov 2025

How it works

From paper to editable Word — four steps

Every page goes through pre-processing, multi-model AI transcription, human verification, and clean export. The review step is built for legal and government auditability.

01 — Upload

↑

Upload your scans

PDF, JPG, PNG, TIFF. Up to 500 pages per batch. Auto-rotate, deskew, and denoise happen instantly — no preparation needed.

02 — AI reads it

ਦ

Multi-model transcription

Gemini 2.0 Flash is the primary engine. Low-confidence lines are automatically routed to a fallback model. Per-line confidence scores flag what needs review.

03 — Verify

⊞

Side-by-side review

Original scan on the left, extracted text on the right — fully editable. Correct errors inline, mark pages verified. Every change is logged for audit.

04 — Export

⎙

Download as Word

Proper Unicode, correct Noto font embedding for all scripts, A4 portrait, page-numbered. Optionally include romanisation and English translation layers.

AI engine

Multi-model routing — the best engine for each page

No single OCR model wins across all Indic scripts and document types. DocLipi routes each page to the right engine based on script, print vs. handwriting, and per-line confidence — then falls back intelligently.

✓

Handwriting — Gemini 2.0 Flash primary; Claude Sonnet on low-confidence pages

✓

Printed text — Gemini + e-Aksharayan (C-DAC, 90–95% accuracy, 7 scripts)

✓

Mixed documents — per-page script detection, different engines on different pages

ਦ

Scanned document

PDF · JPG · PNG · TIFF

↓ pre-process + script detect

Gemini 2.0 Flash

Primary · handwriting

e-Aksharayan

Printed · C-DAC

Claude Sonnet

Fallback · low confidence

Google Vision

Selective · Hindi/Marathi

↓ confidence scoring + review editor

✓

Verified .docx

Unicode · Noto fonts · A4

Who uses DocLipi

Built for those who work with records

Starting with Punjab & Haryana's legal ecosystem — where Gurmukhi is the dominant script and the backlog of undigitised records is enormous.

Beachhead

District court advocates

Punjab & Haryana bar associations. Convert handwritten case files, property deeds, and historical pleadings into verified, citable Word documents in hours — not weeks.

Government

State archives & land records

Revenue registers, census files, administrative records. Make decades of paper searchable and shareable across departments — aligned with DILRMP requirements.

Legal

High court chambers

Digitise case backlogs at scale. The side-by-side review editor and per-page audit log meet the accountability standards required for court submissions.

Upload a scan.Get an editable Word file.In your language.

Upload a scan.
Get an editable Word file.
In your language.