Script · Digitised

Upload a scan.
Get an editable Word file.
In your language.

In minutes.

DocLipi uses AI to transcribe handwritten and printed Indian-language documents — Gurmukhi, Devanagari, and English — into searchable, exportable text. Built for lawyers, archivists, and government offices.

No credit card required
Gurmukhi · Devanagari · English
Export to Word & PDF
← DocumentsRevenue records 1921 · GurmukhiPg 2 of 5Verified
Original scan
Transcription · Gemini 2.0 Flash
94% confidence
Gurmukhi detected
AI transcription complete · ready to verify
₹10,800 Cr
India OCR market by 2033
CAGR 11.8% · 2024–2033
eCourts III
Supreme Court DPR mandates OCR + AI for case file digitisation
Govt of India
DILRMP
Land records modernisation driving bulk regional-language scanning
Ministry of Rural Development
IndiaAI IDP
MeitY challenge seeking Indic multilingual document processing
MeitY · Nov 2025
How it works
From paper to editable Word — four steps

Every page goes through pre-processing, multi-model AI transcription, human verification, and clean export. The review step is built for legal and government auditability.

01 — Upload
Upload your scans

PDF, JPG, PNG, TIFF. Up to 500 pages per batch. Auto-rotate, deskew, and denoise happen instantly — no preparation needed.

02 — AI reads it
Multi-model transcription

Gemini 2.0 Flash is the primary engine. Low-confidence lines are automatically routed to a fallback model. Per-line confidence scores flag what needs review.

03 — Verify
Side-by-side review

Original scan on the left, extracted text on the right — fully editable. Correct errors inline, mark pages verified. Every change is logged for audit.

04 — Export
Download as Word

Proper Unicode, correct Noto font embedding for all scripts, A4 portrait, page-numbered. Optionally include romanisation and English translation layers.

AI engine
Multi-model routing — the best engine for each page

No single OCR model wins across all Indic scripts and document types. DocLipi routes each page to the right engine based on script, print vs. handwriting, and per-line confidence — then falls back intelligently.

Handwriting — Gemini 2.0 Flash primary; Claude Sonnet on low-confidence pages
Printed text — Gemini + e-Aksharayan (C-DAC, 90–95% accuracy, 7 scripts)
Mixed documents — per-page script detection, different engines on different pages
Scanned document
PDF · JPG · PNG · TIFF
↓ pre-process + script detect
Gemini 2.0 Flash
Primary · handwriting
e-Aksharayan
Printed · C-DAC
Claude Sonnet
Fallback · low confidence
Google Vision
Selective · Hindi/Marathi
↓ confidence scoring + review editor
Verified .docx
Unicode · Noto fonts · A4
Who uses DocLipi
Built for those who work with records

Starting with Punjab & Haryana's legal ecosystem — where Gurmukhi is the dominant script and the backlog of undigitised records is enormous.

Beachhead
District court advocates

Punjab & Haryana bar associations. Convert handwritten case files, property deeds, and historical pleadings into verified, citable Word documents in hours — not weeks.

Government
State archives & land records

Revenue registers, census files, administrative records. Make decades of paper searchable and shareable across departments — aligned with DILRMP requirements.

Legal
High court chambers

Digitise case backlogs at scale. The side-by-side review editor and per-page audit log meet the accountability standards required for court submissions.

Archives
Universities & museums

Build searchable collections from fragile manuscripts. Romanisation and translation outputs make regional-language holdings accessible to a global audience.

Publishing
Academic & literary publishers

Bring manuscripts and correspondence back into print. Source script preserved; optional romanised and English translation layers for wider readership.

Data entry
Digitisation BPOs

Automate the bulk of handwriting recognition. Human operators focus only on flagged low-confidence lines — dramatically improving throughput per operator.

Supported scripts
Where we are, and where we're going
Priority 1 · Available now
ਪੰਜਾਬੀ
Gurmukhi

Punjabi. Beachhead market. Deep accuracy tuning for handwritten records from Punjab & Haryana courts and revenue departments.

Available now
हिन्दी
Devanagari

Hindi, Marathi, Sanskrit. Largest volume of undigitised records nationally. Aligned with eCourts Phase III and DILRMP mandates.

On demand · Roadmap
+
More scripts

Additional Indian scripts added on demand. Contact us to request support for your language.

Pricing
Simple, transparent pricing

Pay as you go, or subscribe monthly. Institutional pricing available for courts, government departments, and bulk digitisation teams.

Need more pages? Top up 100 pages for ₹500 — works with any plan, no expiry.
Free
₹0
forever · 20 pages to start
20 pages — try it today
Gurmukhi & Devanagari
Word & PDF export
Batch upload
Romanisation & translation
Start free
Pro Max
₹5,000
per month · 1,200 pages included
1,200 pages per month
All scripts + priority routing
Romanisation & translation
Batch upload (unlimited)
Top-up at ₹500 / 100 pages
Get started

GST applicable. Annual plans at 2 months free. Need 50+ users or on-premise deployment? Contact us for institutional pricing →

ਪੁਰਾਣੀਆਂ ਲਿਖਤਾਂ, ਨਵਾਂ ਰੂਪ·Every page, perfectly transcribed.·India's scripts, now searchable.