PDF OCR — Extract Text from Scanned PDF

Use OCR to recognize and extract text from scanned PDF documents in your browser.

Drop files here or click to upload

Max 100 MB

Recognize text in scanned PDF documents using optical character recognition powered by Tesseract.js. Choose your document language, then export extracted text or generate a searchable PDF with an invisible text layer.

Last reviewed: June 2026

How to use this tool

1Upload a scanned PDF document.
2Select the document language from the dropdown.
3Choose output format: plain text or searchable PDF.
4Click Process to start OCR — progress shows per page.

Common use cases

Make scanned contracts searchable for specific clauses.
Digitize paper archives into text for indexing and search.
Extract text from image-based PDFs that have no text layer.

Technical notes

Uses Tesseract.js WASM engine running entirely in your browser.
Language data is downloaded on first use (~4-50MB depending on language) and cached by your browser.
Pages are rendered at 2x scale for better recognition accuracy.

Private by design

This tool runs in your browser. Your file is not uploaded to our server while using the tool.

Limitations

Handwritten text recognition accuracy is significantly lower than printed text.
Complex layouts with multiple columns or tables may produce disordered text.
First-time language pack download requires an internet connection.

Frequently Asked Questions

Which languages are supported?

Over 100 languages are available including English, Chinese (Simplified and Traditional), Japanese, Spanish, French, German, and Korean. Select from the dropdown before processing.

Why is the first run slower?

The language recognition data must be downloaded on first use (4-50MB depending on language). After that, your browser caches it for faster subsequent runs.

How can I improve OCR accuracy?

Use high-resolution scans (300 DPI or higher), ensure the document is not skewed, and choose the correct language.

What is a 'searchable PDF'?

A searchable PDF contains an invisible text layer on top of the original scanned image. You can use Ctrl+F to find text while the visual appearance stays the same.

Is my scanned document uploaded anywhere?

No. OCR processing runs entirely in your browser using WebAssembly. Your document never leaves your device.

Other PDF Tools

Edit PDF Metadata Online

View and edit PDF document properties like title, author, and keywords in your browser.

Markdown to PDF Converter Free

Convert Markdown to beautifully styled PDF with multiple themes. No uploads needed.

Password Protect PDF

Encrypt PDF online, unlock protected PDFs you own, or remove known passwords.