Current Position:fig. beginning " AI Professional Tools

OCRFlux: Lightweight tool for converting PDFs and images to Markdown

2025-07-22

AI Professional Tools/AI Tool Library/OCR

20 0

OCRFlux is an open source lightweight tool focused on converting PDF files and images to clear Markdown format. It is developed by the ChatDOC team , based on the 3B parameters of the multimodal large model construction , can run on ordinary hardware such as GTX 3090. The tool specializes in handling complex document layouts, accurately parsing multi-column formats, complex tables, and supporting automatic merging of content across pages. Compared to other open source OCR models, OCRFlux excels in accuracy, especially in table and paragraph processing. It provides easy-to-use command line operation , suitable for developers , researchers and users who need to convert documents to Markdown format . The project is open source on GitHub under the Apache 2.0 license, with an active community and 1.7k stars.

Function List

Convert PDFs and images to Markdown format, preserving the natural reading order.
Support for complex layout processing, including multi-column documents, illustrations and embedded content.
Automatically parses complex tables and supports rowspan and colspan HTML table output.
Cross-page content merging, which automatically detects and integrates tables and paragraphs across pages.
Provides high accuracy text recognition with Edit Distance Similarity (EDS) up to 0.967.
Based on a 3B parametric multimodal model, compatible with common GPU operation.
Open source and free, code and documentation are publicly available on GitHub, and community contributions are supported.

Using Help

Installation process

OCRFlux is a Docker-based tool that requires a Docker environment to install and run. The following are the detailed installation steps:

Installing Docker
Make sure Docker is installed on your system, if not, visit the Docker website to download and install the appropriate version for your operating system. Once the installation is complete, run the following command to verify it:
```
docker --version
```

Pulling OCRFlux Mirrors
Run the following command in a terminal to pull the latest OCRFlux image from Docker Hub:
```
docker pull chatdoc/ocrflux:latest
```
Prepare the file path
Create a local working directory (e.g. /path/to/localworkspace) is used to store input and output files. Make sure you have the following directories:
- Enter the PDF file directory (e.g. /path/to/test_pdf_dir).
- OCRFlux model file directory (e.g. /path/to/OCRFlux-3B). The model files should be downloaded from the official GitHub repository or from a link provided by ChatDOC.
Running OCRFlux
Use the following command to start the OCRFlux container, mount the local directory and specify the input PDF and model paths:
```
docker run -it --gpus all \
-v /path/to/localworkspace:/localworkspace \
-v /path/to/test_pdf_dir:/test_pdf_dir \
-v /path/to/OCRFlux-3B:/OCRFlux-3B \
chatdoc/ocrflux:latest /localworkspace --data /test_pdf_dir/* --model /OCRFlux-3B/
```
- --gpus all: Enable GPU acceleration (remove this parameter if there is no GPU).
- -v: Mounts a local directory into the container.
- --data: Specify the path to the input PDF file.
- --model: Specifies the model file path.
Generate Markdown files
When the run completes, the Markdown output file is saved in the ./localworkspace/markdowns/DOCUMENT_NAME directory. Use the following command to convert the JSONL format to Markdown:
```
python -m ocrflux.jsonl_to_markdown ./localworkspace
```

Usage Process

The core function of OCRFlux is to convert PDF or images to Markdown, here are the steps:

Preparing the input file
Place the PDF file or image to be converted into /path/to/test_pdf_dir Catalog. Support for common PDF formats and image formats (e.g. PNG, JPG).
Run the conversion task
Use the Docker commands above to start the conversion. ocRFlux automatically analyzes the document layout and recognizes text, tables and cross-page content. The conversion process may take a few minutes, depending on file size and hardware performance.
Checking the output
After the conversion is complete, open the ./localworkspace/markdowns/DOCUMENT_NAME Catalog to view the generated Markdown file. The file retains the natural reading order of the document, and tables are rendered in Markdown or HTML format.
Handling complex forms
OCRFlux can handle complex tables containing rowspan and colspan. The resulting Markdown file structures the table into a clear format suitable for direct editing or importing into other tools.
Cross-page content merging
For tables or paragraphs that span pages, OCRFlux automatically detects and merges the content. For example, tables spanning two pages are consolidated into one complete table, and paragraphs are spliced together in a logical order.

Featured Function Operation

Complex Layout Processing: OCRFlux supports parsing of multi-column documents and embedded illustrations. No additional configuration is required at runtime, the tool automatically recognizes the document structure.
High-precision recognition: In the OCRFlux-bench-single test, the tool achieves an EDS score of 0.967, outperforming olmOCR-7B (0.872), Nanonets-OCR-s (0.858) and MonkeyOCR (0.780).
cross-page merge: This is a unique feature of OCRFlux. The tool analyzes consecutive pages, detects tables or paragraphs that need to be merged, and outputs the complete content.

caveat

Ensure that the input PDF files are legible and that the recommended resolution of the scans is higher than 300 DPI.
If the GPU is unavailable, the conversion may be slow and a high performance CPU is recommended.
Check for model file completeness, missing files may cause the conversion to fail.
Visit the GitHub repository regularly for the latest version and update instructions.

application scenario

academic research
Researchers can convert academic paper PDFs into Markdown for easy editing and sharing.OCRFlux handles multi-column layouts and complex tables, ensuring clear formatting of formulas and references.
Technical Documentation
Developers can convert technical manuals or API documentation from PDF to Markdown for importing into a knowledge base or blog. Merge across pages to avoid content fragmentation.
Invoice and form processing
Finance staff can convert invoice or form PDFs to Markdown, extracting key information such as purchaser, unit price and price/tax totals for easy data analysis.
content creator
Creators can convert scanned books or notes into Markdown Jellybean format, organizing them into publishable Markdown files suitable for direct use on websites or documents.

QA

What file formats does OCRFlux support?
It supports PDF and common image formats (e.g. PNG, JPG). Input files need to be clear documents or scans.
Need high-performance hardware?
No. OCRFlux is based on a 3B parametric model and can be run on a regular GPU (e.g. GTX 3090) or a high performance CPU.
How do I handle cross-page forms?
OCRFlux automatically detects tables and paragraphs across pages and merges them to output the full Markdown format without manual intervention.
What if the conversion results are inaccurate?
Check the resolution of the input file (300 DPI or higher is recommended). If the problem persists, file an issue on GitHub for community help.
Does it need to be networked to operate?
No internet connection is required.OCRFlux runs in a local Docker environment, and models and data are processed offline.

Chief AI Sharing Circle " OCRFlux: Lightweight tool for converting PDFs and images to Markdown Posted on 2025-07-22, if you find the URL is out of date, or inaccessible, please contact us.

0Bookmarked

0kudos

OCRFlux: Lightweight tool for converting PDFs and images to Markdown

Function List

Using Help

Installation process

Usage Process

Featured Function Operation

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Recommended Tools

New Releases

OCRFlux: Lightweight tool for converting PDFs and images to Markdown

Function List

Using Help

Installation process

Usage Process

Featured Function Operation

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Recommended Tools

New Releases

Quick query station AI tool