
dots.ocr: a unified visual-linguistic model for multilingual document layout parsing
dots.ocr is a powerful multilingual document parsing tool, based on a 1.7B-parameter visual-linguistic model (VLM), capable of both layout detection and content recognition. It demonstrates state-of-the-art performance in benchmarks such as OmniDocBench, and excels especially in text, table and reading order parsing....

SnippAI: A tool for recognizing and analyzing screenshot content using AI
Snippai is an AI-based screenshot tool designed to enhance the screenshot experience through advanced AI algorithms. It not only captures screen content, but also intelligently analyzes and converts formulas, text, tables, images, etc. in the screenshot. Users can use Snippai to convert complex visual information into editable formats such as LaTeX formulas...

AI Fast Station: document parsing tool for comparing OCR models in one click
AI Fast Station is a free open source OCR model arena that focuses on intelligent parsing of documents and images. Users can upload PDF or image files and quickly find a suitable parsing solution by comparing seven mainstream OCR models with one click. The site supports a wide range of format files, easy to operate, without the need for complex installation.AI Fast Station provides high-precision recognition, fast processing and secure...

OCRmyPDF: scanned PDF into searchable text of the open source tool
OCRmyPDF is an open source command line tool designed to add an Optical Character Recognition (OCR) text layer to scanned PDF files, turning them into searchable, reproducible documents. It is based on Python development , using the Tesseract OCR engine , can accurately recognize the text in the image and embedded in the PDF to keep ...

Docstrange: a tool for extracting data from documents and images and converting them to multiple formats
Docstrange is an open source document processing tool that focuses on extracting data from documents and images in multiple formats and converting them to formats such as Markdown, JSON, CSV or HTML. It utilizes artificial intelligence and advanced OCR technology , support for processing PDF, Word documents, Exce...

Guava Intelligent Document Recognition: Intelligent Recognition Tool for Offline Documents and Forms
Guava Intelligent Document Recognition (intelligent_document_recognition) is open source desktop software developed by developer jiangnanboy , hosted on GitHub , focusing on intelligent recognition of offline processing documents and forms . The software integrates Optical Character Recognition (OCR) and form junction...

OCRFlux: Lightweight tool for converting PDFs and images to Markdown
OCRFlux is an open source lightweight tool focused on converting PDF files and images to clear Markdown format. It is developed by the ChatDOC team, built on a large multimodal model with 3B parameters, and can run on common hardware such as GTX 3090. The tool specializes in complex document layouts,...

VOP: OCR Tool for Extracting Complex Diagrams and Math Formulas
Versatile OCR Program is an open source Optical Character Recognition (OCR) tool designed specifically for processing complex academic and educational documents. It can extract text, tables, mathematical formulas, diagrams and schematics from PDF, images and other documents and generate structured data suitable for machine learning training. Supports multiple languages, including English...

Automatically parse PDF content and extract text and tables of open source services
It automatically analyzes the layout of PDF documents, identifies text, titles, images, tables, formulas and other elements in the page, and determines their correct order. The tool supports OCR functionality , you can convert scanned PDF to searchable text. It runs on Docker and provides two models: visual model (Vision Grid Transfor...

Bob.
Bob is a translation and OCR (Optical Character Recognition) software designed for the macOS platform. Users can use Bob for translation and OCR operations in any application, supporting a wide range of translation services, including Volcano, Tencent, Ali, Baidu, Youdao, Apple, Google, Microsoft,...

Ollama OCR: Extracting Text from Images Using Visual Models in Ollama
Ollama OCR is a powerful Optical Character Recognition (OCR) toolkit that utilizes the state-of-the-art visual language model provided by the Ollama platform to extract text from images. The project is available both as a Python package and provides a user-friendly Streamlit web application interface. It supports a wide range of visual models, including...

Doc2X
Doc2X is a powerful document image formula recognition and conversion tools, is committed to providing efficient and intelligent document processing solutions. Whether it is an academic research paper, textbooks, corporate documents or financial reports, Doc2X can accurately recognize PDF tables and formulas, and one-click conversion to Word, LaTeX, HTML...

STranslate
STranslate is a ready-to-use translation and OCR tool developed by WPF. The tool is designed to provide efficient and convenient translation and optical character recognition (OCR) functionality for a wide range of languages and text types.STranslate is an open source project that is free for users to download and use, and also accepts custom development...

Llama OCR: OCR library that converts images to Markdown in three lines of code using the free Llama 3.2 Vision interface
Llama OCR is an OCR (Optical Character Recognition) library based on Llama 3.2 Vision that converts documents to Markdown format. The library was developed by Nutlope and uses the free Llama 3.2 interface provided by Together AI for graph...

Easydict
Easydict is a simple and elegant dictionary translation application designed for macOS users. With support for multiple translation services and offline OCR recognition, it makes finding words or translating text easy and elegant.Easydict works right out of the box and supports input translation, stroke translation, and screenshot translation, providing a convenient multi-language translation experience. ...

Datalab: dedicated OCR recognition AI model, PDF to Markdown (open source/API)
Datalab offers a range of advanced AI models focused on OCR, layout analysis, PDF to Markdown, and more. These models are not only high performing, but also easy to use and open source. The Marker model on the platform can quickly and accurately convert PDF to Markdown, including tables and formulas.Su...

TTime
TTime, a project published on GitHub by InkTimeRecord, is a simple and efficient translation software. It mainly provides input, screenshot, stroke and hoverball translation functions, and supports multiple translation sources and text recognition services, allowing users to quickly perform language conversion and text recognition. In addition, TTime also has...
Top