Overseas access: www.kdjingpai.com

Bookmark Us

OCR

 Submit Website

DeepSeek-OCR: An Open Source Optical Character Recognition (OCR) Tool
DeepSeek-OCR is an optical character recognition (OCR) tool developed and open sourced by DeepSeek-AI. It proposes a new approach called “Contextual Optical Compression”, which rethinks the role of the visual coder from the perspective of the Large Language Model (LLM). The tool does not simply recognize graphs...
10-25 2.3 K0kudos
dots.ocr: a unified visual-linguistic model for multilingual document layout parsing
dots.ocr is a powerful multilingual document parsing tool, based on a 1.7B-parameter visual-linguistic model (VLM), capable of both layout detection and content recognition. It demonstrates state-of-the-art performance in benchmarks such as OmniDocBench, and excels especially in text, table and reading order parsing....
08-10 7.3 K0kudos
SnippAI: A tool for recognizing and analyzing screenshot content using AI
Snippai is an AI-based screenshot tool designed to enhance the screenshot experience through advanced AI algorithms. It not only captures screen content, but also intelligently analyzes and converts formulas, text, tables, images, etc. in the screenshot. Users can use Snippai to convert complex visual information into editable formats such as LaTeX formulas...
08-10 2.2 K0kudos
AI Fast Station: document parsing tool for comparing OCR models in one click
AI Fast Station is a free open source OCR model arena that focuses on intelligent parsing of documents and images. Users can upload PDF or image files and quickly find a suitable parsing solution by comparing seven mainstream OCR models with one click. The site supports a wide range of format files, easy to operate, without the need for complex installation.AI Fast Station provides high-precision recognition, fast processing and secure...
08-09 2.0 K0kudos
Docstrange: a tool for extracting data from documents and images and converting them to multiple formats
Docstrange is an open source document processing tool that focuses on extracting data from documents and images in multiple formats and converting them to formats such as Markdown, JSON, CSV or HTML. It utilizes artificial intelligence and advanced OCR technology , support for processing PDF, Word documents, Exce...
08-04 3.7 K0kudos
Guava Intelligent Document Recognition: Intelligent Recognition Tool for Offline Documents and Forms
Guava Intelligent Document Recognition (intelligent_document_recognition) is open source desktop software developed by developer jiangnanboy , hosted on GitHub , focusing on intelligent recognition of offline processing documents and forms . The software integrates Optical Character Recognition (OCR) and form junction...
07-29 1.7 K0kudos
OCRFlux: Lightweight tool for converting PDFs and images to Markdown
OCRFlux is an open source lightweight tool focused on converting PDF files and images to clear Markdown format. It is developed by the ChatDOC team, built on a large multimodal model with 3B parameters, and can run on common hardware such as GTX 3090. The tool specializes in complex document layouts,...
07-22 2.6 K0kudos
VOP: OCR Tool for Extracting Complex Diagrams and Math Formulas
Versatile OCR Program is an open source Optical Character Recognition (OCR) tool designed specifically for processing complex academic and educational documents. It can extract text, tables, mathematical formulas, diagrams and schematics from PDF, images and other documents and generate structured data suitable for machine learning training. Supports multiple languages, including English...
04-12 2.7 K0kudos
Automatically parse PDF content and extract text and tables of open source services
It automatically analyzes the layout of PDF documents, identifies text, titles, images, tables, formulas and other elements in the page, and determines their correct order. The tool supports OCR functionality , you can convert scanned PDF to searchable text. It runs on Docker and provides two models: visual model (Vision Grid Transfor...
04-09 3.2 K0kudos
RolmOCR: Document OCR Model for Recognizing Handwritten and Slanted Characters
RolmOCR is an open source Optical Character Recognition (OCR) tool developed by Reducto AI team, based on Qwen2.5-VL-7B visual language model. It can extract text from images and PDF files faster than similar tools olmOCR, lower memory footprint.RolmOCR...
04-07 3.9 K0kudos
uniOCR: cross-platform open source text recognition tool
uniOCR is an open source text recognition tool developed by the mediar-ai team. It is based on the Rust language and supports macOS, Windows and Linux. It supports macOS, Windows and Linux systems. It allows users to extract text from images, and is easy and free to use. uniOCR's core feature is cross-platform support...
04-04 2.6 K0kudos
PDF Craft: PDF scanned documents to Markdown open source tools
PDF Craft is an open source tool designed for scanning PDFs of books and converting them to Markdown format. It is developed by oomol-lab and hosted on GitHub for users who like to organize their eBooks. The tool runs through a local AI model and does not require an internet connection, which protects privacy and facilitates operation. It...
03-24 3.7 K0kudos
SmolDocling: a visual language model for efficient document processing in a small volume
SmolDocling is a Visual Language Model (VLM) developed by ds4sd team in collaboration with IBM, based on SmolVLM-256M, hosted on Hugging Face platform. SmolDocling is a visual language model (VLM) based on SmolVLM-256M, hosted on the Hugging Face platform, which is the world's smallest VLM with only 256M parameters.
03-18 3.2 K0kudos
Mistral OCR: 94.89% Overall Accuracy, 1000 Pages/30 Seconds, Only $1
In the long history of human civilization, every leap in the way information is acquired and analyzed has profoundly contributed to social progress. From the ancient hieroglyphics, to the portable papyrus, to the later emergence of the printing press and today's wave of digitalization, each technological innovation has greatly expanded the scope of dissemination and depth of application of human knowledge, which in turn has become a breeding ground for a new round of innovation...
03-07 3.3 K0kudos
Ollama OCR: Extracting Text from Images Using Visual Models in Ollama
Ollama OCR is a powerful Optical Character Recognition (OCR) toolkit that utilizes the state-of-the-art visual language model provided by the Ollama platform to extract text from images. The project is available both as a Python package and provides a user-friendly Streamlit web application interface. It supports a wide range of visual models, including...
01-10 6.7 K0kudos
STranslate
STranslate is a ready-to-use translation and OCR tool developed by WPF. The tool is designed to provide efficient and convenient translation and optical character recognition (OCR) functionality for a wide range of languages and text types.STranslate is an open source project that is free for users to download and use, and also accepts custom development...
12-25 3.0 K0kudos
VisionParser: OCR tool for high-precision processing of receipts and invoices, API available
VisionParser是一款专为处理收据和发票而设计的OCR（光学字符识别）工具。通过先进的生成式AI技术，VisionParser能够快速、准确地将各种收据和发票转换为结构化数据，适用于零售、餐饮、B2B服务等多种业务场景。其灵活的AP...
12-18 2.5 K0kudos
Chunkr: An All-in-One Service for Document Ingestion and Intelligent Chunking Based on Text Paragraph Hierarchy Using Visual Models
Chunkr is a self-hosted API specialized in converting PDF, PPTX, DOCX, and Excel files into data suitable for use in RAG (Retrieval Augmented Generation) and LLM (Large Language Modeling). It was developed by Lumina AI Inc. and utilizes advanced visual models for document...
12-13 2.9 K0kudos
Llama OCR: OCR library that converts images to Markdown in three lines of code using the free Llama 3.2 Vision interface
Llama OCR is an OCR (Optical Character Recognition) library based on Llama 3.2 Vision that converts documents to Markdown format. The library was developed by Nutlope and uses the free Llama 3.2 interface provided by Together AI for graph...
12-11 3.6 K0kudos