
dots.ocr: a unified visual-linguistic model for multilingual document layout parsing
dots.ocr 是一个强大的多语言文档解析工具,基于 1.7B 参数的视觉-语言模型(VLM),能够同时进行布局检测和内容识别。它在 OmniDocBench 等基准测试中展现了最先进的性能,特别是在文本、表格和阅读顺序解析方面表现出色。...

SnippAI: A tool for recognizing and analyzing screenshot content using AI
Snippai 是一个基于人工智能的截图工具,旨在通过先进的AI算法提升截图体验。它不仅能捕捉屏幕内容,还能对截图中的公式、文本、表格、图像等进行智能分析和转换。用户可以通过Snippai将复杂的视觉信息转化为可编辑的格式,如LaTeX公式...

AI Fast Station: document parsing tool for comparing OCR models in one click
AI快站是一个免费的开源OCR模型竞技场,专注于文档和图片的智能解析。用户可以上传PDF或图片文件,通过一键对比七大主流OCR模型,快速找到适合的解析方案。网站支持多种格式文件,操作简单,无需复杂安装。AI快站提供高精度识别、快速处理和安全...

OCRmyPDF: scanned PDF into searchable text of the open source tool
OCRmyPDF 是一个开源的命令行工具,专门用于为扫描的PDF文件添加光学字符识别(OCR)文本层,使其变为可搜索、可复制的文档。它基于Python开发,使用Tesseract OCR引擎,能准确识别图像中的文字,并将其嵌入PDF中,保持...

Docstrange: a tool for extracting data from documents and images and converting them to multiple formats
Docstrange is an open source document processing tool that focuses on extracting data from documents and images in multiple formats and converting them to formats such as Markdown, JSON, CSV or HTML. It utilizes artificial intelligence and advanced OCR technology , support for processing PDF, Word documents, Exce...

Guava Intelligent Document Recognition: Intelligent Recognition Tool for Offline Documents and Forms
Guava Intelligent Document Recognition (intelligent_document_recognition) is open source desktop software developed by developer jiangnanboy , hosted on GitHub , focusing on intelligent recognition of offline processing documents and forms . The software integrates Optical Character Recognition (OCR) and form junction...

OCRFlux: Lightweight tool for converting PDFs and images to Markdown
OCRFlux is an open source lightweight tool focused on converting PDF files and images to clear Markdown format. It is developed by the ChatDOC team, built on a large multimodal model with 3B parameters, and can run on common hardware such as GTX 3090. The tool specializes in complex document layouts,...

VOP: OCR Tool for Extracting Complex Diagrams and Math Formulas
Versatile OCR Program 是一个开源的光学字符识别(OCR)工具,专门为处理复杂的学术和教育文档设计。它能从PDF、图像等文件中提取文本、表格、数学公式、图表和示意图,并生成适合机器学习训练的结构化数据。支持多语言,包括英...

Automatically parse PDF content and extract text and tables of open source services
它能自动分析PDF文档的布局,识别页面中的文字、标题、图片、表格、公式等元素,并判断它们的正确顺序。工具支持OCR功能,可以把扫描PDF转为可搜索文本。它基于Docker运行,提供两种模型:视觉模型(Vision Grid Transfor...

Bob.
Bob is a translation and OCR (Optical Character Recognition) software designed for the macOS platform. Users can use Bob for translation and OCR operations in any application, supporting a wide range of translation services, including Volcano, Tencent, Ali, Baidu, Youdao, Apple, Google, Microsoft,...

Ollama OCR: Extracting Text from Images Using Visual Models in Ollama
Ollama OCR是一个强大的光学字符识别(OCR)工具包,它利用Ollama平台提供的最先进视觉语言模型来从图像中提取文本。该项目既可作为Python包使用,也提供了用户友好的Streamlit网页应用程序界面。它支持多种视觉模型,包括...

Doc2X
Doc2X 是一款功能强大的文档图片公式识别与转换工具,致力于提供高效智能的文档处理解决方案。无论是学术科研论文、教辅书籍、企业文档还是财报研报,Doc2X 都能精准识别 PDF 中的表格和公式,并一键转换为 Word、LaTeX、HTML...

STranslate
STranslate 是一个由 WPF 开发的即用即走的翻译和 OCR 工具。该工具旨在提供高效、便捷的翻译和光学字符识别(OCR)功能,适用于各种语言和文本类型。STranslate 是开源项目,用户可以自由下载和使用,同时也接受定制开发...

Llama OCR: OCR library that converts images to Markdown in three lines of code using the free Llama 3.2 Vision interface
Llama OCR is an OCR (Optical Character Recognition) library based on Llama 3.2 Vision that converts documents to Markdown format. The library was developed by Nutlope and uses the free Llama 3.2 interface provided by Together AI for graph...

Easydict
Easydict 是一个专为 macOS 用户设计的简洁优雅的词典翻译应用。它支持多种翻译服务和离线 OCR 识别,能够轻松优雅地查找单词或翻译文本。Easydict 开箱即用,支持输入翻译、划词翻译和截图翻译,提供便捷的多语言翻译体验。 ...

Datalab: dedicated OCR recognition AI model, PDF to Markdown (open source/API)
Datalab offers a range of advanced AI models focused on OCR, layout analysis, PDF to Markdown, and more. These models are not only high performing, but also easy to use and open source. The Marker model on the platform can quickly and accurately convert PDF to Markdown, including tables and formulas.Su...

TTime
TTime 是由 InkTimeRecord 发布在 GitHub 上的项目,是一款简洁高效的翻译软件。它主要提供输入、截图、划词及悬浮球翻译等功能,支持多种翻译源和文字识别服务,让用户能够快速进行语言转换和文字识别。此外,TTime 也具...
Top