Guava Intelligent Document Recognition (intelligent_document_recognition
) is open source desktop software developed by developer jiangnanboy, hosted on GitHub, focusing on intelligent recognition of documents and forms for offline processing. The software integrates Optical Character Recognition (OCR) and form structure recognition, without the need to run online to ensure data privacy and security. Users can extract text and tables from images or PDFs and save them in txt, html or excel formats. The software supports both English and Chinese interfaces, and the latest version v2.1 adds screenshot recognition and image list deletion for more convenient operation. Guava Intelligent Document Recognition is suitable for personal, business or educational users to deal with documents, especially for the scenarios that need to organize data efficiently.
Function List
- Offline OCR Recognition: Extract text from images or PDFs without an Internet connection.
- Form structure recognition: automatically parses form content and outputs it in html or excel format.
- Screenshot Recognition (v2.1): Mouse box the screen content and extract text in real time.
- Picture list management: support for deleting picture files in the left sidebar.
- Multi-format output: recognition results can be saved as txt, html or excel files.
- Chinese and English interface: Chinese and English versions are available, with friendly operation interface.
Using Help
Installation process
Guava Smart Document Recognition is a desktop software that needs to be downloaded and installed on your local device. Below are the detailed installation steps:
- Download Software
Installation packages are available in Chinese and English. The latest version (v2.1) can be downloaded from the following channels:- Chinese version ::
- Baidu.com:
https://pan.baidu.com/s/1owzG74DLPxq6czEQC7ZNwQ
(Extract code: nt3z) - Hugging Face:
https://huggingface.co/jiangnanboy/intelligent_document_recognition
- Baidu.com:
- English version ::
- Baidu.com:
https://pan.baidu.com/s/1Cv-hG6fMDUhj9dd3Et1RuA
(Extract code: rkrd) - Hugging Face:
https://huggingface.co/jiangnanboy/intelligent_document_recognition
After downloading, unzip the zip to a local directory such asC:\guava_document_recognition
The
- Baidu.com:
- Chinese version ::
- Installing Tesseract OCR
The software relies on the Tesseract OCR engine for text recognition. The installation steps are as follows:- Windows (computer) : Download the installer from Tesseract GitHub and install it.
- Linux : Run command
sudo apt-get install tesseract-ocr
The - Mac : Run command
brew install tesseract
The
After the installation is complete, make sure that the path to the Tesseract executable has been added to the system environment variables (Windows users will need to configure this manually).
- operating software
After unzipping the package, double-click to runintelligent_document_recognition.exe
(Windows) or the corresponding executable. The first run will load the OCR model, which may take a few seconds. After the software starts, select the Chinese or English interface (depending on the version downloaded).
Usage
Guava Intelligent Document Recognition provides an intuitive graphical interface that supports the operation of the following functions:
- Offline OCR Recognition
- Open the software and click the "File Upload" button to import images (JPG, PNG) or PDF files.
- Click the "OCR Recognition" button, the software automatically extracts the text in the file.
- The recognition results are displayed in the text box on the right and can be edited or saved by the user as
txt
maybehtml
Format:- Click the "Save" button to select the output format and save path.
- Example: Upload a picture of the meeting minutes, the software extracts the text and saves it as
notes.txt
The
- Form Structure Recognition
- Upload an image or PDF file containing the form.
- Select the "Form Recognition" option and the software will automatically parse the form content.
- The results can be saved as
html
maybeexcel
Format:- Click on the "Export Table" button, select the format and save it.
- Example: Upload financial statement PDF, software generated
report.xlsx
file containing the complete table data.
- Screenshot Recognition (new in v2.1)
- Click the "Screenshot" button and the software interface will be hidden automatically.
- Use the mouse to frame a target area on the screen (e.g., a web page or document content).
- After releasing the mouse, the software recognizes the text in the boxed area and displays it in the text box.
- The user can edit or save the result as
txt
maybehtml
The - Example: Box the course schedule on the screen, the software extracts the text and saves it as
schedule.txt
The
- Image List Management
- The left column of the software displays a list of uploaded images.
- Select the unwanted images and click the "Delete" button or press the
Delete
key to remove it. - This function is suitable for batch processing to clean up useless files.
- Chinese and English interface switching
- The software displays a Chinese or English interface depending on the version downloaded, with the same operating logic.
- For example, the Chinese version shows "File Upload" and the English version shows "Upload File".
- Users can choose the appropriate language version according to their needs.
- batch file
- Place multiple images or PDFs into a specified folder in the software (e.g.
input
(folder). - Select the "Batch Recognition" function, the software automatically processes all files and saves the results.
- The output file is saved by default in the
output
folder, you can change the path in the settings.
- Place multiple images or PDFs into a specified folder in the software (e.g.
Configuration and Optimization
- Adjusting the output format : Edit the root directory of the software
config.ini
file, set the default output format or save path:
[Output]
default_format = txt
save_path = ./output
- Improved recognition accuracy : Ensure that the input file is clear, high-resolution images (at least 300 DPI) work best. Fuzzy or low-quality files may result in recognition errors.
- log debugging : If the identification is inaccurate, check the
logs
log file in the folder to analyze the cause of the error. - performance optimization : When processing large files, close other resource-hogging programs to increase processing speed.
caveat
- Quality of documents : Uploaded images or PDFs need to be clear and not blurry or skewed to ensure accurate recognition.
- System compatibility : The software is supported on Windows, Linux and Mac and requires Tesseract OCR to be properly installed.
- data security : The software runs completely offline and data is not uploaded to the cloud, making it suitable for handling sensitive information.
- Updated software : Just check Baidu.com or Hugging Face regularly to download the latest version and overwrite the old version folder.
- Contact Support If you have any questions, please contact the developer through the public number "Guava AI" on WeChat.
application scenario
- Enterprise Document Management
Business users upload scanned contracts, invoices or statements, extract text and tables, and quickly generate editable documents to improve office efficiency. - Academic research support
Researchers process academic paper PDFs, extract key text or tables, and organize them into txt or excel files for easy data analysis. - Organization of educational resources
Teachers upload scanned copies of test papers or textbooks, extract topics or table contents, organize teaching materials, and support offline operation. - Personal Efficiency Improvement
Users use the screenshot function to quickly extract text from the screen, such as meeting minutes or web content, and save it as an editable file.
QA
- Does Guava Smart Document Recognition require an internet connection?
The software runs completely offline and data processing is done locally to ensure privacy and security. - What file formats are supported?
JPG, PNG, PDF and other formats are supported, and high-resolution files are recommended to improve recognition. - How do I handle misrecognized text?
Check the clarity of the input file or adjust the OCR sensitivity in the software settings. If the problem is not resolved, contact the developer for feedback. - Does form recognition support complex forms?
Support for regular tables, complex nested tables may require pre-processing of images to improve accuracy. - How do I update to the latest version?
Download v2.1 from Baidu.com or Hugging Face, unzip it and overwrite the old version folder.