dots.ocr has a specialized solution to the problem of confusing the reading order of documents in mixed languages or non-Latin languages:
- Intelligent Sorting Algorithm: The model has a built-in reading order optimization function that automatically arranges blocks of text according to human reading habits.
- Harmonized Output Format: Generate standardized JSON structured data containing element positional relationships and hierarchical information
- language adaptation: Automatically adjusts the parsing logic for different language writing orientations (e.g., right-to-left for Arabic).
- Visual Debugging: outputs numbered bounding box images for visual verification of correct reading order
It is recommended to use the prompt_layout_all_en prompt to get the complete layout analysis results.
This answer comes from the articledots.ocr: a unified visual-linguistic model for multilingual document layout parsingThe