SmolDocling's standard output DocTags can be converted to multiple formats using the docling_core library:
Basic Conversion Method
- To Markdown: Use
export_to_markdown()method, which perfectly preserves the header hierarchy and code blocks - To HTML: Suitable for web publishing, will keep the form style as it is.
- To LaTeX: Academic users can use this for math formulas
Advanced Processing Techniques
- Merge multi-page documents: first collect the DocTags of each page in a list, then use the
Document.merge() - Style customization: Adjust the HTML output style by modifying the CSS template.
- Batch conversion: Batch processing of folders in conjunction with the glob module
Conversion example code:doc = DoclingDocument(name="报告")
doc.load_from_doctags(doctags)
with open("output.md", "w") as f:
f.write(doc.export_to_markdown())
This answer comes from the articleSmolDocling: a visual language model for efficient document processing in a small volumeThe





























