To install SmolDocling, follow these steps:
- environmental preparation: Ensure that Python 3.8+ is installed, a virtual environment is recommended
- Installation of dependencies: Execute command pip install torch transformers docling_core
- GPU acceleration(Optional): Install the CUDA version of PyTorch for speed, available via the torch.cuda.is_available()Detection of support
The utilization process is divided into five stages:
- Image Loading: Use load_image()Function to import the image to be processed
- Model initialization: Automatic download of model weights via Hugging Face (requires initial internet connection)
- Document Conversion: Generate DocTags using a specific prompt template
- format conversion: Export DocTags to popular formats such as Markdown.
- Advanced Optimization: GPU users can enable flash_attention_2 accelerated processing
Note that adjustments may be required when processing large images max_new_tokens parameter (default 8192), it is recommended to print the intermediate results for debugging when using it for the first time.
This answer comes from the articleSmolDocling: a visual language model for efficient document processing in a small volumeThe































 English
English				 简体中文
简体中文					           日本語
日本語					           Deutsch
Deutsch					           Português do Brasil
Português do Brasil