Overseas access: www.kdjingpai.com
Bookmark Us
Current Position:fig. beginning " AI Answers

SmolDocling is the world's smallest visual language model

2025-08-28 1.7 K
Link directMobile View
qrcode

As the world's smallest visual language model (VLM) in terms of parameter size, SmolDocling has only 256M parameters, which was jointly developed by the ds4sd team and IBM. Built on the lean architecture SmolVLM-256M, it features efficient document processing capabilities while maintaining a tiny size. Compared to traditional large-scale VLMs that usually require billions of parameters, SmolDocling is specially optimized with model compression techniques that enable it to run smoothly on common computing devices. The nature of open source hosting on the Hugging Face platform further lowers the barrier to using the technology.

The miniaturized design of the model has multiple advantages: it reduces the video memory occupation by more than 70%, improves the inference speed by more than 10 times, and supports operation in GPU-less environments. Experimental data shows that the document recognition accuracy of 88.7% can still be maintained under 256M parameter scale, which is particularly suitable for embedded devices and edge computing scenarios. This miniaturized implementation path represents an important breakthrough in the development of VLM technology towards lightweight and civilianization.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish