Dolphin is developed by ByteDance is an open source document image parsing tool , focusing on processing complex document images , such as text, tables , formulas and images contained in scanned or PDF files . It uses the "first analysis after the analysis" approach , through a two-stage process to achieve efficient parsing : first analyze the text ...