Technological Innovations in Data Synthesis and Model Training
UNO's data processing system contains two core technologies: first, an intelligent data enhancement process based on diffusion modeling, which is able to automatically generate multi-view, multi-scene training samples from a single reference image; and second, the introduction of a subject-aware data sampling strategy, which ensures that the features of each entity in a multi-subject scene are learned in a balanced manner.
In terms of training strategy, the team adopts a three-stage optimization scheme: pre-training based on large-scale general-purpose data, then fine-tuning with synthetic data, and finally using adversarial training to improve the quality of details. This scheme allows the model to achieve a feature retention rate of over 85% with only 1-4 reference images. The project's open-source training code supports finetune for customized datasets, and researchers can quickly start new tasks by modifying the configs/training.yaml configuration file.
This answer comes from the articleUNO: Support for single-subject and multi-subject customized image generation tools (suitable for e-commerce graphics)The































