DiffBIR's training methodology explained in detail
The two-stage training scheme used by DiffBIR is the technical basis for its performance advantages. The first stage (train_stage1.py) focuses on learning the base feature representation of an image, using a mixed dataset of about 5 million images for pre-training. The second stage (train_stage2.py) is then fine-tuned for specific degradation types, a process that typically takes 2-4 days of distributed training on 8 GPUs.
The technical innovations of the training process are mainly reflected in 1) the progressive learning rate scheduling strategy, 2) the dynamic balancing mechanism of the weighted loss function, and 3) the combined use of adversarial training and perceived loss. Experimental data show that this staged approach has an average advantage of 1.2 dB in PSNR metrics compared to end-to-end training.
The train_stage1.yaml and train_stage2.yaml configuration files provided with the project contain complete hyperparameter settings that can be adapted by the user according to the characteristics of his/her dataset. Of particular note, the system supports TRANSFER LEARNING, which requires only about 1000 domain-specific images to accomplish effective model adaptation.
This answer comes from the articleDiffBIR: Intelligent Repair Tool to Improve Image QualityThe































