Customizing the training process
YOLOE supports adapting new objects in three ways:
Option 1: Fine-tuning of text prompts (5-minute crash course)
- Prepare 10-20 labeled images (COCO format)
- modifications
class_names.txtAdd new category (e.g. "defective_part") - Running:
python train_text_prompt.py --data custom.yaml --weights yoloe-s.pt
Option 2: Visual cue training (suitable for unspecified categories)
- Steps:
- utilization
generate_sam_masks.pyGenerate reference graph segmentation masks - fulfillment
train_vp.pyTraining the visual encoder - Load on prediction
vp_model.ptfor reference
- utilization
Option 3: Full model training (highest accuracy)
Need to prepare >1000 labeled drawings:
- Convert the dataset:
python tools/convert2yolo.py --data_path ./custom - Generate a cache:
python generate_grounding_cache.py --img-path ./train --json-path annotations.json - Start training (A100 video card recommended):
python train_seg.py --batch 64 --epochs 100
This answer comes from the articleYOLOE: an open source tool for real-time video detection and segmentation of objectsThe































