Overseas access: www.kdjingpai.com
Ctrl + D Favorites
Current Position:fig. beginning " AI Tool Library

RF-DETR: An Open Source Model for Real-Time Visual Object Detection

2025-03-25 658

RF-DETR is an open source object detection model developed by the Roboflow team. It is based on Transformer architecture, the core feature is real-time efficiency. The model achieves real-time detection over 60 APs for the first time on the Microsoft COCO dataset, and also performs outstandingly in the RF100-VL benchmark test, adapting to a variety of real-world scenarios. It is available in two versions: RF-DETR-base (29 million parameters) and RF-DETR-large (128 million parameters). The model is small and suitable for edge device deployment. The code and pre-trained weights are licensed under the Apache 2.0 license and are free and open for community use. Users can obtain resources from GitHub for easy training or deployment.

RF-DETR:实时视觉对象检测开源模型-1

 

Function List

  • Real-time object detection: fast recognition of objects in images or videos with low latency.
  • Custom dataset training: support for tuning models with your own data.
  • Running on edge devices: the model is lightweight and suitable for resource-limited devices.
  • Adjustable resolution: users can balance inspection speed and accuracy.
  • Pre-training model support: provides pre-trained weights based on the COCO dataset.
  • Video stream processing: can analyze the video in real time and output the results.
  • ONNX Export: Supports conversion to ONNX format for easy cross-platform deployment.
  • Multi-GPU training: You can accelerate the training process with multiple graphics cards.

 

Using Help

The use of RF-DETR is divided into three parts: installation, inference and training. Below are detailed steps to help you get started quickly.

Installation process

  1. environmental preparation
    Requires Python 3.9 or higher, and PyTorch 1.13.0 or higher. If using a GPU, run nvidia-smi Check the drive.

    • Install PyTorch:
      pip install torch>=1.13.0 torchvision>=0.14.0
      
    • Download code:
      git clone https://github.com/roboflow/rf-detr.git
      cd rf-detr
      
    • Install the dependencies:
      pip install rfdetr
      

      This will automatically install numpy,supervision and other necessary libraries.

  2. Verify Installation
    Run the following code:

    from rfdetr import RFDETRBase
    print("安装成功")

If no errors are reported, the installation is complete.

inference operation

RF-DETR comes with a pre-trained model of the COCO dataset to detect images or videos directly.

  1. image detection
    • Sample code:
      import io
      import requests
      from PIL import Image
      from rfdetr import RFDETRBase
      import supervision as sv
      model = RFDETRBase()
      url = "https://media.roboflow.com/notebooks/examples/dog-2.jpeg"
      image = Image.open(io.BytesIO(requests.get(url).content))
      detections = model.predict(image, threshold=0.5)
      labels = [f"{class_id} {confidence:.2f}" for class_id, confidence in zip(detections.class_id, detections.confidence)]
      annotated_image = image.copy()
      annotated_image = sv.BoxAnnotator().annotate(annotated_image, detections)
      annotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels)
      sv.plot_image(annotated_image)
      
    • This code detects objects in the image, labels the bounding box and confidence level, and then displays the results.
  2. Video Detection
    • first install opencv-python::
      pip install opencv-python
      
    • Sample code:
      import cv2
      from rfdetr import RFDETRBase
      import supervision as sv
      model = RFDETRBase()
      cap = cv2.VideoCapture("video.mp4")  # 替换为你的视频路径
      while cap.isOpened():
      ret, frame = cap.read()
      if not ret:
      break
      image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
      detections = model.predict(image, threshold=0.5)
      annotated_frame = sv.BoxAnnotator().annotate(frame, detections)
      cv2.imshow("RF-DETR Detection", annotated_frame)
      if cv2.waitKey(1) & 0xFF == ord('q'):
      break
      cap.release()
      cv2.destroyAllWindows()
      
    • This will detect objects in the video frame by frame and display them in real time.
  3. Adjustment of resolution
    • The resolution can be set at initialization (must be a multiple of 56):
      model = RFDETRBase(resolution=560)
      
    • The higher the resolution, the better the accuracy, but it will be slower.

Training customized models

RF-DETR supports fine-tuning with its own dataset, but the dataset needs to be in COCO format, containing train,valid cap (a poem) test Three subdirectories.

  1. Preparing the dataset
    • Example catalog structure:
      dataset/
      ├── train/
      │   ├── _annotations.coco.json
      │   ├── image1.jpg
      │   └── image2.jpg
      ├── valid/
      │   ├── _annotations.coco.json
      │   ├── image1.jpg
      │   └── image2.jpg
      └── test/
      ├── _annotations.coco.json
      ├── image1.jpg
      └── image2.jpg
      
    • COCO format datasets can be generated using the Roboflow platform:
      from roboflow import Roboflow
      rf = Roboflow(api_key="你的API密钥")
      project = rf.workspace("rf-100-vl").project("mahjong-vtacs-mexax-m4vyu-sjtd")
      dataset = project.version(2).download("coco")
      
  2. Start training
    • Sample code:
      from rfdetr import RFDETRBase
      model = RFDETRBase()
      model.train(dataset_dir="./mahjong-vtacs-mexax-m4vyu-sjtd-2", epochs=10, batch_size=4, grad_accum_steps=4, lr=1e-4)
      
    • For training, the recommended total batch size (batch_size * grad_accum_stepsFor example, the A100 GPUs use the batch_size=16, grad_accum_steps=1T4 GPUs batch_size=4, grad_accum_steps=4The
  3. Multi-GPU Training
    • establish main.py Documentation:
      from rfdetr import RFDETRBase
      model = RFDETRBase()
      model.train(dataset_dir="./dataset", epochs=10, batch_size=4, grad_accum_steps=4, lr=1e-4)
      
    • Runs in the terminal:
      python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py
      
    • commander-in-chief (military) 8 Replace with the number of GPUs you are using. Note the adjustment batch_size to keep the total batch size stable.
  4. Load training results
    • Two weight files are generated after training: regular weights and EMA weights (more stable). Loading method:
      model = RFDETRBase(pretrain_weights="./output/model_ema.pt")
      detections = model.predict("image.jpg")
      

ONNX Export

  • Export to ONNX format for easy deployment on other platforms:
    from rfdetr import RFDETRBase
    model = RFDETRBase()
    model.export()
    
  • The exported file is saved in the output Directory for optimized reasoning for edge devices.

 

application scenario

  1. automatic driving
    RF-DETR detects vehicles and pedestrians on the road in real time. Its low latency and high accuracy are suitable for embedded systems.
  2. industrial quality control
    RF-DETR quickly identifies part defects on factory assembly lines. The model is lightweight and can be run directly on the equipment.
  3. video surveillance
    RF-DETR processes surveillance video to detect abnormal objects or behavior in real time. It supports video streaming and is suitable for 24/7 security.

 

QA

  1. What dataset formats are supported?
    Only the COCO format is supported. The dataset needs to contain train,valid cap (a poem) test subdirectories, each with a corresponding _annotations.coco.json Documentation.
  2. How to get Roboflow API key?
    Log in to https://app.roboflow.com, find the API key in your account settings, copy it and set it to the environment variable ROBOFLOW_API_KEYThe
  3. How long does the training take?
    Depends on hardware and dataset size. On a T4 GPU, 10 epochs might take a couple hours. Smaller datasets can be run on a CPU, but it's slow.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

inbox

Contact Us

Top

en_USEnglish