RF-DETR is an open source object detection model developed by the Roboflow team. It is based on Transformer architecture, the core feature is real-time efficiency. The model achieves real-time detection over 60 APs for the first time on the Microsoft COCO dataset, and also performs outstandingly in the RF100-VL benchmark test, adapting to a variety of real-world scenarios. It is available in two versions: RF-DETR-base (29 million parameters) and RF-DETR-large (128 million parameters). The model is small and suitable for edge device deployment. The code and pre-trained weights are licensed under the Apache 2.0 license and are free and open for community use. Users can obtain resources from GitHub for easy training or deployment.

Function List
- Real-time object detection: fast recognition of objects in images or videos with low latency.
- Custom dataset training: support for tuning models with your own data.
- Running on edge devices: the model is lightweight and suitable for resource-limited devices.
- Adjustable resolution: users can balance inspection speed and accuracy.
- Pre-training model support: provides pre-trained weights based on the COCO dataset.
- Video stream processing: can analyze the video in real time and output the results.
- ONNX Export: Supports conversion to ONNX format for easy cross-platform deployment.
- Multi-GPU training: You can accelerate the training process with multiple graphics cards.
Using Help
The use of RF-DETR is divided into three parts: installation, inference and training. Below are detailed steps to help you get started quickly.
Installation process
- environmental preparation
 Requires Python 3.9 or higher, and PyTorch 1.13.0 or higher. If using a GPU, runnvidia-smiCheck the drive.- Install PyTorch:
pip install torch>=1.13.0 torchvision>=0.14.0
- Download code:
git clone https://github.com/roboflow/rf-detr.git cd rf-detr
- Install the dependencies:
pip install rfdetrThis will automatically install numpy,supervisionand other necessary libraries.
 
- Install PyTorch:
- Verify Installation
 Run the following code:from rfdetr import RFDETRBase print("安装成功")
If no errors are reported, the installation is complete.
inference operation
RF-DETR comes with a pre-trained model of the COCO dataset to detect images or videos directly.
- image detection
- Sample code:
import io import requests from PIL import Image from rfdetr import RFDETRBase import supervision as sv model = RFDETRBase() url = "https://media.roboflow.com/notebooks/examples/dog-2.jpeg" image = Image.open(io.BytesIO(requests.get(url).content)) detections = model.predict(image, threshold=0.5) labels = [f"{class_id} {confidence:.2f}" for class_id, confidence in zip(detections.class_id, detections.confidence)] annotated_image = image.copy() annotated_image = sv.BoxAnnotator().annotate(annotated_image, detections) annotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels) sv.plot_image(annotated_image)
- This code detects objects in the image, labels the bounding box and confidence level, and then displays the results.
 
- Sample code:
- Video Detection
- first install opencv-python::pip install opencv-python
- Sample code:
import cv2 from rfdetr import RFDETRBase import supervision as sv model = RFDETRBase() cap = cv2.VideoCapture("video.mp4") # 替换为你的视频路径 while cap.isOpened(): ret, frame = cap.read() if not ret: break image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) detections = model.predict(image, threshold=0.5) annotated_frame = sv.BoxAnnotator().annotate(frame, detections) cv2.imshow("RF-DETR Detection", annotated_frame) if cv2.waitKey(1) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows()
- This will detect objects in the video frame by frame and display them in real time.
 
- first install 
- Adjustment of resolution
- The resolution can be set at initialization (must be a multiple of 56):
model = RFDETRBase(resolution=560)
- The higher the resolution, the better the accuracy, but it will be slower.
 
- The resolution can be set at initialization (must be a multiple of 56):
Training customized models
RF-DETR supports fine-tuning with its own dataset, but the dataset needs to be in COCO format, containing train,valid cap (a poem) test Three subdirectories.
- Preparing the dataset
- Example catalog structure:
dataset/ ├── train/ │ ├── _annotations.coco.json │ ├── image1.jpg │ └── image2.jpg ├── valid/ │ ├── _annotations.coco.json │ ├── image1.jpg │ └── image2.jpg └── test/ ├── _annotations.coco.json ├── image1.jpg └── image2.jpg
- COCO format datasets can be generated using the Roboflow platform:
from roboflow import Roboflow rf = Roboflow(api_key="你的API密钥") project = rf.workspace("rf-100-vl").project("mahjong-vtacs-mexax-m4vyu-sjtd") dataset = project.version(2).download("coco")
 
- Example catalog structure:
- Start training
- Sample code:
from rfdetr import RFDETRBase model = RFDETRBase() model.train(dataset_dir="./mahjong-vtacs-mexax-m4vyu-sjtd-2", epochs=10, batch_size=4, grad_accum_steps=4, lr=1e-4)
- For training, the recommended total batch size (batch_size * grad_accum_stepsFor example, the A100 GPUs use thebatch_size=16, grad_accum_steps=1T4 GPUsbatch_size=4, grad_accum_steps=4The
 
- Sample code:
- Multi-GPU Training
- establish main.pyDocumentation:from rfdetr import RFDETRBase model = RFDETRBase() model.train(dataset_dir="./dataset", epochs=10, batch_size=4, grad_accum_steps=4, lr=1e-4)
- Runs in the terminal:
python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py
- commander-in-chief (military) 8Replace with the number of GPUs you are using. Note the adjustmentbatch_sizeto keep the total batch size stable.
 
- establish 
- Load training results
- Two weight files are generated after training: regular weights and EMA weights (more stable). Loading method:
model = RFDETRBase(pretrain_weights="./output/model_ema.pt") detections = model.predict("image.jpg")
 
- Two weight files are generated after training: regular weights and EMA weights (more stable). Loading method:
ONNX Export
- Export to ONNX format for easy deployment on other platforms:
from rfdetr import RFDETRBase model = RFDETRBase() model.export()
- The exported file is saved in the outputDirectory for optimized reasoning for edge devices.
application scenario
- automatic driving
 RF-DETR detects vehicles and pedestrians on the road in real time. Its low latency and high accuracy are suitable for embedded systems.
- industrial quality control
 RF-DETR quickly identifies part defects on factory assembly lines. The model is lightweight and can be run directly on the equipment.
- video surveillance
 RF-DETR processes surveillance video to detect abnormal objects or behavior in real time. It supports video streaming and is suitable for 24/7 security.
QA
- What dataset formats are supported?
 Only the COCO format is supported. The dataset needs to containtrain,validcap (a poem)testsubdirectories, each with a corresponding_annotations.coco.jsonDocumentation.
- How to get Roboflow API key?
 Log in to https://app.roboflow.com, find the API key in your account settings, copy it and set it to the environment variableROBOFLOW_API_KEYThe
- How long does the training take?
 Depends on hardware and dataset size. On a T4 GPU, 10 epochs might take a couple hours. Smaller datasets can be run on a CPU, but it's slow.































 English
English				 简体中文
简体中文					           日本語
日本語					           Deutsch
Deutsch					           Português do Brasil
Português do Brasil