Current Position:fig. beginning " AI Professional Tools

FastDeploy: an open source tool for rapid deployment of AI models

2025-07-29

AI Professional Tools/AI Tool/Model Serving

321 6

https://github.com/PaddlePaddle/FastDeploy

FastDeploy is an open source tool developed by the PaddlePaddle team, focusing on rapid deployment of deep learning models. It supports a variety of hardware and frameworks, covering more than 20 scenarios such as image, video, text and speech, and contains more than 150 mainstream models.FastDeploy provides out-of-the-box deployment solutions for production environments, simplifying the development process and improving inference performance. It supports deployment from the cloud to mobile and edge devices, and is suitable for enterprises and developers to quickly realize AI applications. The project is licensed under Apache-2.0, has an active community, and is well-documented, making it suitable for developers seeking efficient deployment.

Function List

Support multiple hardware: including NVIDIA GPU, Kunlun XPU, Rise NPU, RK3588, etc., adapting to a variety of chips.
Multiple model support: Covering more than 20 scenarios such as image classification, target detection, OCR, speech synthesis, etc., supporting 150+ models.
Efficient inference acceleration: provides quantization support (e.g., W8A16, FP8), speculative decoding, multiple token technologies such as forecasting.
Out-of-the-box for production environments: support vLLM and OpenAI APIs to simplify service deployment.
Visual Deployment: Combined with VisualDL, it supports model configuration modification, performance monitoring and service management.
Cross-platform deployment: supports cloud, mobile, edge device and web deployment.
Flexible compilation options: developers can choose the back-end module according to their needs to reduce resource consumption.

Using Help

Installation process

FastDeploy provides Python and C++ installation methods, suitable for different development needs. The following is an example of a Python installation, based on an Ubuntu system. You need to make sure that Python 3.6+ and its dependencies are installed on your system.

Preparing the environment
Install the necessary dependencies:

sudo apt update
sudo apt install -y python3 python3-dev python3-pip gcc python3-opencv python3-numpy

It is recommended to isolate dependencies using a virtual environment such as conda:

conda create -n fastdeploy python=3.8
conda activate fastdeploy

Installing PaddlePaddle
FastDeploy relies on the PaddlePaddle framework to install the development version:

python -m pip install paddlepaddle==0.0.0 -f https://www.paddlepaddle.org.cn/whl/linux/cpu-mkl/develop.html

Installing FastDeploy
Pre-compiled packages can be installed via pip:

pip install fastdeploy-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html

or compiled from source:

git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/python
export ENABLE_ORT_BACKEND=ON
export ENABLE_PADDLE_BACKEND=ON
export ENABLE_VISION=ON
python setup.py build
python setup.py bdist_wheel
pip install dist/fastdeploy_python-*-linux_x86_64.whl

If you are compiling a device such as the RK3588, you need to set the ENABLE_RKNPU2_BACKEND=ON cap (a poem) RKNN2_TARGET_SOC=RK3588The

Verify Installation
After installation, run the sample code to verify:
```
import fastdeploy
print(fastdeploy.__version__)
```

Functional operation flow

1. Model deployment

FastDeploy supports deploying multiple models with a single click. Take the target detection model as an example and run the PaddleDetection model:

from fastdeploy.vision import detection
model = detection.PaddleDetectionModel(
model_file="ppyoloe_crn_l_300e_coco/model.pdmodel",
params_file="ppyoloe_crn_l_300e_coco/model.pdiparams",
config_file="ppyoloe_crn_l_300e_coco/infer_cfg.yml"
)
result = model.predict("000000014439.jpg")
print(result)

Users need to download the model files (e.g. ppyoloe_crn_l_300e_coco.tgz) and unzip it, the file is available from the official link.

2. Hardware adaptation

FastDeploy supports multiple hardware deployments. For example, deployment on RK3588:

cd demos/vision/detection/paddledetection/rknpu2/python
python infer.py --model_file picodet_s_416_coco_lcnet_rk3588.rknn \
--config_file picodet_s_416_coco_lcnet/infer_cfg.yml \
--image 000000014439.jpg

Ensure that the device has the corresponding driver installed (e.g. rknpu2).

3. Accelerated reasoning

FastDeploy offers several acceleration techniques. For example, quantization techniques are used (W8A16):

model.enable_quantization("W8A16")

or enable speculative decoding:

model.enable_speculative_decoding()

These features significantly improve inference speed and are suitable for high-performance demanding scenarios.

4. Visualization deployment

In conjunction with VisualDL, users can manage models through a web interface:

Start the VisualDL service:

visualdl --model-dir model_path --host 0.0.0.0 --port 8040

Accessed in a browser http://localhost:8040, tuning model configurations, and monitoring performance.

5. Documentation support

FastDeploy provides detailed documentation in the <FastDeploy>/docs directory or GitHub repository. Users can refer to:

Model support list:<FastDeploy>/docs/supported_models.md
Hardware Adaptation Guide:<FastDeploy>/docs/cn/build_and_install

caveat

Ensure that hardware drivers are installed, such as the rknpu2 driver for the RK3588.
Insufficient memory may cause compilation failure, it is recommended that you add at least a 4GB swap partition to the RK3588.
The project is updated frequently, so it is recommended to check GitHub regularly for the latest version.

application scenario

Intelligent Security
FastDeploy deploys target detection and face recognition models for monitoring systems. Developers can quickly run PaddleDetection models on edge devices such as the RK3588 to detect anomalous behavior in real time.
smart retail
Supports OCR and image classification models for traffic counting and product identification. Retailers can deploy models on mobile devices to analyze customer behavior through FastDeploy.
industrial automation
Use FastDeploy to deploy image segmentation models in production lines to check product quality. Supports multiple hardware adaptations for complex factory environments.
voice interaction
Deploy speech synthesis models for intelligent customer service or voice assistants.FastDeploy's multi-token prediction technology improves the speed of speech generation.

QA

What hardware does FastDeploy support?
Support NVIDIA GPU, Kunlun XPU, Rise NPU, RK3588, Iluvatar GPU, etc. Some hardware such as MetaX GPU is being adapted.
How do I switch inference backends?
By setting an environment variable (such as ENABLE_ORT_BACKEND=ON) or specify the backend in the code (e.g. model.set_backend("paddle")) Switching inference engines.
Does FastDeploy support web deployment?
Yes, web and applet deployment is supported via Paddle.js, see the <FastDeploy>/docs/web_deployment.mdThe
What should I do if I am experiencing a lack of memory?
The number of tasks can be limited at compile time (e.g. python setup.py build -j 4), or add a swap partition to the device.

Local Deployment of Open Source Large Modeling Tools

AI productivity tools " FastDeploy: an open source tool for rapid deployment of AI models Posted on 2025-07-29, if you find that the URL is out of date, or inaccessible, please contact us.

0Bookmarked

0kudos

FastDeploy: an open source tool for rapid deployment of AI models

Function List

Using Help

Installation process

Functional operation flow

1. Model deployment

2. Hardware adaptation

3. Accelerated reasoning

4. Visualization deployment

5. Documentation support

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

FastDeploy: an open source tool for rapid deployment of AI models

Function List

Using Help

Installation process

Functional operation flow

1. Model deployment

2. Hardware adaptation

3. Accelerated reasoning

4. Visualization deployment

5. Documentation support

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool