The following steps are required to install realtime-transcription-fastrtc:
System environment preparation
- Ensure that Python ≥3.10 is installed
- Install ffmpeg for audio processing (available via brew for macOS, apt for Linux, manual configuration for Windows)
- GPU acceleration (MPS or CUDA) is recommended, CPU can be run but with lower performance
Project Deployment Process
- Cloning Warehouse:
git clone https://github.com/sofi444/realtime-transcription-fastrtc
- Creating virtual environments: the uv tool is recommended (or the traditional pip method)
- Installation dependencies: run
uv pip install -r requirements.txtor the corresponding pip command
- Configuration.env file: set UI_MODE, APP_MODE, MODEL_ID and other key parameters
Key Configuration Description
- UI_MODE:: gradio (simple interface) or fastapi (customizable interface)
- MODEL_ID: openai/whisper-large-v3-turbo is used by default, and can be replaced with other Hugging Face models.
- PORT: Service run port, default 7860
- Ensure that Python ≥3.10 is installed
- Install ffmpeg for audio processing (available via brew for macOS, apt for Linux, manual configuration for Windows)
- GPU acceleration (MPS or CUDA) is recommended, CPU can be run but with lower performance
Project Deployment Process
- Cloning Warehouse:
git clone https://github.com/sofi444/realtime-transcription-fastrtc
- Creating virtual environments: the uv tool is recommended (or the traditional pip method)
- Installation dependencies: run
uv pip install -r requirements.txtor the corresponding pip command
- Configuration.env file: set UI_MODE, APP_MODE, MODEL_ID and other key parameters
Key Configuration Description
- UI_MODE:: gradio (simple interface) or fastapi (customizable interface)
- MODEL_ID: openai/whisper-large-v3-turbo is used by default, and can be replaced with other Hugging Face models.
- PORT: Service run port, default 7860
- Cloning Warehouse:
git clone https://github.com/sofi444/realtime-transcription-fastrtc - Creating virtual environments: the uv tool is recommended (or the traditional pip method)
- Installation dependencies: run
uv pip install -r requirements.txtor the corresponding pip command - Configuration.env file: set UI_MODE, APP_MODE, MODEL_ID and other key parameters
Key Configuration Description
- UI_MODE:: gradio (simple interface) or fastapi (customizable interface)
- MODEL_ID: openai/whisper-large-v3-turbo is used by default, and can be replaced with other Hugging Face models.
- PORT: Service run port, default 7860
- UI_MODE:: gradio (simple interface) or fastapi (customizable interface)
- MODEL_ID: openai/whisper-large-v3-turbo is used by default, and can be replaced with other Hugging Face models.
- PORT: Service run port, default 7860
After completing the configuration, execute thepython main.pyStart the service and use it by accessing the URL displayed on the terminal through a browser.
This answer comes from the articleOpen source tool for real-time speech to textThe
































