ShortGPT is an open source artificial intelligence framework for automating the creation of video content. Its main function is to streamline the entire process of video production, including script writing, footage gathering, speech synthesis, subtitle generation and video editing. The framework understands and executes editing commands through large-scale language modeling (LLM), can automatically find images and video clips from the Internet, and integrates similar ElevenLabs or Microsoft EdgeTTS' technology to generate natural-sounding narration. ShortGPT is designed to help content creators, especially those running automated channels on platforms like YouTube and TikTok, to be able to produce videos in bulk quickly and efficiently. It offers different working engines designed for short and long videos, as well as a feature module dedicated to translating and dubbing existing videos.

Function List
- Automated editing framework: Streamline the video creation process with a video editing language oriented towards the Large Language Model (LLM).
- multilingual dubbing: Integration with ElevenLabs and Microsoft EdgeTTS supports speech synthesis in more than 30 languages to generate natural-sounding narration.
- Online Material Access: Can automatically fetch video footage from sites like Pexels or search for images from Bing Images to provide visual material for video content.
- Automatic generation of subtitles:: Automatically generate and add subtitles to produced videos.
- Video translation and dubbing: provides a specialized translation engine that transcribes the content of a video (via file or YouTube link), translates it, re-dubs it in the target language, and finally generates a completely new multilingual version of the video.
- Scripts and Cues: The framework has a variety of built-in scripts and cues that can be used directly for different automated video editing tasks.
- Customization Options: Users can customize it according to their needs, such as choosing a voice-over language or adding their own watermark to the video.
- Data persistence: Use TinyDB to ensure that variables and settings from the automated editing process are saved over time.
Using Help
ShortGPT is a powerful AI video automation framework that you can use in two main ways: running it on Google Colab or locally via a Docker environment. For beginners or users who don't want to configure a complex environment on their own computer, Google Colab is officially recommended.
Method 1: Use Google Colab (Recommended)
This is the easiest and fastest way to do it without installing any dependencies locally.
- Open the Colab notebook.: First of all, you need a Google account. Then visit the official Google Colab link provided directly:
https://colab.research.google.com/drive/1_2UKdpFqxCqWaAcZb3rwMVQqtbisdE?usp=sharingThe - Execute code units sequentially: When you open the page, you will see a series of code cells. You just need to run through each cell in turn, from top to bottom. Click on the "Play" button to the left of each cell, or select the cell and use the shortcut key
Shift+Enterto perform. - Configuring the API Key: During execution, the program will ask you to enter some API keys, such as OpenAI, ElevenLabs, etc. You need to register the account of these services and get the key in advance, and then fill in the corresponding input box.
- Launching the Web Interface: When all cells have been successfully run, a public link to the Gradio interface is generated. Clicking on this link will allow you to use ShortGPT's GUI in your browser.
Method 2: Run locally with Docker
If you want to run ShortGPT on your own computer with a higher level of control, you can use Docker.This approach requires you to have a basic understanding of the command line and Docker.
- Installing Docker: First, make sure you have Docker installed on your computer. you can download the appropriate version for your operating system (Windows, macOS, or Linux) from the Docker website and complete the installation.
- Download ShortGPT Project File:
- Open the command line tool (Terminal).
- Use git to clone the project repository:
git clone https://github.com/RayVentura/ShortGPT.git - Go to the project catalog:
cd ShortGPT
- Configuring Environment Variables:
- In the project root directory, find a file named
.env.exampleof the document. - Make a copy of this file and rename it
.envThe - show (a ticket)
.envfile, fill it with your own API key, e.g.OPENAI_API_KEYcap (a poem)ELEVENLABS_API_KEYThe
- In the project root directory, find a file named
- Build and run Docker containers:
- In the project root directory, execute the following command to build the Docker image. This process may take some time as it requires downloading and installing all the dependencies.
docker build -t short_gpt_docker:latest . - Once the build is complete, use the following command to run the container:
docker run -p 31415:31415 --env-file .env short_gpt_docker:latest
- In the project root directory, execute the following command to build the Docker image. This process may take some time as it requires downloading and installing all the dependencies.
- Accessing the Web Interface: After the container has run successfully, open your browser and visit the
http://localhost:31415. You'll be able to see the same Gradio interface as the Colab version and start creating videos.
Core Function Operation Flow
ShortGPT divides the different video creation tasks into three main engines:
ContentShortEngine(Short video engine): Designed for making short YouTube Shorts or TikTok style videos. Its workflow is usually as follows: Receive a topic or script -> Generate narration audio -> Automatically search for matching background video clips or images -> Synthesize the footage and audio into a short video -> Automatically add subtitles -> Finally, it can even generate metadata such as the title and description of the video.ContentVideoEngine(Long video engine): is used to produce standard-length videos. It has a similar process to the Short Video Engine, but focuses more on handling longer scripts, generating longer audio, and aligning video footage and subtitles over a longer timeline.ContentTranslationEngine(Video Translation Engine): This is a very unique feature. You can provide an existing video file or YouTube link and it will automatically recognize the voice content in the video, convert it into text, then translate the text into the target language of your choice, synthesize a new dub in that language, and finally generate a version of the video with the new dub and translated subtitles.
In the web interface, you can choose which engine to use according to your needs and enter the appropriate information (e.g., video theme, language, dubbing style, etc.) according to the prompts, then start the task and wait for the AI to finish the video.
application scenario
- Automating Social Media Content
For users who need to continuously publish short videos on platforms such as YouTube Shorts, TikTok or Instagram Reels, they can use ShortGPT to set up a theme and let it automate the entire process of script generation, material collection, dubbing and editing, realizing the automated production of content and greatly saving time and labor. - Multilingual content distribution
If a video creator wants to promote their content to different languages, they can use the ContentTranslationEngine. Simply provide a link to the original video, and ShortGPT automatically generates the video in multiple languages, such as Spanish, French or Japanese, with the appropriate voice-overs and subtitles to quickly expand the audience. - Quickly create informational videos
For scenarios that require the production of a large number of knowledge explanations, product introductions, or news broadcast videos, creators can provide only the transcript and utilize ShortGPT to automatically match it with visual material and generate narration, quickly transforming the textual content into an information-rich video. - Localization of video content
Enterprises or educational institutions often need to localize training materials or promotional videos for globalization promotions. ShortGPT can be used as an efficient tool to translate and match these video materials with the language of the target market, reducing the cost and complexity of localization.
QA
- Is ShortGPT free?
ShortGPT itself is an open source framework, so using the software is free. However, it needs to call some third-party API services during operation, such as OpenAI (GPT model) for script generation and ElevenLabs for speech synthesis, which may charge a fee. However, it also supports the use of Microsoft's free EdgeTTS voice service. - Do I need to be able to program to use ShortGPT?
Not necessarily. If you use the officially recommended Google Colab method, you basically don't need to write code, you just need to click Run in order and fill in some necessary information. If you choose to install locally via Docker, some basic knowledge of command line operations is required. - What languages does ShortGPT support?
It supports a very wide range of languages. With the help of speech synthesis services such as ElevenLabs and EdgeTTS, ShortGPT supports voiceovers and content creation in more than 30 languages including English, Spanish, French, German, Chinese, Japanese, Korean, Hindi, and more. - Are there any copyright issues with the generated video footage?
ShortGPT obtains its videos and images primarily from websites that offer free footage, such as Pexels. Content on these platforms usually allows both commercial and non-commercial use, but users are still advised to check the licensing agreements for specific footage on their own before use to avoid potential copyright risks.






























