SkyReels-V2 is an open source video generation model developed by SkyworkAI. It supports the generation of videos of unlimited length through advanced Diffusion Forcing techniques for both text-to-video (T2V) and image-to-video (I2V) tasks. Users can generate high-quality, movie-quality video content using text descriptions or input images. The model has excelled in the open source community, with performance comparable to commercial models such as Kling and Runway-Gen4. It provides flexible inference models suitable for developers, creators, and researchers.The code and model weights of SkyReels-V2 are publicly available on GitHub for easy download and deployment.

Function List
- Unlimited length video generation: Support for generating videos of any length, suitable for short films to full movies.
- Text to video (T2V): Generate video content that matches the description via text prompts.
- Image to video (I2V): Generate dynamic video based on the input image, maintaining the image characteristics.
- multimodal support: Combining large-scale language modeling (MLLM) and reinforcement learning to improve video generation quality.
- Story Generation: Automatically generate video storyboards that fit the narrative logic.
- camera control: Provides a director's point of view and supports customized camera angles and movements.
- Multi-subject coherence: Ensure visual consistency of multi-character videos with the SkyReels-A2 system.
- Efficient Reasoning Framework: Supports multi-GPU reasoning to optimize generation speed and resource usage.
Using Help
Installation process
SkyReels-V2 is a Python based open source project , you need to configure the environment locally or on the server . Here are the detailed installation steps:
- clone warehouse
 Open a terminal and run the following command to get the SkyReels-V2 code:git clone https://github.com/SkyworkAI/SkyReels-V2 cd SkyReels-V2
- Creating a Virtual Environment
 It is recommended that you create a virtual environment using Python 3.10.12 to avoid dependency conflicts:conda create -n skyreels-v2 python=3.10 conda activate skyreels-v2
- Installation of dependencies
 Install the Python libraries needed for the project and run it:pip install -r requirements.txt
- Download model weights
 The model weights for SkyReels-V2 are hosted at Hugging Face. download them using the following command:pip install -U "huggingface_hub[cli]" huggingface-cli download Skywork/SkyReels-V2 --local-dir ./modelsMake sure you have enough disk space (model sizes can be tens of gigabytes). 
- hardware requirement
- minimum configuration: Single block RTX 4090 (24GB VRAM) with FP8 support to quantitatively reduce memory requirements.
- Recommended Configurations: Multiple GPUs (e.g., 4-8 A100s) to support efficient parallel inference.
- At least 32GB of system memory and 100GB of disk space.
 
Usage
SkyReels-V2 provides two main functions: Text to Video (T2V) and Image to Video (I2V). The following is the specific operation procedure:
Text to video (T2V)
- Preparing Cues
 Write text prompts describing the content of the video, for example:A serene lake surrounded by towering mountains, with swans gliding across the water.Negative cues can be added to avoid unwanted elements: low quality, deformation, bad composition
- Run the generated script
 modificationsgenerate_video.pyparameters, set the resolution, frame rate, etc:python generate_video.py --model_id "Skywork/SkyReels-V2-T2V-14B-540P" --prompt "A serene lake surrounded by mountains" --num_frames 97 --fps 24 --outdir ./output- --model_id: Select the model (e.g. 540P or 720P).
- --num_frames: Set the video frame rate (default 97).
- --fps: Frame rate (default 24).
- --outdir: Output video save path.
 
- View Output
 The generated video will be saved in MP4 format, e.g.output/serene_lake_42_0.mp4The
Image to video (I2V)
- Preparing the input image
 Provide a high-quality image (e.g., PNG or JPG), making sure the resolution matches the model (default 960×544).
- Run the generated script
 existgenerate_video.pySpecify the image path in thepython generate_video.py --model_id "Skywork/SkyReels-V2-I2V-14B-540P" --prompt "A warrior fighting in a forest" --image ./input_image.jpg --num_frames 97 --fps 24 --outdir ./output- --image: Enter the image path.
- Other parameters are similar to those of the T2V.
 
- Optimized settings
- utilization --guidance_scale(Default 6.0) Adjusts the intensity of text steering.
- utilization --inference_steps(default 30) Controls the quality of the generation, the more steps the higher the quality but the longer it takes.
- start using --offloadOptimized memory usage for low graphics memory devices.
 
- utilization 
Featured Function Operation
- Unlimited length video
 SkyReels-V2 uses Diffusion Forcing technology to support the generation of very long videos. Run long video inference scripts:python inference_long_video.py --model_id "Skywork/SkyReels-V2-T2V-14B-720P" --prompt "A sci-fi movie scene" --num_frames 1000- It is recommended to generate them in segments of 97-192 frames each, and then stitch them together with post-processing tools.
 
- Story Generation
 Use the Story Generation feature of the SkyReels-A2 system to enter a plot description:A hero’s journey through a futuristic city, facing challenges.Running: python story_generate.py --prompt "A hero’s journey" --output story_video.mp4The system will generate videos containing storyboards, automatically arranging scenes and shots. 
- camera control
 pass (a bill or inspection etc)--camera_angleparameter sets the lens view (e.g. "frontal" or "profile"):python generate_video.py --prompt "A car chase" --camera_angle "profile" --outdir ./output
- Multi-subject coherence
 SkyReels-A2 supports multi-character scenes. Provides multiple reference images to run:python multi_subject.py --prompt "Two characters talking" --images "char1.jpg,char2.jpg" --outdir ./outputMake sure the characters are visually consistent in the video. 
Optimization and Debugging
- lack of memory: Enable --quantQuantification using FP8, or--offloadOffloads some calculations to the CPU.
- Generating quality: Increase --inference_steps(e.g., 50) or adjust--guidance_scale(e.g. 8.0).
- Community Support: Check GitHub Issues for problems or join the SkyReels Community Discussion.
application scenario
- Short video creation
 Creators can use the T2V feature to quickly generate short video clips from text, suitable for social media content production.
- Movie pre-production
 Directors can utilize the unlimited length video and story generation features to create movie trailers or concepts and reduce upfront costs.
- Virtual E-Commerce Showcase
 Use the I2V function to turn product pictures into dynamic videos to show how the product is used in a virtual scene.
- Educational animation
 Teachers can generate instructional animations from text descriptions to visualize complex concepts, such as the process of a science experiment.
- game development
 Developers can generate game scenes or character animations to be used as material for prototyping or transitions.
QA
- What resolutions does SkyReels-V2 support?
 Currently supports 540P (960×544) and 720P (1280×720), with the possibility of expanding to higher resolutions in the future.
- How much video memory do I need to run it?
 A single RTX 4090 (24GB) can run basic reasoning, and multi-GPU configurations can accelerate raw and grown video.
- How to improve the quality of generated videos?
 Increase the number of reasoning steps (--inference_steps), optimize prompt words, or use high-quality input images.
- Does it support real-time generation?
 Currently offline generation, real-time generation requires higher hardware support and may be optimized in the future.
- Are model weights free?
 Yes, SkyReels-V2 is completely open source and the weights can be downloaded for free from Hugging Face.
































 English
English				 简体中文
简体中文					           日本語
日本語					           Deutsch
Deutsch					           Português do Brasil
Português do Brasil