Dual channel input system expands creative dimension
SkyReels-V1's innovative bi-directional input system allows creators to choose the most suitable content generation method for different needs:
- Text to Video (T2V): Generate dynamic content directly from text descriptions, e.g. enter "FPS-24, A dog running in a park" to automatically create a park scene at 24 fps.
- Image to Video (I2V)Convert static portraits into motion video, retaining original features while adding natural movement, with resolution support for professional specifications such as 544×960
Both modes share the same set of high-quality action libraries and generate 97 frames (about 4 seconds) of video by default. In terms of hardware configuration, it is recommended to use GPUs such as NVIDIA RTX 4090 with CUDA 12.2 environment, and multi-GPU parallel computing acceleration can be realized through SkyReelsInfer inference framework.
This answer comes from the articleSkyReels-V1: An Open Source Video Model for Generating High Quality Human Action VideoThe































