Overseas access: www.kdjingpai.com
Ctrl + D Favorites
Current Position:fig. beginning " AI Tool Library

Hallo2: audio-driven generation of lip-synchronized/expression-synchronized portrait videos (Windows one-click installation)

2024-11-10 1.1 K

Hallo2 is an open source project jointly developed by Fudan University and Baidu to generate high-resolution portrait animations through audio-driven generation. The project utilizes advanced Generative Adversarial Networks (GAN) and time alignment techniques to achieve 4K resolution and up to 1 hour of video generation.Hallo2 also supports text prompts to enhance the diversity and controllability of generated content.

Hallo3 was released and achieved significant lip synchronization by introducing a cross-attention mechanism for audio conditioning that effectively captures the complex relationship between audio signals and facial expressions.

Note that:Hallo3 has the following simple requirements on the input data for inference:

  • Reference Image: The reference image must have an aspect ratio of 1:1 or 3:2.
  • Driver Audio: The driver audio must be in WAV format.
  • Audio language: the audio must be in English, as the model's training dataset contains only this language.
  • Audio clarity: ensure that vocals are clear in the audio; background music is acceptable.

Hallo2:音频驱动生成长持续时间和高分辨率的肖像动画视频-1

 

Function List

  • Audio Driven Animation Generation: Generate corresponding portrait animations from input audio files.
  • High Resolution Support: Support for generating videos with 4K resolution to ensure clear picture quality.
  • Long video generation: Can generate video content up to 1 hour long.
  • Text Alert Enhancement: Control generated portrait expressions and actions with semantic text labels.
  • open source: Full source code and pre-trained models are provided to facilitate secondary development.
  • Multi-platform support: Supports running on multiple platforms such as Windows, Linux, etc.

 

Using Help

Installation process

  1. system requirements::
    • Operating system: Ubuntu 20.04/22.04
    • GPU: Graphics card supporting CUDA 11.8 (e.g. A100)
  2. Creating a Virtual Environment::
    conda create -n hallo python=3.10
    conda activate hallo
    
  3. Installation of dependencies::
    pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118
    pip install -r requirements.txt
    sudo apt-get install ffmpeg
    
  4. Download pre-trained model::
    git lfs install
    git clone https://huggingface.co/fudan-generative-ai/hallo2 pretrained_models
    

Usage Process

  1. Preparing to enter data::
    • Download and prepare the required pre-trained model.
    • Prepare the source image and driver audio files.
  2. Running inference scripts::
    python scripts/inference.py --source_image path/to/image --driving_audio path/to/audio
    
  3. View Generated Results::
    • The generated video file will be saved in the specified output directory and can be viewed using any video player.

Detailed steps

  1. Download Code::
    git clone https://github.com/fudan-generative-vision/hallo2
    cd hallo2
    
  2. Create and activate a virtual environment::
    conda create -n hallo python=3.10
    conda activate hallo
    
  3. Install the necessary Python packages::
    pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118
    pip install -r requirements.txt
    
  4. Install ffmpeg::
    sudo apt-get install ffmpeg
    
  5. Download pre-trained model::
    git lfs install
    git clone https://huggingface.co/fudan-generative-ai/hallo2 pretrained_models
    
  6. Running inference scripts::
    python scripts/inference.py --source_image path/to/image --driving_audio path/to/audio
    
  7. View Generated Results::
    • The generated video file will be saved in the specified output directory and can be viewed using any video player.

 

Hallo2: One-Click Installer for Windows

AI生产力应用This content has been hidden by the author, please enter the verification code to view the content
Captcha:
Please pay attention to this site WeChat public number, reply "CAPTCHA, a type of challenge-response test (computing)", get the verification code. Search in WeChat for "AI productivity applications"or"Artificial9527"or WeChat scanning the right side of the QR code can be concerned about this site WeChat public number.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

inbox

Contact Us

Top

en_USEnglish