Overseas access: www.kdjingpai.com
Bookmark Us

 

InstantID is a state-of-the-art technology focused on generating images with personalized styles or poses in seconds using a single reference ID picture while ensuring a high level of fidelity. The technology employs a diffusion model-based solution that accurately guides the image generation process by integrating facial images, landmark images and textual cues. Key features include high-fidelity image generation, compatibility with popular pre-trained text-to-image diffusion models that can be used without extensive fine-tuning or multiple reference images, and high facial fidelity and text editing capabilities.

 

InstantID is a new state-of-the-art adjustment-free method for avatar feature ID retention generation from a single image, supporting a variety of downstream tasks. Clone a face from just one photo and use cue words to generate different style images of the same face.

 

InstantID:秒级个性化身份保留图像生成技术-1

 

InstantID:秒级个性化身份保留图像生成技术-2

 

InstantID:秒级个性化身份保留图像生成技术-3

 

 

Function List

 

  • Zero sample identity retention generation: No need for multiple images, just one front face image to generate multiple styles of portraits.
  • High fidelity generation: the generated results have high fidelity and can well preserve the identity features of the original image.
  • Multiple downstream task support: Supports multiple downstream tasks such as style migration, image editing, etc.
  • Open source code and models: open source code and pre-trained models are provided for easy download and use.
  • Strong compatibility: Supports integration with other programs such as InstantStyle and Kolors of compatible use.

 

 

Using Help

Upload a person image. For multiple people images, we will only detect the largest faces. Make sure that the face is not too small or visibly obscured or blurred.
(Optional) Upload another character image as a reference pose. If not uploaded, we will use the first person image to extract the landmarks. If a cropped face was used in step 1, it is recommended to upload it to extract a new pose.
Input text prompts, just like normal text to image models.
Click the Submit button to start customizing.

Users are required to provide a single reference ID picture
Different styles and poses can be selected for personalized image generation
No need to fine-tune during testing or collect multiple images for fine-tuning
The generated images can be directly used for fusion with popular pre-trained models and control networks
Supports flexible addition of identity attributes to non-human roles

 

Installation process

  1. Clone a GitHub repository:
    git clone https://github.com/instantX-research/InstantID.git
    cd InstantID
    

     

  2. Install the dependencies:
    pip install -r requirements.txt
    

     

  3. Download the pre-trained model:
    from huggingface_hub import hf_hub_download
    hf_hub_download(repo_id="InstantX/InstantID", filename="ControlNetModel/config.json", local_dir="./checkpoints")
    hf_hub_download(repo_id="InstantX/InstantID", filename="ControlNetModel/diffusion_pytorch_model.safetensors", local_dir="./checkpoints")
    hf_hub_download(repo_id="InstantX/InstantID", filename="ip-adapter.bin", local_dir="./checkpoints")
    

 

Usage Process

  1. Prepare the image:
    from diffusers.utils import load_image
    image = load_image("your-example.jpg")
    
  2. Load model:
    from diffusers import StableDiffusionXLInstantIDPipeline, ControlNetModel
    controlnet = ControlNetModel.from_pretrained("./checkpoints/ControlNetModel", torch_dtype=torch.float16)
    pipe = StableDiffusionXLInstantIDPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, torch_dtype=torch.float16)
    pipe.cuda()
    pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")
    
  3. Generate an image:
    prompt = "analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality"
    negative_prompt = "(lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured"
    image = pipe(prompt, image_embeds=face_emb, image=face_kps, controlnet_conditioning_scale=0.8).images[0]
    

Detailed Operation Procedure

  1. Preparing the environment: Ensure that the necessary dependencies are installed and the pre-trained model is downloaded.
  2. Load Image: Use load_image function loads the image to be processed.
  3. Loading Models: Use from_pretrained method loads the pre-trained ControlNet model and the StableDiffusionXLInstantIDPipeline.
  4. Generating images: Set the cue word and negative cue word for the generated image by calling the pipe method to generate an image.

With the above steps, users can easily generate high fidelity identity retention images with InstantID.

 

 

ComfyUI Implementation Solution

 

Select the SDXL Base Dock. You can also try SDXL Turbo's 4-step process, which is very effective for quick testing.

The first load usually takes more than 60 seconds, but the node does its best to cache the model.

https://github.com/huxiuhan/ComfyUI-InstantID

 

InstantID Experience Address

 

AI生产力应用This content has been hidden by the author, please enter the verification code to view the content
Captcha:
Please pay attention to this site WeChat public number, reply "CAPTCHA, a type of challenge-response test (computing)", get the verification code. Search in WeChat for "AI productivity applications"or"Artificial9527"or WeChat scanning the right side of the QR code can be concerned about this site WeChat public number.

0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish