Overseas access: www.kdjingpai.com
Bookmark Us

HiDream-I1 is an open source image generation base model with 17 billion parameters to quickly generate high quality images. Users only need to enter a textual description, and the model can generate images in a variety of styles including realistic, cartoon, and artistic. Developed by the HiDream.ai team, the project is hosted on GitHub under the MIT license and is supported for personal, scientific, and commercial use.HiDream-I1 has excelled in a number of benchmarks, such as HPS v2.1, GenEval, and DPG, and has reached industry-leading levels in terms of quality of the images generated and the ability to follow cue words. Users can experience the model through the Hugging Face platform or download the model weights to run locally. The project also provides a Gradio demo interface to facilitate interactive image generation.

HiDream-I1-1

 

Function List

  • Text to Image: Generate high-quality images based on text descriptions entered by users.
  • Multi-style support: generate realistic, cartoon, art and other styles of images.
  • Fast Generation: Images can be generated in as little as seconds by optimizing the inference steps.
  • Model variants: Full (HiDream-I1-Full), Development (HiDream-I1-Dev) and Fast (HiDream-I1-Fast) versions are available.
  • Image editing support: Based on the HiDream-E1-Full model, it supports image modification through text commands.
  • Open Source and Commercial: The MIT license allows free use of the generated images.
  • Gradio Interactive Interface: Provides an online demo for users to experience image generation directly.

 

Using Help

Installation process

To use HiDream-I1, you need to configure the model runtime environment in your local environment. The following are the detailed installation steps:

  1. Preparing the environment
    It is recommended to use Python 3.12 and create a new virtual environment to avoid dependency conflicts. Run the following command:

    conda create -n hdi1 python=3.12
    conda activate hdi1

or use a virtual environment:

python3 -m venv venv
source venv/bin/activate  # Linux
.\venv\Scripts\activate   # Windows
  1. Installation of dependencies
    Install the necessary libraries, especially the Hugging Face Diffusers library. It is recommended to install from source to ensure compatibility:

    pip install git+https://github.com/huggingface/diffusers.git
    

    In addition, Flash Attention is installed to optimize performance and CUDA 12.4 is recommended:

    pip install flash-attn
    
  2. Download model
    HiDream-I1 model weights are available from Hugging Face. Three variants are supported:

    • HiDream-ai/HiDream-I1-Full: complete model, suitable for high quality generation.
    • HiDream-ai/HiDream-I1-Dev: The development version, with fewer reasoning steps, is faster.
    • HiDream-ai/HiDream-I1-Fast: A quick version, suitable for rapid generation.
      Running the reasoning script will automatically downloadmeta-llama/Meta-Llama-3.1-8B-InstructModels. If the network is unstable, download them from Hugging Face in advance and place them in the cache directory.
  3. running inference
    Run the image generation using the following Python code:

    import torch
    from transformers import PreTrainedTokenizerFast, LlamaForCausalLM
    from diffusers import HiDreamImagePipeline
    tokenizer_4 = PreTrainedTokenizerFast.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
    text_encoder_4 = LlamaForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3.1-8B-Instruct",
    output_hidden_states=True,
    output_attentions=True,
    torch_dtype=torch.bfloat16
    )
    pipe = HiDreamImagePipeline.from_pretrained(
    "HiDream-ai/HiDream-I1-Full",
    tokenizer_4=tokenizer_4,
    text_encoder_4=text_encoder_4,
    torch_dtype=torch.bfloat16
    )
    pipe = pipe.to('cuda')
    image = pipe(
    'A cat holding a sign that says "HiDream.ai"',
    height=1024,
    width=1024,
    guidance_scale=5.0,
    num_inference_steps=50,
    generator=torch.Generator("cuda").manual_seed(0)
    ).images[0]
    image.save("output.png")
    

    Parameter Description:

    • heightcap (a poem)width: Set the resolution of the generated image, 1024 x 1024 is recommended.
    • guidance_scale: Controls the degree of cue word adherence, recommendation 5.0.
    • num_inference_steps: The number of inference steps is 50 for the Full version, 28 for the Dev version, and 16 for the Fast version.
  4. Run the Gradio demo
    The project provides a Gradio interface to facilitate interactive image generation. Run the following command to start it:

    python gradio_demo.py
    

    Once launched, access the local web interface and enter a text description to generate an image.

Featured Function Operation

  • Text to Image: Enter descriptive text in the Gradio interface, e.g. "A cat holding up a sign that says 'HiDream.ai'". Select the model variant and adjust the resolution, click Generate to get the image.
  • image editing: Using the HiDream-E1-Full model in Hugging Face space (https://huggingface.co/spaces/HiDream-ai/HiDream-E1-Full) Upload the image and enter a change command, such as "Change background to forest". The model will adjust the image according to the command to keep the character consistent.
  • Model SelectionThe Full version is suitable for high-quality generation, the Dev version for development and testing, and the Fast version for rapid prototyping.

caveat

  • Hardware Requirements: Requires NVIDIA GPU (e.g. A100, RTX 3090), supports Ampere architecture or higher. 4-bit quantization version (hykilpikonna/HiDream-I1-nf4) can run with 16GB of video memory.
  • License: consent requiredmeta-llama/Meta-Llama-3.1-8B-Instruct's community license and log in on Hugging Face:
    huggingface-cli login
    

 

application scenario

  1. content creation
    Creators can use HiDream-I1 to generate illustrations, advertising graphics or concept art. For example, enter "future city night scene" to generate a sci-fi style image for a novel cover or game design.
  2. Education and Research
    Researchers can use the model to conduct image generation experiments, test the effects of different cue words, or develop new applications based on the MIT license.
  3. commercial use
    Businesses can generate product promotional images or marketing materials.The MIT license allows free use of the generated images without additional licensing.

 

QA

  1. What hardware is required for the HiDream-I1?
    Requires NVIDIA GPU (e.g. RTX 3090, A100) with support for the Ampere architecture or higher. 4-bit quantized versions run at 16GB of video memory.
  2. How to choose a model variant?
    The Full version is good for high quality generation, the Dev version is good for fast development, and the Fast version is good for fast generation but slightly lower quality.
  3. Are the generated images commercially available?
    Yes. The MIT license allows the use of the generated images for personal, scientific and commercial purposes.
  4. How do I fix a model download failure?
    Advance Download from Hugging Facemeta-llama/Meta-Llama-3.1-8B-Instructmodel, placed into the cache directory.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

inbox

Contact Us

Top

en_USEnglish