The following is a detailed procedure for generating images using Janus-4o:
1. Loading model
from transformers import AutoModelForCausalLM, VLChatProcessor
model_path = "FreedomIntelligence/Janus-4o-7B"
vl_chat_processor = VLChatProcessor.from_pretrained(model_path)
vl_gpt = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True).cuda().eval()
2. Defining generating functions
utilization text_to_image_generate function (see GitHub for sample code):
- Input parameters: text prompt (e.g. "a desert under a starry sky"), output path, processor and model object.
- Optional parameters: temperature values (to control the generation of diversity), parallel size, configuration weights, etc.
3. Implementation generation
The function will save the generated image to the specified path, and during the process, it will call the pipeline of Hugging Face to process the text and image data. After the generation is finished, you can preview the result with the image viewer tool.
Note: Make sure the GPU is available and refer to the GitHub documentation to tweak the parameters for best results.
This answer comes from the articleShareGPT-4o-Image: an open source multimodal image generation datasetThe

































