OmniInsert: A tool for inserting any reference image into video without masking
OmniInsert is a research project developed by ByteDance Intelligent Creation Lab. It is a tool that seamlessly inserts any reference object into a video without the use of a mask (Mask). In the traditional video editing process, if you want to add a new object into the video, you usually need to manually create a...
Qwen-Image-Edit: an AI model for editing images based on textual commands
Qwen-Image-Edit is an image editing AI model developed by Alibaba Tongyi Qianqian team. It is trained based on the Qwen-Image model with 20 billion parameters, and its core function is to allow users to modify images through simple Chinese or English text commands. This model also utilizes visual...
Qwen-Image: an AI tool for generating high-fidelity images with accurate text rendering
Qwen-Image is a 20B parametric multimodal diffusion model (MMDiT) developed by the Qwen team, focusing on high-fidelity image generation and accurate text rendering. It excels in complex text processing (especially Chinese and English) and image editing. The model supports a variety of art styles such as realistic,...
SkyworkUniPic: An Open Source Model for Unified Processing Image Understanding and Generation
SkyworkUniPic is an open-source multimodal model developed by SkyworkAI that focuses on image understanding, text-generated images, and image editing. It integrates three visual language tasks using a single 150 million parameter architecture. Users can run 102 on consumer GPUs such as RTX 4090...
FLUX.1 Krea: Free Open Source Tool for Generating Highly Realistic Images
FLUX.1 Krea [dev] is an open source image generation tool developed by Black Forest Labs in collaboration with Krea AI and hosted on the Hugging Face platform. It is based on a 12 billion parameter rectified flow transfo...
Diffuman4D: Generating High-Fidelity 4D Human Body Views from Sparse Video
Diffuman4D is a project developed by the ZJU3DV research team at Zhejiang University, focusing on generating high-fidelity 4D human body views from sparse view videos. The project combines spatio-temporal diffusion modeling and 4DGS (4D Gaussian Splatting) technology, which solves the difficulty of traditional methods in generating sparse input...
Launch of FLUX.1 Kontext and BFL Playground
Today, we are proud to release FLUX.1 Kontext -- a set of generative flow matching models to support image generation and editing. Unlike existing text-based image generation models, the FLUX.1 Kontext family supports context-sensitive...
PartCrafter: Generating Editable 3D Part Models from a Single Image
PartCrafter is an innovative open source project focused on generating editable 3D part models from a single RGB image. It uses advanced structured 3D generation technology to generate multiple semantically meaningful 3D parts simultaneously from a single image , applicable to game development, product design and other fields. The project is based on pre-training...
HiDream-I1
HiDream-I1 is an open source image generation base model with 17 billion parameters to quickly generate high quality images. Users only need to enter a textual description, and the model can generate images in a variety of styles including realistic, cartoon, art, and more. Developed by the HiDream.ai team and hosted on GitHub, the project picks...
Imagen 4
Google DeepMind's recently launched Imagen 4 model, the latest iteration of its image generation technology, is quickly becoming an industry focal point. The model has made significant progress in improving the richness, accuracy of detail, and speed of image generation, working to bring the user's imagination to life in ways never before...
StarVector: Basic model for generating SVG vector graphics from images and text
StarVector is an open source project created by developers such as Juan A. Rodriguez to convert images and text into Scalable Vector Graphics (SVG). This tool uses a visual language model that understands image content and text instructions to generate high-quality SVG code. Its core ...
AnyText
AnyText is a revolutionary multilingual visual text generation and editing tool developed based on the diffusion model. It generates natural, high-quality multilingual text in images and supports flexible text editing capabilities. It was developed by a team of researchers and received Spotlight honors at the ICLR 2024 conference...
OmniGen
OmniGen is a "universal" image generation model developed by VectorSpaceLab that allows users to create diverse and contextually rich visuals with simple text prompts or multimodal inputs. It is particularly well suited for scenes that require character recognition and consistent character rendering. Users...
CogView3: Wisdom Spectrum Light Word open source cascade diffusion text to generate image models
Comprehensive Introduction CogView3 is an advanced text generation image system developed by Tsinghua University and Think Tank Team (Chi Spectrum Qingyan). It is based on the cascading diffusion model and generates high-resolution images through multiple stages.The key features of CogView3 include multi-stage generation, innovative architecture and efficient performance for artistic creation...
Top