AI Image Gen

 Submit Website

xAI Grok Imagine API: out-of-the-box multimodal audio and video generation service for production environments
In January 2026, xAI officially launched the Grok Imagine API, a production-grade multimodal video generation service for developers and enterprises. Built on xAI's internally developed “Aurora” model, the service's core capability is the ability to generate text based on...
1.3 Kgo nonstop to0kudos
0Bookmarked
OmniInsert: A tool for inserting any reference image into video without masking
OmniInsert is a research project developed by ByteDance Intelligent Creation Lab. It is a tool that seamlessly inserts any reference object into a video without the use of a mask. In the traditional video editing process, if you want to add a new object to the video, you usually need to manually create a precise “mask” to frame out the...
1.7 Kgo nonstop to0kudos
0Bookmarked
Qwen-Image-Edit: an AI model for editing images based on textual commands
Qwen-Image-Edit is an image editing AI model developed by Alibaba Tongyi Qianqian team. It is trained based on the Qwen-Image model with 20 billion parameters, and its core function is to allow users to modify images through simple Chinese or English text commands. This model utilizes both visual semantic understanding and...
3.3 Kgo nonstop to0kudos
0Bookmarked
Qwen-Image: an AI tool for generating high-fidelity images with accurate text rendering
Qwen-Image is a 20B parametric multimodal diffusion model (MMDiT) developed by the Qwen team, specializing in high-fidelity image generation and accurate text rendering. It excels in complex text processing (especially Chinese and English) and image editing. The model supports a variety of art styles, such as realistic, anime, and high-definition posters,...
3.4 Kgo nonstop to0kudos
0Bookmarked
SkyworkUniPic: An Open Source Model for Unified Processing Image Understanding and Generation
SkyworkUniPic is an open-source multimodal model developed by SkyworkAI that focuses on image understanding, text-generated images, and image editing. It integrates three visual language tasks using a single 150 million parameter architecture. Users can run 102 on consumer GPUs such as RTX 4090...
1.8 Kgo nonstop to0kudos
0Bookmarked
FLUX.1 Krea: Free Open Source Tool for Generating Highly Realistic Images
FLUX.1 Krea [dev] is an open source image generation tool developed by Black Forest Labs in collaboration with Krea AI and hosted on the Hugging Face platform. It is based on a 12 billion parameter rectified flow transfo...
2.3 Kgo nonstop to0kudos
0Bookmarked
Diffuman4D: Generating High-Fidelity 4D Human Body Views from Sparse Video
Diffuman4D is a project developed by the ZJU3DV research team at Zhejiang University, focusing on generating high-fidelity 4D human body views from sparse view videos. The project combines spatio-temporal diffusion modeling and 4DGS (4D Gaussian Splatting) technology, which solves the difficulty of traditional methods in generating sparse input...
2.2 Kgo nonstop to1kudos
0Bookmarked
Launch of FLUX.1 Kontext and BFL Playground
Today, we are proud to release FLUX.1 Kontext -- a set of generative flow matching models to support image generation and editing. Unlike existing text-based image generation models, the FLUX.1 Kontext family supports context-sensitive...
1.9 K0kudos
0Bookmarked
PartCrafter: Generating Editable 3D Part Models from a Single Image
PartCrafter is an innovative open source project focused on generating editable 3D part models from a single RGB image. It uses advanced structured 3D generation techniques to simultaneously generate multiple semantically meaningful 3D parts from a single image, suitable for game development, product design and other fields. The project is based on a pre-trained 3D mesh diffusion transformer...
2.7 Kgo nonstop to0kudos
0Bookmarked
HiDream-I1
HiDream-I1 is an open source image generation base model with 17 billion parameters to quickly generate high quality images. Users only need to enter a text description, the model can generate a variety of styles including realistic, cartoon, art and other images. The project is developed by the HiDream.ai team, hosted on GitHub under the MIT license...
2.9 Kgo nonstop to0kudos
0Bookmarked
Imagen 4
Google DeepMind's recently launched Imagen 4 model, the latest iteration of its image generation technology, is quickly becoming an industry sensation. The model has made significant progress in improving image richness, detail accuracy, and generation speed, working to bring users' imaginations to life in ways never before possible. Currently, the use of ...
2.7 K0kudos
0Bookmarked
StarVector: Basic model for generating SVG vector graphics from images and text
StarVector is an open source project created by developers such as Juan A. Rodriguez to convert images and text into Scalable Vector Graphics (SVG). This tool uses a visual language model that understands image content and textual instructions to generate high-quality SVG code. Its core features are...
4.5 Kgo nonstop to0kudos
0Bookmarked
AnyText
AnyText is a revolutionary multilingual visual text generation and editing tool developed based on the diffusion model. It generates natural, high-quality multilingual text in images and supports flexible text editing capabilities. It was developed by a team of researchers and received Spotlight honors at the ICLR 2024 conference.AnyText's...
3.7 Kgo nonstop to0kudos
0Bookmarked
OmniGen
OmniGen is a “universal” image generation model developed by VectorSpaceLab that allows users to create diverse and contextually rich visuals with simple text prompts or multimodal inputs. It is particularly suited to scenes that require character recognition and consistent character rendering. Users can upload up to three images...
4.2 Kgo nonstop to0kudos
0Bookmarked
CogView3: Wisdom Spectrum Light Word open source cascade diffusion text to generate image models
Comprehensive Introduction CogView3 is an advanced text generation image system developed by Tsinghua University and Think Tank Team (Smart Spectrum Qingyan). It is based on the cascading diffusion model and generates high-resolution images through multiple stages.The main features of CogView3 include multi-stage generation, innovative architecture and efficient performance, which is suitable for art creation, advertisement design, game development, and many other...
3.0 Kgo nonstop to0kudos
0Bookmarked

AI Image Gen

Quick query station AI tool