Current Position:fig. beginning " AI Professional Tools

GenAI Processors：轻量级Python库支持高效并行处理多模态内容

GenAI Processors: lightweight Python library supports efficient parallel processing of multimodal content

2025-08-02

AI Professional Tools/Foundation APIs/AI Tool

754 3

https://github.com/google-gemini/genai-processors

make a copy of

GenAI Processors is an open source Python library developed by Google DeepMind that focuses on efficient parallel processing of multimodal content. It is based on Python's asyncio framework and provides a modular, reusable processor interface that simplifies the development of complex AI applications. Users can use this library to process data streams such as text, audio, video, and other data streams, and work with the Gemini API Seamless integration. It supports real-time stream processing and turn-based interactions, making it suitable for building AI applications that require fast responses. The code is hosted on GitHub, and the community can contribute processor modules to extend the functionality. The project is licensed under the Apache 2.0 license, making it suitable for developers to quickly build AI applications that are usable in production environments.

Function List

asynchronous parallel processing: Based on Python asyncio, it supports efficient handling of I/O and compute-intensive tasks.
Modular Processor Design: Provides Processor and PartProcessor units that can combine or parallelize complex data streams.
Gemini API Integration: Built-in GenaiModel and LiveProcessor to support turn-based and real-time streaming interactions.
multimodal flow processing: Supports splitting, merging and processing of text, audio, video and other data streams.
Real-time interactive support: Processes real-time audio and video streams through LiveProcessor, ideal for building real-time AI agents.
Community Contribution Extension: Enhancements to support users adding custom processors to the contrib/ directory.
tool integration: Built-in tools such as Google Search enhance the AI agent's contextualization capabilities.

Using Help

Installation process

GenAI Processors requires Python 3.10 or higher. Here are the detailed installation steps:

Setting up the environment::
- Make sure Python 3.10+ is installed on your system.
- Install Git in order to clone the code repository.
```
sudo apt update && sudo apt install python3.10 git
```

clone warehouse::

Clone the GenAI Processors repository from GitHub.

git clone https://github.com/google-gemini/genai-processors
cd genai-processors

Installation of dependencies::
- Use pip to install the required dependencies, including pyaudio, google-genai and termcolor.
```
pip install --upgrade pyaudio genai-processors google-genai termcolor
```
Configuring API Keys::
- gain Google AI Studio The API key for the
- Setting environment variables GOOGLE_API_KEY cap (a poem) GOOGLE_PROJECT_IDThe
```
export GOOGLE_API_KEY="你的API密钥"
export GOOGLE_PROJECT_ID="你的项目ID"
```

Usage

At the heart of GenAI Processors is the Processor module, which is used to process input and output streams. Below is a detailed flow of the main functions:

1. Creating a simple text processor

functionality: Processes text input and outputs results.

workflow::

Import the module and create an input stream.
utilization stream_content Converts text to a ProcessorPart stream.
Apply the processor and iterate through the output.

from genai_processors import content_api, streams
input_parts = ["Hello", content_api.ProcessorPart("World")]
input_stream = streams.stream_content(input_parts)
async for part in simple_text_processor(input_stream):
print(part.text)

effect: Processes and prints input text part by part, suitable for simple text tasks.

2. Construction of real-time audio and video proxies

functionality: Processes real-time audio and video streams through LiveProcessor.

workflow::

Initialize an audio input device (such as PyAudio).
Configure the video input (e.g. camera or screen stream).
Calling with LiveProcessor Gemini Live API.
Combined input, processing and output modules.

from genai_processors.core import audio_io, live_model, video
import pyaudio
pya = pyaudio.PyAudio()
input_processor = video.VideoIn() + audio_io.PyAudioIn(pya, use_pcm_mimetype=True)
live_processor = live_model.LiveProcessor(api_key="你的API密钥", model_name="gemini-2.5-flash-preview-native-audio-dialog")
play_output = audio_io.PyAudioOut(pya)
live_agent = input_processor + live_processor + play_output
async for part in live_agent(text.terminal_input()):
print(part)

effect: Realizes microphone and camera input, and outputs audio after processing via the Gemini API, suitable for real-time conversational agents.

3. Research theme generation

functionality: Generate research topics based on user input.

workflow::

utilization topic_generator.py Example, configuring GenaiModel.
Set model parameters such as the number of topics and output format.
Enter a research query to get a list of topics in JSON format.

from genai_processors.examples.research.processors import topic_generator
processor = topic_generator.TopicGenerator(api_key="你的API密钥")
async for part in processor(["研究AI在医疗领域的应用"]):
print(part.text)

effect: Generate a specified number of research topics and their relationship to the input, suitable for academic research scenarios.

4. Customized processors

functionality: Create custom processors to handle specific tasks.
workflow::
- consultation create_your_own_processor.ipynb Notebook.
- Define the Processor class, inheriting from processor.ProcessorThe
- realization call method handles the input stream.
- Add custom processors to the pipeline.
effect: Users can extend the functionality as needed, such as handling specific file formats or integrating with other APIs.

running example

Real-time CLI Example::
- (of a computer) run realtime_simple_cli.py Create an audio dialog agent.
```
python3 examples/realtime_simple_cli.py
```
- Input voice, the system converts the voice to text, processes it and outputs a voice response.
Travel Program CLI::
- (of a computer) run trip_request_cli.py Generate a travel plan.
```
python3 examples/trip_request_cli.py
```
- Enter your destination and dates for a detailed plan.

caveat

Ensure that the API key is valid to avoid request failures.
You can set the --debug=True View Log.
Real-time processing requires stable network and hardware support.

application scenario

Real-Time Dialog Agent
- descriptive: Develop voice- or video-driven AI assistants that process real-time user input, suitable for customer service or virtual assistants.
Academic research support
- descriptive: Generate research topics or analyze literature for students and researchers to quickly organize their thoughts.
Multimodal Content Processing
- descriptive: Process audio and video streams to generate subtitles or real-time narration, suitable for live broadcasting or video analysis.
Automated workflows
- descriptive: Build automated processing pipelines to handle batch data, suitable for enterprise data processing.

QA

What pre-requisites are required?
- Requires Python 3.10+, installation of pyaudio and google-genai libraries, setup of Google API key.
How do I debug the processing flow?
- When running the script add the --debug=True, view the log output and check the input and output streams.
What data types are supported?
- Supports text, audio, video and customized data streams that can be processed by ProcessorPart.
How do I contribute code?
- consultation CONTRIBUTING.md, submit a custom processor in the contrib/ directory.

AI open source project

AI productivity tools " GenAI Processors: lightweight Python library supports efficient parallel processing of multimodal content Posted on 2025-08-02, please contact us if you find the URL is out of date, or inaccessible.

0Bookmarked

0kudos

GenAI Processors: lightweight Python library supports efficient parallel processing of multimodal content

Function List

Using Help

Installation process

Usage

1. Creating a simple text processor

2. Construction of real-time audio and video proxies

3. Research theme generation

4. Customized processors

running example

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

GenAI Processors: lightweight Python library supports efficient parallel processing of multimodal content

Function List

Using Help

Installation process

Usage

1. Creating a simple text processor

2. Construction of real-time audio and video proxies

3. Research theme generation

4. Customized processors

running example

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool