Current Position:fig. beginning " AI Tool

DiffMem: a Git-based versioned memory repository for AI intelligences

2025-08-25

716 8

https://github.com/Growth-Kinetics/DiffMem

make a copy of

DiffMem is a lightweight memory backend designed for AI intelligences and conversational systems. It innovatively uses Git as the core of the memory store, saving AI memories as human-readable Markdown files. Git's commit history is used to track the evolution of memories over time, while the system employs in-memory BM25 indexing for fast and interpretable information retrieval. This project is currently a Proof of Concept (PoC) to explore how a version control system can be utilized to build an efficient and scalable memory foundation for AI applications. DiffMem treats memory as a versioned knowledge base: the "current state" of knowledge is stored in editable files, while historical changes are stored in Git commit graphs. This design allows intelligences to query on a compact, up-to-date level of knowledge, while at the same time delving into the evolution of the memory when needed.

Function List

Git-driven memory storage: Leverage Git's version control capabilities to manage and track the evolution of AI memories, with each memory update corresponding to a Git commit.
Human-readable format: Memories are stored as simple Markdown files for developers to read, edit and manage directly.
Current State Focus: By default, only the "current state" of knowledge documents are indexed and searched, reducing the scope of queries and improving retrieval efficiency and token economy in the context of large language models (LLMs).
Differential Evolution Tracking: Bygit diffand other commands, the smart body can efficiently query for changes in specific information over time without having to load a complete history.
Fast Text Search: An in-memory BM25 index is built-in to provide fast millisecond response for keyword searches.
modular component: The system consists of several core modules, including the one responsible for analyzing conversations and submitting updates写入智能体, responsible for integrating the query context of the上下文管理器, as well as those responsible for performing searches搜索智能体The
Lightweight and easy to integrate: The project has few dependencies, does not require deployment of a separate server, and can be integrated directly into existing projects as a Python module.

Using Help

DiffMem is designed so that it can be imported as a simple Python module without the need for complex server deployment. Below is a detailed installation and usage procedure.

Environment Preparation and Installation

Cloning Codebase
First, it needs to be cloned from GitHubDiffMemthe source code repository to your local computer. Open a terminal or command line tool and enter the following command:
```
git clone https://github.com/Growth-Kinetics/DiffMem.git
```
After execution, the codebase will be downloaded to a file in the current directory namedDiffMemin the folder of the
Go to the project directory
utilizationcdcommand into the project folder:
```
cd DiffMem
```
Installing dependencies
DiffMem relies on a number of Python libraries to run, and these dependencies are documented in therequirements.txtfile. You can use thepipto install them:
```
pip install -r requirements.txt
```    这个命令会自动下载并安装`gitpython`、`rank-bm25`和`sentence-transformers`等必要的库。
```
Setting the API key
DiffMem requires the use of a Large Language Model (LLM) to work together, for example to analyze the content of conversations. The project uses OpenRouter to manage LLM calls. You need to set your API key in an environment variable.
For Linux or macOS systems, use theexportCommand:
```
export OPENROUTER_API_KEY='你的密钥'
```
For Windows systems, use thesetCommand:
```
set OPENROUTER_API_KEY='你的密钥'
```
please include你的密钥Replace it with your own valid API key.

Core Function Operation

The main functionality of DiffMem is accomplished through theDiffMemoryclass is exposed to the user. You can initialize this class and then call its methods to read, write, and query the memory.

Initialize the memory bank
First, you need to import theDiffMemoryclass and initialize it with a local path. This path will serve as the Git repository where the memory will be stored.
```
from src.diffmem import DiffMemory
# 初始化记忆库，指定仓库路径、用户名和API密钥
# 如果路径不存在，系统会自动创建一个新的Git仓库
memory = DiffMemory(
repo_path="/path/to/your/memory_repo",
user_name="alex",
api_key="你的OpenRouter密钥"
)
```
In the code above, the/path/to/your/memory_repoReplace it with the path to the folder where you wish to store your memories.
Processing and submitting memories
You can pass a piece of dialog or conversation content to theprocess_and_commit_sessionmethod.DiffMem's write intelligences will automatically analyze the text, extract or update entity information, and then save those changes as a single Git commit.
```
# 假设你有一段新的对话内容
conversation_text = "今天和妈妈一起喝了咖啡，她提到下周要去旅行。"
session_id = "session-12345" # 为这次会话指定一个唯一的ID
# 处理并提交这次会话的记忆
memory.process_and_commit_session(conversation_text, session_id)
print("记忆已成功处理并提交。")
```
Upon execution, the relevant knowledge is updated to the Markdown file and a new Git commit record is generated, and the commit message will contain the session ID.
Get Context
When it's time to interact with the AI, you can use theget_contextmethod to get relevant background information for the current conversation. This method supports different "depth" parameters to control the level of detail returned.
- depth="basic":: Get the core information block.
- depth="wide": Conduct semantic searches to return more broadly relevant information.
- depth="deep": Returns the complete contents of the file associated with the query.
- depth="temporal": Returns temporal information containing Git's history.
```
# 假设当前的对话是关于“妈妈的旅行计划”
current_conversation = "妈妈的旅行计划定了吗？"
# 获取深度上下文
context = memory.get_context(current_conversation, depth="deep")
# 将获取到的上下文信息打印出来
print("获取到的相关上下文：")
print(context)
```
This context can be fed into the LLM to generate more accurate and contextually knowledgeable responses.

Direct execution of searches
You can also just use thesearchmethod to retrieve information from the memory bank.

query = "关于妈妈的信息"
search_results = memory.search(query)
print(f"关于 '{query}' 的搜索结果:")
for result in search_results:
print(f"- {result}")

Sample code to run

In the project'sexamples/directory, a full usage demo file is availableusage.py. You can run it directly to observe the complete workflow of DiffMem.
Execute the following command in the terminal:

python examples/usage.py

This script will demonstrate how to initialize the memory bank, submit new memories, and retrieve context based on new conversational content, showing the entire chain of DiffMem from message input to output.

application scenario

Personal AI assistants
Long-term memory capabilities can be provided to personal AI assistants. The assistant can remember user preferences, past conversations, important dates and events. As memories evolve over time, the assistant can accurately recall "what we discussed last week" or "what is my daughter's age now" because it focuses only on the most recent state of the information while retaining a historical track.
AI systems that require continuous learning
In areas such as customer service and technical support, AI intelligences need to constantly learn new product knowledge and business processes.DiffMem can record the evolution of this knowledge. When an operating guide is updated, the system saves the new version and also logs the changes via Git history, ensuring that the AI always provides the most accurate information and can trace the historical version of any knowledge point.
Multi-Intelligence Collaboration
In a multi-intelligence system, different intelligences can share the same DiffMem memory. Through Git's branching and merge request mechanisms, intelligences can collaborate on updating shared knowledge and resolving possible "memory conflicts" to form a consistent, versioned team memory.
Interpretability and Debugging
For developers, AI sometimes behaves like a "black box", and DiffMem greatly enhances the interpretability of AI memories by storing them as human-readable text and Git commit history. Developers can review code as if it weregit logcap (a poem)git diffTo see what the AI has "learned" and "how the knowledge has changed" is very helpful in debugging the AI's behavior and decision-making process.

QA

How is DiffMem different from traditional vector databases?
Vector databases are primarily used for similarity searching of high-dimensional data, where information (e.g., text) is converted into vectors and stored, and then similar content is found by calculating the distance between the vectors. DiffMem, on the other hand, adopts a completely different idea, which does not rely on vector embeddings but manages memories as versioned text documents. Its core advantage is that it handles information that evolves over time, clearly tracking changes in a fact (e.g., a person's age changing from 9 to 10), whereas vector databases may retain outdated and noisy information when dealing with these kinds of "factual updates".
Why choose Git as your backend technology?
Git was chosen because it provides a mature and powerful solution for versioned document management. Git's strengths fit well with the needs of AI memory: it naturally supports tracking changes (diff), recording history (log), back to any point in time (checkout) and branch management (branch). In addition, Git is distributed and data is stored in simple files, which makes the repository highly portable and persistent, and does not rely on any proprietary formats.
Is DiffMem suitable for production environments?
Currently, DiffMem is a Proof of Concept (PoC) project, and its authors make it clear that it has not yet been hardened for production environments. It has some limitations, such as the need to manually perform remote synchronization of Git (push/pull), the error handling mechanism is relatively basic, and there is no locking mechanism designed for multi-user concurrent access. Therefore, further development and testing is required before direct use in large-scale commercial applications.
What are the main software dependencies needed to run DiffMem?
DiffMem is a lightweight project with major dependencies includingGitPythonlibrary (for manipulating Git repositories in Python),rank-bm25library (for implementing efficient text retrieval algorithms) and thesentence-transformers(used to support semantically related search functions).

AI open source project Knowledge Retrieval and the RAG Framework

AI productivity tools " DiffMem: a Git-based versioned memory repository for AI intelligences Posted on 2025-08-25, if you find the URL is out of date, or inaccessible, please contact us.

0Bookmarked

0kudos

DiffMem: a Git-based versioned memory repository for AI intelligences

Function List

Using Help

Environment Preparation and Installation

Core Function Operation

Sample code to run

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

DiffMem: a Git-based versioned memory repository for AI intelligences

Function List

Using Help

Environment Preparation and Installation

Core Function Operation

Sample code to run

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool