Overseas access: www.kdjingpai.com
Bookmark Us
Current Position:fig. beginning " AI How-Tos

Uncovering Claude Code: A Deep Reverse Engineering and Open Source Implementation

2025-09-25 74

Anthropic launches AI programming assistant Claude Code With its powerful code generation and interaction capabilities, it has gained a lot of attention in the developer community. However, as a closed-source product, its internal working mechanism is always a "black box" to the outside world, which not only makes it difficult for developers to fully understand its decision-making logic, but also restricts in-depth customization and optimization in specific technology stacks.

In this article, we'll take apart in detail one time for the Claude Code The reverse engineering practice. By intercepting and analyzing its API communications with the model, the researchers succeeded in revealing its complex Agent operation mechanism and eventually migrated its core capabilities to open source. Python organizing plan Pywen in a transparent and customizable alternative for the community.

Motivation: why deconstruct a closed-source AI assistant?

(go ahead and do it) without hesitating Claude Code Powerful, but its closed-source nature poses several core challenges:

  • Non-transparent decision-making logic: Developers are unable to explore their internal cue word design, tool invocation strategies, and context management mechanisms, making it difficult to debug and optimize when problems arise.
  • technology stack integration barrier: its native environment is based on TypeScriptFor the majority of Python There are certain integration and usage barriers for developers.
  • Limited ability to customize: The inability to tailor their behavioral patterns and response styles to specific business needs or personal preferences.
  • Cost control unknown: Without the ability to intervene in their API call strategy, it is difficult for users to effectively control the cost of usage by optimizing model selection or context length.

Therefore, the goal of this reverse engineering is very clear: complete deconstruction of the Claude Code works, extracts its core cue words and tool definitions, and replicates its capabilities natively to the Pywen In the framework.

Reverse Engineering: Uncovering the Black Box with Monkey Patch

Due to lack of direct access Claude Code of the source code, reverse engineering chose a clever entry point:Intercepting its relationship with Claude All API communication between models. All of its intelligent behavior, whether code generation or tool invocation, must be done through API requests.

The core technology uses "Monkey Patching", a technique that dynamically modifies or replaces code while the program is running. This is accomplished by claude-code-reverse program offered by the project, the researchers were able to Claude Code The web request module injects custom logic to capture the complete request and response data for each API call.

Run the modified Claude Code The API will generate detailed communication logs. By analyzing these logs, an originally opaque "thought process" is completely restored.

Reverse Discovery: The Eight Core Mechanisms of Claude Code

Through in-depth analysis of massive API logs, theClaude Code The complete workflow is gradually decrypted, and its core mechanism mainly contains the following eight parts:

1. Quota Query

Each time the program starts, it will first send a request to the Claude The model sends a lightweight dialog request to verify that the user's API quota is sufficient.

  • importation"quota"
  • mouldHaiku 3.5
  • goal: Selected Haiku Models enable fast, low-cost API quota validation while maintaining functionality.

2. Topic Detection

After receiving each input from the user, the system will use a file named check-new-topic of a particular prompt word to determine whether the current input opens a new topic. It is worth noting that this detection does not include the historical dialog context, and its main purpose is to update the session title of the terminal interface.

For example, in the same session, when the discussion switches content, the model returns the following JSON object:

{ "isNewTopic": true, "title": "Python文件创建" }

3. Core Agent process

this is Claude Code The main work cycle of a process is driven by a master Agent. This process uses the more capable Sonnet 4 model and consists of five core prompter components that together build the model's system-level instructions (System Prompt).

  • system-identity.prompt.md: Defines the identity and role of the Agent.
  • system-workflow.prompt.md:: Describe core workflows, codes of conduct, and response styles.
  • system-reminder-start.prompt.md: Inject dynamically generated environment information, such as the current working directory, IDE status, and so on.
  • Actual user input:: The user's original request.
  • system-reminder-end.prompt.md: Inject the task status, such as the current Todo list.

4. Context compression (Compact)

When the amount of Token in the dialog history approaches the upper limit of the model's context window, theClaude Code A context compression mechanism will be activated automatically or upon manual trigger by the user.

  • mouldSonnet 4
  • machine: Distill the entire history of the conversation into a single summary using a specialized compressed cue word.
  • goal:: This summary will serve as the initial context for the next conversation, thus freeing up space for new conversational content while preserving key information.

5. IDE integration

Claude Code Tight integration with IDEs (e.g. VS Code) through MCP (a communication protocol), the ability to read information about the currently open file, and the registration of specialized tools to interact with the IDE, for example:

  • getDiagnostics: Get diagnostic information such as errors and warnings in your code.
  • executeCode: Execute code snippets in an IDE environment.
  • In addition, it automatically fixes linting errors with these tools.

6. Todo short-term memory management

Claude Code A short-term memory system based on a local file system for managing task lists is implemented.

  • stockpile: The task status is stored in JSON format in the ~/.claude/todos/ Catalog.
  • framework:
    {
    "todos": [
    {
    "content": "Run tests",
    "status": "in_progress",
    "activeForm": "Running tests"
    }
    ]
    }
    
  • machine: The system loads and updates this JSON file in real time, ensuring that the Agent remembers and continues the current task between interactions.

7. Sub Agent system

In order to handle complex tasks and avoid contamination of the main Agent's context, theClaude Code The Sub Agent mechanism has been introduced, which is implemented through a Task Tools for multi-agent collaboration.

  • workflows: The Main Agent can assign a complex subtask.
  • incommunicado: The Sub Agent performs this task in a completely isolated context.
  • come (or go) back: After the task is completed, the Sub Agent returns the final result to the Main Agent. This mechanism effectively avoids the redundant information in the intermediate steps from interfering with the main dialog process, which is the so-called "dirty context" problem.

8. Historical summary of the dialogue

At the end of the session, the system uses the lightweight Haiku 3.5 The model generates a concise title for this conversation, which makes it easy for the user to manage and review the history of the conversation.

Test Cases: Breaking Down a Complete Interaction Flow

The following is a simple task to "create and modify a file", which clearly shows how the above mechanisms work together.

User input 1. > 当前目录创建test_py文件夹,写一个hello world Python文件

  1. Quota Query: (Completed at startup)
  2. Topic Detection:

    • System Prompt:
      Analyze if this message indicates a new conversation topic. If it does, extract a 2-3 word title that captures the new topic. Format your response as a JSON object with two fields: 'isNewTopic' (boolean) and 'title' (string, or null if isNewTopic is false). Only include these fields, no other text.
      
    • model output:
      { "isNewTopic": true, "title": "Python文件创建" }
      
  3. Core Agent Reminder & Tool Call:

    • System Prompt: Splicing System-Identity,System-Workflow Multiple cue word components such as.
    • Message: Inject dynamic alerts, such as code selected by the user in the IDE.
      <system-reminder> The user selected the lines 16 to 16 from /home/capios/.nvm/versions/node/v18.20.8/lib/node_modules/@anthropic-ai/claude-code/cli.js: USERPROFILE
      This may or may not be related to the current task. </system-reminder>
      
    • Tools: Pass in a detailed description of all available tools.
    • Model output (Assistant):: First export the thought process "I'll create a test_py folder and write a hello world Python file in it."and then call the Bash Tools.
      {
      "command": "mkdir test_py",
      "description": "Create test_py directory"
      }
      

      Notes:Bash The description of the tool is very detailed, including security tips, git operation specification, PR creation process and so on more than 400 lines of content, not shown here in full.

  4. Bash Tool Enforcement Policy:

    • Before executing a specific bash command, theClaude Code Another model will be invoked to analyze the security of the command and extract the command prefix to prevent command injection attacks.
    • System Prompt:
      Your task is to process Bash commands that an AI coding agent wants to run. This policy spec defines how to determine the prefix of a Bash command...
      
    • Message:
      ... Command: mkdir test_py
      
    • model output:
      mkdir
      
  5. The Bash utility performs post-processing:

    • After execution, the system calls the model again to determine if the command has displayed the contents of the file to decide whether to display the output on the CLI interface.mkdir command does not display the contents of the file, so the model returns falseThe
  6. Calling the Write utility:

    • After creating the directory, the Agent proceeds to the next step of the task by calling the Write tool to create the file.
    • Model Output (Tool Call):
      {
      "name": "Write",
      "input": {
      "file_path": "/home/capios/test_py/hello_world.py",
      "content": "print(\"Hello, World!\")"
      }
      }
      
  7. End of Write tool execution:

    • Model output (Assistant):
      Done! Created test_py/hello_world.py with a simple Hello World program.
      

User input 2. > 修改为hello Hangzhou

  1. New Topic Detection:

    • Same process, the model determines that this is a new theme.
    • model output:
      { "isNewTopic": true, "title": "Hangzhou Greeting" }
      
  2. Calling the Edit Tool:

    • The Agent understands that the user's intent is to modify the file, so it chooses to call the Edit Tools.
    • Model Output (Tool Call):
      {
      "name": "Edit",
      "input": {
      "file_path": "/home/capios/test_py/hello_world.py",
      "old_string": "print(\"Hello, World!\")",
      "new_string": "print(\"Hello, Hangzhou!\")"
      }
      }
      
  3. The Edit tool has been executed.:

    • Model output (Assistant):
      Modified to "Hello, Hangzhou!"
      

Migrating to Pywen: Open Source Migration from TypeScript to Python

In a complete understanding of Claude Code The next step, after the core mechanism of the Pywen framework. This process is not only a translation of code, but also a reproduction of design ideas.

Core migration strategy

All the core components extracted through reverse engineering were individually added in the Pywen A modular implementation was carried out within the framework:

  1. Cue word system complete migration:
    • placementpywen/agents/claudecode/prompts.py
    • realization: All cue words extracted from logs, including identity definitions, workflows, dynamic alerts, topic detection, and contextual compression policies, are integrated into the prompts.py in the file. These cue words contain a large number of detailed instructions that are Claude Code Cornerstone of the behavioral model.
  2. Tool Definition Accurate Reproduction:
    • placementpywen/agents/claudecode/tools/tool_adapter.py
    • realization: Utilizes the Adapter Pattern of design patterns for the Pywen The native tool dynamically replaces the Claude Code version of the description. This means that the underlying tool functionality remains the same, but the tool descriptions provided to the larger language model (i.e., the Tool Schema as seen by the LLM) are the same as those of the Claude Code exactly the same, thus accurately replicating its tool invocation behavior.
  3. Systematic reminder mechanism resurfaces:
    • placementpywen/agents/claudecode/system_reminder.py
    • realizationClaude Code of system alerts is based on contextual state smart injection.Pywen is passed through a SystemReminder module to reproduce this mechanism, such as alerting task management when the Todo list is empty, or suggesting running a test after multiple file edits.
  4. Todo Short-term Memory Management:
    • placementpywen/agents/claudecode/tools/todo_tool.py
    • realization: File system-based task state persistence is encapsulated in a TodoTool, which realizes adding, deleting, changing, and checking tasks, and manages them through three states (pending/in_progress/completed).

Architectural Benefits of the Pywen Framework

Pywen A flexible and complete architectural design made this complex migration process efficient and smooth:

  • Complete infrastructure: The framework provides BaseAgent Core components such as base classes, tool registration system, session management, track logging, etc., developers don't need to build from scratch.
  • Flexible Agent SystemMultiple Agent coexistence is supported. /agent command on different Agents (e.g. claude The Agent can be switched at any time between the default mode and the default mode, and each Agent has its own configuration without interfering with each other.
  • Rich ecosystems: The framework has built-in basic tools for file manipulation, code editing, web searching, etc., which can be used in combination as needed to replicate the Claude Code provides the basis for a complex toolset.

Result: Claude Code in Pywen

Once the migration is complete, users can now Pywen framework to seamlessly experience the same benefits as native Claude Code Highly consistent and powerful.

  • GitHub Links: https://github.com/PAMPAS-Lab/Pywen
# 启动 Pywen
$ pywen
# 切换到 Claude Code Agent 模式
> /agent claude
# 现在可以开始使用完整的 Claude Code 能力
> Help me refactor this code and add tests
# Todo 工具将自动激活并进行任务管理
Creating todo list:
1. Analyzing code structure
2. Refactoring implementation
3. Adding test cases

This successful reverse engineering and open source implementation not only reveals the internal design of a top-notch closed-source AI product, but more importantly, it brings these advanced Agent architectural ideas to the open source community, driving transparency and common progress in the technology.

All analysis in this article is based on publicly available API logs and is intended for technical learning and communication.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish