Current Position:fig. beginning » AI How-Tos

Deep Deconstruction of Claude Code Source Code: The Agent Architecture Philosophy Behind 510,000 Lines of Code

2026-03-31

Recently, Anthropic released its native AI coding assistant Claude Code. based on a review of the @anthropic-ai/claude-code v2.1.88 source codeThis engineering system of 510,000 lines of TypeScript code spread across 1,902 files not only shows how to build a full-featured endpoint CLI tool, but is also an engineering-level analysis that reveals how the Head AI Lab handles the interaction of large models with real physical systems, privilege control, and context management.

分析报告概览
(Note: the analysis of 500,000 lines of code took over an hour to run)

II. Project panorama and design philosophy

2.1 Code Size and Core Responsibilities

The distribution of the code catalog's volume gives a clear insight into the direction of the project's resource skew. The infrastructure layer and the UI rendering layer take up the lion's share:

module (in software)	row	percentage	Core responsibilities
utils	180,472	35.2%	Permission control, Bash security intercepts, message processing pipelines, Git interactions, MCP clients, and other underlying infrastructure
components	81,546	15.9%	React Endpoint UI components (permission confirmation dialog, code Diff difference display, message rendering engine)
services	53,680	10.5%	API call encapsulation, multi-layer context compression algorithms, MCP backend services, data analytics, and OAuth authentication.
tools	50,828	9.9%	Specific implementations of more than 40 core tools (Bash execution, FileEdit editing, Agent dispatching, MCP tools, etc.)
commands	26,428	5.2%	Terminal entry for over 90 slash commands (e.g. /compact, /model, /mcp)
ink	19,842	3.9%	Fork version of our own Ink framework, a high-performance rendering engine for React in endpoint environments.
hooks	19,204	3.7%	Provide React hooks to decouple permission handling, IDE state integration, and voice interaction logic
bridge	12,613	2.5%	Remote control protocols that allow local machines to perform tasks as Bridge bridged environments
cli	12,353	2.4%	CLI Command Line Parameter Parsing and Lifecycle Management of Background Sessions

2.2 Aerial view of the architecture

The bottom layer of the system relies on a number of basic services, while the upper layer drives the decision and operation of the agent through sophisticated state machine and event mechanism, showing a highly modularized feature.

架构鸟瞰

2.3 Five cross-cutting design principles

After disassembling these 510,000 lines of code, it is possible to distill five core design guidelines that dominate the direction of the architecture:

Tools as competence boundaries: Agent does not have any backdoors to bypass the toolset and manipulate the environment directly. Reading a file requires the FileReadTool, modifying a file relies on the FileEditTool, and executing system commands is only possible with the BashTool. the expansion of the system's capabilities is entirely equivalent to the addition of new tools.
Fail-closed security default: All default settings involving security properties are extremely conservative. Tools do not allow parallel execution by default (isConcurrencySafe: false), the default assumption of write destructiveness (isReadOnly: false), and all operation permissions are blocked by default, forcing the user to authorize.
Context Engineering over Prompt EngineeringThe engineering focus is not on using a long cue to tell the model “who you are”, but on dynamically assembling the complete contextual environment in each round of dialog, through segmented caching, dynamic state injection, and multilevel context compression.
Highly combinable: Seamless reuse between different modules. Sub-Agents directly reuse the main thread's query() function engine, the MCP external tool reuses the system's internal permission checking pipeline, and the Team collaboration model reuses Subagent's underlying execution engine.
Compile-time elimination is superior to run-time judgment: Utilizing the Bun runtime feature() Macro [7] enables dead code elimination (DCE) at build time. Not only are unenabled features not executed at runtime, the physical code is completely absent from the final generated Bundle package.

III. Agent Loop: The Executive Heart of the System

be situated at src/QueryEngine.ts(Row 1295) vs. src/query.ts(line 1729), plus the tool implementation layer of the StreamingToolExecutor.ts(530 lines) and toolExecution.ts(1745 lines), which together form the core of what drives Agent.

3.1 Two-tier cyclic model

Instead of the traditional while loop, Agent Loop constructs an implicit state machine with 7 recovery paths and 10 termination conditions, and is structured into two layers:

两层循环模型

QueryEngine (outer layer session management): Handles multi-round state maintenance, disk persistence for Transcript, SDK protocol adaptation, and APIs. Token Cumulative consumption.
queryLoop (inner single-round execution): Focus on initiating API calls, parsing and executing tools, and localized error recovery mechanisms.

The connection between the two layers is established by the AsyncGenerator: the inner queryLoop is responsible for the yield produces the message object, which the outer QueryEngine is responsible for consuming. This design solves three engineering pain points:

Back pressure control: The caller pulls data on-demand to avoid flooding memory with the model's massive streaming output.
disrupt semantic transmission: Call the generator's .return() The ability to cascade off all nested Generator instances allows user cancellations to propagate instantly to all underlying tasks.
Streaming Combination: Sub-Agent's runAgent() The return is also an AsyncGenerator, allowing it to be nested directly back into the main Agent's data stream.

3.2 State Machine Design for queryLoop

queryLoop is essentially a while(true) A loop where each iteration represents an “API call + tool execution” cycle. The exit of the loop is determined by two types:

Terminal: The termination condition is reached, ending the loop and returning the specific cause.
Continue: Trigger the recovery path by state = next; continue Carry the new state into the next iteration.

To prevent state omissions, the system does not use piecemeal variable assignments, but instead centrally maintains a State Structures:

type State = {
messages: Message[]
toolUseContext: ToolUseContext
autoCompactTracking: AutoCompactTrackingState | undefined
maxOutputTokensRecoveryCount: number
hasAttemptedReactiveCompact: boolean
pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
turnCount: number
transition: Continue | undefined
}

source code query.ts:266-268 The comments in the previous section specifically state this motivation: replacing multiple independent variable changes with a single full State assignment forces the developer to explicitly declare all state at every continue site, thus avoiding contextual inconsistencies altogether.

queryLoop 状态转换图

3.3 Message preprocessing pipeline: light to heavy context management

Before an API request is initiated, the list of messages goes through a pre-processing pipeline that strictly follows the “lightest to heaviest” principle. The system prioritizes low-cost, local operations before heavy-duty operations that consume API quota.

消息预处理管线

Context Collapse is performed before AutoCompact is called, because AutoCompact is costly and destroys fine-grained context, while Context Collapse preserves the original request as much as possible by collapsing secondary messages.

An exact mathematical model exists for the triggering of AutoCompact, with the following formula:

Effective Context Window = Model Context Window - max(max_output_tokens, 20000)

Trigger Threshold = Effective Context Window - 13000

For a model with a 200k window limit, the compression action is around 167k. tokens Nearby startup. Notably, the circuit breaker mechanism is designed to stop retrying after 3 consecutive failures. The source code comments cite real telemetry data: “1,279 sessions globally had more than 50 consecutive compression failures (up to 3,272), wasting about 250K API calls per day”. This tiny mechanism blocks resource black holes in large-scale deployments.

3.4 Concurrency Control Essence for Streaming Tool Actuators

When the model outputs multiple tool call blocks in parallel within a single request, the system defaults to using the StreamingToolExecutor(streaming execution). During the API streaming reception phase, as soon as a complete tool_use block is received, execution is initiated.

The concurrent control model for streaming actuators is based on the idea of tool partitioning:

流式工具执行器

The actuator does this by reading the tool's isConcurrencySafe(input) property groups consecutive security tools into a “parallel partition”. Whenever a non-secure tool is encountered (e.g. FileEdit, where two parallel FileEdits modifying the same file will inevitably result in line number offsets and overwrites), a new partition will be opened. Serial execution is enforced between partitions, while full parallelism is allowed within partitions.
One of the defensive design: if the security check function itself throws a parsing exception, the system will follow the fail-closed principle and treat it as a non-secure tool execution by default, rather than sacrificing performance than risking concurrency conflicts.

3.5 Message Withholding and Token Budget Management

Not all information returned by the API is directly leaked to the front-end. The system withholds three types of messages:

prompt-too-long error: Held up by reactiveCompact, retried internally after attempting compression.
media-size error: Retry internally after attempting to peel off oversized image attachments.
max_output_tokens error: Hangs waiting for the system to decide whether to inject a successor command.
The core of this withholding mechanism is to protect SDK consumers (e.g., desktop clients)-consumers tend to terminate their sessions once they receive an error field, and hiding intermediate state errors ensures that internal recovery loops are not prematurely interrupted by the outside world.

When the model stops outputting normally but fails to complete the task because the text is too long, the system injects a nudge message to prompt the model to continue as long as the Token Budget is sufficient. Sub-Agents are prohibited from using this budget to prevent infinite hangs. Diminishing returns detection also comes into effect: if the increment is less than 500 tokens for 3 consecutive detections, the model is idle and the system will force it to terminate.

IV. The system of tools: constraints and capacities

The tool system spans more than 40 directories and nearly 50,000 lines of code, and everything Agent can do is completely limited by the tool libraries provided by the system.

4.1 Six Functional Groups of the Tool Interface

The Generic Interface Tool specifies about 30 methods, which are functionally divided into six modules. The buildTool Factory functions inject harsh security defaults:

Tool 接口六个功能组

causality	default value	Design Motivation
`isConcurrencySafe`	`false`	Force serial queuing assuming concurrent conflicts will be triggered
`isReadOnly`	`false`	Assuming a write operation feature, triggering strict write permission auditing
`isDestructive`	`false`	Not pre-defined as a destructive action to prevent UI fatigue from spamming warnings
`checkPermissions`	`allow`	Internal default release, actual interception is handled by the outer global permission system under the hood

4.2 More than 40 states of ToolUseContext

instrumental call() method needs to receive an extremely large ToolUseContext object, which shows that the underlying tool is by no means a pure function:

upper and lower text paragraphs	use	Why it can't be omitted
`readFileState`	File Read Record Cache	FileEditTool must check this field to deny editing of files that have not been reviewed by the model.
`abortController`	Cancel signal handle	Allows the user to interrupt time-consuming build commands executed by the BashTool at any time.
`setToolJSX`	UI Render Callback Injection	Enables tools such as BashTool to draw progress bar components directly in the terminal
`agentId`	instance identification	Distinguish between master process and subagents to bind separate CWD directories.
`contentReplacementState`	Budgetary control	Intercepting runaway tool output to prevent it from instantly bursting the system context
`updateFileHistoryState`	Historical state pointer	Providing tracking of document changes is `/rewind` Undo the underlying dependencies of the command

If the global state needs to be modified after the tool is executed (e.g. cd (switching directories), which can only be done by returning the contextModifier Fields are subject to controlled modification. And this permission is only open to serial tools, concurrent execution tools are strictly prohibited from interfering with the global environment.

4.3 Compile-time Elimination and Partition Registration

src/tools.ts Three strategies are used for tool registration and loading:

工具注册三种加载策略

Bun. feature() Macros play a key role here. The macros under the control of the Feature gate require() If a branch is determined to be false, the builder not only doesn't execute the logic, it completely wipes it from the product using DCE (dead code elimination). Since the dynamic require() Supports conditional statement wrapping while static import does not work, so the require()。
Assembly phase of the tool assembleToolPool A strict partitioned sorting strategy has been implemented: built-in tools come first and are sorted alphabetically, externally introduced MCP tools at the end of the list. This non-hybrid layout ensures that the server-side Prompt Cache's cache breakpoints consistently fall after the last built-in tool, preventing dynamically loaded MCPs from invalidating long-lived caches.

4.4 BashTool: A Security Fortress for 18 Files

Because of the nearly unlimited destructive power of Shell commands, BashTool occupies 18 files alone, building an 8-layer defense system.

BashTool 8 层安全检查

Several highly informative security designs:
Composite Command Physical Isolation: The system utilizes tree-sitter to parse the Shell AST, which will cd /path && python3 evil.py Precisely split into separate SimpleCommands, the entire compound command is rejected if any of the sub-commands fails the audit. To prevent ReDoS attacks and event loop starvation, a single command is mandatorily split into a maximum of 50 subcommands.
Flag level whitelist validation: Not only the command name is locked, but also the value of the parameter (Flag) is examined in depth. For example, limiting xargs -I input format to protect against malicious exploitation of the model -i GNU variant semantics for executing process injection.
25 Syntax Injection Detection：bashSecurity.ts There are more than 25 syntax-specific detection logics, including command substitution (backquotes or $()), process substitution (<()), Zsh-specific high-risk built-in methods (such as zmodload、syswrite), control characters, and various types of Unicode camouflage whitespace.
Sandbox environmental pockets：SandboxManager A final layer of safety net is constructed by constraining read and write paths to the file system, limiting network access addresses and Unix sockets.

4.5 FileEditTool's Search-Replace Invariants

Instead of using line number based add/delete logic, FileEditTool enforces “search-and-replace”. This assumes that the model gives old_string The code block must be uniquely matched in the target source file. If more than one match is made, the system aborts the modification and asks the model to continue providing more context code until it is unique. This rigid but reliable constraint completely eliminates catastrophic accidents where the model changes the wrong location.
Meanwhile, the previously mentioned readFileState A second line of invariant defense is constructed: the prohibition of editing unread files. The model can't blindly edit a file from its own “hallucinatory memory”; it must first use FileRead to confirm the current state of the file.

V. Competence system: the system's immune barrier

This system of more than 70 documents delicately sets the trust scale between “allowing AI to function autonomously” and “preventing systemic catastrophe”.

5.1 Continuous spectrum of confidence gradients

The permission system defines six gradients from lockout to full delegation:

权限模式

paradigm	Core Behavioral Characteristics	Applicable workflows
`plan`	Deprive everything of write permissions, models can only be used for architectural planning and code review	Exploratory analysis and audit
`default`	Default startup mode, each tool call pops up a UI requesting manual confirmation	Routine day-to-day development collaboration
`acceptEdits`	Read and write files in the workspace are silently released, the rest of the environment still needs to be blocked to confirm the modification.	Code refactoring with a very high level of trust
`auto`	Introducing intelligent classifiers on the model side to examine security in real time (internal environment)	High-frequency automation
`bypassPermissions`	Completely skips the regular manual checking process, retaining only the underlying kernel blocking	CI environment or controlled isolation container
`dontAsk`	When you encounter an interception prompt, convert it directly to Deny and skip the task without a pop-up window	Purely automated unattended scripts

In conjunction with this model, the system is implanted with a Statsig-based gating mechanism. bypassPermissionsKillswitch.ts Remote Fuse. When the Anthropic Instantly and remotely downgrade global clients in bypass mode when a large-scale security breach is captured.auto The mode has a similar autoModeCircuitBroken The cut-off mechanism.

5.2 Multi-layer assessment of pipelines and rule masking

be directed against rm -rf / This type of sensitive instruction is evaluated with a multi-layered pipeline mechanism developed by the system:

权限判断主流程

Explicit Ask rules take absolute precedence: Even if you are currently in the bypassPermissionsIf the user manually configures the ask: ["Bash(npm publish:*)"]The system also aborts the auto-execution and pops up a window. This implements the design philosophy that “the user's explicit commands always override the global mode”.
Hard-coded path immunity:: To the .git/、.claude/、.vscode/ and terminal configuration files (e.g. .bashrc、.zshrcThe modifications to the bypass mode are hard-coded to be immune to bypass mode and must be manually verified in all cases.
Classifier fuse protection: In auto mode, if the AI classifier rejects the command 3 times in a row, or a total of 20 rejections in a single session, the system will downgrade from intelligent rejection to a forceful pop-up requesting the user to intervene. If the system is currently in Headless mode, it will directly throw the AbortError Destroy the entire Agent process.

5.3 Rule Sources, Syntax and Masking Detection

Each permission rule is controlled at 8 different source levels. Enterprise administrators are empowered with policySettings Having absolute dominion, when opening the allowManagedPermissionRulesOnly: true After that, the system recognizes only the rulebook issued by the enterprise.

权限规则来源

The Shell match syntax supports exact matching, old prefix matching (npm:*) and wildcard regulars. The system adds a nifty touch here: when the pattern ends with a space plus a wildcard (as in git *), the system compiles the tail as an optional match, so that the rule matches both the git add Also compatible with bare commands git。
To address user experience pain points, theshadowedRuleDetection.ts Rule masking detection will be performed on the configuration file. When a user incorrectly sets the deny: ["Bash"] put down in allow:["Bash(ls:*)"] Previously, the latter could never be reached because of the order of the evaluation pipeline, and the UI level would immediately render a red warning instructing the user to adjust the order.

5.4 Three Handlers for Passing Privileges in Multiple Agents

The permission processor splits three paths for different types of subordinate Agents:

interactiveHandler: A standards-oriented interaction model, requested through an interface pop-up.
coordinatorHandler: Work with Classifiers & Hooks to prioritize silent approvals and intervene manually when they fail.
swarmWorkerHandler: Cooperation bubble A bubbling mechanism that sends requests across processes to the Leader Permission Bridge of the main line endpoint.
If the Agent is a file with the shouldAvoidPermissionPrompts: true Marked asynchronous tasks, it has no interface to pop-up when it encounters a corroboration request, and will simply perform a silent veto (auto-reject) on the task.

Multi-Agent Collaboration: Building Swarm Intelligence

In the face of complex, long-lead time development tasks, the system derives a robust multi-agent architecture through task disassembly and workspace isolation.

6.1 Three-Layer Abstraction Boundary

三层协作架构

Subagent: Extremely lightweight child nodes, usually pulled up synchronously or asynchronously by the parent node, used to perform atomic tasks similar to those defined by the retrieval function.
Team/Swarm: A team topology with a complete lifecycle. Members are divided into Leader and Teammate, with peer-to-peer communication capabilities, making it ideal for parallel refactoring across front and back ends.
Coordinator: A pure task orchestrator. Nodes in this mode are disqualified from invoking the underlying read/write tools, and focus entirely on parsing child node reports and issuing commands.

6.2 AgentTool's Unified Routing

All triggered actions against subprocesses in the system converge on the unique AgentTool. This initiative greatly diminishes the cognitive overhead of the model-calling tool.

AgentTool 路由设计

Agent is abstracted as a large three-level federated type definition containing: built-in nodes (BuiltIn), user override (Custom) as well as external plug-ins (Plugin). In case of conflict follow built-in < plugin < userSettings < projectSettings < flagSettings < policySettings The principle of hierarchical overwriting.

6.3 Extremely customizable built-in Agent

Multiple Agents with specialized functions are hard-coded in the system:
Explore (search-specific): It is forced to lock to the Haiku model (the cheapest and fastest), completely disabling Edit/Write permissions. Even more extremely, the source code comments show that it actively drops the CLAUDE.md(project specification configuration) and Git status are two huge pieces of read-only data. In a large-scale deployment, with 34 million derived calls per week, omitting just these two pieces of information saves 5-15 Gtok/week of horrible dosage.
VerificationThe Prompt is the longest Prompt: at nearly 120 lines, it is designed to counter the self-defeating illusion that LLMs are prone to “the code looks fine, the test should pass” by mandating the output of specific test execution instructions as well as a standard output log before reaching a conclusion. In addition, it comes with background: true markers, always computed asynchronously in the background.

6.4 Fine-grained control of sub-node execution engines

runAgent() takes on the burden of child node execution. Its core logic is to completely reuse the master node's query() Cyclic framework.
For privilege overrides, the system introduces mandatory security constraint logic: child nodes are allowed to declare themselves permissionModebut it is never permissible to set a file that runs in the bypassPermissions、acceptEdits 和 auto The parent node of the pattern is pulled into the default Mode. In short, subsystems cannot break the global trust flow mechanism by masquerading as a higher security level.

Tool filtering underwent three levels of rigorous screening:

The global disablement table (ALL_AGENT_DISALLOWED_TOOLS) is blocked across the board.
The non-built-in Agent disablement table (CUSTOM_AGENT_DISALLOWED_TOOLS) blocks illegal tools.
Asynchronous Agent-specific whitelist control (ASYNC_AGENT_ALLOWED_TOOLS).
It's worth noting that all of the products that start with mcp__ Prefix-named external MCP tools are released in full, ignoring all of the above rules, ensuring that external extensions are not restricted by internal topology type.

At the end of the lifecycle, the cleanup phase must perform exactly eight scavenging tasks: break proprietary MCP links, clean up session hooks listening, force purge allocated Prompt Cache, free the file state cache dictionary, deregister Perfetto performance tracking handles, clear the Transcript mapping table, sweep orphaned todo tasks, kill the bash resident processes in the background that have not exited.

6.5 Fork Subagent: Extreme Squeezing of Cache Hit Rates

As an experimental model, Fork aims to fully inherit the memory context and dialog records of the master node.
Its core task is to maximize the Prompt Cache hit rate. All incoming history message structures and tool_use The placeholders must be kept bit-level consistent by appending only one instruction specific to the current child node at the very end of the data stream. When multiple Forks are initiated concurrently, they directly hit the far end of the high-speed prefix cache due to the complete overlap of the data sequences of the first 99%.
The design against recursion-induced explosions is equally demanding: the system implements a double-checking mechanism. It relies on the querySource (an anti-compression checker) and the scanning of the message stream for the <fork-boilerplate> Labeling acts as a double insurance policy blocking infinite derivation.
In the final injection command, the extremely harsh all-caps English “STOP. READ THIS FIRST.” is used to forcefully cause a diversion of attention from the LLM, requiring it to ignore the inherited old identity and take over the newly assigned tasks.

6.6 Team/Swarm dual-track back-end mechanism

To support teamwork, the system is built with two completely different underlying architectures:

Team/Swarm 两种后端

The system performs detection and degradation based on the state of the environment: if in tmux, enable TmuxBackend; if inside iTerm2, enable ITermBackend; if neither is present but the system is detected as holding a tmux, call an external Tmux session; if nothing is present, force an error-throwing abort and ask the user to configure the environment. In pure SDK-driven mode, the system is locked using In-process.
These two backends are being consistently TeammateExecutor The interface specification shields the differences, and the APIs exposed to the top level are all simple spawn()、sendMessage() 和 terminate()。

6.7 Extremely Complex In-process Engines

With nearly 1,400 lines of code, the src/utils/swarm/inProcessRunner.ts It is the most complex hub in the collaborative network.

进程内 Teammate 运行器

Privilege Escape Protection against Shared Memory.createInProcessCanUseTool() A refined three-tier approval flow has been constructed:

Perform the usual allow / deny checks.
If you encounter a ask, it is prioritized for trial by the backend Classifier.
Enable core components if manual intervention is necessary Leader Permission Bridge(master node privilege bridging).
The background task uses the bridge to reverse call the foreground REPL process's exposed setToolUseConfirmQueueThe approval results will be sent back to the Teammate with the Teammate's badge, which will be drawn on the home screen. Approval results are returned with preserveMode: true flag, completely isolating subsystems from probing the master system's global privilege patterns through this path.

The Idle standby state is introduced in the lifecycle control. instead of physically destroying the process when the Teammate completes its task, it hangs and sends an idle message to the master node containing a peer-to-peer communication digest (peer DM digest).
For uncontrolled memory inflation, the component has embedded the TEAMMATE_MESSAGES_UI_CAP = 50 The source code notes reveal that during a disastrous “whale session” stress test, the system spawned 292 Linked Agents in 2 minutes. This physical limit is a lesson learned from a disastrous "whale session" stress test in which the system spawned 292 Linked Agents in 2 minutes and memory consumption instantly exceeded 36.8GB.

6.8 Cross-Web Mailboxes and Coordinator Design Concepts

Collaborative communication between independent physical processes is built into the local file system's ~/.claude/teams/<teamName>/mailbox/<agentName>/ Above the Path. Harmonization of utilization SendMessageTool Mount the route.

Teammate 通信路由

The communication protocol contains structured control frames (such as the shutdown_request、shutdown_response 及 plan_approval_request), enabling the network to have self-recovery and response flow processing capabilities.

Further up the tier.coordinatorMode.ts Reveals the design guidelines for Supreme Commander. It holds only 6 command tools such as TeamCreate, TaskStop, SendMessage. Its 260 lines of proprietary Prompt define a four-phase logic that includes synthesizing, analyzing, and issuing commands.
There is a core Anti-pattern taboo hidden here:“Never delegate the process of understanding.”The coordinator is forbidden to send vague instructions like “fix the bug based on your analysis” to the child nodes, and must chew through all the context on its own, translating it into execution instructions with clear file and line number pointers.
To compensate for the Coordinator's inability to edit files, it has a special Scratchpad directory. All workers can drop intermediate analytics into this directory regardless of regular read/write approvals. In addition, this model is completely mutually exclusive with Fork - a Commander who does not have any real power over the state of a file, and cannot generate any valid inheritance.

6.9 Infrastructure for Asynchronous Work: The Task System

Any logic that blocks the main interface for a long period of time is stuffed into the AppState.tasks Centralized custodianship is carried out.

Task 系统

LocalAgentTask Provides extremely accurate scaling: real-time reporting of the total number of tools invoked, a global Token overhead ledger with inputs and outputs, and a text description of the last 5 invocations.
InProcessTeammateTask The control architecture of the dual AbortController is maintained: the main abortController responsible for the physical extraction of the entire machine, and the currentWorkAbortController Simply pinch off the tool polling that is currently waiting.
DreamTask （The most specific of these concepts is the Dream Organizer Task (DOT). During terminal inactivity, this task starts automatically as a daemon, regurgitating recent conversational data and organizing and condensing it into a memory file. In the event of an unplanned interruption, it kill() The routine automatically performs the lock file mtime timestamp rollback technique to ensure a seamless retransmission on the next reboot.

6.10 Passing the full set of permissions between multiple nodes

These six chains of transmission strictly adhere to the “minimum exposure + prevention of transboundary proliferation” paradigm.

权限传递规则

VII. System Prompt Project: Re-contextualization

go into src/constants/prompts.ts 与 src/context.tsThe logic here declares rote Prompts obsolete. The system performs extremely sophisticated Context Engineering.

7.1 Memoization Segmented Cache System

Rather than being assembled into a single giant string of characters, the vast cue word bank is broken down into separate string[] Segmented containers. The core driver goes straight to the API layer of the Prompt Cache Optimization.

System Prompt 分段缓存架构

SYSTEM_PROMPT_DYNAMIC_BOUNDARY Like a watershed. Above that are static constants that carry scope: 'global' Cache identifiers are shared across global users. When a large number of requests hit the same prefix constant, not only does the API cost drop off a cliff, but the response time is also extremely compressed.
Crossing below the boundary is exclusive to the user. The source code enables functions that intend to break this boundary with the DANGEROUS_uncachedSystemPromptSection Warning prefix to force developers to provide the _reason Literal quantity parameters for code review.

7.2 Static Constitution and Specialization Strategies

The principle of “code simplicity” is emphasized to the model several times in the Doing Tasks specification paragraph: “Three similar lines of code is better than a premature abstraction. a premature abstraction.”. This is essentially a defensive directive to suppress LLM's inherent tendency to “over-engineer and show off”.
The Actions control paragraph explicitly declares the principle of authorization isolation: “A user's permission once does not mean that it is released in all contexts.”
For the internal version of the privileged configuration, there is even the peculiar constraint of “don't write any code comments by default”. Annotations confirmed that in early evaluations of the Capybara v8 model, where the model-generated hypothetical logic resulted in a false statement rate of 29-30%, outright banning of annotations became the fastest interim compromise.
For the tools section, the model is tightly specified to use specialized tools instead of the Bash command chain (e.g., FileRead instead of cat), and the structured output and controlled environment provided by specialized tools can significantly reduce surprises.

7.3 Environmental Detection and Dynamic Incremental Injection

During the session initialization phase, thesrc/context.ts will detect and merge the layers upwards retrospectively with the CLAUDE.md The file is injected into the userContext in. This uses the --bare Pure Start, while it cuts off auto-exploration, still strictly follows the same rules that apply through the --add-dir The rule for manual mounts, i.e. “bare means not actively adding, not not accepting input”.
This even breaks the internal dead-end loop of the classifier along with it:yoloClassifier → claudemd → filesystem → permissions → yoloClassifier。CLAUDE.md Hardcoded to the first time after loading bootstrap/state.ts Provides cache blocking.
Git state injection pulls five streams of information concurrently (current branch, master branch, workspace state, last five commits, and operator). To prevent information overload, the workspace status is hard-cropped to 2000 characters, and the prompt model gets the full log from the BashTool when it needs it.
When communicating with an external MCP Server.isMcpInstructionsDeltaEnabled The switch controls the command to take incremental downstream only when the topology changes, blocking the wasteful transmission of duplicate bytes.

7.4 Compression and Reconstruction of the Quadruple Defense

When the limited Token contexts are on the verge of exhaustion, the system enables four different levels of compression defense:

四层压缩策略

AutoCompact StrategyThe boot prompt used is extremely fine-grained, forcing the retention of nine categories of information, including error routing. The central point is that “all original user command messages that are not converted to tool output must be retained”. LLM is prone to losing additional user requests such as “I just said I don't want to use Redux” at the summary stage, which is essentially an anti-forgotten technology. The engineering is essentially an anti-forgetting technique.
After the compression is complete and the reconstruction phase is complete, the system will again extract up to 5 core files (with a limit of 5000 Token each) and 25000 Token fragments of Skill commands for reverse re-injection to ensure that it has not completely lost its view of the details of the files.
Further, because AutoCompact is so destructive, it is naturally exclusive of Context Collapse, a mildly progressive collapsing scheme - AutoCompact is forcibly suppressed when collapsing mode is activated.

Terminal UI: Refactoring React's Advanced Rendering Engine

The entire CLI interface is managed by the src/ink/ 90 files in a directory with nearly 20,000 lines of React code took over. This completely changes the image of the CLI as a coarse output.

8.1 Level 5 Rendering Pipeline with TS Yoga

终端 UI 渲染管线

React Reconciler: Call react-reconciler To make the React <Box> Parses to the underlying terminal DOM with the Yoga engine mounted on it. ink-box. ConcurrentRoot with support for React 19 concurrency features.
Pure TS Refactored Yoga: eliminates the WASM file calculation logic referenced by the original Ink, avoiding the need for await loadYoga() first-screen lag and linear memory bloat after a long run, moving all rendering interfaces toward the isolation layer LayoutNode Alignment.
resident object poolASCII characters are fetched in an Int32Array. CharPool The O(1) index hit in;StylePool Differential sequences before and after conversion are directly cached; contains link writes for special OSC 8 protocols HyperlinkPool. Timed scavengers executed every five minutes will rely on the migrateScreenPools method to recover and migrate still active cellular units (Cells).

8.2 Ultimate anti-flicker hardware-level optimization

The system only traverses the occurrence of dirty Marks the view area generated by the motion or displacement. This makes the CPU overhead of a clock animation or a progress bar rotation component tied only to its area.
Instead of performing brute-force redrawing in the face of large text scrolling output, the underlying terminal hardware driver-level DECSTBM scrolling mechanism is emulated (under the CSI top;bot r 及 CSI n S instruction), while at the same time the internal prev.screen Displacement mapping is performed so that the Diff algorithm only has to deal with the only new line that scrolls onto the screen.
Utilizing the Double Buffering mechanism to maintain both front and backstage frames, supplemented by Lodash high-frequency throttling (16ms, i.e., 60fps). leading + trailing (DEC 2026 mode), and through the DEC 2026 protocol, BSU (Begin Synchronized Update) and ESU are encapsulated before and after the full amount of update data, completely realizing the frame-synchronized atomic refresh of the terminal screen, and eradicating stroboscopic flickering. In-line parsing also introduces charCache Forces a hit on the computed ANSI result to skip the time-consuming operation.

8.3 Capturing the bubbling mechanism with no Redux state

Well-built DOM-level event interceptor (Dispatcher). Keyboard input gets the absolute DiscreteEventPriority(immediate plug-in processing rights), while window morphing (Resize), etc. are planned into the ContinuousEventPriority Perform the dithering and merging process.
Ditching the heavy and redundant Redux, the DeepImmutable-based AppState Holds about 50 core state dictionaries to go along with native React Context injection. Globally, it relies only on the onChangeAppState hooks capture high-priority side effects such as permission mode switching and dispatch them to the relevant components in real time.

IX. MCP Integration: Standardized Capability Plug-In Modules

faced with Model Context Protocol (MCP), a protocol that is becoming an industry standard, the modules are placed in the src/services/mcp/ Center.

MCP 四层架构

demersal config.ts Pull network configurations from six different sources such as enterprise policy, local project specifications, etc. When encountering the Claude.ai cloud environment synchronization list, thededupClaudeAiMcpServers Will take over de-duplication and cede high-optimization control to the local configuration.
Discovered new external tools are re-capped at the time of access, with the name forced to be converted to mcp__<serverName>__<toolName> This specific format prevents overwrite conflicts, and its JSON Schema definition is remapped into the system's own Zod-protected validation network.The Agent forces the execution of the refreshTools() to grab the latest feature changes of external nodes.
built-in McpAuthTool Special interfaces are provided for servers that require secondary authentication. When a model operation is blocked, the process can be interrupted, directing the user to go through the OAuth handshake on the browser side and return with the credentials to re-issue the command.

X. Architecture Inspiration and Engineering Depth Reflection

The dissection of this massive amount of engineering code summarizes the following valuable design patterns for any developer trying to build an Agentic-like system:

AsyncGenerator as an absolute core cornerstone: The asynchronous generator handles the entire execution chain, demonstrating the unmatched system dominance of Promise chains in solving model streaming concurrency, reverse back-pressure suppression, cascading down of safe revocation signals, and combinatorial invocations.

Extreme Fail-closed Security Hood: Not registering a concurrency attribute means that concurrency is forbidden, and not specifying a read/write attribute means that it is treated as a malicious write. This type of design, where vulnerabilities are pre-positioned as denials through the framework layer, is worth emulating in the underlying layers of all systems.

Ditch if-else build isolation for good：feature() Macros [7] at the physical level kill off any chance of unready functionality entering the production environment, avoiding any potential runtime anomalies or decompilation leaks.

Everything for the Prompt Cache Service Refactoring DesignFrom the very careful backward and forward ordering of Tool lists, to the system's constitutional boundary slicing, to the very strange suffix replenishment of Fork operations, large system architectures should treat cache hit rates as a “first class citizen” rather than an afterthought. Large system architectures should consider cache hit rates as a "first class citizen" rather than an afterthought.

Engineering countermeasures to address LLM deficienciesFor example, the pre-compression history mandates that “no non-instrumented return value of the original instruction can be discarded”, which utilizes engineering design to compensate for the generalized hardness of the large language model that tends to discard small user requirements and lead to amnesia.

Legacy architectural pitfalls: Even as rigorous as it is, in bootstrap/state.ts The “DO NOT ADD MORE STATE HERE” warning remains on a global object with over 200 dictionary fields maintained in the system. For a complex system of 500,000 rows, the lack of dependency injection-based granular state management is a red flag.
Also.BashTool Although well-designed, the code is too large and complex due to the 8-layer isolation validation mechanism, and the framework's expanding viability would be greatly enhanced if the validation logic could be pulled out and replaced with an OPA (Open Policy Agent) or Rego style declarative engine configuration.
At the data persistence level, the toolflow feedback is still in the form of flat strings. If it can be reconfigured into a data layer storage with structured retrieval capability, it will help the long-lasting memory system and Context Collapse to obtain more precise folding entry points.
As Anthropic extracts the core modules into an open source SDK distribution, the future evolution will be decoupled and pluginized.

Appendix: Core Engine and Control Surface Index

system module	Underlying Scheduling and Implementation Documentation	Code Volume Assessment
Agent Loop Engine Core	`src/QueryEngine.ts`, `src/query.ts`	3,024 rows
Tool Triage and Concurrency Scheduler	`src/services/tools/StreamingToolExecutor.ts`, `toolExecution.ts`	2,275 rows
Abstract Methods and Tool Assembly	`src/Tool.ts`, `src/tools.ts`	1,181 rows
Bash Physical Sandbox Control Surface	`src/tools/BashTool/` Directory (18 files)	Approx. 5,000 rows
Global Security Authorization Audit Centre	`src/utils/permissions/permissions.ts`	Approx. 1,400 lines
Rules masking and assessment of interceptor networks	`src/utils/permissions/` Directory (24 files)	Approx. 5,000 rows
Subagent Task Topology Network	`src/tools/AgentTool/` Directory (20 files)	Approx. 6,000 rows
Swarm Inter-team connectivity and mailboxes	`src/utils/swarm/` Directory (22 files)	Approx. 5,000 rows
System Prompt Built-in Constants	`src/constants/prompts.ts`	914 Row
Global Information Compression Folding Module	`src/services/compact/` Directory (11 files)	3,960 rows
React-Ink hardware rendering component	`src/ink/` Directory (90 files)	19,842 rows
MCP Protocol Connectivity and Conversion Layer	`src/services/mcp/client.ts`	3,348 lines
MCP Multi-source Consolidation and De-layering	`src/services/mcp/config.ts`	1,578 rows
Initialize boot system buffer	`src/bootstrap/state.ts`	Approx. 800 rows
React app immutable state tree	`src/state/AppStateStore.ts`	Approx. 400 rows

May not be reproduced without permission:AI productivity tools » Deep Deconstruction of Claude Code Source Code: The Agent Architecture Philosophy Behind 510,000 Lines of Code