2026年从OpenClaw和Claude code的设计上学习Agent编排

从OpenClaw和Claude code的设计上学习Agent编排过去一年 Agent 的发展太过于迅速 不断有新的概念被提出来 MCP Multi Agent Skill RAG 知识图谱 在实际开发过程中发现了一个无法被绕开的根本问题 我们的项目究竟应该选择 single Agent 还是 multi Agent multi Agent 怎么编排它的 subAgent 碰巧 claude code 的源码泄露 现象级 Agent 产品 OpenClaw 也是开源的

大家好,我是讯享网,很高兴认识大家。这里提供最前沿的Ai技术和互联网信息。



过去一年Agent的发展太过于迅速,不断有新的概念被提出来,MCP、Multi-Agent、Skill、RAG、知识图谱,在实际开发过程中发现了一个无法被绕开的根本问题:我们的项目究竟应该选择single-Agent还是multi-Agent?multi-Agent怎么编排它的subAgent?碰巧claude code的源码泄露,现象级Agent产品OpenClaw也是开源的,从这两个可以运行的开源框架探索它们对于子Agent的编排是怎么实现的。

OpenClaw里内置了子agent机制,核心就是sessions_spawn和subagents两个工具,它们会被加入OpenClaw的工具集合里,都在src/agents/tools目录下,也就是说,OpenClaw的多agent编排不是硬编码好的固定supervisor-worker流程,顶层agent拿到一套工具,它自己决定要不要调用sessions_spawn;一旦调用,当前会话就临时扮演orchestrator,被spawn出来的子会话再根据深度变成orchestrator或leaf。

OpenClaw的agent有三个角色:

  • main:顶层会话;
  • orchestrator:还能继续派生子agent的中间层;
  • leaf:叶子节点,不能再往下派生。
 export const SUBAGENT_SESSION_ROLES = ["main", "orchestrator", "leaf"] as const; export type SubagentSessionRole = (typeof SUBAGENT_SESSION_ROLES)[number]; export const SUBAGENT_CONTROL_SCOPES = ["children", "none"] as const; export type SubagentControlScope = (typeof SUBAGENT_CONTROL_SCOPES)[number]; type SessionCapabilityEntry = { sessionId?: unknown; # 允许按 session id 反查 entry spawnDepth?: unknown; # 当前是第几层派生 subagentRole?: unknown; # 持久化的角色,main | orchestrator | leaf subagentControlScope?: unknown; # 持久化的控制范围,children | none }; 

sessions_spawn是创建/启动工具,它负责新开一个隔离session,可以开OpenClaw subagent和ACP runtime session,它的参数里有task、agentId、runtime、thread、mode、model、thinking等,真正会去调用spawnSubagentDirect()spawnAcpDirect(),所以它的本质是spawn一个新的子会话。subagents是管理已存在subagent的工具,它不负责创建,只负责对当前requester session下面已经跑起来的subagent做list、kill、steer。

是否开启subagent这里是三层共同决定,LLM最终是否调用工具由模型根据提示词和上下文决定要不要调用 sessions_spawn:

  1. 用户是否显式要求:你可以直接用/subagents spawn ...,这时根本不是让LLM自己决定,而是用户直接触发。
  2. 工具是否可用:当前run里必须真的有sessions_spawn这个工具;OpenClaw会把它注入工具集,如果工具被policy禁掉,prompt再怎么写也没用。
  3. prompt是否鼓励:主系统提示词里确实有明确引导:If a task is more complex or takes longer, spawn a sub-agent.

在src/agents/subagent-spawn.ts:512中,OpenClaw调buildSubagentSystemPrompt()生成专门给子agent的附加系统提示extraSystemPrompt,这个prompt里会写清楚:你是subagent、你的父级是谁、你的角色是什么、你该如何收尾、能不能继续spawn子agent、不要busy-poll、结果会auto-announce回去。再往下,子会话跑自己的embedded agent时,OpenClaw会把extraSystemPrompt混进完整系统提示词,在minimal模式下,它会被放到 Subagent Context这一节里。因为subagent是一个新的独立session,所以它默认拿到的是:新session的上下文、父级传下来的任务描述、父级传下来的subagent专用system prompt、同一workspace/agent环境下的bootstrap与技能上下文。

下面给一个例子,如果我们给OpenClaw指令:“帮我分析openclaw里subagent相关实现,找出sessions_spawn和subagents的职责区别;如果需要,可以并行让一个subagent去读相关代码并给我摘要。”

主agent的Promt如下:

You are a personal assistant running inside OpenClaw. Tooling Tool availability (filtered by policy): - read: ... - write: ... - edit: ... - apply_patch: ... - exec: ... - process: ... - sessions_list: ... - sessions_history: ... - sessions_send: ... - sessions_spawn: Spawn an isolated sub-agent session - subagents: List, steer, or kill sub-agent runs - session_status: ... - browser: ... - web_search: ... - ... TOOLS.md does not control tool availability; it is user guidance for how to use external tools. For long waits, avoid rapid poll loops ... If a task is more complex or takes longer, spawn a sub-agent. Completion is push-based: it will auto-announce when done. Do not poll `subagents list` / `sessions_list` in a loop ... Tool Call Style Default: do not narrate routine, low-risk tool calls ... When a first-class tool exists for an action, use the tool directly ... Safety ... OpenClaw CLI Quick Reference ... Skills [这里会插入 skills prompt] Workspace Your working directory is: /.../workspace Treat this directory as the single global workspace ... Current Date & Time Time zone: ... Workspace Files (injected) These user-editable files are loaded by OpenClaw ... Messaging - Reply in current session ... - Cross-session messaging → use sessions_send(sessionKey, message) - Sub-agent orchestration → use subagents(action=list|steer|kill) Project Context [这里会注入 AGENTS.md / SOUL.md / USER.md / 其他上下文文件内容] 

subagent的Prompt如下:

You are a personal assistant running inside OpenClaw. Tooling Tool availability (filtered by policy): - read: ... - exec: ... - process: ... - sessions_spawn: ... - subagents: ... - session_status: ... - ... TOOLS.md does not control tool availability ... For long waits, avoid rapid poll loops ... If a task is more complex or takes longer, spawn a sub-agent. Completion is push-based: it will auto-announce when done. Do not poll `subagents list` / `sessions_list` in a loop ... Tool Call Style ... Safety ... Workspace Your working directory is: /.../workspace Current Date & Time Time zone: ... Workspace Files (injected) These user-editable files are loaded by OpenClaw ... Subagent Context # Subagent Context You are a subagent spawned by the main agent for a specific task. Your Role - You were created to handle: 阅读 `src/agents/tools/sessions-spawn-tool.ts`、`src/agents/tools/subagents-tool.ts`、`src/agents/subagent-capabilities.ts`,总结三者关系 - Complete this task. That's your entire purpose. - You are NOT the main agent. Don't try to be. Rules 1. Stay focused 2. Complete the task 3. Don't initiate 4. Be ephemeral 5. Trust push-based completion 6. Recover from compacted/truncated tool output Output Format - What you accomplished or found - Any relevant details the main agent should know - Keep it concise but informative What You DON'T Do - NO user conversations - NO pretending to be the main agent - Only use the `message` tool when explicitly instructed ... Sub-Agent Spawning You CAN spawn your own sub-agents ... // 只有还能继续派生时才有 或 You are a leaf worker and CANNOT spawn further sub-agents. // leaf 时出现 Session Context - Label: code-reader - Requester session: agent:main:main - Requester channel: discord - Your session: agent:main:subagent:abcd-efgh 

看到这里,就会发现OpenClaw对于Agent的编排简单到有点粗暴,就是让模型自己调用工具开启一个subagent,那么不可避免地就会tool调用失败,OpenClaw对于tool调用失败的机制包括下面三个:

在工具调用之前,OpenClaw就会进行三个判断:

  1. 做loop detection:在有sessionKey的情况下,它会懒加载runtime依赖,然后拿当前session的诊断状态做detectToolCallLoop(),判断是否陷入工具调用循环。
  2. 跑插件的before_tool_call hook:如果全局hook runner里注册了before_tool_call,它会把当前toolName、参数、runId、toolCallId和session上下文一起传进去,hook可以返回两种效果,block直接拦截和params改写参数。
  3. 记录真正执行的参数和执行结果:被hook改过的参数会按runId + toolCallId存到一个内存Map里,执行成功或失败后,它还会调用recordLoopOutcome()把结果写回loop detection状态里。

这个文件负责把 OpenClaw 内部工具适配成pi-coding-agent需要的ToolDefinition,入口是toToolDefinitions()。很多内部工具未必严格返回标准AgentToolResult,所以它会在执行后调用normalizeToolExecutionResult(),这层的规则是:

  1. 如果结果已经有 content[],直接当标准结果返回;
  2. 如果没有content[],就强制包成:content: [{ type: "text", text: ... }] details: ...
  3. 如果工具只返回字符串、数字、普通对象,也都能被兜成合法结果。

如果tool.execute()抛错,它会:

  1. 如果是signal.aborted或AbortError:直接rethrow,表示这是run取消,不是普通工具失败;
  2. 否则,先提取错误message/stack,记debug stack 和error log,返回一个 jsonResult(...) 包出来的结构化结果,形如:
{ "status": "error", "tool": "xxx", "error": "..." } 

失败兜底链路,串起来就是:

  1. tool先经过wrapToolWithBeforeToolCallHook(),见 src/agents/pi-tools.ts:609;
  2. before_tool_call阶段先做loop detection和plugin hook,见 src/agents/pi-tools.before-tool-call.ts:89;
  3. 如果只是hook出错,warning后继续走原参数,见 src/agents/pi-tools.before-tool-call.ts:186;
  4. 真正执行工具时,adapter统一把返回值规范成 AgentToolResult,见 src/agents/pi-tool-definition-adapter.ts:78;
  5. 如果执行异常但不是abort,就返回status: "error"的结构化结果,见 src/agents/pi-tool-definition-adapter.ts:185;
  6. LLM收到这个错误结果,再决定重试、换工具、或者降级处理。

Claude Code的底层内核是single-agent loop,当主agent认为需要委派并发出AgentTool的tool_use时,才会产生普通subagent。模型输出一个tool_use(name=AgentTool, ...)query()把tool_use交给runTools()执行,AgentTool.call()解析参数,决定是teammate、普通subagent,还是fork subagent。普通subagent最终进入runAgent()runAgent()会创建subagent context,然后再次调用一遍query(),所以本质上是子代理也跑同一套query loop。如果FORK_SUBAGENT开启,AgentTool不传 subagent_type会走fork path,生成一种继承父上下文的forked subagent。AgentTool所在的目录是:

image

cc的subagent包括三类:

  1. 普通的subagent:system prompt来自agentDefinition.getSystemPrompt(),再经过enhanceSystemPromptWithEnvDetails()补上环境细节、绝对路径要求、不要emoji等说明。它收到的初始user message,就是AgentTool传进去的prompt字符串:
 romptMessages = [createUserMessage({ content: prompt })];。 
  1. 内置subagent的system prompt是各自独立定义的,例如Explore Agent,定义在tools/AgentTool/built-in/exploreAgent.ts严格只读、搜索导向,它的Prompt是:
import { BASH_TOOL_NAME } from 'src/tools/BashTool/toolName.js' import { EXIT_PLAN_MODE_TOOL_NAME } from 'src/tools/ExitPlanModeTool/constants.js' import { FILE_EDIT_TOOL_NAME } from 'src/tools/FileEditTool/constants.js' import { FILE_READ_TOOL_NAME } from 'src/tools/FileReadTool/prompt.js' import { FILE_WRITE_TOOL_NAME } from 'src/tools/FileWriteTool/prompt.js' import { GLOB_TOOL_NAME } from 'src/tools/GlobTool/prompt.js' import { GREP_TOOL_NAME } from 'src/tools/GrepTool/prompt.js' import { NOTEBOOK_EDIT_TOOL_NAME } from 'src/tools/NotebookEditTool/constants.js' import { hasEmbeddedSearchTools } from 'src/utils/embeddedTools.js' import { AGENT_TOOL_NAME } from '../constants.js' import type { BuiltInAgentDefinition } from '../loadAgentsDir.js' function getExploreSystemPrompt(): string { // Ant-native builds alias find/grep to embedded bfs/ugrep and remove the // dedicated Glob/Grep tools, so point at find/grep via Bash instead. const embedded = hasEmbeddedSearchTools() const globGuidance = embedded ? `- Use `find` via ${BASH_TOOL_NAME} for broad file pattern matching` : `- Use ${GLOB_TOOL_NAME} for broad file pattern matching` const grepGuidance = embedded ? `- Use `grep` via ${BASH_TOOL_NAME} for searching file contents with regex` : `- Use ${GREP_TOOL_NAME} for searching file contents with regex` return `You are a file search specialist for Claude Code, Anthropic's official CLI for Claude. You excel at thoroughly navigating and exploring codebases. === CRITICAL: READ-ONLY MODE - NO FILE MODIFICATIONS === This is a READ-ONLY exploration task. You are STRICTLY PROHIBITED from: - Creating new files (no Write, touch, or file creation of any kind) - Modifying existing files (no Edit operations) - Deleting files (no rm or deletion) - Moving or copying files (no mv or cp) - Creating temporary files anywhere, including /tmp - Using redirect operators (>, >>, |) or heredocs to write to files - Running ANY commands that change system state Your role is EXCLUSIVELY to search and analyze existing code. You do NOT have access to file editing tools - attempting to edit files will fail. Your strengths: - Rapidly finding files using glob patterns - Searching code and text with powerful regex patterns - Reading and analyzing file contents Guidelines: ${globGuidance} ${grepGuidance} - Use ${FILE_READ_TOOL_NAME} when you know the specific file path you need to read - Use ${BASH_TOOL_NAME} ONLY for read-only operations (ls, git status, git log, git diff, find${embedded ? ', grep' : ''}, cat, head, tail) - NEVER use ${BASH_TOOL_NAME} for: mkdir, touch, rm, cp, mv, git add, git commit, npm install, pip install, or any file creation/modification - Adapt your search approach based on the thoroughness level specified by the caller - Communicate your final report directly as a regular message - do NOT attempt to create files NOTE: You are meant to be a fast agent that returns output as quickly as possible. In order to achieve this you must: - Make efficient use of the tools that you have at your disposal: be smart about how you search for files and implementations - Wherever possible you should try to spawn multiple parallel tool calls for grepping and reading files Complete the user's search request efficiently and report your findings clearly.` } export const EXPLORE_AGENT_MIN_QUERIES = 3 const EXPLORE_WHEN_TO_USE = 'Fast agent specialized for exploring codebases. Use this when you need to quickly find files by patterns (eg. "src/components//*.tsx"), search code for keywords (eg. "API endpoints"), or answer questions about the codebase (eg. "how do API endpoints work?"). When calling this agent, specify the desired thoroughness level: "quick" for basic searches, "medium" for moderate exploration, or "very thorough" for comprehensive analysis across multiple locations and naming conventions.' export const EXPLORE_AGENT: BuiltInAgentDefinition = 

调用这个内置的subagent的示例是:

 主 Agent ↓ 决策 调用 sessions_spawn ↓ 指定 agentType = "Explore" ↓ 加载 EXPLORE_AGENT 定义 ↓ 调用 getSystemPrompt() ↓ 得到 prompt ↓ 启动 subagent(Explore agent) 

3.fork subagent:它不重新生成自己的agent system prompt,而是直接继承父agent已渲染好的system prompt,再把父assistant message、placeholder tool_results和当前directive拼成fork的输入消息。

每一个工具在写的时候都有自己的Input校验逻辑,如果校验结果为否,就将错误打日志,上报错误事件,把错误包装成tool_result返回给模型。

// Validate input values. Each tool has its own validation logic const isValidCall = await tool.validateInput?.( parsedInput.data, toolUseContext, ) if (isValidCall?.result === false) { logForDebugging( `${tool.name} tool validation error: ${isValidCall.message?.slice(0, 200)}`, ) logEvent('tengu_tool_use_error', { messageID: messageId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, toolName: sanitizeToolNameForAnalytics(tool.name), error: isValidCall.message as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, errorCode: isValidCall.errorCode, isMcp: tool.isMcp ?? false, queryChainId: toolUseContext.queryTracking ?.chainId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, queryDepth: toolUseContext.queryTracking?.depth, ...(mcpServerType && { mcpServerType: mcpServerType as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, }), ...(mcpServerBaseUrl && { mcpServerBaseUrl: mcpServerBaseUrl as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, }), ...(requestId && { requestId: requestId as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, }), ...mcpToolDetailsForAnalytics(tool.name, mcpServerType, mcpServerBaseUrl), }) return [ { message: createUserMessage({ content: [ { type: 'tool_result', content: ` 
  
    
    
      ${isValidCall.message} 
    `, is_error: true, tool_use_id: toolUseID, }, ], toolUseResult: `Error: ${isValidCall.message}`, sourceToolAssistantUUID: assistantMessage.uuid, }), }, ] } 

真正执行tool.call()时出错,统一格式化并跑failure hooks,运行期异常不会直接把主loop打死,而是先格式化错误内容,特殊处理 MCP auth,把client状态改成needs-auth,跑PostToolUseFailure hooks,最终仍然返回一个tool_result()给模型,所以模型还能看到失败原因并继续下一步决策。

代码995-1185行写了发生错误的处理方式,记录错误日志和埋点,如果是图片错误走专门处理,补齐缺失的tool_result,向用户暴露真实错误。

可以看到OpenClaw和Claude code都没有采用中心式的Multi-agent架构,而是subagent as a tool,这个想法很值得学习。现在的很多开源项目,仍然会采用Router、Planner、Special Agent的架构,有过LangGraph开发经验的应该都可以感受到,这种架构的优势在于可控性很强,但是劣势很明显,上下文非常难以管理,每一个subagent的上下文和Ochestrator的上下文都很难编排。两个项目拆解给我的启发就是:subagent可以以skill或者Prompt的方式存在,按照需要的时候加载。当然,可控性仍然是Agent落地的最关键问题,所以为了实现这一点,可以在prompt里规范哪里需要开启subagent。

如果你有不同的看法,欢迎交流和补充!让我们一起学习!

小讯
上一篇 2026-04-09 22:17
下一篇 2026-04-09 22:15

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/252967.html