現在の場所図頭 " コース情報

12 - ファクター・エージェント 7.ツール・コールを使ったコミュニケーション

2025-07-22

コース情報

デフォルトでは、大規模言語モデル（LLM）APIは基本的にリスクの高いトークン選択肢：プレーンテキストのコンテンツを返すか、構造化データを返すか。

12 - ファクター・エージェント 7.ツール・コールを使ったコミュニケーション - 1

の最初のトークンの選択に重きを置いている。 the weather in tokyo の場合

"その"

ただし fetch_weather の場合は、JSONオブジェクトの先頭を表す特別なトークンである。

|JSON

ラージ・ランゲージ・モデリング（LLM）にすることで 最初から最後までjsonを出力し、自然言語トークン（例えば request_human_input もしかしたら done_for_nowのようにではなく）意思を表明する。 check_weather_in_city これらのような "標準的な "ツールを使えば、より良い結果が得られるかもしれない。

繰り返しになるが、これはパフォーマンスの向上にはつながらないかもしれないが、最高の結果を得るためには、実験的に、型にはまらない方法を自由に試してみるべきだ。

class Options:
urgency: Literal["low", "medium", "high"]
format: Literal["free_text", "yes_no", "multiple_choice"]
choices: List[str]
# 用于人类交互的工具定义
class RequestHumanInput:
intent: "request_human_input"
question: str
context: str
options: Options
# 在代理循环中的使用示例
if nextStep.intent == 'request_human_input':
thread.events.append({
type: 'human_input_requested',
data: nextStep
})
thread_id = await save_state(thread)
await notify_human(nextStep, thread_id)
return # 中断循环并等待带有线程 ID 的响应返回
else:
# ... 其他情况

その後、スラック、Eメール、SMS、その他のイベントを処理するシステムからウェブフックを受け取ることができる。

@app.post('/webhook')
def webhook(req: Request):
thread_id = req.body.threadId
thread = await load_state(thread_id)
thread.events.push({
type: 'response_from_human',
data: req.body
})
# ... 为简洁起见已简化，你可能不希望在这里阻塞 web worker
next_step = await determine_next_step(thread_to_prompt(thread))
thread.events.append(next_step)
result = await handle_next_step(thread, next_step)
# todo - 循环、中断或任何你想要的操作
return {"status": "ok"}

上記には、以下の情報が含まれている。要素5 - 実施状況と運用状況の調和そして要素8 - コントロール・フローをコントロールするそして要素3 - コンテキストウィンドウをコントロールする歌で応える要素4 - 構造化されたアウトプットとしてのツールそしてモデルの他のいくつかの要素。

のデータを使用する。要素3 - コンテキストウィンドウをコントロールするのクラスXMLフォーマットで、何回か対話すると、コンテキスト・ウィンドウは次のようになる：

(为简洁起见已省略)
<slack_message>
From: @alex
Channel: #deployments
Text: Can you deploy backend v1.2.3 to production?
Thread: []
</slack_message>
<request_human_input>
intent: "request_human_input"
question: "Would you like to proceed with deploying v1.2.3 to production?"
context: "This is a production deployment that will affect live users."
options: {
urgency: "high"
format: "yes_no"
}
</request_human_input>
<human_response>
response: "yes please proceed"
approved: true
timestamp: "2024-03-15T10:30:00Z"
user: "alex@company.com"
</human_response>
<deploy_backend>
intent: "deploy_backend"
tag: "v1.2.3"
environment: "production"
</deploy_backend>
<deploy_backend_result>
status: "success"
message: "Deployment v1.2.3 to production completed successfully."
timestamp: "2024-03-15T10:30:00Z"
</deploy_backend_result>

メリット

明確な指示人間とのコンタクトの種類によってツールを使い分けることで、ラージ・ランゲージ・モデル（LLM）の指示をより具体的にすることができる。
内部循環と外部循環伝統的な使用方法チャットGPT スタイルインターフェース とは別に 制御フローとコンテキストの初期化が可能なプロキシワークフローを有効にする。 Agent->Human よりも Human->Agent (例えば、cronやイベントによってトリガーされるエージェント)
マルチユーザーアクセス異なるユーザーからの入力は、構造化されたイベントを通じて簡単に追跡および調整することができます。
マルチエージェント単純な抽象化は、簡単に拡張できる。 Agent->Agent リクエストと回答
耐久性を持つ。要素6 - シンプルなAPIによる開始/中断/再開これらを組み合わせることで、永続的で信頼性が高く、内省的な複数人のワークフローを作成することができます。