smolagents 项目说明书

Doramagic 项目包 · 项目说明书

smolagents 项目

🤗 smolagents：一个极简的智能体（agent）库，让智能体以代码形式进行思考。

Overview & Core Architecture

smolagents 是由 Hugging Face 维护的轻量级 LLM 智能体框架。它以"最小化抽象、可执行的 Python 智能体"为核心设计目标，提供两类范式：CodeAgent（在隔离的 Python 解释器中执行生成的代码）与 ToolCallingAgent（通过结构化 JSON 工具调用驱动）。两类智能体共享同一个 MultiStepAgent 父类，统一管...

章节 相关页面

继续阅读本节完整说明和来源证据。

章节 2.1 智能体层级

继续阅读本节完整说明和来源证据。

章节 2.2 提示与规划

继续阅读本节完整说明和来源证据。

章节 2.3 工具与函数式 schema

继续阅读本节完整说明和来源证据。

一、项目目的与定位

smolagents 是由 Hugging Face 维护的轻量级 LLM 智能体框架。它以"最小化抽象、可执行的 Python 智能体"为核心设计目标，提供两类范式：CodeAgent（在隔离的 Python 解释器中执行生成的代码）与 ToolCallingAgent（通过结构化 JSON 工具调用驱动）。两类智能体共享同一个 MultiStepAgent 父类，统一管理多轮循环、规划、记忆与回调资料来源：src/smolagents/agents.py:73-120。

最新稳定版本为 v1.26.0，开发节奏活跃，社区反馈集中在内存治理（issues #694、#901、#1216、#2129）、异步化（#145）、人机协同（#52）以及更多推理后端支持（#449、LlamaCppModel）等方向。

二、核心架构总览

graph TD
    U[用户任务] --> MA[MultiStepAgent]
    MA -->|code 范式| CA[CodeAgent]
    MA -->|tool 范式| TC[ToolCallingAgent]
    CA --> PE[LocalPythonExecutor<br/>或远程 Executor]
    TC --> TR[Tool Registry]
    MA --> MEM[AgentMemory<br/>步骤序列]
    MA --> M[Model 抽象层]
    PE -.安全沙箱.-> EXT[E2B / Docker / Blaxel / Modal]

整体采用"智能体调度 + 模型抽象 + 工具/代码执行器 + 记忆"的分层设计：

层级	主要模块	职责
智能体调度	`agents.py:MultiStepAgent`	多步循环、规划、最终答案判定
模型抽象	`models.py:Model` / `MODEL_REGISTRY`	统一 `generate()` 接口，屏蔽 provider 差异
工具系统	`tools.py` / `default_tools.py`	`BaseTool`、`Tool`、`PipelineTool`、`FinalAnswerTool`
执行器	`local_python_executor.py` / `remote_executors.py`	安全执行 Python 代码或转发到远程沙箱
记忆	`memory.py:AgentMemory`	以 `MemoryStep` 列表保存系统提示、任务、动作、规划、最终答案
提示	`prompts/*.yaml`	Jinja2 模板，按范式注入工具/计划/团队成员

2.1 智能体层级

MultiStepAgent 是所有智能体的基类，提供 run()、step()、_run() 等核心方法，并维护 step_callbacks 与 final_answer_checks。CodeAgent 与 ToolCallingAgent 各自覆盖 _step() 实现：前者解析代码块并送入执行器，后者解析 JSON Action。ManagedAgent 则允许将一个智能体作为"团队成员"被另一个智能体调用，对应提示模板 managed_agent 与 report 字段资料来源：src/smolagents/agents.py:150-260。

2.2 提示与规划

三种提示模板分别服务于三种使用场景：code_agent.yaml 用于经典 ReAct + 代码块范式；structured_code_agent.yaml 在每步前额外要求模型输出"事实清单 + 步骤计划"，使推理更稳健；toolcalling_agent.yaml 则面向 JSON 工具调用风格的模型，示例中使用 python_interpreter、web_search、image_generator 等虚构工具演示动作/观察循环资料来源：src/smolagents/prompts/structured_code_agent.yaml:1-60、资料来源：src/smolagents/prompts/toolcalling_agent.yaml:1-50。

规划（PlanningStep）由独立的"planner 模型"周期性触发：模型先汇总已知/未知事实，再生成高层步骤，最后以 <end_plan> 结束；规划步骤会被注入后续 prompt，保证长任务不偏离目标。

2.3 工具与函数式 schema

tools.py:BaseTool 定义了 name、description、inputs、output_type 与 forward()。default_tools.py 内置了若干开箱即用工具：

工具	用途	关键依赖
`VisitWebpageTool`	抓取网页并转 Markdown，按 `max_output_length` 截断	`requests`、`markdownify`
`WikipediaSearchTool`	检索 Wikipedia 摘要或全文，需自定义 `user_agent`	`wikipedia-api`
`SpeechToTextTool`	基于 Whisper 的音频转写	`transformers`
`FinalAnswerTool`	标记终止并返回结果	无外部依赖

对于直接由函数生成工具描述的场景，_function_type_hints_utils.py:get_json_schema() 会解析 Google 风格 docstring（Args: / Returns:），自动生成符合 OpenAI/HF chat template 的 JSON Schema，并支持 (choices: ...) 形式的枚举提示资料来源：src/smolagents/_function_type_hints_utils.py:60-140。

2.4 模型抽象层

models.py 通过 MODEL_REGISTRY 注册多种后端（InferenceClientModel、LiteLLMModel、OpenAIServerModel、TransformersModel、MLXModel 等），并暴露统一的 generate()、generate_stream()、agenerate() 异步接口。ChatMessage、ChatMessageToolCall、ChatMessageStreamDelta 等数据结构屏蔽不同 provider 的消息格式差异，方便上层智能体复用资料来源：src/smolagents/models.py:1-80。

2.5 记忆系统

memory.py:AgentMemory 以有序列表保存 MemoryStep 子类：TaskStep、SystemPromptStep、ActionStep、PlanningStep、FinalAnswerStep。每一步记录模型输出、工具调用、观察、耗时（Timing）和 token 使用（TokenUsage），供后续 prompt 注入与可视化（monitoring.Monitor）使用。

然而，社区指出当前记忆仅驻留内存，且随交互线性增长，缺乏内置摘要/压缩策略；当上下文超出窗口时只能依赖外部截断资料来源：GitHub Issue #694、资料来源：GitHub Issue #901。相关讨论还涉及保存/加载（#1216）、多轮 chat 历史注入（#1579）、OWASP 内存投毒防护（#2332）以及合并事件的可观测性 hook（#2129）。

2.6 代码执行与安全

LocalPythonExecutor 提供受限的 Python 子集，支持 BASE_BUILTIN_MODULES 内置模块与可配置的 authorized_imports。当需要更强隔离时，可通过 executor 参数切换到 E2BExecutor、DockerExecutor、BlaxelExecutor 或 ModalExecutor；社区也已请求基于 libkrun 的 microsandbox 集成（issue #2368，已关闭）。代码与 JSON 解析失败时抛出 AgentParsingError、AgentExecutionError 等专用异常，便于上层捕获与重试资料来源：src/smolagents/agents.py:60-72、资料来源：GitHub Issue #2368。

三、典型执行流程

以 CodeAgent.run() 为例：① 组装 SystemPromptStep 与 TaskStep 写入 AgentMemory；② 渲染 code_agent.yaml，调用模型；③ 解析代码块，送入 PythonExecutor；④ 将 ActionStep（含观察、错误、耗时）回写记忆；⑤ 在达到 max_steps 之前循环，必要时触发 PlanningStep；⑥ 调用 FinalAnswerTool 终止循环并返回结果。ToolCallingAgent 的差异仅在第 ②、③ 步：解析 JSON Action 后直接调用 Tool.forward() 而非解释器。

四、社区与演进方向

围绕"Overview & Core Architecture"，社区讨论最密集的话题包括：

异步化（#145）：当前 run() 为同步调用，但底层模型与 LiteLLMModel 已暴露 agenerate，存在统一异步 API 的空间。
人机协同（#52）：可通过 step_callbacks 介入，或在自定义工具中暂停等待输入。
记忆治理（#694、#901、#2129）：摘要压缩、可插拔后端（mem0 等）与合并事件 hook 是重点诉求。
后端扩展（#449、#1848）：对 LlamaCpp、Anthropic Skills 等本地/外部能力有持续需求。

Agent Loop, Planning & Multi-Agent Hierarchies

在 smolagents 库中，MultiStepAgent 是所有智能体（CodeAgent、ToolCallingAgent 等）的基类，其核心由三部分组成：Agent Loop（执行循环）、Planning（规划机制）与 Multi-Agent Hierarchies（多代理层级）。它们共同决定了智能体如何理解任务、如何规划步骤、如何调度子代理，以及如何在记忆系统中...

章节 相关页面

继续阅读本节完整说明和来源证据。

章节 1.1. Facts given in the task

继续阅读本节完整说明和来源证据。

章节 1.2. Facts to look up

继续阅读本节完整说明和来源证据。

章节 1.3. Facts to derive

继续阅读本节完整说明和来源证据。

Agent Loop、Planning 与多代理层级

一、概览：智能体执行的核心三要素

在 smolagents 库中，MultiStepAgent 是所有智能体（CodeAgent、ToolCallingAgent 等）的基类，其核心由三部分组成：Agent Loop（执行循环）、Planning（规划机制） 与 Multi-Agent Hierarchies（多代理层级）。它们共同决定了智能体如何理解任务、如何规划步骤、如何调度子代理，以及如何在记忆系统中累积上下文。

智能体的运行入口为 agent.run(task, ...)，其内部反复调用 _step() 方法，直到产生 FinalAnswerStep 或达到 max_steps 上限（参见 src/smolagents/agents.py 中的 MultiStepAgent 类）。社区中关于"如何在多轮对话中保存/加载记忆"的诉求（如 issue #1216、#1579），正是因为记忆是 Agent Loop 与 Planning 协作的载体。

二、Agent Loop：步进式状态机

MultiStepAgent.run() 实际上是一个 步进循环（step loop），每一次迭代都会向 AgentMemory 写入一个 MemoryStep 子类对象。记忆模块定义了多种步骤类型（参见 src/smolagents/memory.py）：

步骤类型	作用
`TaskStep`	记录用户原始任务，作为循环起点
`SystemPromptStep`	注入系统提示与工具说明
`PlanningStep`	阶段性生成或更新高层计划
`ActionStep`	模型输出 + 代码/工具执行结果
`FinalAnswerStep`	终止步骤，包含最终答复

下图展示了智能体单次执行循环的核心数据流：

flowchart TD
    A[TaskStep: 用户任务] --> B[SystemPromptStep: 注入系统提示]
    B --> C{是否达到 planning_interval?}
    C -- 是 --> D[PlanningStep: 生成/更新计划]
    C -- 否 --> E[ActionStep: 模型生成 + 执行]
    D --> E
    E --> F{返回 final_answer?}
    F -- 是 --> G[FinalAnswerStep: 终止]
    F -- 否 --> H{达到 max_steps?}
    H -- 否 --> C
    H -- 是 --> I[AgentMaxStepsError]

执行循环支持通过 step_callbacks 字典注入回调，例如 examples/plan_customization/README.md 中演示的 {PlanningStep: interrupt_after_plan} 即在每次 PlanningStep 后暂停以进行人工审阅，这是社区关注的"Human-in-the-Loop"模式（issue #52）。

资料来源：src/smolagents/agents.py

三、Planning：周期性高层计划与更新

Planning 机制由 planning_interval 参数控制。当循环步数达到该整数倍时，智能体将基于当前历史生成一份"事实调查 + 高层计划"的结构化输出，并以 PlanningStep 写入记忆。

初始计划模板 来自 src/smolagents/prompts/code_agent.yaml：

planning:
  initial_plan: |-
    ## 1. Facts survey
    ### 1.1. Facts given in the task
    ### 1.2. Facts to look up
    ### 1.3. Facts to derive
    ## 2. Plan
    ...
    <end_plan>

计划更新模板（update_plan_pre_messages 与 update_plan_post_messages）则要求模型回顾此前历史、修正事实并提出新计划。ToolCallingAgent 同样具备规划能力，但其工具清单以文本列表形式呈现（参见 src/smolagents/prompts/toolcalling_agent.yaml）；StructuredCodeAgent 则进一步要求结构化输出（参见 src/smolagents/prompts/structured_code_agent.yaml）。

规划步骤常通过 step_callbacks={PlanningStep: ...} 被拦截，用于实现"计划定制"，例如 examples/plan_customization/README.md 演示了用户如何修改计划后再让智能体继续运行，从而缓解"上下文无限增长"的问题（issue #901、#694）。

资料来源：src/smolagents/prompts/code_agent.yaml、src/smolagents/prompts/toolcalling_agent.yaml、examples/plan_customization/README.md

四、多代理层级（Managed Agents）

smolagents 通过 managed_agents 参数构造 层级式多代理系统：在主代理的提示中，被管理的子代理以"伪 Python 函数"或"伪工具"形式出现，主代理可像调用工具一样委派任务给它们。

子代理的 name 与 description 取自其构造参数；主代理读取这些元数据并插入到系统提示模板的相应占位符中。例如在 src/smolagents/prompts/code_agent.yaml 中：

{%- if managed_agents and managed_agents.values() | list %}
Calling a team member works similarly to calling a tool: ...
{%- for agent in managed_agents.values() %}
def {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:
    """{{ agent.description }}"""
{% endfor %}
{%- endif %}

这一设计使得主代理可以将研究、编码、检索等职责分配给专用子代理。CLI（src/smolagents/cli.py）提供交互式工具与模型选择流程，便于快速搭建带有多代理层级的应用；默认工具如 WikipediaSearchTool、DuckDuckGoSearchTool 则注册于 TOOL_MAPPING 中供子代理复用（参见 src/smolagents/default_tools.py）。

序列化与持久化

MultiStepAgent.save() 会递归将所有 managed_agents 与工具写入 managed_agents/ 与 tools/ 子目录，把 prompt_templates 序列化为 prompts.yaml，并把代理元数据写入 JSON，从而支持从 Hub 加载（from_hub()）以及在多次运行间复用。社区关注的"记忆保存/加载"（issue #1216、#945）即依赖此序列化机制。

资料来源：src/smolagents/agents.py、src/smolagents/default_tools.py、src/smolagents/cli.py

五、常见失败模式与最佳实践

上下文爆炸：未设置 planning_interval 时，长任务可能耗尽上下文窗口。建议结合 step_callbacks 手动压缩历史。
计划与执行脱节：若计划模板未被覆盖（例如自定义 prompt_templates 时漏掉 planning 段），模型将不会生成 PlanningStep。
多代理死循环：主代理反复委派同一任务给子代理，需通过 max_steps 或自定义 step_callbacks 强制终止。
异步需求：issue #145 提议为库加入异步 API，但当前执行循环仍是同步的，若需并发只能从外部调度多个 agent.run()。

另请参阅

Tools & Default Tools
Memory & Step Types
Code Execution & Security
Models & Inference Providers

资料来源：src/smolagents/agents.py

Memory, Tools & Model Integrations

smolagents 的运行时由三个相互衔接的子系统组成：记忆（Memory）记录任务轨迹，工具（Tools）提供可执行能力，模型（Models）驱动决策与规划。本页基于源码描述这三者的职责、协作方式以及社区中关注度较高的扩展点。

章节 相关页面

继续阅读本节完整说明和来源证据。

记忆系统（Memory）

记忆系统是 agent 步进循环的状态容器。所有 step 都派生自 MemoryStep，按职责可拆分为五种类型，由 src/smolagents/memory.py 定义：

Step 类型	作用
`TaskStep`	记录用户输入的原始任务
`SystemPromptStep`	持久化 system prompt 模板内容
`PlanningStep`	持有由规划模型生成的高层计划（facts survey + plan）
`ActionStep`	记录模型输出、tool 调用、token 消耗、计时与错误
`FinalAnswerStep`	标记 agent 已给出最终结果

AgentMemory 负责追加、重放（replay）与压缩这些 step，是 MultiStepAgent._step 调用之间传递上下文的唯一通道。ActionStep 内嵌 ToolCall 与 Timing 字段，可为监控（monitoring.py）提供可观测性。

flowchart LR
    T[TaskStep] --> A1[ActionStep]
    A1 --> P[PlanningStep]
    P --> A2[ActionStep]
    A2 --> F[FinalAnswerStep]
    A1 -.token_usage.-> M[AgentMemory]
    P -.facts.-> M

社区长期关注记忆的两个痛点：持久化与压缩。Issue #1216 指出当前没有 save/load 接口，Issue #901 与 #694 要求在 step 数累积后自动摘要，否则上下文将突破模型窗口。Issue #2129 进一步提出为压缩事件暴露 hook，便于在多层 agent 中追踪“行为指纹”。在现有代码中，开发者可通过 agent.memory.steps 直接读写，或在 step 回调中观察 CallbackRegistry 事件，实现自定义摘要/重放。

工具系统（Tools）

工具在 src/smolagents/tools.py 中以 BaseTool 为抽象基类，常用子类包括 Tool（同步）与 PipelineTool（基于 🤗 Transformers 的 pipeline()）。每个工具需要声明 name、description、inputs（类型映射）和 output_type，并实现 forward()。

src/smolagents/default_tools.py 提供了开箱即用的实现：

VisitWebpageTool：使用 requests 与 markdownify 抓取并转换页面，输出按 max_output_length（默认 40000 字符）截断。
WikipediaSearchTool：基于 wikipedia-api，需提供 user_agent（Wikipedia 政策要求），可返回 summary 或全文，输出为 Markdown。
SpeechToTextTool：继承 PipelineTool，默认 checkpoint 为 openai/whisper-large-v3-turbo，将音频转写为文本。
FinalAnswerTool：通过 TOOL_MAPPING 注入到所有 agent，强制模型在结束时返回结构化结果。

工具在 agent 启动时被渲染到 prompt。MultiStepAgent._step（agents.py）会先调用 validate_tool_arguments（src/smolagents/tool_validation.py）做参数校验，再将执行结果写回 ActionStep.observations。对外部工具协议的桥接由 src/smolagents/mcp_client.py 提供，使 MCP server 可作为 Tool 集合直接挂载。

examples/plan_customization/README.md 演示了如何通过 step 回调暂停、修改并恢复 PlanningStep，实现 human-in-the-loop 审批，对应社区 Issue #52 中“如何注入用户确认”的常见诉求。

模型集成（Models）

模型层位于 src/smolagents/models.py，通过 Model 抽象类与 MODEL_REGISTRY 注册表管理多提供者。统一的 ChatMessage / ChatMessageToolCall 数据结构屏蔽了 LiteLLM、HF Inference API、OpenAI、Anthropic 等后端差异；流式输出由 ChatMessageStreamDelta 与 agglomerate_stream_deltas 聚合。

根据所选 agent 类型，加载不同的 prompt 模板：

prompts/code_agent.yaml —— CodeAgent 使用 Thought → Code → Observation 循环，模型在代码块内直接编写 Python 并调用工具。
prompts/toolcalling_agent.yaml —— ToolCallingAgent 引导模型输出 JSON Action，由框架解析为结构化工具调用。
prompts/structured_code_agent.yaml —— structured_code_agent 强制模型以 JSON 形式返回 {"thought", "code"}，结合 CODEAGENT_RESPONSE_FORMAT 实现解析短路。

MultiStepAgent._step 会根据是否使用结构化输出切换 JSON 解析与 parse_code_blobs 两条路径；当未启用结构化输出时，框架会向 LLM 输出追加闭合的 code block 标签以稳定终止序列。社区 Issue #145 长期请求将 Model.generate 改造为 async，而 #449 与 #1848 分别呼吁增加 LlamaCppModel 与 Anthropic agent skills 的注册实现。

协作关系与常见失败模式

记忆、工具、模型通过 ActionStep 形成完整反馈环：模型在 ActionStep 中产出 tool call → validate_tool_arguments 校验 → 工具将结果写回 observations → 序列化为下一轮 prompt 的 Observation。规划阶段引入 PlanningStep，将事实清单与高层计划注入 update_plan_pre_messages / update_plan_post_messages，让模型能在长时任务中自我纠偏。

常见失败模式包括：

上下文超限：未触发压缩时 ActionStep 累积导致 prompt 越界，可手动调用 agent.memory.replay() 与 step 回调实现滚动摘要（参见 #694）。
解析失败：当模型未按 code_block_tags 闭合代码块时，parse_code_blobs 抛 AgentParsingError；toolcalling_agent 下则因 JSON 缺失字段报 AgentToolCallError。
外部依赖缺失：WikipediaSearchTool / SpeechToTextTool 在 import 失败时显式抛出 ImportError，指引用户安装可选 extras。
记忆污染：长期持久化场景下需在加载时清洗 ActionStep.observations，对应 #2332 的 OWASP memory guard 提议。

Secure Code Execution, Monitoring & Deployment

smolagents 让大语言模型直接生成并执行 Python 代码，因此围绕"执行—观测—审计—部署"构建了一整套工程化体系。本页聚焦该体系的三个核心维度：本地安全执行、远程隔离执行、运行监控以及可视化部署入口。CodeAgent 与 ToolCallingAgent 均通过 executor 参数接入执行器，并在每一步骤通过 AgentMemory 与 Monitor ...

章节 相关页面

继续阅读本节完整说明和来源证据。

概述

smolagents 让大语言模型直接生成并执行 Python 代码，因此围绕"执行—观测—审计—部署"构建了一整套工程化体系。本页聚焦该体系的三个核心维度：本地安全执行、远程隔离执行、运行监控以及可视化部署入口。CodeAgent 与 ToolCallingAgent 均通过 executor 参数接入执行器，并在每一步骤通过 AgentMemory 与 Monitor 记录可观测数据，使运行轨迹既能回放也能持久化。该能力直接对应社区对 memory consolidation（#901、#694）与 save/load（#1216）的诉求。

本地安全代码执行

local_python_executor.py 提供了沙箱式的 Python 执行环境，其核心抽象是 PythonExecutor 与 LocalPythonExecutor，并通过白名单 BASE_BUILTIN_MODULES 控制可被 import 的内建模块。入口函数为：

evaluate_python_code(
    code, state, static_tools,
    authorized_imports, timeout_seconds,
    max_print_outputs_length=DEFAULT_MAX_PRINT_OUTPUT_LENGTH,
)

资料来源：src/smolagents/local_python_executor.py

其中 authorized_imports 用于追加允许的第三方包，timeout_seconds（默认取 MAX_EXECUTION_TIME_SECONDS）避免长任务阻塞，max_print_outputs_length 限制 print() 回显长度以防止上下文窗口溢出；社区 issue #2372 已指出该函数文档缺失 authorized_imports 与 max_print_outputs_length 两个参数，应在阅读源码时留意。fix_final_answer_code 负责规整 final_answer(...) 调用，使其能被上层解析。

default_tools.py 中的 PythonInterpreterTool 把这套执行器包装为模型可调用的工具：

class PythonInterpreterTool(Tool):
    def __init__(self, *args, authorized_imports=None,
                 timeout_seconds=MAX_EXECUTION_TIME_SECONDS, **kwargs):
        self.authorized_imports = list(set(BASE_BUILTIN_MODULES) | set(authorized_imports or []))
        ...

资料来源：src/smolagents/default_tools.py

forward 中把 code、状态、静态工具、超时一并传入 evaluate_python_code，最终把 state['_print_outputs'] 与返回值拼成 "Stdout:\n...\nOutput: ..." 字符串返回。

远程隔离执行

agents.py 顶层从 remote_executors 导入 BlaxelExecutor、DockerExecutor、E2BExecutor、ModalExecutor，分别对应 Blaxel 托管沙箱、本地 Docker 容器、E2B 微虚拟机、Modal 云函数四种隔离后端：

from .remote_executors import BlaxelExecutor, DockerExecutor, E2BExecutor, ModalExecutor

资料来源：src/smolagents/agents.py

开发者只需在 CodeAgent(executor=...) 处替换即可，无需改动 agent 其余逻辑——这一抽象也契合社区 #2368 关于 microsandbox/libkrun 进程隔离执行器的诉求方向。examples/sandboxed_execution.py 提供了完整示例，演示如何把不可信代码迁移到远端机器运行。

监控与可观测性

monitoring.py 定义了 AgentLogger、LogLevel、Monitor、TokenUsage 等基础设施。Monitor 在每一步对 step 进行装饰，捕获 stdout、错误与耗时；TokenUsage 聚合输入/输出 token 计数。agents.py 通过 rich 的 Panel、Group、Live 把 Thought、Code、Observation 流式渲染为终端面板，并同步写入 AgentMemory.steps 列表供外部回放。该链路为社区 #694、#901 提出的 memory consolidation 提供了落地基础——开发者可在外部对 steps 进行摘要压缩或持久化，从而回应 #1216 关于 save/load 的需求。

flowchart LR
    Agent[CodeAgent / ToolCallingAgent] --> Exec{Executor 选择}
    Exec -->|本地| LPE[LocalPythonExecutor<br/>evaluate_python_code]
    Exec -->|远端| Docker[DockerExecutor]
    Exec -->|远端| E2B[E2BExecutor]
    Exec -->|远端| Modal[ModalExecutor]
    Exec -->|远端| Blaxel[BlaxelExecutor]
    LPE --> Mon[Monitor / AgentLogger]
    Docker --> Mon
    E2B --> Mon
    Modal --> Mon
    Blaxel --> Mon
    Mon --> Mem[AgentMemory.steps<br/>TokenUsage]
    Mem --> UI[Gradio UI<br/>gradio_ui.py]
    UI --> Browser[vision_web_browser.py]

部署与 UI

gradio_ui.py 提供 create_agent_gradio_app_template（在 agents.py 中被复用），能够把任意 agent 包装为 Gradio Web 应用；vision_web_browser.py 则实现带视觉理解的浏览器子 agent，可作为多 agent 编排的子模块部署。结合 examples/sandboxed_execution.py，即可拼装"沙箱执行 + 监控日志 + Gradio 前端"的完整部署单元。

常见失败模式

现象	根因	处置方式
`ImportError: pandas not allowed`	`authorized_imports` 未配置	在 `PythonInterpreterTool(authorized_imports=[...])` 或 agent 构造参数中追加
`AgentMaxStepsError`	长任务超出 `timeout_seconds` 或 step 上限	引入 memory 摘要压缩（参考 #694 / #901）
远程执行器初始化失败	缺少 E2B_API_KEY、Modal token 等凭证	提前注入环境变量；v1.26.0 (#2275) 改善了 LiteLLM 类无 key 场景的错误提示
步骤输出截断	`max_print_outputs_length` 限制	调大参数或自行裁剪

资料来源：src/smolagents/local_python_executor.py src/smolagents/agents.py src/smolagents/monitoring.py

参见

代理与工具调用：参见 agents.py、default_tools.py。
模型接入与流式响应：参见 models.py。
监控与 Token 计量：参见 monitoring.py。
安全执行示例：参见 examples/sandboxed_execution.py。

来源：https://github.com/huggingface/smolagents / 项目说明书

失败模式与踩坑日记

保留 Doramagic 在发现、验证和编译中沉淀的项目专属风险，不把社区讨论只当作装饰信息。

high 来源证据：ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?

可能增加新用户试用和生产接入成本。

high 来源证据：Real Memory summary for on-going conversations (avoid LLM size limits)

可能增加新用户试用和生产接入成本。

high 失败模式：security_permissions: [Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard

Developers may expose sensitive permissions or credentials: [Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard

high 来源证据：Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent

可能影响授权、密钥配置或安全边界。

Pitfall Log / 踩坑日志

项目：huggingface/smolagents

摘要：发现 28 个潜在踩坑项，其中 6 个为 high/blocking；最高优先级：维护坑 - 来源证据：ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?。

1. 维护坑 · 来源证据：ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?
对用户的影响：可能增加新用户试用和生产接入成本。
证据：community_evidence:github | https://github.com/huggingface/smolagents/issues/1579 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

2. 维护坑 · 来源证据：Real Memory summary for on-going conversations (avoid LLM size limits)

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：Real Memory summary for on-going conversations (avoid LLM size limits)
对用户的影响：可能增加新用户试用和生产接入成本。
证据：community_evidence:github | https://github.com/huggingface/smolagents/issues/694 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

3. 安全/权限坑 · 失败模式：security_permissions: [Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard

严重度：high
证据强度：source_linked
发现：Developers should check this security_permissions risk before relying on the project: [Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard
对用户的影响：Developers may expose sensitive permissions or credentials: [Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/2332 | [Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard

4. 安全/权限坑 · 来源证据：Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent
对用户的影响：可能影响授权、密钥配置或安全边界。
证据：community_evidence:github | https://github.com/huggingface/smolagents/issues/2129 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

5. 安全/权限坑 · 来源证据：[Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：[Feature Request] Memory Poisoning Protection for smolagents via OWASP Agent Memory Guard
对用户的影响：可能阻塞安装或首次运行。
证据：community_evidence:github | https://github.com/huggingface/smolagents/issues/2332 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

6. 安全/权限坑 · 来源证据：[Feature] Agent memory/history consolidation after a number of interactions

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：[Feature] Agent memory/history consolidation after a number of interactions
对用户的影响：可能影响升级、迁移或版本选择。
证据：community_evidence:github | https://github.com/huggingface/smolagents/issues/901 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

7. 安装坑 · 失败模式：installation: Save/Load agent memory

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: Save/Load agent memory
对用户的影响：Developers may fail before the first successful local run: Save/Load agent memory
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/1216 | Save/Load agent memory

8. 安装坑 · 失败模式：installation: v1.21.0

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: v1.21.0
对用户的影响：Upgrade or migration may change expected behavior: v1.21.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.21.0 | v1.21.0

9. 安装坑 · 失败模式：installation: v1.22.0

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: v1.22.0
对用户的影响：Upgrade or migration may change expected behavior: v1.22.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.22.0 | v1.22.0

10. 配置坑 · 失败模式：configuration: DOC: `evaluate_python_code` docstring is missing two parameters

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: DOC: evaluate_python_code docstring is missing two parameters
对用户的影响：Developers may misconfigure credentials, environment, or host setup: DOC: evaluate_python_code docstring is missing two parameters
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/2372 | DOC: evaluate_python_code docstring is missing two parameters

11. 配置坑 · 失败模式：configuration: Enhanced memory module with integrations

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: Enhanced memory module with integrations
对用户的影响：Developers may misconfigure credentials, environment, or host setup: Enhanced memory module with integrations
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/945 | Enhanced memory module with integrations

12. 配置坑 · 失败模式：configuration: Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent
对用户的影响：Developers may misconfigure credentials, environment, or host setup: Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/2129 | Feature: behavioral fingerprint hook for memory consolidation events in MultiStepAgent

13. 配置坑 · 失败模式：configuration: [Feature] Agent memory/history consolidation after a number of interactions

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: [Feature] Agent memory/history consolidation after a number of interactions
对用户的影响：Developers may misconfigure credentials, environment, or host setup: [Feature] Agent memory/history consolidation after a number of interactions
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/901 | [Feature] Agent memory/history consolidation after a number of interactions

14. 配置坑 · 失败模式：configuration: v1.20.0

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: v1.20.0
对用户的影响：Upgrade or migration may change expected behavior: v1.20.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.20.0 | v1.20.0

15. 能力坑 · 能力判断依赖假设

严重度：medium
证据强度：source_linked
发现：README/documentation is current enough for a first validation pass.
对用户的影响：假设不成立时，用户拿不到承诺的能力。
证据：capability.assumptions | github_repo:898968194 | https://github.com/huggingface/smolagents | README/documentation is current enough for a first validation pass.

16. 运行坑 · 失败模式：runtime: v1.25.0

严重度：medium
证据强度：source_linked
发现：Developers should check this runtime risk before relying on the project: v1.25.0
对用户的影响：Upgrade or migration may change expected behavior: v1.25.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.25.0 | v1.25.0

17. 维护坑 · 失败模式：migration: v1.24.0

严重度：medium
证据强度：source_linked
发现：Developers should check this migration risk before relying on the project: v1.24.0
对用户的影响：Upgrade or migration may change expected behavior: v1.24.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.24.0 | v1.24.0

18. 维护坑 · 维护活跃度未知

严重度：medium
证据强度：source_linked
发现：未记录 last_activity_observed。
对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
证据：evidence.maintainer_signals | github_repo:898968194 | https://github.com/huggingface/smolagents | last_activity_observed missing

严重度：medium
证据强度：source_linked
发现：no_demo
证据：downstream_validation.risk_items | github_repo:898968194 | https://github.com/huggingface/smolagents | no_demo; severity=medium

20. 安全/权限坑 · 存在评分风险

严重度：medium
证据强度：source_linked
发现：no_demo
对用户的影响：风险会影响是否适合普通用户安装。
证据：risks.scoring_risks | github_repo:898968194 | https://github.com/huggingface/smolagents | no_demo; severity=medium

21. 安全/权限坑 · 来源证据：DOC: `evaluate_python_code` docstring is missing two parameters

严重度：medium
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：DOC: evaluate_python_code docstring is missing two parameters
对用户的影响：可能影响授权、密钥配置或安全边界。
证据：community_evidence:github | https://github.com/huggingface/smolagents/issues/2372 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

22. 能力坑 · 失败模式：capability: ENH: Executor support for microsandbox

严重度：low
证据强度：source_linked
发现：Developers should check this capability risk before relying on the project: ENH: Executor support for microsandbox
对用户的影响：Developers may hit a documented source-backed failure mode: ENH: Executor support for microsandbox
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/2368 | ENH: Executor support for microsandbox

23. 运行坑 · 失败模式：performance: ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best...

严重度：low
证据强度：source_linked
发现：Developers should check this performance risk before relying on the project: ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?
对用户的影响：Developers may hit a documented source-backed failure mode: ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/1579 | ENH: Add a way to insert custom chat history for multi-turn chat and what's the current best way?

24. 运行坑 · 失败模式：performance: Real Memory summary for on-going conversations (avoid LLM size limits)

严重度：low
证据强度：source_linked
发现：Developers should check this performance risk before relying on the project: Real Memory summary for on-going conversations (avoid LLM size limits)
对用户的影响：Developers may hit a documented source-backed failure mode: Real Memory summary for on-going conversations (avoid LLM size limits)
证据：failure_mode_cluster:github_issue | https://github.com/huggingface/smolagents/issues/694 | Real Memory summary for on-going conversations (avoid LLM size limits)

25. 运行坑 · 失败模式：performance: v1.23.0

严重度：low
证据强度：source_linked
发现：Developers should check this performance risk before relying on the project: v1.23.0
对用户的影响：Upgrade or migration may change expected behavior: v1.23.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.23.0 | v1.23.0

26. 维护坑 · issue/PR 响应质量未知

严重度：low
证据强度：source_linked
发现：issue_or_pr_quality=unknown。
对用户的影响：用户无法判断遇到问题后是否有人维护。
证据：evidence.maintainer_signals | github_repo:898968194 | https://github.com/huggingface/smolagents | issue_or_pr_quality=unknown

27. 维护坑 · 发布节奏不明确

严重度：low
证据强度：source_linked
发现：release_recency=unknown。
对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
证据：evidence.maintainer_signals | github_repo:898968194 | https://github.com/huggingface/smolagents | release_recency=unknown

28. 维护坑 · 失败模式：maintenance: v1.26.0

严重度：low
证据强度：source_linked
发现：Developers should check this maintenance risk before relying on the project: v1.26.0
对用户的影响：Upgrade or migration may change expected behavior: v1.26.0
证据：failure_mode_cluster:github_release | https://github.com/huggingface/smolagents/releases/tag/v1.26.0 | v1.26.0

来源：Doramagic 发现、验证与编译记录