browser-use 项目说明书

Doramagic 项目包 · 项目说明书

browser-use 项目

browser-use 是一个面向「浏览器自动化」的开源项目，重点覆盖视觉生成、视觉工作流编排；Doramagic 已整理安装入口、说明书、上下文包和风险边界，方便先判断再试用。

Overview and Quickstart

browser-use 是一个面向大语言模型（LLM）代理的开源浏览器自动化库，其目标是让 AI 智能体能够以自然语言指令驱动真实浏览器，完成网页导航、表单提交、信息抽取与流程编排等任务资料来源：[README.md:1-80]()。

章节 相关页面

继续阅读本节完整说明和来源证据。

概览与快速入门

一、项目定位与核心能力

browser-use 是一个面向大语言模型（LLM）代理的开源浏览器自动化库，其目标是让 AI 智能体能够以自然语言指令驱动真实浏览器，完成网页导航、表单提交、信息抽取与流程编排等任务资料来源：README.md:1-80。

该项目围绕"代理（Agent）+ 浏览器会话（BrowserSession）+ 工具（Tools）"三件套构建：

Agent：基于 LLM 决策循环的解释器，根据用户请求逐步执行动作；
BrowserSession：底层浏览器与 CDP（Chrome DevTools Protocol）客户端，负责任务页的打开、标签管理与上下文切换；
Tools：内置动作集合（点击、输入、滚动、提取、文件操作等），并允许用户通过 Tools.action 装饰器注册自定义工具资料来源：README.md:60-120、browser_use/actor/README.md:1-60()。

官方系统提示词明确指出，代理以"迭代循环"工作：在每一步中输入 user_request、agent_history、browser_state、browser_vision 与 read_state，并输出结构化的动作资料来源：browser_use/agent/system_prompts/system_prompt.md:1-60。

二、安装与最小可运行示例

社区最常见的初次安装方式是通过 pip：

pip install -U browser-use
playwright install

随后在 Agent 构造时显式传入 llm 即可启动一次浏览器任务：

from browser_use import Agent, ChatBrowserUse
agent = Agent(
    task="Find the latest AI news on Hacker News",
    llm=ChatBrowserUse(),
)
await agent.run()

仓库默认推荐使用 ChatBrowserUse 作为 LLM 提供方，因为它针对浏览器代理场景做了优化资料来源：README.md:40-80。如需切换其他模型供应商（如 OpenAI、Azure、Gemini、Ollama、OpenRouter），只需替换 llm= 参数。社区 issue #4755 反映出"模型导入路径不正确"的文档问题，请优先参考官方文档中的 supported-models 章节获取最新导入方式资料来源：README.md:60-90。

对于希望复用本地 Chromium 配置（含登录态、Cookie）的开发者，可使用 examples/browser/real_browser.py 中展示的"真实浏览器档案"模式资料来源：README.md:120-150。

三、核心架构与执行流程

browser-use 的运行时由 Agent 主导，依次执行：解析任务 → 抓取浏览器状态 → 构造 LLM 提示 → 调用 LLM → 解析结构化动作 → 在浏览器中执行 → 记录历史 → 循环直至完成或达到 max_steps 上限资料来源：browser_use/agent/service.py:200-400。

flowchart TD
    A[用户任务 task] --> B[Agent.run]
    B --> C[抓取 browser_state + screenshot]
    C --> D[注入系统提示与历史]
    D --> E[调用 LLM 决策]
    E --> F[解析 AgentOutput 动作]
    F --> G[BrowserSession 执行动作]
    G --> H[记录到 agent_history]
    H --> I{达到 done?}
    I -- 否 --> C
    I -- 是 --> J[返回结果]

系统提示词以"语言、输入、推理规则、效率指引"四段式组织，并随 LLM 类型切换模板，例如 system_prompt_no_thinking.md 适用于无 chain-of-thought 的模型，system_prompt_anthropic_flash.md 则面向 Claude 系列快速通道资料来源：browser_use/agent/system_prompts/system_prompt_no_thinking.md:1-40、browser_use/agent/system_prompts/system_prompt_anthropic_flash.md:1-40、browser_use/agent/system_prompts/__init__.py:1-20()。

页面抽取动作 extract 与 ai_step 共享一套提示模板，核心规则是"只依据网页内容作答，不要臆造" 资料来源：browser_use/agent/prompts.py:1-60。

四、常见使用模式与社区反馈

持续监控型任务：examples/apps/news-use/ 演示了以 5 分钟为周期轮询新闻站点、抽取最新报道并生成摘要的完整流程资料来源：examples/apps/news-use/README.md:1-50。
第三方集成：第三方集成示例应放在 examples/integrations/<provider>/，而随包发布的集成放在 browser_use/integrations/<provider>/ 资料来源：examples/integrations/README.md:1-30。
低层级 CDP 操作：需要更细粒度控制时，可使用 browser_use/actor 中暴露的 BrowserSession、Page、Element、Mouse 类，绕过高层 Agent 直接驱动 Chrome 资料来源：browser_use/actor/README.md:1-80。

社区高频痛点与对应处理建议：

常见问题	说明与建议
启动后浏览器白屏（issue #1020）	多数为 `playwright install` 未执行或代理模式下 `executable_path` 配置错误，需先运行 `playwright install chromium` 资料来源：README.md:30-60
Azure OpenAI 触发内容审核（issue #4783）	Azure 的 `content_filter` 对自动化提示敏感，建议改用 `ChatBrowserUse` 或在 Azure 后台调整过滤等级资料来源：README.md:60-90
缺少 `hover` 动作（issue #4964）	高层级 API 暂未暴露，可通过 `evaluate` 触发 CSS `:hover` 状态，或在 `Tools` 中注册自定义动作资料来源：browser_use/agent/system_prompts/system_prompt.md:60-100
需要 human-in-the-loop（issue #221）	0.12.x 暂无内置暂停机制，可在自定义工具内加入审批逻辑，或 fork Agent 主循环资料来源：browser_use/agent/service.py:200-400
Ollama 等本地模型 JSON 解析失败（issue #2605）	多因本地模型未启用 `tool calling`；可关闭 `flash_mode` 或改用支持结构化输出的模型资料来源：browser_use/agent/prompts.py:60-120

System Architecture

browser-use 是一个面向 LLM 的浏览器自动化框架，其核心目标是把"在网页上完成任意任务"的自然语言请求转化为浏览器内可执行的动作序列。整个系统采用分层 + 事件驱动的架构，从上到下大致划分为四层：

章节 相关页面

继续阅读本节完整说明和来源证据。

总体架构概览

browser-use 是一个面向 LLM 的浏览器自动化框架，其核心目标是把"在网页上完成任意任务"的自然语言请求转化为浏览器内可执行的动作序列。整个系统采用分层 + 事件驱动的架构，从上到下大致划分为四层：

Agent 编排层（browser_use/agent/）：负责对话循环、消息管理、提示工程与 LLM 调度。
消息与状态层（browser_use/agent/message_manager/）：维护多轮历史、提供压缩机制（compaction）。
浏览器执行层 / Actor 层（browser_use/actor/）：通过 CDP 直接控制 Chromium，管理会话、标签页、页面与元素。
工具与动作层：把 LLM 输出的结构化动作映射为浏览器内可执行的操作（点击、输入、滚动、提取、文件读写等）。

flowchart TB
    User["用户任务 (User Request)"] --> AgentService["Agent Service (service.py)"]
    AgentService --> MsgMgr["Message Manager (含 Compaction)"]
    MsgMgr --> SysPrompt["System Prompt 模板"]
    SysPrompt --> LLM["LLM (ChatBrowserUse / OpenAI / Anthropic / Gemini)"]
    LLM -->|"结构化 Action"| ActionLayer["Action 校验与分发"]
    ActionLayer --> Actor["Actor 层 (BrowserSession / Page / Element)"]
    Actor -->|"CDP"| Browser["Chromium 浏览器"]
    Browser -->|"DOM / 截图 / 状态"| MsgMgr

这一闭环在 browser_use/agent/service.py 中由 Agent 类驱动：构造时接收 llm、browser_session、page_extraction_llm、judge_llm 等参数，注册到 TokenCost 服务用于成本统计，并通过 AgentSettings 统一管理 use_vision、max_failures、use_thinking、flash_mode、enable_planning、message_compaction 等运行开关。资料来源：browser_use/agent/service.py

Agent 编排层

Agent 编排层是整个框架的"大脑"。它的核心组件包括：

System Prompt 模板：browser_use/agent/system_prompts/ 目录下提供多个变体，根据模型与运行模式自动选择：
system_prompt.md：标准模板，强制 LLM 维护 todo.md、在每步产出 evaluation_previous_goal / memory / next_goal / action 四个字段。资料来源：browser_use/agent/system_prompts/system_prompt.md
system_prompt_no_thinking.md：关闭显式 thinking 块，节省 token。资料来源：browser_use/agent/system_prompts/system_prompt_no_thinking.md
system_prompt_anthropic_flash.md：面向 Anthropic 的 Flash 模式精简模板。资料来源：browser_use/agent/system_prompts/system_prompt_anthropic_flash.md
Per-step 状态拼装：prompts.py 负责在每一步将 <browser_state>、<file_system>、<todo_contents>、当前截图、<read_state>（仅在执行过 extract / read_file 后出现）拼装成 LLM 输入，并在消息尾部追加步骤元信息以最大化前缀缓存命中率。资料来源：browser_use/agent/prompts.py
消息压缩：message_manager/service.py 中的 compact_messages 方法使用专门的"摘要系统提示"，保留任务需求、关键事实、决策、错误与下一步，并通过 sensitive_data 过滤后再送入压缩 LLM；只有当历史中显式确认成功的步骤才被标记为完成，避免幻觉性"已完成"。资料来源：browser_use/agent/message_manager/service.py
数据模型：views.py 使用 Pydantic 定义 DetectedVariable、VariableMetadata 等可序列化结构，便于跨层传递。资料来源：browser_use/agent/views.py

浏览器执行层（Actor）

Actor 层把抽象的"动作"落地为浏览器内事件，是框架的"手脚"。

BrowserSession / Browser：browser_use/actor/README.md 显示，BrowserSession 是会话管理器（同时以 Browser 作为别名暴露），提供 start() / stop() / kill() / new_page() / get_pages() / get_current_page() / close_page() 等标签页与生命周期方法，并通过 CDP 客户端与 Chromium 通信。资料来源：browser_use/actor/README.md
Page / Element / Mouse：以 Page 表示单个标签页或 iframe，Element 封装 DOM 节点，Mouse 提供 click / move / scroll 等底层交互。
AI 驱动内容提取：Page.extract_content(query, schema, llm) 允许把任意 LLM 当作"抽取器"，从当前页面得到结构化结果（典型示例见 examples/apps/news-use/README.md 中的 news_monitor.py —— 调用 Gemini 对头条文章做摘要与情感分析）。资料来源：examples/apps/news-use/README.md

值得注意的工程取舍是：ChatBrowserUse() 是默认且推荐选项，只有在确实需要替换模型时再选其他 provider；这一点在集成示例目录中也有明确约定。资料来源：examples/integrations/README.md

工具、动作与扩展点

LLM 每一步产出的"动作"是受控词汇表（controlled vocabulary），其执行路径与扩展点包括：

受限文件 / 受限 Profile：service.py 的扩展名白名单（md/csv/json/pdf/zip/png/…）与"黑名单关键词"（never/dont/not/…）共同约束 Agent 在页面上的可写范围，降低越权风险。
运行时配置：AgentSettings 集中表达 use_vision、vision_detail_level、max_actions_per_step、loop_detection_window、planning_exploration_limit、final_response_after_failure 等；fallback_llm 与 _using_fallback_llm 标志位提供了主备模型切换的运行时语义。
集成样例位置约定：examples/integrations/<provider>/ 用于第三方服务演示，examples/custom-functions/ 用于无供应商偏向的自定义工具，browser_use/integrations/<provider>/ 仅在集成随主包发布并带测试时使用。资料来源：examples/integrations/README.md

社区与版本演进中的架构信号

近几个版本（0.12.3–0.12.9）的发布说明揭示了几条贯穿架构演化的主线，与上述设计直接相关：

CLI 2.0（0.12.3） 把底层从 Playwright 切换为直接 CDP + 持久后台守护进程，使命令延迟降到约 50 ms —— 这正是 Actor 层以 CDP 为中心的设计动机。
0.12.5 出于 litellm 供应链投毒事件，将 litellm 从核心依赖中移除，但保留 ChatLiteLLM 包装器，需用户显式安装 —— 体现了"核心薄、扩展按需"的依赖策略。
0.12.8 在守护进程上做了 Unix socket 文件权限收紧、并在 evaluate() 上对受限 browser profile 拒绝执行 —— 直接对应 Actor 层的执行边界。
0.12.9 修复了新标签页截图跳过的判断，并把 session id 透传到 judge LLM —— 强化了消息管理层与多会话隔离。
长期社区诉求（如 #221 人工接管、#2605 新的开源模型适配、#4964 显式 hover 动作、#947 拟人化行为）会进一步推动 Actor 层动作表与 Agent 编排层提示工程的扩展。

LLM and Model Integration

LLM 与模型集成层是 browser-use 项目的核心抽象,负责把任何大语言模型统一为 BaseChatModel 接口,并负责结构化输出、消息构建、Token 计费、提示词拼接等横切关注点。所有上层 Agent 逻辑(browseruse/agent/service.py)、工具调用以及 Browser Session 控制都通过该层与底层模型解耦,使得 OpenAI...

章节 相关页面

继续阅读本节完整说明和来源证据。

概述

LLM 与模型集成层是 browser-use 项目的核心抽象,负责把任何大语言模型统一为 BaseChatModel 接口,并负责结构化输出、消息构建、Token 计费、提示词拼接等横切关注点。所有上层 Agent 逻辑(browser_use/agent/service.py)、工具调用以及 Browser Session 控制都通过该层与底层模型解耦,使得 OpenAI、Anthropic、Google、Ollama 等多家厂商的模型可以在不改动业务代码的情况下替换使用(资料来源:README.md、browser_use/llm/README.md)。

一、官方支持的模型与提供商

项目在 browser_use/llm/README.md 中明确列出了官方支持的提供商,包括 OpenAI、Anthropic、Google、Groq、Ollama、DeepSeek、Mistral 和 Cerebras。每种提供商都对应 browser_use/llm/ 下的一个子模块(如 browser_use/llm/openai/,browser_use/llm/anthropic/),实现统一的 ainvoke() 入口。

社区提示:Issue #4755 反馈文档中列出的某些 OpenRouter 模型导入路径并不存在,使用时应以 browser_use/llm/__init__.py 中实际导出的类为准(资料来源:README.md、社区 Issue #4755)。

安全提示:在 0.12.5 版本中,litellm 被从核心依赖中移除,以规避 2026-03-24 的供应链后门事件。ChatLiteLLM 包装器仍然保留,使用方需自行 pip install litellm(资料来源:Release 0.12.5 Changelog)。

二、BaseChatModel 抽象与消息协议

flowchart LR
    A[Agent 业务逻辑] --> B[BaseChatModel.ainvoke]
    B --> C{provider}
    C -->|OpenAI| D[ChatOpenAI]
    C -->|Anthropic| E[ChatAnthropic]
    C -->|Google| F[ChatGoogle]
    C -->|Ollama/本地| G[ChatOllama]
    C -->|Browser-Use SaaS| H[ChatBrowserUse]
    D --> I[结构化输出校验]
    E --> I
    F --> I
    G --> I
    H --> I
    I --> J[ActionModel / AgentOutput]

所有 Chat* 类都继承自 browser_use/llm/base.py 中的 BaseChatModel,核心契约是异步方法 ainvoke(messages, output_format=None),返回的对象可通过 .completion 字段访问结构化结果(资料来源:browser_use/llm/base.py、browser_use/agent/service.py 中的 ai_step 调用)。messages 列表中的元素遵循 browser_use/llm/messages.py 的 BaseMessage 协议,主要类型包括 SystemMessage、UserMessage、AssistantMessage,以保证跨厂商的语义一致。

三、Agent 中的多 LLM 协同

Agent 构造函数支持同时注入最多四个独立 LLM,各自承担不同职责(资料来源:browser_use/agent/service.py、browser_use/agent/views.py)。

字段	作用	典型用法
`llm`	主对话模型,负责推理与动作选择	任意支持的模型
`page_extraction_llm`	页面结构化提取专用模型	`extract_content` / `ai_step`
`judge_llm`	任务评判与 ground-truth 校对	与 `use_judge=True` 搭配
`fallback_llm`	主 LLM 失败时的兜底	网络或配额异常时自动切换

这种分层设计的好处是可以用能力更强但更贵的模型(如 Claude、Gemini)做主推理,而用更快更便宜的本地模型(Ollama)做页面提取。社区 Issue #2605 中反馈的 Gpt-OSS + Ollama 解析失败问题,正是 page_extraction_llm 在结构化输出校验上需要完善 JSON 解析的典型场景(资料来源:Issue #2605)。

四、提示词与输出 Schema 协同

模型集成层与提示词系统紧耦合。browser_use/agent/system_prompts/ 目录针对不同模型家族提供独立模板:

system_prompt.md:通用模板,包含 <todo_examples>、<evaluation_examples>、<memory_examples> 等示例块(资料来源:browser_use/agent/system_prompts/system_prompt.md)。
system_prompt_no_thinking.md:面向禁用扩展思维链的模型,提供更精简的指令(资料来源:browser_use/agent/system_prompts/system_prompt_no_thinking.md)。
system_prompt_anthropic_flash.md:面向 Claude Flash 模式,提供 11 个常用动作的快速参考(资料来源:browser_use/agent/system_prompts/system_prompt_anthropic_flash.md)。

Agent 默认输出是 AgentOutput Pydantic 模型,包含 evaluation_previous_goal、memory、next_goal、action 四个字段;LLM 层通过 output_format 参数强制模型生成符合该 Schema 的 JSON,失败时由 Pydantic 校验报错并触发重试(资料来源:browser_use/agent/views.py、browser_use/agent/service.py 中的 MessageManager)。

五、消息压缩与上下文管理

随着任务步数增加,历史消息会迅速膨胀。MessageCompactionSettings(定义于 browser_use/agent/views.py)提供了一套自动压缩机制:

compact_every_n_steps:每隔 N 步触发一次摘要;
trigger_char_count / trigger_token_count:超过阈值时立即压缩;
summary_max_chars:摘要最大字符数;
keep_last_items:保留最近几条原始消息以保留局部上下文。

压缩后,被摘要替换的旧消息可由 compaction_llm 重新生成,大幅降低 Token 消耗(资料来源:browser_use/agent/views.py,MessageCompactionSettings 类)。在 0.12.9 版本中,judge_llm 调用被加入 session id 透传,提升了多会话场景下的可观测性(资料来源:Release 0.12.9)。

六、常见故障与社区反馈

现象	根因	解决思路
Azure OpenAI 触发 `ResponsibleAIPolicyViolation`	系统提示中的"jailbreak"等关键词被误判	在 `override_system_message` 中改写措辞,或切换到非 Azure 端点(Issue #4783)
Ollama/Gpt-OSS 出现 `Invalid JSON: EOF while parsing`	本地模型未启用 JSON 模式或上下文不足	显式传 `output_format`,或减小 `max_clickable_elements_length`(Issue #2605)
Gemini-3 出现 0 输出	温度参数过低	0.12.6 已将默认温度设为 1.0(PR #4489)
Chromium 启动后空白页	缺少 `playwright install` 依赖或权限不足	重新执行 `playwright install chromium`(Issue #1020)

七、扩展自定义模型

对于官方未列出的模型,有两种集成路径:

LangChain 桥接:使用 ChatLangchain 包装任意 LangChain BaseChatModel,参考 examples/models/langchain/example.py(资料来源:browser_use/llm/README.md)。
实现 BaseChatModel:继承并实现 ainvoke(),在 browser_use/llm/__init__.py 中导出后即可被 Agent 直接消费。

集成完成后,应将示例放入 examples/integrations/<provider>/ 目录,并在 README 的“Community integrations”列表中登记(资料来源:examples/integrations/README.md)。

Browser Session, DOM, and Watchdogs

browser-use 是一个基于 CDP（Chrome DevTools Protocol）的浏览器自动化库，核心由三层协同构成：浏览器会话（Browser Session）负责与 Chrome 实例的底层通信，DOM 提取层把网页转换为 LLM 可消费的结构化信息，看门狗（Watchdogs）在动作循环中持续维护状态、健康度与安全策略。三者共同支撑起"语言模型 →...

章节 相关页面

继续阅读本节完整说明和来源证据。

浏览器会话、DOM 与看门狗机制

概述

browser-use 是一个基于 CDP（Chrome DevTools Protocol）的浏览器自动化库，核心由三层协同构成：浏览器会话（Browser Session） 负责与 Chrome 实例的底层通信，DOM 提取层 把网页转换为 LLM 可消费的结构化信息，看门狗（Watchdogs） 在动作循环中持续维护状态、健康度与安全策略。三者共同支撑起"语言模型 → 浏览器动作 → 截图/DOM → 下一轮决策"的闭环。

flowchart LR
    A[Agent / LLM] -->|输出动作| B[Action Loop]
    B --> C[BrowserSession<br/>CDP 通信]
    C -->|DOM / 截图| D[DOM 提取层<br/>extract_clean_markdown]
    D -->|browser_state + vision| A
    C --> E[Watchdogs<br/>默认动作 / 守护]
    E --> C

浏览器会话（Browser Session）

Browser Actor 是面向 CDP 的低层抽象。代码中以 Browser 作为 BrowserSession 的对外别名，用户既可通过 Browser() 一键启动并直接 await browser.new_page("https://...")，也可以用 Page / Element / Mouse 等对象做精细控制。资料来源：browser_use/actor/README.md:1-30

会话层支持完整的标签页与导航原语：new_page、close_page、get_pages、get_current_page 管理多标签；page.goto、go_back、go_forward、reload 控制导航生命周期。资料来源：browser_use/actor/README.md:30-45

元素层提供三类获取方式：CSS 选择器批量查找（get_elements_by_css_selector）、按后端节点 ID 直接获取（get_element(backend_node_id=...)），以及借助 LLM 的语义定位（get_element_by_prompt / must_get_element_by_prompt）。资料来源：browser_use/actor/README.md:45-65

在 Agent 中可直接复用同一会话，例如 ai_step 内部通过 self.browser_session.take_screenshot(full_page=False) 获取截图，并 base64 编码后与提取的 Markdown 一起作为模型输入。资料来源：browser_use/agent/service.py:1-50

DOM 提取与页面状态

DOM 层负责把浏览器侧的真实状态序列化为 LLM 提示的一部分。Agent._run_ai_step 调用 extract_clean_markdown 输出三段统计——原始 HTML 字符数、初步 Markdown 字符数、过滤后 Markdown 字符数，并拼接为 stats_summary 喂给模型，便于模型感知"过滤前后的信息密度"。资料来源：browser_use/agent/service.py:1-50

当动作中包含 extract 或 read_file 时，结果会通过 <read_state> 在下一轮注入到 system prompt；提取模型本身被指示"只使用网页中存在的信息、不可臆造、查询不到时明确说明"，并以非对话式紧凑形式输出。资料来源：browser_use/agent/prompts.py:1-25

<browser_state> 同时承载 URL、标签页列表、按 [index] 编号的交互元素以及可见内容；只有带编号的元素才是合法动作目标，新增元素以 * 标记。资料来源：browser_use/agent/system_prompts/system_prompt_anthropic_flash.md:1-30

<browser_vision> 提供带边界框的截图，被明确标注为"GROUND TRUTH"，是判断动作成功与否的最终依据。资料来源：browser_use/agent/system_prompts/system_prompt.md:1-30

看门狗机制与执行循环

Watchdogs 在本仓库中并不以独立大类文件出现，而是以约束、恢复策略和动作容器的形式内嵌在 Agent 循环里，由不同 LLM 家族对应的 system prompt 模板统一承载。核心规则包括：

每步动作上限：由 max_actions 控制；多动作链中遇到提交按钮、表单回车等"后果性"动作前，必须先确认前置状态变化。资料来源：browser_use/agent/system_prompts/system_prompt_flash.md:1-25
错误恢复优先级：先以截图作 ground truth；遇到弹窗/遮罩先关闭；元素未找到就滚动重试；同类错误 2–3 次后切换策略；遇登录/403 改用替代站点；CAPTCHA 由浏览器自动处理，不应主动尝试。资料来源：browser_use/agent/system_prompts/system_prompt_anthropic_flash.md:25-45
Reasoning 约束：模型被要求显式判定上一步"成功/失败/不确定"，并以截图为主、<browser_state> 为辅进行校验；禁止"动作出现在历史中就视为成功"的乐观假设。资料来源：browser_use/agent/system_prompts/system_prompt.md:1-30
多步任务规划：超过 10 步的长任务必须使用 todo.md 做清单追踪，并使用 replace_file_str 更新完成项。资料来源：browser_use/agent/system_prompts/system_prompt_anthropic_flash.md:1-30

v0.12.8 起 CLI 进一步收紧安全策略：evaluate() 在受限浏览器配置下被拒绝执行，Unix Socket 文件被限制为仅属主可访问——这是 Watchdog 视角下"安全护栏"的具体体现。资料来源：examples/integrations/README.md:1-20

社区常见故障速查

现象	触发条件	建议处理
首步出现空白 Chromium 页（#1020）	启动后未等待 DOM 就绪	在 system prompt 中显式先 `wait` 或截图校验
`evaluate()` 被拒绝	受限 profile	切换非受限 profile，或改用 `click` / `input`
缺乏 hover 动作（#4964）	CSS `:hover` 触发型菜单/工具提示	当前须通过 `evaluate` 派发事件，社区已有专门请求
Azure OpenAI 误判 jailbreak（#4783）	content_filter / ResponsibleAIPolicy 过严	调整部署侧过滤策略，或更换 LLM provider
模型导入路径错误（#4755）	文档示例引用了不存在的符号	优先使用 `ChatBrowserUse()`，再视情况按官方 supported-models 列表选择

参见

examples/apps/news-use/README.md — 端到端新闻监控示例，串联 BrowserSession、DOM 提取与看门狗循环
examples/integrations/README.md — 第三方集成的目录约定与安全护栏说明
browser_use/actor/README.md — Browser Actor / CDP 低层 API 参考

来源：https://github.com/browser-use/browser-use / 项目说明书

Agent Loop, Tools, and Customization

Browser-Use 的核心是一个迭代式 Agent 循环：LLM 观察浏览器状态、推理下一步、调用工具执行操作，然后再观察、再推理，如此循环直至完成用户任务。Agent 类是该循环的入口，它在 browseruse/agent/service.py 中注册 LLM、Browser、Tools 与 Settings，并在每一轮中构造消息、调用 LLM、解析输出、执行动作。...

章节 相关页面

继续阅读本节完整说明和来源证据。

概述

Browser-Use 的核心是一个迭代式 Agent 循环：LLM 观察浏览器状态、推理下一步、调用工具执行操作，然后再观察、再推理，如此循环直至完成用户任务。Agent 类是该循环的入口，它在 browser_use/agent/service.py 中注册 LLM、Browser、Tools 与 Settings，并在每一轮中构造消息、调用 LLM、解析输出、执行动作。系统提示词通过 browser_use/agent/system_prompts/__init__.py 模板化管理，工具集合支持通过 Tools 注册自定义动作。这种"提示词模板 + 工具注册 + 消息管理"三件套共同决定了循环的可定制性。

资料来源：browser_use/agent/service.py:1-50、browser_use/agent/system_prompts/__init__.py:1-3。

Agent Loop 核心流程

下图展示一次单步循环中各模块的协作关系：

flowchart LR
  A[User Task] --> B[MessageManager]
  B --> C[LLM 调用]
  C --> D{解析输出}
  D -->|动作| E[Tools / Browser]
  E --> F[执行结果]
  F --> G[BrowserState + Screenshot]
  G --> B
  D -->|done| H[返回最终结果]

循环的关键设计：

状态汇总：MessageManager 将 <file_system>、<todo_contents>、<read_state>、<browser_state>、<browser_vision> 等部分组合成单次 LLM 输入。资料来源：browser_use/agent/message_manager/service.py:1-50]。
历史持久化：AgentHistoryList 维护带元数据的历史记录，包含每步的 duration_seconds、结果与模型输出，支持保存为 JSON 与生成 GIF 回放。资料来源：browser_use/agent/views.py:1-30、browser_use/agent/gif.py:1-20。
消息压缩：当上下文超长时，MessageManager 会调用 compaction LLM 生成摘要，并明确要求"只标记被显式确认的步骤为完成"。资料来源：browser_use/agent/message_manager/service.py:1-40]。

社区中常见的"循环卡住"问题（如 Issue #1020 描述的空白页面现象）通常与 loop_detection_window 和 loop_detection_enabled 设置相关，开发者可通过调整这两个参数配合 planning_replan_on_stall 来摆脱停滞。

系统提示词与 LLM 适配

browser_use/agent/system_prompts/ 目录提供多套提示词模板，分别面向不同 LLM：

模板文件	适用场景
`system_prompt.md`	默认全功能版，含完整 reasoning_rules 与 todo_examples
`system_prompt_no_thinking.md`	关闭显式思考块的模型
`system_prompt_flash.md` / `system_prompt_flash_anthropic.md`	Flash 系列精简版
`system_prompt_anthropic_flash.md`	Anthropic Flash 模式，强调截图作为 ground truth

模板统一使用 <user_request> / <browser_state> / <file_system> / <action_rules> / <output> 等结构化标签，让同一段业务代码可适配多家 LLM。资料来源：browser_use/agent/system_prompts/system_prompt.md:1-40、system_prompt_flash.md:1-30、system_prompt_anthropic_flash.md:1-30。

对于不支持原生 tool_call 的模型，提示词要求模型输出 JSON 形式 {memory, evaluation_previous_goal, next_goal, action}；对于支持原生工具调用的模型，则使用 AgentOutput 工具的 schema。资料来源：browser_use/agent/system_prompts/system_prompt_flash.md:1-20。

工具注册与自定义动作

Tools 是扩展 Agent 能力的标准入口。自定义动作通过装饰器注册：

from browser_use import Tools

tools = Tools()

@tools.action(description='Description of what this tool does.')
def custom_tool(param: str) -> str:
    return f"Result: {param}"

agent = Agent(task="Your task", llm=llm, browser=browser, tools=tools)

资料来源：README.md:1-30。

每个内置动作（如 navigate / click / input / scroll / extract / screenshot / switch_tab / go_back / done / write_file / read_file / replace_file_str）都有明确 schema，模板要求模型在调用前用截图验证上一步结果。资料来源：browser_use/agent/system_prompts/system_prompt_anthropic_flash.md:1-30。

安全限制（来自 0.12.8 release notes）：evaluate() 在受限浏览器配置上会被拒绝；这是为防止模型通过任意 JS 绕过沙箱所加的防护。社区 Issue #4964 建议新增 hover 动作以触发 CSS hover 行为，但当前实现需通过 evaluate() 配合 dispatchEvent 模拟。

LLM 提供方与最佳实践

OpenRouter 等聚合服务：文档（Issue #4755）曾出现"model import 不存在"的例子，推荐改用 ChatOpenRouter 等显式包装类，并检查最新文档。
Azure OpenAI（Issue #4783）：存在内容审核被误判的情况，提示词要求模型"截图作为 ground truth 而非 overclaim"，可在 system_prompt_anthropic_flash.md 中找到对应规则。
Codex CLI（Issue #4895）：CLI 2.0（v0.12.3）已基于直接 CDP 替代 Playwright，提供约 50ms 的低延迟命令执行。资料来源：README.md:1-30。
示例项目：examples/apps/news-use/README.md 展示了以 Gemini + Agent 实现的新闻监控；examples/integrations/README.md 给出了第三方集成的目录规范。

常见失败模式

现象	触发条件	调优建议
循环卡在空白页	截图未更新 / 元素未渲染	提升 `max_clickable_elements_length`，降低 `step_timeout`
误判任务完成	模型 overclaim	启用 `use_judge` 与 `ground_truth`
Token 超限	长任务历史膨胀	启用 `message_compaction`，调小 `summary_max_chars`
内容被 Azure 误拦	LLM 提示词触发安全策略	改用 `system_prompt_anthropic_flash.md` 风格的安全措辞

参见

Agent Service 与 Settings 配置
Browser 会话与 Profile 管理
DOM 提取与 BrowserState 序列化
社区与 Issue 跟踪

资料来源：browser_use/agent/service.py:1-50、browser_use/agent/system_prompts/__init__.py:1-3。

CLI 2.0, Skills, and Coding-Agent Integration

CLI 2.0（自 0.12.3 版本引入）是 browser-use 面向 AI 编码代理（Claude Code、Codex 等）推出的新一代命令行入口，核心设计目标是在不依赖 Playwright 中间层的前提下提供"最快"的浏览器自动化能力。

章节 相关页面

继续阅读本节完整说明和来源证据。

CLI 2.0、Skills 与编码代理集成

概述与设计目标

根据 0.12.3 版本发布说明，CLI 2.0 直接基于 CDP（Chrome DevTools Protocol） 与持久后台守护进程通信，从而实现了约 50ms 的指令延迟，宣称相比旧实现提速约 2 倍、token 消耗减少约 50%。资料来源：releases/tag/0.12.3。

CLI 2.0 的实现被组织在 browser_use/skill_cli/ 子包中，包含入口模块、守护进程、浏览器会话与命令子目录等文件。资料来源：browser_use/skill_cli/main.py、browser_use/skill_cli/daemon.py。

架构与组件划分

CLI 2.0 的运行时由两个层级构成：上层是供终端用户或编码代理调用的 skill_cli 入口与命令集合，下层是常驻的后台守护进程（daemon），两者之间通过本地套接字（Unix socket）通信。资料来源：browser_use/skill_cli/README.md、browser_use/skill_cli/daemon.py。

flowchart LR
  A[CLI 编码代理<br/>Claude Code / Codex] --> B[skill_cli/main.py<br/>命令分发]
  B --> C[skill_cli/commands/browser.py<br/>浏览器子命令]
  B --> D[skill_cli/browser.py<br/>会话封装]
  D <-->|本地套接字| E[skill_cli/daemon.py<br/>持久守护进程]
  E -->|直接 CDP| F[Chromium 浏览器]

关键组件说明

组件	职责	资料来源
`skill_cli/main.py`	CLI 入口，解析子命令并转发	main.py
`skill_cli/daemon.py`	常驻后台进程，持有浏览器并响应 CDP 请求	daemon.py
`skill_cli/browser.py`	浏览器会话封装，对外暴露高层 API	browser.py
`skill_cli/commands/browser.py`	`browser` 子命令实现（如打开、关闭、截图）	commands/browser.py
`browser_use/cli.py`	与传统 `Agent` 入口并存的另一条 CLI 路径	cli.py

已知变更与安全修复

CLI 2.0 在 0.12.x 系列中持续演进，重要变更如下：

0.12.7：#4514 引入"另一次大型 CLI 升级"（@ShawnPana），随后 #4590 修复了评审中发现的安全与正确性问题（@sauravpanda）。资料来源：releases/tag/0.12.7。
0.12.8：#4870 修复了守护进程 Unix 套接字文件的访问权限，仅允许属主访问，避免同机多用户环境下的越权调用。资料来源：releases/tag/0.12.8。
0.12.9：#4920 修复 agent 在新标签页（new tab）场景下重复截图导致上下文膨胀的问题，对 CLI 代理尤为重要。资料来源：releases/tag/0.12.9。

社区中也有相关请求希望 CLI 能与 codex-cli 无缝协作，而不仅限于 API Key 模式。资料来源：issue #4895。

使用模式与社区反馈

CLI 2.0 的典型使用场景是"在编码代理中直接驱动浏览器"：Claude Code 或 Codex 进程调用 skill_cli 暴露的子命令，守护进程复用同一浏览器实例，避免反复冷启动。browser 子命令的可用操作由 commands/browser.py 定义，配合会话封装可完成导航、点击、输入、截图等动作。资料来源：browser_use/skill_cli/commands/browser.py、browser_use/skill_cli/browser.py。

社区中曾报告 CLI 在游戏类页面（如 Gold Miner）下可成功打开页面，但难以判定"何时松钩"等动态状态，这反映出 CLI 模式的决策能力仍受底层 LLM 视觉理解限制。资料来源：issue #4939。对于需要 :hover 触发的下拉与提示组件，社区请求增加专门的 hover 动作，目前主要通过 evaluate() 注入 JS 事件作为变通。资料来源：issue #4964。

Cloud, Deployment, and Production

Browser-Use 是一个基于 Python 的浏览器自动化智能体（Agent）框架，既可作为开源库在本地运行，也提供托管的 Cloud API 服务。本页面聚焦于 Cloud 集成、生产部署相关的模式与最佳实践，涵盖云端浏览器会话的编程接口、CLI 工具、认证机制以及面向生产环境的运行建议。

章节 相关页面

继续阅读本节完整说明和来源证据。

概述

根据 README.md 的说明，Browser-Use 遵循 MIT 协议开源，用户可选择 OpenAI、Google、ChatBrowserUse 等 LLM 提供商，或通过 Ollama 等方式运行本地模型；同时官方也提供 Cloud 服务用于托管浏览器会话与智能体执行。

架构概览

Browser-Use 的 Cloud 部署模式由三个层次组成：本地 SDK 层、Cloud API 网关层，以及由 Cloud 维护的远程浏览器运行时。

flowchart LR
    A[本地应用/脚本] -->|HTTPS API 调用| B[Browser-Use Cloud]
    B -->|会话路由| C[远程浏览器实例]
    B -->|LLM 调用| D[LLM 提供商]
    C -->|页面状态/截图| B
    B -->|结果回传| A
    A -->|CLI: browser-use cloud| E[CLI 工具]
    E --> B

如 examples/cloud/README.md 所述，Cloud API 主要面向需要可扩展、托管式浏览器自动化的生产场景，提供了超时控制、重试逻辑、状态码校验等生产级特性。

Cloud API 与示例代码

examples/cloud/ 目录中包含一组按编号组织的可运行示例，覆盖从基础任务到复杂集成的多个场景。资料来源：examples/cloud/README.md

示例文件	用途
`01_basic_task.py`	基础任务执行示例，建议作为起点
`02_*.py` 及更高编号	覆盖认证、回调、Webhook、文件处理等进阶模式

所有示例均内置 30 秒超时与重试机制，并使用环境变量管理密钥与域限制配置。生产环境推荐通过 CLI 参数而非交互式提示来驱动，便于 CI/CD 集成。

CLI 工具与 Cloud 命令

browser_use/skill_cli/commands/cloud.py 模块提供了浏览器与 Cloud 服务交互的命令行工具。其内部实现的 _example_value 等辅助函数用于从 OpenAPI 模式中自动生成请求体示例，帮助用户在调用 Cloud API 前快速构造合规的请求负载。资料来源：browser_use/skill_cli/commands/cloud.py

CLI 工具是 Browser Use CLI 2.0 的核心组件之一，该版本基于直接 CDP（Chrome DevTools Protocol）而非 Playwright 构建，配合持久化后台守护进程实现约 50ms 的命令延迟，可与 Claude Code、Codex 等 CLI Agent 协同工作。

集成与自定义工具

对于需要扩展 Agent 能力的场景，Browser-Use 提供了两种主要扩展路径：

第三方服务集成：将可运行的小型示例放在 examples/integrations/<provider>/ 目录中，仅当集成随包发布并配有测试时才放入 browser_use/integrations/<provider>/。资料来源：examples/integrations/README.md
自定义工具函数：使用 Tools 注册表添加与 LLM 无关的自定义动作：

from browser_use import Tools

tools = Tools()

@tools.action(description='Description of what this tool does.')
def custom_tool(param: str) -> str:
    return f"Result: {param}"

agent = Agent(task="Your task", llm=llm, browser=browser, tools=tools)

资料来源：README.md

生产环境最佳实践

下表汇总了从多个源文件中提炼的生产环境建议：

类别	建议	来源
超时与重试	所有示例内置 30 秒超时与重试逻辑	examples/cloud/README.md
密钥管理	使用环境变量；不提交任何 token 或凭据	examples/integrations/README.md
LLM 选择	优先使用 `ChatBrowserUse()`，除非示例专用于其他模型	examples/integrations/README.md
浏览器配置	`evaluate()` 在受限的浏览器配置上会被拒绝执行	0.12.8 发布说明
CLI 守护进程	Unix socket 文件已限制为仅属主可访问	0.12.8 发布说明
依赖安全	`litellm` 已从核心依赖中移除，需单独安装	0.12.5 发布说明
异步编程	Cloud 客户端使用 `async/await` 模式，需配合 `asyncio.run`	browser_use/llm/oci_raw/README.md

部署场景示例：News-Use 监控

examples/apps/news-use/ 提供了一个面向生产场景的完整应用示例，演示如何以持久化方式运行 Browser-Use Agent。该应用支持：

周期性抓取新闻网站（默认 5 分钟间隔，可通过 --interval 自定义）
标题、URL、发布时间、内容等结构化字段抽取
长/短摘要生成与情感分析
跨重启的去重持久化

资料来源：examples/apps/news-use/README.md

该示例表明 Browser-Use 不仅可用于一次性任务，也适合构建长期运行的监控与数据采集服务。

常见问题与限制

根据社区反馈，部署 Browser-Use 到生产环境时需关注以下几类问题：

Azure OpenAI 内容过滤：在某些导航或登录提示场景下，Azure OpenAI 可能会触发 ResponsibleAIPolicyViolation，影响 Agent 正常执行。
CLI 集成：有用户希望以 codex-cli 而非 API Key 方式调用 browser-use，相关请求正在跟踪中。
HTTPS 与下载：若在企业代理或受限网络环境下使用，需确保 HTTPS 出站连接及浏览器下载功能可用。

如需更深入地了解 Agent 内部提示词与推理机制，请参考系统提示词相关页面。

失败模式与踩坑日记

保留 Doramagic 在发现、验证和编译中沉淀的项目专属风险，不把社区讨论只当作装饰信息。

high 来源证据：Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load t…

可能增加新用户试用和生产接入成本。

high 来源证据：Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected) on normal browser-use nav…

可能阻塞安装或首次运行。

high 来源证据：Documentation: some model import does not exist at all.

可能增加新用户试用和生产接入成本。

medium 失败模式：installation: 0.12.0

Upgrade or migration may change expected behavior: 0.12.0

Pitfall Log / 踩坑日志

项目：browser-use/browser-use

摘要：发现 26 个潜在踩坑项，其中 3 个为 high/blocking；最高优先级：安装坑 - 来源证据：Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load t…。

1. 安装坑 · 来源证据：Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load t…

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load the game, but fails to determine when to…
对用户的影响：可能增加新用户试用和生产接入成本。
证据：community_evidence:github | https://github.com/browser-use/browser-use/issues/4939 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

2. 配置坑 · 来源证据：Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected) on normal browser-use nav…

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected) on normal browser-use navigation/login prompts
对用户的影响：可能阻塞安装或首次运行。
证据：community_evidence:github | https://github.com/browser-use/browser-use/issues/4783 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

3. 能力坑 · 来源证据：Documentation: some model import does not exist at all.

严重度：high
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个能力理解相关的待验证问题：Documentation: some model import does not exist at all.
对用户的影响：可能增加新用户试用和生产接入成本。
证据：community_evidence:github | https://github.com/browser-use/browser-use/issues/4755 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

4. 安装坑 · 失败模式：installation: 0.12.0

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: 0.12.0
对用户的影响：Upgrade or migration may change expected behavior: 0.12.0
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.0 | 0.12.0

5. 安装坑 · 失败模式：installation: 0.12.3 - Browser Use CLI 2.0

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: 0.12.3 - Browser Use CLI 2.0
对用户的影响：Upgrade or migration may change expected behavior: 0.12.3 - Browser Use CLI 2.0
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.3 | 0.12.3 - Browser Use CLI 2.0

6. 安装坑 · 失败模式：installation: 0.12.4

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: 0.12.4
对用户的影响：Upgrade or migration may change expected behavior: 0.12.4
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.4 | 0.12.4

7. 安装坑 · 失败模式：installation: 0.12.5

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: 0.12.5
对用户的影响：Upgrade or migration may change expected behavior: 0.12.5
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.5 | 0.12.5

8. 安装坑 · 失败模式：installation: 0.12.6

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: 0.12.6
对用户的影响：Upgrade or migration may change expected behavior: 0.12.6
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.6 | 0.12.6

9. 安装坑 · 失败模式：installation: Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully o...

严重度：medium
证据强度：source_linked
发现：Developers should check this installation risk before relying on the project: Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load the game, but fails to determine when to aim the claw and r...
对用户的影响：Developers may fail before the first successful local run: Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load the game, but fails to determine when to aim the claw and r...
证据：failure_mode_cluster:github_issue | https://github.com/browser-use/browser-use/issues/4939 | Great project! I tried playing Gold Miner via browser-harness in Codex. It can successfully open the webpage and load the game, but fails to determine when to aim the claw and r...

10. 配置坑 · 失败模式：configuration: 0.12.1

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: 0.12.1
对用户的影响：Upgrade or migration may change expected behavior: 0.12.1
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.1 | 0.12.1

11. 配置坑 · 失败模式：configuration: 0.12.7

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: 0.12.7
对用户的影响：Upgrade or migration may change expected behavior: 0.12.7
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.7 | 0.12.7

12. 配置坑 · 失败模式：configuration: Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected)...

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected) on normal browser-use navigation/login prompts
对用户的影响：Developers may misconfigure credentials, environment, or host setup: Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected) on normal browser-use navigation/login prompts
证据：failure_mode_cluster:github_issue | https://github.com/browser-use/browser-use/issues/4783 | Bug: ...Azure OpenAI false content_filter / ResponsibleAIPolicyViolation (jailbreak detected) on normal browser-use navigation/login prompts

13. 配置坑 · 失败模式：configuration: Feature Request: ...

严重度：medium
证据强度：source_linked
发现：Developers should check this configuration risk before relying on the project: Feature Request: ...
对用户的影响：Developers may misconfigure credentials, environment, or host setup: Feature Request: ...
证据：failure_mode_cluster:github_issue | https://github.com/browser-use/browser-use/issues/4895 | Feature Request: ...

14. 能力坑 · 能力判断依赖假设

严重度：medium
证据强度：source_linked
发现：README/documentation is current enough for a first validation pass.
对用户的影响：假设不成立时，用户拿不到承诺的能力。
证据：capability.assumptions | github_repo:881458615 | https://github.com/browser-use/browser-use | README/documentation is current enough for a first validation pass.

15. 运行坑 · 失败模式：runtime: Feature Request: Add hover action for triggering CSS :hover dropdowns, tooltips, and hover-re...

严重度：medium
证据强度：source_linked
发现：Developers should check this runtime risk before relying on the project: Feature Request: Add hover action for triggering CSS :hover dropdowns, tooltips, and hover-reveal patterns
对用户的影响：Developers may hit a documented source-backed failure mode: Feature Request: Add hover action for triggering CSS :hover dropdowns, tooltips, and hover-reveal patterns
证据：failure_mode_cluster:github_issue | https://github.com/browser-use/browser-use/issues/4964 | Feature Request: Add hover action for triggering CSS :hover dropdowns, tooltips, and hover-reveal patterns

16. 维护坑 · 维护活跃度未知

严重度：medium
证据强度：source_linked
发现：未记录 last_activity_observed。
对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
证据：evidence.maintainer_signals | github_repo:881458615 | https://github.com/browser-use/browser-use | last_activity_observed missing

严重度：medium
证据强度：source_linked
发现：no_demo
证据：downstream_validation.risk_items | github_repo:881458615 | https://github.com/browser-use/browser-use | no_demo; severity=medium

18. 安全/权限坑 · 存在评分风险

严重度：medium
证据强度：source_linked
发现：no_demo
对用户的影响：风险会影响是否适合普通用户安装。
证据：risks.scoring_risks | github_repo:881458615 | https://github.com/browser-use/browser-use | no_demo; severity=medium

19. 安全/权限坑 · 来源证据：Feature Request: ...

严重度：medium
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Feature Request: ...
对用户的影响：可能影响授权、密钥配置或安全边界。
证据：community_evidence:github | https://github.com/browser-use/browser-use/issues/4895 | 来源讨论提到 api key 相关条件，需在安装/试用前复核。

20. 安全/权限坑 · 来源证据：Feature Request: Add hover action for triggering CSS :hover dropdowns, tooltips, and hover-reveal patterns

严重度：medium
证据强度：source_linked
发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Feature Request: Add hover action for triggering CSS :hover dropdowns, tooltips, and hover-reveal patterns
对用户的影响：可能影响授权、密钥配置或安全边界。
证据：community_evidence:github | https://github.com/browser-use/browser-use/issues/4964 | 来源类型 github_issue 暴露的待验证使用条件。

21. 能力坑 · 失败模式：conceptual: Documentation: some model import does not exist at all.

严重度：low
证据强度：source_linked
发现：Developers should check this conceptual risk before relying on the project: Documentation: some model import does not exist at all.
对用户的影响：Developers may hit a documented source-backed failure mode: Documentation: some model import does not exist at all.
证据：failure_mode_cluster:github_issue | https://github.com/browser-use/browser-use/issues/4755 | Documentation: some model import does not exist at all.

22. 维护坑 · issue/PR 响应质量未知

严重度：low
证据强度：source_linked
发现：issue_or_pr_quality=unknown。
对用户的影响：用户无法判断遇到问题后是否有人维护。
证据：evidence.maintainer_signals | github_repo:881458615 | https://github.com/browser-use/browser-use | issue_or_pr_quality=unknown

23. 维护坑 · 发布节奏不明确

严重度：low
证据强度：source_linked
发现：release_recency=unknown。
对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
证据：evidence.maintainer_signals | github_repo:881458615 | https://github.com/browser-use/browser-use | release_recency=unknown

24. 维护坑 · 失败模式：maintenance: 0.12.2

严重度：low
证据强度：source_linked
发现：Developers should check this maintenance risk before relying on the project: 0.12.2
对用户的影响：Upgrade or migration may change expected behavior: 0.12.2
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.2 | 0.12.2

25. 维护坑 · 失败模式：maintenance: 0.12.8

严重度：low
证据强度：source_linked
发现：Developers should check this maintenance risk before relying on the project: 0.12.8
对用户的影响：Upgrade or migration may change expected behavior: 0.12.8
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.8 | 0.12.8

26. 维护坑 · 失败模式：maintenance: 0.12.9

严重度：low
证据强度：source_linked
发现：Developers should check this maintenance risk before relying on the project: 0.12.9
对用户的影响：Upgrade or migration may change expected behavior: 0.12.9
证据：failure_mode_cluster:github_release | https://github.com/browser-use/browser-use/releases/tag/0.12.9 | 0.12.9

来源：Doramagic 发现、验证与编译记录