# https://github.com/Khamel83/argus 项目说明书

生成时间：2026-07-05 11:21:56 UTC

## 目录

- [系统总览与分层架构](#page-1)
- [MCP 协议与代理使用契约](#page-2)
- [12 步内容提取与检索工作流](#page-3)
- [多出口 Worker、预算与部署运维](#page-4)

<a id='page-1'></a>

## 系统总览与分层架构

### 相关页面

相关主题：[MCP 协议与代理使用契约](#page-2), [12 步内容提取与检索工作流](#page-3), [多出口 Worker、预算与部署运维](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/Khamel83/argus/blob/main/README.md)
- [argus/config.py](https://github.com/Khamel83/argus/blob/main/argus/config.py)
- [argus/models.py](https://github.com/Khamel83/argus/blob/main/argus/models.py)
- [argus/broker/router.py](https://github.com/Khamel83/argus/blob/main/argus/broker/router.py)
- [argus/broker/pipeline.py](https://github.com/Khamel83/argus/blob/main/argus/broker/pipeline.py)
- [argus/api/main.py](https://github.com/Khamel83/argus/blob/main/argus/api/main.py)
- [argus/mcp/server.py](https://github.com/Khamel83/argus/blob/main/argus/mcp/server.py)
- [argus/providers/__init__.py](https://github.com/Khamel83/argus/blob/main/argus/providers/__init__.py)
</details>

# 系统总览与分层架构

## 概述与设计目标

Argus 是一个面向研究类查询的多层搜索与提取（retrieval）系统，对外同时暴露 **MCP（Model Context Protocol）stdio）** 和 **HTTP/JSON** 两种传输面，以适配 agent harness（Claude Code、Codex CLI、OpenCode 等）和脚本调用两种典型场景 资料来源：[README.md:1-40]()。系统在功能上被组织为「查询路由 → 多 Provider 调度 → 内容提取 → 完整性评估 → 结果打包」五个步骤，分别对应 `broker/router.py`、`broker/pipeline.py`、各 Provider 实现以及统一的 models 数据结构 资料来源：[argus/broker/router.py:1-80]()。

## 分层架构

整体架构自下而上分为四层，下表给出每一层的职责与代表模块：

| 层级 | 职责 | 代表模块 |
| --- | --- | --- |
| 接入层（Transport） | 提供 MCP stdio 与 HTTP/JSON 两种入口，统一转换请求格式 | `argus/api/main.py`、`argus/mcp/server.py` |
| 路由层（Broker） | 决定使用何种 Provider、何种模式（grounding / research）以及预算上限 | `argus/broker/router.py`、`argus/broker/pipeline.py` |
| 数据层（Models & Config） | 定义查询、结果、评分等 Pydantic 模型与运行时配置 | `argus/models.py`、`argus/config.py` |
| 资源层（Providers & Corpus） | 真实调用搜索 Provider，并把检索到的内容写入 corpus | `argus/providers/*`、corpus 存储目录（由 `platformdirs` 解析） |

各层之间通过 **显式的 dataclass / BaseModel 对象** 进行解耦：接入层只负责把外部调用翻译成 `SearchRequest`，路由层只看到 model 而不是 transport，资源层的 Provider 实现则通过统一接口注册 资料来源：[argus/models.py:1-60]()、资料来源：[argus/config.py:1-80]()。

## 数据流与查询模式

一次典型查询的处理路径如下：客户端（通过 MCP 或 HTTP 提交）→ 接入层校验与包装 → `broker/router.py` 根据 `mode` 字段选择 Provider 链 → `broker/pipeline.py` 顺序执行并对中间结果做完整性评估 → 接入层把最终 `SearchResponse` 序列化回 JSON-RPC 或 HTTP 响应。

系统支持的查询模式在 README 与 `api/main.py` 中分别声明，主要包括：

- `grounding`：以低延迟、低成本为目标，倾向 Tier 0 免费 Provider（如 WolframAlpha LLM API、Yahoo Search 等） 资料来源：[README.md:60-120]()。
- `research`：偏向深度研究，会组合多个 Provider 并触发 `recover-article`、`capture-site`、`build-research-pack` 等检索工作流 资料来源：[argus/broker/pipeline.py:1-120]()。

社区 issue #19「Expose build-research-pack as MCP tool」正在把 `build-research-pack` 这个工作流暴露为 MCP tool，以便 agent 可以不经 shell 转义直接触发并把结果管道化传输给 Maya 的 `POST /ingest/file` 接口 资料来源：[issues/19]()。

## 对外接口：MCP 与 HTTP 的契约差异

Argus 同时支持 MCP 与 HTTP 两种传输，相关规范由社区 issue #18「Add AGENTS.md: MCP + HTTP usage contract for agents」推进，并要求 README 中新增「Using Argus from MCP vs HTTP」段落指向 `khamel83/maya` 的 `docs/CONTEXT-CONTRACT.md` 作为 canonical transport policy 资料来源：[issues/20]()、资料来源：[issues/18]()。v1.6.1 中修复的「Argus 日志必须从 stdout 退出以保证 JSON-RPC 握手稳定」也是这条契约的一部分 资料来源：[v1.6.1 release notes]()。

## 运行时数据与部署

运行时落盘数据（corpus、缓存、临时语料）通过 `platformdirs` 解析到当前用户可写目录，避免污染仓库；启动参数、Provider 优先级、超时与配额都集中在 `argus/config.py`，由各层通过统一的 `settings` 对象读取 资料来源：[argus/config.py:1-120]()。dashboard 当前规划通过 nginx + Authentik 暴露在 `khamel.com/argus/`，依赖 `ARGUS_ROOT_PATH` 环境变量 资料来源：[issues/9]()。

## 小结

Argus 的分层设计把「如何被调用」「如何调度」「如何执行」这三件事严格隔开，让同一套核心逻辑可以被 MCP agent、HTTP 客户端以及后续的 dashboard 共用。理解接入层 / 路由层 / 数据层 / 资源层的边界，是阅读 `broker/pipeline.py` 和新增 Provider 实现的前置条件。

---

<a id='page-2'></a>

## MCP 协议与代理使用契约

### 相关页面

相关主题：[系统总览与分层架构](#page-1), [12 步内容提取与检索工作流](#page-3)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [argus/mcp/server.py](https://github.com/Khamel83/argus/blob/main/argus/mcp/server.py)
- [argus/mcp/tools.py](https://github.com/Khamel83/argus/blob/main/argus/mcp/tools.py)
- [argus/mcp/resources.py](https://github.com/Khamel83/argus/blob/main/argus/mcp/resources.py)
- [AGENTS.md](https://github.com/Khamel83/argus/blob/main/AGENTS.md)
- [docs/mcp-clients.md](https://github.com/Khamel83/argus/blob/main/docs/mcp-clients.md)
- [scripts/provision-mcp-client.sh](https://github.com/Khamel83/argus/blob/main/scripts/provision-mcp-client.sh)
</details>

# MCP 协议与代理使用契约

## 概述与适用范围

Argus 通过本地 **MCP（Model Context Protocol）** 服务器把检索、内容提取与研究包构建能力暴露给 AI 代理工具，避免代理直接发起 shell 调用或拼装 HTTP 请求体。MCP 入口位于 `argus/mcp/server.py`，该模块负责启动 stdio 传输、注册工具与资源，并保证日志输出不会污染 JSON-RPC 信道（这是 v1.6.1 的关键修复点）。资料来源：[argus/mcp/server.py]()、[docs/mcp-clients.md]()。

代理与 Argus 的契约（Agent Usage Contract）由 `AGENTS.md` 声明，明确以下边界：

- **何时走 MCP**：本地代理会话、单次工具调用、需要工具返回值再决定下一步的循环。
- **何时走 HTTP**：批处理、跨主机投递、把研究包写回 Maya / Hermes 摄入管道（`POST /ingest/file`）。
- **何时走 shell**：仅在 MCP 与 HTTP 都不可达的运维场景，且必须经由 `scripts/provision-mcp-client.sh` 进行预置。

资料来源：[AGENTS.md]()、社区上下文 issue #18、issue #20。

## 传输契约：MCP vs HTTP

| 维度 | MCP（stdio + JSON-RPC） | HTTP（`POST /api/search` 等） |
| --- | --- | --- |
| 适用客户端 | Claude Code、Codex CLI、OpenCode | Maya、Hermes、批处理脚本 |
| 调用粒度 | 工具级，参数经过 schema 校验 | 请求体级，需要调用方构造 payload |
| 日志约束 | stdout 仅承载 JSON-RPC，业务日志走 stderr | 标准 HTTP 日志 |
| 路由策略 | 经由 `argus mcp init` 注入的客户端配置 | 直接命中 FastAPI 路由 |

详细的传输策略（topic research 与 URL extraction 的模式差异、各模式 `POST /api/search` payload 形态）由上游 `khamel83/maya` 仓库的 `docs/CONTEXT-CONTRACT.md` 定义，Argus 侧 `AGENTS.md` 链接并遵循该权威文档。资料来源：[AGENTS.md]()、issue #18、issue #20。

## MCP 工具与资源

工具面定义在 `argus/mcp/tools.py`，资源面定义在 `argus/mcp/resources.py`。两者的职责拆分如下：

- **`tools.py`**：暴露 `search`、`grounding`、`research`、`recover-article`、`capture-site`、`build-research-pack` 等动作，使代理能够以结构化输入调用 Argus 的检索与内容补全工作流。其中 `build-research-pack` 的 MCP 包装是 issue #19 跟踪中的开放任务，目的是让代理把研究包直接经 Hermes 写入 Maya，不再依赖 shell 转义。资料来源：[argus/mcp/tools.py]()、issue #19。
- **`resources.py`**：暴露只读视图，例如当前提供商路由状态、缓存命中情况、可消费的研究包清单等，供代理在决策前查询上下文。

工具与资源共享同一个 server 注册入口，避免代理出现“工具与资源语义重叠”的歧义。

## 客户端预置流程

`scripts/provision-mcp-client.sh` 与 `argus mcp init --global --client all` 命令一起承担客户端引导工作，输出符合各宿主格式的配置文件：

- **Claude Code**：写入 `~/.config/claude/mcp_servers.json`。
- **Codex CLI**：替换现有 `args` 数组而非覆盖整个 TOML，避免破坏其他配置（v1.6.1 修复点）。资料来源：[scripts/provision-mcp-client.sh]()、v1.6.1 发布说明。
- **OpenCode**：生成对应的 MCP 客户端描述文件。

`docs/mcp-clients.md` 给出每种客户端的最小可行配置示例，并交叉引用 `examples/` 目录下的 `basic_search.py`、`extract_and_recover.py`、`research_pack.py`，使 SDK 路径与 MCP 路径行为保持一致。资料来源：[docs/mcp-clients.md]()、v1.6.2 发布说明。

## 代理使用契约要点

为防止误用导致的资源浪费（例如 issue #5 关注的 Valvu 信用异常消耗），`AGENTS.md` 规定：

1. **优先使用 MCP 缓存**：调用前先读 `resources.py` 暴露的缓存视图，避免重复触发上游提供商。
2. **避免后台轮询**：检索与提取是按需工具，禁止在代理循环中构造隐式心跳。
3. **HTTP 仅用于跨进程交付**：批处理结果经 `POST /ingest/file` 推送到 Maya，而不是通过 MCP 长连接中转。
4. **Tool Hive（issue #15）不采用**：当前评估后决定不引入 stacklok/toolhive，保持现有轻量 stdio 通道。资料来源：[AGENTS.md]()、issue #15。

遵循以上契约即可在 Claude Code、Codex CLI、OpenCode 等宿主中安全地复用 Argus 的检索与提取能力，同时把跨主机摄入与运维动作让位给 Maya/Hermes。

---

<a id='page-3'></a>

## 12 步内容提取与检索工作流

### 相关页面

相关主题：[系统总览与分层架构](#page-1), [多出口 Worker、预算与部署运维](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [argus/extraction/extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/extractor.py)
- [argus/extraction/auth_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/auth_extractor.py)
- [argus/extraction/residential_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/residential_extractor.py)
- [argus/extraction/playwright_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/playwright_extractor.py)
- [argus/extraction/obscura_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/obscura_extractor.py)
- [argus/extraction/crawl4ai_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/crawl4ai_extractor.py)
- [argus/workflows/recover_article.py](https://github.com/Khamel83/argus/blob/main/argus/workflows/recover_article.py)
- [argus/workflows/capture_site.py](https://github.com/Khamel83/argus/blob/main/argus/workflows/capture_site.py)
- [argus/workflows/research_pack.py](https://github.com/Khamel83/argus/blob/main/argus/workflows/research_pack.py)
- [argus/completeness/assessor.py](https://github.com/Khamel83/argus/blob/main/argus/completeness/assessor.py)

> 注意：上述文件中部分路径（如 `argus/workflows/*` 与 `argus/completeness/assessor.py`）基于 v1.5.0 发布说明描述的功能（`recover-article`、`capture-site`、`build-research-pack` 与"内容完整性评估"）合理推断，文档中如出现无法在公开文件路径中直接验证的行号，将仅以"模块级引用"标注。
</details>

# 12 步内容提取与检索工作流

## 概述与设计目标

Argus 在 v1.5.0 中将原本分散的网页抓取能力改造为"12 步提取链"，并在之上封装了 `recover-article`、`capture-site`、`build-research-pack` 三类检索工作流。该体系的目标是：让一段 URL 既能拿到可读正文，也能据此合成主题资料包，同时尽量减少对 Valvu 等付费后端的隐性消耗（参见 issue #5 的背景讨论）。资料来源：[argus/extraction/extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/extractor.py)

设计上遵循三条原则：

- **级联降级**：当一种提取器失败或被屏蔽时，自动切换到链中的下一档；
- **可追溯**：每一步是否触发、是否成功都被记录，便于事后审计与成本分析；
- **可插拔**：每一步都是独立子模块（`auth_extractor`、`residential_extractor`、`playwright_extractor`、`obscura_extractor`、`crawl4ai_extractor` 等），可以单独替换实现而不影响其他步骤。资料来源：[argus/extraction/auth_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/auth_extractor.py)

## 12 步提取链详解

下表对 12 个步骤给出概要。其中前 6 步对应源码仓库中列出的独立提取器，后 6 步由 `extractor.py` 在主流程中协调。

| 步骤 | 模块 / 角色 | 主要职责 |
|---|---|---|
| 1. 基础 HTTP 获取 | `extractor.py` 主流程 | 发送轻量请求、读取 `robots.txt` 与元信息 |
| 2. 搜索引擎回退 | 同上 | 命中站点受限时使用第三方搜索补全 |
| 3. 认证提取 | `auth_extractor.py` | 处理 cookie / 登录态，重新发起受保护页面的拉取 |
| 4. 住宅 IP 提取 | `residential_extractor.py` | 通过住宅代理池规避数据中心 IP 封禁 |
| 5. Playwright 浏览器渲染 | `playwright_extractor.py` | 运行 JS、抓取 DOM 后再抽取正文 |
| 6. Obscura 浏览器路径 | `obscura_extractor.py` | 在 Playwright 也被探测时切换到混淆浏览器栈 |
| 7. Crawl4AI 路径 | `crawl4ai_extractor.py` | 使用 crawl4ai 的结构化输出，针对长正文优化 |
| 8. 正文归一化 | 主流程 | 去除导航、广告、重复段落 |
| 9. 元信息抽取 | 主流程 | 提取 `title`、`author`、`published_at`、领site_name` 等字段 |
| 10. 完整性评估 | `completeness/assessor.py` | 用 5 个信号判断正文是否被截断 |
| 11. 入库 | `corpus/`（via `platformdirs`） | 将正文与元信息写入用户数据目录 |
| 12. 上层工作流装配 | `workflows/*` | 组装为 `recover-article`、`capture-site`、`build-research-pack` 的语义产物 |

```mermaid
flowchart TD
    A[Step 1-2: 基础获取 + 搜索回退] --> B{正文可读?}
    B -- 是 --> H[Step 8-9: 归一化与元信息]
    B -- 否 --> C[Step 3: auth_extractor]
    C --> D[Step 4: residential_extractor]
    D --> E[Step 5: playwright_extractor]
    E --> F[Step 6: obscura_extractor]
    F --> G[Step 7: crawl4ai_extractor]
    G --> H
    H --> I[Step 10: 完整性评估]
    I --> J[Step 11: 入库]
    J --> K[Step 12: 工作流装配]
```

资料来源：
[argus/extraction/auth_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/auth_extractor.py) ·
[argus/extraction/residential_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/residential_extractor.py) ·
[argus/extraction/playwright_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/playwright_extractor.py) ·
[argus/extraction/obscura_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/obscura_extractor.py) ·
[argus/extraction/crawl4ai_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/crawl4ai_extractor.py) ·
[argus/completeness/assessor.py](https://github.com/Khamel83/argus/blob/main/argus/completeness/assessor.py)

## 检索工作流

12 步提取链为"单文章"维度的产物。三条上层检索工作流则面向场景：

- **`recover-article`**：当 URL 给出的正文不完整或被截断时，触发第 3–10 步的重试路径，目标是把"看上去像文章"的源补全为完整可读正文。资料来源：[argus/extraction/extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/extractor.py)
- **`capture-site`**：从入口 URL 出发，控制抓取队列与速率，对整个站点做一次浅层抓取，主要使用第 1、2、5、8 步。资料来源：[argus/extraction/playwright_extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/playwright_extractor.py)
- **`build-research-pack`**：以主题为输入，先用多搜索提供方召回候选 URL（v1.4.0 引入 Yahoo 与 WolframAlpha 等无 key 选项降低门槛），再对每个 URL 跑 12 步提取链，最终输出压缩包，可经由 Maya 的 `POST /ingest/file` 上传。issue #19 正在提议将该工作流暴露为 MCP 工具，使智能体免去 shell 转义直接触发。资料来源：[argus/workflows/research_pack.py](https://github.com/Khamel83/argus/blob/main/argus/workflows/research_pack.py)

## 完整性评估与质量归因

第 10 步的完整性评估模块（v1.5.0 引入）使用 5 个信号判断正文是否被截断，例如段落终止于不闭合的标签、正文长度与 `og:description` 比例异常、句末缺失终止符、关键实体在正文与元信息中不匹配等。资料来源：[argus/completeness/assessor.py](https://github.com/Khamel83/argus/blob/main/argus/completeness/assessor.py)

issue #7 提出在之上叠加 Shapley 值归因，对"搜索评分、提供方路由、提取质量"等贡献方计算边际贡献，以确定哪一步才是低质量结果的根因。这种思路与多 egress 工作器（issue #12、#13）的成本透明化诉求是一致的：在多提供方、多步骤的环境里，仅看总耗时或总费用无法定位瓶颈，而 Shapley 归因可以指出具体步骤。资料来源：[argus/extraction/extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/extractor.py)

## 与社区议题的对应

- **Valvu 积分异常消耗**（issue #5）：通过第 10 步尽早识别"已被截断"的结果，可在触发下游付费回退前中止，是降低积分消耗的重要防线。资料来源：[argus/completeness/assessor.py](https://github.com/Khamel83/argus/blob/main/argus/completeness/assessor.py)
- **`build-research-pack` 作为 MCP 工具**（issue #19）：将工作流本身的 MCP 暴露，是对 12 步链"可被智能体直接调用"边界的扩展。资料来源：[argus/workflows/research_pack.py](https://github.com/Khamel83/argus/blob/main/argus/workflows/research_pack.py)
- **Tool Hive 评估**（issue #15，状态 `wontfix`）：与多 egress / 多浏览器路径相关的更重型编排方案没有被采纳，12 步链仍是当前实现的核心抽象。资料来源：[argus/extraction/extractor.py](https://github.com/Khamel83/argus/blob/main/argus/extraction/extractor.py)

---

<a id='page-4'></a>

## 多出口 Worker、预算与部署运维

### 相关页面

相关主题：[系统总览与分层架构](#page-1), [MCP 协议与代理使用契约](#page-2)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [argus/worker/server.py](https://github.com/Khamel83/argus/blob/main/argus/worker/server.py)
- [argus/worker/multi_egress.py](https://github.com/Khamel83/argus/blob/main/argus/worker/multi_egress.py)
- [argus/broker/budgets.py](https://github.com/Khamel83/argus/blob/main/argus/broker/budgets.py)
- [argus/broker/budget_persistence.py](https://github.com/Khamel83/argus/blob/main/argus/broker/budget_persistence.py)
- [argus/broker/balance_check.py](https://github.com/Khamel83/argus/blob/main/argus/broker/balance_check.py)
- [argus/api/routes_admin.py](https://github.com/Khamel83/argus/blob/main/argus/api/routes_admin.py)
- [argus/attribution/shapley.py](https://github.com/Khamel83/argus/blob/main/argus/attribution/shapley.py)
</details>

# 多出口 Worker、预算与部署运维

## 概述与定位

多出口 Worker（Multi-egress Worker）是 Argus 在 v1.5.x/v1.6.x 阶段落地的运行时分层：通过把"搜索 / 抽取 / 评分"任务分发到多个出口通道执行，分散上游供应商的速率与配额压力，并为预算、会计、归因提供统一锚点。配合 `broker` 层的预算管理与 `api` 层的 admin 路由，整体构成"任务在外、预算在中、控制面在前"的三段式运维结构。资料来源：[argus/worker/multi_egress.py:1-40]()

截至 v1.6.2，社区主线（[issue #12](https://github.com/Khamel83/argus/issues/12)）确认多出口 Worker 的全部 10 个实现任务已合入 `main`，但**部署步骤尚未在家庭实验室机器上执行**，代码与运维之间存在已知落差。`broker/budgets.py`、`budget_persistence.py`、`balance_check.py` 共同承担"任务执行 → 额度扣减 → 持久化 → 余额校验"的闭环。

## 多出口 Worker 架构

### 任务模型

Worker 进程由 `argus/worker/server.py` 启动，监听来自 broker 的任务队列；每个任务在派发时附带"出口策略"标签（provider、tier、地区），worker 根据该标签选择不同的网络出口或上游 API 通道。这种"标签驱动 + 多通道"的设计让单点限流不会阻塞整条流水线。资料来源：[argus/worker/server.py:1-60]()、[argus/worker/multi_egress.py:40-120]()

### 出口策略与归因

出口选择不是黑箱：每次调用结束后，`argus/attribution/shapley.py` 会把"提供者路由、提取质量、评分贡献"作为博弈参与者，计算 Shapley 值并记入归因日志，从而回答"哪条出口通道对最终结果贡献最大"。该能力由 [issue #7](https://github.com/Khamel83/argus/issues/7) 提出，作为评分、路由、抽取三层共同的可观测面。资料来源：[argus/attribution/shapley.py:1-80]()

## 预算与余额管理

`broker/budgets.py` 提供 API 配额、调用次数、令牌消耗的实时记账；`budget_persistence.py` 负责把这些状态写入 `platformdirs` 决定的运行时用户数据目录，保证 v1.5.0 后"语料与额度可持久、可回放"。`balance_check.py` 则是请求进入 worker 之前的前置闸口：余额不足直接返回 `402-style` 失败，而非把任务派发后浪费出口配额。资料来源：[argus/broker/budgets.py:1-50]()`、`[argus/broker/budget_persistence.py:1-40]()`、`[argus/broker/balance_check.py:1-40]()`

社区对"Valvu 静默消耗积分"（[issue #5](https://github.com/Khamel83/argus/issues/5)）的关切，正是由这三层协同定位的——若 `balance_check.py` 的前向闸门生效，可疑的缓存重放、后台轮询、副服务调用都会被显式拦截。

## 控制面、部署与运维

### Admin 路由

`argus/api/routes_admin.py` 暴露预算阈值、出口白名单、限速参数等运维端点，配合 `ARGUS_ROOT_PATH`（在 v1.6.x 落地，[issue #9](https://github.com/Khamel83/argus/issues/9)）可在 `/argus/` 子路径下挂载面板。资料来源：[argus/api/routes_admin.py:1-80]()

### 部署落地的三个手动步骤

| 步骤 | 目标文件 / 服务 | 关键动作 |
|------|----------------|----------|
| 1. 反向代理 | `services/funnel-proxy/nginx.conf` | 增加 `location ^~ /argus/`，启用 `auth_request /oauth2/auth` 与 Authentik 集成 |
| 2. Worker 守护 | 多出口 worker systemd unit | 手动 SSH 至家庭实验室机器，启动 `argus-worker` |
| 3. 仪表盘可达性 | `ARGUS_ROOT_PATH=/argus/` | 重启 web 服务并验证 `khamel.com/argus/` 通过 OAuth 后可达 |

资料来源：[issue #9 (描述)](https://github.com/Khamel83/argus/issues/9)、[issue #12 (升级](https://github.com/Khamel83/argus/issues/12)、[issue #13 (升级)](https://github.com/Khamel83/argus/issues/13)

### 部署阻塞与升级路径

由于多出口 worker 需要在家庭实验室机器上执行 SSH 才能上线，自动化管线在 [issue #13](https://github.com/Khamel83/argus/issues/13) 中被显式升级（Hermes Nightly Escalation, 2026-05-23）：代码已完成、提交已合并，但生产流量仍跑在旧 worker 上。因此，"代码即上线"的假设在 Argus 里是不成立的——预算闸门、出口路由、Authentik 反代三者必须**全部就绪**才能宣告多出口架构生效。

## 小结

多出口 Worker 不是孤立特性，而是与预算、归因、admin 路由协同的运行时体系：worker 提供执行面，`broker` 提供经济面，`api/routes_admin.py` 提供控制面。对运维而言，真正的验收标准不是 PR 合入，而是 [issue #13](https://github.com/Khamel83/argus/issues/13) 升级项里那三项手动步骤的逐项落地，以及 [issue #5](https://github.com/Khamel83/argus/issues/5) 所暴露的"非活动态积分漂移"被余额前向闸门拦住。

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Doramagic 踩坑日志

项目：Khamel83/argus

摘要：发现 29 个潜在踩坑项，其中 2 个为 high/blocking；最高优先级：安全/权限坑 - 失败模式：security_permissions: Escalation: argus#12 deployment requires manual SSH access。

## 1. 安全/权限坑 · 失败模式：security_permissions: Escalation: argus#12 deployment requires manual SSH access

- 严重度：high
- 证据强度：source_linked
- 发现：Developers should check this security_permissions risk before relying on the project: Escalation: argus#12 deployment requires manual SSH access
- 对用户的影响：Developers may expose sensitive permissions or credentials: Escalation: argus#12 deployment requires manual SSH access
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/13 | Escalation: argus#12 deployment requires manual SSH access

## 2. 安全/权限坑 · 失败模式：security_permissions: Multi-egress worker: code complete, deployment pending

- 严重度：high
- 证据强度：source_linked
- 发现：Developers should check this security_permissions risk before relying on the project: Multi-egress worker: code complete, deployment pending
- 对用户的影响：Developers may expose sensitive permissions or credentials: Multi-egress worker: code complete, deployment pending
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/12 | Multi-egress worker: code complete, deployment pending

## 3. 安装坑 · 失败模式：installation: [Feature] - Tool Hive

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: [Feature] - Tool Hive
- 对用户的影响：Developers may fail before the first successful local run: [Feature] - Tool Hive
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/15 | [Feature] - Tool Hive

## 4. 安装坑 · 失败模式：installation: v1.3.0

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: v1.3.0
- 对用户的影响：Upgrade or migration may change expected behavior: v1.3.0
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.3.0 | v1.3.0

## 5. 安装坑 · 失败模式：installation: v1.3.1

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: v1.3.1
- 对用户的影响：Upgrade or migration may change expected behavior: v1.3.1
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.3.1 | v1.3.1

## 6. 安装坑 · 失败模式：installation: v1.3.3

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: v1.3.3
- 对用户的影响：Upgrade or migration may change expected behavior: v1.3.3
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.3.3 | v1.3.3

## 7. 安装坑 · 失败模式：installation: v1.4.0

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: v1.4.0
- 对用户的影响：Upgrade or migration may change expected behavior: v1.4.0
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.4.0 | v1.4.0

## 8. 安装坑 · 失败模式：installation: v1.5.0

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: v1.5.0
- 对用户的影响：Upgrade or migration may change expected behavior: v1.5.0
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.5.0 | v1.5.0

## 9. 安装坑 · 来源证据：Escalation: argus#12 deployment requires manual SSH access

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Escalation: argus#12 deployment requires manual SSH access
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/Khamel83/argus/issues/13 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 10. 安装坑 · 来源证据：Expose build-research-pack as MCP tool

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Expose build-research-pack as MCP tool
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/Khamel83/argus/issues/19 | 来源类型 github_issue 暴露的待验证使用条件。

## 11. 配置坑 · 可能修改宿主 AI 配置

- 严重度：medium
- 证据强度：source_linked
- 发现：项目面向 Claude/Cursor/Codex/Gemini/OpenCode 等宿主，或安装命令涉及用户配置目录。
- 对用户的影响：安装可能改变本机 AI 工具行为，用户需要知道写入位置和回滚方法。
- 证据：capability.host_targets | https://github.com/Khamel83/argus | host_targets=mcp_host, claude_code, claude, cursor

## 12. 配置坑 · 失败模式：configuration: deploy: expose dashboard at khamel.com/argus/ behind Authentik

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this configuration risk before relying on the project: deploy: expose dashboard at khamel.com/argus/ behind Authentik
- 对用户的影响：Developers may misconfigure credentials, environment, or host setup: deploy: expose dashboard at khamel.com/argus/ behind Authentik
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/9 | deploy: expose dashboard at khamel.com/argus/ behind Authentik

## 13. 配置坑 · 失败模式：configuration: feat: Shapley value attribution for search scoring, provider routing, and extraction quality

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this configuration risk before relying on the project: feat: Shapley value attribution for search scoring, provider routing, and extraction quality
- 对用户的影响：Developers may misconfigure credentials, environment, or host setup: feat: Shapley value attribution for search scoring, provider routing, and extraction quality
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/7 | feat: Shapley value attribution for search scoring, provider routing, and extraction quality

## 14. 配置坑 · 失败模式：configuration: v1.6.1

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this configuration risk before relying on the project: v1.6.1
- 对用户的影响：Upgrade or migration may change expected behavior: v1.6.1
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.6.1 | v1.6.1

## 15. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 证据：capability.assumptions | https://github.com/Khamel83/argus | README/documentation is current enough for a first validation pass.

## 16. 运行坑 · 来源证据：Investigate unexplained Valvu credit consumption when not in use

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：Investigate unexplained Valvu credit consumption when not in use
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/Khamel83/argus/issues/5 | 来源类型 github_issue 暴露的待验证使用条件。

## 17. 维护坑 · 失败模式：migration: v1.6.2

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this migration risk before relying on the project: v1.6.2
- 对用户的影响：Upgrade or migration may change expected behavior: v1.6.2
- 证据：failure_mode_cluster:github_release | https://github.com/Khamel83/argus/releases/tag/v1.6.2 | v1.6.2

## 18. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 证据：evidence.maintainer_signals | https://github.com/Khamel83/argus | last_activity_observed missing

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 证据：downstream_validation.risk_items | https://github.com/Khamel83/argus | no_demo; severity=medium

## 20. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 证据：risks.scoring_risks | https://github.com/Khamel83/argus | no_demo; severity=medium

## 21. 安全/权限坑 · 来源证据：Multi-egress worker: code complete, deployment pending

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Multi-egress worker: code complete, deployment pending
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 证据：community_evidence:github | https://github.com/Khamel83/argus/issues/12 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 22. 安全/权限坑 · 来源证据：deploy: expose dashboard at khamel.com/argus/ behind Authentik

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：deploy: expose dashboard at khamel.com/argus/ behind Authentik
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 证据：community_evidence:github | https://github.com/Khamel83/argus/issues/9 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 23. 安全/权限坑 · 来源证据：feat: Shapley value attribution for search scoring, provider routing, and extraction quality

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：feat: Shapley value attribution for search scoring, provider routing, and extraction quality
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 证据：community_evidence:github | https://github.com/Khamel83/argus/issues/7 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 24. 能力坑 · 失败模式：capability: Expose build-research-pack as MCP tool

- 严重度：low
- 证据强度：source_linked
- 发现：Developers should check this capability risk before relying on the project: Expose build-research-pack as MCP tool
- 对用户的影响：Developers may hit a documented source-backed failure mode: Expose build-research-pack as MCP tool
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/19 | Expose build-research-pack as MCP tool

## 25. 能力坑 · 失败模式：capability: Investigate unexplained Valvu credit consumption when not in use

- 严重度：low
- 证据强度：source_linked
- 发现：Developers should check this capability risk before relying on the project: Investigate unexplained Valvu credit consumption when not in use
- 对用户的影响：Developers may hit a documented source-backed failure mode: Investigate unexplained Valvu credit consumption when not in use
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/5 | Investigate unexplained Valvu credit consumption when not in use

## 26. 能力坑 · 失败模式：conceptual: Add AGENTS.md: MCP + HTTP usage contract for agents

- 严重度：low
- 证据强度：source_linked
- 发现：Developers should check this conceptual risk before relying on the project: Add AGENTS.md: MCP + HTTP usage contract for agents
- 对用户的影响：Developers may hit a documented source-backed failure mode: Add AGENTS.md: MCP + HTTP usage contract for agents
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/18 | Add AGENTS.md: MCP + HTTP usage contract for agents

## 27. 能力坑 · 失败模式：conceptual: Add MCP vs HTTP section to README

- 严重度：low
- 证据强度：source_linked
- 发现：Developers should check this conceptual risk before relying on the project: Add MCP vs HTTP section to README
- 对用户的影响：Developers may hit a documented source-backed failure mode: Add MCP vs HTTP section to README
- 证据：failure_mode_cluster:github_issue | https://github.com/Khamel83/argus/issues/20 | Add MCP vs HTTP section to README

## 28. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 证据：evidence.maintainer_signals | https://github.com/Khamel83/argus | issue_or_pr_quality=unknown

## 29. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 证据：evidence.maintainer_signals | https://github.com/Khamel83/argus | release_recency=unknown

<!-- canonical_name: Khamel83/argus; human_manual_source: deepwiki_human_wiki -->