# https://github.com/callstack/agent-device 项目说明书

生成时间：2026-06-18 04:04:02 UTC

## 目录

- [项目概述与系统架构](#page-overview)
- [CLI 命令、会话与证据捕获](#page-cli-sessions)
- [平台后端与设备自动化](#page-platforms)
- [重放脚本、Maestro 兼容与端到端测试](#page-replay)

<a id='page-overview'></a>

## 项目概述与系统架构

### 相关页面

相关主题：[CLI 命令、会话与证据捕获](#page-cli-sessions), [平台后端与设备自动化](#page-platforms)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/callstack/agent-device/blob/main/README.md)
- [package.json](https://github.com/callstack/agent-device/blob/main/package.json)
- [ios-runner/README.md](https://github.com/callstack/agent-device/blob/main/ios-runner/README.md)
- [src/commands/capture/runtime/snapshot.ts](https://github.com/callstack/agent-device/blob/main/src/commands/capture/runtime/snapshot.ts)
- [src/commands/interaction/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/interaction/runtime/index.ts)
- [src/commands/management/runtime/admin.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts)
- [src/commands/management/runtime/apps.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/apps.ts)
- [src/commands/observability/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/index.ts)
- [src/commands/observability/runtime/diagnostics.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics.ts)
- [src/commands/observability/runtime/diagnostics-format.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-format.ts)
- [src/commands/recording/runtime/recording.ts](https://github.com/callstack/agent-device/blob/main/src/commands/recording/runtime/recording.ts)
</details>

# 项目概述与系统架构

## 项目定位与目标

`agent-device` 是 Callstack 维护的一款面向 AI 编码代理（coding agent）的设备自动化 CLI 工具，专注于在真实设备上对 iOS、Android、tvOS、macOS 桌面与 React Native 应用进行验证、交互与诊断。`README.md` 将其定位为 Vercel `agent-browser` 在移动、电视与桌面领域的对位产品，强调"代理需要验证真实设备上的行为，而不是仅推理代码"。其核心价值在于：以 token 高效的快照、语义引用与按需证据采集，让代理对运行中的应用进行可验证的观察与操控。

资料来源：[README.md:1-18]()

项目当前最新发布版本为 **v0.17.6**，本批次变更主要涉及 Rslib 启动构建性能优化、外部 XCTest 运行器支持与若干 iOS 修复，表明项目正在持续打磨构建链路与平台适配层。

资料来源：[package.json:1-30]()

## 核心架构

系统采用 "CLI/MCP 入口 → 命令运行时 → 后端抽象 → 平台驱动" 的四层结构。`package.json` 中的关键词（`mcp`、`model-context-protocol`、`mcp-server`、`ai-agent`、`mobile-automation`、`xcuitest`）显示项目对 AI 代理工作流的一等支持：CLI 与 MCP 服务端是同一组命令的双重入口。

```mermaid
graph TD
  A[AI Agent / CLI / MCP Client] --> B[CLI & MCP 入口]
  B --> C[命令运行时 Runtime]
  C --> D1[捕获 capture]
  C --> D2[交互 interaction]
  C --> D3[管理 management]
  C --> D4[观测 observability]
  C --> D5[录制 recording]
  D1 --> E[Backend 抽象层]
  D2 --> E
  D3 --> E
  D4 --> E
  D5 --> E
  E --> F1[iOS XCUITest Runner]
  E --> F2[Android Emulator 驱动]
  E --> F3[macOS / tvOS 驱动]
```

`src/commands/observability/runtime/index.ts` 提供了 `diagnosticsCommands` 与 `bindObservabilityCommands` 两个绑定工厂，将后端原语封装为运行时命令并暴露绑定版本（去掉 `runtime` 参数），是命令域与后端之间统一的"运行时-绑定"桥接模式。其他命令域（`apps.ts`、`admin.ts`、`recording.ts`）严格遵循同样的模式，便于在 MCP 与 CLI 两种入口下复用同一组实现。

资料来源：[src/commands/observability/runtime/index.ts:1-56]()，[src/commands/management/runtime/admin.ts:1-40]()

## 命令域与关键能力

`agent-device` 的命令在 `src/commands/` 下按职责拆分为五个域，每个域都以 `RuntimeCommand<Options, Result>` 类型统一签名，参数与结果均以 TypeScript 类型固化，便于在 IDE 与代理上下文中获得可推导的契约。

- **捕获域（capture）**：`snapshot.ts` 提供 UI 快照、差异对比、可见性分析，是代理"看见"界面的基础原语。
- **交互域（interaction）**：`runtime/index.ts` 暴露 tap、fill、typeText、swipe、scroll、longPress、pinch、focus 等动作，并提供绑定到目标的 `getText` / `getAttrs` / `isVisible` / `waitForText` 等便捷方法。
- **管理域（management）**：`apps.ts` 与 `admin.ts` 负责应用生命周期（open / close / listApps / install / reinstall / installFromSource）与设备生命周期（listDevices / boot / shutdown），是测试 setup/teardown 的核心。
- **观测域（observability）**：`diagnostics.ts` 提供 logs、network、perf 三个原语，调用后端的 `readLogs`、`dumpNetwork`、`measurePerf`，并对结果做脱敏与截断。
- **录制域（recording）**：`recording.ts` 提供屏幕录制与 trace 的 start/stop 控制，输出由 `ArtifactDescriptor` 描述。

观测域对结果的脱敏与截断规则集中在 `diagnostics-format.ts`：`SECRET_KEY_PATTERN` 匹配 `authorization`、`cookie`、`token`、`secret`、`password`、`api[-_]?key` 等敏感字段名；`PAYLOAD_MAX_CHARS = 2048` 与 `MESSAGE_MAX_CHARS = 4096` 对载荷与消息做硬性长度上限，`redactAndTruncate` 与 `redactUnknown` 协同输出"已脱敏"标记，供代理识别。

资料来源：[src/commands/observability/runtime/diagnostics-format.ts:1-30]()，[src/commands/capture/runtime/snapshot.ts:1-40]()，[src/commands/recording/runtime/recording.ts:1-40]()

## 平台支持与 iOS Runner

`package.json` 关键词列出了 ios、android、tvos、macos、react-native、expo、xcuitest、ios-simulator、android-emulator、maestro、detox 等多平台与生态标签，表明项目目标是在多端共用同一套命令契约。

Apple 平台侧的实现由 `ios-runner/` 目录承载：一个轻量级 XCUITest 目标，通过小型 HTTP 服务对外暴露 UI 自动化原语。`ios-runner/README.md` 将其拆分为多个聚焦文件以减少贡献者与 LLM 代理的上下文消耗：

- `RunnerTests.swift`：共享状态、常量与 `testCommand()` 入口；
- `RunnerTests+Models.swift`：线缆协议模型（`Command` / `Response` 与快照负载）；
- `RunnerTests+Environment.swift` / `+Transport.swift`：环境变量与 TCP/HTTP 解析；
- `RunnerTests+CommandExecution.swift` / `+Interaction.swift` / `+Lifecycle.swift`：命令分发、交互与生命周期辅助。

TypeScript 客户端位于 `src/platforms/ios/runner-client.ts`，与 Swift 模型一一对应。最近的 v0.17.6 增加了"外部 xctest runner artifact 支持"与"外部 runner flags 分类"两项修复，正是围绕该 Runner 协议展开。

资料来源：[ios-runner/README.md:1-30]()

## 回放脚本与社区关注点

`agent-device` 还在 `.ad` 回放脚本中提供面向代理的可重放工作流。社区 issue **#432** 指出：当前 `.ad` 脚本不支持参数化，所有值都必须以字面量写入文件，导致跨应用变体（`com.example.debug` 与 `com.example.prod`）、不同节流时长等场景需要复制多份脚本。该问题直接对应"为 AI 代理提供可复用脚本"的产品定位，提示后续架构演进需要在录制域与回放调度层之间补齐变量解析层。

资料来源：[README.md:1-18]()

## 开发与工具链

`package.json` 中的脚本暴露了完整的工程化闭环：

- **静态检查与格式化**：`lint`（`oxlint . --deny-warnings`）、`format` / `format:check`（`oxfmt`）；
- **性能基线**：`perf` / `perf:ios` / `perf:android` 通过 `scripts/perf/run.ts` 跑性能脚本，并输出 size report；
- **代码审计**：`fallow` / `fallow:all` / `fallow:baseline` / `check:fallow` 提供死代码与健康度基线管理；
- **MCP 元数据同步**：`sync:mcp-metadata` / `check:mcp-metadata` 保证仓库与 MCP 服务端元数据一致；
- **质量门禁**：`check:quick` 串联 lint 与 typecheck，作为轻量预检。

工具链以 `pnpm` 为包管理器，TypeScript 6.0 + Rslib 0.22 作为构建核心，Vitest 4.1 承担单元测试，oxlint/oxfmt 提供现代快速检查，整体反映了项目对"工程化、自动化、可审计"的高要求。

资料来源：[package.json:1-60]()

## See Also

- 命令运行时与后端契约：见 `src/backend.ts` 与 `src/runtime-contract.ts`。
- MCP 服务端元数据同步脚本：`scripts/sync-mcp-metadata.mjs`。
- iOS Runner 协议细节：`ios-runner/RUNNER_PROTOCOL.md`。
- 观测域单元测试样例：`src/commands/observability/runtime/diagnostics-router.test.ts`。

---

<a id='page-cli-sessions'></a>

## CLI 命令、会话与证据捕获

### 相关页面

相关主题：[项目概述与系统架构](#page-overview), [平台后端与设备自动化](#page-platforms)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [package.json](https://github.com/callstack/agent-device/blob/main/package.json)
- [README.md](https://github.com/callstack/agent-device/blob/main/README.md)
- [src/commands/management/runtime/admin.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts)
- [src/commands/management/runtime/apps.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/apps.ts)
- [src/commands/observability/runtime/diagnostics.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics.ts)
- [src/commands/observability/runtime/diagnostics-format.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-format.ts)
- [src/commands/observability/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/index.ts)
- [src/commands/recording/runtime/recording.ts](https://github.com/callstack/agent-device/blob/main/src/commands/recording/runtime/recording.ts)
- [src/commands/capture/runtime/snapshot.ts](https://github.com/callstack/agent-device/blob/main/src/commands/capture/runtime/snapshot.ts)
- [src/commands/interaction/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/interaction/runtime/index.ts)
- [ios-runner/README.md](https://github.com/callstack/agent-device/blob/main/ios-runner/README.md)

</details>

# CLI 命令、会话与证据捕获

## 1. 项目定位与 CLI 总体架构

`agent-device` 是一个面向 AI 代理的设备自动化 CLI，目标是让代理在 iOS、Android、TV 与桌面平台上打开真实 App、抓取 UI 快照、与可见元素交互并按需收集调试证据。仓库自述强调："A device automation CLI for real apps on iOS, Android, TV, and desktop. Agents get token-efficient snapshots, semantic refs, and evidence captured only when needed." 资料来源：[README.md:1-15]()。

CLI 内部按"领域"切分命令，主要包括管理（management）、交互（interaction）、捕获（capture）、可观测性（observability）与录制（recording）五大模块：

| 领域 | 关键命令 | 入口文件 |
| --- | --- | --- |
| 管理（apps/admin） | `openApp`、`closeApp`、`listApps`、`devices`、`boot`、`shutdown`、`install` | `src/commands/management/runtime/apps.ts`、`admin.ts` |
| 交互（interaction） | `tap`、`swipe`、`typeText`、`find`、`get`、`wait` | `src/commands/interaction/runtime/index.ts` |
| 捕获（capture） | `snapshot`、`diffSnapshot` | `src/commands/capture/runtime/snapshot.ts` |
| 可观测性（observability） | `logs`、`network`、`perf` | `src/commands/observability/runtime/diagnostics.ts` |
| 录制（recording） | `record`、`trace` | `src/commands/recording/runtime/recording.ts` |

```mermaid
flowchart LR
  CLI[CLI / MCP 调用方] --> Cmd[命令路由器]
  Cmd --> Mgmt[管理域]
  Cmd --> Int[交互域]
  Cmd --> Cap[捕获域]
  Cmd --> Obs[可观测性域]
  Cmd --> Rec[录制域]
  Cap --> Snap[snapshot 节点树]
  Obs --> Logs[redacted 日志]
  Rec --> Vid[视频/trace 工件]
```

## 2. 命令运行时与可观测性证据

每个命令在运行时都会经过一个统一的"RuntimeCommand" 适配层，再下转到具体后端。例如 `logsCommand` 会校验后端是否实现 `readLogs`，否则抛出 `UNSUPPORTED_OPERATION` 错误：资料来源：[src/commands/observability/runtime/diagnostics.ts:30-50]()。可观测性命令由 `bindObservabilityCommands` 绑定到运行时对象，对外暴露 `logs`、`network`、`perf` 三个函数：资料来源：[src/commands/observability/runtime/index.ts:20-35]()。

证据收集时会对敏感字段进行脱敏，包括 `authorization`、`cookie`、`token`、`secret`、`password`、`passwd`、`api[-_]?key` 等键名，并按 `PAYLOAD_MAX_CHARS=2048`、`MESSAGE_MAX_CHARS=4096` 截断内容。`formatLogsResult` 会标记 `redacted: true`，告诉调用方"内容已被改写"：资料来源：[src/commands/observability/runtime/diagnostics-format.ts:8-40]()。日志查询默认 `limit=100`、最大 `500`；网络默认 `limit=25`、最大 `200`；性能采样范围 100–60 000 ms，指标最多 20 项：资料来源：[src/commands/observability/runtime/diagnostics.ts:22-28]()。

## 3. 录制、追踪与会话证据

录制命令 `record` 与 `trace` 都通过 `action: 'start' | 'stop'` 切换生命周期，并支持 `out` 字段指定输出位置（`FileOutputRef`）以及 `fps`、`quality`、`hideTouches` 等参数。返回结构中会携带 `path`、`telemetryPath` 与 `ArtifactDescriptor`，便于代理在后续步骤里再上传或引用：资料来源：[src/commands/recording/runtime/recording.ts:18-60]()。

快照命令 `snapshot` 与 `diffSnapshot` 是 token-高效取证的核心。`SnapshotCommandResult` 同时返回节点树、`truncated` 标志、应用标识以及"未变化"标记 `unchanged`，让代理可以判断是否需要重抓：资料来源：[src/commands/capture/runtime/snapshot.ts:38-58]()。管理域的 `openApp`/`closeApp`/`listApps` 与 `admin.devices`/`admin.boot`/`admin.install` 共同维护"会话"上下文，例如 `OpenAppCommandOptions` 包含 `launchArgs` 与 `relaunch`，返回的 `BackendResultEnvelope` 会附带原始后端回执：资料来源：[src/commands/management/runtime/apps.ts:28-46]()、`src/commands/management/runtime/admin.ts:60-110]()。

## 4. 跨平台 Runner 与 `.ad` 重放脚本

iOS 端通过轻量级 XCUITest Runner 提供原子级 UI 自动化。Runner 内置小型 HTTP/TCP 服务，与 TypeScript 客户端 [`src/platforms/ios/runner-client.ts`](https://github.com/callstack/agent-device/blob/main/src/platforms/ios/runner-client.ts) 协作；Swift 端把 `Command` / `Response` 模型与生命周期、交互、传输、命令分发等职责拆分到多个文件中，以便贡献者与 LLM 代理按需加载：资料来源：[ios-runner/README.md:1-38]()。这与最新 v0.17.6 中"支持外部 xctest runner artifact"以及"为 XCTest runner 添加 no-op 占位"等改动方向一致，避免每次都从源码构建。

社区中关于 ".ad replay scripts 不支持参数化" 的讨论（#432）反映了当前会话/重放模型的短板：脚本中所有值都必须是字面量，无法在 `com.example.debug` 与 `com.example.prod` 之间复用，也无法在不同设备上微调等待时间。CLI 命令与会话存储需要为参数化重放预留接口，例如通过环境变量或 `--set key=value` 注入，并在 `OpenAppCommandOptions` / `CommandContext` 的 `launchArgs` 中传递。资料来源：[src/commands/management/runtime/apps.ts:24-34]()（社区反馈作为未来扩展参考）。

## See Also

- [iOS Runner 协议与文件结构](ios-runner/README.md)
- [可观测性与脱敏管线](src/commands/observability/runtime/diagnostics-format.ts)
- [快照与差异捕获](src/commands/capture/runtime/snapshot.ts)
- [管理域命令（apps/admin）](src/commands/management/runtime/apps.ts)

---

<a id='page-platforms'></a>

## 平台后端与设备自动化

### 相关页面

相关主题：[项目概述与系统架构](#page-overview), [CLI 命令、会话与证据捕获](#page-cli-sessions), [重放脚本、Maestro 兼容与端到端测试](#page-replay)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/callstack/agent-device/blob/main/README.md)
- [package.json](https://github.com/callstack/agent-device/blob/main/package.json)
- [ios-runner/README.md](https://github.com/callstack/agent-device/blob/main/ios-runner/README.md)
- [src/commands/management/runtime/admin.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts)
- [src/commands/management/runtime/apps.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/apps.ts)
- [src/commands/observability/runtime/diagnostics.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics.ts)
- [src/commands/observability/runtime/diagnostics-format.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-format.ts)
- [src/commands/recording/runtime/recording.ts](https://github.com/callstack/agent-device/blob/main/src/commands/recording/runtime/recording.ts)
- [src/commands/interaction/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/interaction/runtime/index.ts)
- [src/commands/observability/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/index.ts)
- [src/commands/management/runtime/admin-router.test.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin-router.test.ts)
- [src/commands/observability/runtime/diagnostics-router.test.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-router.test.ts)
</details>

# 平台后端与设备自动化

## 概述与架构

`agent-device` 是一个面向 AI Agent 的设备自动化 CLI，目标平台覆盖 iOS、Android、tvOS、macOS、React Native 与桌面应用 [`README.md:1-12`](https://github.com/callstack/agent-device/blob/main/README.md)。官方将其定位为"mobile 版 [agent-browser]"——为编码 Agent 提供 token 高效的 UI 快照、语义 refs 与按需证据采集，通过一条 CLI 即可驱动真实设备上的真实 App。

整个运行时建立在两层抽象之上：上层是面向用户与 MCP 的命令层（management、interaction、observability、recording）；下层是 `AgentDeviceBackend` 接口及各平台实现（如 iOS XCUITest Runner、Android ADB）。这种分层让 `apps`、`admin`、`diagnostics` 等模块在不同平台间共享同一套语义 [`package.json:39-58`](https://github.com/callstack/agent-device/blob/main/package.json)（其中 `ios-simulator`、`android-emulator`、`xcuitest`、`appium` 等关键字也佐证了多平台定位）。

## iOS 平台后端与 Backend 抽象

iOS 平台后端的核心是 `ios-runner/` 目录下的 XCUITest Runner [`ios-runner/README.md:1-10`](https://github.com/callstack/agent-device/blob/main/ios-runner/README.md)。它是一个"轻量级 XCUITest target，通过小型 HTTP 服务暴露 UI 自动化能力"：Runner 启动后监听 TCP 连接，将 HTTP 请求解析并分派到 Swift 中的 `execute*` 实现。其源代码被刻意拆分为多个聚焦文件（`RunnerTests+Transport.swift`、`RunnerTests+Interaction.swift`、`RunnerTests+Lifecycle.swift` 等），目的是减小贡献者与 LLM Agent 阅读时的上下文体积。

```mermaid
graph LR
  A[CLI / MCP Server] --> B[Runtime Commands<br/>management · interaction · observability · recording]
  B --> C[AgentDeviceBackend<br/>统一接口]
  C --> D[iOS Runner<br/>XCUITest + HTTP]
  C --> E[Android Backend<br/>ADB + SnapshotHelper]
  C --> F[其他 Backend]
  D --> G[Simulator / 设备]
  E --> H[模拟器 / 设备]
```

各后端通过 `AgentDeviceBackend` 接口向上暴露能力 [`src/commands/management/runtime/admin-router.test.ts:1-30`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin-router.test.ts)（测试中以桩实现注入）。当某个后端未实现 `listDevices`、`bootDevice`、`reinstallApp`、`readLogs`、`dumpNetwork`、`measurePerf` 等原语时，运行时会抛出 `AppError('UNSUPPORTED_OPERATION', ...)`——这是 [`src/commands/management/runtime/admin.ts:43-58`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts) 等命令中统一的降级策略。

## 设备与应用管理命令

设备生命周期由 [`src/commands/management/runtime/admin.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts) 与 [`src/commands/management/runtime/apps.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/apps.ts) 共同暴露：

- **设备层**：`devicesCommand` / `bootCommand` / `shutdownCommand` 枚举、启动与关闭模拟器或设备，支持 `BackendDeviceFilter` 与 `BackendDeviceTarget`。
- **安装层**：`installCommand` / `reinstallCommand` / `installFromSourceCommand` 共享同一辅助函数 `runInstallCommand`，区别仅在于选择后端的 `installApp` 或 `reinstallApp` 原语 [`admin.ts:88-118`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts)。
- **应用层**：`OpenAppCommandOptions` 支持 `BackendOpenTarget`、`launchArgs` 与 `relaunch` 标志；对应用事件名施加正则 `^[A-Za-z0-9_.:-]{1,64}$`，并把 payload 字节数限制为 8 KiB [`apps.ts:23-25`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/apps.ts)。

[`src/commands/management/runtime/admin-router.test.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin-router.test.ts) 通过桩后端验证所有命令都"调用了类型化的后端原语"：例如 `devices`、`install`、`boot`、`shutdown` 都直接转发到底层方法，而不在命令层做平台分支，这保证了新增后端时不需要修改上层命令。

## 诊断、录制与交互

诊断层由 [`src/commands/observability/runtime/diagnostics.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics.ts) 与 [`src/commands/observability/runtime/diagnostics-format.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-format.ts) 组成，对外暴露三类能力，并通过 [`src/commands/observability/runtime/index.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/index.ts) 聚合为 `diagnostics` 命名空间：

- **日志读取**（`backend.readLogs`）：默认 100 条，最多 500 条；
- **网络转储**（`backend.dumpNetwork`）：默认 25 条，最多 200 条；
- **性能采样**（`backend.measurePerf`）：采样 100–60 000 ms，最多 20 个指标。

输出层在格式化阶段做脱敏与截断：消息最长 4096 字符，payload 最长 2048 字符，并通过 `SECRET_KEY_PATTERN`（`authorization|cookie|token|secret|password|passwd|api[-_]?key`）识别敏感字段 [`diagnostics-format.ts:14-23`](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-format.ts)。测试 [`src/commands/observability/runtime/diagnostics-router.test.ts:1-40`](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-router.test.ts) 显式断言 `redacted: true`，证明敏感数据默认会被屏蔽——这一点对把日志喂给 LLM Agent 的场景至关重要。

录制层 [`src/commands/recording/runtime/recording.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/recording/runtime/recording.ts) 通过 `recordCommand` 与 `traceCommand` 启动/停止屏幕录制与系统 trace，支持 `fps`、`quality`、`hideTouches` 等参数，并经由 `ArtifactAdapter.reserveOutput` 发布产物给客户端或服务端。

交互层 [`src/commands/interaction/runtime/index.ts`](https://github.com/callstack/agent-device/blob/main/src/commands/interaction/runtime/index.ts) 暴露 `find`、`getText`、`getAttrs`、`is`、`wait`、`waitForText` 等查询命令，以及 `tap`、`doubleTap`、`fill`、`typeText`、`focus`、`longPress`、`swipe`、`scroll`、`pinch` 等交互命令；`BoundSelectorCommands` 会自动绑定当前 `target`，从而减少 Agent 的输入冗余。

社区 issue **#432** 指出 `.ad` replay 脚本目前不支持参数化，导致同一脚本难以在多个 app variant（如 `com.example.debug` 与 `com.example.prod`）间复用，只能复制文件。v0.17.6 的发布说明显示 iOS Runner 正在演进：PR #806 引入"外部 XCTest runner 产物"支持，PR #810 增加了对外部 runner 启动参数的识别分类。这表明 Runner 正从"内置 Runner"走向"内置 + 可外部注入"的双轨模式，以适配更复杂的 CI 环境与跨团队复用场景。

## See Also

- iOS Runner 协议：[`ios-runner/RUNNER_PROTOCOL.md`](https://github.com/callstack/agent-device/blob/main/ios-runner/RUNNER_PROTOCOL.md)
- iOS Runner TypeScript 客户端：[`src/platforms/ios/runner-client.ts`](https://github.com/callstack/agent-device/blob/main/src/platforms/ios/runner-client.ts)
- Android 快照助手：[`android-snapshot-helper/src/main/java/com/callstack/agentdevice/snapshothelper/SnapshotInstrumentation.java`](https://github.com/callstack/agent-device/blob/main/android-snapshot-helper/src/main/java/com/callstack/agentdevice/snapshothelper/SnapshotInstrumentation.java)

---

<a id='page-replay'></a>

## 重放脚本、Maestro 兼容与端到端测试

### 相关页面

相关主题：[CLI 命令、会话与证据捕获](#page-cli-sessions), [平台后端与设备自动化](#page-platforms)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/commands/management/runtime/admin.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/admin.ts)
- [src/commands/management/runtime/apps.ts](https://github.com/callstack/agent-device/blob/main/src/commands/management/runtime/apps.ts)
- [src/commands/observability/runtime/diagnostics.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics.ts)
- [src/commands/observability/runtime/diagnostics-format.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/diagnostics-format.ts)
- [src/commands/observability/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/observability/runtime/index.ts)
- [src/commands/recording/runtime/recording.ts](https://github.com/callstack/agent-device/blob/main/src/commands/recording/runtime/recording.ts)
- [src/commands/interaction/runtime/index.ts](https://github.com/callstack/agent-device/blob/main/src/commands/interaction/runtime/index.ts)
- [src/commands/capture/runtime/snapshot.ts](https://github.com/callstack/agent-device/blob/main/src/commands/capture/runtime/snapshot.ts)
- [README.md](https://github.com/callstack/agent-device/blob/main/README.md)
- [package.json](https://github.com/callstack/agent-device/blob/main/package.json)
</details>

# 重放脚本、Maestro 兼容与端到端测试

## 项目定位与重放/兼容性角色

`agent-device` 是一个面向 AI 代理（coding agent）的移动端设备自动化 CLI，定位是"让代理在真机/模拟器上验证应用行为，而不只是推理代码"。资料来源：[README.md:1-15]()。项目以 npm 包形式发布（`agent-device`），并在 `package.json` 的关键词中明确包含 `maestro`、`appium`、`detox`、`e2e-testing`、`mobile-testing`、`qa-automation`、`ai-testing`，说明其在测试与编排生态中的对标意图。资料来源：[package.json:55-78]()。

在这种定位下，**重放脚本（replay / `.ad` scripts）**扮演"把一次成功的人工或代理操作固化下来、可重复执行、并接入 CI"的角色。社区讨论（issue #432）已经指出 `.ad` 脚本当前缺乏参数化能力，例如不能在不复制脚本的情况下切换 `com.example.debug` 与 `com.example.prod` 包名，也不能参数化等待时长。

```mermaid
flowchart LR
  Agent[AI 代理 / 人类 QA] -->|录制/编写| Script[".ad 重放脚本"]
  Script -->|调度| Runtime[AgentDeviceRuntime]
  Runtime --> Mgmt[管理命令<br/>admin / apps]
  Runtime --> Interact[交互命令<br/>click / type / swipe]
  Runtime --> Obs[可观测性命令<br/>logs / network / perf]
  Runtime --> Rec[录制命令<br/>record / trace]
  Runtime --> Cap[捕获命令<br/>snapshot]
  Interact --> Device[(iOS / Android / TV)]
  Obs --> Device
  Mgmt --> Device
```

## 运行时命令矩阵：端到端能力底座

`.ad` 重放脚本的可执行能力由一组按职责拆分的运行时命令构成：

| 类别 | 入口 | 关键能力 |
|---|---|---|
| 管理（Management） | `admin` / `apps` | 设备列表、启动/关闭、应用安装/重装/从源码安装、应用列表与打开/关闭 |
| 交互（Interaction） | `selector` / `interactions` | 通过语义 ref 查找元素、点击、填表、聚焦、长按、滑动、滚动、捏合手势 |
| 可观测性（Observability） | `diagnostics` | 日志、网络、性能采样，含敏感信息脱敏 |
| 录制（Recording） | `record` / `trace` | 屏幕录制与追踪，性能与遥测产物落盘 |
| 捕获（Capture） | `snapshot` / `diff` | UI 快照、差异比较、质量警告 |

例如 `admin` 命名空间统一封装了设备生命周期与软件安装：`devicesCommand` 列出后端支持的设备；`bootCommand` 启动目标设备；`installCommand` / `reinstallCommand` / `installFromSourceCommand` 在统一的 `runInstallCommand` 中按 `mode` 复用校验和后端调用。资料来源：[src/commands/management/runtime/admin.ts:33-104]()。`apps` 命名空间则提供 `OpenAppCommandOptions`（含 `launchArgs`、`relaunch`）与 `CloseAppCommandOptions`，并对应用事件名格式施加了正则约束 `APP_EVENT_NAME_PATTERN = /^[A-Za-z0-9_.:-]{1,64}$/`，对 push payload 限制 8 KiB。资料来源：[src/commands/management/runtime/apps.ts:9-23]()。

交互层通过 `SelectorCommands` 与 `BoundSelectorCommands` 把"按 ref/文本/属性"的选择器封装为 `find / get / getText / getAttrs / is / isVisible / isHidden / wait / waitForText` 等语义动作，下层再映射到 `clickCommand / fillCommand / typeTextCommand / focusCommand / longPressCommand / swipeCommand / scrollCommand / pinchCommand` 等基础手势。资料来源：[src/commands/interaction/runtime/index.ts:1-52]()。这一分层让 `.ad` 脚本既能写"高层语义"，也能回落到"低层坐标"。

可观测性层是端到端测试在"证据"维度的关键。`logs / network / perf` 三个命令都接受 `appId` 或 `appBundleId` 进行范围限定，并通过 `DiagnosticsPageOptions` 支持 `since / until / cursor / limit` 分页。`PERF_SAMPLE_MIN_MS = 100`、`PERF_SAMPLE_MAX_MS = 60_000`、`PERF_METRICS_MAX = 20` 等常量在源头限制了误用。资料来源：[src/commands/observability/runtime/diagnostics.ts:9-30]()。结果经过 `formatLogsResult` / `formatNetworkResult` / `formatPerfResult` 处理，敏感字段（如 `authorization/cookie/token/secret/password/api[-_]?key`）会被 `redactAndTruncate` 替换并截断到 `MESSAGE_MAX_CHARS = 4096`、`PAYLOAD_MAX_CHARS = 2048`，并在结果上打上 `redacted: true` 标记。资料来源：[src/commands/observability/runtime/diagnostics-format.ts:1-18]()。`diagnosticsCommands` 进一步以 `bindObservabilityCommands` 形式为 SDK 绑定"无 runtime 入参"的便捷入口。资料来源：[src/commands/observability/runtime/index.ts:1-35]()。

录制层提供 `recordCommand`（含 `fps / quality / hideTouches`）与 trace 命令，所有产物通过 `reserveCommandOutput` 预留落盘路径。资料来源：[src/commands/recording/runtime/recording.ts:1-30]()。捕获层 `snapshot` 还会产出 `SnapshotDiagnosticsSummary`、未变化标记 `SnapshotUnchanged` 和 React Native overlay 警告，天然适合作为 `.ad` 脚本"前后断言"的依据。资料来源：[src/commands/capture/runtime/snapshot.ts:1-30]()。

## `.ad` 重放脚本的现状与参数化诉求

根据社区反馈（issue #432），当前 `.ad` 重放脚本的所有值都必须是字面量，无法参数化。典型痛点包括：

- **跨包名复用**：在 debug 与 prod 构建之间切换 `appId`/`appBundleId` 只能复制脚本；
- **时序调参**：等待时长（`wait / sleep`）和重试次数被硬编码，CI 与本机差异时需要改文件；
- **环境切换**：不同环境（staging/prod）的 URL、用户名、密码等。

虽然本仓库的源码快照未直接展示 `src/replay/*` 的完整实现，但运行时命令已经为"参数化 + 变量插值"准备好了类型边界：`DiagnosticsPageOptions` 已经支持 `since/until/limit/cursor` 等可选字段；`OpenAppCommandOptions` 接受 `launchArgs: string[]`；`RecordingRecordCommandOptions` 接受 `fps/quality/hideTouches`；所有命令都返回带 `kind` 判别字段的 `*CommandResult`。这意味着 `.ad` 脚本引擎只需在解析阶段引入变量绑定（`${VAR}` 语法 + 环境/CLI 覆盖），无需改动命令契约本身。

## 与 Maestro/Appium/Detox 的兼容定位

`package.json` 把 `maestro`、`appium`、`detox` 同时列为关键词，意味着 `agent-device` 的策略是"同台竞争 + 互操作参考"。资料来源：[package.json:55-78]()。从架构上看，这种兼容性建立在三层抽象上：

1. **后端抽象层（Backend Adapter）**：`runtime.backend` 上是否提供 `listDevices / bootDevice / installApp / reinstallApp / readLogs / dumpNetwork / measurePerf / recordScreen / recordTrace` 等方法，是命令能否执行的唯一前提；缺失时统一抛出 `UNSUPPORTED_OPERATION`，例如 `admin.devices is not supported by this backend`。资料来源：[src/commands/management/runtime/admin.ts:33-40]()。
2. **结果信封（BackendResultEnvelope）**：所有 `*CommandResult` 都扩展自 `BackendResultEnvelope`，便于把不同后端（XCUITest、UIAutomator、Maestro 桥、Appium 桥）产出的证据统一序列化。
3. **可观测性脱敏与分页**：`diagnostics` 子系统在 `formatXxxResult` 中内建了密钥检测和截断，确保在 E2E 用例中采集网络/日志证据时不会泄露凭证。资料来源：[src/commands/observability/runtime/diagnostics-format.ts:5-18]()。

对 `.ad` 脚本作者而言，这意味着：脚本本身只描述"动作序列 + 期望证据"，可以由不同后端执行；当后端切换到 Maestro 兼容通道时，相同脚本应能在不修改的前提下复跑——这也是社区提出"参数化 + 多后端复用"的潜在受益点。

## 常见失败模式与排查入口

- **后端能力缺失**：执行 `admin.devices`、`diagnostics.network` 等命令时抛 `UNSUPPORTED_OPERATION`，提示当前 backend 未实现对应原语。资料来源：[src/commands/management/runtime/admin.ts:33-40]() 与 [src/commands/observability/runtime/diagnostics.ts:38-50]()。
- **输入越界**：`requireIntInRange` 等校验会拒绝 `sampleMs < 100` 或 `> 60_000` 的 perf 采样请求。资料来源：[src/commands/observability/runtime/diagnostics.ts:13-19]()。
- **应用事件名/载荷不合法**：`APP_EVENT_NAME_PATTERN` 与 8 KiB 上限会拦截不符合规范的 push / event payload。资料来源：[src/commands/management/runtime/apps.ts:9-13]()。
- **敏感数据泄露担忧**：`MESSAGE_MAX_CHARS` 与 `PAYLOAD_MAX_CHARS` 之外的字段会被 `redactAndTruncate` 替换，并在 `redacted: true` 上打标，便于审计。资料来源：[src/commands/observability/runtime/diagnostics-format.ts:5-18]()。
- **脚本字面量耦合**：即 issue #432 所述的"无法参数化"，应通过未来引入的变量绑定与 `.env` / CLI 覆盖来缓解。

## 参见

- 管理命令：`src/commands/management/runtime/admin.ts`、`src/commands/management/runtime/apps.ts`
- 交互命令：`src/commands/interaction/runtime/index.ts`
- 可观测性命令：`src/commands/observability/runtime/diagnostics.ts`、`src/commands/observability/runtime/index.ts`
- 录制命令：`src/commands/recording/runtime/recording.ts`
- 快照与差异：`src/commands/capture/runtime/snapshot.ts`
- 项目总览：[README.md](https://github.com/callstack/agent-device/blob/main/README.md)
- 社区讨论：[issue #432 Parametrise `.ad` replay scripts](https://github.com/callstack/agent-device/issues/432)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Doramagic 踩坑日志

项目：callstack/agent-device

摘要：发现 12 个潜在踩坑项，其中 0 个为 high/blocking；最高优先级：安装坑 - 来源证据：Apps discovery should default to user-installed apps and make all-apps explicit。

## 1. 安装坑 · 来源证据：Apps discovery should default to user-installed apps and make all-apps explicit

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Apps discovery should default to user-installed apps and make all-apps explicit
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/callstack/agent-device/issues/538 | 来源类型 github_issue 暴露的待验证使用条件。

## 2. 安装坑 · 来源证据：Native diagnostics API rollout

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Native diagnostics API rollout
- 对用户的影响：可能阻塞安装或首次运行。
- 证据：community_evidence:github | https://github.com/callstack/agent-device/issues/694 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 3. 配置坑 · 来源证据：Implement Android CPU and memory sampling in perf payload

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Implement Android CPU and memory sampling in perf payload
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/callstack/agent-device/issues/126 | 来源类型 github_issue 暴露的待验证使用条件。

## 4. 能力坑 · 来源证据：Comparisons with https://mobilenext.ai/#products

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个能力理解相关的待验证问题：Comparisons with https://mobilenext.ai/#products
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/callstack/agent-device/issues/808 | 来源类型 github_issue 暴露的待验证使用条件。

## 5. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 证据：capability.assumptions | https://www.npmjs.com/package/agent-device | README/documentation is current enough for a first validation pass.

## 6. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 证据：evidence.maintainer_signals | https://www.npmjs.com/package/agent-device | last_activity_observed missing

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 证据：downstream_validation.risk_items | https://www.npmjs.com/package/agent-device | no_demo; severity=medium

## 8. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 证据：risks.scoring_risks | https://www.npmjs.com/package/agent-device | no_demo; severity=medium

## 9. 安全/权限坑 · 来源证据：Add Android native CPU and Perfetto profiling under perf

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Add Android native CPU and Perfetto profiling under perf
- 对用户的影响：可能阻塞安装或首次运行。
- 证据：community_evidence:github | https://github.com/callstack/agent-device/issues/696 | 来源讨论提到 npm 相关条件，需在安装/试用前复核。

## 10. 安全/权限坑 · 来源证据：Add narrow debug symbols workflow for crash symbolication

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Add narrow debug symbols workflow for crash symbolication
- 对用户的影响：可能阻塞安装或首次运行。
- 证据：community_evidence:github | https://github.com/callstack/agent-device/issues/699 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 11. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 证据：evidence.maintainer_signals | https://www.npmjs.com/package/agent-device | issue_or_pr_quality=unknown

## 12. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 证据：evidence.maintainer_signals | https://www.npmjs.com/package/agent-device | release_recency=unknown

<!-- canonical_name: callstack/agent-device; human_manual_source: deepwiki_human_wiki -->