Doramagic Project Pack · Human Manual
AIPex
AIPex: AI browser automation assistant, no migration and privacy first. Alternative to Manus Browser Operator、 Claude Chrome and Agent Browser
Repository Overview & Architecture
Related topics: Browser Automation: DOM Snapshot, Locator & Tools, AI Chat, MCP Bridge & Skills, Extension UI, Build, Deployment & Operations
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Browser Automation: DOM Snapshot, Locator & Tools, AI Chat, MCP Bridge & Skills, Extension UI, Build, Deployment & Operations
Repository Overview & Architecture
1. Purpose and Scope
AIPex is an open-source browser automation project that lets users drive a Chromium-based browser with natural-language commands. Per the top-level README.md, the project's headline positioning is "Automate your browser with natural language commands — The open source browser-use solution." The repository combines:
- A published Chrome / Edge extension (the user-facing product).
- A pure-TypeScript AI agent framework reusable outside the extension.
- An MCP bridge that exposes AIPex tools to external agents such as Cursor, Claude Code, and GitHub Copilot.
- A reusable DOM snapshot library that avoids the Chrome DevTools Protocol (CDP) Accessibility Tree.
According to packages/browser-ext/package.json, the extension is published under the name @aipexstudio/cool-aipex with display name "AIPex". The README.md confirms distribution on the Chrome Web Store and Microsoft Edge Add-ons. Firefox support is tracked as an open enhancement request in issue #1.
2. Repository Layout
The root package.json declares private: true and configures workspaces: ["packages/*"], making this a pnpm monorepo. The root scripts provide workspace-wide commands such as build, dev, preflight (format + lint + typecheck + test), and lint:dependencies (via knip). A postinstall hook runs prek install to install pre-commit hooks.
The monorepo contains six top-level deliverables:
| Path | Package name | Role | Source of record |
|---|---|---|---|
packages/core | @aipexstudio/aipex-core | Pure-TS AI agent framework, depends on @openai/agents | packages/core/package.json |
packages/aipex-react | @aipexstudio/aipex-react | React UI toolkit, hooks, chat & omni components, adapters | packages/aipex-react/package.json |
packages/browser-runtime | @aipexstudio/browser-runtime | Browser automation runtime, CDP/DOM locator strategies, VM, ZenFS, skill manager | packages/browser-runtime/package.json |
packages/browser-ext | @aipexstudio/cool-aipex | The AIPex browser extension (Side Panel, Options, Chatbot UI, MCP bridge panel) | packages/browser-ext/package.json |
packages/dom-snapshot | @aipexstudio/dom-snapshot | DOM-based accessibility snapshot library | packages/dom-snapshot/package.json |
mcp-bridge/ | (MCP bridge server) | Standalone WebSocket bridge exposing AIPex over MCP | mcp-bridge/package.json |
3. Architecture and Data Flow
AIPex follows a layered design. The extension is the user-facing shell; below it, the runtime exposes 30+ browser tools; the core package provides the agent loop; and the MCP bridge lets external agents reuse the same tools.
flowchart TB
subgraph Client["Browser (Chrome / Edge)"]
EXT["@aipexstudio/cool-aipex<br/>(Side Panel + Options UI)"]
UI["aipex-react components<br/>(chatbot, omni)"]
DOM["@aipexstudio/dom-snapshot<br/>(pure-DOM AXTree)"]
end
subgraph Runtime["Browser Runtime Layer"]
BR["@aipexstudio/browser-runtime<br/>CDP & DOM locators, VM, ZenFS, skills"]
end
subgraph Core["Agent Layer"]
CORE["@aipexstudio/aipex-core<br/>Agent loop (Zod, lru-cache)"]
SDK["@openai/agents + extensions"]
end
subgraph External["External Agents"]
MCP["mcp-bridge (WebSocket + MCP)"]
EXT_AGENTS["Cursor / Claude Code / Copilot"]
end
EXT --> UI --> BR
BR --> DOM
BR --> CORE --> SDK
MCP -. tools/list, tools/call .-> BR
EXT_AGENTS --> MCPKey data flows:
- Snapshot capture — packages/dom-snapshot/README.md documents
collectDomSnapshotInPage()andcollectDomSnapshot(document, options)producing a serialized tree with stabledata-aipex-nodeidattributes, hidden-element filtering, and same-origin iframe traversal. - Tool execution — packages/browser-runtime/README.md describes two locator modes (
"cdp"and"dom"),SmartLocator/DomLocatorelement handles, and runtime contracts (RuntimeAddon,NoopBrowserAutomationHost,InMemoryOmniActionRegistry,NullInterventionHost,NoopContextProvider). - Sandboxed execution — packages/browser-runtime/src/lib/vm/zenfs-manager.ts shows file-system and rename primitives backed by QuickJS + ZenFS, including extension whitelisting for text previews surfaced through packages/browser-ext/src/pages/options/file-components/FilePreview.tsx.
- Skill lifecycle — packages/browser-runtime/src/skill/lib/services/skill-manager.ts bootstraps built-in skills (e.g.
wcag22-a11y-audit) into IndexedDB with metadata describing version, capabilities, and enabled state. - UI ↔ runtime messaging — packages/browser-ext/src/lib/message-adapter.ts converts between runtime
UIMessageshapes and tool-call parts, handling business failures returned as{ success: false, error }. - MCP integration — packages/browser-ext/src/pages/options/mcp-bridge-panel.tsx documents that the bridge exposes
tools/listandtools/callover MCP and only accepts localhost connections (127.0.0.1, ::1).
4. Cross-Cutting Concerns and Community-Known Limitations
Several recurring themes from community issues directly map to architectural decisions:
- Browser coverage — Per issue #1, Firefox support is still open. The README only advertises Chrome and Edge store badges, and the codebase relies on Chromium-specific extension APIs, so Firefox would require a framework-level refactor.
- Repository size — issue #53 tracks reducing
size-packfrom ~97 MiB. This is consistent with the heavy native dependencies listed in packages/browser-runtime/package.json (@jitl/quickjs-ng-wasmfile-release-sync,@zenfs/core). - Iframe handling in snapshots — issue #100 reports that
search_elementsdoes not recurse into iframes. packages/dom-snapshot/README.md confirms the current scope is "same-origin iframe support" — cross-origin iframes remain out of scope. - MCP multi-session — issue #202 reports that only one Claude session can connect at a time. mcp-bridge/package.json lists
websocketandbridgeamong its keywords, and the mcp-bridge-panel.tsx notice restricts access to localhost only, which together explain the single-client limitation. - Conversation sharing — packages/browser-ext/src/services/share-conversation.ts shows that shareable conversations are uploaded via
fetchwith an auth cookie; screenshots are intentionally omitted from the share payload, which is a deliberate privacy choice rather than a bug.
Together these patterns position AIPex as a layered monorepo: a stable core agent and snapshot library, a feature-rich runtime, a React UI toolkit, and a thin extension shell that can be re-targeted (in principle) at additional hosts beyond Chromium.
See Also
- packages/dom-snapshot/README.md — DOM snapshot API and iframe scope.
- packages/browser-runtime/README.md — CDP/DOM locator contracts and test setup.
- mcp-bridge/README.md — Advanced MCP bridge options.
- skill/SKILL.md — The
aipex-browserskill definition for agent runtimes. - DEVELOPMENT.md — Local development setup.
Source: https://github.com/AIPexStudio/AIPex / Human Manual
Browser Automation: DOM Snapshot, Locator & Tools
Related topics: Repository Overview & Architecture, AI Chat, MCP Bridge & Skills
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Repository Overview & Architecture, AI Chat, MCP Bridge & Skills
Browser Automation: DOM Snapshot, Locator & Tools
Overview
AIPex is an open-source browser-automation agent that lets users drive Chrome/Edge with natural-language commands. The automation stack is split across the monorepo into two cooperating layers: @aipexstudio/dom-snapshot, a pure-JS/TS page serializer, and @aipexstudio/browser-runtime, which packages that serializer with CDP-based automation, React-agnostic locators, and a curated FunctionTool bundle exposed to LLM agents. Source: packages/dom-snapshot/README.md and packages/browser-runtime/README.md.
The package manifest confirms the dependency direction: browser-runtime declares @aipexstudio/dom-snapshot: workspace:* as a dependency, and browser-ext (the actual Chrome MV3 extension) consumes browser-runtime and aipex-core. Source: packages/browser-runtime/package.json.
flowchart LR
LLM[LLM Agent] -->|FunctionTool call| Tools[allBrowserTools]
Tools --> Locator{Strategy}
Locator -->|cdp| Smart[SmartLocator / DebuggerManager]
Locator -->|dom| Dom[DomLocator / DomElementHandle]
Smart --> Page[CDP via chrome.debugger]
Dom --> Snap[collectDomSnapshot]
Snap --> Iframe[Same-origin iframe traversal]
Snap --> Text[buildTextSnapshot / formatSnapshot]
Text -->|searchSnapshotText| ToolsDOM Snapshot: Collection, Formatting, Search
@aipexstudio/dom-snapshot was created specifically to avoid the Chrome DevTools Protocol (CDP) Accessibility Tree dependency, which the README cites as having "browser dependency," "performance overhead," and "complex setup" drawbacks. Source: packages/dom-snapshot/README.md.
The public API exposes two top-level helpers:
collectDomSnapshotInPage()— current document.collectDomSnapshot(document, options?)— explicit document with options (maxTextLength,includeHidden,captureTextNodes).
Each call returns a serialized tree (root, idToNode, totalNodes, metadata.url) and stamps every interactive element with a stable data-aipex-nodeid attribute. Source: packages/dom-snapshot/README.md.
Text rendering is a two-step pipeline. buildTextSnapshot produces a TextSnapshot; formatSnapshot emits the human/LLM-readable form with three prefix markers: * for the focused element, → for ancestors of focus, and a single space for normal nodes. Source: packages/dom-snapshot/README.md.
searchSnapshotText(formatted, query, options?) supports both literal and |-separated multi-term queries, plus glob patterns when useGlob is enabled. Source: packages/dom-snapshot/README.md.
The release history shows steady refinement of the serializer: cursor-pointer detection for interactive elements (v0.0.14), explicit accessibility-label support (v0.0.13), and StaticText extraction for non-interactive containers (v0.0.11). Source: GitHub Releases v0.0.11–v0.0.14.
Locators and Element Handles
browser-runtime ships two parallel locator strategies, both surfaced through SnapshotManager:
| Component | Strategy | Use when |
|---|---|---|
SmartLocator / SmartElementHandle | cdp | Chrome/Edge with chrome.debugger available; needs full AXTree |
DomLocator / DomElementHandle | dom | Any browser context; works without CDP using @aipexstudio/dom-snapshot |
Source: packages/browser-runtime/README.md.
The DOM-based path automatically tracks coordinate offsets for nested same-origin iframes, which is necessary because an element's bounding rectangle is only valid relative to its own document. The CDP path is the higher-fidelity option when an extension can attach the debugger to a tab.
Tools Bundle and Tool Manager
allBrowserTools is the curated agent surface — 32 FunctionTools grouped into seven categories: tabs (7), UI ops (7), page content (4), screenshots (2), downloads (2), interventions (4), and skills (6). Source: packages/browser-runtime/README.md.
The browser extension wraps that bundle in a ToolManager singleton that supports dynamic registration/unregistration, subscriber notifications via ToolEventType, and exposes category metadata. Source: packages/browser-ext/src/services/tool-manager.ts.
Key tool names used by the agent:
search_elements— runssearchSnapshotTextover the current snapshot.click,hover_element_by_uid,fill_element_by_uid,fill_form,get_editor_value.get_page_metadata,scroll_to_element,highlight_element,highlight_text_inline.capture_screenshot,capture_tab_screenshot— seeisCaptureScreenshotToolin packages/aipex-react/src/lib/screenshot-utils.ts.- Skill tools:
load_skill,execute_skill_script, plus four others. Built-in skills such aswcag22-a11y-auditare auto-seeded into IndexedDB. Source: packages/browser-runtime/src/skill/lib/services/skill-manager.ts.
The extension wires these into the side-panel chat via the useBrowserTools hook inside app-root.tsx, which composes BrowserContextProviders, ChromeStorageAdapters, and the InputModeProvider. Source: packages/browser-ext/src/pages/common/app-root.tsx.
Known Limitations and Community Issues
Several open issues are worth flagging for any contributor touching this stack:
- Iframe traversal is not recursive across nested iframes — Issue #100 reports that
search_elementscannot locate elements inside pages with iframes and points to chrome-devtools-mcp / Playwright / Stagehand as reference implementations. The README explicitly states only "same-origin iframe support" today. Source: packages/dom-snapshot/README.md. - Firefox support is not implemented — Issue #1 tracks this. Because the current runtime depends on
chrome.*APIs andindexedDB, porting would require the planned framework refactor mentioned in the issue. Source: packages/browser-runtime/README.md. - "Requested device not found" — Issue #80 discusses a runtime error in the screenshot/automation path; reviewers should check device-targeting flags when reproducing.
- MCP bridge is single-session — Issue #202 notes that opening a second Claude window fails to connect; the bridge currently accepts only one localhost consumer at a time. Source: packages/browser-ext/src/pages/options/mcp-bridge-panel.tsx.
- Repository size — Issue #53 tracks a 97 MiB pack size; contributors should be mindful before adding large binary assets to history.
See Also
- README.md — top-level project introduction and installation
packages/aipex-core— platform-agnostic agent loop and event buspackages/aipex-react— React UI toolkit consumed by the side panelpackages/browser-ext— the Manifest V3 extension that ships the user-facing product
Source: https://github.com/AIPexStudio/AIPex / Human Manual
AI Chat, MCP Bridge & Skills
Related topics: Repository Overview & Architecture, Browser Automation: DOM Snapshot, Locator & Tools, Extension UI, Build, Deployment & Operations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Repository Overview & Architecture, Browser Automation: DOM Snapshot, Locator & Tools, Extension UI, Build, Deployment & Operations
AI Chat, MCP Bridge & Skills
Overview
AIPex exposes a layered stack that lets an end user (or an external AI agent) drive a real browser through natural-language commands. The three pieces that sit on top of the browser runtime are:
- AI Chat — the in-extension assistant rendered by
@aipexstudio/aipex-reactand powered by@aipexstudio/aipex-core(which wraps@openai/agents). Source: README.md, packages/core/README.md. - MCP Bridge — a local Node service (
mcp-bridge/) and an in-extension panel that expose AIPex browser tools to external coding agents via the Model Context Protocol. Source: mcp-bridge/README.md. - Skill — a packaged
aipex-browserskill definition inskill/SKILL.mdthat ships tool-usage strategy, schemas, and automation patterns to skill-protocol agents such as Claude Code. Source: README.md.
The repository is a pnpm workspace monorepo; the relevant packages are declared under workspaces in the root and the extension depends on the core and react packages. Source: package.json, packages/browser-ext/package.json.
AI Chat Layer
Core Agent (`@aipexstudio/aipex-core`)
The core package is platform-agnostic — it has no Chrome API or Node-only assumptions — and exposes a streaming-first agent API. AIPex.chat() returns an AsyncGenerator<AgentEvent> that emits structured events such as content_delta, tool_call_start, tool_call_complete, tool_call_args_streaming_start and tool_call_args_streaming_complete. Source: packages/core/README.md.
Optional layers that productize the agent include:
- Conversation/session management (persistence, listing, forking).
- Conversation compression for long-running sessions.
- A
ContextManager/ContextProvidersystem for attaching external data to a turn. - A schema-first
ToolRegistry.
The runtime depends on @openai/agents, @openai/agents-extensions, lru-cache, and zod, with optional peer integrations for OpenAI, Anthropic, Google, and OpenRouter. Source: packages/core/package.json.
React UI (`@aipexstudio/aipex-react`)
The React package depends only on core and provides a drop-in chat surface, headless hooks, and extension building blocks:
<Chatbot />and related component slots/themes.useChatandChatAdapter(convertsAgentEventstreams into UI messages).<SettingsPage />anduseChatConfigwithKeyValueStorage(localStorage by default, swappable forchrome.storageor IndexedDB).<ContentScript />,<Omni />, intervention cards, and a fake mouse cursor.- Optional i18n and theme providers via subpath exports.
Source: packages/aipex-react/README.md and packages/aipex-react/package.json.
AI Provider Factory
The extension resolves an AI provider in ai-provider.ts using one of two modes:
- BYOK (Bring Your Own Key) — the user supplies
apiKeyplus optionalbaseURL. Providers are constructed via@ai-sdk/anthropic,@ai-sdk/google,@ai-sdk/openai, and@ai-sdk/openai-compatible. - Proxy mode — falls back to
https://www.claudechrome.com/api/aiwith cookie-based auth. The default model in this mode isdeepseek/deepseek-chat-v3.1.
Host URLs are validated to reject private/internal addresses (SSRF guard). Source: packages/browser-ext/src/lib/ai-provider.ts.
The model catalog is fetched from https://www.claudechrome.com/api/models, cached under the cachedModelList storage key, and price-bucketed into cheap, normal, or expensive tiers. Source: packages/aipex-react/src/lib/models.ts.
MCP Bridge
Architecture
The bridge is a local server plus an in-extension sidecar that lets external coding agents drive the user's browser. It supports multiple simultaneous clients (Cursor, Claude Code, VS Code Copilot) over StreamableHTTP. Source: mcp-bridge/README.md.
flowchart LR A[Cursor] -->|HTTP POST /mcp| S B[Claude Code] -->|HTTP POST /mcp| S C[VS Code Copilot] -->|HTTP POST /mcp| S S[aipex-mcp-server<br/>localhost:9223] -->|WebSocket| E[AIPex Extension] E -->|chrome.* APIs| BR[Browser] classDef ext fill:#eef,stroke:#557; class E,BR ext;
Endpoints
| Endpoint | Protocol | Purpose |
|---|---|---|
/mcp | StreamableHTTP | MCP client entry point for agents |
/extension | WebSocket | Bidirectional link to the AIPex extension |
/health | HTTP | Liveness check |
The default listen port is 9223 and only 127.0.0.1 / ::1 connections are accepted, as reflected in the options page UI. Source: packages/browser-ext/src/pages/options/mcp-bridge-panel.tsx, mcp-bridge/README.md.
Client Configuration
Run the server with npx aipex-mcp-server and point an agent at http://localhost:9223/mcp. Example for Cursor (.cursor/mcp.json):
{
"mcpServers": {
"aipex-browser": { "url": "http://localhost:9223/mcp" }
}
}
Claude Code users can run claude mcp add --transport http aipex-browser http://localhost:9223/mcp. Source: mcp-bridge/README.md.
In-Extension Panel
The options page exposes a connection panel that shows status (connected, connecting, disconnected), reconnect attempt counters, the since <time> label, and a URL input that defaults to ws://localhost:9223. It exposes tools/list and tools/call over MCP and enforces the localhost-only rule. Source: packages/browser-ext/src/pages/options/mcp-bridge-panel.tsx.
Known Limitation
Issue #202 reports that running multiple Claude sessions against the bridge causes the second window to fail to connect while the first one keeps working — a symptom of the current single-session assumption in the bridge. Source: community context — #202.
Skills
AIPex ships an aipex-browser skill — a packaged skill definition for agents that speak the skill protocol (Claude Code, OpenClaw-compatible runtimes). The bundle includes tool-usage strategy, full parameter schemas for the 30+ browser tools, and common automation patterns so an agent can drive the browser without rediscovering the tool surface. The canonical definition lives at skill/SKILL.md. Source: README.md.
Skills are loaded by the agent runtime; the chat layer does not embed them. They complement the MCP bridge: an agent that uses the skill can be paired with the bridge to translate high-level intentions into extension tool calls.
See Also
- DOM Snapshot Library — the structured page representation the chat and MCP tools reason over.
- MCP Bridge Server — configuration and endpoint reference.
- Core Agent README —
AgentEventstream and tool registry. - React Toolkit README —
<Chatbot />, hooks, and settings. - Community tracking: Firefox support (#1), repository size (#53), iframe-aware snapshots (#100), "Requested device not found" (#80).
Source: https://github.com/AIPexStudio/AIPex / Human Manual
Extension UI, Build, Deployment & Operations
Related topics: Repository Overview & Architecture, AI Chat, MCP Bridge & Skills
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Repository Overview & Architecture, AI Chat, MCP Bridge & Skills
Extension UI, Build, Deployment & Operations
Overview
AIPex is a browser automation platform distributed as a browser extension that lets users control their browser using natural-language commands. The product is delivered as a pnpm monorepo containing the browser extension package (@aipexstudio/cool-aipex), a runtime package (@aipexstudio/browser-runtime), the core agent package (@aipexstudio/aipex-core), and a DOM snapshot package (@aipexstudio/dom-snapshot). The extension UI is built with Vite, React, TypeScript, and Tailwind CSS, and is wired into a chat-based agent that uses the Vercel AI SDK and OpenAI Agents SDK to call into 30+ browser tools.
This page covers the extension's user-facing UI surfaces, how the monorepo and extension are built, what the deployment artifact looks like, and the operational concerns (tool registry, message adaptation, conversation sharing) that govern day-to-day behavior.
Extension UI Surfaces
Workspace layout
The root package.json declares a workspaces: ["packages/*"] arrangement with pnpm. The browser extension is the umbrella workspace, declared in packages/browser-ext/package.json as @aipexstudio/cool-aipex with display name "AIPex" and version 0.1.0. It pulls in @aipexstudio/aipex-core, @aipexstudio/aipex-react, and @aipexstudio/browser-runtime via workspace:* references, so the extension is always built against the in-repo versions of the runtime and core packages.
Source: package.json:1-15 Source: packages/browser-ext/package.json:1-25
Tool-driven UI
The extension uses a ToolManager that wraps allBrowserTools from the runtime into a unified interface for the extension context. Tools are organized into categories that the UI can render and filter on:
| Category | Purpose |
|---|---|
browser | Tab management, navigation |
ui | Locate, click, hover, fill elements |
page | Metadata, scroll, highlight |
screenshot | Capture to data URL |
download | Save images from agent workflow |
intervention | Human-in-the-loop requests |
skill | Load and execute skill scripts |
The ToolManager exposes a subscriber model (tool_registered / tool_unregistered events) so UI components can react as tools become available, and it is implemented as a singleton via ToolManager.getInstance(). This pattern lets the extension's chat panel render the live tool inventory as skills and tools are added at runtime.
Source: packages/browser-ext/src/services/tool-manager.ts:15-80
Automation mode toolbar
The chat input is augmented by an AutomationModeInputToolbar that lets the user toggle between focus mode (visual feedback, window focus) and background mode (silent operation). The component reads the current mode from storage using the useStorage hook from @aipexstudio/browser-runtime/hooks and validates writes through validateAutomationMode from @aipexstudio/aipex-core. The toolbar also shows a TokenUsageIndicator and stop/send affordances, demonstrating how extension UI components are composed from shared React primitives in @aipexstudio/aipex-react.
Source: packages/browser-ext/src/lib/automation-mode-toolbar.tsx:1-45
File preview and conversation sharing
The options page includes a FilePreview dialog for browsing files uploaded to the skill sandbox. It renders text content with SyntaxHighlighter (word-wrap, line numbers, monospace font) and shows a binary file alert with a formatted byte count when preview is not possible. Conversation sharing is handled by shareConversation, which serializes UIMessages to a ShareableMessage[] (omitting screenshots and system messages), and posts the payload to a share API using an auth cookie header.
Source: packages/browser-ext/src/pages/options/file-components/FilePreview.tsx:1-45 Source: packages/browser-ext/src/services/share-conversation.ts:1-60
flowchart LR
UI[Extension UI<br/>Sidepanel / Options] --> TM[ToolManager]
TM --> RT[allBrowserTools<br/>@aipexstudio/browser-runtime]
RT --> Core[aipex-core<br/>Agent / Sessions]
Core --> LLM[AI SDK Providers]
Core --> DOM[dom-snapshot]
TM -->|tool events| UI
UI -->|share| API[Share API]Build System
Root scripts
The root package.json exposes a small set of orchestration scripts: build runs pnpm -r --if-present build to build every workspace; dev runs the extension's Vite dev server; preflight chains format, lint, typecheck, and test; lint:dependencies uses knip --strict to catch unused exports. The postinstall hook installs prek, a faster pre-commit runner.
Source: package.json:10-35
Extension build
The extension package's build scripts are intentionally thin: dev → vite, build → vite build, preview → vite preview, build:css → Tailwind CLI minification, typecheck → tsc --project tsconfig.json, and test → vitest. CSS is pre-built to src/style.css and shipped alongside the JavaScript bundle. The published files array for the extension is ["build/", "README.md", "package.json", "host-access-config.json"], so the deployment artifact is the Vite build/ directory plus the host access policy file.
Source: packages/browser-ext/package.json:14-40
Runtime build
The runtime package (@aipexstudio/browser-runtime) compiles with tsc -b (project references), with typecheck via tsc --project tsconfig.json and tests via vitest. It exposes two entry points — the main module and a ./hooks subpath for React hooks — and depends on @aipexstudio/aipex-core and @aipexstudio/dom-snapshot from the workspace. Heavy dependencies such as @zenfs/core and the QuickJS WebAssembly file are bundled at the runtime layer, keeping the extension's own bundle lean.
Source: packages/browser-runtime/package.json:14-50
Linting and formatting
preflight runs Biome's format, check --fix --unsafe, and tsc for every workspace. This enforces a single style and import-order rule across packages.
Deployment Artifact
The extension is published as the @aipexstudio/cool-aipex package on npm. The description field reads "Automate your browser with natural language commands - The open source browser-use solution", and the package lists several AI SDK peer dependencies as optional (@ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, @openrouter/ai-sdk-provider). Users can install only the providers they need; the core agent and OpenAI Agents SDK are required runtime dependencies.
Source: packages/browser-ext/package.json:5-60 Source: packages/core/package.json:25-50
For the MCP bridge, mcp-bridge/package.json targets Node >=18.0.0 and ships as a separate package with keywords such as mcp, model-context-protocol, cursor, claude, and copilot. It is the recommended way to wire AI clients (Cursor, Claude Code, GitHub Copilot) to the running extension.
Source: mcp-bridge/package.json:1-30
Operations
Message adaptation
Internal messages travel through a message-adapter that maps UIMessage parts to a runtime-friendly shape: text parts are passed through, tool parts are projected to { type: "tool_use", id, name, input }, and unrecognized part types fall back to { type: "text", text: "[<type>]" }. The adapter also implements safeJsonParse and extractBusinessFailure, which detect business-level failures ({ success: false, error: "..." }) so the UI can surface them as user-visible errors instead of silent JSON strings.
Source: packages/browser-ext/src/lib/message-adapter.ts:1-70
Tool bundle composition
allBrowserTools ships 32 tools by default; a small set is intentionally disabled to avoid known issues: switch_to_tab (context switching), duplicate_tab (not enabled), and wait_* helpers (disabled for stability). The remaining tools cover seven tabs-management actions, seven UI operations, four page-level helpers, two screenshot tools, two download tools, four intervention helpers, and six skill-management tools.
Source: packages/browser-runtime/README.md:1-55
Known operational issues from the community
- Repository size is large (≈97 MiB packed); tracking is logged in issue #53.
- The MCP bridge only supports a single Claude session at a time (issue #202); opening a second Claude window causes the first to disconnect.
- Snapshots do not yet recursively descend into iframes (issue #100); pages containing iframes cannot have their inner elements located.
- A "Requested device not found" error has been reported (issue #80), usually when the native messaging host is not registered.
- Firefox support is an open feature request (issue #1); the team has stated the codebase will be refactored to support multiple browsers, but Firefox is not yet a supported target.
Source: README.md:1-30
See Also
- Core Agent & Sessions
- DOM Snapshot Library
- Browser Runtime & Tool Registry
- MCP Bridge
Source: https://github.com/AIPexStudio/AIPex / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 8 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.
1. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: packet_text.keyword_scan | https://github.com/AIPexStudio/AIPex
2. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | https://github.com/AIPexStudio/AIPex
3. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/AIPexStudio/AIPex
4. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/AIPexStudio/AIPex
5. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/AIPexStudio/AIPex
6. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/AIPexStudio/AIPex
7. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/AIPexStudio/AIPex
8. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/AIPexStudio/AIPex
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using AIPex with real data or production workflows.
- firefox support - github / github_issue
- v0.1.0 - github / github_release
- v0.0.16 - github / github_release
- v0.0.15 - github / github_release
- v0.0.14 - github / github_release
- v0.0.13 - github / github_release
- v0.0.12 - github / github_release
- v0.0.11 - github / github_release
- v0.0.10 - github / github_release
- v0.0.9 - github / github_release
- v0.0.8 - github / github_release
- Security or permission risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence