mcp-chrome Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

mcp-chrome

Chrome MCP Server provides AI assistants with the ability to interact with web pages, manage browser tabs, capture screenshots, monitor network traffic, and perform automated actions. By e...

Introduction to Chrome MCP Server

Chrome MCP Server is a Model Context Protocol (MCP) implementation that bridges AI assistants with Chrome/Chromium browsers, enabling AI-powered browser automation, content analysis, and control through a comprehensive set of tools. The project consists of two main components: a Chrome extension that runs in the browser and a native server that communicates with AI clients via the MCP protocol.

Overview and Purpose

Chrome MCP Server provides AI assistants with the ability to interact with web pages, manage browser tabs, capture screenshots, monitor network traffic, and perform automated actions. By exposing browser functionality as MCP tools, developers can create AI agents that can browse the web, fill forms, click elements, search history, and analyze page content.

Sources: README.md:1-20

System Architecture

The Chrome MCP Server architecture consists of multiple interconnected components that work together to provide browser automation capabilities.

Component Overview

graph TD
    A["AI Client<br/>(Claude, etc.)"] --> B["MCP Chrome Bridge<br/>(mcp-chrome-bridge)"]
    B --> C["Native Server<br/>(Node.js)"]
    C <--> D["Chrome Extension<br/>(WXT-based)"]
    D --> E["Chrome Browser"]
    F["Web Content Scripts"] --> D
    G["Injected Scripts"] --> E

Native Messaging Host Configuration

The native server registers itself as a Native Messaging Host with the operating system, enabling secure communication between native applications and Chrome extensions.

Platform	User-Level Path	System-Level Path
Windows	`%APPDATA%\Google\Chrome\NativeMessagingHosts\`	`%ProgramFiles%\Google\Chrome\NativeMessagingHosts\`
macOS	`~/Library/Application Support/Google/Chrome/NativeMessagingHosts/`	`/Library/Google/Chrome/NativeMessagingHosts/`
Linux	`~/.config/google-chrome/NativeMessagingHosts/`	`/etc/opt/chrome/native-messaging-hosts/`

Sources: app/native-server/src/scripts/utils.ts:1-50

The HOST_NAME constant identifies the native messaging host configuration file, which Chrome reads to establish the connection with the native server.

Chrome Extension Structure

The Chrome extension is built using WXT, a modern build tool for Chrome extensions. The extension configuration includes security policies, content scripts, and web-accessible resources.

// Key configuration from wxt.config.ts
web_accessible_resources: [
  {
    resources: [
      '/models/*',      // ML models for semantic search
      '/workers/*',     // Web Workers for background tasks
      '/inject-scripts/*', // Scripts injected into web pages
    ],
    matches: ['<all_urls>'],
  },
]

Sources: app/chrome-extension/wxt.config.ts:1-30

Tool Categories

Chrome MCP Server organizes its functionality into several tool categories, each targeting specific browser interaction needs.

Tool	Purpose
`chrome_navigate`	Navigate to URLs with viewport control
`chrome_switch_tab`	Switch the current active tab
`chrome_close_tabs`	Close specific tabs or windows
`chrome_go_back_or_forward`	Browser navigation control

These tools enable AI assistants to control the browser's navigation state and manage multiple tabs programmatically. The chrome_close_tabs function, for example, accepts URL patterns to match and close multiple tabs at once.

Sources: app/chrome-extension/entrypoints/background/tools/browser/common.ts:1-80

Content Interaction

Tool	Purpose
`chrome_click_element`	Click elements using CSS selectors
`chrome_fill_or_select`	Fill forms and select options
`chrome_keyboard`	Simulate keyboard input and shortcuts
`chrome_inject_script`	Inject content scripts into web pages
`chrome_send_command_to_inject_script`	Send commands to injected content scripts

The interaction tools support complex user interaction scenarios by translating high-level commands into browser automation actions.

Network Monitoring

Tool	Purpose
`chrome_network_capture_start/stop`	Capture network requests via webRequest API
`chrome_network_debugger_start/stop`	Debugger API with response bodies
`chrome_network_request`	Send custom HTTP requests

Sources: README.md:40-55

Content Analysis

Tool	Purpose
`search_tabs_content`	AI-powered semantic search across browser tabs
`chrome_get_web_content`	Extract HTML/text content from pages
`chrome_get_interactive_elements`	Find clickable elements
`chrome_console`	Capture and retrieve console output

The content analysis tools leverage semantic search capabilities powered by WebAssembly-optimized math functions for cosine similarity and vector operations.

Sources: packages/wasm-simd/package.json:1-30

Screenshots

Tool	Purpose
`chrome_screenshot`	Advanced screenshot capture with element targeting, full-page support, and custom dimensions

Data Management

Tool	Purpose
`chrome_history`	Search browser history with time filters
`chrome_bookmark_search`	Find bookmarks by keywords
`chrome_bookmark_add`	Add new bookmarks with folder support
`chrome_bookmark_delete`	Delete bookmarks

Content Indexing System

The Chrome extension includes a sophisticated content indexing system that enables semantic search across open browser tabs.

graph LR
    A["Tab Load Complete"] --> B["ContentIndexer"]
    B --> C{"URL Excluded?"}
    C -->|Yes| D["Skip"]
    C -->|No| E["Execute web-fetcher-helper.js"]
    E --> F["Extract Metadata"]
    F --> G["Index Content"]
    G --> H["Searchable Index"]

The indexer automatically processes tabs when they complete loading, extracting text content, titles, and metadata. It excludes internal Chrome URLs, extensions, and local files.

Sources: app/chrome-extension/utils/content-indexer.ts:1-60

URL Exclusion Patterns

The content indexer skips indexing for the following URL patterns:

Pattern	Description
`chrome://*`	Internal Chrome pages
`chrome-extension://*`	Extension pages
`edge://*`	Microsoft Edge pages
`about:*`	About pages
`moz-extension://*`	Firefox extension pages
`file://*`	Local files

Web Fetcher Tool

The chrome_get_web_content tool provides flexible content extraction from web pages.

interface WebFetcherToolParams {
  htmlContent?: boolean;  // Get visible HTML content
  textContent?: boolean;  // Get visible text content
  url?: string;           // Optional URL to fetch
  selector?: string;      // CSS selector for specific elements
  tabId?: number;         // Target existing tab
  background?: boolean;   // Don't activate/focus tab
  windowId?: number;      // Target window
}

Sources: app/chrome-extension/entrypoints/background/tools/browser/web-fetcher.ts:1-30

Metadata Extraction

The injected web fetcher script extracts comprehensive metadata from web pages:

Field	Source Priority
title	JSON-LD → Open Graph → Twitter Cards
byline	JSON-LD → Meta tags
excerpt	JSON-LD → Open Graph → Twitter
siteName	Open Graph
publishedTime	JSON-LD → Article tags

All values are automatically unescaped from HTML entities to ensure proper formatting.

Sources: app/chrome-extension/inject-scripts/web-fetcher-helper.js:1-100

Record and Replay System

The extension includes a record-replay engine for automating browser interactions. The wait policies ensure that automated actions wait for appropriate page states before proceeding.

// Wait policies monitor various browser events
chrome.webNavigation.onCommitted  // Page navigation committed
chrome.webNavigation.onCompleted  // Page fully loaded
chrome.tabs.onUpdated             // Tab state changes

Sources: app/chrome-extension/entrypoints/background/record-replay/engine/policies/wait.ts:1-50

Security Configuration

The extension implements a multi-layered security model:

Environment	COEP	COOP	CSP
Development	Disabled (WXT default)	Disabled (WXT default)	Relaxed for HMR
Production	`require-corp`	`same-origin`	Strict

The Content Security Policy for production enforces:

Scripts: 'self' 'wasm-unsafe-eval'
Objects: 'self'
Styles: 'self' 'unsafe-inline'
Images: 'self' data: blob:

Sources: app/chrome-extension/wxt.config.ts:25-40

Browser Configuration Support

The native server supports multiple browsers and platforms:

Browser	Windows	macOS	Linux
Chrome	✅	✅	✅
Chromium	✅	✅	✅

The configuration system automatically detects the platform and selects appropriate paths for native messaging host manifests.

Sources: app/native-server/src/scripts/browser-config.ts:1-100

Installation Flow

graph TD
    A["Install mcp-chrome-bridge"] --> B{"pnpm config set enable-pre-post-scripts"}
    B -->|pnpm| C["Auto-register native host"]
    B -->|npm| C
    C --> D["Load Chrome Extension"]
    D --> E["Click Extension Icon"]
    E --> F["Connect to Bridge"]
    F --> G["Get MCP Configuration"]

Prerequisites

Node.js >= 20.0.0
pnpm/npm
Chrome/Chromium browser

Registration Methods

Automatic (pnpm):

pnpm config set enable-pre-post-scripts true
pnpm install -g mcp-chrome-bridge

Manual:

npm install -g mcp-chrome-bridge
mcp-chrome-bridge register

Sources: README.md:60-100

Future Roadmap

The project has planned several enhancements:

Feature	Status
Authentication	Planned
Recording and Playback	Planned
Workflow Automation	Planned
Firefox Extension	Planned

Project Structure

mcp-chrome/
├── app/
│   ├── chrome-extension/     # WXT-based Chrome extension
│   │   ├── entrypoints/      # Extension entry points (background, sidepanel, etc.)
│   │   ├── inject-scripts/   # Scripts injected into web pages
│   │   └── utils/            # Utility modules
│   └── native-server/        # Node.js MCP server
│       └── src/
│           ├── agent/        # AI agent integration
│           └── scripts/      # Native messaging setup
└── packages/
    └── wasm-simd/           # WebAssembly SIMD math functions

This structure enables clear separation between the browser extension (UI and web interaction), the native server (MCP protocol and AI integration), and shared packages (common utilities and optimized algorithms).

Sources: README.md:1-20

Quick Start Guide

Overview

The mcp-chrome project provides a Model Context Protocol (MCP) server that enables AI assistants to control and interact with Chrome browser instances. This integration allows AI-powered automation of web browsing tasks through a comprehensive set of tools for navigation, interaction, content extraction, and network monitoring.

The project consists of two primary components:

Component	Location	Purpose
Chrome Extension	`app/chrome-extension/`	Handles browser-level operations via Chrome Extension APIs
Native Server	`app/native-server/`	Bridges MCP clients (Claude Desktop, Cursor, etc.) with the extension

Architecture Overview

graph TD
    A[MCP Client<br/>Claude Desktop / Cursor] -->|MCP Protocol| B[Native Server<br/>mcp-chrome-bridge]
    B -->|Chrome Native Messaging| C[Chrome Extension<br/>Background Script]
    C -->|Chrome APIs| D[Browser Tabs & Content]
    E[Content Scripts] -->|Injection| D
    C -->|scripting.executeScript| E

Prerequisites

Before starting, ensure the following requirements are met:

Requirement	Version/Details
Chrome/Chromium Browser	Version 88+
Node.js	v18+
npm	v8+
MCP Client	Claude Desktop, Cursor, or compatible

Installation Steps

1. Chrome Extension Installation

The Chrome extension must be loaded in developer mode:

Open Chrome and navigate to chrome://extensions/
Enable Developer mode (toggle in top-right corner)
Click Load unpacked
Select the app/chrome-extension/ directory from the repository

Sources: app/chrome-extension/README.md

The extension provides the following entry points:

Entry Point	File	Purpose
Side Panel	`entrypoints/sidepanel/`	Main AI chat interface
Popup	`entrypoints/popup/`	Quick access toolbar popup
Options	`entrypoints/options/`	Extension settings
Welcome	`entrypoints/welcome/`	Onboarding page
Builder	`entrypoints/builder/`	Workflow editor

2. Native Server Setup

The native server acts as the bridge between MCP clients and the Chrome extension.

#### Installation via npm

npm install -g @mcp-chrome/native-server

#### Manual Installation

For manual installation, use the registry command:

# User-level installation (default)
mcp-chrome-bridge register

# System-level installation (requires admin)
mcp-chrome-bridge register --system

Sources: app/native-server/install.md

3. Registry Configuration

On Windows, the native messaging host must be registered in the Windows Registry:

HKCU\Software\Google\Chrome\NativeMessagingHosts\com.mcp.chrome

Or for system-wide installation:

HKLM\Software\Google\Chrome\NativeMessagingHosts\com.mcp.chrome

Sources: app/native-server/install.md

MCP Client Configuration

Claude Desktop

Add the following to your Claude Desktop configuration file:

OS	Config Location
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

{
  "mcpServers": {
    "chrome": {
      "command": "npx",
      "args": ["-y", "@mcp-chrome/native-server"]
    }
  }
}

Sources: app/native-server/README.md

Available MCP Tools

Tool	Description
`chrome_navigate`	Navigate to URL or perform search
`chrome_go_back_or_forward`	Browser history navigation
`chrome_reload`	Reload current page
`chrome_control_viewport`	Zoom and scroll control

Tab Management (6 tools)

Tool	Description
`chrome_new_tab`	Open new tab with optional URL
`chrome_switch_tab`	Switch to specific tab
`chrome_close_tabs`	Close tabs by URL pattern or tab IDs
`chrome_get_tabs`	List all open tabs
`chrome_duplicate_tab`	Duplicate current tab

Sources: README.md

Interaction (3 tools)

Tool	Parameters	Description
`chrome_click_element`	`selector` (CSS)	Click elements using CSS selectors
`chrome_fill_or_select`	`selector`, `value`	Fill forms and select options
`chrome_keyboard`	`text`, `shortcut`	Simulate keyboard input

Content Analysis (4 tools)

Tool	Description
`chrome_get_web_content`	Extract HTML/text content from pages
`chrome_get_interactive_elements`	Find clickable elements
`chrome_console`	Capture browser console output
`search_tabs_content`	AI-powered semantic search across tabs

Screenshot & Visual (1 tool)

Tool	Features
`chrome_screenshot`	Element targeting, full-page capture, custom dimensions

Network Monitoring (4 tools)

Tool	Description
`chrome_network_capture_start/stop`	webRequest API network capture
`chrome_network_debugger_start/stop`	Debugger API with response bodies
`chrome_network_request`	Send custom HTTP requests

Data Management (5 tools)

Tool	Description
`chrome_history`	Search browser history with time filters
`chrome_bookmark_search`	Find bookmarks by keywords
`chrome_bookmark_add`	Add bookmarks with folder support
`chrome_bookmark_delete`	Delete bookmarks

Sources: README.md

Content Script Injection

The extension uses content scripts for web page interaction. Scripts are injected via the Chrome scripting API:

await chrome.scripting.executeScript({
  target: { tabId },
  files: ['inject-scripts/web-fetcher-helper.js'],
});

Sources: app/chrome-extension/utils/content-indexer.ts

Web-accessible resources are configured in wxt.config.ts:

Resource Path	Purpose
`/models/*`	Local AI models
`/workers/*`	Web workers
`/inject-scripts/*`	Helper scripts for content scripts

web_accessible_resources: [
  {
    resources: [
      '/models/*',
      '/workers/*',
      '/inject-scripts/*',
    ],
    matches: ['<all_urls>'],
  },
]

Sources: app/chrome-extension/wxt.config.ts

Security Configuration

The extension implements Content Security Policy (CSP) for production builds:

content_security_policy: {
  extension_pages:
    "script-src 'self' 'wasm-unsafe-eval'; object-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:;",
}

Note: Security policies are disabled in development mode to allow Vite dev server resource loading.

Sources: app/chrome-extension/wxt.config.ts

URL Exclusion Patterns

The content indexer automatically excludes internal browser URLs:

const excludePatterns = [
  /^chrome:\/\//,
  /^chrome-extension:\/\//,
  /^edge:\/\//,
  /^about:/,
  /^moz-extension:\/\//,
  /^file:\/\//,
];

Sources: app/chrome-extension/utils/content-indexer.ts

Agent Thread Visualization

The sidepanel provides visual representation of agent activities:

// Tool kinds tracked in useAgentThreads
const toolKinds = ['run', 'grep', 'edit', 'read', 'search'] as const;

Each tool execution is categorized and displayed with:

Property	Description
`kind`	Tool category (edit, run, grep, etc.)
`title`	File name or command
`details`	Output content
`severity`	Error, success, or info status
`phase`	Execution phase (input, output, result)

Sources: app/chrome-extension/entrypoints/sidepanel/composables/useAgentThreads.ts

Troubleshooting

Common Issues

Issue	Solution
Extension not responding	Reload the extension at `chrome://extensions/`
Native messaging fails	Verify registry entries on Windows
Permission denied	Check that extension has required permissions
Tab access fails	Ensure tab ID is valid and accessible

Sources: app/native-server/install.md

Verbose Logging

Enable verbose logging for debugging:

mcp-chrome-bridge --verbose

Quick Usage Example

Once installed, you can interact with Chrome using natural language:

Navigate to github.com and search for mcp-chrome repositories

This triggers the following workflow:

sequenceDiagram
    participant User
    participant MCP
    participant NativeServer
    participant Extension
    participant Chrome
    
    User->>MCP: "Navigate to github.com..."
    MCP->>NativeServer: chrome_navigate
    NativeServer->>Extension: chrome.tabs.update()
    Extension->>Chrome: Navigate URL
    Chrome-->>Extension: Page loaded
    Extension-->>NativeServer: Success response
    NativeServer-->>MCP: Result
    MCP-->>User: Confirmation

Next Steps

Explore the Workflow Editor for advanced automation
Configure AI model settings in the Options page
Review Content Analysis capabilities
Set up Bookmark Management workflows

Sources: app/chrome-extension/README.md

System Architecture

Related topics: System Architecture, Chrome Extension Structure, MCP Server Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Entrypoint Components

Continue reading this section for the full explanation and source context.

Section Cross-Platform Manifest Configuration

Continue reading this section for the full explanation and source context.

Section Browser Support Matrix

Continue reading this section for the full explanation and source context.

System Architecture

Overview

mcp-chrome is a Model Context Protocol (MCP) server implementation that extends AI assistant capabilities to browser automation through a Chrome extension and native messaging host architecture. The system enables AI models to interact with web pages, control browser tabs, capture network traffic, and perform semantic search across browser content.

High-Level Architecture

The architecture consists of three primary layers:

Layer	Component	Responsibility
Chrome Extension	Background Service Worker	Hosts MCP tools, manages tab state, handles browser API calls
Native Server	Node.js Application	Manages browser configuration, handles native messaging, coordinates with AI engines
AI Integration	Claude Engine / Codex Engine	Executes AI logic, processes tool requests, manages sessions

Sources: app/chrome-extension/wxt.config.ts:1-50

Extension Entrypoint Architecture

The Chrome extension uses multiple HTML entrypoints, each serving a specific purpose:

graph TD
    A[Chrome Extension] --> B[Background Service Worker]
    A --> C[Popup]
    A --> D[Side Panel]
    A --> E[Options Page]
    A --> F[Welcome Page]
    A --> G[Builder]
    A --> H[Offscreen Document]
    
    B --> I[MCP Tools]
    B --> J[Tab Management]
    B --> K[Content Indexer]
    
    C --> L[Quick Actions]
    D --> M[Workflow Management]
    E --> N[Userscripts Manager]

Entrypoint Components

Entrypoint	File Path	Purpose
Background	`entrypoints/background/`	Service worker hosting all MCP tools
Popup	`entrypoints/popup/`	Quick access to AI chat interface
Side Panel	`entrypoints/sidepanel/`	Workflow management and agent threads
Builder	`entrypoints/builder/`	Visual workflow editor
Options	`entrypoints/options/`	Userscripts configuration
Welcome	`entrypoints/welcome/`	Onboarding page
Offscreen	`entrypoints/offscreen/`	Background document for long-running tasks

Sources: app/chrome-extension/entrypoints/popup/index.html:1-12, app/chrome-extension/entrypoints/sidepanel/index.html:1-12

Native Messaging Architecture

Cross-Platform Manifest Configuration

The native server uses platform-specific manifest paths for Chrome Native Messaging, enabling secure communication between the native application and Chrome extension:

graph TD
    A[Native Server] --> B{Platform Detection}
    B -->|win32| C[Windows Registry + JSON]
    B -->|darwin| D[macOS plist]
    B -->|linux| E[Linux JSON]
    
    C --> F[User Manifest]
    C --> G[System Manifest]
    D --> H[User Manifest]
    D --> I[System Manifest]
    E --> J[User Manifest]
    E --> K[System Manifest]

Manifest Path Locations by Platform:

Platform	User-Level Path	System-Level Path
Windows	`%APPDATA%\Google\Chrome\NativeMessagingHosts\`	`%ProgramFiles%\Google\Chrome\NativeMessagingHosts\`
macOS	`~/Library/Application Support/Google/Chrome/NativeMessagingHosts/`	`/Library/Google/Chrome/NativeMessagingHosts/`
Linux	`~/.config/google-chrome/NativeMessagingHosts/`	`/etc/opt/chrome/native-messaging-hosts/`

Sources: app/native-server/src/scripts/browser-config.ts:1-100, app/native-server/src/scripts/utils.ts:1-50

Browser Support Matrix

The native server supports multiple Chromium-based browsers:

Browser	Chrome	Chromium	Edge
Windows Registry	`HKCU\Software\Google\Chrome\`	`HKCU\Software\Chromium\`	Not configured
System Registry	`HKLM\Software\Google\Chrome\`	`HKLM\Software\Chromium\`	Not configured

MCP Tool Architecture

Tool Categories

The extension exposes MCP tools organized into functional categories:

graph LR
    A[MCP Tools] --> B[Viewport & Navigation]
    A --> C[Interaction]
    A --> D[Data Management]
    A --> E[Screenshots & Visual]
    A --> F[Network Monitoring]
    A --> G[Content Analysis]
    
    B --> B1[chrome_set_viewport]
    B --> B2[chrome_switch_tab]
    B --> B3[chrome_close_tabs]
    B --> B4[chrome_go_back_or_forward]
    
    C --> C1[chrome_click_element]
    C --> C2[chrome_fill_or_select]
    C --> C3[chrome_keyboard]
    
    D --> D1[chrome_history]
    D --> D2[chrome_bookmark_*]
    
    E --> E1[chrome_screenshot]
    
    F --> F1[chrome_network_*]
    
    G --> G1[search_tabs_content]
    G --> G2[chrome_get_web_content]

Tab Management Tools

Tabs can be queried and closed using URL pattern matching:

const tabs = await chrome.tabs.query({ url: urlPattern });
await chrome.tabs.remove(tabIdsToClose);

Sources: app/chrome-extension/entrypoints/background/tools/browser/common.ts:1-80

Dialog Handling

The extension can handle JavaScript dialogs using Chrome DevTools Protocol (CDP):

await cdpSessionManager.sendCommand(tabId, 'Page.handleJavaScriptDialog', {
  accept: action === 'accept',
  promptText: action === 'accept' ? promptText : undefined,
});

Sources: app/chrome-extension/entrypoints/background/tools/browser/dialog.ts:1-50

Content Indexer System

The content indexer provides semantic search across open browser tabs:

Indexing Logic

Step	Action	Trigger
1	Detect tab update	`chrome.tabs.onUpdated` with `status === 'complete'`
2	Check semantic engine readiness	2-second delay to allow engine initialization
3	Extract content	Execute `web-fetcher-helper.js` injection script
4	Index content	Store in semantic engine for `search_tabs_content`

URL Exclusion Patterns

The indexer automatically excludes internal Chrome URLs:

const excludePatterns = [
  /^chrome:\/\//,
  /^chrome-extension:\/\//,
  /^edge:\/\//,
  /^about:/,
  /^moz-extension:\/\//,
  /^file:\/\//,
];

Sources: app/chrome-extension/utils/content-indexer.ts:1-100

Agent Session Management

Session Service Architecture

The native server manages AI agent sessions with structured metadata:

graph TD
    A[SessionService] --> B[ManagementInfo]
    A --> C[AgentSessionPreviewMeta]
    A --> D[Engine Configuration]
    
    B --> B1[models]
    B --> B2[commands]
    B --> B3[mcpServers]
    B --> B4[plugins]
    B --> B5[skills]
    
    D --> D1[ClaudeEngine]
    D --> D2[CodexEngine]

Session Interface

interface ManagementInfo {
  models?: Array<{ value: string; displayName: string; description: string }>;
  commands?: Array<{ name: string; description: string; argumentHint: string }>;
  account?: { email?: string; organization?: string; subscriptionType?: string };
  mcpServers?: Array<{ name: string; status: string }>;
  tools?: string[];
  agents?: string[];
  plugins?: Array<{ name: string; path?: string }>;
  skills?: string[];
  model?: string;
  permissionMode?: string;
  cwd?: string;
}

Sources: app/native-server/src/agent/session-service.ts:1-60

Claude Engine Integration

Tool Message Processing

The Claude engine handles streaming tool execution results:

sequenceDiagram
    participant AI as Claude API
    participant Engine as ClaudeEngine
    participant Dispatch as Tool Dispatch
    
    AI->>Engine: content_block_start (tool_use)
    Engine->>Dispatch: Register pending tool
    AI->>Engine: content_block_delta (input_json_delta)
    Engine->>Engine: Accumulate JSON parts
    AI->>Engine: content_block_stop
    Engine->>Engine: Parse accumulated JSON
    Engine->>Dispatch: Execute tool with full input
    Dispatch->>Engine: Result content
    Engine->>AI: tool_result

Tool Result Processing

if (contentBlock.type === 'tool_result') {
  const metadata = this.buildToolResultMetadata(contentBlock);
  const content = this.extractToolResultContent(contentBlock);
  const isError = contentBlock.is_error === true;
  
  dispatchToolMessage(
    isError ? `Error: ${content || 'Tool execution failed'}` : content || 'Tool completed',
    metadata,
    'tool_result',
    false
  );
}

Sources: app/native-server/src/agent/engines/claude.ts:1-100

Security Architecture

Content Security Policy

The extension implements strict CSP in production:

Policy	Value	Purpose
script-src	`'self' 'wasm-unsafe-eval'`	Restrict script execution
object-src	`'self'`	Limit object embedding
style-src	`'self' 'unsafe-inline'`	Allow Vite-compiled styles
img-src	`'self' data: blob:`	Permit data URIs for thumbnails

Cross-Origin Policies

Policy	Value	Development	Production
COOP	`same-origin`	Disabled	Enabled
COEP	`require-corp`	Disabled	Enabled

Sources: app/chrome-extension/wxt.config.ts:20-35

Web Accessible Resources

The extension exposes specific directories to content scripts:

Directory	Purpose
`/models/*`	AI model files
`/workers/*`	Web Worker scripts
`/inject-scripts/*`	Content script helpers

Record-Replay System

The background service includes a record-replay engine for browser automation testing:

Wait Policy

The waitForNetworkIdle function monitors browser events:

Event Type	Listener	Purpose
`onCommitted`	webNavigation	Detect navigation commits
`onCompleted`	webNavigation	Detect page load completion
`onHistoryStateUpdated`	webNavigation (optional)	Detect SPA route changes
`onUpdated`	tabs	Detect tab status changes

const onUpdated = (updatedId: number, change: chrome.tabs.TabChangeInfo) => {
  if (updatedId !== tabId) return;
  if (change.status === 'loading') mark();
  if (typeof change.url === 'string' && (!prevUrl || change.url !== prevUrl)) mark();
};

Sources: app/chrome-extension/entrypoints/background/record-replay/engine/policies/wait.ts:1-60

Data Flow Diagram

graph LR
    A[User] -->|Command| B[AI Assistant]
    B -->|MCP Request| C[Native Server]
    C -->|Native Messaging| D[Chrome Extension]
    D -->|Chrome APIs| E[Browser]
    E -->|Response| D
    D -->|Tool Result| C
    C -->|Stream| B
    B -->|Response| A

Extension Development Configuration

Keyboard Shortcuts

Shortcut	Windows/Linux	macOS	Action
Quick Panel	`Alt+Shift+U`	`Cmd+Shift+U`	Toggle AI Chat

Vite Plugin Stack

Plugin	Purpose
TailwindCSS v4	Styling without PostCSS
`@vitejs/plugin-vue`	Vue component auto-registration
WXT	Extension framework

Sources: app/chrome-extension/wxt.config.ts:40-50

Sources: app/chrome-extension/wxt.config.ts:1-50

Communication Protocols

Related topics: System Architecture, MCP Server Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Purpose and Scope

Continue reading this section for the full explanation and source context.

Section Connection Establishment

Continue reading this section for the full explanation and source context.

Section Auto-Connect Behavior

Continue reading this section for the full explanation and source context.

Communication Protocols

Overview

The mcp-chrome project implements a sophisticated communication architecture that bridges Chrome browser extensions with a native Node.js server. This architecture relies on multiple communication protocols working in concert to enable AI-powered browser automation through the Model Context Protocol (MCP).

The system employs two primary communication mechanisms:

Native Messaging Protocol - Chrome extension to native host communication using Chrome's runtime.connectNative API
Internal Message Routing - Communication between background scripts and tool handlers within the extension

Architecture Overview

graph TD
    A[Chrome Extension UI/Popup] -->|chrome.runtime.sendMessage| B[Background Script]
    B -->|call_tool messages| C[Tool Handlers]
    C -->|CDP Commands| D[Chrome DevTools Protocol]
    B -->|connectNative| E[Native Host]
    E -->|stdio| F[Node.js MCP Server]
    F -->|WebSocket/CDP| G[Browser Instance]
    
    H[Content Scripts] -->|injected scripts| G
    
    I[Offscreen Document] -->|file operations| E

Native Messaging Protocol

Purpose and Scope

The Native Messaging Protocol enables secure communication between the Chrome extension and a standalone native application. This is essential for operations that require direct system access, such as file system operations, native module execution, and port management for the MCP server.

Sources: app/native-server/README.md

Connection Establishment

The extension initiates native connections using Chrome's runtime.connectNative API with a predefined application ID:

nativePort = chrome.runtime.connectNative('com.yourcompany.fastify_native_host');

Sources: app/native-server/README.md:40

Auto-Connect Behavior

The system implements intelligent auto-connect functionality that activates on browser startup and extension installation:

// Auto-connect on Chrome browser startup
chrome.runtime.onStartup.addListener(() => {
  void ensureNativeConnected('onStartup').catch(() => {});
});

// Auto-connect on extension install/update
chrome.runtime.onInstalled.addListener(() => {
  void ensureNativeConnected('onInstalled').catch(() => {});
});

Sources: app/chrome-extension/entrypoints/background/native-host.ts:1-20

Message Types

Message Type	Direction	Purpose
`call_tool`	Extension → Native	Request tool execution
`ENSURE_NATIVE`	UI → Background	Trigger connection check
`CONNECT_NATIVE`	UI → Background	User-initiated connection
`forward_to_native`	Background → Native	Route messages to native host
`file_operation`	Extension → Native	File preparation for uploads
`started`	Native → Extension	Server startup confirmation
`stopped`	Native → Extension	Server shutdown notification
`error`	Native → Extension	Error reporting

Message Format

Messages follow a structured format with type and payload fields:

chrome.runtime.sendMessage({
  type: 'forward_to_native',
  message: {
    type: 'file_operation',
    requestId: requestId,
    payload: {
      action: 'prepareFile',
      fileUrl,
      base64Data,
      fileName,
    },
  },
});

Sources: app/chrome-extension/entrypoints/background/tools/browser/file-upload.ts:58-75

Internal Message Routing

Tool Call Routing

The background script routes tool calls from various sources (UI, content scripts) to appropriate handlers:

chrome.runtime.onMessage.addListener((message, _sender, sendResponse) => {
  // Allow UI to call tools directly
  if (message && message.type === 'call_tool' && message.name) {
    handleCallTool({ name: message.name, args: message.args })
      .then((res) => sendResponse({ success: true, result: res }))
      .catch((err) =>
        sendResponse({ success: false, error: err instanceof Error ? err.message : String(err) }),
      );
    return true;
  }
  // ... additional routing logic
});

Sources: app/chrome-extension/entrypoints/background/native-host.ts:28-45

Connection Management States

stateDiagram-v2
    [*] --> Disconnected
    Disconnected --> Connecting: ensureNativeConnected()
    Connecting --> Connected: Port established
    Connecting --> Disconnected: Connection failed
    Connected --> Disconnected: Port disconnect
    Connected --> AutoConnectDisabled: User explicit disconnect
    AutoConnectDisabled --> Connecting: CONNECT_NATIVE message
    AutoConnectDisabled --> Connecting: ensureNativeConnected()

File Operation Protocol

File Upload Flow

The file upload mechanism uses a request-response pattern with unique request IDs:

const requestId = `${Date.now()}-${Math.random().toString(36).substring(2, 9)}`;

const handleMessage = (message: any) => {
  if (
    message.type === 'file_operation_response' &&
    message.responseToRequestId === requestId
  ) {
    clearTimeout(timeout);
    chrome.runtime.onMessage.removeListener(handleMessage);
    
    if (message.payload?.success && message.payload?.filePath) {
      resolve(message.payload.filePath);
    } else {
      resolve(null);
    }
  }
};

Sources: app/chrome-extension/entrypoints/background/tools/browser/file-upload.ts:25-48

Timeout Handling

File operations implement a 30-second timeout to prevent hanging connections:

const timeout = setTimeout(() => {
  console.error('File preparation request timed out');
  resolve(null);
}, 30000); // 30 second timeout

Sources: app/chrome-extension/entrypoints/background/tools/browser/file-upload.ts:14-18

Native Host Manifest Configuration

Platform-Specific Paths

The native messaging host manifest location varies by operating system:

Platform	User-Level Path
Windows	`%APPDATA%\Google\Chrome\NativeMessagingHosts\`
macOS	`~/Library/Application Support/Google/Chrome/NativeMessagingHosts/`
Linux	`~/.config/google-chrome/NativeMessagingHosts/`

Platform	System-Level Path
Windows	`%ProgramFiles%\Google\Chrome\NativeMessagingHosts\`
macOS	`/Library/Google/Chrome/NativeMessagingHosts/`
Linux	`/etc/opt/chrome/native-messaging-hosts/`

Sources: app/native-server/src/scripts/utils.ts:1-35

Browser-Specific Configuration

Different Chromium-based browsers have distinct manifest locations:

switch (browser) {
  case BrowserType.CHROME:
    return path.join(home, 'Library', 'Application Support', 'Google', 'Chrome', 'NativeMessagingHosts', `${HOST_NAME}.json`);
  case BrowserType.CHROMIUM:
    return path.join(home, 'Library', 'Application Support', 'Chromium', 'NativeMessagingHosts', `${HOST_NAME}.json`);
}

Sources: app/native-server/src/scripts/browser-config.ts:1-50

Manifest Validation

The doctor command validates manifest files for correctness:

Validation Check	Error Message
File existence	Manifest not found
JSON parsing	Failed to parse manifest
Name field	name != HOST_NAME
Type field	type != stdio
Path field	path is missing
Path existence	path target does not exist

Sources: app/native-server/src/scripts/doctor.ts:1-50

Tab Operation Protocol

URL Pattern Matching

The close tabs functionality supports glob-style URL pattern matching:

let urlPattern = url;
if (!urlPattern.includes('*')) {
  try {
    new URL(urlPattern);
    urlPattern = urlPattern.endsWith('/') ? `${urlPattern}*` : `${urlPattern}/*`;
  } catch {
    urlPattern = urlPattern.endsWith('*')
      ? urlPattern
      : urlPattern.endsWith('/')
        ? `${urlPattern}*`
        : `${urlPattern}/*`;
  }
}

Sources: app/chrome-extension/entrypoints/background/tools/browser/common.ts:1-30

Tab Query and Response Format

const tabs = await chrome.tabs.query({ url: urlPattern });
await chrome.tabs.remove(tabIdsToClose);

return {
  content: [{
    type: 'text',
    text: JSON.stringify({
      success: true,
      message: `Closed ${tabIdsToClose.length} tabs with URL: ${url}`,
      closedCount: tabIdsToClose.length,
      closedTabIds: tabIdsToClose,
    }),
  }],
  isError: false,
};

Sources: app/chrome-extension/entrypoints/background/tools/browser/common.ts:60-80

Port Configuration Protocol

Stdio Configuration

The native server communicates over a configured port for MCP operations:

const url = new URL(configValue.url as string);
const port = Number(url.port);
const portOk = port === EXPECTED_PORT;

checks.push({
  id: 'port.config',
  title: 'Port config',
  status: portOk ? 'ok' : 'error',
  message: configValue.url as string,
  details: {
    expectedPort: EXPECTED_PORT,
    actualPort: port,
    fix: portOk ? undefined : [`${COMMAND_NAME} update-port ${EXPECTED_PORT}`],
  },
});

Sources: app/native-server/src/scripts/doctor.ts:80-100

Node.js Path Management

Version Consistency

The system ensures consistent Node.js versions between installation and runtime to avoid native module version mismatches:

export function writeNodePathFile(distDir: string, nodeExecPath = process.execPath): void {
  try {
    const nodePathFile = path.join(distDir, 'node_path.txt');
    fs.mkdirSync(distDir, { recursive: true });
    // Write Node.js executable path for runtime consistency
  } catch (error) {
    // Error handling
  }
}

Sources: app/native-server/src/scripts/utils.ts:80-100

Security Considerations

Cross-Origin Policies

Production builds implement strict COEP and COOP headers:

cross_origin_embedder_policy: { value: 'require-corp' as const },
cross_origin_opener_policy: { value: 'same-origin' as const },
content_security_policy: {
  extension_pages: "script-src 'self' 'wasm-unsafe-eval'; object-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:;",
}

Sources: app/chrome-extension/wxt.config.ts:1-30

Web Accessible Resources

Only specific resource paths are accessible to web pages:

web_accessible_resources: [
  {
    resources: [
      '/models/*',
      '/workers/*',
      '/inject-scripts/*',
    ],
    matches: ['<all_urls>'],
  },
],

Sources: app/chrome-extension/wxt.config.ts:20-30

Summary

The mcp-chrome communication architecture consists of multiple protocol layers:

Native Messaging Layer - Chrome-to-native bridge using Chrome's native messaging API
Internal Message Routing - Efficient routing of tool calls within the extension
File Operation Protocol - Request-response pattern with timeout handling
Tab Management Protocol - URL pattern-based tab operations
Port Configuration Protocol - Stdio-based MCP server communication

All protocols emphasize reliability through timeout handling, error reporting, and connection state management.

Sources: app/native-server/README.md

Chrome Extension Structure

Related topics: Browser Tools and APIs, Storage and Data Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Chrome Extension Structure

Overview

The mcp-chrome project implements a Chrome Extension that serves as a bridge between a Chrome browser instance and an AI-powered automation system. The extension enables Large Language Models (LLMs) to control browser behavior, capture web content, manage tabs, inject scripts, and automate user interactions through the Model Context Protocol (MCP).

The extension is built using WXT, a modern framework for Chrome extension development, which handles manifest generation, build configuration, and hot module replacement. The architecture follows a multi-process model with background service workers, content scripts, and UI entrypoints communicating through Chrome's message passing APIs and native messaging.

Sources: app/chrome-extension/wxt.config.ts:1-100

Browser Tools and APIs

Related topics: Chrome Extension Structure, Record and Replay Engine

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Section Tool Registration Flow

Continue reading this section for the full explanation and source context.

Section chromenavigate

Continue reading this section for the full explanation and source context.

Browser Tools and APIs

Overview

The Browser Tools and APIs module is a core component of the mcp-chrome extension that provides AI agents with programmatic control over browser functionality. This module bridges the gap between AI decision-making and browser operations, enabling automated web navigation, content extraction, tab management, and user interaction simulation.

The system operates as a collection of Chrome extension background script tools that expose browser capabilities through a structured MCP (Model Context Protocol) interface, allowing LLM-powered agents to interact with the browser as if performing actions themselves.

Sources: docs/TOOLS.md

Architecture

High-Level Architecture

graph TD
    A[MCP Client / AI Agent] -->|MCP Protocol| B[Native Server]
    B -->|Native Messaging| C[Chrome Extension Background]
    C -->|Chrome APIs| D[Browser Environment]
    
    E[Content Scripts] -->|Injection| D
    F[Injected Scripts] -->|Web Fetcher| D
    
    G[Sidepanel UI] -->|User Interaction| C
    H[Popup UI] -->|Quick Actions| C

Tool Registration Flow

sequenceDiagram
    participant T as Tool Registry
    participant B as Background Script
    participant C as Chrome API
    participant M as MCP Bridge
    
    T->>B: Register tool handlers
    B->>C: Initialize chrome.tabs listeners
    C->>M: Expose via native messaging
    M->>T: Forward tool calls

The extension registers tools through the shared tools.ts module, which defines the complete tool schema including names, descriptions, input parameters, and response formats. The background script then connects these tool definitions to actual Chrome API implementations.

Sources: app/chrome-extension/shared/tools.ts:1-50

Tool Categories

The browser tools are organized into the following functional categories:

Category	Tools	Purpose
Navigation	`chrome_navigate`, `chrome_go_back_or_forward`	Page navigation and history control
Tab Management	`chrome_switch_tab`, `chrome_close_tabs`, `chrome_close_other_tabs`	Tab lifecycle management
View Control	`chrome_set_viewport`, `chrome_screenshot`	Visual viewport manipulation and capture
Content Extraction	`chrome_get_web_content`, `chrome_get_interactive_elements`	Page content retrieval
User Interaction	`chrome_click_element`, `chrome_fill_or_select`, `chrome_keyboard`	Simulated user actions
Data Management	`chrome_bookmark_*`, `chrome_history`	Bookmark and history access
Network Monitoring	`chrome_network_*`	HTTP traffic inspection
Script Injection	`chrome_inject_script`, `chrome_send_command_to_inject_script`	Custom script execution

Sources: docs/TOOLS.md

chrome_navigate

Navigates the current tab to a specified URL or performs search queries.

Parameters:

Parameter	Type	Required	Description
`url`	`string`	Yes	Target URL or search query
`options`	`object`	No	Navigation options

Options:

Option	Type	Default	Description
`waitUntil`	`string`	`"networkidle2"`	When to consider navigation complete
`timeout`	`number`	`30000`	Navigation timeout in milliseconds

Implementation Details:

The navigation tool first validates the URL, handling search queries by converting them to search engine URLs. It then attempts to find an existing tab matching the URL before creating a new one, promoting existing tabs to reduce clutter.

// Simplified navigation flow
async function chrome_navigate(url: string, options?: NavigationOptions) {
  const normalizedUrl = validateAndNormalizeUrl(url);
  const existingTab = await findExistingTab(normalizedUrl);
  
  if (existingTab) {
    await chrome.tabs.update(existingTab.id, { active: true });
    return existingTab;
  }
  
  return await chrome.tabs.create({ url: normalizedUrl, active: true });
}

Sources: app/chrome-extension/entrypoints/background/tools/browser/index.ts

chrome_go_back_or_forward

Navigates browser history by a specified number of steps.

Parameters:

Parameter	Type	Required	Description
`historySteps`	`number`	Yes	Number of steps (- for back, + for forward)

Tab Management Tools

Tab Query System

graph LR
    A[URL Pattern] -->|Parse| B[Tab Filter]
    B -->|Query| C[chrome.tabs.query]
    C -->|Results| D[Tab IDs Array]
    
    E[Tab ID] -->|Direct| D
    F[Window ID] -->|Window Filter| D

The tab management system supports multiple query mechanisms:

URL Pattern Matching: Glob-style patterns with wildcard support
Direct Tab ID: Integer-based tab identification
Window Scoping: Restricting operations to specific windows

Pattern Matching Implementation:

private async queryTabsByUrl(urlPattern: string): Promise<number[]> {
  // Normalize pattern with proper glob-to-regex conversion
  const pattern = urlPattern
    .replace(/\./g, '\\.')
    .replace(/\*/g, '.*')
    .replace(/\?/g, '.');
  
  const tabs = await chrome.tabs.query({ url: `<all_urls>` });
  return tabs
    .filter(tab => tab.url?.match(new RegExp(pattern)))
    .map(tab => tab.id)
    .filter((id): id is number => id !== undefined);
}

Sources: app/chrome-extension/entrypoints/background/tools/browser/common.ts

chrome_close_tabs

Closes tabs matching a URL pattern or specified tab IDs.

Response Schema:

interface CloseTabsResponse {
  success: boolean;
  message: string;
  closedCount: number;
  closedTabIds: number[];
}

Example Response:

{
  "success": true,
  "message": "Closed 3 tabs with URL: https://github.com/*",
  "closedCount": 3,
  "closedTabIds": [123, 456, 789]
}

chrome_switch_tab

Activates a specific tab by ID or switches to adjacent tabs relative to the current tab.

Parameters:

Parameter	Type	Description
`tabId`	`number`	Target tab ID
`offset`	`number`	Relative offset from current tab

View Control Tools

chrome_screenshot

Captures screenshots with multiple targeting modes.

Parameters:

Parameter	Type	Required	Description
`captureArea`	`string`	No	`"fullpage"`, `"viewport"`, or CSS selector
`selector`	`string`	No	CSS selector for element targeting
`options`	`object`	No	Screenshot options (quality, format)

Capture Modes:

Mode	Description	Use Case
`viewport`	Current visible area	Quick preview
`fullpage`	Entire scrollable page	Complete documentation
`element`	Specific element by selector	Targeted capture

Sources: docs/TOOLS.md

chrome_set_viewport

Controls the browser viewport dimensions for responsive testing and content adaptation.

Parameters:

Parameter	Type	Required	Description
`width`	`number`	Yes	Viewport width in pixels
`height`	`number`	Yes	Viewport height in pixels
`deviceScaleFactor`	`number`	No	Device pixel ratio (default: 1)

Content Extraction Tools

chrome_get_web_content

Extracts structured content from web pages including text, metadata, and semantic elements.

Parameters:

Parameter	Type	Description
`extractOptions`	`object`	Configuration for content extraction
`metadata`	`boolean`	Include page metadata (default: true)

Extracted Metadata Fields:

Field	Source	Description
`title`	`<title>`, `og:title`, JSON-LD	Page title
`byline`	`author`, `article:author`	Content author
`excerpt`	`description`, `og:description`	Page summary
`siteName`	`og:site_name`	Website name
`publishedTime`	`article:published_time`	Publication date

Content Script Injection:

// Web fetcher helper extracts structured data from pages
const response = await chrome.scripting.executeScript({
  target: { tabId },
  files: ['inject-scripts/web-fetcher-helper.js'],
});

Sources: app/chrome-extension/inject-scripts/web-fetcher-helper.js

chrome_get_interactive_elements

Retrieves clickable and interactive elements from a page for automation targeting.

Response:

interface InteractiveElementsResponse {
  elements: Array<{
    selector: string;      // CSS selector for the element
    tagName: string;       // HTML tag name
    text: string;          // Visible text content
    attributes: Record<string, string>;
    isClickable: boolean;
    boundingRect: DOMRect;
  }>;
}

search_tabs_content

AI-powered semantic search across all open browser tabs.

Parameters:

Parameter	Type	Description
`query`	`string`	Semantic search query
`maxResults`	`number`	Maximum results to return

User Interaction Tools

chrome_click_element

Simulates mouse clicks on elements identified by CSS selectors.

Parameters:

Parameter	Type	Required	Description
`selector`	`string`	Yes	CSS selector for target element
`button`	`string`	No	Mouse button (`left`, `middle`, `right`)
`clickCount`	`number`	No	Number of clicks

chrome_fill_or_select

Fills form inputs and selects options.

Supported Input Types:

Input Type	Action
`input[type=text]`	Text input
`input[type=email]`	Email input
`input[type=password]`	Password input
`textarea`	Multi-line text
`select`	Dropdown selection
`input[type=checkbox]`	Toggle state
`input[type=radio]`	Radio selection

Parameters:

Parameter	Type	Required	Description
`selector`	`string`	Yes	Target element selector
`value`	`string`	Yes	Value to input or select
`options`	`object`	No	Additional options

chrome_keyboard

Simulates keyboard input and shortcuts.

Parameters:

Parameter	Type	Required	Description
`text`	`string`	No	Text to type
`shortcuts`	`string[]`	No	Keyboard shortcuts to execute
`options`	`object`	No	Keyboard simulation options

Supported Shortcuts:

const shortcuts = [
  'Ctrl+C',      // Copy
  'Ctrl+V',      // Paste
  'Ctrl+A',      // Select all
  'Ctrl+K',      // Clear
  'Enter',       // Submit
  'Escape',      // Cancel/Dismiss
  'Tab',         // Next element
  'Shift+Tab',   // Previous element
];

Network Monitoring Tools

chrome_network_capture_start / stop

Controls webRequest API-based network traffic capture.

Parameters:

Parameter	Type	Description
`options`	`object`	Capture configuration
`filterUrls`	`string[]`	URL patterns to monitor

Captured Data:

Field	Description
`url`	Request URL
`method`	HTTP method
`headers`	Request/response headers
`status`	HTTP status code
`timing`	Request timing data
`bodySize`	Response body size

chrome_network_debugger_start / stop

Enables Chrome DevTools Protocol debugger for detailed request/response inspection including response bodies.

chrome_network_request

Sends custom HTTP requests through the browser context, useful for bypassing CORS and maintaining session state.

Script Injection System

chrome_inject_script

Injects JavaScript files into web pages for extended functionality.

Implementation:

await chrome.scripting.executeScript({
  target: { tabId },
  files: ['inject-scripts/web-fetcher-helper.js'],
});

Web Accessible Resources Configuration:

The extension allows injection of specific resource categories defined in wxt.config.ts:

web_accessible_resources: [
  {
    resources: [
      '/models/*',           // ML models
      '/workers/*',          // Web workers
      '/inject-scripts/*',   // Helper scripts
    ],
    matches: ['<all_urls>'],
  },
]

Sources: app/chrome-extension/wxt.config.ts

chrome_send_command_to_inject_script

Sends structured commands to injected content scripts via a messaging bridge.

Command Flow:

sequenceDiagram
    participant B as Background Script
    participant C as Content Script
    participant P as Web Page
    
    B->>C: chrome.tabs.sendMessage
    C->>P: postMessage to page context
    P-->>C: Response message
    C-->>B: chrome.runtime.sendResponse

Bookmark and History Tools

Bookmark Operations

Tool	Function
`chrome_bookmark_search`	Search bookmarks by keywords
`chrome_bookmark_add`	Create new bookmarks with folder support
`chrome_bookmark_delete`	Remove bookmarks

chrome_history

Searches browser history with temporal filtering.

Parameters:

Parameter	Type	Description
`query`	`string`	Search query
`maxResults`	`number`	Maximum entries
`startTime`	`number`	Unix timestamp lower bound
`endTime`	`number`	Unix timestamp upper bound

Error Handling

All browser tools implement consistent error handling patterns:

interface ToolResponse {
  content: Array<{
    type: 'text';
    text: string;  // JSON stringified response or error
  }>;
  isError: boolean;
}

Error Response Format:

{
  "success": false,
  "error": "Error description",
  "code": "ERROR_CODE"
}

URL Exclusion Patterns:

Internal Chrome URLs are automatically excluded from operations:

const excludePatterns = [
  /^chrome:\/\//,
  /^chrome-extension:\/\//,
  /^edge:\/\//,
  /^about:/,
  /^moz-extension:\/\//,
  /^file:\/\//,
];

Sources: app/chrome-extension/utils/content-indexer.ts

Security Considerations

Cross-Origin Policies

The extension implements COOP/COEP headers in production to enable SharedArrayBuffer and other features:

...(IS_DEV
  ? {}
  : {
      cross_origin_embedder_policy: { value: 'require-corp' as const },
      cross_origin_opener_policy: { value: 'same-origin' as const },
    })

Content Security Policy

Production builds enforce strict CSP:

script-src 'self' 'wasm-unsafe-eval'; 
object-src 'self'; 
style-src 'self' 'unsafe-inline'; 
img-src 'self' data: blob:;

Sources: app/chrome-extension/wxt.config.ts

Usage with AI Agents

The browser tools are designed for seamless integration with AI agents:

// Example: Agent task to research a topic
const task = {
  instruction: "Search for TypeScript best practices and summarize the top 5 tips",
  tools: [
    "chrome_navigate",      // Go to search engine
    "chrome_get_web_content", // Extract results
    "chrome_screenshot",    // Document findings
  ]
};

The tools provide structured, parseable responses that AI agents can use for decision-making and subsequent actions, enabling complex multi-step workflows in the browser environment.

Sources: docs/TOOLS.md

Record and Replay Engine

Related topics: Browser Tools and APIs, Storage and Data Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Session Manager

Continue reading this section for the full explanation and source context.

Section Browser Event Listener

Continue reading this section for the full explanation and source context.

Section Flow Runner

Continue reading this section for the full explanation and source context.

Record and Replay Engine

Overview

The Record and Replay Engine is a Chrome extension feature within mcp-chrome that enables automatic capture and reproduction of user browser interactions. It records navigation events, user actions, and DOM interactions into a structured flow, then replays these flows to automate repetitive browser tasks.

The engine operates as a background service within the Chrome extension, using Chrome's webExtension APIs to monitor and control browser behavior across tabs and windows.

Architecture

graph TD
    subgraph Recording
        BEL[Browser Event Listener] -->|captures| Session[Recording Session]
        Session -->|stores| Flow[Flow Data]
    end
    
    subgraph Replay
        FR[Flow Runner] -->|executes| Steps[Flow Steps]
        Steps -->|waits for| NI[Network Idle Detection]
        Steps -->|injects| IS[Injected Scripts]
    end
    
    subgraph Triggers
        TriggerStore[Trigger Store] -->|activates| FR
        DOMObs[DOM Observer] -->|detects| TriggerStore
    end
    
    Flow -->|loads| FR
    TriggerStore -->|manages| Flow

Core Components

Session Manager

The session manager tracks the current recording state and maintains a list of active tabs under observation.

Key Responsibilities:

Track recording status (recording, idle, paused)
Manage set of active tabs during recording
Store and retrieve flow data from Chrome storage

State Management:

session.getStatus() !== 'recording'  // Check if recording is active
session.addActiveTab(tabId)           // Track tab for targeted STOP
session.removeActiveTab(tabId)        // Cleanup when tab closes
session.getFlow()                     // Retrieve current flow data

Sources: browser-event-listener.ts:25-30

Browser Event Listener

The event listener hooks into Chrome's navigation and tab APIs to capture user interactions during recording.

Monitored Events:

Event Source	Event Type	Recording Behavior
`chrome.webNavigation.onCommitted`	Link navigation	Record if `transitionType === 'link'`
`chrome.webNavigation.onCommitted`	reload/typed/generated	Always record
`chrome.webNavigation.onCommitted`	auto_bookmark/keyword	Record
`chrome.webNavigation.onCommitted`	form_submit	Record for search navigations
`chrome.tabs.onRemoved`	Tab close	Remove from active set
`chrome.tabs.onUpdated`	Status/URL change	Mark navigation event

Transition Types Tracked:

const shouldRecord = 
  t === 'reload' ||
  t === 'typed' ||
  t === 'generated' ||
  t === 'auto_bookmark' ||
  t === 'keyword' ||
  t === 'form_submit';

Sources: browser-event-listener.ts:8-22

Flow Runner

The flow runner executes recorded flows by iterating through steps and performing the recorded actions.

Execution Pipeline:

Load flow from storage by flowId
Resolve template variables from execution arguments
Execute steps sequentially
Handle variable assignment between steps
Wait for network idle between navigation steps

Key Features:

Template variable expansion using \{variable\} syntax
Variable assignment from step outputs via assign map
Network idle detection for reliable page load waits
Support for both DOM-based and programmatic triggers

Sources: index.ts:1-50

DOM Trigger System

Triggers enable automatic replay initiation based on DOM element presence or changes.

Trigger Configuration:

{
  id: string,
  type: 'dom',
  enabled: boolean,
  selector: string,       // CSS selector for target element
  appear: boolean,        // Trigger on element appearance
  once: boolean,          // Fire only once per session
  debounceMs: number      // Default: 800ms
}

Trigger Lifecycle:

Store triggers in chrome.storage.local under STORAGE_KEYS.RR_TRIGGERS
Inject dom-observer.js into target tab
Send trigger configuration via message to injected script
DOM observer monitors for selector matches
On match, initiate flow replay

Sources: index.ts:60-85

Utility Functions

applyAssign: Maps values from a source object to a target using path notation.

applyAssign(target, source, {
  'variableName': 'path.to.value',
  'nested.value': 'data[0].field'
});

expandTemplatesDeep: Recursively expands template variables in any value type.

// Input: "Navigate to \{baseUrl\}/home"
// Scope: { baseUrl: 'https://example.com' }
// Output: "Navigate to https://example.com/home"

Sources: rr-utils.ts:1-50

Recording Process

1. Initialization

When recording starts:

Session status transitions to recording
Active tab list is initialized
Event listeners are attached to Chrome APIs

2. Event Capture

sequenceDiagram
    participant U as User
    participant C as Chrome API
    participant L as Event Listener
    participant S as Session
    
    U->>C: Click/Type/Navigate
    C->>L: webNavigation.onCommitted
    L->>L: Check transition type
    L->>S: getFlow()
    S-->>L: Current flow object
    L->>S: addNavigationStep(url)
    L->>L: ensureRecorderInjected(tabId)
    L->>L: broadcastControlToTab(START)
    S->>S: addActiveTab(tabId)

Navigation events are captured with the current tab URL:

const tab = await chrome.tabs.get(tabId);
const url = tab.url || details.url;
if (flow && url) addNavigationStep(flow, url);

4. Script Injection

After each navigation, the extension ensures content scripts are injected for replay preparation:

await ensureCoreInjected(details.tabId);
await ensureRecorderInjected(tabId);
await broadcastControlToTab(tabId, REC_CMD.START);

Replay Process

1. Flow Loading

Flows are retrieved by ID and validated before execution:

const flow = await getFlow(t.flowId);
if (!flow) return;
await runFlow(flow, { args: t.args || {}, returnLogs: false });

2. Variable Expansion

Template variables in flow steps are resolved from the execution arguments:

expandTemplatesDeep(value, { ...scope, ...args });

3. Step Execution

Each step in the flow is executed in topological order, with:

Network idle waits between navigation steps
DOM query waits for element visibility
Error handling for selector mismatches

4. Network Idle Detection

The engine waits for network activity to settle before proceeding:

export async function waitForNetworkIdle(
  tabId: number,
  sniffMs: number = 2000
): Promise<void>

Detection Strategy:

Monitor onCommitted, onCompleted, onHistoryStateUpdated
Track tab loading status via tabs.onUpdated
Set timeout for sniff period (default: 2000ms)

Sources: wait.ts:1-50

Data Models

Flow Structure

interface Flow {
  id: string;
  name?: string;
  steps: Step[];
  variables?: Record<string, any>;
  createdAt: number;
}

interface Step {
  id: string;
  type: 'navigation' | 'interaction' | 'wait';
  action?: string;
  selector?: string;
  url?: string;
  value?: any;
  assign?: Record<string, string>;  // Variable assignments
}

Trigger Storage

interface StoredTriggers {
  triggers: Array<{
    id: string;
    type: 'dom' | 'network';
    enabled: boolean;
    selector?: string;
    appear?: boolean;
    once?: boolean;
    debounceMs?: number;
  }>;
}

Security Considerations

Content Security Policy

The extension enforces strict CSP in production:

content_security_policy: {
  extension_pages: 
    "script-src 'self' 'wasm-unsafe-eval'; object-src 'self'; " +
    "style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:;"
}

Script Injection Safety

Scripts are injected only into controlled pages
DOM observers use isolated worlds
Script injection is preheated only for valid flow replays

Configuration

Flow Execution Options

interface RunFlowOptions {
  args?: Record<string, string>;      // Template variable values
  returnLogs?: boolean;               // Include execution logs
  triggerId?: string;                 // Associated trigger ID
}

Recording Commands

enum REC_CMD {
  START = 'start',      // Begin recording in tab
  STOP = 'stop',        // End recording in tab
  PAUSE = 'pause',      // Pause current recording
  RESUME = 'resume'     // Resume paused recording
}

Extension Points

Custom Event Handlers

The engine supports extension through the shared utility layer:

import { TOOL_NAMES, topoOrder, mapNodeToStep } from 'chrome-mcp-shared';

Policy-Based Waiting

Additional wait policies can be registered for specialized scenarios:

// In engine/policies/wait.ts
export { waitForNetworkIdle };

Component	Location	Purpose
DOM Observer	`inject-scripts/dom-observer.js`	Monitors DOM for trigger elements
Web Fetcher	`inject-scripts/web-fetcher-helper.js`	Extracts page content
Element Marker	`inject-scripts/element-marker.js`	Visual element selection UI

Usage Example

Recording a flow:

Open Chrome DevTools → MCP tab
Click "Start Recording"
Perform browser actions (navigate, fill forms, click)
Click "Stop Recording"
Flow is saved with captured steps

Replaying a flow:

Load saved flow by ID
Optionally provide template variable values
Engine executes steps with network idle waits
Variables are assigned and can be used in subsequent steps

References

Chrome Extension webNavigation API: developer.chrome.com
Chrome Extension tabs API: developer.chrome.com
Chrome.scripting API: developer.chrome.com

Sources: browser-event-listener.ts:25-30

MCP Server Implementation

Related topics: Communication Protocols, Browser Tools and APIs, AI Agent Engines

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Communication Flow

Continue reading this section for the full explanation and source context.

Section Message Handling

Continue reading this section for the full explanation and source context.

MCP Server Implementation

Overview

The MCP Server Implementation provides a bridge between AI coding assistants and Chrome browser automation through the Model Context Protocol (MCP). This component enables AI agents to interact with the browser by exposing a comprehensive set of tools for navigation, content extraction, interaction, and network monitoring.

Sources: app/native-server/src/mcp/register-tools.ts

Architecture

System Components

The MCP Server Implementation consists of four primary components that work together to enable AI-browser interaction:

Component	File	Purpose
MCP Server Core	`mcp-server.ts`	Core protocol implementation and message handling
Tool Registry	`register-tools.ts`	Tool registration and definition
Stdio Transport	`mcp-server-stdio.ts`	Standard I/O communication layer
Server Entry	`server/index.ts`	Server initialization and lifecycle management

Sources: app/native-server/src/mcp/mcp-server.ts, app/native-server/src/mcp/register-tools.ts

Communication Flow

graph TD
    A[AI Client] -->|MCP Protocol| B[Stdio Transport Layer]
    B --> C[MCP Server Core]
    C --> D[Tool Registry]
    D --> E[Chrome Extension]
    E --> F[Browser Tabs]
    F -->|Results| E
    E -->|Tool Results| D
    D -->|Structured Response| C
    C -->|JSON-RPC| B
    B -->|stdout| A

MCP Server Core

The core server implementation handles the MCP protocol lifecycle, managing tool invocations, tool results, and resource operations.

Sources: app/native-server/src/mcp/mcp-server.ts

Message Handling

The server processes incoming JSON-RPC requests and dispatches them to appropriate handlers:

// Message dispatch pattern
dispatchToolMessage(
  isError
    ? `Error: ${content || 'Tool execution failed'}`
    : content || 'Tool completed',
  metadata,
  'tool_result',
  false,
);

Sources: app/native-server/src/agent/engines/claude.ts:50-56

Tool Metadata Building

Tool metadata is constructed with full input details to provide context to AI clients:

const metadata = buildToolMetadata({
  name: pending.toolName,
  id: pending.toolId,
  input,
});

Sources: app/native-server/src/agent/engines/claude.ts:85-89

Tool Categories

Tool Name	Description	Key Parameters
`chrome_navigate`	Navigate to URLs or perform searches	`url`, `query`
`chrome_go_back_or_forward`	Browser navigation control	`direction`
`chrome_switch_tab`	Switch the current active tab	`tabId`
`chrome_close_tabs`	Close specific tabs or windows	`tabIds`, `windowIds`
`chrome_reload`	Reload current page or tabs	`tabId`
`chrome_inject_script`	Inject content scripts into web pages	`script`, `tabId`
`chrome_send_command_to_inject_script`	Send commands to injected scripts	`command`, `tabId`

Interaction Tools (3 tools)

Tool Name	Description
`chrome_click_element`	Click elements using CSS selectors
`chrome_fill_or_select`	Fill forms and select options
`chrome_keyboard`	Simulate keyboard input and shortcuts

Data Management (5 tools)

Tool Name	Description	Parameters
`chrome_history`	Search browser history with time filters	`query`, `startTime`, `endTime`
`chrome_bookmark_search`	Find bookmarks by keywords	`query`
`chrome_bookmark_add`	Add new bookmarks with folder support	`url`, `title`, `folder`
`chrome_bookmark_delete`	Delete bookmarks	`bookmarkId`

Network Monitoring (4 tools)

Tool Name	Description	API Used
`chrome_network_capture_start/stop`	webRequest API network capture	`chrome.webRequest`
`chrome_network_debugger_start/stop`	Debugger API with response bodies	`chrome.debugger`
`chrome_network_request`	Send custom HTTP requests	`fetch` via background

Content Analysis (4 tools)

Tool Name	Description
`search_tabs_content`	AI-powered semantic search across browser tabs
`chrome_get_web_content`	Extract HTML/text content from pages
`chrome_get_interactive_elements`	Find clickable elements
`chrome_console`	Capture and retrieve console output

Visual Tools (1 tool)

Tool Name	Description	Parameters
`chrome_screenshot`	Advanced screenshot capture	`element`, `fullPage`, `width`, `height`

Sources: app/native-server/src/mcp/register-tools.ts

Tool Parameter Extraction

The MCP server extracts structured metadata from tool invocations to provide AI clients with meaningful context:

// Bash/shell - command extraction
if (normalizedName === 'bash' || normalizedName.includes('shell')) {
  if (typeof input.command === 'string') {
    metadata.command = input.command;
  }
  if (typeof input.description === 'string') {
    metadata.commandDescription = input.description;
  }
}

Search Tool Metadata

// Search tools (grep, glob)
if (normalizedName === 'grep' || normalizedName.includes('search')) {
  if (typeof input.pattern === 'string') metadata.pattern = input.pattern;
  if (typeof input.path === 'string') metadata.searchPath = input.path;
  if (typeof input.glob === 'string') metadata.glob = input.glob;
  if (typeof input.output_mode === 'string') metadata.outputMode = input.output_mode;
}

Sources: app/native-server/src/agent/engines/claude.ts:120-140

Stdio Transport Layer

The stdio transport layer handles communication between the MCP server and AI clients through standard input/output streams.

sequenceDiagram
    participant AI as AI Client
    participant Stdio as Stdio Transport
    participant Server as MCP Server Core
    participant Chrome as Chrome Extension

    AI->>Stdio: JSON-RPC Request (stdin)
    Stdio->>Server: Parsed Request
    Server->>Chrome: Tool Invocation
    Chrome->>Chrome: Execute in Browser
    Chrome-->>Server: Tool Result
    Server-->>Stdio: JSON-RPC Response
    Stdio-->>AI: stdout Response

Sources: app/native-server/src/mcp/mcp-server-stdio.ts

Port Configuration

The stdio transport requires proper port configuration to communicate with the Chrome extension:

const url = new URL(configValue.url as string);
const port = Number(url.port);
const portOk = port === EXPECTED_PORT;

Sources: app/native-server/src/scripts/doctor.ts:120-128

Agent Session Integration

The MCP server integrates with the agent session service to support multi-session environments:

export interface AgentSessionPreviewMeta {
  /** Compact display text */
  displayText?: string;
  /** Client metadata for special rendering */
  clientMeta?: {
    kind?: string;
    // ...
  };
}

Session Management Configuration

export interface SessionConfig {
  mcpServers?: Record<string, unknown>;
  outputFormat?: Record<string, unknown>;
  enableFileCheckpointing?: boolean;
  sandbox?: Record<string, unknown>;
  env?: Record<string, string>;
  codexConfig?: Partial<CodexEngineConfig>;
}

Sources: app/native-server/src/agent/session-service.ts:15-35

Claude Engine Integration

The MCP server works seamlessly with the Claude engine through structured tool result handling:

Tool Input Parsing

// Parse accumulated JSON input
const fullJsonStr = pending.inputJsonParts.join('');
let input: Record<string, unknown> = {};
try {
  if (fullJsonStr) {
    input = JSON.parse(fullJsonStr);
  }
} catch (e) {
  console.error(`[ClaudeEngine] Failed to parse tool input JSON: ${e}`);
}

Content Preview Extraction

// Write tool - content preview
if (normalizedName.includes('write') || normalizedName === 'create_file') {
  if (typeof input.content === 'string') {
    metadata.contentPreview = input.content.slice(0, 200);
    metadata.totalLines = input.content.split('\n').length;
  }
}

// Read tool - offset/limit
if (normalizedName.includes('read')) {
  if (typeof input.offset === 'number') metadata.offset = input.offset;
  if (typeof input.limit === 'number') metadata.limit = input.limit;
}

Sources: app/native-server/src/agent/engines/claude.ts:90-110

Chrome Extension Communication

The MCP server communicates with the Chrome extension through a defined message protocol:

graph LR
    A[MCP Server] -->|chrome.runtime.sendMessage| B[Background Script]
    B --> C[Content Scripts]
    C -->|DOM Access| D[Web Pages]
    B -->|Tabs API| E[Browser Tabs]

Web Accessible Resources

The extension exposes necessary resources for tool execution:

web_accessible_resources: [
  {
    resources: [
      '/models/*',        // AI models
      '/workers/*',       // Web workers
      '/inject-scripts/*' // Content script helpers
    ],
    matches: ['<all_urls>'],
  },
]

Sources: app/chrome-extension/wxt.config.ts

Error Handling

The MCP server implements comprehensive error handling:

Error Detection Patterns

const isError =
  meta.is_error === true ||
  meta.isError === true ||
  (typeof msg.content === 'string' && msg.content.trimStart().startsWith('Error:'));

Tool Severity Classification

Condition	Severity
`isError` is true	`error`
Tool execution successful	`success`
Tool execution in progress	`info`

Configuration Management

Environment-Specific Security

...(IS_DEV
  ? {}
  : {
      cross_origin_embedder_policy: { value: 'require-corp' as const },
      cross_origin_opener_policy: { value: 'same-origin' as const },
      content_security_policy: {
        extension_pages:
          "script-src 'self' 'wasm-unsafe-eval'; object-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:;",
      },
    })

Sources: app/chrome-extension/wxt.config.ts

Diagnostic Tools

The MCP server includes diagnostic capabilities:

Check ID	Title	Purpose
`port.config`	Port config	Verify stdio-config.json port
`port.constant`	Port constant	Verify native server port constant

Doctor Script Output

checks.push({
  id: 'port.config',
  title: 'Port config',
  status: portOk ? 'ok' : 'error',
  message: configValue.url as string,
  details: {
    expectedPort: EXPECTED_PORT,
    actualPort: port,
    fix: portOk ? undefined : [`${COMMAND_NAME} update-port ${EXPECTED_PORT}`],
  },
});

Sources: app/native-server/src/scripts/doctor.ts:130-145

Summary

The MCP Server Implementation provides a robust bridge between AI coding assistants and Chrome browser automation. Key features include:

Protocol Compliance: Full MCP protocol implementation with JSON-RPC message handling
Comprehensive Tools: 24+ tools covering navigation, interaction, data management, network monitoring, and content analysis
Structured Metadata: Rich tool result metadata for AI context understanding
Flexible Transport: Stdio-based communication for easy integration with various AI clients
Security: Environment-aware security policies and proper isolation
Debugging: Built-in diagnostic capabilities for troubleshooting connectivity issues

Sources: app/native-server/src/mcp/register-tools.ts

AI Agent Engines

Related topics: MCP Server Implementation, Browser Tools and APIs

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Supported Engines

Continue reading this section for the full explanation and source context.

Section Base Capabilities

Continue reading this section for the full explanation and source context.

AI Agent Engines

Overview

The AI Agent Engines system is a core architectural component of the MCP Chrome project that provides an abstraction layer for interacting with different LLM (Large Language Model) backends. This module enables the browser extension to leverage various AI providers (Claude, Codex) for intelligent automation, workflow execution, and browser control.

The engine system follows a unified interface pattern, allowing seamless switching between different AI providers while maintaining consistent behavior for tool execution, message handling, and state management.

Sources: app/native-server/src/agent/engines/claude.ts:1-50 Sources: app/native-server/src/agent/engines/codex.ts:1-50

Architecture

System Components

graph TD
    A[Chat Service] --> B[Tool Bridge]
    B --> C[Claude Engine]
    B --> D[Codex Engine]
    C --> E[Claude API]
    D --> F[Codex API]
    E --> G[Stream Events]
    F --> G
    G --> H[Tool Dispatcher]
    H --> I[Browser/Tools]

Supported Engines

Engine	Provider	Status	Primary Use Case
Claude	Anthropic	Active	General AI assistance, code generation
Codex	OpenAI	Active	Code-focused tasks, GitHub integration

Sources: app/native-server/src/agent/engines/claude.ts:1-30 Sources: app/native-server/src/agent/engines/codex.ts:1-30

Engine Interface

Base Capabilities

All agent engines implement a common interface that handles:

Message Streaming: Real-time event streaming from AI providers
Tool Execution: Routing tool calls to appropriate handlers
Content Parsing: Processing multi-modal content blocks
Error Handling: Graceful error management and recovery

Event Processing

The engines process events through a unified event loop:

sequenceDiagram
    participant API as AI API
    participant Engine as Agent Engine
    participant Bridge as Tool Bridge
    participant Browser as Browser/Tools
    
    API->>Engine: content_block_start
    Engine->>Engine: accumulateToolInput()
    API->>Engine: content_block_delta
    Engine->>Engine: parseToolInput()
    API->>Engine: content_block_stop
    Engine->>Bridge: dispatchToolMessage()
    Bridge->>Browser: executeTool()
    Browser->>Bridge: toolResult
    Bridge->>Engine: forwardResult
    Engine->>API: streamResponse

Sources: app/native-server/src/agent/engines/claude.ts:50-120 Sources: app/native-server/src/agent/engines/codex.ts:50-100

Claude Engine

Overview

The Claude Engine (claude.ts) implements the Anthropic Claude API integration, handling streaming responses and tool execution through the Claude Messages API.

Core Features

Content Block Handling

The engine processes different content block types:

content_block_start: Initializes tool input accumulation
content_block_delta: Accumulates tool input JSON chunks
content_block_stop: Finalizes and parses accumulated tool input

Tool Result Processing

// Extract tool result content from content blocks
const extractToolResultContent = (contentBlock) => {
  if (contentBlock.type === 'tool_result') {
    return contentBlock.content?.text || null;
  }
  return null;
};

Sources: app/native-server/src/agent/engines/claude.ts:80-100

Error Handling

The engine detects errors through multiple indicators:

Error Indicator	Source	Priority
`is_error: true`	API response	High
`isError: true`	Metadata	High
Content starts with "Error:"	Message content	Medium

Sources: app/native-server/src/agent/engines/claude.ts:40-60

Metadata Building

The Claude engine constructs comprehensive tool metadata:

const buildToolMetadata = ({
  name: pending.toolName,
  id: pending.toolId,
  input,
}) => {
  // Returns structured metadata for tool invocation
};

Sources: app/native-server/src/agent/engines/claude.ts:100-130

Codex Engine

Overview

The Codex Engine (codex.ts) integrates with OpenAI's Codex API, providing specialized handling for code-related tasks and GitHub integration.

Core Features

Tool Message Dispatch

The engine implements a sophisticated tool dispatching system:

const dispatchToolMessage = (
  content: string,
  metadata: Record<string, unknown>,
  type: string,
  isUpdate: boolean
) => {
  // Dispatches tool execution results with metadata
};

Sources: app/native-server/src/agent/engines/codex.ts:30-60

Todo List Management

The Codex engine includes specialized support for task tracking:

graph LR
    A[Agent Response] --> B[emitTodoListUpdate]
    B --> C{Phase}
    C -->|started| D[Tool Use Event]
    C -->|update| D
    C -->|completed| E[Tool Result Event]
    
    F[Raw Items] --> G[normalizeTodoListItems]
    G --> H[buildTodoListContent]

Item Event Types

Event Type	Handler	Output
`command_execution`	`emitCommandStart`	Command start message
`todo_list`	`emitTodoListUpdate`	Todo list state
`agent_message`	Text extraction	Text content

Sources: app/native-server/src/agent/engines/codex.ts:60-100

Command Execution Tracking

const emitCommandStart = (record: Record<string, unknown>) => {
  const command = this.pickFirstString(record.command);
  const description = this.pickFirstString(record.description);
  // Emits command start event for UI tracking
};

Sources: app/native-server/src/agent/engines/codex.ts:40-50

Tool Bridge Integration

Purpose

The Tool Bridge (tool-bridge.ts) acts as the intermediary between agent engines and actual tool implementations, providing:

Tool Routing: Directs tool calls to appropriate handlers
Parameter Transformation: Converts engine-specific formats to tool formats
Result Formatting: Standardizes tool responses for engines

Tool Execution Flow

graph LR
    A[Engine Tool Call] --> B[Tool Bridge]
    B --> C{toolId match?}
    C -->|Yes| D[Execute Tool]
    C -->|No| E[Error Handler]
    D --> F[Format Result]
    E --> G[Error Response]
    F --> H[Return to Engine]

Chat Service Coordination

Service Layer

The Chat Service (chat-service.ts) orchestrates the interaction between user interfaces and agent engines:

graph TD
    A[User Request] --> B[Chat Service]
    B --> C[Select Engine]
    C --> D[Claude or Codex]
    D --> E[Stream Events]
    E --> F[Tool Bridge]
    F --> G[Execute Tools]
    G --> H[Return Results]
    H --> E
    E --> I[User Response]

Message Handling

All engines communicate through a standardized message format defined in agent-types.ts:

Field	Type	Description
`content`	string	Message text content
`metadata`	object	Engine-specific metadata
`type`	string	Message type (tool_use, tool_result, etc.)
`isUpdate`	boolean	Whether this is an incremental update

Sources: packages/shared/src/agent-types.ts:1-50

Common Patterns

Tool Input Accumulation

Both engines implement a similar pattern for handling streaming tool inputs:

Start: Initialize input accumulator on content_block_start
Delta: Append JSON chunks on content_block_delta
Stop: Parse complete JSON on content_block_stop

// Pseudocode for accumulation pattern
const pendingToolInputs = new Map<number, PendingInput>();

function handleContentBlockStart(index, toolName, toolId) {
  pendingToolInputs.set(index, {
    toolName,
    toolId,
    inputJsonParts: []
  });
}

function handleContentBlockDelta(index, chunk) {
  const pending = pendingToolInputs.get(index);
  if (pending) {
    pending.inputJsonParts.push(chunk);
  }
}

Error Detection Strategy

Engines use a layered error detection approach:

Layer	Check	Action
1	`is_error` flag	Immediate error
2	`isError` flag	Error from metadata
3	Content prefix	Parse "Error:" prefix

Severity Classification

Tool execution results are classified by severity:

Severity	Trigger	Visual Indicator
`error`	Error detected	Red highlight
`success`	Tool completed	Green checkmark
`info`	In progress	Neutral/informational

Configuration

Engine Selection

Engines are selected based on:

User preference/settings
Task requirements (code vs. general)
API availability

Environment Variables

Variable	Engine	Purpose
`ANTHROPIC_API_KEY`	Claude	Authentication
`OPENAI_API_KEY`	Codex	Authentication
`API_BASE_URL`	Both	Endpoint configuration

Extension Points

Adding New Engines

To add a new engine:

Create engine file in engines/ directory
Implement common interface methods
Register in engine factory/registry
Add tool handlers in Tool Bridge

Custom Tool Handlers

Tool handlers can be extended by:

Implementing handler in tool-bridge.ts
Registering handler in tool registry
Adding type definitions in agent-types.ts

Best Practices

Error Handling

Always check multiple error indicators
Provide meaningful error messages
Log errors with context for debugging

Streaming

Handle partial content gracefully
Accumulate tool inputs properly
Update UI incrementally

Resource Management

Clean up pending inputs after processing
Limit retry attempts for failed calls
Monitor API rate limits

Sources: app/native-server/src/agent/engines/claude.ts:1-50

Storage and Data Management

Related topics: Chrome Extension Structure, Record and Replay Engine

Section Related Pages

Continue reading this section for the full explanation and source context.

Section IndexedDB Manager

Continue reading this section for the full explanation and source context.

Section Flow Storage Module

Continue reading this section for the full explanation and source context.

Section chrome.storage.local Usage

Continue reading this section for the full explanation and source context.

Storage and Data Management

Overview

The mcp-chrome project implements a dual-layer storage architecture that separates concerns between the Chrome Extension (client-side) and the Native Server (server-side). This design enables the extension to operate independently for recording, replaying, and managing browser-related data while delegating persistent storage of agent sessions and workflows to the native backend.

Sources: app/native-server/src/agent/db/client.ts:1-50

The storage system handles several key data domains:

Data Domain	Storage Layer	Primary Use
Recording Flows	IndexedDB	Browser session recording and replay
Semantic Index	chrome.storage.local	Content indexing and similarity search
Agent Sessions	SQLite (better-sqlite3)	Persistent agent workflow management
Model State	chrome.storage.local	ML model download and status tracking
Tab Content	IndexedDB	Web page content caching

Sources: app/chrome-extension/entrypoints/background/record-replay/storage/flows.ts:1-30 Sources: app/native-server/src/agent/session-service.ts:1-80

Architecture Overview

graph TD
    subgraph ChromeExtension["Chrome Extension"]
        A[IndexedDB] -->|Flow Recording| B[indexeddb-manager.ts]
        C[chrome.storage.local] -->|Settings & State| D[Background Scripts]
        E[Semantic Engine] -->|Indexing| F[content-indexer.ts]
    end
    
    subgraph NativeServer["Native Server"]
        G[SQLite Database] -->|Sessions| H[db/client.ts]
        I[Schema Definitions] -->|Tables| G
        J[Session Service] -->|CRUD| H
    end
    
    K[MCP Protocol] -->|Communication| L[Native Messaging]
    
    B -->|Storage Ops| A
    F -->|Cache| C
    H -->|Async Queries| G
    J -->|Session Mgmt| H

Sources: app/native-server/src/agent/db/schema.ts:1-100

Chrome Extension Storage

IndexedDB Manager

The IndexedDB manager (indexeddb-manager.ts) provides structured access to browser-native storage for recording flows and session data.

#### Core Operations

Method	Purpose
`initialize()`	Open/create IndexedDB database
`getFlow(id)`	Retrieve a recording flow by ID
`getAllFlows()`	List all stored flows
`saveFlow(flow)`	Persist a new or updated flow
`deleteFlow(id)`	Remove a flow from storage
`clearAll()`	Wipe all stored data

Sources: app/chrome-extension/entrypoints/background/record-replay/storage/indexeddb-manager.ts:1-150

#### Database Schema

interface FlowRecord {
  id: string;
  name: string;
  createdAt: number;
  updatedAt: number;
  steps: FlowStep[];
  triggers?: TriggerConfig[];
  args?: Record<string, unknown>;
}

The IndexedDB schema supports:

Flows: Complete recording sessions with metadata
Steps: Individual navigation and interaction steps
Triggers: DOM-based or URL-based automation triggers
Args: Configuration parameters for flow execution

Sources: app/chrome-extension/entrypoints/background/record-replay/storage/flows.ts:1-50

Flow Storage Module

The flows.ts module wraps the IndexedDB manager with higher-level operations for flow management.

graph LR
    A[Recording Start] --> B[Create Flow Record]
    B --> C[Capture Events]
    C --> D[Append Steps]
    D --> E[Trigger Detection]
    E --> F[Save to IndexedDB]
    F --> G[Recording Complete]

#### Flow Lifecycle

Creation: Initialize a new flow with timestamp and metadata
Capture: Record DOM events, network requests, and navigation
Trigger Binding: Associate DOM selectors or URL patterns
Persistence: Batch write to IndexedDB on completion or intervals
Retrieval: Load flows for replay with full state reconstruction

Sources: app/chrome-extension/entrypoints/background/record-replay/storage/flows.ts:50-150

chrome.storage.local Usage

The extension uses Chrome's storage.local API for lightweight, synchronous state management:

// Model state tracking
const modelState = {
  status: string;
  downloadProgress: number;
  isDownloading: boolean;
  lastUpdated: number;
  errorMessage: string;
  errorType: 'network' | 'file' | 'unknown';
};

// Semantic engine state
const semanticEngineState = {
  isReady: boolean;
  isInitializing: boolean;
  modelPath?: string;
};

Sources: app/chrome-extension/entrypoints/background/semantic-similarity.ts:1-80

#### Storage Keys

Key	Type	Purpose
`modelState`	Object	ML model download status
`semanticEngineState`	Object	Content indexing engine status
`RR_TRIGGERS`	Array	Record-replay DOM triggers
`contentIndex`	Object	Cached page content metadata

Sources: app/chrome-extension/utils/content-indexer.ts:1-60

Native Server Database

Database Client

The native server uses better-sqlite3 for synchronous, high-performance SQLite access.

import Database from 'better-sqlite3';

class AgentDatabase {
  private db: Database.Database;
  
  constructor(dbPath: string);
  initialize(): void;
  close(): void;
  // ... query methods
}

Sources: app/native-server/src/agent/db/client.ts:1-80

#### Configuration

Parameter	Default	Description
`dbPath`	`~/.claude/mcp-chrome.db`	SQLite database file path
WAL Mode	Enabled	Write-Ahead Logging for concurrency
Foreign Keys	Enabled	Referential integrity enforcement

Sources: app/native-server/src/agent/db/client.ts:80-120

Database Schema

erDiagram
    SESSIONS {
        string id PK
        string name
        string description
        string engine
        timestamp created_at
        timestamp updated_at
        json metadata
    }
    
    MESSAGES {
        string id PK
        string session_id FK
        string role
        text content
        timestamp created_at
        json attachments
        json raw
    }
    
    MESSAGES }o--|| SESSIONS : belongs_to

#### Sessions Table

Column	Type	Constraints	Description
`id`	TEXT	PRIMARY KEY	Unique session identifier
`name`	TEXT	NOT NULL	Display name for session
`description`	TEXT		Session description
`engine`	TEXT		AI engine (claude, codex, etc.)
`created_at`	INTEGER	NOT NULL	Unix timestamp
`updated_at`	INTEGER	NOT NULL	Last modification time
`metadata`	TEXT		JSON-encoded metadata

Sources: app/native-server/src/agent/db/schema.ts:1-100

#### Messages Table

Column	Type	Constraints	Description
`id`	TEXT	PRIMARY KEY	Message UUID
`session_id`	TEXT	FOREIGN KEY	Parent session reference
`role`	TEXT	NOT NULL	Message role (user/assistant/system)
`content`	TEXT		Message text content
`created_at`	INTEGER	NOT NULL	Timestamp
`attachments`	TEXT		JSON array of attachments
`raw`	TEXT		Raw metadata JSON

Sources: app/native-server/src/agent/db/schema.ts:100-200

Session Service

The session service provides the high-level interface for session and message management:

export interface ManagementInfo {
  models?: Array<{ value: string; displayName: string; description: string }>;
  commands?: Array<{ name: string; description: string; argumentHint: string }>;
  account?: { email?: string; organization?: string; subscriptionType?: string };
  mcpServers?: Array<{ name: string; status: string }>;
  tools?: string[];
  agents?: string[];
  plugins?: Array<{ name: string; path?: string }>;
  skills?: string[];
  slashCommands?: string[];
  model?: string;
  permissionMode?: string;
  cwd?: string;
  outputStyle?: string;
  betas?: string[];
  claudeCodeVersion?: string;
  apiKeySource?: string;
  lastUpdated?: string;
}

Sources: app/native-server/src/agent/session-service.ts:30-70

#### Session Preview Metadata

export interface AgentSessionPreviewMeta {
  displayText?: string;  // Compact display text
  clientMeta?: {
    kind?: 'web_editor_apply_batch' | 'web_editor_apply_single';
    elementCount?: number;
  };
}

This metadata enables special UI rendering for web editor apply operations.

Sources: app/native-server/src/agent/session-service.ts:75-90

Content Indexing System

ContentIndexer Class

The content indexer manages semantic search capabilities for browser tab content:

class ContentIndexer {
  private options: IndexerOptions;
  private semanticEngine: SemanticEngine;
  
  async indexTabContent(tabId: number): Promise<void>;
  async removeTabIndex(tabId: number): Promise<void>;
  private shouldIndexUrl(url: string): boolean;
  private async extractTabContent(tabId: number): Promise<ContentResult>;
}

Sources: app/chrome-extension/utils/content-indexer.ts:20-80

URL Filtering

Certain URLs are automatically excluded from indexing:

Pattern	Reason
`chrome://*`	Chrome internal pages
`chrome-extension://*`	Extension pages
`edge://*`	Edge internal pages
`about:*`	Browser about pages
`moz-extension://*`	Firefox extension pages
`file://*`	Local file system

Sources: app/chrome-extension/utils/content-indexer.ts:50-70

Auto-Indexing Behavior

graph TD
    A[Tab Load Complete] --> B{URL Valid?}
    B -->|No| C[Skip Indexing]
    B -->|Yes| D{Engine Ready?}
    D -->|No| E[Wait 2 seconds]
    E --> D
    D -->|Yes| F[Execute Fetcher Script]
    F --> G[Extract Page Content]
    G --> H[Update Semantic Index]
    H --> I[Store in IndexedDB]

The auto-indexing is triggered with a 2-second delay to allow dynamic content to load.

Sources: app/chrome-extension/utils/content-indexer.ts:10-40

Data Flow Summary

Recording Flow

sequenceDiagram
    participant U as User
    participant E as Extension
    participant IDB as IndexedDB
    participant NS as Native Server
    
    U->>E: Start Recording
    E->>IDB: Create Flow Record
    loop User Actions
        E->>E: Capture Event
        E->>IDB: Append Step
    end
    U->>E: Stop Recording
    E->>IDB: Finalize Flow
    E->>NS: Sync Flow Metadata

Replay Flow

sequenceDiagram
    participant T as Trigger
    participant E as Extension
    participant IDB as IndexedDB
    participant B as Browser
    
    T->>E: Trigger Matched
    E->>IDB: Load Flow
    loop Flow Steps
        E->>E: Process Step
        E->>B: Execute Action
    end

Error Handling

Both storage layers implement robust error handling:

Layer	Error Strategy
IndexedDB	Graceful degradation, console logging
chrome.storage	Fallback messaging to background script
SQLite	Transaction rollback, WAL recovery
Network sync	Retry with exponential backoff

Sources: app/chrome-extension/entrypoints/offscreen/main.ts:40-80

Best Practices

Batch Writes: Group multiple IndexedDB operations to reduce I/O
Index Maintenance: Periodically clean stale tab indices
Transaction Scope: Keep SQLite transactions short to prevent locks
Memory Management: Release DOM references after content extraction
Storage Quotas: Monitor chrome.storage.local usage for extension limits

Sources: app/native-server/src/agent/db/client.ts:1-50

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Bug: Singleton McpServer causes 'Already connected to a transport' on HTTP endpoint

First-time setup may fail or require extra isolation and rollback planning.

high Opencode cannot use chrome-mcp via mcp-server-stdio on Windows (Failed to connect to MCP server)

First-time setup may fail or require extra isolation and rollback planning.

medium Codex CLI setup guide still points to ~/.codex/config.json and a port-only flow Codex does not pick up

First-time setup may fail or require extra isolation and rollback planning.

medium [Feature] Profile-aware bridge — multi-Chrome-profile support without port collision

First-time setup may fail or require extra isolation and rollback planning.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: Bug: Singleton McpServer causes 'Already connected to a transport' on HTTP endpoint

Severity: high
Finding: Installation risk is backed by a source signal: Bug: Singleton McpServer causes 'Already connected to a transport' on HTTP endpoint. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/321

2. Installation risk: Opencode cannot use chrome-mcp via mcp-server-stdio on Windows (Failed to connect to MCP server)

Severity: high
Finding: Installation risk is backed by a source signal: Opencode cannot use chrome-mcp via mcp-server-stdio on Windows (Failed to connect to MCP server). Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/319

3. Installation risk: Codex CLI setup guide still points to ~/.codex/config.json and a port-only flow Codex does not pick up

Severity: medium
Finding: Installation risk is backed by a source signal: Codex CLI setup guide still points to ~/.codex/config.json and a port-only flow Codex does not pick up. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/339

4. Installation risk: [Feature] Profile-aware bridge — multi-Chrome-profile support without port collision

Severity: medium
Finding: Installation risk is backed by a source signal: [Feature] Profile-aware bridge — multi-Chrome-profile support without port collision. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/347

5. Installation risk: Installation risk needs validation

Severity: medium
Finding: Installation risk is backed by a source signal: Installation risk needs validation. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/333

6. Installation risk: 🐛 Status indicator stuck on yellow - Service shows as "not started" after connection

Severity: medium
Finding: Installation risk is backed by a source signal: 🐛 Status indicator stuck on yellow - Service shows as "not started" after connection. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/342

7. Configuration risk: Configuration risk needs validation

Severity: medium
Finding: Configuration risk is backed by a source signal: Configuration risk needs validation. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.host_targets | github_repo:998796026 | https://github.com/hangwin/mcp-chrome | host_targets=mcp_host, claude

8. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:998796026 | https://github.com/hangwin/mcp-chrome | README/documentation is current enough for a first validation pass.

9. Project risk: v0.0.6

Severity: medium
Finding: Project risk is backed by a source signal: v0.0.6. Treat it as a review item until the current version is checked.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/releases/tag/v0.0.6

10. Maintenance risk: [Feature/Bug] Multi-client MCP support — second Claude Code session kills first via shared MCP Server singleton

Severity: medium
Finding: Maintenance risk is backed by a source signal: [Feature/Bug] Multi-client MCP support — second Claude Code session kills first via shared MCP Server singleton. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/hangwin/mcp-chrome/issues/345

11. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:998796026 | https://github.com/hangwin/mcp-chrome | last_activity_observed missing

12. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:998796026 | https://github.com/hangwin/mcp-chrome | no_demo; severity=medium

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 10

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mcp-chrome with real data or production workflows.

Bug: Singleton McpServer causes 'Already connected to a transport' on HT - github / github_issue
[[Feature] Profile-aware bridge — multi-Chrome-profile support without po](https://github.com/hangwin/mcp-chrome/issues/347) - github / github_issue
[[Feature/Bug] Multi-client MCP support — second Claude Code session kill](https://github.com/hangwin/mcp-chrome/issues/345) - github / github_issue
🐛 Status indicator stuck on yellow - Service shows as "not started" afte - github / github_issue
Opencode cannot use chrome-mcp via mcp-server-stdio on Windows (Failed t - github / github_issue
Codex CLI setup guide still points to ~/.codex/config.json and a port-on - github / github_issue
Community source 7 - github / github_issue
issue - github / github_issue
v0.0.6 - github / github_release
Configuration risk needs validation - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

mcp-chrome

Introduction to Chrome MCP Server

Related Pages

Introduction to Chrome MCP Server

Overview and Purpose

System Architecture

Component Overview

Native Messaging Host Configuration

Chrome Extension Structure

Tool Categories

Navigation and Tab Management

Content Interaction

Network Monitoring

Content Analysis

Screenshots

Data Management

Content Indexing System

URL Exclusion Patterns

Web Fetcher Tool

Metadata Extraction

Record and Replay System

Security Configuration

Browser Configuration Support

Installation Flow

Prerequisites

Registration Methods

Future Roadmap

Project Structure

Quick Start Guide

Related Pages

Quick Start Guide

Overview

Architecture Overview

Prerequisites

Installation Steps

1. Chrome Extension Installation

2. Native Server Setup

3. Registry Configuration

MCP Client Configuration

Claude Desktop

Available MCP Tools

Browser Navigation (4 tools)

Tab Management (6 tools)

Interaction (3 tools)

Content Analysis (4 tools)

Screenshot & Visual (1 tool)

Network Monitoring (4 tools)

Data Management (5 tools)

Content Script Injection

Security Configuration

URL Exclusion Patterns

Agent Thread Visualization

Troubleshooting

Common Issues

Verbose Logging

Quick Usage Example

Next Steps

System Architecture

Related Pages

System Architecture

Overview

High-Level Architecture

Extension Entrypoint Architecture

Entrypoint Components

Native Messaging Architecture

Cross-Platform Manifest Configuration

Browser Support Matrix

MCP Tool Architecture

Tool Categories

Tab Management Tools

Dialog Handling

Content Indexer System

Indexing Logic

URL Exclusion Patterns

Agent Session Management

Session Service Architecture

Session Interface

Claude Engine Integration

Tool Message Processing

Tool Result Processing