Doramagic Project Pack · Human Manual

mobile-mcp

Related topics: Installation and Configuration, System Architecture, MCP Tools Reference

Overview

Related topics: Installation and Configuration, System Architecture, MCP Tools Reference

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Components

Continue reading this section for the full explanation and source context.

Section Device Management

Continue reading this section for the full explanation and source context.

Section App Management

Continue reading this section for the full explanation and source context.

Related topics: Installation and Configuration, System Architecture, MCP Tools Reference

Overview

Mobile MCP (Model Context Protocol) is an open-source project that enables AI assistants to interact with mobile devices through the MCP protocol. It provides a bridge between AI agents and mobile device automation, supporting both iOS and Android platforms including simulators, emulators, and physical devices.

Sources: README.md

Project Purpose

The project serves as an MCP server implementation that allows AI assistants to:

  • Automate native iOS and Android applications for testing or data entry scenarios
  • Execute scripted flows and form interactions without manual device control
  • Automate multi-step user journeys driven by large language models
  • Enable general-purpose mobile application interaction for agent-based frameworks
  • Facilitate agent-to-agent communication for mobile automation use cases and data extraction

Sources: README.md

Core Architecture

Mobile MCP follows a modular architecture that separates platform-specific implementations from the core server logic.

graph TD
    subgraph "MCP Clients"
        C1[Claude Desktop]
        C2[Cursor]
        C3[VS Code]
        C4[Codex]
        C5[Other MCP Clients]
    end
    
    subgraph "Mobile MCP Server"
        Server[Core Server<br/>src/server.ts]
        Tools[MCP Tools Layer]
        Robot[Robot Interface<br/>src/robot.ts]
    end
    
    subgraph "Platform Implementations"
        iOS_WDA[WebDriver Agent<br/>src/webdriver-agent.ts]
        iOS_Sim[iOS Simulator<br/>src/iphone-simulator.ts]
        Android[Android Device Manager]
    end
    
    subgraph "Target Devices"
        iOS_Real[iOS Physical Device]
        Android_Real[Android Physical Device]
        Simulators[iOS Simulators]
        Emulators[Android Emulators]
    end
    
    C1 & C2 & C3 & C4 & C5 --> Server
    Server --> Tools
    Tools --> Robot
    Robot --> iOS_WDA & iOS_Sim & Android
    iOS_WDA --> iOS_Real & Simulators
    iOS_Sim --> Simulators
    Android --> Android_Real & Emulators

Key Components

ComponentFilePurpose
Core Serversrc/server.tsMCP protocol implementation, device management, tool routing
Robot Interfacesrc/robot.tsAbstract interface defining device interaction methods
WebDriver Agentsrc/webdriver-agent.tsiOS real device and simulator automation via WebDriverAgent
iOS Simulatorsrc/iphone-simulator.tsDirect iOS simulator control using simctl
Android Manager(platform-specific)Android device automation

Sources: src/server.ts:1-30

Device Detection Flow

The server implements a device detection mechanism that identifies the platform type when a device ID is provided.

graph TD
    Start[Get Device ID] --> Check_iOS{iOS Device?}
    Check_iOS -->|Yes| Return_iOS[Return IosRobot]
    Check_iOS -->|No| Check_Android{Android Device?}
    Check_Android -->|Yes| Return_Android[Return AndroidRobot]
    Check_Android -->|No| Check_Simulator{iOS Simulator?}
    Check_Simulator -->|Yes| Return_Simulator[Return Simulator Robot]
    Check_Simulator -->|No| Error[Throw Error]
    
    style Return_iOS fill:#90EE90
    style Return_Android fill:#90EE90
    style Return_Simulator fill:#90EE90
    style Error fill:#FFB6C1

Sources: src/server.ts:20-45

Available Tools

Mobile MCP exposes the following tool categories through the MCP protocol:

Device Management

ToolDescription
mobile_list_available_devicesList all available devices (simulators, emulators, and real devices)
mobile_get_screen_sizeGet screen dimensions in pixels
mobile_get_orientationGet current screen orientation
mobile_set_orientationChange screen orientation (portrait/landscape)

App Management

ToolDescription
mobile_list_appsList all installed apps on the device
mobile_launch_appLaunch an app using package name
mobile_terminate_appStop and terminate a running app
mobile_install_appInstall an app from file (.apk, .ipa, .app, .zip)
mobile_uninstall_appUninstall an app using bundle ID or package name

Screen Interaction

ToolDescription
mobile_take_screenshotCapture screenshot to understand screen content
mobile_save_screenshotSave screenshot to a file
mobile_list_elements_on_screenList UI elements with coordinates and properties
mobile_click_on_screen_at_coordinatesClick at specific x,y coordinates
mobile_double_tap_on_screenDouble-tap at specific coordinates
mobile_long_press_on_screen_at_coordinatesLong press at specific coordinates
mobile_swipe_on_screenSwipe in any direction (up, down, left, right)

Input & Navigation

ToolDescription
mobile_type_keysType text into focused elements with optional submit
mobile_press_buttonPress device buttons (home, back, etc.)

Sources: README.md

Robot Interface

The Robot interface defines the contract for all platform implementations:

interface Robot {
    // Screen operations
    getScreenshot(): Promise<Buffer>;
    getScreenSize(): Promise<ScreenSize>;
    getOrientation(): Promise<Orientation>;
    setOrientation(orientation: Orientation): Promise<void>;
    
    // Element operations
    getElementsOnScreen(): Promise<ScreenElement[]>;
    
    // Touch operations
    tap(x: number, y: number): Promise<void>;
    doubleTap(x: number, y: number): Promise<void>;
    longPress(x: number, y: number, duration: number): Promise<void>;
    swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
    
    // Text and keys
    sendKeys(text: string): Promise<void>;
    pressButton(button: Button): Promise<void>;
    
    // App management
    listApps(): Promise<InstalledApp[]>;
    launchApp(packageName: string, locale?: string): Promise<void>;
    terminateApp(packageName: string): Promise<void>;
    installApp(path: string): Promise<void>;
    uninstallApp(bundleId: string): Promise<void>;
    
    // URL handling
    openUrl(url: string): Promise<void>;
}

Sources: src/robot.ts

Platform Support

iOS Platform

iOS automation is handled through two mechanisms:

  1. WebDriverAgent (src/webdriver-agent.ts): Used for real iOS devices and Xcode-managed simulators
  • Communicates via HTTP with WDA session
  • Filters elements by accepted types: TextField, Button, Switch, Icon, SearchField, StaticText, Image
  • Uses element visibility and accessibility properties for filtering
  1. iOS Simulator (src/iphone-simulator.ts): Direct simctl control for booted simulators
  • Handles .app bundle installation
  • Supports .zip file extraction with zip-slip vulnerability protection
  • Uses simctl command-line tool

Sources: src/webdriver-agent.ts:1-50

Android Platform

Android automation uses platform-specific managers to:

  • List connected devices
  • Query UI elements via accessibility tree
  • Execute touch and input operations
  • Manage app lifecycle

Communication Modes

STDIO Mode (Default)

The server communicates over standard input/output:

npx -y @mobilenext/mobile-mcp@latest

SSE Server Mode

For HTTP-based connections, the server can listen on a specified port:

npx @mobilenext/mobile-mcp@latest --listen 3000

Optional Bearer token authentication can be enabled:

MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000

Sources: README.md

Technology Stack

ComponentTechnologyVersion
RuntimeNode.js>=18
MCP SDK@modelcontextprotocol/sdk1.26.0
HTTP Frameworkexpress5.1.0
CLI Frameworkcommander14.0.0
Validationzod4.1.13
XML Parsingfast-xml-parser5.5.7
Native CLImobilecli0.3.70 (optional)

Sources: package.json

Key Features

  • Fast and lightweight: Uses native accessibility trees for most interactions, or screenshot-based coordinates where accessibility labels are not available
  • LLM-friendly: No computer vision model required in Accessibility (Snapshot)
  • Visual Sense: Evaluates and analyses what's actually rendered on screen
  • Cross-platform: Supports iOS, Android, simulators, emulators, and real devices
  • Standard protocol: Built on Model Context Protocol for seamless AI assistant integration

Sources: README.md

Version History

The project follows semantic versioning with active development:

VersionDateKey Changes
0.0.492026-03-24Path traversal fix in save screenshot and record video
0.0.482026-03-20fast-xml-parser security updates, error handling fixes
0.0.472026-03-09Zod coerce for number parameter parsing, locale support for iOS
0.0.422026-02-03mobilecli upgrade, fast-xml-parser security update
0.0.412026-01-27Android element filtering improvements

Sources: CHANGELOG.md

Getting Started

Installation

Add to your MCP client configuration:

{
  "mcpServers": {
    "mobile-mcp": {
      "command": "npx",
      "args": ["-y", "@mobilenext/mobile-mcp@latest"]
    }
  }
}

Prerequisites

  • Node.js >= 18
  • For iOS: Xcode Command Line Tools, WebDriverAgent (for real devices)
  • For Android: Android SDK, ADB configured

Client Support

Mobile MCP is compatible with multiple AI coding assistants:

  • Claude Desktop
  • Claude Code
  • Cursor
  • VS Code
  • Codex
  • Copilot
  • Gemini CLI
  • Goose
  • Cline
  • Windsurf
  • Qodo Gen
  • Amp
  • Kiro
  • opencode

Sources: README.md

Sources: README.md

Installation and Configuration

Related topics: Overview, Prerequisites

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Requirements

Continue reading this section for the full explanation and source context.

Section Required Mobile Tools

Continue reading this section for the full explanation and source context.

Section Standard NPM Installation

Continue reading this section for the full explanation and source context.

Related topics: Overview, Prerequisites

Installation and Configuration

Overview

Mobile MCP is a Model Context Protocol (MCP) server that enables mobile automation for iOS and Android devices. The server provides a standardized interface for AI assistants to interact with mobile devices through simulators, emulators, and real hardware.

This page covers the complete installation process, configuration options, and server deployment modes.

Prerequisites

System Requirements

RequirementSpecification
Node.jsVersion 18 or higher
Package Managernpm, yarn, or pnpm
Mobile CLI Toolsmobilecli (auto-installed as optional dependency)
Platform ToolsXcode Command Line Tools (iOS) / Android SDK (Android)

The server requires Node.js 18+ as specified in package.json:

"engines": {
  "node": ">=18"
}

Sources: package.json:12

Required Mobile Tools

Mobile MCP depends on mobilecli for device communication. The SDK checks for mobilecli availability at server startup:

const ensureMobilecliAvailable = (): void => {
    try {
        const version = mobilecli.getVersion();
        if (version.startsWith("failed")) {
            throw new Error("mobilecli version check failed");
        }
    } catch (error: any) {
        throw new ActionableError(`mobilecli is not available or not working properly...`);
    }
};

Sources: src/server.ts:1-20

Installation Methods

Standard NPM Installation

The recommended installation method uses npx to run the package directly:

npx -y @mobilenext/mobile-mcp@latest

This command downloads and executes the latest version without requiring local installation.

Local Installation

For development or customization, install locally:

npm install @mobilenext/mobile-mcp

The package provides a binary entry point:

"bin": {
  "mcp-server-mobile": "lib/index.js"
}

Sources: package.json:62-65

Building from Source

To build from source:

git clone https://github.com/mobile-next/mobile-mcp.git
cd mobile-mcp
npm install
npm run build

Build artifacts are output to the lib/ directory:

npm run build  # Compiles TypeScript and sets executable permissions
npm run watch  # Watch mode for development

Sources: package.json:22-30

Server Configuration

Standard MCP Configuration

The following JSON configuration works across most MCP clients:

{
  "mcpServers": {
    "mobile-mcp": {
      "command": "npx",
      "args": ["-y", "@mobilenext/mobile-mcp@latest"]
    }
  }
}

Sources: README.md:1

Configuration Schema

The server adheres to the MCP protocol schema:

{
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
  "name": "io.github.mobile-next/mobile-mcp",
  "description": "MCP server for iOS and Android Mobile Development, Automation and Testing",
  "version": "{{VERSION}}",
  "packages": [
    {
      "registryType": "npm",
      "registryBaseUrl": "https://registry.npmjs.org",
      "identifier": "@mobilenext/mobile-mcp",
      "transport": {
        "type": "stdio"
      }
    }
  ]
}

Sources: server.json:1-20

Client-Specific Configuration

Claude Code

claude mcp add mobile-mcp -- npx -y @mobilenext/mobile-mcp@latest

Sources: README.md:1

Claude Desktop

Follow the MCP install guide and use the standard JSON configuration above.

Codex

CLI Installation:

codex mcp add mobile-mcp npx "@mobilenext/mobile-mcp@latest"

Manual Configuration (~/.codex/config.toml):

[mcp_servers.mobile-mcp]
command = "npx"
args = ["@mobilenext/mobile-mcp@latest"]

Sources: README.md:1

Copilot

Configuration file (~/.copilot/mcp-config.json):

{
  "mcpServers": {
    "mobile-mcp": {
      "type": "local",
      "command": "npx",
      "tools": ["*"],
      "args": ["@mobilenext/mobile-mcp@latest"]
    }
  }
}

Sources: README.md:1

Cursor

Installation Button: Click the provided deeplink or navigate to Cursor SettingsMCPAdd new MCP Server.

Manual Configuration:

{
  "mcpServers": {
    "mobile-mcp": {
      "command": "npx",
      "args": ["@mobilenext/mobile-mcp@latest"]
    }
  }
}

Sources: README.md:1

Gemini CLI

gemini mcp add mobile-mcp npx -y @mobilenext/mobile-mcp@latest

Sources: README.md:1

Goose

UI Installation: Use the extension install button or navigate to Advanced settingsExtensionsAdd custom extension.

Manual Configuration:

Sources: README.md:1

  • Type: STDIO
  • Command: npx -y @mobilenext/mobile-mcp@latest

Windsurf

Navigate to Windsurf settings → MCP servers → Add new server:

npx @mobilenext/mobile-mcp@latest

Sources: README.md:1

Amp

CLI Installation:

amp mcp add mobile-mcp -- npx @mobilenext/mobile-mcp@latest

VS Code Extension: Add via settings.json:

"amp.mcpServers": {
  "mobile-mcp": {
    "command": "npx",
    "args": ["@mobilenext/mobile-mcp@latest"]
  }
}

Sources: README.md:1

Cline

Add the standard JSON configuration to your MCP settings file. Sources: README.md:1

Kiro

Configuration file (~/.kiro/settings/mcp.json):

{
  "mcpServers": {
    "mobile-mcp": {
      "command": "npx",
      "args": ["@mobilenext/mobile-mcp@latest"]
    }
  }
}

Sources: README.md:1

opencode

Configuration file (~/.config/opencode/opencode.json):

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "mobile-mcp": {
      "type": "local",
      "command": ["npx", "@mobilenext/mobile-mcp@latest"],
      "enabled": true
    }
  }
}

Sources: README.md:1

Qodo Gen

Open Qodo Gen chat panel → Connect more tools+ Add new MCP → Paste the standard configuration. Sources: README.md:1

SSE Server Mode

By default, Mobile MCP communicates over stdio. For remote or web-based deployments, enable SSE (Server-Sent Events) mode.

Starting the SSE Server

Basic Usage:

npx @mobilenext/mobile-mcp@latest --listen 3000

Binding to Specific Interface:

npx @mobilenext/mobile-mcp@latest --listen 0.0.0.0:3000

This binds the server to all network interfaces on port 3000.

Client Configuration for SSE

Configure your MCP client to connect to the SSE endpoint:

http://<host>:3000/mcp

Architecture Diagram

graph TD
    A[MCP Client] -->|HTTP/MCP Protocol| B[SSE Server]
    B --> C{Mobile MCP Server}
    C -->|iOS Devices| D[IosRobot]
    C -->|Android Devices| E[AndroidRobot]
    D --> F[mobilecli]
    E --> F
    F --> G[iOS Simulator/Device]
    F --> H[Android Emulator/Device]

Authorization

When running in SSE mode, secure the server with Bearer token authentication.

Configuration

Set the MOBILEMCP_AUTH environment variable:

MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000

Client Request Format

All requests must include the authorization header:

Authorization: Bearer my-secret-token

Dependencies

Production Dependencies

Sources: package.json:31-48

PackageVersionPurpose
@modelcontextprotocol/sdk1.26.0MCP protocol implementation
ajv^8.18.0JSON schema validation
commander14.0.0CLI argument parsing
express5.1.0SSE server framework
fast-xml-parser5.5.7XML parsing for mobile protocols
qs^6.15.0Query string parsing
zod^4.1.13Schema validation
zod-to-json-schema3.25.0Zod to JSON Schema conversion
mobilecli0.3.70Mobile device communication (optional)

Dev Dependencies

Key development dependencies include:

Sources: package.json:49-60

  • TypeScript 5.8.2
  • ESLint 9.19.0
  • Mocha 11.1.0 (testing)
  • ts-node 10.9.2
  • husky 9.1.7 (git hooks)

Troubleshooting

mobilecli Not Available

If the server fails to start with "mobilecli is not available":

``bash npm install mobilecli ``

  1. Ensure the optional dependency is installed:

``bash mobilecli --version ``

  1. Verify installation:
  1. Check platform compatibility using the binary resolution logic in src/mobilecli.ts.

Binary Resolution Path

The server searches for mobilecli in this order:

graph TD
    A[Start] --> B{Current path contains node_modules?}
    B -->|Yes| C[Find last node_modules directory]
    C --> D[Check mobilecli/bin/<platform-specific-binary>]
    D --> E{Binary exists?}
    E -->|Yes| F[Return path]
    E -->|No| G[Check parentDir/node_modules/mobilecli/bin/...]
    B -->|No| G
    G --> H{Binary exists?}
    H -->|Yes| F
    H -->|No| I[Throw error]

Sources: src/mobilecli.ts:1-35

Node.js Version

Ensure Node.js 18+ is installed:

node --version

iOS Simulator Issues

For iOS simulators, ensure Xcode Command Line Tools are installed:

xcode-select --install

List available simulators:

xcrun simctl list

Boot a simulator before use:

xcrun simctl boot "iPhone 16"

Next Steps

After successful installation:

  1. Connect a Device: Use physical devices, simulators (iOS), or emulators (Android)
  2. Verify Connection: Run mobile_list_devices tool
  3. Take a Screenshot: Use mobile_take_screenshot to verify communication
  4. Explore Tools: Review available tools in the main documentation

Sources: package.json:12

Prerequisites

Related topics: Installation and Configuration, iOS Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Node.js Environment

Continue reading this section for the full explanation and source context.

Section Supported Platforms

Continue reading this section for the full explanation and source context.

Section mobilecli

Continue reading this section for the full explanation and source context.

Related topics: Installation and Configuration, iOS Implementation

Prerequisites

Mobile MCP (Model Context Protocol) enables AI agents to interact with mobile devices for automation, testing, and mobile application manipulation. Before using Mobile MCP, you must ensure your environment meets the necessary requirements for both the MCP server and the target mobile devices.

System Requirements

Node.js Environment

Mobile MCP requires Node.js version 18 or higher. This requirement is enforced through the package.json engine specification:

"engines": {
  "node": ">=18"
}

Sources: package.json:11

The MCP SDK version 1.26.0 is used as the core protocol implementation, which also requires a modern Node.js environment with support for ES modules and async/await patterns.

Supported Platforms

Mobile MCP supports automation across multiple platform types:

Platform TypeExamplesAccess Method
iOS SimulatorsiPhone Simulator, iPad SimulatorLocal Xcode tools
iOS Real DevicesPhysical iPhones, iPadsWebDriverAgent via tunnel
Android EmulatorsAndroid Studio Emulator, GenymotionADB
Android Real DevicesSamsung, Google Pixel, etc.ADB

Sources: src/server.ts:32-55

The server automatically detects which platform type a device belongs to by checking against iOS devices, Android devices, and iOS simulators in sequence.

Required Software Components

mobilecli

The mobilecli package is the core CLI tool that Mobile MCP relies on for device communication. It is listed as an optional dependency in package.json:

"optionalDependencies": {
  "mobilecli": "0.3.70"
}

Sources: package.json:31-33

The server performs a version check on startup to ensure mobilecli is available:

const ensureMobilecliAvailable = (): void => {
    try {
        const version = mobilecli.getVersion();
        if (version.startsWith("failed")) {
            throw new Error("mobilecli version check failed");
        }
    } catch (error: any) {
        throw new ActionableError(`mobilecli is not available or not working properly.`);
    }
};

Sources: src/server.ts:18-28

Installation Methods

#### Via npm (Recommended)

npx @mobilenext/mobile-mcp@latest

#### Via mobilecli package

npm install -g mobilecli

iOS-Specific Prerequisites

WebDriverAgent (Real Devices)

For physical iOS devices, WebDriverAgent (WDA) must be installed and running. WDA is Facebook's WebDriver protocol implementation for iOS used for device control and element inspection.

The IosManager class manages WDA connections:

private async wda(): Promise<WebDriverAgent> {
    await this.assertTunnelRunning();

    if (!(await this.isWdaForwardRunning())) {
        throw new ActionableError("Port forwarding to WebDriverAgent is not running");
    }

    const wda = new WebDriverAgent("localhost", WDA_PORT);

    if (!(await wda.isRunning())) {
        throw new ActionableError("WebDriverAgent is not running on device");
    }

    return wda;
}

Sources: src/ios.ts:82-97

#### WebDriverAgent Port Configuration

SettingDefault ValuePurpose
WDA_PORT8100WebDriverAgent local port
IOS_TUNNEL_PORT9222iOS tunnel port

iOS Tunnel Requirements

When connecting to remote iOS devices, an iOS tunnel must be established. The tunnel allows the MCP server to communicate with devices over a network connection.

private async assertTunnelRunning(): Promise<void> {
    if (await this.isTunnelRequired()) {
        if (!(await this.isTunnelRunning())) {
            throw new ActionableError("iOS tunnel is not running");
        }
    }
}

Sources: src/ios.ts:68-74

#### Tunnel Setup Requirements

  1. Port forwarding must be active - The tunnel forwards WDA traffic to the local MCP server
  2. Firewall configuration - Port 9222 must be accessible for tunnel connections
  3. Network connectivity - Both server and device must have network access

iOS Simulator Requirements

For iOS simulators, the system uses simctl commands through the Xcode toolchain:

this.simctl("install", this.simulatorUuid, installPath);

Sources: src/iphone-simulator.ts:89

Required tools for simulators:

  • Xcode Command Line Tools
  • simctl utility
  • Bootable simulator instances

App Installation on iOS

The iOS manager handles .zip and .app bundle installations:

if (extname(path).toLowerCase() === ".zip") {
    this.validateZipPaths(path);
    // Extract and install .app bundle
}

Sources: src/iphone-simulator.ts:58-65

Supported formats: .zip, .app

Android-Specific Prerequisites

Android Debug Bridge (ADB)

Android devices communicate through ADB (Android Debug Bridge). The MobileDevice class executes ADB commands for device interaction:

private runCommand(args: string[]): string {
    const result = execFileSync("adb", ["-s", this.deviceId, ...args]);
    return result.toString();
}

Sources: src/mobile-device.ts:17-20

#### Common ADB Operations

CommandPurpose
adb shellExecute shell commands on device
adb installInstall APK packages
adb uninstallRemove applications
adb screencapCapture screen screenshots
adb inputSend touch/keyboard input

Android Device Requirements

  1. USB Debugging enabled - Required for ADB communication
  2. Device authorization - Device must approve computer for debugging
  3. Proper USB drivers - Especially on Windows systems

Android UI Automation Commands

The Android implementation uses uiautomator2 commands through ADB shell:

public async getElementsOnScreen(): Promise<ScreenElement[]> {
    const response = JSON.parse(this.runCommand(["dump", "ui"])) as DumpUIResponse;
    return response.data.elements.map(element => ({
        type: element.type,
        label: element.label,
        text: element.text,
        // ... other properties
    }));
}

Sources: src/mobile-device.ts:52-61

Supported Android Input Operations

OperationADB Command
Tapinput tap x,y
Swipeinput swipe x1 y1 x2 y2
Text inputinput text <text>
Button pressinput keyevent <code>
Long pressCustom implementation with duration

Architecture Overview

graph TB
    subgraph "MCP Client" 
        A["AI Agent / IDE"]
    end
    
    subgraph "Mobile MCP Server"
        B["server.ts<br/>MCP Protocol Handler"]
        C["Robot Interface<br/>Abstract Layer"]
    end
    
    subgraph "Device Abstraction Layer"
        D["IosRobot<br/>iOS Implementation"]
        E["AndroidRobot<br/>Android Implementation"]
        F["IosSimulatorRobot<br/>Simulator Implementation"]
    end
    
    subgraph "Device Communication"
        G["mobilecli"]
        H["WebDriverAgent<br/>iOS Real Devices"]
        I["ADB<br/>Android Devices"]
        J["simctl<br/>iOS Simulators"]
    end
    
    A --> B
    B --> C
    C --> D
    C --> E
    C --> F
    D --> G
    D --> H
    E --> G
    E --> I
    F --> G
    F --> J

Prerequisites Checklist

Before running Mobile MCP, verify the following:

Environment Checklist

  • [ ] Node.js >= 18 installed
  • [ ] mobilecli package accessible
  • [ ] Network connectivity for remote devices

iOS Device Checklist (Real Devices)

  • [ ] WebDriverAgent installed on device
  • [ ] iOS tunnel established (for remote access)
  • [ ] Port forwarding active (port 8100)
  • [ ] Device connected via USB or network

iOS Simulator Checklist

  • [ ] Xcode installed
  • [ ] Simulator booted and available
  • [ ] simctl command accessible

Android Device Checklist

  • [ ] ADB installed and in PATH
  • [ ] USB debugging enabled on device
  • [ ] Device authorized for debugging
  • [ ] Device connected (USB or network)

Troubleshooting Prerequisites Issues

mobilecli Not Found

Error: mobilecli is not available or not working properly

Solution: Install mobilecli globally:

npm install -g mobilecli

iOS Tunnel Not Running

Error: iOS tunnel is not running

Solution: Establish tunnel using go-ios:

ios tunnel start

WebDriverAgent Not Running

Error: WebDriverAgent is not running on device

Solution:

  1. Ensure WDA is installed on the device
  2. Verify port forwarding: iproxy 8100 8100
  3. Restart WebDriverAgent if needed

Android Device Not Detected

Error: No Android devices found

Solution:

  1. Verify ADB is running: adb devices
  2. Enable USB debugging on device
  3. Reconnect device or restart ADB: adb kill-server && adb start-server

Sources: package.json:11

System Architecture

Related topics: Device Abstraction Layer, iOS Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. MCP Server (server.ts)

Continue reading this section for the full explanation and source context.

Section 2. Robot Interface (robot.ts)

Continue reading this section for the full explanation and source context.

Section 3. Device Implementations

Continue reading this section for the full explanation and source context.

Related topics: Device Abstraction Layer, iOS Implementation

System Architecture

Overview

Mobile MCP is a Model Context Protocol (MCP) server that enables AI agents to interact with mobile devices (iOS and Android) through a standardized interface. The system acts as a bridge between LLM-powered agents and mobile device automation, supporting physical devices, simulators, and emulators.

Sources: README.md:1-50

High-Level Architecture

The architecture follows a layered design pattern with clear separation of concerns:

graph TD
    subgraph "Client Layer"
        A["MCP Client<br/>(Cursor, Claude, etc.)"]
    end
    
    subgraph "MCP Server Layer"
        B["server.ts<br/>(MCP Protocol Handler)"]
        C["Tool Definitions"]
        D["Device Manager"]
    end
    
    subgraph "Abstraction Layer"
        E["Robot Interface<br/>(robot.ts)"]
    end
    
    subgraph "Device Implementation Layer"
        F["IosRobot<br/>(ios.ts)"]
        G["AndroidRobot<br/>(android.ts)"]
        H["MobileDevice<br/>(mobile-device.ts)"]
    end
    
    subgraph "External Dependencies"
        I["mobilecli"]
        J["WebDriverAgent"]
        K["ADB"]
    end
    
    A --> B
    B --> C
    B --> D
    D --> E
    E --> F
    E --> G
    E --> H
    F --> J
    G --> K
    H --> I

Core Components

1. MCP Server (`server.ts`)

The main entry point that implements the MCP protocol. It handles:

  • Tool registration and discovery
  • Device listing and selection
  • Request routing to appropriate robot implementations
  • Authentication handling (SSE mode)

Key responsibilities:

2. Robot Interface (`robot.ts`)

Defines the abstract interface that all device implementations must follow:

interface Robot {
  openUrl(url: string): Promise<void>;
  sendKeys(text: string): Promise<void>;
  pressButton(button: Button): Promise<void>;
  tap(x: number, y: number): Promise<void>;
  doubleTap(x: number, y: number): Promise<void>;
  longPress(x: number, y: number, duration: number): Promise<void>;
  swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
  getScreenshot(): Promise<Buffer>;
  getElementsOnScreen(): Promise<ScreenElement[]>;
  listApps(): Promise<InstalledApp[]>;
  launchApp(packageName: string, locale?: string): Promise<void>;
  terminateApp(packageName: string): Promise<void>;
  installApp(path: string): Promise<void>;
  uninstallApp(bundleId: string): Promise<void>;
  setOrientation(orientation: Orientation): Promise<void>;
  getOrientation(): Promise<Orientation>;
}

Sources: src/robot.ts:1-100

3. Device Implementations

#### IosRobot (ios.ts)

Manages iOS device interactions through WebDriverAgent:

  • Establishes tunnel connections for remote device access
  • Manages WebDriverAgent port forwarding
  • Provides iOS-specific automation commands

Key features:

#### AndroidRobot (android.ts)

Manages Android device interactions through ADB:

  • Direct ADB command execution
  • Device discovery and listing
  • APK installation and management

#### MobileDevice (mobile-device.ts)

Uses mobilecli for unified device control:

  • Works across both platforms through mobilecli abstraction
  • UI element dumping and interaction
  • Device orientation management

Sources: src/mobile-device.ts:1-80

Device Selection Flow

graph TD
    A["mobile_list_available_devices"] --> B["Check iOS Devices"]
    B --> C["Check Android Devices"]
    C --> D["Check iOS Simulators"]
    D --> E["Return Combined List"]
    
    F["getRobotFromDevice<br/>(deviceId)"] --> G{"Is iOS Device?"}
    G -->|Yes| H["Return IosRobot"]
    G -->|No| I{"Is Android Device?"}
    I -->|Yes| J["Return AndroidRobot"]
    I -->|No| K{"Is iOS Simulator?"}
    K -->|Yes| L["Check Agent Status"]
    L -->|Install if needed| M["Return MobileDevice"]
    K -->|No| N["Throw Error"]

Sources: src/server.ts:60-100

Tool Architecture

All MCP tools follow a consistent pattern defined in server.ts:

CategoryTool NamePurpose
Device Infomobile_list_available_devicesList all connected devices
Device Infomobile_get_screen_sizeGet device screen dimensions
Device Infomobile_get_orientationGet current screen orientation
Device Infomobile_set_orientationSet screen orientation
App Managementmobile_list_appsList installed apps
App Managementmobile_launch_appLaunch an app
App Managementmobile_terminate_appStop an app
App Managementmobile_install_appInstall from file
App Managementmobile_uninstall_appUninstall app
Screen Interactionmobile_take_screenshotCapture screen
Screen Interactionmobile_click_on_screen_at_coordinatesTap at coordinates
Screen Interactionmobile_double_tap_on_screenDouble tap
Screen Interactionmobile_long_press_on_screen_at_coordinatesLong press
Screen Interactionmobile_swipe_on_screenSwipe gesture
Inputmobile_type_keysSend text input

Communication Protocols

Stdio Mode (Default)

The default communication mode uses standard input/output:

{
  "mcpServers": {
    "mobile-mcp": {
      "command": "npx",
      "args": ["-y", "@mobilenext/mobile-mcp@latest"]
    }
  }
}

SSE Server Mode

Optional HTTP server mode for remote access:

npx @mobilenext/mobile-mcp@latest --listen 3000

Supports optional Bearer token authentication:

MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000

WebDriverAgent Integration (iOS)

For iOS devices, the architecture leverages WebDriverAgent:

sequenceDiagram
    participant MCP as MCP Server
    participant IosRobot
    participant WDA as WebDriverAgent
    participant Tunnel
    
    MCP->>IosRobot: tap(x, y)
    IosRobot->>Tunnel: Ensure tunnel running
    Tunnel-->>IosRobot: OK
    IosRobot->>WDA: POST /wda/tap/0
    WDA-->>IosRobot: Response
    IosRobot-->>MCP: Success

Sources: src/webdriver-agent.ts:50-100

Dependency Architecture

graph LR
    A["@modelcontextprotocol/sdk"] --> B["MCP Server"]
    C["mobilecli"] --> D["MobileDevice"]
    E["express"] --> F["SSE Server"]
    G["zod"] --> H["Schema Validation"]

Key Dependencies

PackageVersionPurpose
@modelcontextprotocol/sdk1.26.0MCP protocol implementation
mobilecli0.3.70 (optional)Cross-platform mobile control
express5.1.0HTTP server for SSE mode
zod^4.1.13Runtime type validation
commander14.0.0CLI argument parsing

Sources: package.json:1-50

Error Handling

The system uses ActionableError for user-friendly error messages:

throw new ActionableError(
  `Device "${deviceId}" not found. Use the mobile_list_available_devices tool to see available devices.`
);

Sources: src/server.ts:90-95

Platform Detection Logic

graph TD
    A["getRobotFromDevice"] --> B["Check IosManager"]
    B --> C{"Found in iOS devices?"}
    C -->|Yes| D["Return IosRobot"]
    C -->|No| E["Check AndroidManager"]
    E --> F{"Found in Android devices?"}
    F -->|Yes| G["Return AndroidRobot"]
    F -->|No| H["Check iOS Simulators"]
    H --> I{"Found?"}
    I -->|Yes| J["Return MobileDevice"]
    I -->|No| K["Throw ActionableError"]

Security Considerations

  • URL scheme validation: Only http:// and https:// allowed by default Sources: src/server.ts:150-160
  • Optional unsafe URL access via MOBILEMCP_ALLOW_UNSAFE_URLS=1
  • Bearer token authentication for SSE mode
  • No arbitrary command execution on host system

Extensibility

The Robot interface design allows for additional platform implementations:

  1. Implement the Robot interface
  2. Add device detection in getRobotFromDevice()
  3. Register new tools in server.ts
// Example: Adding a new platform
const newPlatformDevice = new NewPlatformRobot(deviceId);
return newPlatformRobot;

This modular architecture enables easy addition of new device types or automation backends without modifying existing implementations.

Sources: README.md:1-50

Device Abstraction Layer

Related topics: System Architecture, iOS Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Robot Interface

Continue reading this section for the full explanation and source context.

Section Platform Implementations

Continue reading this section for the full explanation and source context.

Section IosManager

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, iOS Implementation

Device Abstraction Layer

Overview

The Device Abstraction Layer (DAL) is the core architectural component of mobile-mcp that provides a unified interface for interacting with mobile devices across different platforms. It abstracts platform-specific implementations (iOS and Android) behind a common Robot interface, enabling AI agents and automated workflows to control mobile devices without knowledge of the underlying platform details.

The DAL serves as the bridge between the Model Context Protocol (MCP) server and the physical mobile devices, handling device detection, robot instantiation, and operation delegation.

Architecture

graph TD
    A[MCP Server] --> B[Device Abstraction Layer]
    B --> C[Robot Factory]
    C --> D{IOS Device?}
    C --> E{Android Device?}
    C --> F{Simulator?}
    D --> G[IosRobot]
    E --> H[AndroidRobot]
    F --> I[MobileDevice]
    G --> J[mobilecli / iOS Device Kit]
    H --> K[ADB / UI Automator]
    I --> J

Core Components

Robot Interface

The Robot interface (src/robot.ts) defines the contract that all platform-specific implementations must follow. This ensures consistent behavior regardless of the target device.

File: src/robot.ts:1-80

MethodDescriptionParametersReturn Type
openUrlOpen a URL in the device browserurl: stringPromise<void>
sendKeysSend keyboard input to devicetext: stringPromise<void>
pressButtonSimulate physical button pressbutton: ButtonPromise<void>
tapTap at screen coordinatesx: number, y: numberPromise<void>
doubleTapDouble-tap at coordinatesx: number, y: numberPromise<void>
longPressLong press at coordinatesx: number, y: number, duration: numberPromise<void>
getElementsOnScreenGet interactive UI elements-Promise<ScreenElement[]>
setOrientationChange screen orientationorientation: OrientationPromise<void>
getOrientationGet current orientation-Promise<Orientation>

Platform Implementations

#### IosRobot

Handles iOS device interactions using WebDriverAgent or iOS Device Kit as the underlying protocol.

File: src/ios.ts

CapabilityDescription
Simulator SupportFull support for iOS simulators
Real Device SupportUses iOS Device Kit for physical devices
WebDriverAgentLegacy support via go-ios

#### AndroidRobot

Manages Android device interactions through ADB (Android Debug Bridge) and UI Automator.

File: src/android.ts

CapabilityDescription
Emulator SupportFull support for Android emulators
Real Device SupportDirect ADB communication
Foldable SupportMulti-screen device handling (v0.0.23+)

#### MobileDevice

Unified device class for modern simulator/emulator management. Used as the primary interface for iOS simulators.

File: src/mobile-device.ts

FeatureDescription
Agent Status CheckVerifies device readiness
Agent InstallAutomatically installs required agents
Platform AbstractionWorks across iOS and Android platforms

Device Detection Flow

sequenceDiagram
    participant Client
    participant Server
    participant DAL as Device Abstraction Layer
    participant IosMgr as IosManager
    participant AndroidMgr as AndroidDeviceManager
    participant mobilecli

    Client->>Server: Tool Request (deviceId)
    Server->>DAL: getRobotFromDevice(deviceId)
    DAL->>IosMgr: listDevices()
    DAL->>AndroidMgr: getConnectedDevices()
    
    alt iOS Device Found
        DAL->>DAL: Create IosRobot(deviceId)
    else Android Device Found
        DAL->>DAL: Create AndroidRobot(deviceId)
    else Simulator Check
        DAL->>mobilecli: getDevices(platform: "ios", type: "simulator")
        alt Simulator Found
            DAL->>DAL: Check agentVerifiedSimulators
            DAL->>mobilecli: agentStatus(deviceId)
            alt Agent Not Installed
                DAL->>mobilecli: agentInstall(deviceId)
            end
            DAL->>DAL: Create MobileDevice(deviceId)
        else Not Found
            DAL-->>Client: Error: Device not found
        end
    end
    DAL-->>Server: Robot Instance
    Server-->>Client: Execute Tool

Device Manager Classes

IosManager

Manages iOS device discovery and listing.

File: src/ios.ts

class IosManager {
    listDevices(): IosDevice[];
}

AndroidDeviceManager

Manages Android device discovery via ADB.

File: src/android.ts

class AndroidDeviceManager {
    getConnectedDevices(): AndroidDevice[];
}

mobilecli Integration

The mobilecli tool (src/mobilecli.ts) is the underlying CLI that provides cross-platform device management. It is a required dependency for the Device Abstraction Layer to function.

File: src/server.ts:8-16

const ensureMobilecliAvailable = (): void => {
    try {
        const version = mobilecli.getVersion();
        if (version.startsWith("failed")) {
            throw new Error("mobilecli version check failed");
        }
    } catch (error: any) {
        throw new ActionableError(`mobilecli is not available or not working properly...`);
    }
};

Key mobilecli Functions

FunctionPurpose
getVersion()Verify mobilecli installation
getDevices()List available devices by platform and type
agentStatus()Check if agent is installed on device
agentInstall()Install required agent on device
agentUninstall()Remove agent from device

Simulator Agent Management

For iOS simulators, the DAL implements an agent verification and auto-installation system.

File: src/server.ts:45-58

if (!agentVerifiedSimulators.has(deviceId)) {
    const agentStatus = mobilecli.agentStatus(deviceId);
    if (agentStatus.status === "fail") {
        mobilecli.agentInstall(deviceId);
    }
    agentVerifiedSimulators.add(deviceId);
}

This ensures that:

  1. Each simulator is checked at most once per server session
  2. Missing agents are automatically installed
  3. The agentVerifiedSimulators Set prevents redundant installations

Error Handling

The DAL provides actionable error messages when device operations fail.

Error ConditionResponse
mobilecli not availableLinks to installation wiki
Device not foundLists available devices tool
iOS Device Kit failureFallback mechanisms

File: src/server.ts:15-16

throw new ActionableError(`Device "${deviceId}" not found. 
Use the mobile_list_available_devices tool to see available devices.`);

Usage in MCP Tools

The Device Abstraction Layer is invoked by all device interaction tools defined in the MCP server:

File: src/server.ts:63-90

ToolOperation
mobile_list_available_devicesLists all connected devices
mobile_get_screen_sizeDelegates to active Robot
mobile_take_screenshotDelegates to active Robot
mobile_click_on_screen_at_coordinatesDelegates to active Robot
mobile_type_keysDelegates to active Robot
mobile_press_buttonDelegates to active Robot

Configuration

The DAL requires mobilecli as an optional dependency:

File: package.json:22-23

"optionalDependencies": {
    "mobilecli": "0.3.70"
}

Summary

The Device Abstraction Layer provides:

  • Unified Interface: A single Robot interface for all platform operations
  • Platform Detection: Automatic identification of iOS, Android, and simulator devices
  • Factory Pattern: Dynamic robot instantiation based on device type
  • Agent Management: Automatic verification and installation of required agents
  • Error Context: Actionable error messages with documentation links

This architecture enables mobile-mcp to provide a consistent automation experience across the diverse mobile device ecosystem while maintaining platform-specific optimizations.

Source: https://github.com/mobile-next/mobile-mcp / Human Manual

MCP Tools Reference

Related topics: Device Management, App Management, Screen Interaction and Input

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Tool Layer Hierarchy

Continue reading this section for the full explanation and source context.

Section Device Detection Flow

Continue reading this section for the full explanation and source context.

Section Device Management Tools

Continue reading this section for the full explanation and source context.

Related topics: Device Management, App Management, Screen Interaction and Input

MCP Tools Reference

Overview

The Mobile MCP server exposes a comprehensive set of tools that enable AI assistants to interact with mobile devices (iOS and Android) through the Model Context Protocol. These tools provide capabilities for device management, screen interaction, app lifecycle management, and UI automation. Sources: README.md

The tools are implemented in TypeScript and follow a consistent pattern where each tool accepts a device parameter to specify the target device identifier. Sources: src/server.ts:1-100

Architecture

Tool Layer Hierarchy

graph TD
    A[MCP Client<br/>Claude, Cursor, Codex] --> B[Mobile MCP Server]
    B --> C[Robot Interface]
    C --> D[iOS Robot]
    C --> E[Android Robot]
    D --> F[WebDriverAgent]
    D --> G[iPhone Simulator]
    E --> H[Android ADB]
    F --> I[Real iOS Device]
    G --> J[iOS Simulator]
    H --> K[Android Emulator]
    H --> L[Android Device]

Device Detection Flow

graph TD
    A[getRobotFromDevice<br/>deviceId: string] --> B{Is iOS Device?}
    B -->|Yes| C[Return IosRobot]
    B -->|No| D{Is Android Device?}
    D -->|Yes| E[Return AndroidRobot]
    D -->|No| F{Check iOS<br/>Simulators?}
    F -->|Found| G[Return IosRobot]
    F -->|Not Found| H[Return Error]

The getRobotFromDevice function determines the appropriate robot implementation based on the device type by querying device managers and checking against known device identifiers. Sources: src/server.ts:15-50

Tool Categories

Device Management Tools

#### mobile_list_available_devices

Lists all available mobile devices including simulators, emulators, and connected physical devices.

ParameterTypeRequiredDescription
devicestringNoOptional device filter

Response: JSON array of device objects with deviceId, name, platform, and status.

#### mobile_get_screen_size

Retrieves the screen dimensions of the connected device in pixels.

ParameterTypeRequiredDescription
devicestringYesTarget device identifier

Response: { "width": number, "height": number }

#### mobile_get_orientation

Returns the current screen orientation of the device.

ParameterTypeRequiredDescription
devicestringYesTarget device identifier

Response: "portrait" or "landscape" Sources: src/mobile-device.ts:80-90

#### mobile_set_orientation

Changes the screen orientation of the device.

ParameterTypeRequiredDescription
devicestringYesTarget device identifier
orientationstringYes"portrait" or "landscape"
public async setOrientation(orientation: Orientation): Promise<void> {
    this.runCommand(["device", "orientation", "set", orientation]);
}

Sources: src/mobile-device.ts:75-79

Sources: src/mobile-device.ts:75-79

Device Management

Related topics: MCP Tools Reference, Device Abstraction Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Device Detection Flow

Continue reading this section for the full explanation and source context.

Section Platform Matrix

Continue reading this section for the full explanation and source context.

Related topics: MCP Tools Reference, Device Abstraction Layer

Device Management

Overview

Device Management in Mobile MCP provides a unified abstraction layer for controlling mobile devices across iOS and Android platforms. It enables AI agents to interact with physical devices, simulators, and emulators through a consistent set of tools and APIs, abstracting platform-specific implementation details.

Sources: src/server.ts:1-50

Architecture

The Device Management system follows a layered architecture:

graph TD
    A[MCP Server] --> B[Device Manager]
    B --> C[iOS Manager]
    B --> D[Android Manager]
    B --> E[Mobile CLI]
    C --> F[iOS Robot]
    D --> G[Android Robot]
    E --> H[Mobile Device]
    F --> I[Physical Device / Simulator]
    G --> J[Emulator / Physical Device]
    H --> K[iOS Simulator]

Core Components

ComponentFilePurpose
Robot interfacesrc/robot.tsDefines unified device interaction contract
IosRobotPlatform-specificHandles iOS device communication
AndroidRobotPlatform-specificHandles Android device communication
MobileDevicePlatform-specificHandles iOS simulators via mobilecli
IosManagerInternaliOS device discovery and management
AndroidDeviceManagerInternalAndroid device discovery and management

Sources: src/robot.ts:1-50

Device Discovery

Device discovery is performed through the getRobotFromDevice() function, which identifies the device type and returns the appropriate robot implementation.

graph TD
    A[Start] --> B{Check iOS Devices}
    B -->|Found| C[Return IosRobot]
    B -->|Not Found| D{Check Android Devices}
    D -->|Found| E[Return AndroidRobot]
    D -->|Not Found| F{Check Simulators via mobilecli}
    F -->|Simulator Found| G[Verify Agent Status]
    G -->|Status Failed| H[Install Agent]
    G -->|Status OK| I[Return MobileDevice]
    F -->|Not Found| J[Throw Error]

Device Detection Flow

The detection sequence in src/server.ts:

  1. iOS Physical Devices: Uses IosManager.listDevices() to enumerate iOS devices
  2. Android Physical Devices: Uses AndroidDeviceManager.getConnectedDevices() to enumerate Android devices
  3. iOS Simulators: Uses mobilecli.getDevices() with platform: "ios" and type: "simulator"

Sources: src/server.ts:50-100

Device Types Supported

Platform Matrix

PlatformPhysical DevicesSimulators/EmulatorsKey Technologies
iOSiPhone, iPadiOS SimulatorWebDriverAgent, iOS Device Kit
AndroidSamsung, Pixel, etc.Android EmulatorADB, UI Automator

Simulator Agent Verification

When connecting to iOS simulators, Mobile MCP performs agent verification:

if (!agentVerifiedSimulators.has(deviceId)) {
    const agentStatus = mobilecli.agentStatus(deviceId);
    if (agentStatus.status === "fail") {
        mobilecli.agentInstall(deviceId);
    }
    agentVerifiedSimulators.add(deviceId);
}

Sources: src/server.ts:75-85

Robot Interface

The Robot interface in src/robot.ts defines all device interaction methods:

Screen Operations

MethodPurposeReturn Type
getScreenshot()Capture screen as PNGPromise<Buffer>
getScreenSize()Get screen dimensionsPromise<{width, height}>
getOrientation()Get current orientationPromise<Orientation>
setOrientation()Set portrait/landscapePromise<void>

Touch Interactions

MethodPurposeParameters
tap(x, y)Single tapx, y coordinates
doubleTap(x, y)Double tapx, y coordinates
longPress(x, y, duration)Long pressx, y, duration (ms)
swipeFromCoordinate(x, y, direction, distance?)Swipe gesturex, y, direction, optional distance

App Management

MethodPurposeParameters
listApps()List installed appsNone
launchApp(packageName, locale?)Launch apppackage name, optional locale
terminateApp(packageName)Stop apppackage name
installApp(path)Install from filefile path (.apk, .ipa, .app, .zip)
uninstallApp(bundleId)Uninstall appbundle ID/package name

Navigation & Input

MethodPurposeParameters
sendKeys(text)Type textstring
pressButton(button)Press buttonButton enum (HOME, BACK, etc.)
openUrl(url)Open URL/browserURL string

Sources: src/robot.ts:50-120

Available Tools

Mobile MCP exposes the following device management tools to MCP clients:

Device Information

  • mobile_list_available_devices - List all connected devices (physical and simulators)
  • mobile_get_screen_size - Get screen dimensions in pixels
  • mobile_get_orientation - Get current screen orientation
  • mobile_set_orientation - Change screen orientation (portrait/landscape)
  • mobile_list_crashes - List crash reports on device
  • mobile_get_crash - Retrieve full crash report content

Screen Interaction

  • mobile_take_screenshot - Capture screenshot
  • mobile_save_screenshot - Save screenshot to file
  • mobile_list_elements_on_screen - Get UI elements with coordinates
  • mobile_click_on_screen_at_coordinates - Click at x,y
  • mobile_double_tap_on_screen - Double tap at x,y
  • mobile_long_press_on_screen_at_coordinates - Long press at x,y
  • mobile_swipe_on_screen - Swipe in direction

Input & Navigation

  • mobile_type_keys - Type text into focused elements
  • mobile_press_button - Press device buttons
  • mobile_open_url - Open URL in browser

App Management

  • mobile_list_apps - List installed apps
  • mobile_launch_app - Launch app by package name
  • mobile_terminate_app - Stop running app
  • mobile_install_app - Install from file
  • mobile_uninstall_app - Uninstall app

Prerequisites

Device management requires the mobilecli tool to be available on the system. The server validates this on startup:

const ensureMobilecliAvailable = (): void => {
    try {
        const version = mobilecli.getVersion();
        if (version.startsWith("failed")) {
            throw new Error("mobilecli version check failed");
        }
    } catch (error: any) {
        throw new ActionableError(`mobilecli is not available...`);
    }
};

Sources: src/server.ts:20-35

Configuration

MCP Server Configuration

{
  "mcpServers": {
    "mobile-mcp": {
      "command": "npx",
      "args": ["-y", "@mobilenext/mobile-mcp@latest"]
    }
  }
}

SSE Server Mode with Authentication

MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000

When MOBILEMCP_AUTH is set, all requests require the header:

Authorization: Bearer my-secret-token

Sources: package.json:1-30

Error Handling

The device management system uses ActionableError to provide users with actionable error messages:

throw new ActionableError(
    `Device "${deviceId}" not found. Use the mobile_list_available_devices tool to see available devices.`
);

This pattern ensures users receive guidance on how to resolve issues, not just failure messages.

Sources: src/server.ts:95-100

Dependencies

DependencyVersionPurpose
@modelcontextprotocol/sdk1.26.0MCP protocol implementation
mobilecli0.3.70 (optional)Cross-platform mobile CLI
express5.1.0SSE server transport
zod^4.1.13Schema validation

Sources: package.json:25-50

See Also

Sources: src/server.ts:1-50

App Management

Related topics: MCP Tools Reference, iOS Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Method Specifications

Continue reading this section for the full explanation and source context.

Section Device Selection Flow

Continue reading this section for the full explanation and source context.

Section iOS Implementation

Continue reading this section for the full explanation and source context.

Related topics: MCP Tools Reference, iOS Implementation

App Management

App Management in Mobile MCP enables AI agents to interact with mobile applications on connected devices through a unified interface. This system provides capabilities for listing, launching, terminating, installing, and uninstalling applications across both iOS and Android platforms.

Overview

The App Management feature abstracts platform-specific implementation details behind a consistent Robot interface. This allows AI agents to manage applications without understanding the underlying differences between iOS and Android platforms.

graph TD
    A[MCP Client / AI Agent] --> B[Mobile MCP Server]
    B --> C[Robot Interface]
    C --> D[iOS Robot]
    C --> E[Android Robot]
    D --> F[WebDriverAgent / go-ios]
    E --> G[ADB / Android Manager]

Sources: src/robot.ts:1-100

Available Tools

Mobile MCP exposes five primary tools for application management:

ToolDescriptionPlatform
mobile_list_appsList all installed applicationsiOS, Android
mobile_launch_appLaunch an app by package/bundle nameiOS, Android
mobile_terminate_appStop a running applicationiOS, Android
mobile_install_appInstall app from file (.apk, .ipa, .app, .zip)iOS, Android
mobile_uninstall_appUninstall app by bundle ID or package nameiOS, Android

Sources: README.md

Robot Interface

The core App Management functionality is defined in the Robot interface. This interface declares abstract methods that platform-specific implementations must provide.

interface Robot {
    listApps(): Promise<InstalledApp[]>;
    launchApp(packageName: string, locale?: string): Promise<void>;
    terminateApp(packageName: string): Promise<void>;
    installApp(path: string): Promise<void>;
    uninstallApp(bundleId: string): Promise<void>;
    openUrl(url: string): Promise<void>;
}

Sources: src/robot.ts:40-75

Method Specifications

#### listApps()

Returns all installed applications on the device.

listApps(): Promise<InstalledApp[]>;

Return Type: InstalledApp[] - Array of objects containing package names (Android) or bundle identifiers (iOS).

#### launchApp()

Launches an application with optional locale specification.

launchApp(packageName: string, locale?: string): Promise<void>;

Parameters:

ParameterTypeDescription
packageNamestringThe package name (Android) or bundle ID (iOS) of the app
localestring (optional)Locale to launch the app with (e.g., "en_US")

Sources: src/robot.ts:47-49

#### terminateApp()

Terminates a running application. If the app is not running or doesn't exist, this operation is a no-op.

terminateApp(packageName: string): Promise<void>;

Parameters:

ParameterTypeDescription
packageNamestringThe package name (Android) or bundle ID (iOS)

#### installApp()

Installs an application from a local file path. Supports multiple formats across platforms.

installApp(path: string): Promise<void>;

Supported Formats:

PlatformFormats
Android.apk, .zip
iOS.ipa, .app, .zip

#### uninstallApp()

Uninstalls an application from the device.

uninstallApp(bundleId: string): Promise<void>;

Parameters:

ParameterTypeDescription
bundleIdstringThe app's bundle identifier (iOS) or package name (Android)

#### openUrl()

Opens a URL in the device's web browser.

openUrl(url: string): Promise<void>;

Supported URL Schemes:

TypeExample
HTTP/HTTPShttps://example.com
Custom Schemesmyapp://action

Sources: src/robot.ts:59-63

Architecture

Device Selection Flow

When a tool requires device-specific implementation, the server determines the appropriate Robot based on the device type:

graph TD
    A[Device ID Provided] --> B{Is iOS Device?}
    B -->|Yes| C[Return IosRobot]
    B -->|No| D{Is Android Device?}
    D -->|Yes| E[Return AndroidRobot]
    D -->|No| F{Is Simulator?}
    F -->|Yes| G[Return MobileDevice]
    F -->|No| H[Throw ActionableError]

Sources: src/server.ts:30-60

iOS Implementation

The iOS implementation leverages go-ios (via mobilecli) for device communication. The IosManager handles device detection and the WebDriverAgent protocol manages application lifecycle.

Key components in iOS app management:

  1. WebDriverAgent (WDA) - Apple's testing framework for iOS
  2. go-ios - Command-line interface for iOS device control
  3. Tunnel Service - Required for real device communication

Sources: src/ios.ts:1-50

Android Implementation

Android implementation uses the Android Debug Bridge (ADB) through the Android Manager:

  1. ADB - Primary communication protocol with Android devices
  2. Android Manager - Handles device enumeration and app operations
  3. Package Manager - Manages app installation and uninstallation

Tool Registration

App Management tools are registered in the server with descriptive schemas:

tool(
    "mobile_list_apps",
    "List Apps",
    "List all installed apps on the device",
    {},
    { readOnlyHint: true },
    async ({}) => { /* implementation */ }
);

Sources: src/server.ts:80-100

Prerequisites

iOS Requirements

  • WebDriverAgent must be running on the device
  • For real devices: iOS tunnel must be established
  • go-ios must be installed and functional

Sources: src/ios.ts:35-45

Android Requirements

  • ADB must be enabled on the device
  • Device must be connected and authorized
  • USB debugging must be enabled

Usage Examples

List All Installed Apps

{
  "tool": "mobile_list_apps",
  "arguments": {}
}

Launch an App with Locale

{
  "tool": "mobile_launch_app",
  "arguments": {
    "packageName": "com.example.app",
    "locale": "en_US"
  }
}

Install an App

{
  "tool": "mobile_install_app",
  "arguments": {
    "path": "/path/to/application.apk"
  }
}

Uninstall an App

{
  "tool": "mobile_uninstall_app",
  "arguments": {
    "bundleId": "com.example.app"
  }
}

Open URL in Browser

{
  "tool": "mobile_open_url",
  "arguments": {
    "url": "https://example.com"
  }
}

Error Handling

The system provides actionable error messages when operations fail. Common error scenarios include:

Error ConditionCauseResolution
Device not foundInvalid device ID or disconnected deviceCheck device connection
App not installedAttempting to launch non-existent appInstall app first
Installation failedInvalid file format or corrupted packageVerify file integrity
Permission deniedInsufficient device permissionsGrant required permissions

Sources: src/server.ts:45-50

Dependencies

App Management relies on the following package:

PackageVersionPurpose
mobilecli0.3.70Cross-platform mobile device CLI

Sources: package.json:25-30

Source: https://github.com/mobile-next/mobile-mcp / Human Manual

Screen Interaction and Input

Related topics: MCP Tools Reference, iOS Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Robot Interface Contract

Continue reading this section for the full explanation and source context.

Section Device Selection Flow

Continue reading this section for the full explanation and source context.

Related topics: MCP Tools Reference, iOS Implementation

Screen Interaction and Input

Overview

The Screen Interaction and Input system in Mobile MCP provides a comprehensive abstraction layer for controlling mobile devices (iOS, Android, simulators, and emulators) through a unified Robot interface. This system enables AI agents to interact with mobile applications by simulating user input events, capturing screen content, and managing device orientation.

The architecture is designed to work with multiple device platforms while presenting a consistent API to MCP clients. The system leverages native accessibility trees for most interactions, falling back to screenshot-based coordinate operations when accessibility labels are unavailable. Sources: src/robot.ts:1-30

Architecture

Core Components

The system consists of three primary device abstraction layers:

ComponentPlatformProtocolFile
IosRobotiOSWebDriverAgent (WDA)src/webdriver-agent.ts:1-50
AndroidRobotAndroidmobileclisrc/mobile-device.ts:1-30
Robot (interface)AllAbstractsrc/robot.ts:1-50

Robot Interface Contract

The Robot interface defines the contract that all platform-specific implementations must fulfill:

interface Robot {
  openUrl(url: string): Promise<void>;
  sendKeys(text: string): Promise<void>;
  pressButton(button: Button): Promise<void>;
  tap(x: number, y: number): Promise<void>;
  doubleTap(x: number, y: number): Promise<void>;
  longPress(x: number, y: number, duration: number): Promise<void>;
  getElementsOnScreen(): Promise<ScreenElement[]>;
  setOrientation(orientation: Orientation): Promise<void>;
  getOrientation(): Promise<Orientation>;
  swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
  getScreenshot(): Promise<Buffer>;
  listApps(): Promise<InstalledApp[]>;
  launchApp(packageName: string, locale?: string): Promise<void>;
  terminateApp(packageName: string): Promise<void>;
  installApp(path: string): Promise<void>;
  uninstallApp(bundleId: string): Promise<void>;
}

Sources: src/robot.ts:30-100

Device Selection Flow

The server determines which Robot implementation to use based on the device identifier:

graph TD
    A[MCP Request with device ID] --> B{Device Type Check}
    B -->|iOS Device| C[IosRobot]
    B -->|Android Device| D[AndroidRobot]
    B -->|Simulator| E[IosRobot]
    C --> F[WebDriverAgent Connection]
    D --> G[mobilecli Commands]
    E --> F

Sources: src/server.ts:50-80

Touch Interactions

Tap Operations

#### Single Tap

Single tap is implemented using pointer actions sent through the WebDriverAgent protocol for iOS:

public async tap(x: number, y: number) {
  await this.withinSession(async sessionUrl => {
    const url = `${sessionUrl}/actions`;
    await fetch(url, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        actions: [{
          type: "pointer",
          id: "finger1",
          parameters: { pointerType: "touch" },
          actions: [
            { type: "pointerMove", duration: 0, x, y },
            { type: "pointerDown", button: 0 },
            { type: "pause", duration: 100 },
            { type: "pointerUp", button: 0 }
          ]
        }]
      }),
    });
  });
}

Sources: src/webdriver-agent.ts:180-210

For Android, the tap operation uses the mobilecli command:

public async tap(x: number, y: number): Promise<void> {
  this.runCommand(["io", "tap", `${x},${y}`]);
}

Sources: src/mobile-device.ts:50-52

#### Double Tap

Double tap is implemented by executing two consecutive tap operations:

public async doubleTap(x: number, y: number): Promise<void> {
  await this.tap(x, y);
  await this.tap(x, y);
}

Sources: src/mobile-device.ts:56-60

In iOS WebDriverAgent, the double tap uses the W3C Actions API with multiple pointer sequences.

#### Long Press

Long press requires specifying the duration in milliseconds:

public async longPress(x: number, y: number, duration: number): Promise<void> {
  this.runCommand(["io", "longpress", `${x},${y}`, "--duration", `${duration}`]);
}

Sources: src/mobile-device.ts:62-64

The iOS implementation uses the Actions API with extended pointerDown duration:

{ type: "pointerDown", button: 0 },
{ type: "pause", duration: duration },
{ type: "pointerUp", button: 0 }

Swipe Gestures

Swipe gestures calculate coordinates based on screen dimensions, using 60% of the screen width or height as the swipe distance:

const verticalDistance = Math.floor(screenSize.height * 0.6);
const horizontalDistance = Math.floor(screenSize.width * 0.6);
const centerX = Math.floor(screenSize.width / 2);
const centerY = Math.floor(screenSize.height / 2);

Sources: src/webdriver-agent.ts:130-135

The swipe direction determines the start and end coordinates:

DirectionX MovementY Movement
upcenterXcenterY ± verticalDistance/2
downcenterXcenterY ± verticalDistance/2
leftcenterX ± horizontalDistance/2centerY
rightcenterX ± horizontalDistance/2centerY

Text Input

Keyboard Input

Text input is handled through different mechanisms per platform:

iOS (via WebDriverAgent):

public async sendKeys(text: string): Promise<void> {
  await this.withinSession(async sessionUrl => {
    await fetch(`${sessionUrl}/wda/keys`, {
      method: "POST",
      body: JSON.stringify({ value: text.split("") }),
    });
  });
}

Android (via mobilecli):

public async typeText(text: string): Promise<void> {
  this.runCommand(["io", "text", text]);
}

Sources: src/mobile-device.ts:42-44

Button Presses

Physical device buttons are mapped to platform-specific commands:

ButtoniOS ActionAndroid Command
HOMEwda/pressButtonio button home
BACKN/Aio button back
POWERN/Aio button power
ENTERsendKeys "\n"io button enter
VOLUME_UPwda/pressButtonio button volume_up
VOLUME_DOWN`wda/pressButton"io button volume_down

Sources: src/webdriver-agent.ts:150-175

Screen Capture

Screenshot Retrieval

Screenshots are retrieved as PNG buffers through platform-specific protocols:

iOS WebDriverAgent:

public async getScreenshot(): Promise<Buffer> {
  const url = `http://${this.host}:${this.port}/screenshot`;
  const response = await fetch(url);
  const json = await response.json();
  return Buffer.from(json.value, "base64");
}

Sources: src/webdriver-agent.ts:100-106

Android (via mobilecli): Screenshots are captured using the platform's native screenshot mechanism through the dump command.

UI Element Discovery

Element Filtering

The system filters accessibility elements to return only actionable UI components:

const acceptedTypes = [
  "TextField", 
  "Button", 
  "Switch", 
  "Icon", 
  "SearchField", 
  "StaticText", 
  "Image"
];

Sources: src/webdriver-agent.ts:30-32

Element visibility is determined by checking both the isVisible flag and bounds:

if (acceptedTypes.includes(source.type)) {
  if (source.isVisible === "1" && this.isVisible(source.rect)) {
    if (source.label !== null || source.name !== null || source.rawIdentifier !== null) {
      output.push({ /* element data */ });
    }
  }
}

Element Data Structure

Each screen element contains:

PropertyTypeDescription
typestringElement type (Button, TextField, etc.)
labelstringAccessibility label
namestringElement name
valuestringCurrent value (for text fields)
identifierstringPlatform-specific identifier
rect{x, y, width, height}Bounding rectangle
focusedbooleanFocus state

Sources: src/mobile-device.ts:66-76

Orientation Management

Device orientation can be queried and changed:

public async setOrientation(orientation: Orientation): Promise<void> {
  this.runCommand(["device", "orientation", "set", orientation]);
}

public async getOrientation(): Promise<Orientation> {
  const response = JSON.parse(this.runCommand(["device", "orientation", "get"])) as OrientationResponse;
  return response.data.orientation;
}

Sources: src/mobile-device.ts:78-85

Supported orientations: "portrait" | "landscape"

Connection Prerequisites

iOS Requirements

iOS devices require several connection layers to be operational:

graph LR
    A[MCP Server] --> B[iOS Tunnel]
    B --> C[WDA Port Forward]
    C --> D[WebDriverAgent]
    D --> E[iOS Device]
    
    F[Check: Tunnel Running] --> B
    G[Check: WDA Forward Running] --> C
    H[Check: WDA isRunning] --> D

Sources: src/ios.ts:30-60

The system verifies:

  1. Tunnel Running - Required for remote iOS devices
  2. WDA Port Forward - TCP port forwarding to WebDriverAgent
  3. WebDriverAgent Status - Actual running state of WDA on device
private async assertTunnelRunning(): Promise<void> {
  if (await this.isTunnelRequired()) {
    if (!(await this.isTunnelRunning())) {
      throw new ActionableError("iOS tunnel is not running...");
    }
  }
}

MCP Tools Interface

The server exposes the following screen interaction tools:

Tool NameDescriptionParameters
mobile_take_screenshotCapture current screendevice
mobile_list_elements_on_screenGet UI elementsdevice
mobile_click_on_screen_at_coordinatesTap at coordinatesdevice, x, y
mobile_double_tap_on_screenDouble tapdevice, x, y
mobile_long_press_on_screen_at_coordinatesLong pressdevice, x, y, duration
mobile_swipe_on_screenSwipe gesturedevice, direction
mobile_type_keysText inputdevice, text, submit
mobile_get_orientationQuery orientationdevice
mobile_set_orientationSet orientationdevice, orientation

Usage Examples

Basic Screen Interaction Workflow

// 1. List available devices
const devices = await listAvailableDevices();

// 2. Get screen elements
const elements = await getElementsOnScreen(deviceId);

// 3. Tap on a button by coordinates
await tap(deviceId, 150, 300);

// 4. Type text into a field
await typeKeys(deviceId, "Hello World", false);

// 5. Swipe up to scroll
await swipe(deviceId, "up");

// 6. Take screenshot to verify
const screenshot = await getScreenshot(deviceId);

Complete User Journey Automation

// Open app and perform multi-step interaction
await launchApp(deviceId, "com.example.app");
await swipe(deviceId, "up");
const elements = await getElementsOnScreen(deviceId);
const loginButton = elements.find(e => e.name === "Login");
await tap(deviceId, loginButton.rect.x + 10, loginButton.rect.y + 10);
await typeKeys(deviceId, "[email protected]", false);
await pressButton(deviceId, "ENTER");

Error Handling

The system provides actionable error messages with troubleshooting links:

throw new ActionableError(
  `mobilecli is not available or not working properly. 
   Please review the documentation at 
   https://github.com/mobile-next/mobile-mcp/wiki`
);

Sources: src/server.ts:20-25

Common error scenarios:

Error ConditionCauseResolution
"iOS tunnel is not running"Remote device connection issueStart tunnel per wiki instructions
"Port forwarding not running"WDA port not forwardedConfigure port forwarding
"WebDriverAgent not running"WDA crashed on deviceRestart WDA on iOS device

Performance Considerations

  • Element queries use native accessibility trees for speed
  • Screenshots are transferred as base64-encoded PNG
  • Gestures are calculated as percentages of screen dimensions
  • Timeouts for recording operations default to 5 minutes (300 seconds)

Sources: src/robot.ts:30-100

iOS Implementation

Related topics: Device Abstraction Layer, Screen Interaction and Input

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Device Type Resolution

Continue reading this section for the full explanation and source context.

Section Simulator Agent Verification

Continue reading this section for the full explanation and source context.

Related topics: Device Abstraction Layer, Screen Interaction and Input

iOS Implementation

Overview

The iOS Implementation module provides comprehensive automation capabilities for iOS devices, including physical devices, simulators, and emulators. This module serves as a bridge between the Mobile MCP server and iOS testing infrastructure, enabling AI agents to interact with iOS applications through accessibility trees and coordinate-based input.

Purpose: The iOS Implementation enables native app automation for iOS testing, scripted flows, and multi-step user journeys driven by LLMs.

Scope: The module handles device detection, connection management, WebDriverAgent communication, tunnel port forwarding, and command execution via the go-ios library.

Sources: src/ios.ts:1-50

Architecture

The iOS implementation follows a layered architecture that separates device management, robot control, and protocol communication.

graph TD
    A[MCP Server] --> B[IosManager]
    A --> C[IosRobot]
    B --> D[go-ios / mobilecli]
    C --> E[WebDriverAgent]
    C --> F[iOS Device Kit]
    E --> G[Physical iOS Device]
    F --> G
    D --> H[iOS Simulator]
    I[iPhone Simulator] --> H

Core Components

ComponentFileResponsibility
IosManagersrc/server.tsDevice discovery and listing
IosRobotsrc/server.tsDevice interaction via Robot interface
IosDeviceConnectionsrc/ios.tsConnection and tunnel management
WebDriverAgentsrc/webdriver-agent.tsUI automation protocol
iPhone Simulatorsrc/iphone-simulator.tsSimulator-specific operations

Sources: src/server.ts:30-80

Device Detection

The system uses a multi-step detection process to identify iOS device types and establish appropriate communication channels.

Device Type Resolution

graph TD
    A[Device ID] --> B{IosManager listDevices?}
    B -->|Found| C[IosRobot]
    B -->|Not Found| D{AndroidDeviceManager?}
    D -->|Found| E[AndroidRobot]
    D -->|Not Found| F{mobilecli getDevices?}
    F -->|Simulator Found| G{MobileDevice}
    F -->|Not Found| H[Error: Device not found]

The getRobotFromDevice function performs the following checks in order:

  1. iOS Physical Devices: Queries IosManager for connected iOS devices Sources: src/server.ts:45-50
  2. Android Devices: Checks AndroidDeviceManager for matching device ID Sources: src/server.ts:52-56
  3. iOS Simulators: Uses mobilecli with platform filter for simulators Sources: src/server.ts:58-75

Simulator Agent Verification

For iOS simulators, the system automatically verifies and installs the agent if needed:

if (!agentVerifiedSimulators.has(deviceId)) {
    const agentStatus = mobilecli.agentStatus(deviceId);
    if (agentStatus.status === "fail") {
        mobilecli.agentInstall(deviceId);
    }
    agentVerifiedSimulators.add(deviceId);
}

Sources: src/server.ts:65-71

Connection Management

Port-Based Connection State

The iOS implementation uses port checking to verify connection states:

graph LR
    A[Device] -->|USB Tunnel| B[localhost:PORT]
    B --> C{WDA Port Check}
    C -->|Listening| D[WebDriverAgent Ready]
    C -->|Not Listening| E[Error: Port forwarding not running]

#### Tunnel Requirements

The isTunnelRequired method determines when tunnel port forwarding is necessary based on connection type. When a tunnel is required but not running, the system throws an ActionableError:

private async assertTunnelRunning(): Promise<void> {
    if (await this.isTunnelRequired()) {
        if (!(await this.isTunnelRunning())) {
            throw new ActionableError("iOS tunnel is not running, please see https://github.com/mobile-next/mobile-mcp/wiki/");
        }
    }
}

Sources: src/ios.ts:45-50

#### WebDriverAgent Port Forwarding

Connection to WebDriverAgent requires both tunnel and port forwarding verification:

private async wda(): Promise<WebDriverAgent> {
    await this.assertTunnelRunning();

    if (!(await this.isWdaForwardRunning())) {
        throw new ActionableError("Port forwarding to WebDriverAgent is not running (tunnel okay), please see https://github.com/mobile-next/mobile-mcp/wiki/");
    }

    const wda = new WebDriverAgent("localhost", WDA_PORT);

    if (!(await wda.isRunning())) {
        throw new ActionableError("WebDriverAgent is not running on device (tunnel okay, port forwarding okay), please see https://github.com/mobile-next/mobile-mcp/wiki/");
    }

    return wda;
}

Sources: src/ios.ts:55-70

Robot Interface Implementation

The IosRobot class implements the Robot interface, providing unified access to iOS device capabilities.

Supported Operations

CategoryOperations
ScreengetScreenshot, getScreenSize, getOrientation, setOrientation
Touchtap, doubleTap, longPress, swipeFromCoordinate
InputsendKeys, pressButton
Navigationhome, back, openUrl
AppslistApps, launchApp, terminateApp, installApp, uninstallApp
ElementsgetElementsOnScreen

Sources: src/robot.ts:1-80

iOS-Specific Considerations

#### Accessibility Tree

The implementation uses native accessibility trees for element detection rather than computer vision, making it LLM-friendly and fast.

Note: The getElementsOnScreen method works only on native apps and will not function within webviews.

Sources: src/robot.ts:40-45

#### Long Press Duration

The long press operation accepts a custom duration parameter:

longPress(x: number, y: number, duration: number): Promise<void>;
ParameterTypeDescription
xnumberX coordinate on screen
ynumberY coordinate on screen
durationnumberDuration in milliseconds

Sources: src/robot.ts:35-38

WebDriverAgent Integration

The WebDriverAgent (WDA) serves as the primary automation protocol for iOS physical devices.

Connection Flow

sequenceDiagram
    participant MCP as MCP Server
    participant Ios as IosDeviceConnection
    participant WDA as WebDriverAgent
    participant Device as iOS Device

    MCP->>Ios: wda()
    Ios->>Ios: assertTunnelRunning()
    Ios->>Ios: isWdaForwardRunning()
    Ios->>WDA: new WebDriverAgent(localhost, PORT)
    WDA->>Device: isRunning()
    Device-->>WDA: status
    WDA-->>Ios: WebDriverAgent instance
    Ios-->>MCP: Ready for commands

Port Configuration

PortPurposeConfiguration
WDA_PORTWebDriverAgent communicationlocalhost:PORT
IOS_TUNNEL_PORTUSB tunnel forwardinglocalhost:PORT

Sources: src/webdriver-agent.ts:1-30

iOS Simulator Support

The iPhone Simulator module provides specialized handling for iOS Simulator instances.

Simulator Detection

Simulators are detected through mobilecli with the following parameters:

const response = mobilecli.getDevices({
    platform: "ios",
    type: "simulator",
    includeOffline: false,
});

Sources: src/server.ts:58-62

Simulator vs Physical Device

AspectSimulatorPhysical Device
ConnectionDirect via mobilecliUSB tunnel + WDA
AgentAuto-installed on demandManual setup required
Port ForwardingNot requiredRequired (WDA_PORT)
WebDriverAgentOptional auto-startRequired

Sources: src/iphone-simulator.ts:1-40

go-ios Integration

The implementation uses the go-ios binary for low-level iOS device communication.

Command Execution

private async ios(...args: string[]): Promise<string> {
    return execFileSync(getGoIosPath(), ["--udid", this.deviceId, ...args], {}).toString();
}

Sources: src/ios.ts:75-77

Utility Functions

The system checks for mobilecli availability on startup:

const ensureMobilecliAvailable = (): void => {
    try {
        const version = mobilecli.getVersion();
        if (version.startsWith("failed")) {
            throw new Error("mobilecli version check failed");
        }
    } catch (error: any) {
        throw new ActionableError(`mobilecli is not available or not working properly. Please review the documentation at https://github.com/mobile-next/mobile-mcp/wiki for installation instructions`);
    }
};

Sources: src/server.ts:22-32

Error Handling

Actionable Errors

The system throws ActionableError with user-friendly messages and documentation links:

Error ScenarioMessageResolution Link
mobilecli unavailableInstallation instructionsWiki Installation
Tunnel not runningSetup guideWiki Tunnel Setup
Port forwarding failedTroubleshootingWiki Debugging
WDA not runningConfiguration guideWiki WDA Setup
Device not foundDevice listmobile_list_available_devices tool

Sources: src/ios.ts:48-70

Configuration

Environment Variables

VariableDescriptionDefault
MOBILEMCP_AUTHBearer token for SSE authorizationNone

Port Constants

The system uses standard port configurations for iOS tunnel and WDA communication. Refer to the source code for current port values.

Platform Support Matrix

PlatformVersionAutomation Method
iOS SimulatorAll versionsmobilecli + iOS Device Kit
Physical iOSiOS 13+WebDriverAgent via go-ios
Real Device (debug)iOS 13+USB tunnel + WDA

Sources: src/webdriver-agent.ts:20-30

Sources: src/ios.ts:1-50

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high feat(iOS): add Unix Domain Socket support for usbmuxd (/var/run/usbmuxd) with TCP fallback

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

high mobile_type_keys not support Chinese words

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

high Default-on telemetry creates security and compliance risk for enterprise users

The project may affect permissions, credentials, data exposure, or host boundaries.

high iOS physical-device support blocked on macOS 26 (Tahoe) — go-ios can't read /var/db/lockdown/RemotePairing/

The project may affect permissions, credentials, data exposure, or host boundaries.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Configuration risk: feat(iOS): add Unix Domain Socket support for usbmuxd (/var/run/usbmuxd) with TCP fallback

  • Severity: high
  • Finding: Configuration risk is backed by a source signal: feat(iOS): add Unix Domain Socket support for usbmuxd (/var/run/usbmuxd) with TCP fallback. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/322

2. Configuration risk: mobile_type_keys not support Chinese words

  • Severity: high
  • Finding: Configuration risk is backed by a source signal: mobile_type_keys not support Chinese words. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/238

3. Security or permission risk: Default-on telemetry creates security and compliance risk for enterprise users

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: Default-on telemetry creates security and compliance risk for enterprise users. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/330

4. Security or permission risk: iOS physical-device support blocked on macOS 26 (Tahoe) — go-ios can't read /var/db/lockdown/RemotePairing/

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: iOS physical-device support blocked on macOS 26 (Tahoe) — go-ios can't read /var/db/lockdown/RemotePairing/. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/323

5. Configuration risk: start/stop_screen_recording produces corrupt files on Android and silently fails on iOS physical devices

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: start/stop_screen_recording produces corrupt files on Android and silently fails on iOS physical devices. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/321
  • Severity: medium
  • Finding: Capability assumption is backed by a source signal: mobile fleet documentation link broken. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/328

7. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | README/documentation is current enough for a first validation pass.

8. Maintenance risk: Version 0.0.49

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Version 0.0.49. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.49

9. Maintenance risk: Version 0.0.54

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Version 0.0.54. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.54

10. Maintenance risk: Maintainer activity is unknown

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | last_activity_observed missing

11. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: downstream_validation.risk_items | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | no_demo; severity=medium

12. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: risks.scoring_risks | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | no_demo; severity=medium

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mobile-mcp with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence