Doramagic Project Pack · Human Manual
mobile-mcp
Related topics: Installation and Configuration, System Architecture, MCP Tools Reference
Overview
Related topics: Installation and Configuration, System Architecture, MCP Tools Reference
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation and Configuration, System Architecture, MCP Tools Reference
Overview
Mobile MCP (Model Context Protocol) is an open-source project that enables AI assistants to interact with mobile devices through the MCP protocol. It provides a bridge between AI agents and mobile device automation, supporting both iOS and Android platforms including simulators, emulators, and physical devices.
Sources: README.md
Project Purpose
The project serves as an MCP server implementation that allows AI assistants to:
- Automate native iOS and Android applications for testing or data entry scenarios
- Execute scripted flows and form interactions without manual device control
- Automate multi-step user journeys driven by large language models
- Enable general-purpose mobile application interaction for agent-based frameworks
- Facilitate agent-to-agent communication for mobile automation use cases and data extraction
Sources: README.md
Core Architecture
Mobile MCP follows a modular architecture that separates platform-specific implementations from the core server logic.
graph TD
subgraph "MCP Clients"
C1[Claude Desktop]
C2[Cursor]
C3[VS Code]
C4[Codex]
C5[Other MCP Clients]
end
subgraph "Mobile MCP Server"
Server[Core Server<br/>src/server.ts]
Tools[MCP Tools Layer]
Robot[Robot Interface<br/>src/robot.ts]
end
subgraph "Platform Implementations"
iOS_WDA[WebDriver Agent<br/>src/webdriver-agent.ts]
iOS_Sim[iOS Simulator<br/>src/iphone-simulator.ts]
Android[Android Device Manager]
end
subgraph "Target Devices"
iOS_Real[iOS Physical Device]
Android_Real[Android Physical Device]
Simulators[iOS Simulators]
Emulators[Android Emulators]
end
C1 & C2 & C3 & C4 & C5 --> Server
Server --> Tools
Tools --> Robot
Robot --> iOS_WDA & iOS_Sim & Android
iOS_WDA --> iOS_Real & Simulators
iOS_Sim --> Simulators
Android --> Android_Real & EmulatorsKey Components
| Component | File | Purpose |
|---|---|---|
| Core Server | src/server.ts | MCP protocol implementation, device management, tool routing |
| Robot Interface | src/robot.ts | Abstract interface defining device interaction methods |
| WebDriver Agent | src/webdriver-agent.ts | iOS real device and simulator automation via WebDriverAgent |
| iOS Simulator | src/iphone-simulator.ts | Direct iOS simulator control using simctl |
| Android Manager | (platform-specific) | Android device automation |
Sources: src/server.ts:1-30
Device Detection Flow
The server implements a device detection mechanism that identifies the platform type when a device ID is provided.
graph TD
Start[Get Device ID] --> Check_iOS{iOS Device?}
Check_iOS -->|Yes| Return_iOS[Return IosRobot]
Check_iOS -->|No| Check_Android{Android Device?}
Check_Android -->|Yes| Return_Android[Return AndroidRobot]
Check_Android -->|No| Check_Simulator{iOS Simulator?}
Check_Simulator -->|Yes| Return_Simulator[Return Simulator Robot]
Check_Simulator -->|No| Error[Throw Error]
style Return_iOS fill:#90EE90
style Return_Android fill:#90EE90
style Return_Simulator fill:#90EE90
style Error fill:#FFB6C1Sources: src/server.ts:20-45
Available Tools
Mobile MCP exposes the following tool categories through the MCP protocol:
Device Management
| Tool | Description |
|---|---|
mobile_list_available_devices | List all available devices (simulators, emulators, and real devices) |
mobile_get_screen_size | Get screen dimensions in pixels |
mobile_get_orientation | Get current screen orientation |
mobile_set_orientation | Change screen orientation (portrait/landscape) |
App Management
| Tool | Description |
|---|---|
mobile_list_apps | List all installed apps on the device |
mobile_launch_app | Launch an app using package name |
mobile_terminate_app | Stop and terminate a running app |
mobile_install_app | Install an app from file (.apk, .ipa, .app, .zip) |
mobile_uninstall_app | Uninstall an app using bundle ID or package name |
Screen Interaction
| Tool | Description |
|---|---|
mobile_take_screenshot | Capture screenshot to understand screen content |
mobile_save_screenshot | Save screenshot to a file |
mobile_list_elements_on_screen | List UI elements with coordinates and properties |
mobile_click_on_screen_at_coordinates | Click at specific x,y coordinates |
mobile_double_tap_on_screen | Double-tap at specific coordinates |
mobile_long_press_on_screen_at_coordinates | Long press at specific coordinates |
mobile_swipe_on_screen | Swipe in any direction (up, down, left, right) |
Input & Navigation
| Tool | Description |
|---|---|
mobile_type_keys | Type text into focused elements with optional submit |
mobile_press_button | Press device buttons (home, back, etc.) |
Sources: README.md
Robot Interface
The Robot interface defines the contract for all platform implementations:
interface Robot {
// Screen operations
getScreenshot(): Promise<Buffer>;
getScreenSize(): Promise<ScreenSize>;
getOrientation(): Promise<Orientation>;
setOrientation(orientation: Orientation): Promise<void>;
// Element operations
getElementsOnScreen(): Promise<ScreenElement[]>;
// Touch operations
tap(x: number, y: number): Promise<void>;
doubleTap(x: number, y: number): Promise<void>;
longPress(x: number, y: number, duration: number): Promise<void>;
swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
// Text and keys
sendKeys(text: string): Promise<void>;
pressButton(button: Button): Promise<void>;
// App management
listApps(): Promise<InstalledApp[]>;
launchApp(packageName: string, locale?: string): Promise<void>;
terminateApp(packageName: string): Promise<void>;
installApp(path: string): Promise<void>;
uninstallApp(bundleId: string): Promise<void>;
// URL handling
openUrl(url: string): Promise<void>;
}
Sources: src/robot.ts
Platform Support
iOS Platform
iOS automation is handled through two mechanisms:
- WebDriverAgent (
src/webdriver-agent.ts): Used for real iOS devices and Xcode-managed simulators
- Communicates via HTTP with WDA session
- Filters elements by accepted types:
TextField,Button,Switch,Icon,SearchField,StaticText,Image - Uses element visibility and accessibility properties for filtering
- iOS Simulator (
src/iphone-simulator.ts): Direct simctl control for booted simulators
- Handles
.appbundle installation - Supports
.zipfile extraction with zip-slip vulnerability protection - Uses simctl command-line tool
Sources: src/webdriver-agent.ts:1-50
Android Platform
Android automation uses platform-specific managers to:
- List connected devices
- Query UI elements via accessibility tree
- Execute touch and input operations
- Manage app lifecycle
Communication Modes
STDIO Mode (Default)
The server communicates over standard input/output:
npx -y @mobilenext/mobile-mcp@latest
SSE Server Mode
For HTTP-based connections, the server can listen on a specified port:
npx @mobilenext/mobile-mcp@latest --listen 3000
Optional Bearer token authentication can be enabled:
MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000
Sources: README.md
Technology Stack
| Component | Technology | Version |
|---|---|---|
| Runtime | Node.js | >=18 |
| MCP SDK | @modelcontextprotocol/sdk | 1.26.0 |
| HTTP Framework | express | 5.1.0 |
| CLI Framework | commander | 14.0.0 |
| Validation | zod | 4.1.13 |
| XML Parsing | fast-xml-parser | 5.5.7 |
| Native CLI | mobilecli | 0.3.70 (optional) |
Sources: package.json
Key Features
- Fast and lightweight: Uses native accessibility trees for most interactions, or screenshot-based coordinates where accessibility labels are not available
- LLM-friendly: No computer vision model required in Accessibility (Snapshot)
- Visual Sense: Evaluates and analyses what's actually rendered on screen
- Cross-platform: Supports iOS, Android, simulators, emulators, and real devices
- Standard protocol: Built on Model Context Protocol for seamless AI assistant integration
Sources: README.md
Version History
The project follows semantic versioning with active development:
| Version | Date | Key Changes |
|---|---|---|
| 0.0.49 | 2026-03-24 | Path traversal fix in save screenshot and record video |
| 0.0.48 | 2026-03-20 | fast-xml-parser security updates, error handling fixes |
| 0.0.47 | 2026-03-09 | Zod coerce for number parameter parsing, locale support for iOS |
| 0.0.42 | 2026-02-03 | mobilecli upgrade, fast-xml-parser security update |
| 0.0.41 | 2026-01-27 | Android element filtering improvements |
Sources: CHANGELOG.md
Getting Started
Installation
Add to your MCP client configuration:
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}
Prerequisites
- Node.js >= 18
- For iOS: Xcode Command Line Tools, WebDriverAgent (for real devices)
- For Android: Android SDK, ADB configured
Client Support
Mobile MCP is compatible with multiple AI coding assistants:
- Claude Desktop
- Claude Code
- Cursor
- VS Code
- Codex
- Copilot
- Gemini CLI
- Goose
- Cline
- Windsurf
- Qodo Gen
- Amp
- Kiro
- opencode
Sources: README.md
Sources: README.md
Installation and Configuration
Related topics: Overview, Prerequisites
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Prerequisites
Installation and Configuration
Overview
Mobile MCP is a Model Context Protocol (MCP) server that enables mobile automation for iOS and Android devices. The server provides a standardized interface for AI assistants to interact with mobile devices through simulators, emulators, and real hardware.
This page covers the complete installation process, configuration options, and server deployment modes.
Prerequisites
System Requirements
| Requirement | Specification |
|---|---|
| Node.js | Version 18 or higher |
| Package Manager | npm, yarn, or pnpm |
| Mobile CLI Tools | mobilecli (auto-installed as optional dependency) |
| Platform Tools | Xcode Command Line Tools (iOS) / Android SDK (Android) |
The server requires Node.js 18+ as specified in package.json:
"engines": {
"node": ">=18"
}
Sources: package.json:12
Required Mobile Tools
Mobile MCP depends on mobilecli for device communication. The SDK checks for mobilecli availability at server startup:
const ensureMobilecliAvailable = (): void => {
try {
const version = mobilecli.getVersion();
if (version.startsWith("failed")) {
throw new Error("mobilecli version check failed");
}
} catch (error: any) {
throw new ActionableError(`mobilecli is not available or not working properly...`);
}
};
Sources: src/server.ts:1-20
Installation Methods
Standard NPM Installation
The recommended installation method uses npx to run the package directly:
npx -y @mobilenext/mobile-mcp@latest
This command downloads and executes the latest version without requiring local installation.
Local Installation
For development or customization, install locally:
npm install @mobilenext/mobile-mcp
The package provides a binary entry point:
"bin": {
"mcp-server-mobile": "lib/index.js"
}
Sources: package.json:62-65
Building from Source
To build from source:
git clone https://github.com/mobile-next/mobile-mcp.git
cd mobile-mcp
npm install
npm run build
Build artifacts are output to the lib/ directory:
npm run build # Compiles TypeScript and sets executable permissions
npm run watch # Watch mode for development
Sources: package.json:22-30
Server Configuration
Standard MCP Configuration
The following JSON configuration works across most MCP clients:
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}
Sources: README.md:1
Configuration Schema
The server adheres to the MCP protocol schema:
{
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
"name": "io.github.mobile-next/mobile-mcp",
"description": "MCP server for iOS and Android Mobile Development, Automation and Testing",
"version": "{{VERSION}}",
"packages": [
{
"registryType": "npm",
"registryBaseUrl": "https://registry.npmjs.org",
"identifier": "@mobilenext/mobile-mcp",
"transport": {
"type": "stdio"
}
}
]
}
Sources: server.json:1-20
Client-Specific Configuration
Claude Code
claude mcp add mobile-mcp -- npx -y @mobilenext/mobile-mcp@latest
Sources: README.md:1
Claude Desktop
Follow the MCP install guide and use the standard JSON configuration above.
Codex
CLI Installation:
codex mcp add mobile-mcp npx "@mobilenext/mobile-mcp@latest"
Manual Configuration (~/.codex/config.toml):
[mcp_servers.mobile-mcp]
command = "npx"
args = ["@mobilenext/mobile-mcp@latest"]
Sources: README.md:1
Copilot
Configuration file (~/.copilot/mcp-config.json):
{
"mcpServers": {
"mobile-mcp": {
"type": "local",
"command": "npx",
"tools": ["*"],
"args": ["@mobilenext/mobile-mcp@latest"]
}
}
}
Sources: README.md:1
Cursor
Installation Button: Click the provided deeplink or navigate to Cursor Settings → MCP → Add new MCP Server.
Manual Configuration:
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["@mobilenext/mobile-mcp@latest"]
}
}
}
Sources: README.md:1
Gemini CLI
gemini mcp add mobile-mcp npx -y @mobilenext/mobile-mcp@latest
Sources: README.md:1
Goose
UI Installation: Use the extension install button or navigate to Advanced settings → Extensions → Add custom extension.
Manual Configuration:
Sources: README.md:1
- Type:
STDIO - Command:
npx -y @mobilenext/mobile-mcp@latest
Windsurf
Navigate to Windsurf settings → MCP servers → Add new server:
npx @mobilenext/mobile-mcp@latest
Sources: README.md:1
Amp
CLI Installation:
amp mcp add mobile-mcp -- npx @mobilenext/mobile-mcp@latest
VS Code Extension: Add via settings.json:
"amp.mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["@mobilenext/mobile-mcp@latest"]
}
}
Sources: README.md:1
Cline
Add the standard JSON configuration to your MCP settings file. Sources: README.md:1
Kiro
Configuration file (~/.kiro/settings/mcp.json):
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["@mobilenext/mobile-mcp@latest"]
}
}
}
Sources: README.md:1
opencode
Configuration file (~/.config/opencode/opencode.json):
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"mobile-mcp": {
"type": "local",
"command": ["npx", "@mobilenext/mobile-mcp@latest"],
"enabled": true
}
}
}
Sources: README.md:1
Qodo Gen
Open Qodo Gen chat panel → Connect more tools → + Add new MCP → Paste the standard configuration. Sources: README.md:1
SSE Server Mode
By default, Mobile MCP communicates over stdio. For remote or web-based deployments, enable SSE (Server-Sent Events) mode.
Starting the SSE Server
Basic Usage:
npx @mobilenext/mobile-mcp@latest --listen 3000
Binding to Specific Interface:
npx @mobilenext/mobile-mcp@latest --listen 0.0.0.0:3000
This binds the server to all network interfaces on port 3000.
Client Configuration for SSE
Configure your MCP client to connect to the SSE endpoint:
http://<host>:3000/mcp
Architecture Diagram
graph TD
A[MCP Client] -->|HTTP/MCP Protocol| B[SSE Server]
B --> C{Mobile MCP Server}
C -->|iOS Devices| D[IosRobot]
C -->|Android Devices| E[AndroidRobot]
D --> F[mobilecli]
E --> F
F --> G[iOS Simulator/Device]
F --> H[Android Emulator/Device]Authorization
When running in SSE mode, secure the server with Bearer token authentication.
Configuration
Set the MOBILEMCP_AUTH environment variable:
MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000
Client Request Format
All requests must include the authorization header:
Authorization: Bearer my-secret-token
Dependencies
Production Dependencies
Sources: package.json:31-48
| Package | Version | Purpose |
|---|---|---|
| @modelcontextprotocol/sdk | 1.26.0 | MCP protocol implementation |
| ajv | ^8.18.0 | JSON schema validation |
| commander | 14.0.0 | CLI argument parsing |
| express | 5.1.0 | SSE server framework |
| fast-xml-parser | 5.5.7 | XML parsing for mobile protocols |
| qs | ^6.15.0 | Query string parsing |
| zod | ^4.1.13 | Schema validation |
| zod-to-json-schema | 3.25.0 | Zod to JSON Schema conversion |
| mobilecli | 0.3.70 | Mobile device communication (optional) |
Dev Dependencies
Key development dependencies include:
Sources: package.json:49-60
- TypeScript 5.8.2
- ESLint 9.19.0
- Mocha 11.1.0 (testing)
- ts-node 10.9.2
- husky 9.1.7 (git hooks)
Troubleshooting
mobilecli Not Available
If the server fails to start with "mobilecli is not available":
``bash npm install mobilecli ``
- Ensure the optional dependency is installed:
``bash mobilecli --version ``
- Verify installation:
- Check platform compatibility using the binary resolution logic in
src/mobilecli.ts.
Binary Resolution Path
The server searches for mobilecli in this order:
graph TD
A[Start] --> B{Current path contains node_modules?}
B -->|Yes| C[Find last node_modules directory]
C --> D[Check mobilecli/bin/<platform-specific-binary>]
D --> E{Binary exists?}
E -->|Yes| F[Return path]
E -->|No| G[Check parentDir/node_modules/mobilecli/bin/...]
B -->|No| G
G --> H{Binary exists?}
H -->|Yes| F
H -->|No| I[Throw error]Sources: src/mobilecli.ts:1-35
Node.js Version
Ensure Node.js 18+ is installed:
node --version
iOS Simulator Issues
For iOS simulators, ensure Xcode Command Line Tools are installed:
xcode-select --install
List available simulators:
xcrun simctl list
Boot a simulator before use:
xcrun simctl boot "iPhone 16"
Next Steps
After successful installation:
- Connect a Device: Use physical devices, simulators (iOS), or emulators (Android)
- Verify Connection: Run
mobile_list_devicestool - Take a Screenshot: Use
mobile_take_screenshotto verify communication - Explore Tools: Review available tools in the main documentation
Sources: package.json:12
Prerequisites
Related topics: Installation and Configuration, iOS Implementation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation and Configuration, iOS Implementation
Prerequisites
Mobile MCP (Model Context Protocol) enables AI agents to interact with mobile devices for automation, testing, and mobile application manipulation. Before using Mobile MCP, you must ensure your environment meets the necessary requirements for both the MCP server and the target mobile devices.
System Requirements
Node.js Environment
Mobile MCP requires Node.js version 18 or higher. This requirement is enforced through the package.json engine specification:
"engines": {
"node": ">=18"
}
Sources: package.json:11
The MCP SDK version 1.26.0 is used as the core protocol implementation, which also requires a modern Node.js environment with support for ES modules and async/await patterns.
Supported Platforms
Mobile MCP supports automation across multiple platform types:
| Platform Type | Examples | Access Method |
|---|---|---|
| iOS Simulators | iPhone Simulator, iPad Simulator | Local Xcode tools |
| iOS Real Devices | Physical iPhones, iPads | WebDriverAgent via tunnel |
| Android Emulators | Android Studio Emulator, Genymotion | ADB |
| Android Real Devices | Samsung, Google Pixel, etc. | ADB |
Sources: src/server.ts:32-55
The server automatically detects which platform type a device belongs to by checking against iOS devices, Android devices, and iOS simulators in sequence.
Required Software Components
mobilecli
The mobilecli package is the core CLI tool that Mobile MCP relies on for device communication. It is listed as an optional dependency in package.json:
"optionalDependencies": {
"mobilecli": "0.3.70"
}
Sources: package.json:31-33
The server performs a version check on startup to ensure mobilecli is available:
const ensureMobilecliAvailable = (): void => {
try {
const version = mobilecli.getVersion();
if (version.startsWith("failed")) {
throw new Error("mobilecli version check failed");
}
} catch (error: any) {
throw new ActionableError(`mobilecli is not available or not working properly.`);
}
};
Sources: src/server.ts:18-28
Installation Methods
#### Via npm (Recommended)
npx @mobilenext/mobile-mcp@latest
#### Via mobilecli package
npm install -g mobilecli
iOS-Specific Prerequisites
WebDriverAgent (Real Devices)
For physical iOS devices, WebDriverAgent (WDA) must be installed and running. WDA is Facebook's WebDriver protocol implementation for iOS used for device control and element inspection.
The IosManager class manages WDA connections:
private async wda(): Promise<WebDriverAgent> {
await this.assertTunnelRunning();
if (!(await this.isWdaForwardRunning())) {
throw new ActionableError("Port forwarding to WebDriverAgent is not running");
}
const wda = new WebDriverAgent("localhost", WDA_PORT);
if (!(await wda.isRunning())) {
throw new ActionableError("WebDriverAgent is not running on device");
}
return wda;
}
Sources: src/ios.ts:82-97
#### WebDriverAgent Port Configuration
| Setting | Default Value | Purpose |
|---|---|---|
| WDA_PORT | 8100 | WebDriverAgent local port |
| IOS_TUNNEL_PORT | 9222 | iOS tunnel port |
iOS Tunnel Requirements
When connecting to remote iOS devices, an iOS tunnel must be established. The tunnel allows the MCP server to communicate with devices over a network connection.
private async assertTunnelRunning(): Promise<void> {
if (await this.isTunnelRequired()) {
if (!(await this.isTunnelRunning())) {
throw new ActionableError("iOS tunnel is not running");
}
}
}
Sources: src/ios.ts:68-74
#### Tunnel Setup Requirements
- Port forwarding must be active - The tunnel forwards WDA traffic to the local MCP server
- Firewall configuration - Port 9222 must be accessible for tunnel connections
- Network connectivity - Both server and device must have network access
iOS Simulator Requirements
For iOS simulators, the system uses simctl commands through the Xcode toolchain:
this.simctl("install", this.simulatorUuid, installPath);
Sources: src/iphone-simulator.ts:89
Required tools for simulators:
- Xcode Command Line Tools
simctlutility- Bootable simulator instances
App Installation on iOS
The iOS manager handles .zip and .app bundle installations:
if (extname(path).toLowerCase() === ".zip") {
this.validateZipPaths(path);
// Extract and install .app bundle
}
Sources: src/iphone-simulator.ts:58-65
Supported formats: .zip, .app
Android-Specific Prerequisites
Android Debug Bridge (ADB)
Android devices communicate through ADB (Android Debug Bridge). The MobileDevice class executes ADB commands for device interaction:
private runCommand(args: string[]): string {
const result = execFileSync("adb", ["-s", this.deviceId, ...args]);
return result.toString();
}
Sources: src/mobile-device.ts:17-20
#### Common ADB Operations
| Command | Purpose |
|---|---|
adb shell | Execute shell commands on device |
adb install | Install APK packages |
adb uninstall | Remove applications |
adb screencap | Capture screen screenshots |
adb input | Send touch/keyboard input |
Android Device Requirements
- USB Debugging enabled - Required for ADB communication
- Device authorization - Device must approve computer for debugging
- Proper USB drivers - Especially on Windows systems
Android UI Automation Commands
The Android implementation uses uiautomator2 commands through ADB shell:
public async getElementsOnScreen(): Promise<ScreenElement[]> {
const response = JSON.parse(this.runCommand(["dump", "ui"])) as DumpUIResponse;
return response.data.elements.map(element => ({
type: element.type,
label: element.label,
text: element.text,
// ... other properties
}));
}
Sources: src/mobile-device.ts:52-61
Supported Android Input Operations
| Operation | ADB Command |
|---|---|
| Tap | input tap x,y |
| Swipe | input swipe x1 y1 x2 y2 |
| Text input | input text <text> |
| Button press | input keyevent <code> |
| Long press | Custom implementation with duration |
Architecture Overview
graph TB
subgraph "MCP Client"
A["AI Agent / IDE"]
end
subgraph "Mobile MCP Server"
B["server.ts<br/>MCP Protocol Handler"]
C["Robot Interface<br/>Abstract Layer"]
end
subgraph "Device Abstraction Layer"
D["IosRobot<br/>iOS Implementation"]
E["AndroidRobot<br/>Android Implementation"]
F["IosSimulatorRobot<br/>Simulator Implementation"]
end
subgraph "Device Communication"
G["mobilecli"]
H["WebDriverAgent<br/>iOS Real Devices"]
I["ADB<br/>Android Devices"]
J["simctl<br/>iOS Simulators"]
end
A --> B
B --> C
C --> D
C --> E
C --> F
D --> G
D --> H
E --> G
E --> I
F --> G
F --> JPrerequisites Checklist
Before running Mobile MCP, verify the following:
Environment Checklist
- [ ] Node.js >= 18 installed
- [ ]
mobileclipackage accessible - [ ] Network connectivity for remote devices
iOS Device Checklist (Real Devices)
- [ ] WebDriverAgent installed on device
- [ ] iOS tunnel established (for remote access)
- [ ] Port forwarding active (port 8100)
- [ ] Device connected via USB or network
iOS Simulator Checklist
- [ ] Xcode installed
- [ ] Simulator booted and available
- [ ]
simctlcommand accessible
Android Device Checklist
- [ ] ADB installed and in PATH
- [ ] USB debugging enabled on device
- [ ] Device authorized for debugging
- [ ] Device connected (USB or network)
Troubleshooting Prerequisites Issues
mobilecli Not Found
Error: mobilecli is not available or not working properly
Solution: Install mobilecli globally:
npm install -g mobilecli
iOS Tunnel Not Running
Error: iOS tunnel is not running
Solution: Establish tunnel using go-ios:
ios tunnel start
WebDriverAgent Not Running
Error: WebDriverAgent is not running on device
Solution:
- Ensure WDA is installed on the device
- Verify port forwarding:
iproxy 8100 8100 - Restart WebDriverAgent if needed
Android Device Not Detected
Error: No Android devices found
Solution:
- Verify ADB is running:
adb devices - Enable USB debugging on device
- Reconnect device or restart ADB:
adb kill-server && adb start-server
Related Documentation
Sources: package.json:11
System Architecture
Related topics: Device Abstraction Layer, iOS Implementation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Device Abstraction Layer, iOS Implementation
System Architecture
Overview
Mobile MCP is a Model Context Protocol (MCP) server that enables AI agents to interact with mobile devices (iOS and Android) through a standardized interface. The system acts as a bridge between LLM-powered agents and mobile device automation, supporting physical devices, simulators, and emulators.
Sources: README.md:1-50
High-Level Architecture
The architecture follows a layered design pattern with clear separation of concerns:
graph TD
subgraph "Client Layer"
A["MCP Client<br/>(Cursor, Claude, etc.)"]
end
subgraph "MCP Server Layer"
B["server.ts<br/>(MCP Protocol Handler)"]
C["Tool Definitions"]
D["Device Manager"]
end
subgraph "Abstraction Layer"
E["Robot Interface<br/>(robot.ts)"]
end
subgraph "Device Implementation Layer"
F["IosRobot<br/>(ios.ts)"]
G["AndroidRobot<br/>(android.ts)"]
H["MobileDevice<br/>(mobile-device.ts)"]
end
subgraph "External Dependencies"
I["mobilecli"]
J["WebDriverAgent"]
K["ADB"]
end
A --> B
B --> C
B --> D
D --> E
E --> F
E --> G
E --> H
F --> J
G --> K
H --> ICore Components
1. MCP Server (`server.ts`)
The main entry point that implements the MCP protocol. It handles:
- Tool registration and discovery
- Device listing and selection
- Request routing to appropriate robot implementations
- Authentication handling (SSE mode)
Key responsibilities:
- Initializes the MCP SDK and registers all tools Sources: src/server.ts:1-50
- Validates mobilecli availability at startup Sources: src/server.ts:20-30
- Routes requests based on device type Sources: src/server.ts:40-70
2. Robot Interface (`robot.ts`)
Defines the abstract interface that all device implementations must follow:
interface Robot {
openUrl(url: string): Promise<void>;
sendKeys(text: string): Promise<void>;
pressButton(button: Button): Promise<void>;
tap(x: number, y: number): Promise<void>;
doubleTap(x: number, y: number): Promise<void>;
longPress(x: number, y: number, duration: number): Promise<void>;
swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
getScreenshot(): Promise<Buffer>;
getElementsOnScreen(): Promise<ScreenElement[]>;
listApps(): Promise<InstalledApp[]>;
launchApp(packageName: string, locale?: string): Promise<void>;
terminateApp(packageName: string): Promise<void>;
installApp(path: string): Promise<void>;
uninstallApp(bundleId: string): Promise<void>;
setOrientation(orientation: Orientation): Promise<void>;
getOrientation(): Promise<Orientation>;
}
Sources: src/robot.ts:1-100
3. Device Implementations
#### IosRobot (ios.ts)
Manages iOS device interactions through WebDriverAgent:
- Establishes tunnel connections for remote device access
- Manages WebDriverAgent port forwarding
- Provides iOS-specific automation commands
Key features:
- Port forwarding management (WDA_PORT = 8100, IOS_TUNNEL_PORT = 20021) Sources: src/ios.ts:30-50
- WebDriverAgent lifecycle management Sources: src/ios.ts:60-80
- Tunnel status validation Sources: src/ios.ts:55-70
#### AndroidRobot (android.ts)
Manages Android device interactions through ADB:
- Direct ADB command execution
- Device discovery and listing
- APK installation and management
#### MobileDevice (mobile-device.ts)
Uses mobilecli for unified device control:
- Works across both platforms through mobilecli abstraction
- UI element dumping and interaction
- Device orientation management
Sources: src/mobile-device.ts:1-80
Device Selection Flow
graph TD
A["mobile_list_available_devices"] --> B["Check iOS Devices"]
B --> C["Check Android Devices"]
C --> D["Check iOS Simulators"]
D --> E["Return Combined List"]
F["getRobotFromDevice<br/>(deviceId)"] --> G{"Is iOS Device?"}
G -->|Yes| H["Return IosRobot"]
G -->|No| I{"Is Android Device?"}
I -->|Yes| J["Return AndroidRobot"]
I -->|No| K{"Is iOS Simulator?"}
K -->|Yes| L["Check Agent Status"]
L -->|Install if needed| M["Return MobileDevice"]
K -->|No| N["Throw Error"]Sources: src/server.ts:60-100
Tool Architecture
All MCP tools follow a consistent pattern defined in server.ts:
| Category | Tool Name | Purpose |
|---|---|---|
| Device Info | mobile_list_available_devices | List all connected devices |
| Device Info | mobile_get_screen_size | Get device screen dimensions |
| Device Info | mobile_get_orientation | Get current screen orientation |
| Device Info | mobile_set_orientation | Set screen orientation |
| App Management | mobile_list_apps | List installed apps |
| App Management | mobile_launch_app | Launch an app |
| App Management | mobile_terminate_app | Stop an app |
| App Management | mobile_install_app | Install from file |
| App Management | mobile_uninstall_app | Uninstall app |
| Screen Interaction | mobile_take_screenshot | Capture screen |
| Screen Interaction | mobile_click_on_screen_at_coordinates | Tap at coordinates |
| Screen Interaction | mobile_double_tap_on_screen | Double tap |
| Screen Interaction | mobile_long_press_on_screen_at_coordinates | Long press |
| Screen Interaction | mobile_swipe_on_screen | Swipe gesture |
| Input | mobile_type_keys | Send text input |
Communication Protocols
Stdio Mode (Default)
The default communication mode uses standard input/output:
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}
SSE Server Mode
Optional HTTP server mode for remote access:
npx @mobilenext/mobile-mcp@latest --listen 3000
Supports optional Bearer token authentication:
MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000
WebDriverAgent Integration (iOS)
For iOS devices, the architecture leverages WebDriverAgent:
sequenceDiagram
participant MCP as MCP Server
participant IosRobot
participant WDA as WebDriverAgent
participant Tunnel
MCP->>IosRobot: tap(x, y)
IosRobot->>Tunnel: Ensure tunnel running
Tunnel-->>IosRobot: OK
IosRobot->>WDA: POST /wda/tap/0
WDA-->>IosRobot: Response
IosRobot-->>MCP: SuccessSources: src/webdriver-agent.ts:50-100
Dependency Architecture
graph LR
A["@modelcontextprotocol/sdk"] --> B["MCP Server"]
C["mobilecli"] --> D["MobileDevice"]
E["express"] --> F["SSE Server"]
G["zod"] --> H["Schema Validation"]Key Dependencies
| Package | Version | Purpose |
|---|---|---|
@modelcontextprotocol/sdk | 1.26.0 | MCP protocol implementation |
mobilecli | 0.3.70 (optional) | Cross-platform mobile control |
express | 5.1.0 | HTTP server for SSE mode |
zod | ^4.1.13 | Runtime type validation |
commander | 14.0.0 | CLI argument parsing |
Sources: package.json:1-50
Error Handling
The system uses ActionableError for user-friendly error messages:
throw new ActionableError(
`Device "${deviceId}" not found. Use the mobile_list_available_devices tool to see available devices.`
);
Sources: src/server.ts:90-95
Platform Detection Logic
graph TD
A["getRobotFromDevice"] --> B["Check IosManager"]
B --> C{"Found in iOS devices?"}
C -->|Yes| D["Return IosRobot"]
C -->|No| E["Check AndroidManager"]
E --> F{"Found in Android devices?"}
F -->|Yes| G["Return AndroidRobot"]
F -->|No| H["Check iOS Simulators"]
H --> I{"Found?"}
I -->|Yes| J["Return MobileDevice"]
I -->|No| K["Throw ActionableError"]Security Considerations
- URL scheme validation: Only
http://andhttps://allowed by default Sources: src/server.ts:150-160 - Optional unsafe URL access via
MOBILEMCP_ALLOW_UNSAFE_URLS=1 - Bearer token authentication for SSE mode
- No arbitrary command execution on host system
Extensibility
The Robot interface design allows for additional platform implementations:
- Implement the
Robotinterface - Add device detection in
getRobotFromDevice() - Register new tools in
server.ts
// Example: Adding a new platform
const newPlatformDevice = new NewPlatformRobot(deviceId);
return newPlatformRobot;
This modular architecture enables easy addition of new device types or automation backends without modifying existing implementations.
Sources: README.md:1-50
Device Abstraction Layer
Related topics: System Architecture, iOS Implementation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, iOS Implementation
Device Abstraction Layer
Overview
The Device Abstraction Layer (DAL) is the core architectural component of mobile-mcp that provides a unified interface for interacting with mobile devices across different platforms. It abstracts platform-specific implementations (iOS and Android) behind a common Robot interface, enabling AI agents and automated workflows to control mobile devices without knowledge of the underlying platform details.
The DAL serves as the bridge between the Model Context Protocol (MCP) server and the physical mobile devices, handling device detection, robot instantiation, and operation delegation.
Architecture
graph TD
A[MCP Server] --> B[Device Abstraction Layer]
B --> C[Robot Factory]
C --> D{IOS Device?}
C --> E{Android Device?}
C --> F{Simulator?}
D --> G[IosRobot]
E --> H[AndroidRobot]
F --> I[MobileDevice]
G --> J[mobilecli / iOS Device Kit]
H --> K[ADB / UI Automator]
I --> JCore Components
Robot Interface
The Robot interface (src/robot.ts) defines the contract that all platform-specific implementations must follow. This ensures consistent behavior regardless of the target device.
File: src/robot.ts:1-80
| Method | Description | Parameters | Return Type |
|---|---|---|---|
openUrl | Open a URL in the device browser | url: string | Promise<void> |
sendKeys | Send keyboard input to device | text: string | Promise<void> |
pressButton | Simulate physical button press | button: Button | Promise<void> |
tap | Tap at screen coordinates | x: number, y: number | Promise<void> |
doubleTap | Double-tap at coordinates | x: number, y: number | Promise<void> |
longPress | Long press at coordinates | x: number, y: number, duration: number | Promise<void> |
getElementsOnScreen | Get interactive UI elements | - | Promise<ScreenElement[]> |
setOrientation | Change screen orientation | orientation: Orientation | Promise<void> |
getOrientation | Get current orientation | - | Promise<Orientation> |
Platform Implementations
#### IosRobot
Handles iOS device interactions using WebDriverAgent or iOS Device Kit as the underlying protocol.
File: src/ios.ts
| Capability | Description |
|---|---|
| Simulator Support | Full support for iOS simulators |
| Real Device Support | Uses iOS Device Kit for physical devices |
| WebDriverAgent | Legacy support via go-ios |
#### AndroidRobot
Manages Android device interactions through ADB (Android Debug Bridge) and UI Automator.
File: src/android.ts
| Capability | Description |
|---|---|
| Emulator Support | Full support for Android emulators |
| Real Device Support | Direct ADB communication |
| Foldable Support | Multi-screen device handling (v0.0.23+) |
#### MobileDevice
Unified device class for modern simulator/emulator management. Used as the primary interface for iOS simulators.
File: src/mobile-device.ts
| Feature | Description |
|---|---|
| Agent Status Check | Verifies device readiness |
| Agent Install | Automatically installs required agents |
| Platform Abstraction | Works across iOS and Android platforms |
Device Detection Flow
sequenceDiagram
participant Client
participant Server
participant DAL as Device Abstraction Layer
participant IosMgr as IosManager
participant AndroidMgr as AndroidDeviceManager
participant mobilecli
Client->>Server: Tool Request (deviceId)
Server->>DAL: getRobotFromDevice(deviceId)
DAL->>IosMgr: listDevices()
DAL->>AndroidMgr: getConnectedDevices()
alt iOS Device Found
DAL->>DAL: Create IosRobot(deviceId)
else Android Device Found
DAL->>DAL: Create AndroidRobot(deviceId)
else Simulator Check
DAL->>mobilecli: getDevices(platform: "ios", type: "simulator")
alt Simulator Found
DAL->>DAL: Check agentVerifiedSimulators
DAL->>mobilecli: agentStatus(deviceId)
alt Agent Not Installed
DAL->>mobilecli: agentInstall(deviceId)
end
DAL->>DAL: Create MobileDevice(deviceId)
else Not Found
DAL-->>Client: Error: Device not found
end
end
DAL-->>Server: Robot Instance
Server-->>Client: Execute ToolDevice Manager Classes
IosManager
Manages iOS device discovery and listing.
File: src/ios.ts
class IosManager {
listDevices(): IosDevice[];
}
AndroidDeviceManager
Manages Android device discovery via ADB.
File: src/android.ts
class AndroidDeviceManager {
getConnectedDevices(): AndroidDevice[];
}
mobilecli Integration
The mobilecli tool (src/mobilecli.ts) is the underlying CLI that provides cross-platform device management. It is a required dependency for the Device Abstraction Layer to function.
File: src/server.ts:8-16
const ensureMobilecliAvailable = (): void => {
try {
const version = mobilecli.getVersion();
if (version.startsWith("failed")) {
throw new Error("mobilecli version check failed");
}
} catch (error: any) {
throw new ActionableError(`mobilecli is not available or not working properly...`);
}
};
Key mobilecli Functions
| Function | Purpose |
|---|---|
getVersion() | Verify mobilecli installation |
getDevices() | List available devices by platform and type |
agentStatus() | Check if agent is installed on device |
agentInstall() | Install required agent on device |
agentUninstall() | Remove agent from device |
Simulator Agent Management
For iOS simulators, the DAL implements an agent verification and auto-installation system.
File: src/server.ts:45-58
if (!agentVerifiedSimulators.has(deviceId)) {
const agentStatus = mobilecli.agentStatus(deviceId);
if (agentStatus.status === "fail") {
mobilecli.agentInstall(deviceId);
}
agentVerifiedSimulators.add(deviceId);
}
This ensures that:
- Each simulator is checked at most once per server session
- Missing agents are automatically installed
- The
agentVerifiedSimulatorsSet prevents redundant installations
Error Handling
The DAL provides actionable error messages when device operations fail.
| Error Condition | Response |
|---|---|
| mobilecli not available | Links to installation wiki |
| Device not found | Lists available devices tool |
| iOS Device Kit failure | Fallback mechanisms |
File: src/server.ts:15-16
throw new ActionableError(`Device "${deviceId}" not found.
Use the mobile_list_available_devices tool to see available devices.`);
Usage in MCP Tools
The Device Abstraction Layer is invoked by all device interaction tools defined in the MCP server:
File: src/server.ts:63-90
| Tool | Operation |
|---|---|
mobile_list_available_devices | Lists all connected devices |
mobile_get_screen_size | Delegates to active Robot |
mobile_take_screenshot | Delegates to active Robot |
mobile_click_on_screen_at_coordinates | Delegates to active Robot |
mobile_type_keys | Delegates to active Robot |
mobile_press_button | Delegates to active Robot |
Configuration
The DAL requires mobilecli as an optional dependency:
File: package.json:22-23
"optionalDependencies": {
"mobilecli": "0.3.70"
}
Summary
The Device Abstraction Layer provides:
- Unified Interface: A single
Robotinterface for all platform operations - Platform Detection: Automatic identification of iOS, Android, and simulator devices
- Factory Pattern: Dynamic robot instantiation based on device type
- Agent Management: Automatic verification and installation of required agents
- Error Context: Actionable error messages with documentation links
This architecture enables mobile-mcp to provide a consistent automation experience across the diverse mobile device ecosystem while maintaining platform-specific optimizations.
Source: https://github.com/mobile-next/mobile-mcp / Human Manual
MCP Tools Reference
Related topics: Device Management, App Management, Screen Interaction and Input
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Device Management, App Management, Screen Interaction and Input
MCP Tools Reference
Overview
The Mobile MCP server exposes a comprehensive set of tools that enable AI assistants to interact with mobile devices (iOS and Android) through the Model Context Protocol. These tools provide capabilities for device management, screen interaction, app lifecycle management, and UI automation. Sources: README.md
The tools are implemented in TypeScript and follow a consistent pattern where each tool accepts a device parameter to specify the target device identifier. Sources: src/server.ts:1-100
Architecture
Tool Layer Hierarchy
graph TD
A[MCP Client<br/>Claude, Cursor, Codex] --> B[Mobile MCP Server]
B --> C[Robot Interface]
C --> D[iOS Robot]
C --> E[Android Robot]
D --> F[WebDriverAgent]
D --> G[iPhone Simulator]
E --> H[Android ADB]
F --> I[Real iOS Device]
G --> J[iOS Simulator]
H --> K[Android Emulator]
H --> L[Android Device]Device Detection Flow
graph TD
A[getRobotFromDevice<br/>deviceId: string] --> B{Is iOS Device?}
B -->|Yes| C[Return IosRobot]
B -->|No| D{Is Android Device?}
D -->|Yes| E[Return AndroidRobot]
D -->|No| F{Check iOS<br/>Simulators?}
F -->|Found| G[Return IosRobot]
F -->|Not Found| H[Return Error]The getRobotFromDevice function determines the appropriate robot implementation based on the device type by querying device managers and checking against known device identifiers. Sources: src/server.ts:15-50
Tool Categories
Device Management Tools
#### mobile_list_available_devices
Lists all available mobile devices including simulators, emulators, and connected physical devices.
| Parameter | Type | Required | Description |
|---|---|---|---|
device | string | No | Optional device filter |
Response: JSON array of device objects with deviceId, name, platform, and status.
#### mobile_get_screen_size
Retrieves the screen dimensions of the connected device in pixels.
| Parameter | Type | Required | Description |
|---|---|---|---|
device | string | Yes | Target device identifier |
Response: { "width": number, "height": number }
#### mobile_get_orientation
Returns the current screen orientation of the device.
| Parameter | Type | Required | Description |
|---|---|---|---|
device | string | Yes | Target device identifier |
Response: "portrait" or "landscape" Sources: src/mobile-device.ts:80-90
#### mobile_set_orientation
Changes the screen orientation of the device.
| Parameter | Type | Required | Description |
|---|---|---|---|
device | string | Yes | Target device identifier |
orientation | string | Yes | "portrait" or "landscape" |
public async setOrientation(orientation: Orientation): Promise<void> {
this.runCommand(["device", "orientation", "set", orientation]);
}
Sources: src/mobile-device.ts:75-79
Sources: src/mobile-device.ts:75-79
Device Management
Related topics: MCP Tools Reference, Device Abstraction Layer
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: MCP Tools Reference, Device Abstraction Layer
Device Management
Overview
Device Management in Mobile MCP provides a unified abstraction layer for controlling mobile devices across iOS and Android platforms. It enables AI agents to interact with physical devices, simulators, and emulators through a consistent set of tools and APIs, abstracting platform-specific implementation details.
Sources: src/server.ts:1-50
Architecture
The Device Management system follows a layered architecture:
graph TD
A[MCP Server] --> B[Device Manager]
B --> C[iOS Manager]
B --> D[Android Manager]
B --> E[Mobile CLI]
C --> F[iOS Robot]
D --> G[Android Robot]
E --> H[Mobile Device]
F --> I[Physical Device / Simulator]
G --> J[Emulator / Physical Device]
H --> K[iOS Simulator]Core Components
| Component | File | Purpose |
|---|---|---|
Robot interface | src/robot.ts | Defines unified device interaction contract |
IosRobot | Platform-specific | Handles iOS device communication |
AndroidRobot | Platform-specific | Handles Android device communication |
MobileDevice | Platform-specific | Handles iOS simulators via mobilecli |
IosManager | Internal | iOS device discovery and management |
AndroidDeviceManager | Internal | Android device discovery and management |
Sources: src/robot.ts:1-50
Device Discovery
Device discovery is performed through the getRobotFromDevice() function, which identifies the device type and returns the appropriate robot implementation.
graph TD
A[Start] --> B{Check iOS Devices}
B -->|Found| C[Return IosRobot]
B -->|Not Found| D{Check Android Devices}
D -->|Found| E[Return AndroidRobot]
D -->|Not Found| F{Check Simulators via mobilecli}
F -->|Simulator Found| G[Verify Agent Status]
G -->|Status Failed| H[Install Agent]
G -->|Status OK| I[Return MobileDevice]
F -->|Not Found| J[Throw Error]Device Detection Flow
The detection sequence in src/server.ts:
- iOS Physical Devices: Uses
IosManager.listDevices()to enumerate iOS devices - Android Physical Devices: Uses
AndroidDeviceManager.getConnectedDevices()to enumerate Android devices - iOS Simulators: Uses
mobilecli.getDevices()withplatform: "ios"andtype: "simulator"
Sources: src/server.ts:50-100
Device Types Supported
Platform Matrix
| Platform | Physical Devices | Simulators/Emulators | Key Technologies |
|---|---|---|---|
| iOS | iPhone, iPad | iOS Simulator | WebDriverAgent, iOS Device Kit |
| Android | Samsung, Pixel, etc. | Android Emulator | ADB, UI Automator |
Simulator Agent Verification
When connecting to iOS simulators, Mobile MCP performs agent verification:
if (!agentVerifiedSimulators.has(deviceId)) {
const agentStatus = mobilecli.agentStatus(deviceId);
if (agentStatus.status === "fail") {
mobilecli.agentInstall(deviceId);
}
agentVerifiedSimulators.add(deviceId);
}
Sources: src/server.ts:75-85
Robot Interface
The Robot interface in src/robot.ts defines all device interaction methods:
Screen Operations
| Method | Purpose | Return Type |
|---|---|---|
getScreenshot() | Capture screen as PNG | Promise<Buffer> |
getScreenSize() | Get screen dimensions | Promise<{width, height}> |
getOrientation() | Get current orientation | Promise<Orientation> |
setOrientation() | Set portrait/landscape | Promise<void> |
Touch Interactions
| Method | Purpose | Parameters |
|---|---|---|
tap(x, y) | Single tap | x, y coordinates |
doubleTap(x, y) | Double tap | x, y coordinates |
longPress(x, y, duration) | Long press | x, y, duration (ms) |
swipeFromCoordinate(x, y, direction, distance?) | Swipe gesture | x, y, direction, optional distance |
App Management
| Method | Purpose | Parameters |
|---|---|---|
listApps() | List installed apps | None |
launchApp(packageName, locale?) | Launch app | package name, optional locale |
terminateApp(packageName) | Stop app | package name |
installApp(path) | Install from file | file path (.apk, .ipa, .app, .zip) |
uninstallApp(bundleId) | Uninstall app | bundle ID/package name |
Navigation & Input
| Method | Purpose | Parameters |
|---|---|---|
sendKeys(text) | Type text | string |
pressButton(button) | Press button | Button enum (HOME, BACK, etc.) |
openUrl(url) | Open URL/browser | URL string |
Sources: src/robot.ts:50-120
Available Tools
Mobile MCP exposes the following device management tools to MCP clients:
Device Information
mobile_list_available_devices- List all connected devices (physical and simulators)mobile_get_screen_size- Get screen dimensions in pixelsmobile_get_orientation- Get current screen orientationmobile_set_orientation- Change screen orientation (portrait/landscape)mobile_list_crashes- List crash reports on devicemobile_get_crash- Retrieve full crash report content
Screen Interaction
mobile_take_screenshot- Capture screenshotmobile_save_screenshot- Save screenshot to filemobile_list_elements_on_screen- Get UI elements with coordinatesmobile_click_on_screen_at_coordinates- Click at x,ymobile_double_tap_on_screen- Double tap at x,ymobile_long_press_on_screen_at_coordinates- Long press at x,ymobile_swipe_on_screen- Swipe in direction
Input & Navigation
mobile_type_keys- Type text into focused elementsmobile_press_button- Press device buttonsmobile_open_url- Open URL in browser
App Management
mobile_list_apps- List installed appsmobile_launch_app- Launch app by package namemobile_terminate_app- Stop running appmobile_install_app- Install from filemobile_uninstall_app- Uninstall app
Prerequisites
Device management requires the mobilecli tool to be available on the system. The server validates this on startup:
const ensureMobilecliAvailable = (): void => {
try {
const version = mobilecli.getVersion();
if (version.startsWith("failed")) {
throw new Error("mobilecli version check failed");
}
} catch (error: any) {
throw new ActionableError(`mobilecli is not available...`);
}
};
Sources: src/server.ts:20-35
Configuration
MCP Server Configuration
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}
SSE Server Mode with Authentication
MOBILEMCP_AUTH=my-secret-token npx @mobilenext/mobile-mcp@latest --listen 3000
When MOBILEMCP_AUTH is set, all requests require the header:
Authorization: Bearer my-secret-token
Sources: package.json:1-30
Error Handling
The device management system uses ActionableError to provide users with actionable error messages:
throw new ActionableError(
`Device "${deviceId}" not found. Use the mobile_list_available_devices tool to see available devices.`
);
This pattern ensures users receive guidance on how to resolve issues, not just failure messages.
Sources: src/server.ts:95-100
Dependencies
| Dependency | Version | Purpose |
|---|---|---|
@modelcontextprotocol/sdk | 1.26.0 | MCP protocol implementation |
mobilecli | 0.3.70 (optional) | Cross-platform mobile CLI |
express | 5.1.0 | SSE server transport |
zod | ^4.1.13 | Schema validation |
Sources: package.json:25-50
See Also
Sources: src/server.ts:1-50
App Management
Related topics: MCP Tools Reference, iOS Implementation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: MCP Tools Reference, iOS Implementation
App Management
App Management in Mobile MCP enables AI agents to interact with mobile applications on connected devices through a unified interface. This system provides capabilities for listing, launching, terminating, installing, and uninstalling applications across both iOS and Android platforms.
Overview
The App Management feature abstracts platform-specific implementation details behind a consistent Robot interface. This allows AI agents to manage applications without understanding the underlying differences between iOS and Android platforms.
graph TD
A[MCP Client / AI Agent] --> B[Mobile MCP Server]
B --> C[Robot Interface]
C --> D[iOS Robot]
C --> E[Android Robot]
D --> F[WebDriverAgent / go-ios]
E --> G[ADB / Android Manager]Sources: src/robot.ts:1-100
Available Tools
Mobile MCP exposes five primary tools for application management:
| Tool | Description | Platform |
|---|---|---|
mobile_list_apps | List all installed applications | iOS, Android |
mobile_launch_app | Launch an app by package/bundle name | iOS, Android |
mobile_terminate_app | Stop a running application | iOS, Android |
mobile_install_app | Install app from file (.apk, .ipa, .app, .zip) | iOS, Android |
mobile_uninstall_app | Uninstall app by bundle ID or package name | iOS, Android |
Sources: README.md
Robot Interface
The core App Management functionality is defined in the Robot interface. This interface declares abstract methods that platform-specific implementations must provide.
interface Robot {
listApps(): Promise<InstalledApp[]>;
launchApp(packageName: string, locale?: string): Promise<void>;
terminateApp(packageName: string): Promise<void>;
installApp(path: string): Promise<void>;
uninstallApp(bundleId: string): Promise<void>;
openUrl(url: string): Promise<void>;
}
Sources: src/robot.ts:40-75
Method Specifications
#### listApps()
Returns all installed applications on the device.
listApps(): Promise<InstalledApp[]>;
Return Type: InstalledApp[] - Array of objects containing package names (Android) or bundle identifiers (iOS).
#### launchApp()
Launches an application with optional locale specification.
launchApp(packageName: string, locale?: string): Promise<void>;
Parameters:
| Parameter | Type | Description |
|---|---|---|
packageName | string | The package name (Android) or bundle ID (iOS) of the app |
locale | string (optional) | Locale to launch the app with (e.g., "en_US") |
Sources: src/robot.ts:47-49
#### terminateApp()
Terminates a running application. If the app is not running or doesn't exist, this operation is a no-op.
terminateApp(packageName: string): Promise<void>;
Parameters:
| Parameter | Type | Description |
|---|---|---|
packageName | string | The package name (Android) or bundle ID (iOS) |
#### installApp()
Installs an application from a local file path. Supports multiple formats across platforms.
installApp(path: string): Promise<void>;
Supported Formats:
| Platform | Formats |
|---|---|
| Android | .apk, .zip |
| iOS | .ipa, .app, .zip |
#### uninstallApp()
Uninstalls an application from the device.
uninstallApp(bundleId: string): Promise<void>;
Parameters:
| Parameter | Type | Description |
|---|---|---|
bundleId | string | The app's bundle identifier (iOS) or package name (Android) |
#### openUrl()
Opens a URL in the device's web browser.
openUrl(url: string): Promise<void>;
Supported URL Schemes:
| Type | Example |
|---|---|
| HTTP/HTTPS | https://example.com |
| Custom Schemes | myapp://action |
Sources: src/robot.ts:59-63
Architecture
Device Selection Flow
When a tool requires device-specific implementation, the server determines the appropriate Robot based on the device type:
graph TD
A[Device ID Provided] --> B{Is iOS Device?}
B -->|Yes| C[Return IosRobot]
B -->|No| D{Is Android Device?}
D -->|Yes| E[Return AndroidRobot]
D -->|No| F{Is Simulator?}
F -->|Yes| G[Return MobileDevice]
F -->|No| H[Throw ActionableError]Sources: src/server.ts:30-60
iOS Implementation
The iOS implementation leverages go-ios (via mobilecli) for device communication. The IosManager handles device detection and the WebDriverAgent protocol manages application lifecycle.
Key components in iOS app management:
- WebDriverAgent (WDA) - Apple's testing framework for iOS
- go-ios - Command-line interface for iOS device control
- Tunnel Service - Required for real device communication
Sources: src/ios.ts:1-50
Android Implementation
Android implementation uses the Android Debug Bridge (ADB) through the Android Manager:
- ADB - Primary communication protocol with Android devices
- Android Manager - Handles device enumeration and app operations
- Package Manager - Manages app installation and uninstallation
Tool Registration
App Management tools are registered in the server with descriptive schemas:
tool(
"mobile_list_apps",
"List Apps",
"List all installed apps on the device",
{},
{ readOnlyHint: true },
async ({}) => { /* implementation */ }
);
Sources: src/server.ts:80-100
Prerequisites
iOS Requirements
- WebDriverAgent must be running on the device
- For real devices: iOS tunnel must be established
- go-ios must be installed and functional
Sources: src/ios.ts:35-45
Android Requirements
- ADB must be enabled on the device
- Device must be connected and authorized
- USB debugging must be enabled
Usage Examples
List All Installed Apps
{
"tool": "mobile_list_apps",
"arguments": {}
}
Launch an App with Locale
{
"tool": "mobile_launch_app",
"arguments": {
"packageName": "com.example.app",
"locale": "en_US"
}
}
Install an App
{
"tool": "mobile_install_app",
"arguments": {
"path": "/path/to/application.apk"
}
}
Uninstall an App
{
"tool": "mobile_uninstall_app",
"arguments": {
"bundleId": "com.example.app"
}
}
Open URL in Browser
{
"tool": "mobile_open_url",
"arguments": {
"url": "https://example.com"
}
}
Error Handling
The system provides actionable error messages when operations fail. Common error scenarios include:
| Error Condition | Cause | Resolution |
|---|---|---|
| Device not found | Invalid device ID or disconnected device | Check device connection |
| App not installed | Attempting to launch non-existent app | Install app first |
| Installation failed | Invalid file format or corrupted package | Verify file integrity |
| Permission denied | Insufficient device permissions | Grant required permissions |
Sources: src/server.ts:45-50
Dependencies
App Management relies on the following package:
| Package | Version | Purpose |
|---|---|---|
mobilecli | 0.3.70 | Cross-platform mobile device CLI |
Sources: package.json:25-30
Source: https://github.com/mobile-next/mobile-mcp / Human Manual
Screen Interaction and Input
Related topics: MCP Tools Reference, iOS Implementation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: MCP Tools Reference, iOS Implementation
Screen Interaction and Input
Overview
The Screen Interaction and Input system in Mobile MCP provides a comprehensive abstraction layer for controlling mobile devices (iOS, Android, simulators, and emulators) through a unified Robot interface. This system enables AI agents to interact with mobile applications by simulating user input events, capturing screen content, and managing device orientation.
The architecture is designed to work with multiple device platforms while presenting a consistent API to MCP clients. The system leverages native accessibility trees for most interactions, falling back to screenshot-based coordinate operations when accessibility labels are unavailable. Sources: src/robot.ts:1-30
Architecture
Core Components
The system consists of three primary device abstraction layers:
| Component | Platform | Protocol | File |
|---|---|---|---|
IosRobot | iOS | WebDriverAgent (WDA) | src/webdriver-agent.ts:1-50 |
AndroidRobot | Android | mobilecli | src/mobile-device.ts:1-30 |
Robot (interface) | All | Abstract | src/robot.ts:1-50 |
Robot Interface Contract
The Robot interface defines the contract that all platform-specific implementations must fulfill:
interface Robot {
openUrl(url: string): Promise<void>;
sendKeys(text: string): Promise<void>;
pressButton(button: Button): Promise<void>;
tap(x: number, y: number): Promise<void>;
doubleTap(x: number, y: number): Promise<void>;
longPress(x: number, y: number, duration: number): Promise<void>;
getElementsOnScreen(): Promise<ScreenElement[]>;
setOrientation(orientation: Orientation): Promise<void>;
getOrientation(): Promise<Orientation>;
swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
getScreenshot(): Promise<Buffer>;
listApps(): Promise<InstalledApp[]>;
launchApp(packageName: string, locale?: string): Promise<void>;
terminateApp(packageName: string): Promise<void>;
installApp(path: string): Promise<void>;
uninstallApp(bundleId: string): Promise<void>;
}
Sources: src/robot.ts:30-100
Device Selection Flow
The server determines which Robot implementation to use based on the device identifier:
graph TD
A[MCP Request with device ID] --> B{Device Type Check}
B -->|iOS Device| C[IosRobot]
B -->|Android Device| D[AndroidRobot]
B -->|Simulator| E[IosRobot]
C --> F[WebDriverAgent Connection]
D --> G[mobilecli Commands]
E --> FSources: src/server.ts:50-80
Touch Interactions
Tap Operations
#### Single Tap
Single tap is implemented using pointer actions sent through the WebDriverAgent protocol for iOS:
public async tap(x: number, y: number) {
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/actions`;
await fetch(url, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
actions: [{
type: "pointer",
id: "finger1",
parameters: { pointerType: "touch" },
actions: [
{ type: "pointerMove", duration: 0, x, y },
{ type: "pointerDown", button: 0 },
{ type: "pause", duration: 100 },
{ type: "pointerUp", button: 0 }
]
}]
}),
});
});
}
Sources: src/webdriver-agent.ts:180-210
For Android, the tap operation uses the mobilecli command:
public async tap(x: number, y: number): Promise<void> {
this.runCommand(["io", "tap", `${x},${y}`]);
}
Sources: src/mobile-device.ts:50-52
#### Double Tap
Double tap is implemented by executing two consecutive tap operations:
public async doubleTap(x: number, y: number): Promise<void> {
await this.tap(x, y);
await this.tap(x, y);
}
Sources: src/mobile-device.ts:56-60
In iOS WebDriverAgent, the double tap uses the W3C Actions API with multiple pointer sequences.
#### Long Press
Long press requires specifying the duration in milliseconds:
public async longPress(x: number, y: number, duration: number): Promise<void> {
this.runCommand(["io", "longpress", `${x},${y}`, "--duration", `${duration}`]);
}
Sources: src/mobile-device.ts:62-64
The iOS implementation uses the Actions API with extended pointerDown duration:
{ type: "pointerDown", button: 0 },
{ type: "pause", duration: duration },
{ type: "pointerUp", button: 0 }
Swipe Gestures
Swipe gestures calculate coordinates based on screen dimensions, using 60% of the screen width or height as the swipe distance:
const verticalDistance = Math.floor(screenSize.height * 0.6);
const horizontalDistance = Math.floor(screenSize.width * 0.6);
const centerX = Math.floor(screenSize.width / 2);
const centerY = Math.floor(screenSize.height / 2);
Sources: src/webdriver-agent.ts:130-135
The swipe direction determines the start and end coordinates:
| Direction | X Movement | Y Movement |
|---|---|---|
| up | centerX | centerY ± verticalDistance/2 |
| down | centerX | centerY ± verticalDistance/2 |
| left | centerX ± horizontalDistance/2 | centerY |
| right | centerX ± horizontalDistance/2 | centerY |
Text Input
Keyboard Input
Text input is handled through different mechanisms per platform:
iOS (via WebDriverAgent):
public async sendKeys(text: string): Promise<void> {
await this.withinSession(async sessionUrl => {
await fetch(`${sessionUrl}/wda/keys`, {
method: "POST",
body: JSON.stringify({ value: text.split("") }),
});
});
}
Android (via mobilecli):
public async typeText(text: string): Promise<void> {
this.runCommand(["io", "text", text]);
}
Sources: src/mobile-device.ts:42-44
Button Presses
Physical device buttons are mapped to platform-specific commands:
| Button | iOS Action | Android Command |
|---|---|---|
| HOME | wda/pressButton | io button home |
| BACK | N/A | io button back |
| POWER | N/A | io button power |
| ENTER | sendKeys "\n" | io button enter |
| VOLUME_UP | wda/pressButton | io button volume_up |
| VOLUME_DOWN | `wda/pressButton" | io button volume_down |
Sources: src/webdriver-agent.ts:150-175
Screen Capture
Screenshot Retrieval
Screenshots are retrieved as PNG buffers through platform-specific protocols:
iOS WebDriverAgent:
public async getScreenshot(): Promise<Buffer> {
const url = `http://${this.host}:${this.port}/screenshot`;
const response = await fetch(url);
const json = await response.json();
return Buffer.from(json.value, "base64");
}
Sources: src/webdriver-agent.ts:100-106
Android (via mobilecli): Screenshots are captured using the platform's native screenshot mechanism through the dump command.
UI Element Discovery
Element Filtering
The system filters accessibility elements to return only actionable UI components:
const acceptedTypes = [
"TextField",
"Button",
"Switch",
"Icon",
"SearchField",
"StaticText",
"Image"
];
Sources: src/webdriver-agent.ts:30-32
Element visibility is determined by checking both the isVisible flag and bounds:
if (acceptedTypes.includes(source.type)) {
if (source.isVisible === "1" && this.isVisible(source.rect)) {
if (source.label !== null || source.name !== null || source.rawIdentifier !== null) {
output.push({ /* element data */ });
}
}
}
Element Data Structure
Each screen element contains:
| Property | Type | Description |
|---|---|---|
| type | string | Element type (Button, TextField, etc.) |
| label | string | Accessibility label |
| name | string | Element name |
| value | string | Current value (for text fields) |
| identifier | string | Platform-specific identifier |
| rect | {x, y, width, height} | Bounding rectangle |
| focused | boolean | Focus state |
Sources: src/mobile-device.ts:66-76
Orientation Management
Device orientation can be queried and changed:
public async setOrientation(orientation: Orientation): Promise<void> {
this.runCommand(["device", "orientation", "set", orientation]);
}
public async getOrientation(): Promise<Orientation> {
const response = JSON.parse(this.runCommand(["device", "orientation", "get"])) as OrientationResponse;
return response.data.orientation;
}
Sources: src/mobile-device.ts:78-85
Supported orientations: "portrait" | "landscape"
Connection Prerequisites
iOS Requirements
iOS devices require several connection layers to be operational:
graph LR
A[MCP Server] --> B[iOS Tunnel]
B --> C[WDA Port Forward]
C --> D[WebDriverAgent]
D --> E[iOS Device]
F[Check: Tunnel Running] --> B
G[Check: WDA Forward Running] --> C
H[Check: WDA isRunning] --> DSources: src/ios.ts:30-60
The system verifies:
- Tunnel Running - Required for remote iOS devices
- WDA Port Forward - TCP port forwarding to WebDriverAgent
- WebDriverAgent Status - Actual running state of WDA on device
private async assertTunnelRunning(): Promise<void> {
if (await this.isTunnelRequired()) {
if (!(await this.isTunnelRunning())) {
throw new ActionableError("iOS tunnel is not running...");
}
}
}
MCP Tools Interface
The server exposes the following screen interaction tools:
| Tool Name | Description | Parameters |
|---|---|---|
mobile_take_screenshot | Capture current screen | device |
mobile_list_elements_on_screen | Get UI elements | device |
mobile_click_on_screen_at_coordinates | Tap at coordinates | device, x, y |
mobile_double_tap_on_screen | Double tap | device, x, y |
mobile_long_press_on_screen_at_coordinates | Long press | device, x, y, duration |
mobile_swipe_on_screen | Swipe gesture | device, direction |
mobile_type_keys | Text input | device, text, submit |
mobile_get_orientation | Query orientation | device |
mobile_set_orientation | Set orientation | device, orientation |
Usage Examples
Basic Screen Interaction Workflow
// 1. List available devices
const devices = await listAvailableDevices();
// 2. Get screen elements
const elements = await getElementsOnScreen(deviceId);
// 3. Tap on a button by coordinates
await tap(deviceId, 150, 300);
// 4. Type text into a field
await typeKeys(deviceId, "Hello World", false);
// 5. Swipe up to scroll
await swipe(deviceId, "up");
// 6. Take screenshot to verify
const screenshot = await getScreenshot(deviceId);
Complete User Journey Automation
// Open app and perform multi-step interaction
await launchApp(deviceId, "com.example.app");
await swipe(deviceId, "up");
const elements = await getElementsOnScreen(deviceId);
const loginButton = elements.find(e => e.name === "Login");
await tap(deviceId, loginButton.rect.x + 10, loginButton.rect.y + 10);
await typeKeys(deviceId, "[email protected]", false);
await pressButton(deviceId, "ENTER");
Error Handling
The system provides actionable error messages with troubleshooting links:
throw new ActionableError(
`mobilecli is not available or not working properly.
Please review the documentation at
https://github.com/mobile-next/mobile-mcp/wiki`
);
Sources: src/server.ts:20-25
Common error scenarios:
| Error Condition | Cause | Resolution |
|---|---|---|
| "iOS tunnel is not running" | Remote device connection issue | Start tunnel per wiki instructions |
| "Port forwarding not running" | WDA port not forwarded | Configure port forwarding |
| "WebDriverAgent not running" | WDA crashed on device | Restart WDA on iOS device |
Performance Considerations
- Element queries use native accessibility trees for speed
- Screenshots are transferred as base64-encoded PNG
- Gestures are calculated as percentages of screen dimensions
- Timeouts for recording operations default to 5 minutes (300 seconds)
Sources: src/robot.ts:30-100
iOS Implementation
Related topics: Device Abstraction Layer, Screen Interaction and Input
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Device Abstraction Layer, Screen Interaction and Input
iOS Implementation
Overview
The iOS Implementation module provides comprehensive automation capabilities for iOS devices, including physical devices, simulators, and emulators. This module serves as a bridge between the Mobile MCP server and iOS testing infrastructure, enabling AI agents to interact with iOS applications through accessibility trees and coordinate-based input.
Purpose: The iOS Implementation enables native app automation for iOS testing, scripted flows, and multi-step user journeys driven by LLMs.
Scope: The module handles device detection, connection management, WebDriverAgent communication, tunnel port forwarding, and command execution via the go-ios library.
Sources: src/ios.ts:1-50
Architecture
The iOS implementation follows a layered architecture that separates device management, robot control, and protocol communication.
graph TD
A[MCP Server] --> B[IosManager]
A --> C[IosRobot]
B --> D[go-ios / mobilecli]
C --> E[WebDriverAgent]
C --> F[iOS Device Kit]
E --> G[Physical iOS Device]
F --> G
D --> H[iOS Simulator]
I[iPhone Simulator] --> HCore Components
| Component | File | Responsibility |
|---|---|---|
| IosManager | src/server.ts | Device discovery and listing |
| IosRobot | src/server.ts | Device interaction via Robot interface |
| IosDeviceConnection | src/ios.ts | Connection and tunnel management |
| WebDriverAgent | src/webdriver-agent.ts | UI automation protocol |
| iPhone Simulator | src/iphone-simulator.ts | Simulator-specific operations |
Sources: src/server.ts:30-80
Device Detection
The system uses a multi-step detection process to identify iOS device types and establish appropriate communication channels.
Device Type Resolution
graph TD
A[Device ID] --> B{IosManager listDevices?}
B -->|Found| C[IosRobot]
B -->|Not Found| D{AndroidDeviceManager?}
D -->|Found| E[AndroidRobot]
D -->|Not Found| F{mobilecli getDevices?}
F -->|Simulator Found| G{MobileDevice}
F -->|Not Found| H[Error: Device not found]The getRobotFromDevice function performs the following checks in order:
- iOS Physical Devices: Queries IosManager for connected iOS devices Sources: src/server.ts:45-50
- Android Devices: Checks AndroidDeviceManager for matching device ID Sources: src/server.ts:52-56
- iOS Simulators: Uses mobilecli with platform filter for simulators Sources: src/server.ts:58-75
Simulator Agent Verification
For iOS simulators, the system automatically verifies and installs the agent if needed:
if (!agentVerifiedSimulators.has(deviceId)) {
const agentStatus = mobilecli.agentStatus(deviceId);
if (agentStatus.status === "fail") {
mobilecli.agentInstall(deviceId);
}
agentVerifiedSimulators.add(deviceId);
}
Sources: src/server.ts:65-71
Connection Management
Port-Based Connection State
The iOS implementation uses port checking to verify connection states:
graph LR
A[Device] -->|USB Tunnel| B[localhost:PORT]
B --> C{WDA Port Check}
C -->|Listening| D[WebDriverAgent Ready]
C -->|Not Listening| E[Error: Port forwarding not running]#### Tunnel Requirements
The isTunnelRequired method determines when tunnel port forwarding is necessary based on connection type. When a tunnel is required but not running, the system throws an ActionableError:
private async assertTunnelRunning(): Promise<void> {
if (await this.isTunnelRequired()) {
if (!(await this.isTunnelRunning())) {
throw new ActionableError("iOS tunnel is not running, please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
}
}
Sources: src/ios.ts:45-50
#### WebDriverAgent Port Forwarding
Connection to WebDriverAgent requires both tunnel and port forwarding verification:
private async wda(): Promise<WebDriverAgent> {
await this.assertTunnelRunning();
if (!(await this.isWdaForwardRunning())) {
throw new ActionableError("Port forwarding to WebDriverAgent is not running (tunnel okay), please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
const wda = new WebDriverAgent("localhost", WDA_PORT);
if (!(await wda.isRunning())) {
throw new ActionableError("WebDriverAgent is not running on device (tunnel okay, port forwarding okay), please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
return wda;
}
Sources: src/ios.ts:55-70
Robot Interface Implementation
The IosRobot class implements the Robot interface, providing unified access to iOS device capabilities.
Supported Operations
| Category | Operations |
|---|---|
| Screen | getScreenshot, getScreenSize, getOrientation, setOrientation |
| Touch | tap, doubleTap, longPress, swipeFromCoordinate |
| Input | sendKeys, pressButton |
| Navigation | home, back, openUrl |
| Apps | listApps, launchApp, terminateApp, installApp, uninstallApp |
| Elements | getElementsOnScreen |
Sources: src/robot.ts:1-80
iOS-Specific Considerations
#### Accessibility Tree
The implementation uses native accessibility trees for element detection rather than computer vision, making it LLM-friendly and fast.
Note: The getElementsOnScreen method works only on native apps and will not function within webviews.
Sources: src/robot.ts:40-45
#### Long Press Duration
The long press operation accepts a custom duration parameter:
longPress(x: number, y: number, duration: number): Promise<void>;
| Parameter | Type | Description |
|---|---|---|
| x | number | X coordinate on screen |
| y | number | Y coordinate on screen |
| duration | number | Duration in milliseconds |
Sources: src/robot.ts:35-38
WebDriverAgent Integration
The WebDriverAgent (WDA) serves as the primary automation protocol for iOS physical devices.
Connection Flow
sequenceDiagram
participant MCP as MCP Server
participant Ios as IosDeviceConnection
participant WDA as WebDriverAgent
participant Device as iOS Device
MCP->>Ios: wda()
Ios->>Ios: assertTunnelRunning()
Ios->>Ios: isWdaForwardRunning()
Ios->>WDA: new WebDriverAgent(localhost, PORT)
WDA->>Device: isRunning()
Device-->>WDA: status
WDA-->>Ios: WebDriverAgent instance
Ios-->>MCP: Ready for commandsPort Configuration
| Port | Purpose | Configuration |
|---|---|---|
| WDA_PORT | WebDriverAgent communication | localhost:PORT |
| IOS_TUNNEL_PORT | USB tunnel forwarding | localhost:PORT |
Sources: src/webdriver-agent.ts:1-30
iOS Simulator Support
The iPhone Simulator module provides specialized handling for iOS Simulator instances.
Simulator Detection
Simulators are detected through mobilecli with the following parameters:
const response = mobilecli.getDevices({
platform: "ios",
type: "simulator",
includeOffline: false,
});
Sources: src/server.ts:58-62
Simulator vs Physical Device
| Aspect | Simulator | Physical Device |
|---|---|---|
| Connection | Direct via mobilecli | USB tunnel + WDA |
| Agent | Auto-installed on demand | Manual setup required |
| Port Forwarding | Not required | Required (WDA_PORT) |
| WebDriverAgent | Optional auto-start | Required |
Sources: src/iphone-simulator.ts:1-40
go-ios Integration
The implementation uses the go-ios binary for low-level iOS device communication.
Command Execution
private async ios(...args: string[]): Promise<string> {
return execFileSync(getGoIosPath(), ["--udid", this.deviceId, ...args], {}).toString();
}
Sources: src/ios.ts:75-77
Utility Functions
The system checks for mobilecli availability on startup:
const ensureMobilecliAvailable = (): void => {
try {
const version = mobilecli.getVersion();
if (version.startsWith("failed")) {
throw new Error("mobilecli version check failed");
}
} catch (error: any) {
throw new ActionableError(`mobilecli is not available or not working properly. Please review the documentation at https://github.com/mobile-next/mobile-mcp/wiki for installation instructions`);
}
};
Sources: src/server.ts:22-32
Error Handling
Actionable Errors
The system throws ActionableError with user-friendly messages and documentation links:
| Error Scenario | Message | Resolution Link |
|---|---|---|
| mobilecli unavailable | Installation instructions | Wiki Installation |
| Tunnel not running | Setup guide | Wiki Tunnel Setup |
| Port forwarding failed | Troubleshooting | Wiki Debugging |
| WDA not running | Configuration guide | Wiki WDA Setup |
| Device not found | Device list | mobile_list_available_devices tool |
Sources: src/ios.ts:48-70
Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
| MOBILEMCP_AUTH | Bearer token for SSE authorization | None |
Port Constants
The system uses standard port configurations for iOS tunnel and WDA communication. Refer to the source code for current port values.
Platform Support Matrix
| Platform | Version | Automation Method |
|---|---|---|
| iOS Simulator | All versions | mobilecli + iOS Device Kit |
| Physical iOS | iOS 13+ | WebDriverAgent via go-ios |
| Real Device (debug) | iOS 13+ | USB tunnel + WDA |
Sources: src/webdriver-agent.ts:20-30
Related Documentation
Sources: src/ios.ts:1-50
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
The project may affect permissions, credentials, data exposure, or host boundaries.
The project may affect permissions, credentials, data exposure, or host boundaries.
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Configuration risk: feat(iOS): add Unix Domain Socket support for usbmuxd (/var/run/usbmuxd) with TCP fallback
- Severity: high
- Finding: Configuration risk is backed by a source signal: feat(iOS): add Unix Domain Socket support for usbmuxd (/var/run/usbmuxd) with TCP fallback. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/322
2. Configuration risk: mobile_type_keys not support Chinese words
- Severity: high
- Finding: Configuration risk is backed by a source signal: mobile_type_keys not support Chinese words. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/238
3. Security or permission risk: Default-on telemetry creates security and compliance risk for enterprise users
- Severity: high
- Finding: Security or permission risk is backed by a source signal: Default-on telemetry creates security and compliance risk for enterprise users. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/330
4. Security or permission risk: iOS physical-device support blocked on macOS 26 (Tahoe) — go-ios can't read /var/db/lockdown/RemotePairing/
- Severity: high
- Finding: Security or permission risk is backed by a source signal: iOS physical-device support blocked on macOS 26 (Tahoe) — go-ios can't read /var/db/lockdown/RemotePairing/. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/323
5. Configuration risk: start/stop_screen_recording produces corrupt files on Android and silently fails on iOS physical devices
- Severity: medium
- Finding: Configuration risk is backed by a source signal: start/stop_screen_recording produces corrupt files on Android and silently fails on iOS physical devices. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/321
6. Capability assumption: mobile fleet documentation link broken
- Severity: medium
- Finding: Capability assumption is backed by a source signal: mobile fleet documentation link broken. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/issues/328
7. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | README/documentation is current enough for a first validation pass.
8. Maintenance risk: Version 0.0.49
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Version 0.0.49. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.49
9. Maintenance risk: Version 0.0.54
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Version 0.0.54. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.54
10. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | last_activity_observed missing
11. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | no_demo; severity=medium
12. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:956657893 | https://github.com/mobile-next/mobile-mcp | no_demo; severity=medium
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mobile-mcp with real data or production workflows.
- start/stop_screen_recording produces corrupt files on Android and silent - github / github_issue
- Default-on telemetry creates security and compliance risk for enterprise - github / github_issue
- mobile fleet documentation link broken - github / github_issue
- iOS physical-device support blocked on macOS 26 (Tahoe) — go-ios can't r - github / github_issue
- feat(iOS): add Unix Domain Socket support for usbmuxd (/var/run/usbmuxd) - github / github_issue
- mobile_type_keys not support Chinese words - github / github_issue
- Founding Harness Doctor audit for Mobile MCP - github / github_issue
- Version 0.0.55 - github / github_release
- Version 0.0.54 - github / github_release
- Version 0.0.53 - github / github_release
- Version 0.0.52 - github / github_release
- README/documentation is current enough for a first validation pass. - GitHub / issue
Source: Project Pack community evidence and pitfall evidence