# https://github.com/ARAS-Workspace/claude-kvm Project Manual

Generated at: 2026-05-31 04:26:50 UTC

## Table of Contents

- [Installation Guide](#installation)
- [Quick Start Guide](#quickstart)
- [System Architecture](#system-architecture)
- [PC (Procedure Call) Protocol](#pc-protocol)
- [Coordinate Scaling](#coordinate-scaling)
- [Screen Operations](#screen-operations)
- [Input Control (Mouse & Keyboard)](#input-control)
- [OCR Detection](#ocr-detection)
- [Authentication Methods](#authentication)
- [Environment Variables](#environment-variables)

<a id='installation'></a>

## Installation Guide

### Related Pages

Related topics: [Quick Start Guide](#quickstart), [Environment Variables](#environment-variables)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
- [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)
</details>

# Installation Guide

This guide covers the complete installation process for Claude KVM, a Model Context Protocol (MCP) server that enables Claude to control remote desktops via VNC with native Apple Silicon performance and on-device OCR capabilities.

## System Architecture Overview

Claude KVM consists of two primary components that work together to provide remote desktop control:

```mermaid
graph TD
    A["Claude Desktop"] <-->|"MCP JSON-RPC"| B["MCP Proxy<br/>(Node.js)"]
    B <-->|"PC Protocol<br/>NDJSON"| C["VNC Daemon<br/>(Swift/C)"]
    C <-->|"VNC Protocol"| D["Remote Desktop<br/>(macOS/Target)"]
    
    B -->|spawns| C
    C -->|screen capture<br/>input injection| D
```

| Component | Language | Role | Distribution |
|-----------|----------|------|--------------|
| **MCP Proxy** | JavaScript (Node.js) | Communicates with Claude over MCP protocol, manages daemon lifecycle | npm package `claude-kvm` |
| **VNC Daemon** | Swift/C (Apple Silicon) | VNC connection, screen capture, mouse/keyboard input injection | Homebrew package `claude-kvm-daemon` |

Source: [README.md:Layer Architecture](README.md)

## Prerequisites

Before installing Claude KVM, ensure your environment meets the following requirements:

### Target Machine (Remote Desktop)

- **Platform**: macOS (Apple Silicon recommended for native performance)
- **VNC Server**: A VNC server must be running and accessible
- **Network**: The machine running Claude must have network access to the VNC server

### Control Machine (Where Claude Runs)

| Requirement | Version/Type | Notes |
|-------------|--------------|-------|
| **Operating System** | macOS, Linux, or Windows with Node.js support | Tested primarily on macOS |
| **Node.js** | LTS version | Required for MCP proxy |
| **Package Manager** | npm or npx | For installing MCP proxy |

Source: [LAUNCHGUIDE.md:Requirements](LAUNCHGUIDE.md)

## Installation Steps

### Step 1: Install the Native VNC Daemon

The native daemon (`claude-kvm-daemon`) handles low-level VNC communication and input injection. It is distributed via Homebrew.

```bash
# Add the Homebrew tap
brew tap ARAS-Workspace/tap

# Install the daemon
brew install claude-kvm-daemon
```

#### Daemon Build and Distribution

The daemon is built and code-signed on GitHub Actions CI:

| Artifact | Location | Purpose |
|----------|----------|---------|
| `.tar.gz` archive | GitHub Releases | Homebrew distribution |
| `.dmg` disk image | CI Artifacts | Notarized installer package |

The notarization process validates the binary with Apple, enabling Gatekeeper approval on modern macOS systems. Source: [README_TR.md:Daemon Build Process](README_TR.md)

### Step 2: Configure MCP Server

Create a `.mcp.json` file in your project directory to configure the MCP server connection:

```json
{
  "mcpServers": {
    "claude-kvm": {
      "command": "npx",
      "args": ["-y", "claude-kvm"],
      "env": {
        "VNC_HOST": "192.168.1.100",
        "VNC_PORT": "5900",
        "VNC_USERNAME": "user",
        "VNC_PASSWORD": "pass",
        "CLAUDE_KVM_DAEMON_PATH": "/opt/homebrew/bin/claude-kvm-daemon"
      }
    }
  }
}
```

Source: [README_TR.md:MCP Configuration](README_TR.md)

#### Environment Variables Reference

| Variable | Required | Description |
|----------|----------|-------------|
| `VNC_HOST` | Yes | VNC server hostname or IP address |
| `VNC_PORT` | Yes | VNC server port (default: 5900) |
| `VNC_USERNAME` | No | Username for VNC/ARD authentication |
| `VNC_PASSWORD` | No | Password for VNC/ARD authentication |
| `CLAUDE_KVM_DAEMON_PATH` | No | Custom path to daemon binary (default: auto-detected) |

Source: [server.json:Environment Variables](server.json)

### Step 3: Verify Installation

After configuration, verify the installation by checking the connection status:

```bash
# Test the MCP server loads correctly
npx -y claude-kvm --help
```

## Supported Authentication Methods

Claude KVM supports multiple VNC authentication mechanisms:

| Method | Protocol | Key Exchange | Encryption |
|--------|----------|--------------|------------|
| **VNC Auth** | Standard VNC | DES challenge-response | Password-based |
| **ARD** | Apple Remote Desktop | Diffie-Hellman | AES-128-ECB |

When ARD authentication is detected (auth type 30), the daemon automatically remaps Meta keys to Super for Command key compatibility on macOS targets. Source: [README.md:Authentication](README.md)

## Daemon Installation Paths

The daemon binary is installed to different locations depending on your macOS architecture:

| Architecture | Typical Path |
|--------------|--------------|
| Apple Silicon (arm64) | `/opt/homebrew/bin/claude-kvm-daemon` |
| Intel (x86_64) | `/usr/local/bin/claude-kvm-daemon` |

The MCP proxy automatically detects the correct path, but you can override it using `CLAUDE_KVM_DAEMON_PATH`. Source: [README_TR.md:Daemon Path Configuration](README_TR.md)

## Quick Start with npx

For temporary or testing purposes, you can run the MCP proxy directly without persistent configuration:

```bash
# Set environment variables and run
VNC_HOST=192.168.1.100 VNC_PORT=5900 VNC_PASSWORD=secret npx -y claude-kvm
```

## Post-Installation: macOS-Specific Considerations

If running on bare-metal Mac hardware, review the [Mac M1 Preparation Tricks](https://gist.github.com/remrearas/a3f300635b02f2587a134882a51f7114) guide for:

- VNC security hardening
- SSH tunneling configuration
- Session stability improvements

> [!NOTE]
> This is particularly relevant if you encounter VNC connection drops during extended test sessions. Source: [README.md:VNC Security Note](README.md)

## Troubleshooting Common Installation Issues

### Daemon Not Found

If you see errors about the daemon not being found:

1. Verify Homebrew installation: `brew list claude-kvm-daemon`
2. Check the binary exists at the expected path
3. Set `CLAUDE_KVM_DAEMON_PATH` explicitly in your `.mcp.json`

### Connection Failures

VNC connection issues during long-running sessions have been reported. If experiencing connection drops:

- SSH connectivity to the target confirms the issue is VNC/ARD server-side, not the daemon
- Consider implementing reconnection logic or monitoring

Source: [Community Issue #8:VNC Connection Drops](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

## Updating Claude KVM

| Component | Update Method |
|-----------|---------------|
| MCP Proxy | `npx -y claude-kvm` (always fetches latest) |
| Daemon | `brew upgrade claude-kvm-daemon` |

## Version Information

Current releases:

| Component | Version |
|-----------|---------|
| MCP Proxy | 2.0.11 |
| Daemon | 1.0.1 |

Source: [server.json:Version](server.json), [Community:Daemon Releases](https://github.com/ARAS-Workspace/claude-kvm/releases/tag/daemon-v1.0.1)

---

<a id='quickstart'></a>

## Quick Start Guide

### Related Pages

Related topics: [Installation Guide](#installation), [Environment Variables](#environment-variables)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [package.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/package.json)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
</details>

# Quick Start Guide

Claude KVM enables AI assistants like Claude to control remote macOS desktops via VNC. This guide walks you through prerequisites, installation, configuration, and basic workflows to get started with automated remote desktop control.

---

## Overview

Claude KVM consists of two main components:

| Component | Language | Role |
|-----------|----------|------|
| **MCP Proxy** | JavaScript (Node.js) | Communicates with Claude over MCP protocol, manages daemon lifecycle |
| **VNC Daemon** | Swift/C (Apple Silicon) | VNC connection, screen capture, mouse/keyboard input injection |

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

The architecture uses PC (Procedure Call) protocol over NDJSON for communication between the proxy and daemon:

```
Request:      {"method":"<name>","params":{...},"id":<int|string>}
Response:     {"result":{...},"id":<int|string>}
Error:        {"error":{"code":<int>,"message":"..."},"id":<int|string>}
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

---

## Prerequisites

Before installing Claude KVM, ensure you have the following:

| Requirement | Details |
|-------------|---------|
| **macOS** | Apple Silicon (aarch64) |
| **Node.js** | LTS version recommended |
| **VNC Server** | Running on the target remote Mac |
| **Homebrew** | For daemon installation |

Source: [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)

### VNC Server Setup on Target Mac

For best results on bare-metal Macs, configure your VNC server with SSH tunneling for security and stability. Refer to the [Mac M1 Preparation Tricks](https://gist.github.com/remrearas/a3f300635b02f2587a134882a51f7114) guide for detailed setup instructions.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

---

## Installation

### Step 1: Install the VNC Daemon

Install the native daemon using Homebrew:

```bash
brew tap ARAS-Workspace/tap
brew install claude-kvm-daemon
```

The daemon is compiled on GitHub Actions, code-signed, and notarized. The Homebrew installation follows the latest release.

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

### Step 2: Install the MCP Proxy

Install the MCP proxy via npm:

```bash
npm install -g claude-kvm
```

Or add to your project:

```bash
npm install claude-kvm
```

Source: [package.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/package.json)

### Step 3: Configure MCP Server

Create a `.mcp.json` file in your project directory:

```json
{
  "mcpServers": {
    "claude-kvm": {
      "command": "npx",
      "args": ["-y", "claude-kvm"],
      "env": {
        "VNC_HOST": "192.168.1.100",
        "VNC_PORT": "5900",
        "VNC_USERNAME": "user",
        "VNC_PASSWORD": "pass",
        "CLAUDE_KVM_DAEMON_PATH": "/opt/homebrew/bin/claude-kvm-daemon"
      }
    }
  }
}
```

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

---

## Configuration Reference

### Environment Variables

| Variable | Required | Description |
|----------|----------|-------------|
| `VNC_HOST` | Yes | VNC server hostname or IP |
| `VNC_PORT` | Yes | VNC server port |
| `VNC_USERNAME` | No | VNC/ARD username for authentication |
| `VNC_PASSWORD` | Yes | VNC/ARD password |
| `CLAUDE_KVM_DAEMON_PATH` | No | Path to daemon binary (auto-detected if omitted) |
| `CLAUDE_KVM_MAX_DIMENSION` | No | Max screen dimension in pixels (default: 1280) |
| `CLAUDE_KVM_NO_RECONNECT` | No | Disable auto-reconnection |
| `CLAUDE_KVM_VERBOSE` | No | Enable verbose logging |

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

### Authentication Methods

Claude KVM supports two VNC authentication methods:

| Method | Description |
|--------|-------------|
| **VNC Auth** | Password-based challenge-response (DES) |
| **ARD** | Apple Remote Desktop (Diffie-Hellman + AES-128-ECB) |

When ARD authentication is detected, Meta keys are automatically remapped to Super for macOS Command key compatibility.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

---

## Available Tools

Claude KVM exposes two MCP tools for desktop control.

### vnc_command

The primary tool for all VNC operations. Actions are categorized into **Screen**, **Mouse**, **Keyboard**, **Detection**, **Configuration**, and **Control**.

Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

### action_queue

Batch up to 20 sequential actions in a single tool call for efficient multi-step workflows.

Source: [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)

---

## Basic Workflows

### Workflow: Screenshot and Analyze

The recommended pattern for desktop automation:

```
screenshot → analyze → act → verify
```

Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

**Example sequence:**

1. Take a screenshot to see the current state:
   ```
   vnc_command({ "action": "screenshot" })
   ```

2. Analyze the image to understand the UI

3. Interact with elements using mouse or keyboard

4. Take another screenshot to verify the result

### Workflow: Click and Type

Navigate to an element and input text:

```json
{
  "action": "action_queue",
  "actions": [
    { "action": "mouse_click", "x": 640, "y": 400 },
    { "action": "key_type", "text": "Hello World" }
  ]
}
```

### Workflow: Detect and Click

Use OCR to find text elements and click them:

1. Call `detect_elements` to get all text with bounding boxes
2. Parse the response for your target text and coordinates
3. Click the element using the returned x/y coordinates

---

## Coordinate System

The VNC server's native resolution is scaled down to fit within `--max-dimension` (default: 1280px). Claude works with scaled coordinates—the daemon handles conversion automatically.

```
Native:  4220 x 2568  (VNC server framebuffer)
Scaled:  1280 x 779   (what Claude sees and targets)

mouse_click(640, 400) → VNC receives (2110, 1284)
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

---

## Timing Parameters

All timing parameters can be configured at runtime via the `configure` action or set in the MCP proxy configuration.

### Default Timing Values

| Parameter | Default | Description |
|-----------|---------|-------------|
| `click_hold_ms` | 50 | Mouse button press duration |
| `double_click_gap_ms` | 50 | Gap between double-click events |
| `hover_settle_ms` | 400 | Wait time after hover |
| `drag_position_ms` | 30 | Wait before drag start |
| `drag_press_ms` | 50 | Drag button press duration |
| `drag_step_ms` | 5 | Delay between drag interpolation steps |
| `drag_settle_ms` | 30 | Wait after releasing drag |
| `drag_pixels_per_step` | 20 | Interpolation density |
| `drag_min_steps` | 10 | Minimum interpolation steps |
| `scroll_press_ms` | 10 | Scroll press duration |
| `scroll_tick_ms` | 20 | Delay between scroll ticks |
| `key_hold_ms` | 30 | Key press duration |
| `combo_mod_ms` | 10 | Modifier key settle delay |
| `type_key_ms` | 20 | Key press during typing |
| `type_inter_key_ms` | 20 | Delay between characters |
| `type_shift_ms` | 10 | Shift key settle delay |
| `paste_settle_ms` | 30 | Wait after paste |

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

### Runtime Configuration

Adjust timing at runtime:

```json
{
  "method": "configure",
  "params": {
    "click_hold_ms": 80,
    "key_hold_ms": 50
  }
}
```

Reset to defaults:

```json
{
  "method": "configure",
  "params": {
    "reset": true
  }
}
```

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

---

## Common Issues and Troubleshooting

### VNC Connection Drops During Long Sessions

> [!CAUTION]
> **Known Issue**: During extended test sessions (~50 turns), the VNC connection may drop and the daemon becomes unresponsive. SSH to the remote Mac remains active, confirming the issue is on the VNC/ARD server side—not the daemon or network layer.

**Community Discussion**: [Issue #8](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

**Workarounds:**

1. Use SSH tunneling for VNC to improve connection stability
2. Enable auto-reconnection by not setting `CLAUDE_KVM_NO_RECONNECT`
3. Reduce session length and restart the daemon periodically
4. Adjust VNC server settings on the target Mac for longer timeouts

### Daemon Path Not Found

If the daemon isn't found automatically, explicitly set the path:

```json
{
  "env": {
    "CLAUDE_KVM_DAEMON_PATH": "/opt/homebrew/bin/claude-kvm-daemon"
  }
}
```

### Coordinate Mismatch

If clicks miss targets, verify the display scaling matches expectations. Use `health` action to check current display dimensions:

```json
{
  "action": "health"
}
```

---

## Verifying Installation

### Test the Setup

Run a basic health check:

```json
{
  "action": "health"
}
```

Expected response includes connection status and display info (scaledWidth, scaledHeight).

### Run Integration Tests

The project includes live test environments you can reference:

- [Mac Integration Test](https://github.com/ARAS-Workspace/claude-kvm/actions/runs/22261487249)
- [Mac Calculator Test](https://github.com/ARAS-Workspace/claude-kvm/actions/runs/22261139721)

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

---

## Next Steps

| Topic | Description |
|-------|-------------|
| [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md) | Complete feature documentation and architecture |
| [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md) | Detailed launch configuration options |
| [Mac M1 Preparation Tricks](https://gist.github.com/remrearas/a3f300635b02f2587a134882a51f7114) | VNC security and stability tips |

---

## Summary

| Component | Version | Purpose |
|-----------|---------|---------|
| claude-kvm (npm) | 2.0.11 | MCP proxy server |
| claude-kvm-daemon | 1.0.1 | Native VNC control daemon |

Getting started requires installing both components, configuring environment variables for your VNC server, and using the `vnc_command` tool to interact with the remote desktop. Remember that all coordinates work in scaled space, and for long-running sessions, consider implementing connection recovery strategies.

---

<a id='system-architecture'></a>

## System Architecture

### Related Pages

Related topics: [PC (Procedure Call) Protocol](#pc-protocol), [Coordinate Scaling](#coordinate-scaling)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [jsconfig.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/jsconfig.json)
</details>

# System Architecture

## Overview

Claude KVM is a Model Context Protocol (MCP) server that enables Claude to control remote macOS desktops via VNC. The system bridges Claude's AI capabilities with low-level desktop automation by combining a JavaScript-based MCP proxy with a native Swift daemon.

**Purpose and Scope:**

- Remote desktop automation through VNC protocol
- On-device OCR using Apple Vision framework
- Action orchestration via the MCP protocol
- Cross-platform control (macOS target, any Claude host)
- Native performance on Apple Silicon hardware

**Repository:** [ARAS-Workspace/claude-kvm](https://github.com/ARAS-Workspace/claude-kvm)  
**Current Version:** 2.0.11 (MCP Server) | 1.0.1 (Daemon) Source: [server.json:14]()

## High-Level Architecture

The system follows a layered architecture separating protocol handling from low-level VNC operations.

```mermaid
graph TD
    subgraph "Claude Host"
        A[Claude AI]
    end
    
    subgraph "MCP Proxy (Node.js)"
        B[MCP SDK Server]
        C[Daemon Lifecycle Manager]
        D[PC Protocol Encoder/Decoder]
    end
    
    subgraph "VNC Daemon (Swift/C)"
        E[VNC Client]
        F[Input Injector]
        G[Screen Capture]
        H[Apple Vision OCR]
    end
    
    subgraph "Remote Target (macOS)"
        I[VNC Server]
        J[Remote Desktop]
    end
    
    A --> B
    B <--> C
    C <--> D
    D <--> E
    E <--> I
    F --> J
    G --> J
    H --> J
```

**Key Architectural Principle:** The MCP proxy handles JSON-RPC communication with Claude, while the native daemon performs all VNC operations. This separation ensures that AI-related processing remains on the Claude host while performance-critical desktop operations execute natively on Apple Silicon.

## Component Layers

### Layer Comparison

| Layer | Language | Role | Communication |
|-------|----------|------|---------------|
| **MCP Proxy** | JavaScript (Node.js) | Claude protocol handling, daemon lifecycle, action batching | stdio JSON-RPC |
| **VNC Daemon** | Swift/C (Apple Silicon) | VNC connection, screen capture, input injection, OCR | stdin/stdout NDJSON |

Source: [README.md:1-30]()

---

## MCP Proxy Layer

**File:** `index.js`

The MCP proxy is the entry point for the entire system. It initializes the MCP SDK server and manages the daemon lifecycle.

### Responsibilities

1. **Daemon Spawning** — Launches the `claude-kvm-daemon` binary as a child process
2. **Protocol Translation** — Converts MCP tool calls to PC protocol requests
3. **Lifecycle Management** — Handles daemon startup, health monitoring, and graceful shutdown
4. **Action Batching** — Supports queuing up to 20 sequential actions in a single call

### Entry Point

```javascript
#!/usr/bin/env node
// index.js

import { spawn } from 'node:child_process';
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { vncCommandTool, actionQueueTool, controlTools } from './tools/index.js';
```

Source: [index.js:1-25]()

### Configuration

The proxy reads configuration from environment variables:

| Variable | Required | Description |
|----------|----------|-------------|
| `VNC_HOST` | Yes | VNC server hostname or IP |
| `VNC_PORT` | Yes | VNC server port |
| `VNC_USERNAME` | No | VNC/ARD username for authentication |
| `VNC_PASSWORD` | No | VNC/ARD password |
| `CLAUDE_KVM_DAEMON_PATH` | No | Custom path to daemon binary |

Source: [server.json:18-35]()

### Module Configuration

```javascript
{
  "compilerOptions": {
    "module": "ESNext",
    "moduleResolution": "node",
    "target": "ESNext",
    "checkJs": true,
    "strict": false,
    "baseUrl": ".",
    "types": ["node"]
  }
}
```

Source: [jsconfig.json:1-10]()

---

## VNC Daemon Layer

**Component:** `claude-kvm-daemon`

The native daemon is built for Apple Silicon and handles all low-level operations.

### Responsibilities

1. **VNC Connection Management** — Establishes and maintains RFB protocol connection
2. **Framebuffer Operations** — Captures screen at native resolution
3. **Input Injection** — Sends mouse and keyboard events to the VNC server
4. **OCR Processing** — Uses Apple Vision framework for text detection
5. **Coordinate Transformation** — Maps between scaled and native coordinate spaces

### Build Configuration

The daemon uses static linking with LibVNC for consistent behavior across macOS versions. Releases include:

- **Homebrew distribution:** `.tar.gz` archive
- **Notarized installer:** `.dmg` disk image (for direct download)

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/releases/tag/daemon-v1.0.1)

---

## PC (Procedure Call) Protocol

Communication between the MCP proxy and daemon uses the PC protocol over NDJSON (Newline-Delimited JSON).

### Message Types

| Type | Direction | Format |
|------|-----------|--------|
| **Request** | Proxy → Daemon | `{"method":"<name>","params":{...},"id":<int\|string>}` |
| **Response** | Daemon → Proxy | `{"result":{...},"id":<int\|string>}` |
| **Error** | Daemon → Proxy | `{"error":{"code":<int>,"message":"..."},"id":<int\|string>}` |
| **Notification** | Bidirectional | `{"method":"<name>","params":{...}}` |

Source: [README.md:35-45]()

### Example Communication Flow

```mermaid
sequenceDiagram
    participant Claude
    participant MCP as MCP Proxy
    participant Daemon as VNC Daemon
    participant VNC as VNC Server
    
    Claude->>MCP: vnc_command(screenshot)
    MCP->>Daemon: {"method":"screenshot","params":{},"id":1}
    Daemon->>VNC: Request framebuffer
    VNC->>Daemon: Raw pixel data
    Daemon->>MCP: {"result":{"image":"base64..."},"id":1}
    MCP->>Claude: Screenshot result
```

### Action Methods

The daemon exposes the following methods:

| Method | Description |
|--------|-------------|
| `screenshot` | Full screen PNG capture |
| `cursor_crop` | Crop around cursor with crosshair overlay |
| `diff_check` | Detect screen changes since baseline |
| `set_baseline` | Set current screen as diff reference |
| `mouse_click` | Click at coordinates (with optional button) |
| `mouse_double_click` | Double click at coordinates |
| `mouse_move` | Move cursor to coordinates |
| `hover` | Move cursor and wait for settle |
| `nudge` | Relative cursor movement |
| `mouse_drag` | Drag from start to end coordinates |
| `scroll` | Scroll direction (up\|down\|left\|right) |
| `key_tap` | Single key press |
| `key_combo` | Modifier combo (e.g., "cmd+c") |
| `key_type` | Character-by-character typing |
| `paste` | Paste via clipboard |
| `detect_elements` | OCR text detection with bounding boxes |
| `configure` | Runtime parameter adjustment |
| `get_timing` | Query current parameters |
| `wait` | Pause execution |
| `health` | Connection and display status |
| `shutdown` | Graceful daemon exit |

Source: [tools/index.js:1-30]()

---

## Coordinate Scaling

The VNC server's native resolution is scaled down to fit within `--max-dimension` (default: 1280px). Claude operates in scaled space for consistency.

### Scaling Flow

```
Native Resolution:    4220 × 2568 pixels (VNC server framebuffer)
Scaled Resolution:    1280 × 779 pixels  (what Claude sees)

User Command:        mouse_click(640, 400)
Translated:          VNC receives (2110, 1284)
```

Source: [README.md:50-60]()

### Scaling Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `max_dimension` | `1280` | Maximum display dimension in scaled space |
| `cursor_crop_radius` | `150` | Radius for cursor crop action |

Source: [README.md:80-85]()

---

## Authentication

The system supports multiple VNC authentication methods with automatic detection.

### Supported Methods

| Method | Protocol | Notes |
|--------|----------|-------|
| **VNC Auth** | DES challenge-response | Standard VNC password authentication |
| **ARD (Apple Remote Desktop)** | Diffie-Hellman + AES-128-ECB | For macOS targets |

Source: [README.md:105-110]()

### macOS Detection

When the daemon detects ARD auth type 30 (credential request), it automatically remaps Meta keys to Super for Command key compatibility:

```javascript
// Meta key detection triggers Super remapping
if (authType === 30) {
    // Remap Meta → Super for macOS compatibility
}
```

Source: [README.md:108-115]()

---

## Timing Parameters

All timing parameters can be configured at runtime via the `configure` action.

### Default Values

| Parameter | Default | Description |
|-----------|---------|-------------|
| `click_hold_ms` | `50` | Mouse press duration |
| `double_click_gap_ms` | `50` | Gap between double-click events |
| `hover_settle_ms` | `400` | Wait after hover before next action |
| `drag_press_ms` | `50` | Initial drag press duration |
| `drag_step_ms` | `5` | Delay between interpolation steps |
| `drag_settle_ms` | `30` | Wait after drag release |
| `drag_pixels_per_step` | `20` | Pixel density per step |
| `drag_min_steps` | `10` | Minimum interpolation steps |
| `scroll_press_ms` | `10` | Scroll press duration |
| `scroll_tick_ms` | `20` | Delay between scroll ticks |
| `key_hold_ms` | `30` | Key press duration |
| `combo_mod_ms` | `10` | Modifier key settle delay |
| `type_key_ms` | `20` | Character key duration |
| `type_inter_key_ms` | `20` | Delay between characters |
| `type_shift_ms` | `10` | Shift key settle delay |
| `paste_settle_ms` | `30` | Wait after clipboard paste |

Source: [README.md:70-80]()

### Runtime Configuration

```json
{"method":"configure","params":{"click_hold_ms":80,"key_hold_ms":50}}
```

```json
{"result":{"detail":"OK — changed: click_hold_ms, key_hold_ms"}}
```

Reset to defaults:

```json
{"method":"configure","params":{"reset":true}}
```

Source: [README.md:55-70]()

---

## OCR Detection

The daemon uses Apple Vision framework for on-device text detection, providing zero API cost and approximately 50ms latency.

### Output Format

```json
{
  "method": "detect_elements",
  "result": {
    "detail": "13 elements",
    "elements": [
      {"confidence": 1, "text": "Finder", "w": 32, "h": 9, "x": 37, "y": 6},
      {"confidence": 1, "text": "File", "w": 15, "h": 9, "x": 84, "y": 6}
    ],
    "scaledHeight": 717,
    "scaledWidth": 1280
  }
}
```

Source: [README.md:1-25]()

---

## Data Flow Summary

```mermaid
graph LR
    A[Claude] -->|MCP Tool Call| B[MCP Proxy]
    B -->|NDJSON Request| C[VNC Daemon]
    C -->|RFB Protocol| D[VNC Server]
    D -->|Pixel Data| C
    C -->|Processed Image| B
    B -->|MCP Response| A
    
    C -->|Apple Vision| E[OCR Results]
    C -->|Input Events| F[Mouse/Keyboard]
```

---

## Known Limitations

### VNC Connection Stability

A documented issue affects long-running test sessions:

> During extended test sessions (~50 turns), the VNC connection drops and the daemon becomes unresponsive. SSH to the remote Mac remains active, confirming the issue is on the VNC/ARD server side — not the daemon or network layer.

**Reference:** [GitHub Issue #8](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

**Workaround:** For bare-metal Mac deployments, refer to the [Mac M1 Preparation Tricks](https://gist.github.com/remrearas/a3f300635b02f2587a134882a51f7114) guide for VNC hardening and SSH tunneling recommendations.

---

## Architecture Files Reference

| File | Purpose |
|------|---------|
| `index.js` | MCP proxy entry point, daemon spawning, tool registration |
| `tools/index.js` | Tool definitions with Zod schemas for all VNC actions |
| `server.json` | MCP server metadata and package configuration |
| `jsconfig.json` | JavaScript/Node.js compiler configuration |

---

<a id='pc-protocol'></a>

## PC (Procedure Call) Protocol

### Related Pages

Related topics: [System Architecture](#system-architecture)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
</details>

# PC (Procedure Call) Protocol

## Overview

The **PC (Procedure Call) Protocol** is the bidirectional JSON-RPC communication layer between the MCP Proxy (Node.js/JavaScript) and the VNC Daemon (Swift/C). It operates over stdin/stdout using NDJSON (Newline Delimited JSON) format, enabling Claude Desktop to control remote desktops via VNC.

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Architecture

The PC Protocol sits between two distinct layers:

| Layer | Language | Role |
|-------|----------|------|
| **MCP Proxy** | JavaScript (Node.js) | Communicates with Claude over MCP protocol, manages daemon lifecycle |
| **VNC Daemon** | Swift/C (Apple Silicon) | VNC connection, screen capture, mouse/keyboard input injection |

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

```mermaid
graph LR
    subgraph Proxy["MCP Proxy Layer"]
        A[Claude Desktop]
        B[MCP Protocol Handler]
    end
    
    subgraph Daemon["VNC Daemon Layer"]
        C[PC Protocol Parser]
        D[VNC Client Module]
        E[Input Injection Module]
        F[Screen Capture Module]
    end
    
    subgraph Target["VNC Server"]
        G[VNC Server :5900]
        H[Remote Desktop]
    end
    
    A --> B
    B -->|"stdio JSON-RPC"| C
    C -->|"PC Protocol NDJSON"| D
    D <-->|"RFB Protocol"| G
    E --> G
    F --> G
    G --> H
    
    style Proxy fill:#1a1a2e,stroke:#16213e,color:#e5e5e5
    style Daemon fill:#0f3460,stroke:#533483,color:#e5e5e5
    style Target fill:#1a1a2e,stroke:#e94560,color:#e5e5e5
```

## Message Format

All PC Protocol messages are JSON objects separated by newlines (NDJSON). Each message must be on its own line.

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Request Message

```json
{"method":"<name>","params":{...},"id":<int|string>}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `method` | string | Yes | The procedure name to invoke |
| `params` | object | No | Named parameters for the procedure |
| `id` | int, string, or null | Yes | Request identifier for correlation with response |

### Response Message

```json
{"result":{...},"id":<int|string>}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `result` | object | Yes | The result data returned by the procedure |
| `id` | int, string, or null | Yes | Must match the request ID |

### Error Message

```json
{"error":{"code":<int>,"message":"..."},"id":<int|string>}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `error.code` | integer | Yes | Numeric error code |
| `error.message` | string | Yes | Human-readable error description |
| `id` | int, string, or null | Yes | Must match the request ID |

### Notification Message

```json
{"method":"<name>","params":{...}}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `method` | string | Yes | The notification event name |
| `params` | object | No | Event-specific data |

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Supported Procedures

### Screen Actions

| Procedure | Parameters | Description | Latency |
|-----------|------------|-------------|---------|
| `screenshot` | — | Full screen capture as PNG | ~200ms |
| `cursor_crop` | — | Crop around cursor with crosshair overlay | — |
| `diff_check` | — | Detect screen changes since baseline | — |
| `set_baseline` | — | Save current screen as diff reference | — |

**Source:** [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

### Mouse Actions

| Procedure | Parameters | Description |
|-----------|------------|-------------|
| `mouse_click` | `x`, `y`, `button?` | Click at coordinates (left/right/middle) |
| `mouse_double_click` | `x`, `y` | Double click at coordinates |
| `mouse_move` | `x`, `y` | Move cursor to position |
| `hover` | `x`, `y` | Move cursor + settle wait |
| `nudge` | `dx`, `dy` | Relative cursor movement (±50px) |
| `mouse_drag` | `x`, `y`, `toX`, `toY` | Drag from start to end coordinates |
| `scroll` | `x`, `y`, `direction`, `amount?` | Scroll (up/down/left/right) |

**Source:** [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

### Keyboard Actions

| Procedure | Parameters | Description |
|-----------|------------|-------------|
| `key_tap` | `key` | Single key press (enter/escape/tab/space/...) |
| `key_combo` | `key` or `keys[]` | Modifier combo (e.g., "cmd+c" or ["cmd","shift","3"]) |
| `key_type` | `text` | Type text character by character |
| `paste` | `text` | Paste text via clipboard |

**Source:** [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

### Detection Actions

| Procedure | Parameters | Description |
|-----------|------------|-------------|
| `detect_elements` | — | OCR text detection with bounding boxes (Apple Vision) |

Returns text elements with bounding box coordinates in scaled space:

```json
{
  "method": "detect_elements"
}
```
```json
{
  "result": {
    "detail": "13 elements",
    "elements": [
      {"confidence": 1, "h": 9, "text": "Finder", "w": 32, "x": 37, "y": 6},
      {"confidence": 1, "h": 9, "text": "File", "w": 15, "x": 84, "y": 6}
    ],
    "scaledHeight": 717,
    "scaledWidth": 1280
  }
}
```

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Configuration Procedures

| Procedure | Parameters | Description |
|-----------|------------|-------------|
| `configure` | `{<params>}` | Set timing/display params at runtime |
| `configure` | `{reset: true}` | Reset all params to defaults |
| `get_timing` | — | Get current timing + display params |

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Control Procedures

| Procedure | Parameters | Description |
|-----------|------------|-------------|
| `wait` | `ms?` | Pause execution (default 500ms) |
| `health` | — | Connection status + display info |
| `shutdown` | — | Graceful daemon shutdown |

**Source:** [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Timing Parameters

All timing parameters can be configured at runtime via the `configure` method.

| Parameter | Default | Description |
|-----------|---------|-------------|
| `click_hold_ms` | 50 | Mouse button hold duration |
| `double_click_gap_ms` | 50 | Gap between double-click events |
| `hover_settle_ms` | 400 | Wait after hover before next action |
| `drag_press_ms` | 50 | Initial press duration for drag |
| `drag_position_ms` | 30 | Position update interval |
| `drag_step_ms` | 5 | Step duration during drag |
| `drag_settle_ms` | 30 | Settle time before mouse release |
| `drag_pixels_per_step` | 20 | Pixels moved per step during drag |
| `drag_min_steps` | 10 | Minimum interpolation steps |
| `scroll_press_ms` | 10 | Scroll button press duration |
| `scroll_tick_ms` | 20 | Interval between scroll ticks |
| `key_hold_ms` | 30 | Key press duration |
| `combo_mod_ms` | 10 | Modifier key settle time |
| `type_key_ms` | 20 | Key press duration during typing |
| `type_inter_key_ms` | 20 | Delay between characters |
| `type_shift_ms` | 10 | Shift key settle time |
| `paste_settle_ms` | 30 | Settle time after paste |

**Source:** [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

### Configuration Examples

Set timing parameters:
```json
{"method":"configure","params":{"click_hold_ms":80,"key_hold_ms":50}}
```
```json
{"result":{"detail":"OK — changed: click_hold_ms, key_hold_ms"}}
```

Change screen scaling:
```json
{"method":"configure","params":{"max_dimension":960}}
```
```json
{"result":{"detail":"OK — changed: max_dimension","scaledWidth":960,"scaledHeight":584}}
```

Reset to defaults:
```json
{"method":"configure","params":{"reset":true}}
```
```json
{"result":{"detail":"OK — reset to defaults","timing":{"click_hold_ms":50,"combo_mod_ms":10,...}}}
```

**Source:** [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Coordinate System

### Coordinate Scaling

The VNC server's native resolution is scaled down to fit within `--max_dimension` (default: 1280px). Claude works more consistently with scaled coordinates — the daemon handles the conversion transparently.

```
Native:  4220 × 2568  (VNC server framebuffer)
Scaled:  1280 × 779   (what Claude sees and targets)

mouse_click(640, 400) → VNC receives (2110, 1284)
```

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Display Configuration

The `health` procedure returns current display dimensions:

| Field | Description |
|-------|-------------|
| `scaledWidth` | Width in scaled coordinate space |
| `scaledHeight` | Height in scaled coordinate space |
| `connected` | VNC connection status |

## Authentication

The PC Protocol supports two VNC authentication methods:

| Method | Protocol | Security |
|--------|----------|----------|
| **VNC Auth** | Password-based challenge-response (DES) | Basic |
| **ARD** | Apple Remote Desktop (Diffie-Hellman + AES-128-ECB) | Enhanced |

**Source:** [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### ARD Detection

macOS is auto-detected via the ARD auth type 30 credential request. When detected, Meta keys are remapped to Super keys for Command key compatibility.

## Daemon Lifecycle

The MCP proxy manages the daemon lifecycle through the PC Protocol:

```mermaid
sequenceDiagram
    participant Claude
    participant Proxy as MCP Proxy
    participant Daemon as VNC Daemon
    participant VNC as VNC Server
    
    Claude->>Proxy: Start MCP server
    Proxy->>Daemon: Spawn process (stdin/stdout)
    Daemon->>VNC: Connect :5900
    VNC-->>Daemon: Authentication
    Daemon-->>Proxy: Ready (health response)
    Proxy-->>Claude: MCP tools available
    
    loop Session
        Claude->>Proxy: vnc_command request
        Proxy->>Daemon: PC Protocol request
        Daemon->>VNC: RFB operations
        Daemon-->>Proxy: PC Protocol response
        Proxy-->>Claude: MCP response
    end
    
    Claude->>Proxy: shutdown request
    Proxy->>Daemon: {"method":"shutdown"}
    Daemon->>Daemon: Cleanup
    Daemon-->>Proxy: Exit
```

**Source:** [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)

### Startup Configuration

The daemon is configured via environment variables:

| Variable | Required | Description |
|----------|----------|-------------|
| `VNC_HOST` | Yes | VNC server hostname or IP |
| `VNC_PORT` | Yes | VNC server port |
| `VNC_USERNAME` | No | VNC/ARD username |
| `VNC_PASSWORD` | No | VNC/ARD password |
| `CLAUDE_KVM_DAEMON_PATH` | No | Path to daemon binary |

**Source:** [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

### Graceful Shutdown

The proxy handles SIGINT by sending a shutdown notification:

```javascript
process.on('SIGINT', () => {
  log('Shutting down...');
  if (daemon) {
    daemon.stdin.write(JSON.stringify({ method: 'shutdown' }) + '\n');
    setTimeout(() => { daemon?.kill(); process.exit(0); }, 500);
  } else {
    process.exit(0);
  }
});
```

**Source:** [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)

## Known Limitations

### VNC Connection Stability

During extended test sessions (~50 turns), the VNC connection may drop and the daemon may become unresponsive. SSH to the remote Mac remains active, confirming the issue is on the VNC/ARD server side — not the daemon or network layer.

**Related Issue:** [#8 - VNC connection drops during long-running test sessions](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

### ARD Session Management

When targeting macOS via Apple Remote Desktop, the ARD server may terminate sessions after extended inactivity. For improved stability, consider:

- Using SSH tunneling for VNC connections
- Configuring ARD server for extended session timeouts
- Implementing periodic `health` checks to maintain connection

## Related Documentation

- [MCP Server Configuration](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [Tool Definitions](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [Mac M1 Preparation Tricks](https://gist.github.com/remrearas/a3f300635b02f2587a134882a51f7114) — VNC security, SSH tunneling, and session stability

---

<a id='coordinate-scaling'></a>

## Coordinate Scaling

### Related Pages

Related topics: [System Architecture](#system-architecture), [Screen Operations](#screen-operations)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
</details>

# Coordinate Scaling

Coordinate scaling is a fundamental mechanism in claude-kvm that enables AI agents like Claude to interact with remote desktops at any native resolution through a consistent, normalized coordinate space. The daemon transparently maps between the scaled coordinates Claude uses and the actual pixel positions on the VNC server's framebuffer.

## Overview

When connecting to a VNC server, the remote desktop may have a native resolution far exceeding what is practical for AI processing. High-resolution displays (e.g., 4K, Retina, or multi-monitor setups) would generate enormous screenshots and require precise coordinate calculations that could exceed reasonable token budgets. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

The coordinate scaling system solves this by:

1. **Downscaling** the framebuffer capture to fit within a configurable maximum dimension
2. **Presenting** scaled coordinates to Claude for all operations
3. **Upscaling** coordinates back to native resolution before sending to the VNC server

This approach allows Claude to work consistently with predictable coordinate ranges regardless of the actual remote display size. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Architecture

The scaling system involves multiple components across the proxy and daemon layers:

```mermaid
graph TD
    subgraph Proxy["MCP Proxy (Node.js)"]
        Tools["Tools Definition<br/>(tools/index.js)"]
        Server["Server Handler"]
    end
    
    subgraph Daemon["VNC Daemon (Swift/C)"]
        Scale["Scale Module"]
        Capture["Screen Capture"]
        Mouse["Mouse Controller"]
        VNC["VNC Bridge<br/>(LibVNCClient)"]
    end
    
    Tools -->|"x, y bounds<br/>from health"| Server
    Server -->|"NDJSON<br/>{method, params}"| Scale
    Scale -->|"scaled coords"| Mouse
    Scale -->|"native coords"| VNC
    Capture -->|"native framebuffer"| Scale
    Scale -->|"scaled PNG"| Server
    
    classDef proxy fill:#1a1a2e,stroke:#16213e,color:#e5e5e5
    classDef daemon fill:#0f3460,stroke:#533483,color:#e5e5e5
    
    class Tools,Server proxy
    class Scale,Capture,Mouse,VNC daemon
```

| Component | Layer | Role |
|-----------|-------|------|
| Tools Definition | Proxy | Defines coordinate bounds for Zod validation |
| Scale Module | Daemon | Performs coordinate transformations |
| Screen Capture | Daemon | Captures native framebuffer, outputs scaled image |
| Mouse Controller | Daemon | Receives scaled coordinates, applies scaling |
| VNC Bridge | Daemon | Sends native coordinates via RFB protocol |

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Coordinate Transformation

The transformation follows a simple aspect-ratio-preserving algorithm. The scaled dimensions maintain the same aspect ratio as the native framebuffer while ensuring neither width nor height exceeds `max_dimension`.

### Scaling Formula

```
scale_factor = max_dimension / max(native_width, native_height)
scaled_width = round(native_width * scale_factor)
scaled_height = round(native_height * scale_factor)
```

### Coordinate Mapping

When Claude specifies a coordinate in the scaled space, the daemon applies the inverse transformation:

```
native_x = round(scaled_x / scale_factor)
native_y = round(scaled_y / scale_factor)
```

### Example Transformation

For a native resolution of 4220×2568 scaled to a maximum dimension of 1280:

```
scale_factor = 1280 / 4220 ≈ 0.303
scaled_width = round(4220 * 0.303) = 1280
scaled_height = round(2568 * 0.303) = 779
```

When Claude clicks at scaled coordinates (640, 400):

```
native_x = round(640 / 0.303) ≈ 2110
native_y = round(400 / 0.303) ≈ 1284
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Configuration Options

### CLI Parameter

The `--max-dimension` flag controls the maximum dimension at daemon startup:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `--max-dimension` | `1280` | Maximum width or height in pixels for scaled display |

```bash
# Launch with custom max dimension
claude-kvm-daemon --max-dimension 1920 --host 192.168.1.100 --port 5900
```

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

### Runtime Configuration

The `configure` method allows changing the max dimension during operation:

```json
{"method":"configure","params":{"max_dimension":960}}
```

Response:

```json
{"result":{"detail":"OK — changed: max_dimension","scaledWidth":960,"scaledHeight":584}}
```

This immediately re-scales the display and recalculates all coordinate bounds. Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

### All Timing/Display Parameters

The following parameters affect timing behavior related to coordinate operations:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `max_dimension` | `1280` | Maximum display dimension |
| `cursor_crop_radius` | `150` | Cursor crop radius in pixels |
| `click_hold_ms` | `50` | Click press duration |
| `double_click_gap_ms` | `50` | Gap between double-click events |
| `hover_settle_ms` | `400` | Wait after hover before next action |
| `drag_position_ms` | `30` | Wait before drag movement |
| `drag_press_ms` | `50` | Press duration to initiate drag |
| `drag_step_ms` | `5` | Interval between interpolation points |
| `drag_pixels_per_step` | `20` | Pixels per interpolation step |
| `drag_min_steps` | `10` | Minimum interpolation steps |
| `drag_settle_ms` | `30` | Wait before mouse release |
| `scroll_press_ms` | `10` | Scroll button press duration |
| `scroll_tick_ms` | `20` | Interval between scroll ticks |

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Tools Using Scaled Coordinates

All mouse and scroll operations accept coordinates in the scaled space. The daemon automatically transforms them to native coordinates before sending to the VNC server.

### Mouse Actions

| Action | Parameters | Description |
|--------|------------|-------------|
| `mouse_click` | `x, y, button?` | Click at scaled coordinates |
| `mouse_double_click` | `x, y` | Double-click at scaled coordinates |
| `mouse_move` | `x, y` | Move cursor to scaled coordinates |
| `hover` | `x, y` | Move cursor and wait for settling |
| `nudge` | `dx, dy` | Relative move (max ±50 pixels) |
| `mouse_drag` | `x, y, toX, toY` | Drag from start to end coordinates |

### Scroll Actions

| Action | Parameters | Description |
|--------|------------|-------------|
| `scroll` | `x, y, direction, amount?` | Scroll at position (up/down/left/right) |

### Input Validation

The tools definition validates that all coordinates fall within the scaled bounds:

```javascript
x: z.number().int().min(0).max(width - 1).optional().describe('X coordinate'),
y: z.number().int().min(0).max(height - 1).optional().describe('Y coordinate'),
toX: z.number().int().min(0).max(width - 1).optional().describe('Drag target X'),
toY: z.number().int().min(0).max(height - 1).optional().describe('Drag target Y'),
```

Where `width` and `height` are the current scaled dimensions obtained from the daemon's `health` response. Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Querying Current State

### Health Check

The `health` action returns the current scaled dimensions:

```json
{"action":"health"}
```

Response includes:

```json
{
  "result": {
    "scaledWidth": 1280,
    "scaledHeight": 779,
    ...
  }
}
```

### Get Timing

The `get_timing` action returns all current timing and display parameters:

```json
{"action":"get_timing"}
```

### Reset to Defaults

To reset all parameters including `max_dimension` to defaults:

```json
{"action":"configure","params":{"reset":true}}
```

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Detection with Bounding Boxes

When using `detect_elements` for OCR, bounding box coordinates are returned in scaled space:

```json
{"action":"detect_elements"}
```

Response:

```json
{
  "result": {
    "scaledWidth": 1280,
    "scaledHeight": 717,
    "elements": [
      {"confidence":1,"h":9,"text":"Finder","w":32,"x":37,"y":6},
      {"confidence":1,"h":93,"text":"PHANTOM","w":633,"x":322,"y":477}
    ]
  }
}
```

These coordinates can be directly used with mouse actions without any conversion. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Practical Usage Patterns

### Progressive Verification Strategy

Claude uses a tiered approach to minimize token usage while maintaining accuracy:

```
diff_check       →  changeDetected: true/false     ~5ms    (text only)
detect_elements  →  OCR text + bounding boxes       ~50ms   (text only)
cursor_crop      →  crop around cursor location     ~50ms   (small image)
screenshot       →  full screen capture              ~200ms  (complete image)
```

This strategy allows Claude to verify changes at low cost before committing to expensive screenshot operations.

### Clicking Text Elements

A common workflow combines OCR detection with coordinate-based clicking:

1. Call `detect_elements` to find text and their bounding boxes
2. Calculate center point of target element
3. Call `mouse_click` with scaled coordinates

```javascript
// After detect_elements returns {"text":"Submit","x":100,"y":200,"w":80,"h":30}
// Click center of element
clickX = x + w/2 = 100 + 40 = 140
clickY = y + h/2 = 200 + 15 = 215
mouse_click(140, 215)
```

## Related Features

| Feature | Description |
|---------|-------------|
| [Apple Remote Desktop Support](ARD-Authentication) | ARD authentication with Meta-to-Super key remapping |
| [Screen Capture](Screen-Capture) | Screenshot and visual diff capabilities |
| [Action Queue](Action-Queue) | Batch multiple coordinate operations |

## Known Limitations

### VNC Connection Stability

During extended test sessions (~50 turns), VNC connections may drop. SSH connections remain active, indicating the issue is at the VNC/ARD server layer rather than the daemon. This affects coordinate operations that require active connection.

Issue: [#8](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

---

## Summary

Coordinate scaling in claude-kvm provides a transparent abstraction layer that allows AI agents to work with a normalized coordinate space regardless of the remote desktop's native resolution. By configuring the maximum dimension and letting the daemon handle transformations, Claude can efficiently control high-resolution displays while minimizing token costs.

---

<a id='screen-operations'></a>

## Screen Operations

### Related Pages

Related topics: [OCR Detection](#ocr-detection), [Coordinate Scaling](#coordinate-scaling)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
</details>

# Screen Operations

Screen Operations provide the visual foundation for Claude KVM's remote desktop control capabilities. These operations capture the remote display state, enable visual comparison, and integrate with OCR detection to support AI-driven automation workflows.

## Overview

The Screen module delivers four primary capabilities through the `vnc_command` tool:

| Action | Description |
|--------|-------------|
| `screenshot` | Full screen PNG capture |
| `cursor_crop` | Crop around cursor with crosshair overlay |
| `diff_check` | Detect screen changes against baseline |
| `set_baseline` | Save current screen as diff reference |

Screen capture operations form the observe phase in the standard automation pattern: screenshot → analyze → act → verify. Each capture operates on the scaled display space, ensuring coordinates align with mouse and keyboard input targets. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Coordinate System

The VNC server's native resolution is scaled down to fit within `--max-dimension` (default: 1280px). Claude works more consistently with scaled coordinates — the daemon handles the conversion in the background. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

```
Native:  4220 x 2568  (VNC server framebuffer)
Scaled:  1280 x 779   (what Claude sees and targets)

mouse_click(640, 400) → VNC receives (2110, 1284)
```

All screen coordinates returned by screen operations use scaled space, matching the coordinate system expected by mouse and keyboard actions. Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Screenshot Capture

### Full Screen Capture

The `screenshot` action captures the entire remote display as a PNG image. This is the primary observation mechanism for AI-driven desktop automation.

```json
{"action": "screenshot"}
```

Response includes the image data encoded as base64, along with display metadata:

```json
{
  "result": {
    "image": "<base64-encoded-png>",
    "width": 1280,
    "height": 779
  }
}
```

### Performance Characteristics

| Operation | Approximate Latency |
|-----------|---------------------|
| screenshot | ~200ms (full display) |

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Cursor Crop

The `cursor_crop` action captures a cropped region centered on the current cursor position with a crosshair overlay. This is useful for focusing on UI elements near the pointer during detailed automation tasks.

```json
{"action": "cursor_crop"}
```

The action uses a configurable crop radius (default: 150px) around the cursor position. Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Visual Diff

Visual diff operations enable verification of state changes after performing actions, which is critical for reliable automation scripts.

### Workflow

```mermaid
graph TD
    A[Initial Screen] --> B[set_baseline]
    B --> C[Perform Actions]
    C --> D[diff_check]
    D --> E{Changes Detected?}
    E -->|Yes| F[Continue Workflow]
    E -->|No| G[Retry or Debug]
```

### Set Baseline

The `set_baseline` action saves the current screen state as a reference for future diff comparisons:

```json
{"action": "set_baseline"}
```

```json
{"result": {"detail": "Baseline saved"}}
```

### Diff Check

The `diff_check` action compares the current screen against the saved baseline and returns whether changes were detected:

```json
{"action": "diff_check"}
```

```json
{
  "result": {
    "changed": true,
    "diffPixels": 1247,
    "percentChanged": 0.125
  }
}
```

This pattern is particularly valuable in CI/CD environments where step-by-step verification ensures automation reliability. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Integration with OCR Detection

Screen operations integrate with `detect_elements` to provide text-aware automation:

1. **Screenshot** captures the display
2. **detect_elements** performs OCR via Apple Vision on the captured frame
3. Returns text elements with bounding boxes in scaled coordinates:

```json
{
  "result": {
    "elements": [
      {"confidence": 1, "h": 9, "text": "Finder", "w": 32, "x": 37, "y": 6},
      {"confidence": 1, "h": 93, "text": "PHANTOM", "w": 633, "x": 322, "y": 477}
    ],
    "scaledWidth": 1280,
    "scaledHeight": 717
  }
}
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

This enables click-targeting of UI elements without consuming image tokens. The OCR operates on-device using Apple Vision framework with zero API cost and approximately 50ms latency. Source: [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)

## Configuration

Screen-related timing parameters can be adjusted at runtime:

| Parameter | Default | Description |
|-----------|--------|-------------|
| `cursor_crop_radius` | 150 | Cursor crop region radius in pixels |
| `hover_settle_ms` | 400 | Wait time after cursor movement |

Configuration changes apply immediately without reconnection:

```json
{"method": "configure", "params": {"cursor_crop_radius": 200}}
```

```json
{"result": {"detail": "OK — changed: cursor_crop_radius"}}
```

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Connection Reliability

During extended test sessions, VNC connections may drop, causing screen operations to fail. This is a known issue with VNC/ARD servers during long-running sessions (~50 turns). The SSH connection to the remote Mac typically remains active, indicating the issue is on the VNC server side rather than the daemon or network layer. Source: [GitHub Issue #8](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

For CI/CD integration, consider implementing retry logic around screen capture operations. The daemon supports automatic reconnection unless `--no-reconnect` is specified. Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Tool Schema

Screen operations are defined in the MCP tool schema:

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| action | enum | Yes | One of: screenshot, cursor_crop, diff_check, set_baseline |

Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Architecture

```mermaid
graph TD
    subgraph MCP_Proxy["MCP Proxy (Node.js)"]
        Tools["Tools Interface"]
    end
    
    subgraph VNC_Daemon["VNC Daemon (Swift/C)"]
        CMD["CMD Handler"]
        Capture["Capture Module"]
        VNC["VNC Client"]
    end
    
    subgraph VNC_Server["VNC Server"]
        Framebuffer["Framebuffer<br/>:5900"]
        Desktop["Desktop Environment"]
    end
    
    Tools -->|"vnc_command"| CMD
    CMD --> Capture
    Capture -->|"RFB Protocol"| VNC
    VNC --> Framebuffer
    Framebuffer --> Desktop
    
    classDef proxy fill:#1a1a2e,stroke:#16213e,color:#e5e5e5
    classDef daemon fill:#0f3460,stroke:#533483,color:#e5e5e5
    classDef target fill:#1a1a2e,stroke:#e94560,color:#e5e5e5
    
    class Tools proxy
    class CMD,Capture,VNC daemon
    class Framebuffer,Desktop target
```

The Capture module receives raw framebuffer updates from the VNC server and encodes them as PNG for transmission to the MCP proxy layer. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## See Also

- [Mouse Operations](mouse-operations) — Cursor movement and clicking
- [Keyboard Operations](keyboard-operations) — Key input and typing
- [OCR Detection](ocr-detection) — Text element identification
- [Configuration](configuration) — Runtime parameter adjustment

---

<a id='input-control'></a>

## Input Control (Mouse & Keyboard)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [package.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/package.json)
</details>

# Input Control (Mouse & Keyboard)

## Overview

Input Control in Claude KVM provides comprehensive mouse and keyboard automation capabilities for remote VNC desktop environments. The Swift daemon translates high-level MCP tool calls into RFB (Remote Framebuffer) protocol events, enabling precise interaction with remote desktops from AI agents. All coordinate-based actions operate within a scaled coordinate space that maps to the native framebuffer resolution, ensuring consistent behavior across different display configurations.

The input system supports seven mouse actions and four keyboard actions, each with configurable timing parameters that control press duration, inter-action delays, and settle times. This design allows the system to adapt to varying VNC server responsiveness and target application latency, which is particularly important for macOS targets using Apple Remote Desktop where timing-sensitive applications may require tuning.

## Architecture

```mermaid
graph TD
    subgraph MCP_Proxy["MCP Proxy Layer"]
        Server["MCP Server<br/>JSON-RPC stdio"]
        Tools["Tools Handler"]
    end
    
    subgraph Daemon["VNC Daemon Layer"]
        CMD["Command Parser<br/>PC Protocol"]
        Mouse["Mouse Controller"]
        KB["Keyboard Controller"]
        Scale["Coordinate Scaler"]
    end
    
    subgraph VNC["VNC/RFB Layer"]
        VNC_Client["VNC Client<br/>RFB Protocol"]
    end
    
    Server -->|method + params| Tools
    Tools -->|vnc_command| CMD
    CMD -->|scaled coords| Scale
    Scale -->|native coords| Mouse
    Scale -->|native coords| KB
    Mouse -->|pointer events| VNC_Client
    KB -->|key events| VNC_Client
    
    classDef proxy fill:#1a1a2e,stroke:#16213e,color:#e5e5e5
    classDef daemon fill:#0f3460,stroke:#533483,color:#e5e5e5
    classDef vnc fill:#1a1a2e,stroke:#e94560,color:#e5e5e5
    
    class Server,Tools proxy
    class CMD,Mouse,KB,Scale daemon
    class VNC_Client vnc
```

The daemon receives commands via stdin using the PC (Procedure Call) protocol over NDJSON. Each command specifies an action type and parameters, which the command parser routes to the appropriate controller. The coordinate scaler transforms all XY coordinates from the scaled display space (default max 1280px) to the native framebuffer resolution before injection.

Source: [tools/index.js:1-85](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Mouse Control

### Supported Actions

The mouse controller handles all pointer-based interactions through the VNC connection. Each action maps to specific RFB pointer events that the VNC server interprets and forwards to the desktop environment.

| Action | Parameters | Description |
|--------|------------|-------------|
| `mouse_click` | `x`, `y`, `button?` | Single click at coordinates (default: left button) |
| `mouse_double_click` | `x`, `y` | Double click at coordinates |
| `mouse_move` | `x`, `y` | Move cursor to position without clicking |
| `hover` | `x`, `y` | Move cursor and wait for settle |
| `nudge` | `dx`, `dy` | Relative cursor movement from current position |
| `mouse_drag` | `x`, `y`, `toX`, `toY` | Press, drag, and release from start to end |
| `scroll` | `x`, `y`, `direction`, `amount?` | Scroll in direction (up/down/left/right) |

Source: [tools/index.js:45-55](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

### Coordinate System

Mouse coordinates operate in the scaled coordinate space, not native screen pixels. The daemon automatically scales coordinates based on the `max_dimension` parameter (default 1280px) and the actual VNC framebuffer dimensions.

```
Native:  4220 × 2568  (VNC server framebuffer)
Scaled:  1280 × 779   (what Claude targets)

mouse_click(640, 400) → Daemon scales to → VNC receives (2110, 1284)
```

Source: [README.md:148-156](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Drag Implementation

Mouse dragging implements interpolated movement between start and end points. The daemon breaks the drag path into discrete steps, pressing and releasing at appropriate intervals to ensure the target application receives motion events correctly.

| Parameter | Default | Description |
|-----------|---------|-------------|
| `drag_position_ms` | 30 | Wait before starting drag movement |
| `drag_press_ms` | 50 | Button press duration before drag |
| `drag_step_ms` | 5 | Delay between interpolation steps |
| `drag_settle_ms` | 30 | Wait after releasing button |
| `drag_pixels_per_step` | 20 | Pixels traveled per interpolation step |
| `drag_min_steps` | 10 | Minimum number of interpolation steps |

Source: [README.md:198-210](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Scroll Behavior

Scroll actions generate wheel events with configurable tick parameters. The scroll operation moves the mouse cursor to the specified coordinates and then sends wheel events in the requested direction.

| Parameter | Default | Description |
|-----------|---------|-------------|
| `scroll_press_ms` | 10 | Button down duration for scroll event |
| `scroll_tick_ms` | 20 | Delay between consecutive scroll ticks |

Source: [README.md:213-216](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Keyboard Control

### Supported Actions

The keyboard controller translates text and key commands into VNC key events. It handles both individual key presses and complex modifier combinations required for macOS shortcuts.

| Action | Parameters | Description |
|--------|------------|-------------|
| `key_tap` | `key` | Single key press (enter, escape, tab, space, etc.) |
| `key_combo` | `key` or `keys[]` | Modifier combination (e.g., "cmd+c" or ["cmd","shift","3"]) |
| `key_type` | `text` | Type text character by character |
| `paste` | `text` | Paste text via clipboard |

Source: [tools/index.js:56-60](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

### Character-by-Character Typing

The `key_type` action sends each character individually rather than as a string, which ensures compatibility with applications that do not support direct string input. The daemon inserts configurable delays between keystrokes to account for input latency.

| Parameter | Default | Description |
|-----------|---------|-------------|
| `type_key_ms` | 20 | Key press duration during typing |
| `type_inter_key_ms` | 20 | Delay between character inputs |
| `type_shift_ms` | 10 | Settle time for shift modifier |

Source: [README.md:222-225](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Modifier Combinations

Modifier combinations support both string shortcuts ("cmd+c") and array syntax (["cmd","shift","3"]). The daemon handles Meta-to-Super key remapping for macOS targets, converting between Windows/Linux and macOS modifier key semantics automatically.

| Parameter | Default | Description |
|-----------|---------|-------------|
| `combo_mod_ms` | 10 | Settle time for modifier key press |
| `key_hold_ms` | 30 | Modifier key hold duration |

Source: [README.md:217-220](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Clipboard Paste

The paste action writes text to the system clipboard and then simulates the paste keyboard shortcut. This approach bypasses character encoding issues that can occur with direct keyboard input.

| Parameter | Default | Description |
|-----------|---------|-------------|
| `paste_settle_ms` | 30 | Wait time after clipboard write |

Source: [README.md:226](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Timing Configuration

### Runtime Configuration

All timing parameters can be adjusted at runtime using the `configure` action without reconnecting to the VNC server. Current values are retrievable via the `get_timing` action.

```json
{"method":"configure","params":{"click_hold_ms":80,"key_hold_ms":50}}
```

```json
{"result":{"detail":"OK — changed: click_hold_ms, key_hold_ms"}}
```

Reset all parameters to defaults:

```json
{"method":"configure","params":{"reset":true}}
```

Source: [README.md:183-197](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Complete Parameter Reference

| Parameter | Default | Description |
|-----------|---------|-------------|
| `click_hold_ms` | 50 | Mouse button press duration |
| `double_click_gap_ms` | 50 | Interval between double-click events |
| `hover_settle_ms` | 400 | Wait time after hover movement |
| `drag_position_ms` | 30 | Pre-drag position wait |
| `drag_press_ms` | 50 | Button press before drag |
| `drag_step_ms` | 5 | Interpolation step delay |
| `drag_settle_ms` | 30 | Post-drag release wait |
| `drag_pixels_per_step` | 20 | Pixels per drag interpolation |
| `drag_min_steps` | 10 | Minimum drag steps |
| `scroll_press_ms` | 10 | Scroll event duration |
| `scroll_tick_ms` | 20 | Between scroll ticks |
| `key_hold_ms` | 30 | Key press duration |
| `combo_mod_ms` | 10 | Modifier settle time |
| `type_key_ms` | 20 | Type character press |
| `type_inter_key_ms` | 20 | Between characters |
| `type_shift_ms` | 10 | Shift modifier settle |
| `paste_settle_ms` | 30 | Post-paste wait |

Source: [README.md:198-226](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Action Queue

The `action_queue` tool enables batching up to 20 sequential input actions in a single tool call. This reduces round-trip latency for multi-step workflows such as selecting text or navigating menus.

```json
{
  "name": "action_queue",
  "description": "Batch up to 20 sequential VNC actions, returns text-only results",
  "inputSchema": {
    "queue": [
      {
        "action": "key_tap",
        "key": "cmd"
      },
      {
        "action": "key_type",
        "text": "hello world"
      }
    ]
  }
}
```

Source: [tools/index.js:98-130](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Known Limitations

### VNC Connection Stability

During extended test sessions with high-frequency input operations, VNC connections may become unstable. The community has reported connection drops occurring after approximately 50 turns of interaction, though SSH connectivity to the remote Mac remains active. This indicates the issue originates in the VNC/ARD server layer rather than the daemon or network infrastructure.

Users experiencing this behavior should consider implementing periodic reconnection logic or monitoring connection health via the `health` action. The daemon provides connection status and display information through this action, allowing automated systems to detect and recover from degraded states.

> **Community Issue**: [VNC connection drops during long-running test sessions](https://github.com/ARAS-Workspace/claude-kvm/issues/8) — VNC connections may drop during extended sessions with frequent input operations. SSH remains functional, confirming the issue is VNC/ARD server-side.

### Timing Sensitivity on macOS

Applications with strict timing requirements may need parameter tuning. The default values are optimized for general desktop interaction but may not suit latency-sensitive operations. Adjust `click_hold_ms` and `type_inter_key_ms` to accommodate applications that require faster or slower input rates.

## Tool Definitions

### vnc_command Schema

```typescript
{
  action: z.enum([
    'screenshot', 'cursor_crop', 'diff_check', 'set_baseline',
    'mouse_click', 'mouse_double_click', 'mouse_move', 'hover', 'nudge',
    'mouse_drag', 'scroll',
    'key_tap', 'key_combo', 'key_type', 'paste',
    'detect_elements',
    'configure', 'get_timing',
    'wait', 'health', 'shutdown',
  ]),
  x: z.number().int().min(0).max(width - 1).optional(),
  y: z.number().int().min(0).max(height - 1).optional(),
  toX: z.number().int().min(0).max(width - 1).optional(),
  toY: z.number().int().min(0).max(height - 1).optional(),
  dx: z.number().int().min(-50).max(50).optional(),
  dy: z.number().int().min(-50).max(50).optional(),
  direction: z.enum(['up', 'down', 'left', 'right']).optional(),
  amount: z.number().int().min(1).max(200).optional(),
  key: z.string().optional(),
  keys: z.array(z.string()).optional(),
  text: z.string().optional(),
  button: z.enum(['left', 'right', 'middle']).optional(),
}
```

Source: [tools/index.js:42-85](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Usage Examples

### Basic Mouse Click

```json
{"method":"vnc_command","params":{"action":"mouse_click","x":640,"y":400}}
```

### Drag Operation

```json
{"method":"vnc_command","params":{"action":"mouse_drag","x":100,"y":200,"toX":400,"toY":300}}
```

### Keyboard Shortcut

```json
{"method":"vnc_command","params":{"action":"key_combo","keys":["cmd","shift","3"]}}
```

### Type Text

```json
{"method":"vnc_command","params":{"action":"key_type","text":"Hello, World!"}}
```

Source: [README.md:160-179](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Related Documentation

- [Server Configuration](server.json) — Environment variables and setup
- [Package Information](package.json) — MCP server metadata
- [Launch Guide](LAUNCHGUIDE.md) — Initial setup and daemon installation

---

<a id='ocr-detection'></a>

## OCR Detection

### Related Pages

Related topics: [Screen Operations](#screen-operations)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)
- [LAUNCHGUIDE.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/LAUNCHGUIDE.md)
</details>

# OCR Detection

## Overview

OCR Detection (`detect_elements`) is a core capability of the claude-kvm tool that performs on-device text recognition using Apple's Vision framework. It scans the current screen state and returns all detected text elements with their bounding box coordinates in the scaled coordinate space used by the VNC session.

**Key characteristics:**

- **On-device processing** — Uses Apple Vision framework directly on the remote Mac, eliminating API costs and network latency
- **Bounding box coordinates** — Returns precise x, y, width, and height for each detected element
- **Scaled coordinate space** — All coordinates match the scaled display dimensions, enabling direct use with mouse actions
- **Fast performance** — Approximately 50ms per detection operation
- **Confidence scoring** — Each detected element includes a confidence value

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Architecture

```mermaid
graph TD
    A[Claude Agent] -->|vnc_command<br/>detect_elements| B[MCP Proxy<br/>index.js]
    B -->|NDJSON PC Protocol| C[VNC Daemon<br/>claude-kvm-daemon]
    C -->|Apple Vision Framework| D[Remote macOS Screen]
    D -->|VNCScreenshot| C
    C -->|OCR Results| B
    B -->|JSON Response| A
```

The OCR Detection feature operates within the broader VNC daemon architecture:

| Layer | Technology | Role |
|-------|------------|------|
| MCP Proxy | JavaScript/Node.js | Receives tool calls, spawns daemon, relays results |
| VNC Daemon | Swift/C | Connects to VNC, captures screen, runs Vision OCR |
| Remote macOS | Vision Framework | Performs the actual text recognition |

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Tool Definition

The `detect_elements` action is defined as part of the `vnc_command` tool schema:

```javascript
action: z.enum([
  'screenshot', 'cursor_crop', 'diff_check', 'set_baseline',
  'mouse_click', 'mouse_double_click', 'mouse_move', 'hover', 'nudge',
  'mouse_drag', 'scroll',
  'key_tap', 'key_combo', 'key_type', 'paste',
  'detect_elements',  // <-- OCR action
  'configure', 'get_timing',
  'wait', 'health', 'shutdown',
]).describe('The action to perform'),
```

Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## API Reference

### Request

```json
{
  "method": "vnc_command",
  "params": {
    "action": "detect_elements"
  }
}
```

### Response

| Field | Type | Description |
|-------|------|-------------|
| `detail` | string | Human-readable count of detected elements |
| `elements` | array | Array of detected text elements |
| `scaledWidth` | integer | Current scaled display width |
| `scaledHeight` | integer | Current scaled display height |

### Element Object

| Field | Type | Description |
|-------|------|-------------|
| `text` | string | Detected text content |
| `x` | integer | X coordinate of bounding box top-left |
| `y` | integer | Y coordinate of bounding box top-left |
| `w` | integer | Width of bounding box |
| `h` | integer | Height of bounding box |
| `confidence` | number | Recognition confidence (0-1) |

### Example Response

```json
{
  "result": {
    "detail": "13 elements",
    "elements": [
      {"confidence": 1, "h": 9, "text": "Finder", "w": 32, "x": 37, "y": 6},
      {"confidence": 1, "h": 9, "text": "File", "w": 15, "x": 84, "y": 6},
      {"confidence": 1, "h": 9, "text": "Edit", "w": 19, "x": 112, "y": 6},
      {"confidence": 1, "h": 11, "text": "Go", "w": 15, "x": 179, "y": 6},
      {"confidence": 1, "h": 9, "text": "Window", "w": 35, "x": 207, "y": 6},
      {"confidence": 1, "h": 11, "text": "Help", "w": 22, "x": 255, "y": 6},
      {"confidence": 1, "h": 11, "text": "8•", "w": 26, "x": 1161, "y": 6},
      {"confidence": 1, "h": 9, "text": "Fri Feb 20 22:19", "w": 80, "x": 1189, "y": 6},
      {"confidence": 1, "h": 93, "text": "PHANTOM", "w": 633, "x": 322, "y": 477},
      {"confidence": 1, "h": 32, "text": "YOUR SERVER, YOUR NETWORK, YOUR PRIVACY", "w": 629, "x": 325, "y": 568}
    ],
    "scaledHeight": 717,
    "scaledWidth": 1280
  }
}
```

Source: [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

## Coordinate System

All bounding box coordinates returned by `detect_elements` are in the **scaled coordinate space**. This means:

1. The VNC server's native resolution is scaled down to fit within `--max-dimension` (default: 1280px)
2. The coordinates can be used directly with mouse actions without conversion
3. The `scaledWidth` and `scaledHeight` fields indicate the current display dimensions

```
Native:  4220 x 2568  (VNC server framebuffer)
Scaled:  1280 x 779   (what detect_elements returns)
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Usage Patterns

### Basic Text Detection

```
1. Take screenshot to verify current state
2. Run detect_elements to find all text
3. Identify target element by text content
4. Use bounding box coordinates for mouse_click
```

### Workflow Integration

The `detect_elements` action fits into the standard interaction pattern:

| Step | Action | Purpose |
|------|--------|---------|
| 1 | `screenshot` | Verify current screen state |
| 2 | `detect_elements` | Get text positions |
| 3 | `mouse_click` | Click on target element |
| 4 | `screenshot` | Verify action result |

### Precise Click Targeting

Instead of guessing click positions, use `detect_elements` to locate buttons or text by their content:

```javascript
// After detect_elements returns:
// {"text": "Submit", "x": 450, "y": 320, "w": 80, "h": 30}

// Click center of the element:
mouse_click(490, 335)  // x + w/2, y + h/2
```

## Performance Characteristics

| Metric | Value | Notes |
|--------|-------|-------|
| Detection time | ~50ms | Apple Vision on Apple Silicon |
| Memory usage | Moderate | Cached Vision request |
| Accuracy | High | Confidence values returned |

## Limitations and Considerations

### VNC Session Stability

During long-running test sessions with many operations, VNC connections may drop. This is a known issue with the VNC/ARD server side, not the OCR feature itself:

> During extended test sessions (~50 turns), the VNC connection drops and the daemon becomes unresponsive. SSH to the remote Mac remains active, confirming the issue is on the VNC/ARD server side — not the daemon or network layer.

Source: [Issue #8](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

**Mitigation strategies:**
- Implement reconnection logic in automation scripts
- Use `--no-reconnect` flag and handle reconnection manually
- Consider restarting the VNC server if drops occur frequently

### Confidence Values

Not all elements return `confidence: 1`. Lower confidence values indicate uncertain recognition. Consider adding validation logic for critical UI elements:

```javascript
// Filter high-confidence elements only
const reliableElements = result.elements.filter(el => el.confidence >= 0.9);
```

### Scaling Dependencies

OCR results are tied to the current display scaling. If `--max-dimension` changes, bounding box coordinates will differ. Always use coordinates from the same session's detection.

## Configuration

The OCR feature itself has no separate configuration, but display scaling affects results:

| Parameter | Env Variable | Default | Effect |
|-----------|--------------|---------|--------|
| Max dimension | `CLAUDE_KVM_DAEMON_PARAMETERS` | `--max-dimension 1280` | Controls scaled resolution |

```json
{
  "env": {
    "CLAUDE_KVM_DAEMON_PARAMETERS": "--max-dimension 960"
  }
}
```

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

## See Also

- [Mouse Actions](README.md#mouse) — Using coordinates returned by detect_elements
- [Screenshot Actions](README.md#screen) — Capturing screen state before detection
- [Configuration](README.md#configuration) — Runtime timing and display parameters

---

<a id='authentication'></a>

## Authentication Methods

### Related Pages

Related topics: [Quick Start Guide](#quickstart), [Environment Variables](#environment-variables)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [package.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/package.json)
</details>

# Authentication Methods

Claude KVM supports two primary VNC authentication mechanisms for remote desktop control: **VNC Auth** (standard challenge-response) and **Apple Remote Desktop (ARD)** authentication. The daemon automatically detects and negotiates the appropriate authentication method based on the VNC server's capabilities.

## Overview

The authentication layer sits between the MCP proxy (JavaScript/Node.js) and the native VNC daemon (Swift/C). When establishing a remote desktop connection, the daemon performs protocol negotiation with the VNC server to determine which authentication method to use.

```mermaid
graph TD
    subgraph Client_Side
        MCP[MCP Proxy<br/>JavaScript]
        CREDS[Credentials<br/>from Environment]
    end
    
    subgraph Daemon_Side
        DAEMON[claude-kvm-daemon<br/>Swift/C]
        LIBVNC[LibVNC<br/>Core]
    end
    
    subgraph Remote_Server
        VNC_SERVER[VNC Server<br/>macOS]
    end
    
    MCP -->|spawn process| DAEMON
    CREDS -->|VNC_USERNAME<br/>VNC_PASSWORD| MCP
    MCP -->|forward via stdin| DAEMON
    DAEMON -->|RFB Protocol| VNC_SERVER
    VNC_SERVER -->|auth challenge| DAEMON
    
    style VNC_SERVER fill:#f96
    style DAEMON fill:#bbf
    style MCP fill:#bfb
```

Source: [index.js:1-30](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js), [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Supported Authentication Types

### VNC Auth (Standard)

VNC Auth is the traditional password-based authentication method used by most VNC servers. It implements a challenge-response protocol using the DES cipher.

| Property | Value |
|----------|-------|
| **Security Level** | Basic |
| **Encryption** | DES-based challenge-response |
| **Key Exchange** | None (password stored in VNC server) |
| **Credential Format** | Password only |

The authentication flow for VNC Auth:

1. Client connects to VNC server
2. Server sends a 16-byte random challenge
3. Client encrypts challenge using password as DES key
4. Client sends 16-byte response
5. Server validates response against stored password

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Apple Remote Desktop (ARD)

ARD authentication is a proprietary macOS authentication method that provides enhanced security through cryptographic key exchange and symmetric encryption.

| Property | Value |
|----------|-------|
| **Security Level** | Enhanced |
| **Key Exchange** | Diffie-Hellman |
| **Encryption** | AES-128-ECB |
| **Auth Type** | 30 (ARD credential request) |
| **Credential Format** | Username + Password |

The authentication flow for ARD:

1. Client initiates connection with ARD auth type request
2. Server responds with Diffie-Hellman parameters
3. Client and server perform key exchange
4. Client encrypts credentials using derived session key (AES-128-ECB)
5. Server decrypts and validates credentials

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Auto-Detection Mechanism

The daemon automatically detects whether the remote VNC server is a standard VNC server or an Apple Remote Desktop server. This detection occurs during the RFB protocol handshake phase.

```mermaid
sequenceDiagram
    participant C as MCP Proxy
    participant D as claude-kvm-daemon
    participant S as VNC Server
    
    C->>D: Spawn daemon process
    D->>S: TCP Connection
    S-->>D: Server Protocol Version
    D->>S: Client Protocol Version
    S-->>D: Security Types List
    Note over D: Checks for auth type 30<br/>(ARD)
    alt ARD Detected
        Note over D: Use DH key exchange<br/>+ AES-128-ECB
        D->>S: Auth Type 30 Request
        S-->>D: DH Parameters
        D->>S: DH Public Key
        S-->>D: DH Public Key + AES Encrypted Creds
    else Standard VNC
        Note over D: Use DES Challenge-Response
        D->>S: Auth Type 2 Request
        S-->>D: 16-byte Challenge
        D->>S: 16-byte Response
    end
    S-->>D: Auth Result
```

The auto-detection is triggered by observing the **auth type 30** credential request from the server, which is specific to macOS ARD implementations.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Credential Configuration

### Environment Variables

Authentication credentials are passed to the daemon through environment variables configured in the `.mcp.json` file.

| Variable | Required | Description |
|----------|----------|-------------|
| `VNC_HOST` | Yes | VNC server hostname or IP address |
| `VNC_PORT` | Yes | VNC server port (default: `5900`) |
| `VNC_USERNAME` | For ARD | Username for ARD authentication |
| `VNC_PASSWORD` | Yes | Password for VNC/ARD authentication |

Example `.mcp.json` configuration:

```json
{
  "mcpServers": {
    "claude-kvm": {
      "command": "npx",
      "args": ["-y", "claude-kvm"],
      "env": {
        "VNC_HOST": "192.168.1.100",
        "VNC_PORT": "5900",
        "VNC_USERNAME": "admin",
        "VNC_PASSWORD": "secretpassword"
      }
    }
  }
}
```

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json), [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Username Requirements

| Server Type | Username Required |
|-------------|-------------------|
| Standard VNC | No |
| Apple Remote Desktop | Yes |

For ARD servers, the `VNC_USERNAME` environment variable is mandatory. The daemon passes both username and password to the authentication handler when auth type 30 is detected.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Key Remapping for macOS Targets

When connecting to macOS systems via ARD authentication, a **Meta-to-Super key remapping** occurs automatically. This ensures keyboard compatibility between the MCP client and the macOS target.

| Key Type | macOS Native | Remapped To |
|----------|---------------|-------------|
| Meta/Command (⌘) | Super (Super_L/Super_R) | Maintains compatibility |

This remapping is applied specifically when the daemon detects the ARD auth type 30 credential request, indicating a macOS target. The remapping ensures that keyboard shortcuts (e.g., ⌘+C for copy) work correctly from remote clients.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Connection Parameters

### Daemon CLI Parameters

Authentication-related connection parameters can be configured via the `CLAUDE_KVM_DAEMON_PARAMETERS` environment variable:

| Parameter | Description |
|-----------|-------------|
| `--connect-timeout <seconds>` | VNC connection timeout |
| `--no-reconnect` | Disable automatic reconnection |
| `-v, --verbose` | Enable verbose logging (including auth details) |

Example with authentication parameters:

```json
{
  "env": {
    "CLAUDE_KVM_DAEMON_PARAMETERS": "--connect-timeout 30 --no-reconnect"
  }
}
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md), [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)

### Timeout Behavior

The `--connect-timeout` parameter affects how long the daemon waits during the authentication handshake. If authentication is slow (common with ARD's Diffie-Hellman exchange), the timeout should be set appropriately:

| Scenario | Recommended Timeout |
|----------|---------------------|
| Local network | 10-15 seconds |
| WAN/Remote | 30-60 seconds |
| High latency | 60+ seconds |

## Health Check and Connection Status

The `health` action provides connection status including authentication state:

```json
{"method":"health"}
```

Response includes:
- Connection status
- Display information
- Authentication method in use

This allows verifying successful authentication before attempting remote control operations.

Source: [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)

## Protocol Communication

The MCP proxy communicates with the daemon using the PC (Procedure Call) protocol over stdin/stdout NDJSON. Authentication-related requests flow through this channel.

| Direction | Message Type | Content |
|-----------|--------------|---------|
| Proxy → Daemon | Request | `{"method":"health","params":{},"id":1}` |
| Daemon → Proxy | Response | `{"result":{...},"id":1}` |
| Daemon → Proxy | Notification | `{"method":"vnc_connected","params":{...}}` |

The daemon handles all authentication internally; the proxy only receives success/failure notifications and connection state updates.

Source: [index.js:1-30](https://github.com/ARAS-Workspace/claude-kvm/blob/main/index.js), [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Security Considerations

### Network Security

| Concern | Mitigation |
|---------|------------|
| Plaintext password over network | VNC/ARD encrypts credentials before transmission |
| Man-in-the-middle | Server certificate verification (if available) |
| Credential exposure in logs | Use `--no-reconnect` to avoid repeated auth |

### Credential Storage

| Environment | Recommendation |
|-------------|----------------|
| Development | Use `.env` file with restricted permissions |
| CI/CD | Use secrets management (GitHub Secrets, etc.) |
| Production | Consider VPN + SSH tunnel for additional isolation |

### macOS VNC Security

For bare-metal Mac connections, consider enabling **SSH tunneling** to add an additional layer of security. The [Mac M1 Preparation Tricks](https://gist.github.com/remrearas/a3f300635b02f2587a134882a51f7114) guide covers VNC security, SSH tunneling, and session stability configurations.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Troubleshooting Authentication Issues

### Common Issues

| Issue | Cause | Solution |
|-------|-------|----------|
| Auth fails immediately | Wrong password | Verify `VNC_PASSWORD` |
| Auth hangs on ARD servers | Missing username | Set `VNC_USERNAME` for ARD targets |
| Auth succeeds but screen blank | Resolution mismatch | Check `--max-dimension` parameter |
| Connection drops after ~50 turns | ARD server instability | See Issue #8 |

### Debug Authentication

Enable verbose logging to see authentication handshake details:

```bash
VNC_HOST=192.168.1.100 \
VNC_PASSWORD=secret \
CLAUDE_KVM_DAEMON_PARAMETERS="-v" \
npx -y claude-kvm
```

Verbose output will show:
- Security types offered by server
- Authentication method selected
- Key exchange progress (ARD)
- Challenge/response exchange (VNC Auth)

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Summary

| Aspect | VNC Auth | Apple Remote Desktop |
|--------|----------|----------------------|
| Encryption | DES challenge-response | Diffie-Hellman + AES-128-ECB |
| Credentials | Password only | Username + Password |
| Target OS | Any VNC server | macOS (ARD server) |
| Auto-detected | Default fallback | Auth type 30 request |
| Key remapping | No | Yes (Meta → Super) |

The daemon's automatic detection and negotiation capabilities allow seamless connections to both standard VNC servers and macOS ARD servers without manual authentication method selection.

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md), [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

---

<a id='environment-variables'></a>

## Environment Variables

### Related Pages

Related topics: [Installation Guide](#installation)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)
- [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)
- [package.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/package.json)
- [README_TR.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README_TR.md)
- [tools/index.js](https://github.com/ARAS-Workspace/claude-kvm/blob/main/tools/index.js)
- [jsconfig.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/jsconfig.json)
</details>

# Environment Variables

This page documents all environment variables used by the claude-kvm MCP server for VNC connection configuration and daemon operation.

## Overview

Environment variables in claude-kvm serve as the primary configuration mechanism for establishing and maintaining VNC connections to remote desktops. These variables control connection parameters, authentication credentials, and daemon execution paths. The MCP server reads these variables at startup and uses them to configure the underlying VNC daemon process.

```mermaid
graph TD
    A[Claude Code] -->|stdio JSON-RPC| B[MCP Proxy<br/>Node.js]
    B -->|Reads Env Vars| C[VNC_HOST<br/>VNC_PORT<br/>VNC_USERNAME<br/>VNC_PASSWORD]
    B -->|Spawns Daemon| D[claude-kvm-daemon]
    D -->|VNC Protocol| E[Remote Desktop]
    
    C -->|Credential Injection| D
```

The environment variables are defined in the MCP server manifest (`server.json`) and validated against the Zod schema in the proxy layer. Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

## Required Variables

These variables must be set for the MCP server to establish a VNC connection.

| Variable | Type | Required | Secret | Description |
|----------|------|----------|--------|-------------|
| `VNC_HOST` | string | Yes | No | VNC server hostname or IP address |
| `VNC_PORT` | string | Yes | No | VNC server port number |

### VNC_HOST

Specifies the hostname or IP address of the remote VNC server to connect to.

```bash
VNC_HOST="192.168.1.100"
VNC_HOST="vnc.example.com"
VNC_HOST="10.0.0.50"
```

**Constraints:**
- Must be a valid hostname or IPv4/IPv6 address
- Must be reachable from the machine running the MCP server
- Connection validation occurs when the daemon attempts to connect

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

### VNC_PORT

Specifies the TCP port number on which the VNC server is listening.

```bash
VNC_PORT="5900"    # Default VNC port
VNC_PORT="5901"    # Display :1
VNC_PORT="mac"     # For macOS Screen Sharing (varies)
```

**Common Port Values:**
- `5900` - Default VNC port (display :0)
- `5901` - Display :1
- `5902` - Display :2
- Screen Sharing on macOS typically uses port `5900`

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

## Optional Variables

These variables enhance functionality or are required for specific VNC server configurations.

| Variable | Type | Required | Secret | Description |
|----------|------|----------|--------|-------------|
| `VNC_USERNAME` | string | No | Yes | Username for VNC/ARD authentication |
| `VNC_PASSWORD` | string | No | Yes | Password for VNC/ARD authentication |
| `CLAUDE_KVM_DAEMON_PATH` | string | No | No | Absolute path to daemon executable |

### VNC_USERNAME

The username for authenticating with the VNC server. This is required when using Apple Remote Desktop (ARD) authentication.

```bash
VNC_USERNAME="admin"
VNC_USERNAME="remoteuser"
```

**Use Cases:**
- Apple Remote Desktop (ARD) authentication
- VNC servers with username/password authentication
- Enterprise VNC configurations

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

### VNC_PASSWORD

The password for authenticating with the VNC server. This value is marked as a secret and should never be committed to version control.

```bash
VNC_PASSWORD="secure_password_here"
```

**Security Considerations:**
- Store in environment files that are excluded from version control
- Consider using secret management tools (e.g., 1Password, HashiCorp Vault)
- Never log or display this value in error messages

Source: [server.json](https://github.com/ARAS-Workspace/claude-kvm/blob/main/server.json)

### CLAUDE_KVM_DAEMON_PATH

Specifies the absolute filesystem path to the `claude-kvm-daemon` executable. When not specified, the MCP proxy attempts to locate the daemon in the system PATH.

```bash
# Homebrew installation (Apple Silicon)
CLAUDE_KVM_DAEMON_PATH="/opt/homebrew/bin/claude-kvm-daemon"

# Homebrew installation (Intel)
CLAUDE_KVM_DAEMON_PATH="/usr/local/bin/claude-kvm-daemon"

# Custom installation
CLAUDE_KVM_DAEMON_PATH="/usr/local/bin/claude-kvm-daemon"
```

**Installation Methods:**
- Homebrew tap: `brew install claude-kvm-daemon`
- Manual: Download from releases and install to desired location

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Configuration Methods

claude-kvm supports multiple methods for setting environment variables depending on your deployment scenario.

### MCP Configuration File

The recommended approach for Claude Code desktop applications uses a `.mcp.json` configuration file in the project directory:

```json
{
  "mcpServers": {
    "claude-kvm": {
      "command": "npx",
      "args": ["-y", "claude-kvm"],
      "env": {
        "VNC_HOST": "192.168.1.100",
        "VNC_PORT": "5900",
        "VNC_USERNAME": "user",
        "VNC_PASSWORD": "pass",
        "CLAUDE_KVM_DAEMON_PATH": "/opt/homebrew/bin/claude-kvm-daemon"
      }
    }
  }
}
```

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

### Shell Environment

For CLI usage or testing, export variables in the shell:

```bash
export VNC_HOST="192.168.1.100"
export VNC_PORT="5900"
export VNC_USERNAME="admin"
export VNC_PASSWORD="secret"
export CLAUDE_KVM_DAEMON_PATH="/opt/homebrew/bin/claude-kvm-daemon"

npx claude-kvm
```

### Environment File (.env)

Create a `.env` file (ensure it's in `.gitignore`):

```bash
VNC_HOST=192.168.1.100
VNC_PORT=5900
VNC_USERNAME=admin
VNC_PASSWORD=secret
CLAUDE_KVM_DAEMON_PATH=/opt/homebrew/bin/claude-kvm-daemon
```

Then load it in your shell:

```bash
source .env
npx claude-kvm
```

## Authentication Methods

The environment variables interact with two supported VNC authentication mechanisms.

### VNC Authentication

Standard VNC password-based challenge-response using DES encryption:

```bash
VNC_HOST="vnc.example.com"
VNC_PORT="5900"
VNC_PASSWORD="vnc_password"
```

### Apple Remote Desktop (ARD)

For macOS targets, ARD uses Diffie-Hellman key exchange with AES-128-ECB encryption. The username is required:

```bash
VNC_HOST="mac-mini.local"
VNC_PORT="5900"
VNC_USERNAME="admin"
VNC_PASSWORD="ard_password"
```

**macOS Detection:**
The daemon automatically detects ARD servers by the auth type 30 credential request and applies Meta-to-Super key remapping for Command key compatibility. Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Runtime Configuration

While environment variables set initial configuration, the running daemon supports runtime parameter adjustments through the `configure` action:

```json
{
  "action": "configure",
  "params": {
    "max_dimension": 1280,
    "click_hold_ms": 80,
    "key_hold_ms": 50
  }
}
```

**Timing Parameters:**

| Parameter | Default | Description |
|-----------|---------|-------------|
| `max_dimension` | `1280` | Maximum screen dimension in pixels |
| `click_hold_ms` | `50` | Mouse button hold duration |
| `key_hold_ms` | `30` | Key press hold duration |
| `hover_settle_ms` | `400` | Wait time after hover |
| `double_click_gap_ms` | `50` | Gap between double-click events |

Source: [README.md](https://github.com/ARAS-Workspace/claude-kvm/blob/main/README.md)

## Security Best Practices

### Credential Management

1. **Never commit credentials** - Add `.env` and `.mcp.json` to `.gitignore`
2. **Use secret managers** - Integrate with 1Password, AWS Secrets Manager, or similar
3. **Rotate passwords** - Update VNC passwords regularly
4. **Limit access** - Ensure only necessary users can read configuration

### Network Security

For remote VNC connections over untrusted networks:

1. **Use SSH tunneling** - Tunnel VNC traffic through SSH
2. **Enable encryption** - Configure VNC server with TLS/SSL
3. **Restrict access** - Use firewall rules to limit VNC server exposure

### Daemon Path Security

- Verify the daemon binary is owned by root and not writable by other users
- Use absolute paths to prevent PATH hijacking
- On macOS, ensure the daemon is code-signed and notarized

## Troubleshooting

### Connection Issues

**Problem:** VNC connection drops during long-running test sessions

This is a known issue (#8) related to VNC/ARD server stability, not the daemon or network layer. Workarounds:

- Use the `--no-reconnect` flag to control reconnection behavior
- Implement application-level retry logic
- Consider SSH tunneling for more stable connections

Source: [GitHub Issue #8](https://github.com/ARAS-Workspace/claude-kvm/issues/8)

### Variable Validation

The MCP proxy validates environment variables against the schema defined in `server.json`. Invalid values result in startup errors with descriptive messages.

### Daemon Path Issues

If the daemon cannot be found:

1. Verify the path exists: `ls -la /opt/homebrew/bin/claude-kvm-daemon`
2. Check PATH: `which claude-kvm-daemon`
3. Explicitly set `CLAUDE_KVM_DAEMON_PATH`

## Summary

| Variable | Required | Default | Secret |
|----------|----------|---------|--------|
| `VNC_HOST` | Yes | - | No |
| `VNC_PORT` | Yes | - | No |
| `VNC_USERNAME` | No | - | Yes |
| `VNC_PASSWORD` | No | - | Yes |
| `CLAUDE_KVM_DAEMON_PATH` | No | PATH lookup | No |

For most use cases, only `VNC_HOST` and `VNC_PORT` are required. Add `VNC_USERNAME` and `VNC_PASSWORD` when connecting to authenticated VNC servers or macOS with Screen Sharing enabled.

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: aras-workspace/claude-kvm

Summary: Found 8 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_c5f8215134dd43ca8306d08dc0e808ac | https://github.com/ARAS-Workspace/claude-kvm/issues/8

## 2. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

## 3. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

## 4. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

## 5. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

## 6. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

## 7. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.ARAS-Workspace/claude-kvm:2.0.11 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.ARAS-Workspace%2Fclaude-kvm/versions/2.0.11

<!-- canonical_name: aras-workspace/claude-kvm; human_manual_source: deepwiki_human_wiki -->
