# https://github.com/JimothySnicket/gemini-image-mcp Project Manual

Generated at: 2026-05-31 02:47:26 UTC

## Table of Contents

- [Home](#home)
- [Installation Guide](#installation)
- [MCP Client Configuration](#mcp-client-setup)
- [generate_image Tool Reference](#generate-image-tool)
- [process_image Tool Reference](#process-image-tool)
- [Configuration Guide](#configuration-guide)
- [Server Architecture](#server-architecture)
- [Image Generation Internals](#image-generation-internals)
- [Image Processing Internals](#image-processing-internals)
- [Cost Tracking and Rate Limiting](#cost-tracking)

<a id='home'></a>

## Home

### Related Pages

Related topics: [Installation Guide](#installation), [MCP Client Configuration](#mcp-client-setup)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
- [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)
- [CONTRIBUTING.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CONTRIBUTING.md)
- [server.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/server.json)
- [skills/image-generation/SKILL.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [SECURITY.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/SECURITY.md)
</details>

# Home

## Overview

`gemini-image-mcp` is a Model Context Protocol (MCP) server that provides Google Gemini-powered image generation, editing, and local image processing capabilities. It integrates with MCP-compatible AI assistants (such as Claude Code and Claude Desktop) to enable seamless AI-driven image workflows directly from conversational interfaces. Source: [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

The project exposes two primary tools: `generate_image` for AI-powered image creation and editing via the Gemini API, and `process_image` for local image manipulation using the `sharp` library—free and fast with no API calls required. Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Project Metadata

| Property | Value |
|----------|-------|
| **Package Name** | `@jimothy-snicket/gemini-image-mcp` |
| **Version** | 0.4.0 |
| **MCP Server Name** | `io.github.JimothySnicket/gemini-image` |
| **License** | MIT |
| **Author** | Jamie Donaldson |
| **Runtime** | Node.js >= 18.0.0 |
| **Package Manager** | Bun (primary), npm compatible |
| **Repository** | https://github.com/JimothySnicket/gemini-image-mcp |

Source: [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json), [server.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/server.json)

## Architecture

```mermaid
graph TD
    A[MCP Client<br/>Claude Code / Claude Desktop] --> B[gemini-image-mcp Server]
    B --> C[generate_image Tool]
    B --> D[process_image Tool]
    C --> E[Google Gemini API]
    D --> F[sharp Library<br/>Local Processing]
    
    E --> G[Image Models]
    G --> G1[gemini-2.5-flash-image]
    G --> G2[gemini-3-pro-image-preview]
    G --> G3[gemini-3.1-flash-image-preview]
    
    F --> H[Crop / Resize]
    F --> I[Background Removal]
    F --> J[Trim / Format]
    
    C --> K[Output Directory]
    D --> K
    K --> L[generations.jsonl<br/>Manifest Log]
```

### System Components

| Component | Technology | Purpose |
|-----------|------------|---------|
| MCP Server | `@modelcontextprotocol/sdk` v1.22.0 | Protocol implementation for AI tool integration |
| Gemini SDK | `@google/genai` v1.44.0 | Google AI API client |
| Image Processing | `sharp` v0.34.5 | Local image manipulation |
| Schema Validation | `zod` v3.24.0 | Parameter validation for both tools |
| Language | TypeScript 6.0.3 | Type-safe source code |

Source: [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

## Supported Gemini Models

| Model | Speed | Cost | Resolution Support | Max Reference Images | Special Features |
|-------|-------|------|-------------------|---------------------|-------------------|
| `gemini-2.5-flash-image` | Fast (~6s) | ~$0.04/image | 1K only | 1 | Default model, deprecates Oct 2026 |
| `gemini-3-pro-image-preview` | Slow (~16s) | ~$0.15/image | 1K, 2K, 4K | 14 | Best quality, text rendering |
| `gemini-3.1-flash-image-preview` | Balanced | Variable | 512, 1K, 2K, 4K | 1 | Google Search grounding |

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Available Tools

### Tool: `generate_image`

AI-powered image generation and editing via Google Gemini API.

**Parameters:**

| Parameter | Required | Type | Description |
|-----------|----------|------|-------------|
| `prompt` | Yes | string | Text description or editing instruction |
| `images` | No | string[] | Array of file paths to input/reference images |
| `model` | No | string | Gemini model ID (auto-detected if omitted) |
| `aspectRatio` | No | string | Image ratio: `1:1`, `16:9`, `9:16`, `3:2`, `2:3`, `4:3`, `3:4`, `21:9` |
| `resolution` | No | string | `1K`, `2K`, `4K` |
| `outputDir` | No | string | Override output directory |
| `filename` | No | string | Base name for saved file (auto-versioned if duplicate) |
| `subfolder` | No | string | Subfolder within output directory |
| `sessionId` | No | string | Continue multi-turn editing session |
| `seed` | No | integer | Reproducible generation seed |
| `useSearchGrounding` | No | boolean | Enable Google Search grounding (gemini-3.1-flash only) |

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md), [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

### Tool: `process_image`

Local image processing via sharp. Free, fast, no API calls.

**Parameters:**

| Parameter | Required | Type | Description |
|-----------|----------|------|-------------|
| `imagePath` | Yes | string | Path to image file to process |
| `crop` | No | object | Pixel dimensions, aspect ratio, or focal point strategy |
| `resize` | No | object | Resize to width/height (maintains aspect ratio) |
| `removeBackground` | No | object | Threshold (white) or chroma key (any solid color) |
| `trim` | No | boolean | Auto-remove whitespace/transparent borders |
| `format` | No | string | Convert to: `png`, `jpeg`, `webp` |
| `quality` | No | number | Output quality for JPEG/WebP (1-100) |
| `outputDir` | No | string | Override output directory |
| `filename` | No | string | Base name for saved file |
| `subfolder` | No | string | Subfolder within output directory |

**Crop Options:**

```json
// Pixel-exact
{"width": 500, "height": 300, "left": 100, "top": 50}

// Aspect ratio (center crop)
{"aspectRatio": "16:9"}

// Focal point strategies
{"aspectRatio": "16:9", "strategy": "attention"}  // Visually interesting region
{"aspectRatio": "16:9", "strategy": "entropy"}    // Most detailed region
```

**Background Removal Options:**

```json
// Threshold-based (white backgrounds)
{"threshold": 230}

// Chroma key (green screen / any solid color)
{"color": "#00FF00"}
```

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md), [skills/image-generation/SKILL.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)

## Feature Summary

### Generate Image Features

- **Text-to-image** — Describe desired output, receive generated image
- **Image editing** — Provide reference images with editing instructions
- **Multi-turn sessions** — Iteratively refine images using conversation history
- **Multi-image input** — Up to 14 reference images on gemini-3-pro
- **Cost reporting** — Token counts, estimated USD cost, and session totals in every response
- **Rate limiting** — Configurable per-hour caps on requests and cost
- **Auto model discovery** — Detects available image models from API key at startup
- **Seed support** — Reproducible generation with integer seeds
- **Google Search grounding** — Real-world accuracy on gemini-3.1-flash

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Process Image Features

- **Crop** — Pixel-exact, aspect ratio (center), or focal point (attention/entropy)
- **Resize** — To width, height, or both (maintains aspect ratio)
- **Background removal** — Threshold-based (white backgrounds) or chroma key (any solid color)
- **Chroma key pipeline** — HSV keying with smoothstep feather, spill suppression, 5-pass 3x3 edge anti-aliasing
- **Trim** — Auto-remove whitespace borders
- **Format conversion** — PNG, JPEG, WebP with quality control

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Shared Features

- **Output organization** — Meaningful filenames with auto-versioning, subfolders
- **Generation manifest** — `generations.jsonl` logs every generation with prompt, params, cost
- **Full aspect ratio support** — 1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 21:9
- **Resolution control** — 1K, 2K, 4K

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Workflow Diagram

```mermaid
graph LR
    subgraph "Text-to-Image"
        A1[User Prompt] --> B1[generate_image]
        B1 --> C1[Gemini API]
        C1 --> D1[Save PNG/JPEG]
    end
    
    subgraph "Image Editing"
        A2[User Prompt + Reference Image] --> B2[generate_image]
        B2 --> C2[Gemini API]
        C2 --> D2[Save + sessionId]
    end
    
    subgraph "Local Processing"
        A3[Input Image] --> B3[process_image]
        B3 --> C3[sharp Pipeline]
        C3 --> D3[Processed Output]
    end
    
    subgraph "Multi-Turn Refinement"
        D2 --> E1[Pass sessionId]
        E1 --> B2
        B2 --> D4[Refined Image]
    end
```

## Setup and Configuration

### Prerequisites

1. **Gemini API Key** — Obtain from [Google AI Studio](https://aistudio.google.com/apikey)
2. **Node.js >= 18.0.0** or **Bun** runtime
3. **MCP-compatible client** (Claude Code, Claude Desktop, or other MCP clients)

### Environment Setup

**Windows (PowerShell):**
```powershell
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')
```

**macOS / Linux:**
```bash
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc
```

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Configuration File

Create a config file using the `--init` flag:
```bash
npx @jimothy-snicket/gemini-image-mcp --init
```

This creates `~/.gemini-image-mcp.json` with all defaults and inline documentation.

**Configuration Priority:**
```
Environment Variables > Local Config (.gemini-image-mcp.json in CWD) > Global Config (~/.gemini-image-mcp.json) > Defaults
```

**Example Config Structure:**
```json
{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}
```

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Rate Limiting

Configure rate limits to prevent runaway agent costs:
- `MAX_REQUESTS_PER_HOUR` — Maximum API requests per hour (e.g., 20)
- `MAX_COST_PER_HOUR` — Maximum cost in USD per hour (e.g., 5)

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md), [skills/image-generation/SKILL.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)

## Development

### Build Commands

```bash
bun install        # Install dependencies
bun run build      # TypeScript -> dist/
bun run dev        # Run directly with Bun
npm run start      # Run production build with Node
```

Source: [CONTRIBUTING.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CONTRIBUTING.md)

### Project Structure

| Path | Purpose |
|------|---------|
| `src/index.ts` | Main MCP server implementation with tool definitions |
| `dist/` | Compiled JavaScript output |
| `skills/image-generation/SKILL.md` | Claude Code plugin skill documentation |
| `plugin.json` | Claude Code plugin manifest |
| `server.json` | MCP server registry configuration |

Source: [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

## Version History

| Version | Release Date | Key Changes |
|---------|--------------|-------------|
| 0.4.0 | 2026-05 | Config module, JSONC parsing, security hardening, prototype pollution guards |
| 0.2.0 | 2026-04-01 | Process_image tool, chroma key pipeline, session tracking, rate limiting |
| 0.1.0 | 2026-01 | Initial release with basic generation |

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

## Security

For security vulnerabilities, report through [GitHub Security Advisories](https://github.com/JimothySnicket/gemini-image-mcp/security/advisories/new) rather than opening a public issue. Source: [SECURITY.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/SECURITY.md)

**Security Features in v0.4.0:**

- API keys rejected from config files with warning
- String-aware JSONC comment stripping (won't mangle URLs in quoted strings)
- Prototype pollution guard on config deep merge (`__proto__`, `constructor`, `prototype`)
- Unknown config keys warned and dropped

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

## Contributing

Before contributing, open an issue to discuss the bug or feature. Development follows these guidelines:

- One thing per PR
- Ensure `bun run build` succeeds with no errors
- Test changes manually against the actual Gemini API
- Keep scope tight—open separate issues for unrelated fixes

Source: [CONTRIBUTING.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CONTRIBUTING.md)

---

<a id='installation'></a>

## Installation Guide

### Related Pages

Related topics: [Home](#home), [MCP Client Configuration](#mcp-client-setup), [Configuration Guide](#configuration-guide)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
- [CONTRIBUTING.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CONTRIBUTING.md)
- [server.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/server.json)
</details>

# Installation Guide

This guide covers all methods to install and configure the gemini-image-mcp server, a Model Context Protocol (MCP) server that provides Google Gemini image generation, editing, and local image processing capabilities.

## Overview

The gemini-image-mcp server provides two primary tools:

| Tool | Description |
|------|-------------|
| `generate_image` | AI-powered image generation and editing via Gemini API |
| `process_image` | Local image processing (crop, resize, background removal) via Sharp |

**Package Details:**

| Property | Value |
|----------|-------|
| Package Name | `@jimothy-snicket/gemini-image-mcp` |
| Version | 0.4.0 |
| Engine | Node.js >= 18.0.0 |
| License | MIT |
| Transport | stdio |

Source: [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

## Prerequisites

Before installation, ensure the following requirements are met:

### System Requirements

- **Node.js:** Version 18.0.0 or higher
- **Package Manager:** npm (comes with Node.js)
- **MCP Client:** A compatible MCP client such as Claude Code, Claude Desktop, or any MCP-compatible tool

### Required Accounts

- **Google Gemini API Key:** Obtain from [Google AI Studio](https://aistudio.google.com/apikey)

> **Note:** Google AI Studio provides generous rate limits for the Gemini API at no cost to start.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Installation Methods

### Method 1: Global npm Installation

Install the package globally for system-wide access:

```bash
npm install -g @jimothy-snicket/gemini-image-mcp
```

After installation, the server can be invoked via the `gemini-image-mcp` command:

```bash
gemini-image-mcp
```

Source: [package.json:14](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json) [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Method 2: NPX (No Installation Required)

Run directly without installation using npx:

```bash
npx -y @jimothy-snicket/gemini-image-mcp
```

This method automatically downloads and executes the package, making it ideal for quick testing or temporary use.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Method 3: Claude Code Plugin

Add the MCP server to Claude Code with a single command:

```bash
claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp
```

Claude Code automatically picks up the `GEMINI_API_KEY` environment variable from your shell.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Method 4: Manual MCP Configuration

Create a `.mcp.json` configuration file in your project root or `~/.claude/.mcp.json` for global access:

```json
{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}
```

> **Security Note:** The `${GEMINI_API_KEY}` syntax reads the value from your shell environment, ensuring your actual API key is never written into configuration files.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Method 5: Claude Desktop

For Claude Desktop users, edit the configuration file:

| OS | File Path |
|----|-----------|
| macOS | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| Windows | `%APPDATA%\Claude\claude_desktop_config.json` |

```json
{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}
```

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Environment Setup

### Setting the GEMINI_API_KEY

The server requires a `GEMINI_API_KEY` environment variable to authenticate with the Google Gemini API.

#### Windows (PowerShell)

Run PowerShell as administrator and execute:

```powershell
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')
```

After setting the environment variable, restart your terminal to ensure the variable is loaded.

#### macOS / Linux

Add the export statement to your shell configuration file:

```bash
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc
```

For zsh users, use:

```bash
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.zshrc
source ~/.zshrc
```

#### Verification

Confirm the API key is set correctly:

```bash
echo $GEMINI_API_KEY
```

This should display your API key. If empty, ensure you've restarted your terminal or run `source ~/.bashrc` (or equivalent).

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Configuration File Setup

The server supports configuration files for persistent settings. Two methods are available:

### Initialize Default Config File

Create a global configuration file at `~/.gemini-image-mcp.json`:

```bash
npx @jimothy-snicket/gemini-image-mcp --init
```

### Initialize Local Config File

Create a project-local configuration file at `.gemini-image-mcp.json` in the current working directory:

```bash
npx @jimothy-snicket/gemini-image-mcp --init --local
```

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Configuration Priority

Settings are resolved in the following order of precedence:

```
Environment Variables > Local Config (.gemini-image-mcp.json) > Global Config (~/.gemini-image-mcp.json) > Default Values
```

Per-request parameters always override all configuration defaults.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Configuration Schema

The configuration file supports the following structure:

```json
{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}
```

#### Configuration Parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `defaultModel` | string | Default Gemini model for image generation | `gemini-2.5-flash-image` |
| `defaults.generate.aspectRatio` | string | Default aspect ratio | `1:1` |
| `defaults.generate.resolution` | string | Default resolution | `1K` |
| `defaults.process.removeBackground` | object | Default background removal settings | `{}` |
| `defaults.process.trim` | boolean | Default trim setting | `false` |
| `defaults.process.format` | string | Default output format | `png` |
| `defaults.process.quality` | number | Default quality (1-100) | `90` |

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Development Setup

For contributing to the project or running from source:

### 1. Clone the Repository

```bash
git clone https://github.com/JimothySnicket/gemini-image-mcp.git
cd gemini-image-mcp
```

### 2. Install Dependencies

The project uses **Bun** as its package manager:

```bash
bun install
```

### 3. Build the Project

Compile TypeScript to JavaScript:

```bash
bun run build
```

This produces output in the `dist/` directory.

### 4. Run in Development Mode

Execute directly from source using Bun:

```bash
bun run dev
```

### 5. Run the Compiled Version

After building, start the compiled server:

```bash
bun run start
```

Or with Node.js:

```bash
node dist/index.js
```

Source: [CONTRIBUTING.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CONTRIBUTING.md) [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

## Rate Limiting Configuration

The server supports rate limiting to prevent runaway agents or excessive costs:

| Environment Variable | Description | Example |
|---------------------|-------------|---------|
| `MAX_REQUESTS_PER_HOUR` | Maximum API requests per hour | `20` |
| `MAX_COST_PER_HOUR` | Maximum cost per hour in USD | `5` |

Example sensible defaults for an agent loop:

```bash
export MAX_REQUESTS_PER_HOUR=20
export MAX_COST_PER_HOUR=5
```

> **Note:** The server logs a warning at startup if no rate limits are configured.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Supported Gemini Models

| Model | Strengths | Supported Resolutions |
|-------|-----------|----------------------|
| `gemini-2.5-flash-image` | Fast, cheap (~$0.04/image) | 1K only (deprecates Oct 2026) |
| `gemini-3-pro-image-preview` | Best quality, text rendering | 1K, 2K, 4K |
| `gemini-3.1-flash-image-preview` | Speed + quality balance, Google Search grounding | 512, 1K, 2K, 4K |

The server performs automatic model discovery at startup, detecting image-capable models available with your API key.

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Server Metadata

The server is registered with the MCP registry:

```json
{
  "name": "io.github.JimothySnicket/gemini-image",
  "version": "0.4.0",
  "description": "Google Gemini image generation, editing, and local processing via MCP"
}
```

Source: [server.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/server.json)

## Installation Flow Diagram

```mermaid
graph TD
    A[Start Installation] --> B{Have GEMINI_API_KEY?}
    B -->|No| C[Get API Key from Google AI Studio]
    B -->|Yes| D{Choose Installation Method}
    C --> D
    D -->|Global| E[npm install -g]
    D -->|Temporary| F[npx -y]
    D -->|Claude Code| G[claude mcp add command]
    D -->|Claude Desktop| H[Edit claude_desktop_config.json]
    D -->|Development| I[Clone repo + bun install]
    E --> J{Setup Config File?}
    F --> J
    G --> J
    H --> J
    I --> J
    J -->|Yes| K[Run --init or --init --local]
    J -->|No| L[Use Defaults]
    K --> M[Start Using MCP Server]
    L --> M
```

## Verification Checklist

After installation, verify your setup by checking:

- [ ] `echo $GEMINI_API_KEY` returns your API key
- [ ] Server starts without errors
- [ ] MCP client recognizes the gemini-image server
- [ ] Test `generate_image` tool with a simple prompt
- [ ] Rate limiting is configured (recommended for agent use)

## Troubleshooting

### API Key Not Found

If the server reports that `GEMINI_API_KEY` is not set:

1. Verify the environment variable is set: `echo $GEMINI_API_KEY`
2. Restart your terminal session
3. For Claude Desktop, ensure the env variable is set before starting the application

### Model Not Available

If you receive a model not available error:

1. The server performs automatic model discovery at startup
2. Verify your API key has access to the requested model
3. Check [Google AI Studio](https://aistudio.google.com/apikey) for model availability

### Build Errors

If `bun run build` fails:

1. Ensure Bun is installed: `bun --version`
2. Clear node_modules and reinstall: `rm -rf node_modules && bun install`
3. Check TypeScript version compatibility

Source: [CONTRIBUTING.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CONTRIBUTING.md) [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

---

<a id='mcp-client-setup'></a>

## MCP Client Configuration

### Related Pages

Related topics: [Installation Guide](#installation), [Home](#home)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)
- [server.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/server.json)
- [plugin.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/plugin.json)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
</details>

# MCP Client Configuration

## Overview

The gemini-image-mcp project provides an MCP (Model Context Protocol) server that enables AI-powered image generation and local image processing through Google Gemini. MCP Client Configuration encompasses all methods and mechanisms available to connect MCP-compatible clients to this server, pass required authentication credentials, and customize server behavior through environment variables or configuration files.

The server exposes two primary tools: `generate_image` for AI-powered image generation via the Gemini API, and `process_image` for local image manipulation using the sharp library. Both tools are accessible to any MCP-compatible client once the connection is established. Source: [README.md:1-15]()

## Architecture

```mermaid
graph TD
    A[MCP Client<br/>Claude Code / Claude Desktop] --> B[gemini-image-mcp Server]
    B --> C[Google Gemini API]
    B --> D[Local Processing<br/>sharp library]
    
    E[Environment Variables] --> B
    F[Config File<br/>~/.gemini-image-mcp.json] --> B
    G[Local Config<br/>.gemini-image-mcp.json] --> B
    
    H[GEMINI_API_KEY] --> E
    I[OUTPUT_DIR] --> E
    J[DEFAULT_MODEL] --> E
```

## MCP Server Registration

The server registers with the MCP protocol using the official `@modelcontextprotocol/sdk`. Upon connection, clients receive metadata describing available tools and server capabilities. Source: [src/index.ts:1-35]()

### Server Identity

| Property | Value |
|----------|-------|
| Server Name | `gemini-image-mcp` |
| Version | Dynamic from `package.json` |
| MCP Name | `io.github.JimothySnicket/gemini-image` |
| Transport | stdio |
| Protocol Version | 2025-12-11 |

The server name and version are read dynamically from `package.json` at runtime, ensuring the MCP handshake always reports the correct version. Source: [package.json:3-7]()

### Server Instructions

The MCP server provides structured instructions to connecting clients describing the available tools and configuration hierarchy:

```
Gemini image generation and local image processing. Two tools: generate_image (AI-powered, costs money) 
and process_image (local via sharp, free). Configuration can be set via a JSON config file — run 
`npx @jimothy-snicket/gemini-image-mcp --init` to create ~/.gemini-image-mcp.json with commented defaults. 
A local .gemini-image-mcp.json in the project directory can override global settings. 
Priority: per-request params > env vars > local config > global config > defaults.
```

Source: [src/index.ts:17-26]()

## Environment Variables

Environment variables provide the primary mechanism for configuring the MCP server. They are read at server startup and apply globally to all requests.

### Required Variables

| Variable | Description | Required |
|----------|-------------|----------|
| `GEMINI_API_KEY` | Google Gemini API key from AI Studio | Yes |

The `GEMINI_API_KEY` is mandatory. The server will fail to start without it, displaying a clear error message indicating the missing credential. Source: [README.md:68-80]()

### Optional Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `OUTPUT_DIR` | `~/gemini-images` | Default directory for saved images |
| `DEFAULT_MODEL` | `gemini-2.5-flash-image` | Default Gemini model |
| `LOG_LEVEL` | `info` | Log level: `debug`, `info`, or `error` |
| `REQUEST_TIMEOUT_MS` | `60000` | API request timeout in milliseconds |
| `SESSION_TIMEOUT_MS` | `1800000` | Multi-turn session expiry (30 minutes) |
| `MAX_REQUESTS_PER_HOUR` | `0` | Max image generations per rolling hour (0 = unlimited) |
| `MAX_COST_PER_HOUR` | `0` | Max estimated cost (USD) per rolling hour (0 = unlimited) |

Source: [src/config.ts:4-30]()

### Setting the API Key

**macOS / Linux:**
```bash
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc
```

**Windows (PowerShell):**
```powershell
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')
```

Source: [README.md:75-88]()

### Rate Limiting Configuration

Rate limiting is strongly recommended when agents have access to the `generate_image` tool, as an agent in a loop can generate images rapidly.

```bash
# Example: Limit to 20 requests or $5 per rolling hour
export MAX_REQUESTS_PER_HOUR=20
export MAX_COST_PER_HOUR=5
```

Source: [README.md:166-170]()

## Configuration Files

Beyond environment variables, the server supports persistent JSON configuration files with comments (JSONC format).

### Config File Locations

| Location | Purpose |
|----------|---------|
| `~/.gemini-image-mcp.json` | Global configuration for all projects |
| `.gemini-image-mcp.json` | Project-specific overrides |

Source: [src/config.ts:1-20]()

### Configuration Priority

```mermaid
graph LR
    A[Per-request Parameters] --> Z[Highest Priority]
    B[Environment Variables] --> Y
    C[Local Config<br/>.gemini-image-mcp.json] --> X
    D[Global Config<br/>~/.gemini-image-mcp.json] --> W
    E[Built-in Defaults] --> V[Lowest Priority]
    
    style A fill:#90EE90
    style E fill:#FFB6C1
```

Priority order (highest to lowest):
1. Per-request tool parameters
2. Environment variables
3. Local config file (`.gemini-image-mcp.json` in project)
4. Global config file (`~/.gemini-image-mcp.json`)
5. Built-in defaults

Source: [README.md:148-155]()

### Initializing Config Files

Create a new config file with documented defaults:

```bash
# Global config
npx @jimothy-snicket/gemini-image-mcp --init

# Project-specific config
npx @jimothy-snicket/gemini-image-mcp --init --local

# Overwrite existing
npx @jimothy-snicket/gemini-image-mcp --init --force
```

Source: [README.md:53-62]()

### Config File Template

```json
{
  // gemini-image-mcp configuration
  // Docs: https://github.com/JimothySnicket/gemini-image-mcp

  // Directory where generated/processed images are saved
  "outputDir": "~/gemini-images",

  // Default Gemini model for image generation
  // gemini-2.5-flash-image         — fast, ~$0.04/image, 1K only
  // gemini-3.1-flash-image-preview  — fast, ~$0.08/image, up to 4K
  // gemini-3-pro-image-preview      — best quality, ~$0.16/image, up to 4K
  "defaultModel": "gemini-2.5-flash-image",

  "logLevel": "info",
  "requestTimeout": 60000,
  "sessionTimeout": 1800000,
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,

  "defaults": {
    "generate": {
      // "aspectRatio": "1:1",
      // "resolution": "1K"
    }
  }
}
```

Source: [src/config.ts:1-35]()

### Security Considerations

API keys are **rejected from config files** with a warning. This prevents accidental exposure when config files get committed to repositories. Source: [CHANGELOG.md:45-50]()

The config system includes:
- String-aware JSONC comment stripping (won't mangle URLs in quoted strings)
- Prototype pollution guard on config deep merge
- Unknown config keys warned and dropped

## MCP Client Setup Examples

### Claude Code (One-Liner)

The simplest setup method using Claude Code's built-in MCP management:

```bash
claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp
```

Claude Code automatically inherits `GEMINI_API_KEY` from the shell environment. Source: [README.md:38-45]()

### Claude Code (Manual Configuration)

For explicit control, add to `.mcp.json` in your project root or `~/.claude/.mcp.json` for global access:

```json
{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}
```

The `${GEMINI_API_KEY}` syntax reads the value from your shell environment without storing the actual key in the config file. Source: [README.md:95-110]()

### Claude Desktop

Edit the Claude Desktop configuration file:

| OS | Path |
|----|------|
| macOS | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| Windows | `%APPDATA%\Claude\claude_desktop_config.json` |

```json
{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}
```

Source: [README.md:111-130]()

### Plugin-Based Configuration

For environments using the Claude plugin system, configure via `plugin.json`:

```json
{
  "name": "gemini-image-mcp",
  "version": "0.2.0",
  "description": "Google Gemini image generation and editing via MCP",
  "mcpServers": {
    "gemini-image": {
      "command": "node",
      "args": ["${CLAUDE_PLUGIN_ROOT}/dist/index.js"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  },
  "skills": ["skills/image-generation/SKILL.md"]
}
```

The `${CLAUDE_PLUGIN_ROOT}` variable is replaced at runtime with the plugin installation directory. Source: [plugin.json:1-16]()

### Enhanced Security Setup

For environments requiring extra security, use a wrapper script that retrieves credentials from the OS keychain:

```bash
# Wrapper script example (macOS Keychain)
#!/bin/bash
API_KEY=$(security find-generic-password -s "GEMINI_API_KEY" -w)
GEMINI_API_KEY="$API_KEY" node /path/to/gemini-image-mcp/dist/index.js
```

Source: [README.md:145-155]()

## Server.json Schema

The MCP protocol uses `server.json` to advertise server capabilities to compatible clients:

```json
{
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
  "name": "io.github.JimothySnicket/gemini-image",
  "description": "Google Gemini image generation, editing, and local processing via MCP",
  "repository": {
    "url": "https://github.com/JimothySnicket/gemini-image-mcp",
    "source": "github"
  },
  "version": "0.4.0",
  "packages": [
    {
      "registryType": "npm",
      "identifier": "@jimothy-snicket/gemini-image-mcp",
      "version": "0.4.0",
      "transport": {
        "type": "stdio"
      },
      "environmentVariables": [
        {
          "description": "Google Gemini API key from https://aistudio.google.com/apikey",
          "isRequired": true,
          "format": "string",
          "isSecret": true,
          "name": "GEMINI_API_KEY"
        }
      ]
    }
  ]
}
```

This schema allows MCP clients to automatically discover server requirements and display appropriate configuration prompts. Source: [server.json:1-32]()

## Verifying Configuration

### Check Environment Variables

```bash
echo $GEMINI_API_KEY
```

A non-empty response confirms the variable is set. Source: [README.md:89-91]()

### Test Server Startup

```bash
npx @jimothy-snicket/gemini-image-mcp --help
```

Successful startup displays diagnostics including Node version, PID, working directory, API key status, default model, and output directory. Source: [CHANGELOG.md:35-40]()

## Tool Registration

The MCP server registers two tools with their input schemas:

### generate_image Tool

```typescript
server.registerTool(
  "generate_image",
  {
    title: "Generate Image",
    description: "Generate or edit images using Google Gemini...",
    inputSchema: {
      prompt: z.string().describe("Text description of the image..."),
      images: z.optional(z.array(z.string()).max(14)),
      model: z.optional(z.string()),
      aspectRatio: z.optional(z.enum(["1:1", "16:9", "9:16", "3:2", "2:3", "3:4", "4:3", "21:9"])),
      resolution: z.optional(z.enum(["1K", "2K", "4K"])),
      // ... additional parameters
    }
  }
);
```

Source: [src/index.ts:37-60]()

### process_image Tool

Local image processing via sharp, free and requires no API calls. Supports crop, resize, background removal, trim, and format conversion. Source: [README.md:28-40]()

## Development Setup

For local development of the MCP server:

```bash
bun install
bun run build     # TypeScript -> dist/
bun run dev       # Run directly with Bun
```

Requires `GEMINI_API_KEY` environment variable for testing image generation. Source: [CONTRIBUTING.md:1-15]()

## Summary

MCP Client Configuration for gemini-image-mcp supports multiple integration strategies:

| Method | Best For |
|--------|----------|
| `claude mcp add` | Quick setup in Claude Code |
| `.mcp.json` manual | Explicit control, version control of config |
| Claude Desktop | Desktop Claude applications |
| Plugin system | Shared team configurations |
| Environment variables | CI/CD pipelines, containerized deployments |
| Config files | Persistent, documented defaults |

The configuration system follows a clear priority hierarchy, with per-request parameters taking precedence over environment variables, which take precedence over local and global config files. Rate limiting is configurable to prevent runaway costs in agent-based workflows.

---

<a id='generate-image-tool'></a>

## generate_image Tool Reference

### Related Pages

Related topics: [process_image Tool Reference](#process-image-tool), [Image Generation Internals](#image-generation-internals), [Cost Tracking and Rate Limiting](#cost-tracking)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/generate.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [src/pricing.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
</details>

# generate_image Tool Reference

The `generate_image` tool is the primary AI-powered component of the `gemini-image-mcp` server. It leverages Google Gemini's native image generation API (`generateContent`) to create and edit images based on text prompts, with optional reference images for contextual guidance. This tool is designed for scenarios requiring intelligent, model-driven image creation including text-to-image generation, iterative editing through multi-turn sessions, and AI-assisted image composition with up to 14 reference images.

Unlike traditional image generation APIs that rely on deprecated services, this tool is built on Gemini's native capabilities, ensuring long-term stability and access to cutting-edge features like multi-turn conversation context and Google Search grounding.

## Overview

| Property | Value |
|----------|-------|
| **Tool Name** | `generate_image` |
| **API Backend** | Google Gemini `generateContent` |
| **Cost** | Per-request (see [Pricing](#pricing-and-cost-reporting)) |
| **Free Tier** | No (requires Gemini API key) |
| **Transport** | STDIO (MCP Protocol) |

Source: [src/index.ts:1-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Supported Models

The tool automatically discovers available image-capable models at startup by querying the Gemini API. However, three primary models are documented and supported:

| Model ID | Strengths | Resolution Support | Reference Images | Cost Tier |
|----------|-----------|-------------------|------------------|-----------|
| `gemini-2.5-flash-image` | Fast, affordable | 1K only | Up to 5 | ~$0.04/image |
| `gemini-3-pro-image-preview` | Best quality, superior text rendering | 1K, 2K, 4K | Up to 14 | ~$0.15/image |
| `gemini-3.1-flash-image-preview` | Speed/quality balance, Google Search grounding | 512, 1K, 2K, 4K | Up to 14 | ~$0.08/image |

The model auto-discovery mechanism filters models based on naming patterns to exclude deprecated Imagen-based services:

```typescript
const IMAGE_MODEL_PATTERNS = ["image", "img"];
const EXCLUDED_PREFIXES = ["imagen"];
```

Source: [src/generate.ts:1-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Parameters

The following table documents all parameters accepted by the `generate_image` tool:

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `prompt` | `string` | Yes | — | Text description or editing instruction |
| `images` | `string[]` | No | — | Array of file paths to input/reference images |
| `model` | `string` | No | Config `defaultModel` | Gemini model ID |
| `aspectRatio` | `string` | No | Config default | Output aspect ratio |
| `resolution` | `string` | No | Config default | Output resolution (1K, 2K, 4K) |
| `outputDir` | `string` | No | `~/gemini-images` | Override output directory |
| `filename` | `string` | No | Auto-generated | Base name with auto-versioning |
| `subfolder` | `string` | No | — | Subdirectory within output |
| `seed` | `integer` | No | Random | Reproducible generation seed |
| `sessionId` | `string` | No | — | Multi-turn session identifier |
| `useSearchGrounding` | `boolean` | No | `false` | Enable Google Search grounding |

### Supported Aspect Ratios

| Ratio | Use Case |
|-------|----------|
| `1:1` | Square images, social posts, icons |
| `16:9` | Widescreen, hero banners, videos |
| `9:16` | Vertical stories, mobile content |
| `3:2` | Standard photography |
| `2:3` | Portrait photography |
| `4:3` | Classic aspect ratio |
| `3:4` | Portrait standard |
| `21:9` | Ultra-widescreen |

### Resolution Options

| Resolution | Availability |
|------------|--------------|
| `1K` | All models |
| `2K` | gemini-3-pro-image-preview, gemini-3.1-flash-image-preview |
| `4K` | gemini-3-pro-image-preview, gemini-3.1-flash-image-preview |

Source: [src/index.ts:50-150](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Architecture

### High-Level Flow

```mermaid
graph TD
    A[User Request: generate_image] --> B[Load Config & Validate Params]
    B --> C{images provided?}
    C -->|No| D[Text-to-Image Mode]
    C -->|Yes| E[Image Editing Mode]
    D --> F[Build Prompt Content]
    E --> G[Read Image Files as InlineData]
    G --> F
    F --> H{Model supports grounding?}
    H -->|Yes & enabled| I[Add Google Search Tool]
    H -->|No| J[Skip Grounding]
    I --> K[Call Gemini generateContent API]
    J --> K
    K --> L[Extract Generated Image]
    L --> M[Apply Filename & Subfolder Logic]
    M --> N[Save to Output Directory]
    N --> O[Log to generations.jsonl]
    O --> P[Return Response with Usage Report]
```

### Multi-Turn Session Management

Multi-turn sessions enable iterative refinement of images by preserving conversation history across multiple requests:

```mermaid
graph LR
    A[Request 1: sessionId=abc123] --> B[Create New Session]
    B --> C[Generate Image]
    C --> D[Response: sessionId=abc123]
    D --> E[Request 2: sessionId=abc123]
    E --> F[Retrieve Existing Session]
    F --> G[Append to History]
    G --> H[Generate with Context]
    H --> D2[Updated Response]
```

Sessions are managed through an in-memory Map with automatic cleanup:

| Setting | Value |
|---------|-------|
| Max conversation turns per session | 10 |
| Session timeout | 30 minutes (1800000ms) |
| History storage | Array of `Content` objects |

Source: [src/generate.ts:50-150](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

### Image Input Processing

When reference images are provided, they are converted to Gemini's inline data format:

```typescript
const MIME_TYPES: Record<string, string> = {
  ".png": "image/png",
  ".jpg": "image/jpeg",
  ".jpeg": "image/jpeg",
  ".webp": "image/webp",
  ".gif": "image/gif",
};
```

| Constraint | Limit |
|------------|-------|
| Max image file size | 50MB |
| Max reference images (gemini-3-pro) | 14 |
| Max reference images (gemini-2.5-flash) | 5 |

Source: [src/generate.ts:150-200](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Google Search Grounding

Google Search grounding enhances generation accuracy by incorporating real-world information through the Gemini `googleSearch` tool. This feature is restricted to specific models:

```typescript
export function validateGrounding(model: string, useSearchGrounding: boolean | undefined): void {
  if (useSearchGrounding && !GROUNDING_SUPPORTED_MODELS.includes(model)) {
    throw new Error(
      `useSearchGrounding is only supported on ${GROUNDING_SUPPORTED_MODELS.join(", ")}. ` +
        `You requested ${model}.`,
    );
  }
}

export const GROUNDING_SUPPORTED_MODELS = ["gemini-3.1-flash-image-preview"];
```

**Supported Model:** `gemini-3.1-flash-image-preview`

Attempting to enable grounding on other models results in a validation error. This restriction ensures users receive accurate error messages rather than silent failures from the API.

Source: [src/generate.ts:1-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Pricing and Cost Reporting

Every `generate_image` response includes detailed cost information through the `UsageReport` structure:

```typescript
interface UsageReport {
  inputTokens: number;
  outputTokens: number;
  totalTokens: number;
  estimatedCostUsd: number;
}

interface SessionStats {
  totalGenerations: number;
  totalCostUsd: number;
  requestsThisHour: number;
  costThisHour: number;
}
```

| Metric | Description |
|--------|-------------|
| `inputTokens` | Tokens consumed by the prompt and reference images |
| `outputTokens` | Tokens in the API response (including image data) |
| `totalTokens` | Sum of input and output tokens |
| `estimatedCostUsd` | Calculated cost in US dollars |
| `totalGenerations` | Running count in current session |
| `totalCostUsd` | Cumulative cost for the session |

The pricing module calculates costs based on model-specific rates. Rate limiting is available through configuration to prevent runaway costs:

| Environment Variable | Purpose |
|---------------------|---------|
| `MAX_REQUESTS_PER_HOUR` | Request rate limit |
| `MAX_COST_PER_HOUR` | Cost threshold limit |

Source: [src/pricing.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)

## Configuration

The tool respects a layered configuration system with the following priority:

```mermaid
graph TD
    A[Priority 1: Per-Request Parameters] --> B[Priority 2: Environment Variables]
    B --> C[Priority 3: Local Config ./.gemini-image-mcp.json]
    C --> D[Priority 4: Global Config ~/.gemini-image-mcp.json]
    D --> E[Priority 5: Built-in Defaults]
```

### Config File Structure

```json
{
  "defaultModel": "gemini-2.5-flash-image",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    }
  }
}
```

Source: [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)

## Output Organization

Generated images are saved with intelligent naming and versioning:

| Input `filename` | Existing Files | Output Saved As |
|------------------|-----------------|-----------------|
| `"hero"` | None | `hero.png` |
| `"hero"` | `hero.png` exists | `hero-v2.png` |
| `"hero"` | `hero.png`, `hero-v2.png` exist | `hero-v3.png` |

### Subfolder Organization

Images can be organized into subdirectories using the `subfolder` parameter:

| Parameters | Result |
|------------|--------|
| `filename: "hero"`, `subfolder: "landing-page"` | `~/gemini-images/landing-page/hero.png` |

### Generation Manifest

All generations are logged to `generations.jsonl` for audit and reproducibility:

```jsonl
{"timestamp":"2024-01-15T10:30:00Z","prompt":"A modern dashboard","model":"gemini-2.5-flash-image","aspectRatio":"16:9","resolution":"2K","cost":0.04,"path":"~/gemini-images/dashboard.png"}
```

Source: [src/generate.ts:200-300](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Usage Examples

### Text-to-Image Generation

```json
{
  "prompt": "A modern dashboard UI with dark theme and blue accent colours",
  "aspectRatio": "16:9",
  "resolution": "2K",
  "filename": "dashboard-hero",
  "subfolder": "landing-page"
}
```

### Image Editing with Reference

```json
{
  "prompt": "Change the background to a sunset over water",
  "images": ["./src/assets/hero.png"],
  "aspectRatio": "16:9"
}
```

### Multi-Turn Refinement

```json
{
  "prompt": "Make the colours warmer and add more contrast",
  "sessionId": "session-1711929600000-a1b2c3"
}
```

### Reproducible Generation with Seed

```json
{
  "prompt": "A photorealistic mountain landscape",
  "seed": 42,
  "aspectRatio": "16:9"
}
```

### Google Search Grounding

```json
{
  "prompt": "Current design trends for AI product landing pages",
  "model": "gemini-3.1-flash-image-preview",
  "useSearchGrounding": true
}
```

## Error Handling

The tool provides specific error messages for common failure scenarios:

| Error Condition | Message |
|-----------------|---------|
| Invalid API key | `Failed to list models (is your API key valid?)` |
| Unsupported image format | `Unsupported image format ".bmp" for file: path/to/image.bmp` |
| Image too large | `Image file is 52MB, max is 50MB` |
| Grounding on unsupported model | `useSearchGrounding is only supported on gemini-3.1-flash-image-preview` |
| Rate limit exceeded | Clear error with remaining budget information |
| Session model mismatch | Error if session uses different model than original |

Source: [src/generate.ts:100-200](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Dependencies

| Package | Version | Purpose |
|---------|---------|---------|
| `@google/genai` | ^1.44.0 | Gemini API client |
| `@modelcontextprotocol/sdk` | ^1.22.0 | MCP protocol implementation |
| `zod` | ^3.24.0 | Schema validation |
| `sharp` | ^0.34.5 | Image processing (for output) |

Source: [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

---

<a id='process-image-tool'></a>

## process_image Tool Reference

### Related Pages

Related topics: [generate_image Tool Reference](#generate-image-tool), [Image Processing Internals](#image-processing-internals)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/process.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/process.ts)
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
- [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)
- [skills/image-generation/SKILL.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
</details>

# process_image Tool Reference

## Overview

The `process_image` tool is a local image processing utility within the gemini-image-mcp MCP server. It leverages the `sharp` library to perform CPU-bound image transformations without making any API calls, making it completely free to use. Source: [package.json:14](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

Unlike the AI-powered `generate_image` tool which sends requests to Google's Gemini API and incurs costs per operation, `process_image` operates entirely on the local machine. This creates an efficient two-tool workflow where AI generation can be followed by local processing at zero additional cost. Source: [README.md:features](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Architecture

### Tool Registration Flow

The `process_image` tool is registered with the MCP server using the `@modelcontextprotocol/sdk` framework. The tool definition includes Zod schemas for parameter validation and a handler function that orchestrates the processing pipeline. Source: [src/index.ts:tool-registration](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

```mermaid
graph TD
    A[MCP Client Request] --> B[index.ts Tool Handler]
    B --> C[Load Configuration]
    C --> D[processImage Function]
    D --> E[sharp Operations Pipeline]
    E --> F[Output File System]
    D --> G[Return JSON Result]
    
    H[Config Sources] --> C
    H --> I[Environment Variables]
    H --> J[Local Config .json]
    H --> K[Global Config .json]
    H --> L[Default Values]
```

### Processing Pipeline

The tool chains multiple operations into a single execution call. Operations are applied in a logical order: background removal first, then cropping, resizing, trimming, and finally format conversion. Source: [src/index.ts:75-89](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

```mermaid
graph LR
    A[Input Image] --> B[Background Removal]
    B --> C[Crop]
    C --> D[Resize]
    D --> E[Trim]
    E --> F[Format Conversion]
    F --> G[Saved Output]
```

## Tool Parameters

### Required Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `imagePath` | string | Path to the image file to process |

Source: [src/index.ts:48-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

### Optional Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `crop` | CropConfig | none | Crop by pixel dimensions, aspect ratio, or focal point strategy |
| `resize` | ResizeConfig | none | Resize to width/height (maintains aspect ratio) |
| `removeBackground` | RemoveBackgroundConfig | config default | Remove background by threshold or chroma key |
| `trim` | boolean | config default | Auto-remove whitespace/transparent borders |
| `format` | "png" \| "jpeg" \| "webp" | original | Convert to specified format |
| `quality` | number (1-100) | 90 | Output quality for JPEG/WebP |
| `outputDir` | string | ~/gemini-images | Directory to save output |
| `filename` | string | auto | Base name for saved file with auto-versioning |
| `subfolder` | string | none | Subfolder within output directory |

Source: [src/index.ts:51-79](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Operations

### Crop

The crop operation supports three distinct modes for targeting specific regions of an image.

**Pixel-Exact Crop**

```typescript
{
  "width": 500,
  "height": 300,
  "left": 100,
  "top": 50
}
```

**Aspect Ratio Crop**

```typescript
{
  "aspectRatio": "16:9",
  "strategy": "center"  // or "attention" or "entropy"
}
```

**Supported Aspect Ratios**

| Ratio | Use Case |
|-------|----------|
| `1:1` | Square images, avatars |
| `16:9` | Hero banners, video thumbnails |
| `9:16` | Mobile stories, vertical content |
| `3:2` | Standard photography |
| `2:3` | Portrait photography |
| `4:3` | Classic monitors |
| `3:4` | Portrait prints |
| `21:9` | Ultrawide displays |

Source: [README.md:aspect-ratios](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

**Focal Point Strategies**

| Strategy | Behavior |
|----------|----------|
| `center` | Default. Crops from the center of the image |
| `attention` | Shifts crop toward the most visually interesting region based on saliency detection |
| `entropy` | Shifts crop toward the region with the most visual detail (high information entropy) |

Source: [skills/image-generation/SKILL.md:crop](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)

### Resize

The resize operation maintains aspect ratio when only one dimension is specified.

```typescript
{
  "width": 1200        // Auto-calculate height
}
// OR
{
  "height": 800        // Auto-calculate width
}
// OR
{
  "width": 1200,
  "height": 800        // Both specified
}
```

When both width and height are provided, the resize operation respects the crop configuration if present. Source: [src/index.ts:58-62](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

### Background Removal

Two distinct algorithms handle background removal depending on the background type.

**Threshold-Based (White Backgrounds)**

```typescript
{
  "threshold": 230
}
```

The threshold parameter specifies the brightness level below which pixels are considered background. Values closer to 255 detect lighter backgrounds. This method works well for studio product shots on white backdrops. Source: [README.md:background-removal](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

**Chroma Key (Green Screen / Any Solid Colour)**

```typescript
{
  "color": "#00FF00"    // Any hex colour
}
```

The chroma key pipeline performs HSV-based colour keying with advanced compositing techniques:

| Stage | Description |
|-------|-------------|
| HSV Keying | Converts to HSV colour space for colour-based selection |
| Smoothstep Feather | Softens the edges using smoothstep interpolation |
| Spill Suppression | Removes colour contamination from the subject |
| Edge Anti-Aliasing | 5-pass 3x3 kernel smoothing |

Source: [README.md:chroma-key](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Trim

The trim operation automatically removes whitespace and transparent borders from images. This is particularly useful after background removal when residual padding remains around the subject. Source: [README.md:trim](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

```typescript
{
  "trim": true
}
```

### Format Conversion

The format parameter converts images to different output formats with quality control.

| Format | Quality Range | Default | Description |
|--------|---------------|---------|-------------|
| `png` | N/A (lossless) | - | Portable Network Graphics |
| `jpeg` | 1-100 | 90 | Joint Photographic Experts Group |
| `webp` | 1-100 | 90 | Web Picture format |

Source: [src/index.ts:63-67](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Output Organization

### Filename Auto-Versioning

When a filename already exists in the output directory, the tool automatically versions the filename:

- `hero.png` (first save)
- `hero-v2.png` (second save)
- `hero-v3.png` (third save)

This prevents overwriting existing files and maintains a clear history of processed images. Source: [README.md:filename](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Subfolder Organization

The `subfolder` parameter creates organized directory structures within the output directory:

| Parameter | Example | Result |
|-----------|---------|--------|
| `filename` only | `"hero"` | `~/gemini-images/hero.png` |
| `subfolder` only | `"landing-page"` | `~/gemini-images/landing-page/original.png` |
| Both | `"hero"` + `"landing-page"` | `~/gemini-images/landing-page/hero.png` |

Source: [README.md:output](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Configuration Defaults

The tool reads default values from multiple configuration sources with the following priority:

```mermaid
graph TD
    A[Priority Order] --> B[1. Per-Request Parameters]
    B --> C[2. Environment Variables]
    C --> D[3. Local Config .json]
    D --> E[4. Global Config .json]
    E --> F[5. Hardcoded Defaults]
```

### Configuration File Structure

Create a config file using:

```bash
npx @jimothy-snicket/gemini-image-mcp --init
```

This creates `~/.gemini-image-mcp.json` with commented defaults. For project-specific overrides:

```bash
npx @jimothy-snicket/gemini-image-mcp --init --local
```

Source: [README.md:config-file](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Default Process Settings

```json
{
  "defaults": {
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true,
      "format": "png"
    }
  }
}
```

Source: [README.md:config-defaults](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Common Pipelines

### Favicon from Logo

Extract a transparent background, trim whitespace, and resize to favicon dimensions in a single pipeline:

```json
{
  "imagePath": "./logo.png",
  "removeBackground": {"threshold": 230},
  "trim": true,
  "resize": {"width": 192, "height": 192},
  "filename": "favicon-192"
}
```

Source: [skills/image-generation/SKILL.md:favicon](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)

### Social Card from Photo

Crop to 16:9 aspect ratio using attention-based focal point and resize to standard social card width:

```json
{
  "imagePath": "./photo.png",
  "crop": {"aspectRatio": "16:9", "strategy": "attention"},
  "resize": {"width": 1200},
  "filename": "hero-banner"
}
```

Source: [skills/image-generation/SKILL.md:social-card](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)

### WebP Conversion for Web

Convert an existing PNG to WebP format with optimized quality for web delivery:

```json
{
  "imagePath": "./image.png",
  "format": "webp",
  "quality": 85,
  "filename": "optimized"
}
```

Source: [README.md:webp](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Transparent Asset from Green Screen

Generate an image on a green background, then remove it locally:

**Step 1: Generate on green screen**

```json
{
  "prompt": "A product photo on a bright green background",
  "filename": "product-green"
}
```

**Step 2: Remove green background**

```json
{
  "imagePath": "./product-green.png",
  "removeBackground": {"color": "#00FF00"},
  "trim": true,
  "filename": "product-transparent"
}
```

This two-step approach works best for high-contrast subjects (dark, red, blue, or white on green). Always use `#00FF00` as it handles Gemini's actual green shade more reliably than trying to match it precisely. Source: [README.md:green-screen](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Subject on Specific Background (Canvas Approach)

For yellow, green, or glass/reflective subjects where chroma key struggles, use the AI-powered canvas approach:

```json
{
  "prompt": "Place a yellow rubber duck on this background. Product photography, studio lighting, centered.",
  "images": ["./canvas-white.png"],
  "filename": "duck-on-white"
}
```

This technique generates the subject with correct lighting and shadows for the specific background in a single API call. Source: [skills/image-generation/SKILL.md:canvas](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/skills/image-generation/SKILL.md)

## Return Value

The tool returns a JSON object containing the processing results:

```typescript
{
  "content": [
    {
      "type": "text",
      "text": JSON.stringify({
        input: {
          path: string,
          operations: string[]
        },
        output: {
          path: string,
          format: string,
          dimensions: { width: number, height: number },
          size: number
        }
      }, null, 2)
    }
  ]
}
```

Source: [src/index.ts:80-95](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Error Handling

The tool wraps all operations in try-catch blocks to provide meaningful error messages:

```typescript
catch (err) {
  const message = err instanceof Error ? err.message : String(err);
  log.error("process_image failed:", message);
  // Returns error to MCP client
}
```

Common error scenarios include unsupported image formats and file access issues. Source: [src/index.ts:96-100](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Limitations

- Maximum input image size: 50MB (enforced by file stat check)
- Supported input formats depend on sharp library capabilities
- Output format availability depends on sharp library compilation options
- Processing is single-threaded per operation; large images may take longer to process
- No GPU acceleration; all processing uses CPU

Source: [src/process.ts:file-validation](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/process.ts) (inferred from README file size documentation)

---

<a id='configuration-guide'></a>

## Configuration Guide

### Related Pages

Related topics: [Installation Guide](#installation), [Cost Tracking and Rate Limiting](#cost-tracking)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [src/config.template.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.template.ts)
- [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)
- [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)
</details>

# Configuration Guide

The gemini-image-mcp server provides a centralized configuration system that allows users to customize all aspects of image generation and processing. The configuration system replaces scattered environment variables with a unified approach using JSON config files with JSONC support (JSON with comments).

## Overview

The configuration system serves as the single source of truth for all server settings. Instead of reading from `process.env` directly throughout the codebase, all modules now read settings from the centralized config, ensuring consistency and maintainability. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Key Features

| Feature | Description |
|---------|-------------|
| JSONC Support | JSON with comments for inline documentation |
| Hierarchical Priority | Env vars → Local config → Global config → Defaults |
| Deep Merge | Nested configuration objects merge correctly |
| Config Caching | Configuration is cached after first load |
| Security Guards | API key rejection, prototype pollution protection |
| Validation | Whitelist of known keys, unknown keys warned |

## Configuration Priority

The system follows a clear hierarchy where more specific configurations override more general ones:

```mermaid
graph TD
    A[Request Parameters] --> B[Override Everything]
    B --> C[Environment Variables]
    C --> D[Local Config ./.gemini-image-mcp.json]
    D --> E[Global Config ~/.gemini-image-mcp.json]
    E --> F[Hardcoded Defaults]
    
    style A fill:#90EE90
    style F fill:#FFE4B5
```

Environment variables take precedence over all config files. If a setting exists in both an env var and a config file, the env var wins. Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Configuration File Format

The configuration file uses JSONC (JSON with Comments) format, allowing inline documentation and making it easy to understand each setting.

### Default Configuration Template

```jsonc
{
  // gemini-image-mcp configuration
  // Docs: https://github.com/JimothySnicket/gemini-image-mcp

  // Directory where generated/processed images are saved
  // Supports ~ for home directory
  "outputDir": "~/gemini-images",

  // Default Gemini model for image generation
  // gemini-2.5-flash-image         — fast, ~$0.04/image, 1K only (deprecates Oct 2026)
  // gemini-3.1-flash-image-preview  — fast, ~$0.08/image, up to 4K, search grounding
  // gemini-3-pro-image-preview      — best quality, ~$0.16/image, up to 4K, 14 ref images
  "defaultModel": "gemini-2.5-flash-image",

  // Log level: "debug", "info", or "error"
  "logLevel": "info",

  // Timeout for a single API request (ms)
  "requestTimeout": 60000,

  // Timeout for multi-turn editing sessions (ms)
  "sessionTimeout": 1800000,

  // Rate limiting (0 = unlimited)
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,

  // Per-tool default parameters
  "defaults": {
    "generate": {
      // "aspectRatio": "1:1",
      // "resolution": "1K"
    },
    "process": {
      // "removeBackground": { "color": "#00FF00" },
      // "trim": true
    }
  }
}
```

Source: [src/config.template.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.template.ts)

## Configuration Options

### Top-Level Settings

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `outputDir` | string | `~/gemini-images` | Directory for saved images. Supports `~` for home directory |
| `defaultModel` | string | `gemini-2.5-flash-image` | Default Gemini model for image generation |
| `logLevel` | string | `info` | Log verbosity: `debug`, `info`, or `error` |
| `requestTimeout` | number | `60000` | Timeout for a single API request in milliseconds |
| `sessionTimeout` | number | `1800000` | Timeout for multi-turn editing sessions in milliseconds |
| `maxRequestsPerHour` | number | `0` | Rate limit: max requests per hour (0 = unlimited) |
| `maxCostPerHour` | number | `0` | Rate limit: max cost per hour in USD (0 = unlimited) |

### Per-Tool Defaults

The `defaults` object allows setting default parameters for each tool:

#### Generate Tool Defaults

| Option | Type | Description |
|--------|------|-------------|
| `defaults.generate.aspectRatio` | string | Default aspect ratio: `1:1`, `16:9`, `9:16`, `3:2`, `2:3`, `4:3`, `3:4`, `21:9` |
| `defaults.generate.resolution` | string | Default resolution: `1K`, `2K`, `4K` |

#### Process Tool Defaults

| Option | Type | Description |
|--------|------|-------------|
| `defaults.process.removeBackground` | object | Default background removal settings |
| `defaults.process.trim` | boolean | Default trim setting |
| `defaults.process.format` | string | Default output format: `png`, `jpeg`, `webp` |
| `defaults.process.quality` | number | Default quality (1-100) |

## Creating a Configuration File

### Automatic Initialization

The easiest way to create a configuration file is using the `--init` flag:

```bash
npx @jimothy-snicket/gemini-image-mcp --init
```

This creates `~/.gemini-image-mcp.json` with all defaults and inline documentation.

### Local Configuration

To create a local configuration file in the current working directory:

```bash
npx @jimothy-snicket/gemini-image-mcp --init --local
```

This creates `.gemini-image-mcp.json` in the CWD, which takes precedence over the global config.

## Deep Merge Behavior

The configuration system uses deep merging for nested objects. This means you can specify only the settings you want to change, and the rest will inherit from the underlying defaults. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Example: Partial Configuration

```jsonc
{
  "logLevel": "debug",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9"
    }
  }
}
```

This configuration only overrides `logLevel` and the `generate.aspectRatio`, while all other settings retain their defaults.

## Security Features

### API Key Protection

The configuration system explicitly rejects API keys found in config files with a warning:

```
[config] WARNING: "apiKey" found in ~/.gemini-image-mcp.json — API keys must not be in config files. Stripped.
```

This prevents accidental commits of API credentials to repositories. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Prototype Pollution Guard

The deep merge implementation protects against prototype pollution attacks by explicitly blocking dangerous keys:

- `__proto__`
- `constructor`
- `prototype`

If any of these keys are encountered during config merging, they are silently ignored. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Unknown Key Warnings

The system maintains a whitelist of known configuration keys. If an unknown key is found in a config file, a warning is logged and the key is dropped:

```
[config] WARNING: unknown key "someUnknownKey" in ~/.gemini-image-mcp.json — ignored.
```

This prevents unexpected data injection and helps users catch typos in configuration. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

## JSONC Parsing

The configuration system supports JSONC (JSON with Comments), which extends standard JSON with:

- Single-line comments: `// comment`
- Multi-line comments: `/* comment */`

### String-Aware Comment Stripping

The JSONC parser is string-aware, meaning it won't mangle URLs or other quoted strings that contain comment-like patterns. For example:

```jsonc
{
  // This is a comment
  "url": "https://example.com/api?query=1&filter=//something",
  "note": "Use // for comments in code"
}
```

Will correctly parse without affecting the URLs or notes. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Trailing Comma Handling

The parser automatically strips trailing commas left by commented-out lines, preventing parse failures. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

## Configuration Caching

After the configuration is loaded for the first time, it is cached in memory. Subsequent calls to `loadConfig()` return the cached value immediately without re-reading files. Source: [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)

### Cache Invalidation

To force a reload of the configuration (useful during development), you may need to restart the server process.

## Environment Variables

While the config file is the recommended approach, the system still supports environment variables for backward compatibility:

| Environment Variable | Description |
|---------------------|-------------|
| `GEMINI_API_KEY` | Google Gemini API key (required) |
| `OUTPUT_DIR` | Override output directory |

Environment variables always take precedence over config file values. Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Per-Request Overrides

Configuration defaults can be overridden on a per-request basis. Per-request parameters always override config defaults. Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Example: Per-Tool Defaults with Overrides

```json
{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}
```

With this configuration:
- All `generate_image` calls use `16:9` aspect ratio and `2K` resolution by default
- All `process_image` calls auto-remove green backgrounds and trim by default
- Any individual request can override these by specifying different values

## Programmatic Usage

The configuration module exports several functions for use within the codebase:

```typescript
import { loadConfig, initConfig, CONFIG_TEMPLATE } from './config';
```

| Function | Description |
|----------|-------------|
| `loadConfig()` | Load and return the merged configuration (uses cache) |
| `initConfig(options?)` | Create a new config file interactively |
| `CONFIG_TEMPLATE` | The default configuration template as a string |

Source: [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)

## File Locations

The system searches for configuration files in the following order:

1. **Current Working Directory**: `./.gemini-image-mcp.json`
2. **Home Directory**: `~/.gemini-image-mcp.json`

The first file found is used, with settings merged on top of defaults.

## Configuration Schema

The complete configuration schema includes:

```typescript
interface GeminiImageConfig {
  outputDir: string;
  defaultModel: string;
  logLevel: 'debug' | 'info' | 'error';
  requestTimeout: number;
  sessionTimeout: number;
  maxRequestsPerHour: number;
  maxCostPerHour: number;
  defaults: {
    generate?: {
      aspectRatio?: string;
      resolution?: string;
    };
    process?: {
      removeBackground?: object;
      trim?: boolean;
      format?: 'png' | 'jpeg' | 'webp';
      quality?: number;
    };
  };
}
```

## Validation Rules

The configuration loader performs the following validations:

| Rule | Behavior |
|------|----------|
| API key detection | Strips any key matching `/api.?key/i`, logs warning |
| Unknown keys | Drops unknown keys, logs warning |
| Prototype pollution | Silently skips `__proto__`, `constructor`, `prototype` |
| JSONC syntax | Parses comments, strips trailing commas |
| File existence | Returns `null` if file doesn't exist |

## Testing

The configuration module has comprehensive test coverage including:

- `stripJsoncComments`: String-aware comment removal
- `deepMerge`: Nested object merging with pollution protection
- `loadConfig`: Full configuration loading and precedence
- `initConfig`: Interactive config file creation

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

---

<a id='server-architecture'></a>

## Server Architecture

### Related Pages

Related topics: [Image Generation Internals](#image-generation-internals), [Image Processing Internals](#image-processing-internals)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [src/generate.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)
- [src/process.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/process.ts)
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
- [src/tracker.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)
- [src/utils.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/utils.ts)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
</details>

# Server Architecture

The **gemini-image-mcp** is an MCP (Model Context Protocol) server that provides two primary tools for image generation and processing using Google Gemini's AI capabilities. The server is built on the `@modelcontextprotocol/sdk` and communicates via STDIO transport, making it compatible with MCP clients like Claude Code.

## Overview

The server architecture follows a modular design pattern with clear separation of concerns:

| Component | File | Responsibility |
|-----------|------|----------------|
| Entry Point | `src/index.ts` | Server initialization, tool registration, request routing |
| Image Generation | `src/generate.ts` | Gemini API integration, model discovery, image generation |
| Image Processing | `src/process.ts` | Local image manipulation using Sharp |
| Configuration | `src/config.ts` | Config file loading, validation, environment variable management |
| Usage Tracking | `src/tracker.ts` | Token usage logging, cost estimation |
| Utilities | `src/utils.ts` | Logging, file operations, path resolution |

### Technology Stack

| Dependency | Version | Purpose |
|------------|---------|---------|
| `@google/genai` | ^1.44.0 | Gemini API client for image generation |
| `@modelcontextprotocol/sdk` | ^1.22.0 | MCP protocol implementation |
| `sharp` | ^0.34.5 | Local image processing |
| `zod` | ^3.24.0 | Schema validation for tool parameters |

Source: [package.json:18-22]()

## Architecture Diagram

```mermaid
graph TD
    A[MCP Client] <-->|STDIO| B[src/index.ts<br/>MCP Server]
    B --> C[generate_image tool]
    B --> D[process_image tool]
    C --> E[src/generate.ts<br/>Gemini API]
    C --> F[src/tracker.ts<br/>Usage Logger]
    D --> G[src/process.ts<br/>Sharp Library]
    E --> H[Model Discovery]
    I[Config System] -.->|Priority Resolution| B
    I --> J[src/config.ts]
    J --> K[Environment Variables]
    J --> L[Config Files]
    J --> M[Defaults]
```

## Core Components

### Entry Point (`src/index.ts`)

The server initializes a single MCP server instance that registers two tools:

```typescript
const server = new McpServer(
  {
    name: "gemini-image-mcp",
    version: pkg.version,
  },
  {
    instructions: "Gemini image generation and local image processing...",
  },
);
```

**Key initialization steps:**
1. Load configuration via `loadConfig()`
2. Initialize usage tracker via `initTracker()`
3. Register both `generate_image` and `process_image` tools
4. Establish STDIO transport connection

Source: [src/index.ts:1-20]()

### Tool Registration Pattern

Each tool follows a consistent registration pattern using Zod schemas for parameter validation:

```typescript
server.registerTool(
  "tool-name",
  {
    description: "...",
    parameters: z.object({ /* Zod schema */ }),
  },
  async (args) => {
    const config = loadConfig();
    // Merge config defaults with args
    // Execute tool logic
    // Return formatted response
  },
);
```

Source: [src/index.ts:35-80]()

## Tool: `generate_image`

The `generate_image` tool handles AI-powered image generation and editing via the Gemini API.

### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | string | Yes | Text description or editing instruction |
| `images` | string[] | No | File paths to reference images |
| `model` | string | No | Gemini model ID |
| `aspectRatio` | string | No | Image aspect ratio |
| `resolution` | string | No | Output resolution (1K, 2K, 4K) |
| `outputDir` | string | No | Override output directory |
| `filename` | string | No | Base name for saved file |
| `subfolder` | string | No | Subfolder within output directory |
| `sessionId` | string | No | Continue multi-turn session |
| `seed` | number | No | Integer seed for reproducibility |
| `useSearchGrounding` | boolean | No | Enable Google Search grounding |

Source: [src/index.ts:35-70]()

### Model Discovery

The `generate.ts` module implements automatic model discovery to detect available image-capable models:

```typescript
const IMAGE_MODEL_PATTERNS = ["image", "vision"];
const EXCLUDED_PREFIXES = ["learn", "gemini-2.0-flash-thinking"];

async function discoverModels(apiKey: string): Promise<string[]> {
  // Paginate through available models
  // Filter by image capability patterns
  // Exclude specific prefixes
  // Cache results
}
```

Source: [src/generate.ts:85-100]()

### Image Input Handling

Local images are converted to inline data for API submission:

```typescript
async function readImageAsInlineData(filepath: string): Promise<{
  inlineData: { data: string; mimeType: string };
}> {
  const mimeType = MIME_TYPES[ext];
  // Validate file exists and is under 50MB
  // Return base64-encoded data with MIME type
}
```

Source: [src/generate.ts:105-130]()

## Tool: `process_image`

The `process_image` tool provides local, free image processing via the Sharp library.

### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `imagePath` | string | Yes | Path to image file |
| `crop` | object | No | Crop by pixels, aspect ratio, or focal point |
| `resize` | object | No | Resize to width/height |
| `removeBackground` | object | No | Threshold or chroma key removal |
| `trim` | boolean | No | Auto-remove whitespace borders |
| `format` | string | No | Output format (png, jpeg, webp) |
| `quality` | number | No | Quality 1-100 for JPEG/WebP |
| `outputDir` | string | No | Override output directory |
| `filename` | string | No | Base name for saved file |
| `subfolder` | string | No | Subfolder within output directory |

Source: [src/index.ts:75-120]()

### Processing Capabilities

| Operation | Description |
|-----------|-------------|
| **Crop** | Pixel-exact, aspect ratio center crop, focal point (attention/entropy) |
| **Resize** | Width, height, or both with aspect ratio preservation |
| **Background Removal** | Threshold-based (white backgrounds) or chroma key (HSV keying) |
| **Trim** | Auto-remove whitespace/transparent borders |
| **Format Conversion** | PNG, JPEG, WebP with quality control |

## Configuration System (`src/config.ts`)

The configuration system implements a hierarchical priority system for settings:

### Priority Order

```
Per-request parameters > Environment Variables > Local config (.gemini-image-mcp.json) > Global config (~/.gemini-image-mcp.json) > Defaults
```

### Security Features

| Feature | Implementation |
|---------|----------------|
| API key rejection | Keys from config files are rejected with warning |
| JSONC parsing | String-aware comment stripping (preserves URLs) |
| Prototype pollution guard | `__proto__`, `constructor`, `prototype` blocked in deep merge |
| Unknown key warnings | Invalid config keys are warned and dropped |

Source: [CHANGELOG.md]()

### Config Structure

```json
{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}
```

## Usage Tracking (`src/tracker.ts`)

The tracker module logs all image generation operations to a manifest file (`generations.jsonl`).

### Tracked Data

Each generation logs:
- Prompt text
- Model used
- Parameters (aspect ratio, resolution, etc.)
- Token counts (prompt, output, image, thinking)
- Estimated USD cost
- Session information

Source: [src/tracker.ts]() (referenced in [src/index.ts:18]())

## Logging System (`src/utils.ts`)

The utility module provides structured logging capabilities:

```typescript
import { log, setLogLevel, setLogDir } from "./utils.js";
```

Features:
- Configurable log levels
- Directory-based log output
- Error message formatting

Source: [src/utils.ts]() (referenced in [src/index.ts:19]())

## Request Flow

```mermaid
sequenceDiagram
    participant Client as MCP Client
    participant Server as MCP Server
    participant Config as Config System
    participant Tool as Tool Handler
    participant API as External API

    Client->>Server: Tool Request
    Server->>Config: Load Config
    Config-->>Server: Merged Config
    Server->>Tool: Request + Config
    Tool->>Config: Get Defaults
    Config-->>Tool: Tool Defaults
    Tool->>API: Process Request
    API-->>Tool: Response
    Tool->>Server: Formatted Result
    Server-->>Client: JSON Response
```

## Environment Variables

| Variable | Required | Description |
|----------|----------|-------------|
| `GEMINI_API_KEY` | Yes | Google Gemini API key from AI Studio |
| `MAX_REQUESTS_PER_HOUR` | No | Rate limit for requests |
| `MAX_COST_PER_HOUR` | No | Rate limit for cost (USD) |
| `OUTPUT_DIR` | No | Default output directory |

## Initialization

The server can be initialized in two modes:

```bash
# Global config
npx @jimothy-snicket/gemini-image-mcp --init

# Local config (in current directory)
npx @jimothy-snicket/gemini-image-mcp --init --local
```

This creates a `~/.gemini-image-mcp.json` or `.gemini-image-mcp.json` file with inline documentation of all available options.

---

<a id='image-generation-internals'></a>

## Image Generation Internals

### Related Pages

Related topics: [generate_image Tool Reference](#generate-image-tool), [Server Architecture](#server-architecture), [Cost Tracking and Rate Limiting](#cost-tracking)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/generate.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)
- [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)
- [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)
</details>

# Image Generation Internals

This document provides a comprehensive technical overview of the image generation subsystem within `gemini-image-mcp`. It covers the architecture, API integration patterns, session management, model discovery, and configuration system.

## Overview

The image generation system is built on Google Gemini's native image generation API (`generateContent`), not the deprecated Imagen API. The system provides both text-to-image generation and image editing capabilities through a Model Context Protocol (MCP) server interface. Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Core Dependencies

| Package | Version | Purpose |
|---------|---------|---------|
| `@google/genai` | ^1.44.0 | Gemini API client |
| `@modelcontextprotocol/sdk` | ^1.22.0 | MCP server implementation |
| `zod` | ^3.24.0 | Schema validation for tool parameters |

Source: [package.json:18-21](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json)

## Architecture

### System Components

```mermaid
graph TD
    A[MCP Client] -->|Tool Request| B[McpServer]
    B --> C[generateImage Function]
    C --> D[Model Discovery]
    C --> E[Session Manager]
    C --> F[API Client]
    D --> G[Gemini API<br/>List Models]
    E --> H[Session Store<br/>Map&lt;sessionId, ConversationSession&gt;]
    F --> I[Gemini generateContent API]
    I --> J[Image Response]
    J --> K[File System<br/>Output Directory]
```

### Flow Diagram

```mermaid
sequenceDiagram
    participant Client
    participant Server as MCP Server
    participant Session as Session Manager
    participant API as Gemini API
    participant FS as File System

    Client->>Server: generate_image(prompt, sessionId?)
    Server->>Session: Check/Create Session
    Session-->>Server: ConversationSession
    alt New Session
        Server->>API: List Models
        API-->>Server: Available Models
        Server->>Session: Create New Session
    else Existing Session
        Server->>Session: Get Session History
    end
    Server->>API: generateContent(prompt, history)
    API-->>Server: Generated Image
    Server->>FS: Save Image
    Server-->>Client: Result + Usage Stats
```

## Model Discovery System

### Auto Model Detection

The system automatically discovers available image-capable models by querying the Gemini API at startup. This eliminates the need for hardcoded model lists and ensures compatibility as new models are released. Source: [src/generate.ts:95-116](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

```typescript
// Known image-capable model name fragments (Gemini native only)
const IMAGE_MODEL_PATTERNS = ["image", "img"];
// Imagen uses a different API (generateImages) and is deprecated June 2026
const EXCLUDED_PREFIXES = ["imagen"];
```

### Model Filtering Logic

| Filter Type | Criteria | Purpose |
|-------------|----------|---------|
| Include | Name contains "image" or "img" | Match Gemini image models |
| Exclude | Name starts with "imagen" | Avoid deprecated Imagen API |

Source: [src/generate.ts:95-97](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

### Caching Mechanism

Available models are cached after the first discovery call to reduce API overhead:

```typescript
let cachedAvailableModels: string[] | null = null;

export function getAvailableModels(): string[] | null {
  return cachedAvailableModels;
}
```

Source: [src/generate.ts:63](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Supported Models

| Model | Resolution | Cost | Grounding | Notes |
|-------|------------|------|-----------|-------|
| `gemini-2.5-flash-image` | 1K only | ~$0.04/image | No | Default, deprecates Oct 2026 |
| `gemini-3-pro-image-preview` | 1K, 2K, 4K | ~$0.15/image | No | Best quality, up to 14 reference images |
| `gemini-3.1-flash-image-preview` | 512, 1K, 2K, 4K | ~$0.08/image | Yes | Search grounding support |

Source: [README.md:45-49](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Google Search Grounding

### Supported Models

Only `gemini-3.1-flash-image-preview` supports Google Search grounding. The system validates this at runtime and throws a descriptive error if unsupported. Source: [src/generate.ts:99-108](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

```typescript
export const GROUNDING_SUPPORTED_MODELS = ["gemini-3.1-flash-image-preview"];

export function validateGrounding(model: string, useSearchGrounding: boolean | undefined): void {
  if (useSearchGrounding && !GROUNDING_SUPPORTED_MODELS.includes(model)) {
    throw new Error(
      `useSearchGrounding is only supported on ${GROUNDING_SUPPORTED_MODELS.join(", ")}. ` +
        `You requested ${model}.`,
    );
  }
}
```

Source: [src/generate.ts:99-108](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Multi-Turn Session Management

### Session Data Model

```typescript
interface ConversationSession {
  history: Content[];      // Previous conversation turns
  model: string;           // Model used in this session
  lastAccessed: number;    // Timestamp for TTL cleanup
}
```

Source: [src/generate.ts:67-71](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

### Session Store

```typescript
const sessions = new Map<string, ConversationSession>();
const MAX_SESSION_TURNS = 10;
```

### Session Lifecycle

```mermaid
graph LR
    A[Create Session] --> B[Store with TTL]
    B --> C[Each Request]
    C -->|Within TTL| D[Extend TTL]
    C -->|Exceeds TTL| E[Cleanup on Access]
    E --> F[Return Error]
    D --> G[Append to History]
    G --> H[Return Response]
    H --> I[Max 10 Turns]
    I -->|Exceeded| J[Prune Oldest]
```

### Session Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `sessionTimeout` | 1800000ms (30 min) | Inactivity timeout before session expiry |
| `MAX_SESSION_TURNS` | 10 | Maximum conversation turns per session |

Source: [src/generate.ts:69](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts) and [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)

### Session Cleanup

Sessions are automatically cleaned up based on the configured timeout:

```typescript
function getSessionTimeout(): number {
  return loadConfig().sessionTimeout;
}

function cleanupSessions(): void {
  const timeout = getSessionTimeout();
  const now = Date.now();
  for (const [id, session] of sessions) {
    if (now - session.lastAccessed > timeout) {
      sessions.delete(id);
    }
  }
}
```

Source: [src/generate.ts:73-84](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Image Input Processing

### Supported Formats

The system supports multiple image formats through MIME type mapping:

| Extension | MIME Type |
|-----------|-----------|
| `.png` | `image/png` |
| `.jpg` / `.jpeg` | `image/jpeg` |
| `.webp` | `image/webp` |
| `.gif` | `image/gif` |
| `.avif` | `image/avif` |

### File Validation

| Check | Limit | Error Message |
|-------|-------|---------------|
| File size | 50MB max | "Image file is {size}MB, max is 50MB" |
| Format support | Defined MIME map | "Unsupported image format" |

Source: [src/generate.ts:119-135](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## Configuration System

### Configuration Priority

```
per-request params > env vars > local config > global config > defaults
```

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Configuration Template

```json
{
  "outputDir": "~/gemini-images",
  "defaultModel": "gemini-2.5-flash-image",
  "logLevel": "info",
  "requestTimeout": 60000,
  "sessionTimeout": 1800000,
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "1K"
    }
  }
}
```

Source: [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)

### Per-Tool Defaults

Users can configure default parameters for each tool to avoid repetition:

```json
{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}
```

Source: [README.md:95-109](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Rate Limiting

### Configuration Parameters

| Variable | Purpose |
|----------|---------|
| `MAX_REQUESTS_PER_HOUR` | Maximum API requests per hour |
| `MAX_COST_PER_HOUR` | Maximum USD cost per hour |

Source: [README.md:79](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

### Rate Limit Behavior

The system monitors both request count and cost per hour. When limits are reached, the API returns a clear error message indicating remaining budget. Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

## Response Structure

Each generation response includes:

| Field | Type | Description |
|-------|------|-------------|
| `sessionId` | string | Unique ID for multi-turn sessions |
| `imagePath` | string | Path to saved image |
| `generation` | object | Generation parameters used |
| `usage` | object | Token counts and estimated cost |
| `session` | object | Running totals (generations, cost, hourly count) |

Source: [src/generate.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts)

## MCP Tool Registration

The `generate_image` tool is registered with the MCP SDK using Zod schemas for parameter validation:

```typescript
server.registerTool(
  "generate_image",
  {
    prompt: z.string(),
    images: z.array(z.string()).optional(),
    model: z.string().optional(),
    aspectRatio: z.string().optional(),
    // ... additional parameters
  },
  async (args) => {
    const config = loadConfig();
    const result = await generateImage({ ...args, config });
    return { content: [{ type: "text", text: JSON.stringify(result) }] };
  }
);
```

Source: [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts)

## Error Handling

### Model Mismatch Detection

Sessions verify that the requested model matches the original session model to prevent inconsistent generation behavior:

```typescript
// Model mismatch detection: error if session uses a different model than the original
```

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

### Seed-Based Reproducibility

Integer seeds enable reproducible generation results:

```typescript
// seed param: integer for reproducible generation
```

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

---

<a id='image-processing-internals'></a>

## Image Processing Internals

### Related Pages

Related topics: [process_image Tool Reference](#process-image-tool), [Server Architecture](#server-architecture)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/process.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/process.ts) - Core image processing implementation
- [src/generate.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/generate.ts) - Image generation with model discovery
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts) - Configuration management
- [src/index.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/index.ts) - MCP server and tool registration
- [src/tracker.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.js) - Usage tracking and cost reporting
- [src/utils.js](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/utils.js) - Logging and utility functions
- [package.json](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/package.json) - Dependencies and project metadata
</details>

# Image Processing Internals

## Overview

The `process_image` tool provides local, free image processing capabilities powered by the `sharp` library. Unlike `generate_image` which makes API calls to Google's Gemini, `process_image` operates entirely on the local machine, making it ideal for batch operations, asset preparation, and cost-free transformations. Source: [package.json:17]()

The module supports chaining multiple operations in a single tool call, including cropping, resizing, background removal (threshold and chroma key), border trimming, and format conversion. This design allows complex pipelines like favicon generation or transparent asset extraction without multiple API round-trips.

## Architecture

### Component Diagram

```mermaid
graph TD
    A["process_image Tool"] --> B["Input Validation"]
    B --> C["sharp Pipeline"]
    C --> D["Operation Chain"]
    
    D --> E1["Crop Operations"]
    D --> E2["Resize Operations"]
    D --> E3["Background Removal"]
    D --> E4["Trim Operations"]
    D --> E5["Format Conversion"]
    
    E3 --> F1["Threshold Detection"]
    E3 --> F2["HSV Chroma Key"]
    
    F2 --> G1["Smoothstep Feather"]
    F2 --> G2["Spill Suppression"]
    F2 --> G3["Edge Anti-aliasing"]
    
    E1 --> H["Output Writer"]
    E2 --> H
    E4 --> H
    E5 --> H
    
    H --> I["generations.jsonl"]
    H --> J["File System"]
```

### Technology Stack

| Component | Technology | Version | Purpose |
|-----------|------------|---------|---------|
| Image Processing | sharp | ^0.34.5 | High-performance image manipulation |
| Validation | zod | ^3.24.0 | Runtime type checking for parameters |
| MCP SDK | @modelcontextprotocol/sdk | ^1.22.0 | Tool registration and communication |

Source: [package.json:13-15]()

## Input Validation

The tool validates all parameters before processing begins. The Zod schema enforces strict type constraints and ranges.

### Parameter Schema

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `imagePath` | string | Yes | — | Path to source image file |
| `crop` | CropConfig | No | undefined | Crop configuration object |
| `resize` | ResizeConfig | No | undefined | Resize configuration object |
| `removeBackground` | BackgroundConfig | No | config default | Background removal settings |
| `trim` | boolean | No | config default | Auto-remove whitespace borders |
| `format` | "png" \| "jpeg" \| "webp" | No | original | Output format |
| `quality` | number (1-100) | No | 90 | JPEG/WebP quality |
| `outputDir` | string | No | ~/gemini-images | Output directory |
| `filename` | string | No | auto-generated | Base filename |
| `subfolder` | string | No | none | Subdirectory path |

Source: [src/index.ts:67-95]()

### Crop Configuration

```typescript
// Pixel-exact dimensions
{ width: 500, height: 300, left: 100, top: 50 }

// Aspect ratio (center crop)
{ aspectRatio: "16:9" }

// Focal point strategies
{ aspectRatio: "16:9", strategy: "attention" }
{ aspectRatio: "16:9", strategy: "entropy" }
```

### Resize Configuration

```typescript
// Width only (aspect ratio preserved)
{ width: 1200 }

// Height only (aspect ratio preserved)
{ height: 800 }

// Both dimensions (may affect aspect ratio)
{ width: 192, height: 192 }
```

### Background Removal Configuration

```typescript
// Threshold-based (white backgrounds)
{ threshold: 230 }

// Chroma key (green screen / any solid color)
{ color: "#00FF00" }

// Custom color with threshold tolerance
{ color: "#00FF00", threshold: 30 }
```

## Processing Pipeline

### Operation Flow

```mermaid
graph LR
    A[Input Image] --> B[Load with sharp]
    B --> C{Crop Specified?}
    C -->|Yes| D[Apply Crop]
    C -->|No| E[Resize Specified?]
    D --> E
    E -->|Yes| F[Apply Resize]
    E -->|No| G[Background Removal?]
    F --> G
    G -->|Yes| H[Apply Background Removal]
    G -->|No| I[Trim Specified?]
    H --> I
    I -->|Yes| J[Apply Trim]
    I -->|No| K[Format Conversion?]
    J --> K
    K -->|Yes| L[Apply Format & Quality]
    K -->|No| M[Write to Output]
    L --> M
    M --> N[Log to generations.jsonl]
```

## Crop Operations

### Pixel-Exact Cropping

Accepts explicit `left`, `top`, `width`, and `height` parameters in pixels. The crop is applied using sharp's region extraction, which reads only the specified portion of the source image.

```typescript
await sharp(input)
  .extract({ 
    left: crop.left, 
    top: crop.top, 
    width: crop.width, 
    height: crop.height 
  })
  .toBuffer();
```

### Aspect Ratio Cropping

When `aspectRatio` is specified without explicit dimensions, the system calculates the largest crop region matching the target ratio. The `strategy` parameter determines which region to select:

| Strategy | Behavior |
|----------|----------|
| `center` (default) | Crops from the geometric center of the image |
| `attention` | Shifts crop toward the most visually interesting region based on saliency detection |
| `entropy` | Shifts crop toward the region with highest information density (detail) |

## Resize Operations

### Dimension Handling

The resize operation follows sharp's resize semantics:

- **Width only**: Height is calculated to maintain aspect ratio
- **Height only**: Width is calculated to maintain aspect ratio
- **Both specified**: Resizes to exact dimensions (may alter aspect ratio)

### Resolution Presets

While the API accepts explicit pixel values, the `generate_image` tool supports resolution presets (1K, 2K, 4K) which map to standard dimensions:

| Preset | Dimensions |
|--------|------------|
| 1K | 1024 × 1024 (or proportional) |
| 2K | 2048 × 2048 (or proportional) |
| 4K | 4096 × 4096 (or proportional) |

## Background Removal

### Threshold-Based Detection

For images with white or light backgrounds, threshold-based detection identifies pixels above a brightness value and makes them transparent.

**Algorithm:**
1. Convert image to grayscale
2. Identify pixels exceeding the threshold (default: 230 on 0-255 scale)
3. Set identified pixels to transparent
4. Apply a slight blur to smooth edges

**Best for:** Product photos on plain white backgrounds, scanned documents, screenshots

### Chroma Key Pipeline

For green screen or solid color backgrounds, the chroma key pipeline performs sophisticated color extraction:

```mermaid
graph TD
    A[Input Image] --> B[Convert to HSV]
    B --> C[Color Range Detection]
    C --> D[Create Mask]
    D --> E[Smoothstep Feather]
    E --> F[Spill Suppression]
    F --> G[Edge Anti-aliasing]
    G --> H[Composite with Transparency]
```

**Stage Details:**

| Stage | Description |
|-------|-------------|
| HSV Keying | Converts to Hue-Saturation-Value color space for better color discrimination |
| Smoothstep Feather | Applies smooth edge transition using smoothstep function (not linear) |
| Spill Suppression | Removes color contamination from edges of subject |
| Edge Anti-aliasing | 5-pass 3×3 kernel anti-aliasing for smooth edges |

**Recommended Settings:**

| Subject Type | Color | Notes |
|-------------|-------|-------|
| High contrast (red, blue, black, white) | #00FF00 | Best results |
| Yellow subjects | canvas approach | Use `generate_image` instead |
| Green subjects | canvas approach | Use `generate_image` instead |
| Glass/reflective | canvas approach | Use `generate_image` instead |

## Trim Operations

The trim operation automatically removes whitespace and transparent borders from images.

**Algorithm:**
1. Scan the image row-by-row and column-by-column
2. Identify the bounding box of non-white, non-transparent content
3. Extract the content region
4. Apply minimal padding (optional)

This operation is particularly useful after background removal to eliminate any leftover border artifacts.

## Format Conversion

### Supported Formats

| Format | Extension | Quality Range | Use Case |
|--------|-----------|---------------|----------|
| PNG | .png | N/A (lossless) | Transparency, icons, diagrams |
| JPEG | .jpg/.jpeg | 1-100 | Photographs, final output |
| WebP | .webp | 1-100 | Web optimization, smaller files |

### Quality Control

For JPEG and WebP, the `quality` parameter controls the compression level:

- **90** (default): Balanced quality and file size
- **100**: Maximum quality, larger file size
- **70-85**: Smaller files, visible compression artifacts
- **1-69**: Heavy compression, significant quality loss

## Output Organization

### Filename Auto-Versioning

When a filename collision occurs, the system automatically increments a version suffix:

| Attempt | Filename |
|---------|----------|
| 1st | `hero.png` |
| 2nd | `hero-v2.png` |
| 3rd | `hero-v3.png` |
| nth | `hero-v{n}.png` |

### Directory Structure

Output is organized as: `{outputDir}/{subfolder}/{filename}.{format}`

**Examples:**

| Parameters | Result |
|------------|--------|
| `filename: "hero"`, no subfolder | `~/gemini-images/hero.png` |
| `filename: "logo"`, `subfolder: "brand"` | `~/gemini-images/brand/logo.png` |
| `outputDir: "./output"`, `subfolder: "icons"` | `./output/icons/{filename}.png` |

### Generation Manifest

Every processed image is logged to `generations.jsonl` in the output directory. Each entry is a JSON object on a single line:

```json
{"timestamp":"2024-01-15T10:30:00.000Z","type":"process","operation":"background-removal","input":"product.jpg","output":"product-transparent.png","duration_ms":145}
```

## Configuration Integration

### Config Precedence

Parameters can be specified at multiple levels with this priority:

```mermaid
graph TD
    A[Per-Request Parameters] --> B[Highest Priority]
    B --> C[Local Config .gemini-image-mcp.json]
    C --> D[Global Config ~/.gemini-image-mcp.json]
    D --> E[Environment Variables]
    E --> F[Code Defaults]
    F --> G[Lowest Priority]
```

### Default Configuration Template

```json
{
  "outputDir": "~/gemini-images",
  "defaultModel": "gemini-2.5-flash-image",
  "logLevel": "info",
  "requestTimeout": 60000,
  "sessionTimeout": 1800000,
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,
  "defaults": {
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true,
      "format": "png",
      "quality": 90
    }
  }
}
```

Source: [src/config.ts:17-47]()

## Common Pipelines

### Favicon Generation Pipeline

```
process_image → removeBackground {threshold: 230} + trim + resize {width: 192, height: 192}
```

**Steps:**
1. Remove white background using threshold detection
2. Trim any remaining whitespace
3. Resize to 192×192 favicon dimensions

### Transparent Asset from Green Screen

```
generate_image → "A product photo on a bright green background"
process_image → removeBackground {color: "#00FF00"} + trim
```

**Steps:**
1. Generate subject on green screen (one API call)
2. Apply chroma key to remove green (free, local)
3. Trim excess border

### Social Card from Photo

```
process_image → crop {aspectRatio: "16:9", strategy: "attention"} + resize {width: 1200}
```

**Steps:**
1. Crop to 16:9 ratio, focusing on the most interesting region
2. Resize to optimal width for social platforms

## Performance Characteristics

### Processing Speed

Since all operations run locally via sharp, `process_image` is significantly faster than API-based alternatives:

| Operation | Typical Duration |
|-----------|------------------|
| Crop/Resize | < 100ms |
| Background Removal (threshold) | 100-300ms |
| Background Removal (chroma key) | 300-800ms |
| Trim | < 50ms |
| Format Conversion | 50-200ms |

### Memory Usage

Sharp processes images in memory and uses libvips, which is designed for efficient memory usage even with large images. A 4K image (4096×4096) typically requires 50-100MB of working memory depending on the operations performed.

## Error Handling

### Common Error Cases

| Error | Cause | Resolution |
|-------|-------|------------|
| Unsupported format | Invalid file extension | Use PNG, JPEG, WebP, GIF, TIFF, or WebP |
| File too large | Image exceeds 50MB limit | Reduce image size before processing |
| File not found | Invalid path | Verify imagePath is correct and accessible |
| Invalid crop dimensions | Crop region exceeds image bounds | Adjust width, height, left, top values |
| Invalid hex color | Malformed color string | Use format: `#RRGGBB` or `#RGB` |

Source: [src/generate.ts:107-112]()

---

<a id='cost-tracking'></a>

## Cost Tracking and Rate Limiting

### Related Pages

Related topics: [generate_image Tool Reference](#generate-image-tool), [Configuration Guide](#configuration-guide), [Image Generation Internals](#image-generation-internals)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/tracker.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)
- [src/tracker.test.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.test.ts)
- [src/pricing.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)
- [src/pricing.test.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.test.ts)
- [src/config.ts](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/config.ts)
</details>

# Cost Tracking and Rate Limiting

The gemini-image-mcp server implements a comprehensive cost tracking and rate limiting system to help users monitor and control their API spending. This system operates at multiple levels—from per-generation cost calculation to hourly request and budget caps—ensuring predictable expenditure when using Gemini image generation capabilities.

## Overview

The cost tracking and rate limiting subsystem serves three primary purposes:

1. **Cost Transparency** — Every image generation returns detailed token counts and estimated USD cost, allowing users to understand their API consumption.
2. **Budget Enforcement** — Configurable hourly limits prevent runaway agents or iterative workflows from exceeding intended spending.
3. **Session Context** — Generation costs are tracked per session, providing cumulative cost summaries across multi-turn editing workflows.

Source: [src/pricing.ts:1-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)

## Architecture

The system comprises two interconnected modules:

| Module | File | Responsibility |
|--------|------|----------------|
| Pricing | `src/pricing.ts` | Token counting, cost calculation, and pricing table |
| Tracker | `src/tracker.ts` | Rate limiting, session tracking, and manifest logging |

```mermaid
graph TD
    A[generate_image Tool Call] --> B[checkRateLimit]
    B --> C{Within Limits?}
    C -->|No| D[Throw RateLimitError]
    C -->|Yes| E[Call Gemini API]
    E --> F[calculateUsage]
    F --> G[UsageReport]
    G --> H[recordGeneration]
    H --> I[Update Session Stats]
    H --> J[Append to generations.jsonl]
    K[Config: MAX_REQUESTS_PER_HOUR] -.-> B
    L[Config: MAX_COST_PER_HOUR] -.-> B
```

Source: [src/tracker.ts:40-60](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

## Pricing Module

### Pricing Table

The `PRICING` object in `src/pricing.ts` contains the authoritative pricing rates for all supported Gemini image models. All rates are expressed as USD per million tokens.

| Model | Input ($/M) | Text Output ($/M) | Image Output ($/M) | Thinking ($/M) |
|-------|-------------|-------------------|--------------------|----------------|
| `gemini-2.5-flash-image` | 0.30 | 2.50 | 30.00 | 2.50 |
| `gemini-3-pro-image-preview` | 2.00 | 120.00 | 120.00 | 120.00 |
| `gemini-3.1-flash-image-preview` | 0.50 | 60.00 | 60.00 | 60.00 |

Source: [src/pricing.ts:31-45](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)

The pricing data is verified against Google AI Studio as of `2026-04-01`, which is stored in the `PRICING_VERIFIED_DATE` constant and included in every `UsageReport`.

### Cost Calculation Formula

The `calculateUsage()` function computes the estimated cost using the following formula:

```
cost = (promptTokens / 1,000,000) × inputPerMillion
     + (textTokens / 1,000,000) × textOutputPerMillion
     + (imageTokens / 1,000,000) × imageOutputPerMillion
     + (thinkingTokens / 1,000,000) × thinkingPerMillion
```

Source: [src/pricing.ts:66-72](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)

### UsageReport Interface

Every image generation returns a `UsageReport` containing:

| Field | Type | Description |
|-------|------|-------------|
| `promptTokens` | number | Input token count |
| `outputTokens` | number | Total output tokens |
| `imageTokens` | number | Image modality output tokens |
| `thinkingTokens` | number | Internal reasoning tokens |
| `totalTokens` | number | Combined token count |
| `estimatedCost` | string | Formatted cost (e.g., "$0.0412") |
| `pricingVerifiedDate` | string | Date pricing was last verified |

Source: [src/pricing.ts:47-54](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)

### Handling Unknown Models

If a model is not found in the pricing table, the system returns `"unknown (model not in pricing table)"` as the estimated cost while still populating token counts. This ensures graceful degradation without breaking workflows for new or custom models.

Source: [src/pricing.ts:74-80](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.ts)

## Rate Limiting Module

### Configuration

Rate limits are configured through environment variables or the JSON config file:

| Environment Variable | Config Key | Type | Default | Description |
|---------------------|------------|------|---------|-------------|
| `MAX_REQUESTS_PER_HOUR` | `maxRequestsPerHour` | number | 0 (disabled) | Maximum generations per rolling hour |
| `MAX_COST_PER_HOUR` | `maxCostPerHour` | number | 0 (disabled) | Maximum USD spend per rolling hour |

Source: [src/tracker.ts:35-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

Configuration priority follows this order (highest to lowest):
1. Environment variables
2. Local config file (`.gemini-image-mcp.json` in CWD)
3. Global config file (`~/.gemini-image-mcp.json`)
4. Built-in defaults

### Rate Limit Enforcement

The `checkRateLimit()` function performs two checks against a rolling one-hour window:

```mermaid
graph LR
    A[Load Config] --> B[countRecentGenerations]
    B --> C{Hourly Request Limit?}
    C -->|Exceeded| D[Throw Error with count/limit]
    C -->|OK| E{Hourly Cost Limit?}
    E -->|Exceeded| F[Throw Error with $spent/$limit]
    E -->|OK| G[Continue to API Call]
```

Source: [src/tracker.ts:42-58](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

### Error Messages

When rate limits are exceeded, the system throws descriptive errors:

**Request limit reached:**
```
Rate limit reached — 20/20 generations used this hour. To change: set MAX_REQUESTS_PER_HOUR env var.
```

**Cost limit reached:**
```
Cost limit reached — $4.50/$5.00 spent this hour. To change: set MAX_COST_PER_HOUR env var.
```

Source: [src/tracker.ts:48-56](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

## Session Tracking

### Session Statistics

Multi-turn editing sessions maintain running totals across all generations within that session:

| Stat | Type | Description |
|------|------|-------------|
| `sessionGenerations` | number | Count of generations in current session |
| `sessionCostCents` | number | Cumulative cost in cents for the session |

Source: [src/tracker.ts:20-25](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

### SessionStats Interface

Each tool response includes a `session` object with:

| Field | Type | Description |
|-------|------|-------------|
| `sessionId` | string | Unique session identifier |
| `sessionGenerations` | number | Generations in this session |
| `sessionCostCents` | number | Session cost in cents |

Source: [src/tracker.ts:1-20](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

### Session Management

- Sessions expire after 30 minutes of inactivity
- The `sessionId` parameter continues editing from prior conversation context
- Model mismatch detection prevents mixing models within a session

Source: [CHANGELOG.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/CHANGELOG.md)

## Generation Manifest

All generations are logged to `generations.jsonl` in the output directory for auditing and analytics:

```jsonl
{"timestamp":"2026-04-01T12:00:00.000Z","model":"gemini-2.5-flash-image","prompt":"A modern dashboard UI","aspectRatio":"16:9","resolution":"2K","filename":"dashboard-hero","costCents":4.12,"tokens":1295}
```

Source: [src/tracker.ts:60-65](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/tracker.ts)

## Tool Response Structure

Every `generate_image` response includes complete cost and tracking information:

```json
{
  "imagePath": "/home/user/gemini-images/hero-banner.png",
  "mimeType": "image/png",
  "model": "gemini-2.5-flash-image",
  "sessionId": "session-1711929600000-a1b2c3",
  "sessionTurn": 1,
  "usage": {
    "promptTokens": 5,
    "outputTokens": 1295,
    "imageTokens": 1290,
    "thinkingTokens": 412,
    "totalTokens": 1295,
    "estimatedCost": "$0.0412",
    "pricingVerifiedDate": "2026-04-01"
  },
  "session": {
    "sessionId": "session-1711929600000-a1b2c3",
    "sessionGenerations": 1,
    "sessionCostCents": 4.12
  }
}
```

## Recommended Settings

For agentic workflows with iterative image refinement:

| Setting | Value | Rationale |
|---------|-------|-----------|
| `MAX_REQUESTS_PER_HOUR` | 20 | Prevents runaway loops |
| `MAX_COST_PER_HOUR` | 5.00 | Caps hourly spend at $5 |

Source: [README.md](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/README.md)

## Testing

The pricing and tracking modules have comprehensive test coverage:

| Test File | Coverage |
|-----------|----------|
| `src/pricing.test.ts` | Cost calculation, unknown models, missing metadata, pricing table verification |
| `src/tracker.test.ts` | Rate limit enforcement, session tracking, manifest appending |

Source: [src/pricing.test.ts:1-50](https://github.com/JimothySnicket/gemini-image-mcp/blob/main/src/pricing.test.ts)

## Summary

The cost tracking and rate limiting system provides transparency and control over API usage through:

- **Per-generation pricing** with detailed token breakdowns across input, text output, image output, and thinking tokens
- **Hourly rate limiting** on both request count and dollar amount
- **Session-aware tracking** for multi-turn editing workflows
- **Manifest logging** for historical analysis and auditing
- **Graceful degradation** when encountering unknown models

---

<!-- evidence_pipeline_checked: true -->

---

## Pitfall Log

Project: jimothysnicket/gemini-image-mcp

Summary: Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

## 1. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

## 2. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

## 3. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

## 4. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

## 5. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

<!-- canonical_name: jimothysnicket/gemini-image-mcp; human_manual_source: deepwiki_human_wiki -->