gemini-image-mcp Manual

Doramagic Project Pack · Human Manual

gemini-image-mcp

gemini-image-mcp is a Model Context Protocol (MCP) server that provides Google Gemini-powered image generation, editing, and local image processing capabilities. It integrates with MCP-com...

Home

Related topics: Installation Guide, MCP Client Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Tool: generateimage

Continue reading this section for the full explanation and source context.

Section Tool: processimage

Continue reading this section for the full explanation and source context.

Home

Overview

gemini-image-mcp is a Model Context Protocol (MCP) server that provides Google Gemini-powered image generation, editing, and local image processing capabilities. It integrates with MCP-compatible AI assistants (such as Claude Code and Claude Desktop) to enable seamless AI-driven image workflows directly from conversational interfaces. Source: package.json

The project exposes two primary tools: generate_image for AI-powered image creation and editing via the Gemini API, and process_image for local image manipulation using the sharp library—free and fast with no API calls required. Source: README.md

Project Metadata

Property	Value
Package Name	`@jimothy-snicket/gemini-image-mcp`
Version	0.4.0
MCP Server Name	`io.github.JimothySnicket/gemini-image`
License	MIT
Author	Jamie Donaldson
Runtime	Node.js >= 18.0.0
Package Manager	Bun (primary), npm compatible
Repository	https://github.com/JimothySnicket/gemini-image-mcp

Source: package.json, server.json

Architecture

graph TD
    A[MCP Client<br/>Claude Code / Claude Desktop] --> B[gemini-image-mcp Server]
    B --> C[generate_image Tool]
    B --> D[process_image Tool]
    C --> E[Google Gemini API]
    D --> F[sharp Library<br/>Local Processing]
    
    E --> G[Image Models]
    G --> G1[gemini-2.5-flash-image]
    G --> G2[gemini-3-pro-image-preview]
    G --> G3[gemini-3.1-flash-image-preview]
    
    F --> H[Crop / Resize]
    F --> I[Background Removal]
    F --> J[Trim / Format]
    
    C --> K[Output Directory]
    D --> K
    K --> L[generations.jsonl<br/>Manifest Log]

System Components

Component	Technology	Purpose
MCP Server	`@modelcontextprotocol/sdk` v1.22.0	Protocol implementation for AI tool integration
Gemini SDK	`@google/genai` v1.44.0	Google AI API client
Image Processing	`sharp` v0.34.5	Local image manipulation
Schema Validation	`zod` v3.24.0	Parameter validation for both tools
Language	TypeScript 6.0.3	Type-safe source code

Source: package.json

Supported Gemini Models

Model	Speed	Cost	Resolution Support	Max Reference Images	Special Features
`gemini-2.5-flash-image`	Fast (~6s)	~$0.04/image	1K only	1	Default model, deprecates Oct 2026
`gemini-3-pro-image-preview`	Slow (~16s)	~$0.15/image	1K, 2K, 4K	14	Best quality, text rendering
`gemini-3.1-flash-image-preview`	Balanced	Variable	512, 1K, 2K, 4K	1	Google Search grounding

Source: README.md

Available Tools

Tool: `generate_image`

AI-powered image generation and editing via Google Gemini API.

Parameters:

Parameter	Required	Type	Description
`prompt`	Yes	string	Text description or editing instruction
`images`	No	string[]	Array of file paths to input/reference images
`model`	No	string	Gemini model ID (auto-detected if omitted)
`aspectRatio`	No	string	Image ratio: `1:1`, `16:9`, `9:16`, `3:2`, `2:3`, `4:3`, `3:4`, `21:9`
`resolution`	No	string	`1K`, `2K`, `4K`
`outputDir`	No	string	Override output directory
`filename`	No	string	Base name for saved file (auto-versioned if duplicate)
`subfolder`	No	string	Subfolder within output directory
`sessionId`	No	string	Continue multi-turn editing session
`seed`	No	integer	Reproducible generation seed
`useSearchGrounding`	No	boolean	Enable Google Search grounding (gemini-3.1-flash only)

Source: README.md, src/index.ts

Tool: `process_image`

Local image processing via sharp. Free, fast, no API calls.

Parameters:

Parameter	Required	Type	Description
`imagePath`	Yes	string	Path to image file to process
`crop`	No	object	Pixel dimensions, aspect ratio, or focal point strategy
`resize`	No	object	Resize to width/height (maintains aspect ratio)
`removeBackground`	No	object	Threshold (white) or chroma key (any solid color)
`trim`	No	boolean	Auto-remove whitespace/transparent borders
`format`	No	string	Convert to: `png`, `jpeg`, `webp`
`quality`	No	number	Output quality for JPEG/WebP (1-100)
`outputDir`	No	string	Override output directory
`filename`	No	string	Base name for saved file
`subfolder`	No	string	Subfolder within output directory

Crop Options:

// Pixel-exact
{"width": 500, "height": 300, "left": 100, "top": 50}

// Aspect ratio (center crop)
{"aspectRatio": "16:9"}

// Focal point strategies
{"aspectRatio": "16:9", "strategy": "attention"}  // Visually interesting region
{"aspectRatio": "16:9", "strategy": "entropy"}    // Most detailed region

Background Removal Options:

// Threshold-based (white backgrounds)
{"threshold": 230}

// Chroma key (green screen / any solid color)
{"color": "#00FF00"}

Source: README.md, skills/image-generation/SKILL.md

Feature Summary

Generate Image Features

Text-to-image — Describe desired output, receive generated image
Image editing — Provide reference images with editing instructions
Multi-turn sessions — Iteratively refine images using conversation history
Multi-image input — Up to 14 reference images on gemini-3-pro
Cost reporting — Token counts, estimated USD cost, and session totals in every response
Rate limiting — Configurable per-hour caps on requests and cost
Auto model discovery — Detects available image models from API key at startup
Seed support — Reproducible generation with integer seeds
Google Search grounding — Real-world accuracy on gemini-3.1-flash

Source: README.md

Process Image Features

Crop — Pixel-exact, aspect ratio (center), or focal point (attention/entropy)
Resize — To width, height, or both (maintains aspect ratio)
Background removal — Threshold-based (white backgrounds) or chroma key (any solid color)
Chroma key pipeline — HSV keying with smoothstep feather, spill suppression, 5-pass 3x3 edge anti-aliasing
Trim — Auto-remove whitespace borders
Format conversion — PNG, JPEG, WebP with quality control

Source: CHANGELOG.md

Shared Features

Output organization — Meaningful filenames with auto-versioning, subfolders
Generation manifest — generations.jsonl logs every generation with prompt, params, cost
Full aspect ratio support — 1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 21:9
Resolution control — 1K, 2K, 4K

Source: README.md

Workflow Diagram

graph LR
    subgraph "Text-to-Image"
        A1[User Prompt] --> B1[generate_image]
        B1 --> C1[Gemini API]
        C1 --> D1[Save PNG/JPEG]
    end
    
    subgraph "Image Editing"
        A2[User Prompt + Reference Image] --> B2[generate_image]
        B2 --> C2[Gemini API]
        C2 --> D2[Save + sessionId]
    end
    
    subgraph "Local Processing"
        A3[Input Image] --> B3[process_image]
        B3 --> C3[sharp Pipeline]
        C3 --> D3[Processed Output]
    end
    
    subgraph "Multi-Turn Refinement"
        D2 --> E1[Pass sessionId]
        E1 --> B2
        B2 --> D4[Refined Image]
    end

Setup and Configuration

Prerequisites

Gemini API Key — Obtain from Google AI Studio
Node.js >= 18.0.0 or Bun runtime
MCP-compatible client (Claude Code, Claude Desktop, or other MCP clients)

Environment Setup

Windows (PowerShell):

[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')

macOS / Linux:

echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc

Source: README.md

Configuration File

Create a config file using the --init flag:

npx @jimothy-snicket/gemini-image-mcp --init

This creates ~/.gemini-image-mcp.json with all defaults and inline documentation.

Configuration Priority:

Environment Variables > Local Config (.gemini-image-mcp.json in CWD) > Global Config (~/.gemini-image-mcp.json) > Defaults

Example Config Structure:

{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}

Source: README.md

Rate Limiting

Configure rate limits to prevent runaway agent costs:

MAX_REQUESTS_PER_HOUR — Maximum API requests per hour (e.g., 20)
MAX_COST_PER_HOUR — Maximum cost in USD per hour (e.g., 5)

Source: README.md, skills/image-generation/SKILL.md

Development

Build Commands

bun install        # Install dependencies
bun run build      # TypeScript -> dist/
bun run dev        # Run directly with Bun
npm run start      # Run production build with Node

Source: CONTRIBUTING.md

Project Structure

Path	Purpose
`src/index.ts`	Main MCP server implementation with tool definitions
`dist/`	Compiled JavaScript output
`skills/image-generation/SKILL.md`	Claude Code plugin skill documentation
`plugin.json`	Claude Code plugin manifest
`server.json`	MCP server registry configuration

Source: package.json

Version History

Version	Release Date	Key Changes
0.4.0	2026-05	Config module, JSONC parsing, security hardening, prototype pollution guards
0.2.0	2026-04-01	Process_image tool, chroma key pipeline, session tracking, rate limiting
0.1.0	2026-01	Initial release with basic generation

Source: CHANGELOG.md

Security

For security vulnerabilities, report through GitHub Security Advisories rather than opening a public issue. Source: SECURITY.md

Security Features in v0.4.0:

API keys rejected from config files with warning
String-aware JSONC comment stripping (won't mangle URLs in quoted strings)
Prototype pollution guard on config deep merge (__proto__, constructor, prototype)
Unknown config keys warned and dropped

Source: CHANGELOG.md

Contributing

Before contributing, open an issue to discuss the bug or feature. Development follows these guidelines:

One thing per PR
Ensure bun run build succeeds with no errors
Test changes manually against the actual Gemini API
Keep scope tight—open separate issues for unrelated fixes

Source: CONTRIBUTING.md

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Installation Guide

Related topics: Home, MCP Client Configuration, Configuration Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Requirements

Continue reading this section for the full explanation and source context.

Section Required Accounts

Continue reading this section for the full explanation and source context.

Section Method 1: Global npm Installation

Continue reading this section for the full explanation and source context.

Installation Guide

This guide covers all methods to install and configure the gemini-image-mcp server, a Model Context Protocol (MCP) server that provides Google Gemini image generation, editing, and local image processing capabilities.

Overview

The gemini-image-mcp server provides two primary tools:

Tool	Description
`generate_image`	AI-powered image generation and editing via Gemini API
`process_image`	Local image processing (crop, resize, background removal) via Sharp

Package Details:

Property	Value
Package Name	`@jimothy-snicket/gemini-image-mcp`
Version	0.4.0
Engine	Node.js >= 18.0.0
License	MIT
Transport	stdio

Source: package.json

Prerequisites

Before installation, ensure the following requirements are met:

System Requirements

Node.js: Version 18.0.0 or higher
Package Manager: npm (comes with Node.js)
MCP Client: A compatible MCP client such as Claude Code, Claude Desktop, or any MCP-compatible tool

Required Accounts

Google Gemini API Key: Obtain from Google AI Studio

Note: Google AI Studio provides generous rate limits for the Gemini API at no cost to start.

Source: README.md

Installation Methods

Method 1: Global npm Installation

Install the package globally for system-wide access:

npm install -g @jimothy-snicket/gemini-image-mcp

After installation, the server can be invoked via the gemini-image-mcp command:

gemini-image-mcp

Source: package.json:14 README.md

Method 2: NPX (No Installation Required)

Run directly without installation using npx:

npx -y @jimothy-snicket/gemini-image-mcp

This method automatically downloads and executes the package, making it ideal for quick testing or temporary use.

Source: README.md

Method 3: Claude Code Plugin

Add the MCP server to Claude Code with a single command:

claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp

Claude Code automatically picks up the GEMINI_API_KEY environment variable from your shell.

Source: README.md

Method 4: Manual MCP Configuration

Create a .mcp.json configuration file in your project root or ~/.claude/.mcp.json for global access:

{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}

Security Note: The ${GEMINI_API_KEY} syntax reads the value from your shell environment, ensuring your actual API key is never written into configuration files.

Source: README.md

Method 5: Claude Desktop

For Claude Desktop users, edit the configuration file:

OS	File Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}

Source: README.md

Environment Setup

Setting the GEMINI_API_KEY

The server requires a GEMINI_API_KEY environment variable to authenticate with the Google Gemini API.

#### Windows (PowerShell)

Run PowerShell as administrator and execute:

[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')

After setting the environment variable, restart your terminal to ensure the variable is loaded.

#### macOS / Linux

Add the export statement to your shell configuration file:

echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc

For zsh users, use:

echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.zshrc
source ~/.zshrc

#### Verification

Confirm the API key is set correctly:

echo $GEMINI_API_KEY

This should display your API key. If empty, ensure you've restarted your terminal or run source ~/.bashrc (or equivalent).

Source: README.md

Configuration File Setup

The server supports configuration files for persistent settings. Two methods are available:

Initialize Default Config File

Create a global configuration file at ~/.gemini-image-mcp.json:

npx @jimothy-snicket/gemini-image-mcp --init

Initialize Local Config File

Create a project-local configuration file at .gemini-image-mcp.json in the current working directory:

npx @jimothy-snicket/gemini-image-mcp --init --local

Source: README.md

Configuration Priority

Settings are resolved in the following order of precedence:

Environment Variables > Local Config (.gemini-image-mcp.json) > Global Config (~/.gemini-image-mcp.json) > Default Values

Per-request parameters always override all configuration defaults.

Source: README.md

Configuration Schema

The configuration file supports the following structure:

{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}

#### Configuration Parameters

Parameter	Type	Description	Default
`defaultModel`	string	Default Gemini model for image generation	`gemini-2.5-flash-image`
`defaults.generate.aspectRatio`	string	Default aspect ratio	`1:1`
`defaults.generate.resolution`	string	Default resolution	`1K`
`defaults.process.removeBackground`	object	Default background removal settings	`{}`
`defaults.process.trim`	boolean	Default trim setting	`false`
`defaults.process.format`	string	Default output format	`png`
`defaults.process.quality`	number	Default quality (1-100)	`90`

Source: README.md

Development Setup

For contributing to the project or running from source:

1. Clone the Repository

git clone https://github.com/JimothySnicket/gemini-image-mcp.git
cd gemini-image-mcp

2. Install Dependencies

The project uses Bun as its package manager:

bun install

3. Build the Project

Compile TypeScript to JavaScript:

bun run build

This produces output in the dist/ directory.

4. Run in Development Mode

Execute directly from source using Bun:

bun run dev

5. Run the Compiled Version

After building, start the compiled server:

bun run start

Or with Node.js:

node dist/index.js

Source: CONTRIBUTING.md package.json

Rate Limiting Configuration

The server supports rate limiting to prevent runaway agents or excessive costs:

Environment Variable	Description	Example
`MAX_REQUESTS_PER_HOUR`	Maximum API requests per hour	`20`
`MAX_COST_PER_HOUR`	Maximum cost per hour in USD	`5`

Example sensible defaults for an agent loop:

export MAX_REQUESTS_PER_HOUR=20
export MAX_COST_PER_HOUR=5

Note: The server logs a warning at startup if no rate limits are configured.

Source: README.md

Supported Gemini Models

Model	Strengths	Supported Resolutions
`gemini-2.5-flash-image`	Fast, cheap (~$0.04/image)	1K only (deprecates Oct 2026)
`gemini-3-pro-image-preview`	Best quality, text rendering	1K, 2K, 4K
`gemini-3.1-flash-image-preview`	Speed + quality balance, Google Search grounding	512, 1K, 2K, 4K

The server performs automatic model discovery at startup, detecting image-capable models available with your API key.

Source: README.md

Server Metadata

The server is registered with the MCP registry:

{
  "name": "io.github.JimothySnicket/gemini-image",
  "version": "0.4.0",
  "description": "Google Gemini image generation, editing, and local processing via MCP"
}

Source: server.json

Installation Flow Diagram

graph TD
    A[Start Installation] --> B{Have GEMINI_API_KEY?}
    B -->|No| C[Get API Key from Google AI Studio]
    B -->|Yes| D{Choose Installation Method}
    C --> D
    D -->|Global| E[npm install -g]
    D -->|Temporary| F[npx -y]
    D -->|Claude Code| G[claude mcp add command]
    D -->|Claude Desktop| H[Edit claude_desktop_config.json]
    D -->|Development| I[Clone repo + bun install]
    E --> J{Setup Config File?}
    F --> J
    G --> J
    H --> J
    I --> J
    J -->|Yes| K[Run --init or --init --local]
    J -->|No| L[Use Defaults]
    K --> M[Start Using MCP Server]
    L --> M

Verification Checklist

After installation, verify your setup by checking:

[ ] echo $GEMINI_API_KEY returns your API key
[ ] Server starts without errors
[ ] MCP client recognizes the gemini-image server
[ ] Test generate_image tool with a simple prompt
[ ] Rate limiting is configured (recommended for agent use)

Troubleshooting

API Key Not Found

If the server reports that GEMINI_API_KEY is not set:

Verify the environment variable is set: echo $GEMINI_API_KEY
Restart your terminal session
For Claude Desktop, ensure the env variable is set before starting the application

Model Not Available

If you receive a model not available error:

The server performs automatic model discovery at startup
Verify your API key has access to the requested model
Check Google AI Studio for model availability

Build Errors

If bun run build fails:

Ensure Bun is installed: bun --version
Clear node_modules and reinstall: rm -rf node_modules && bun install
Check TypeScript version compatibility

Source: CONTRIBUTING.md README.md

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

MCP Client Configuration

Related topics: Installation Guide, Home

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Server Identity

Continue reading this section for the full explanation and source context.

Section Server Instructions

Continue reading this section for the full explanation and source context.

Section Required Variables

Continue reading this section for the full explanation and source context.

Related topics: Installation Guide, Home

MCP Client Configuration

Overview

The gemini-image-mcp project provides an MCP (Model Context Protocol) server that enables AI-powered image generation and local image processing through Google Gemini. MCP Client Configuration encompasses all methods and mechanisms available to connect MCP-compatible clients to this server, pass required authentication credentials, and customize server behavior through environment variables or configuration files.

The server exposes two primary tools: generate_image for AI-powered image generation via the Gemini API, and process_image for local image manipulation using the sharp library. Both tools are accessible to any MCP-compatible client once the connection is established. Source: README.md:1-15

Architecture

graph TD
    A[MCP Client<br/>Claude Code / Claude Desktop] --> B[gemini-image-mcp Server]
    B --> C[Google Gemini API]
    B --> D[Local Processing<br/>sharp library]
    
    E[Environment Variables] --> B
    F[Config File<br/>~/.gemini-image-mcp.json] --> B
    G[Local Config<br/>.gemini-image-mcp.json] --> B
    
    H[GEMINI_API_KEY] --> E
    I[OUTPUT_DIR] --> E
    J[DEFAULT_MODEL] --> E

MCP Server Registration

The server registers with the MCP protocol using the official @modelcontextprotocol/sdk. Upon connection, clients receive metadata describing available tools and server capabilities. Source: src/index.ts:1-35

Server Identity

Property	Value
Server Name	`gemini-image-mcp`
Version	Dynamic from `package.json`
MCP Name	`io.github.JimothySnicket/gemini-image`
Transport	stdio
Protocol Version	2025-12-11

The server name and version are read dynamically from package.json at runtime, ensuring the MCP handshake always reports the correct version. Source: package.json:3-7

Server Instructions

The MCP server provides structured instructions to connecting clients describing the available tools and configuration hierarchy:

Gemini image generation and local image processing. Two tools: generate_image (AI-powered, costs money) 
and process_image (local via sharp, free). Configuration can be set via a JSON config file — run 
`npx @jimothy-snicket/gemini-image-mcp --init` to create ~/.gemini-image-mcp.json with commented defaults. 
A local .gemini-image-mcp.json in the project directory can override global settings. 
Priority: per-request params > env vars > local config > global config > defaults.

Source: src/index.ts:17-26

Environment Variables

Environment variables provide the primary mechanism for configuring the MCP server. They are read at server startup and apply globally to all requests.

Required Variables

Variable	Description	Required
`GEMINI_API_KEY`	Google Gemini API key from AI Studio	Yes

The GEMINI_API_KEY is mandatory. The server will fail to start without it, displaying a clear error message indicating the missing credential. Source: README.md:68-80

Optional Variables

Variable	Default	Description
`OUTPUT_DIR`	`~/gemini-images`	Default directory for saved images
`DEFAULT_MODEL`	`gemini-2.5-flash-image`	Default Gemini model
`LOG_LEVEL`	`info`	Log level: `debug`, `info`, or `error`
`REQUEST_TIMEOUT_MS`	`60000`	API request timeout in milliseconds
`SESSION_TIMEOUT_MS`	`1800000`	Multi-turn session expiry (30 minutes)
`MAX_REQUESTS_PER_HOUR`	`0`	Max image generations per rolling hour (0 = unlimited)
`MAX_COST_PER_HOUR`	`0`	Max estimated cost (USD) per rolling hour (0 = unlimited)

Source: src/config.ts:4-30

Setting the API Key

macOS / Linux:

echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc

Windows (PowerShell):

[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')

Source: README.md:75-88

Rate Limiting Configuration

Rate limiting is strongly recommended when agents have access to the generate_image tool, as an agent in a loop can generate images rapidly.

# Example: Limit to 20 requests or $5 per rolling hour
export MAX_REQUESTS_PER_HOUR=20
export MAX_COST_PER_HOUR=5

Source: README.md:166-170

Configuration Files

Beyond environment variables, the server supports persistent JSON configuration files with comments (JSONC format).

Config File Locations

Location	Purpose
`~/.gemini-image-mcp.json`	Global configuration for all projects
`.gemini-image-mcp.json`	Project-specific overrides

Source: src/config.ts:1-20

Configuration Priority

graph LR
    A[Per-request Parameters] --> Z[Highest Priority]
    B[Environment Variables] --> Y
    C[Local Config<br/>.gemini-image-mcp.json] --> X
    D[Global Config<br/>~/.gemini-image-mcp.json] --> W
    E[Built-in Defaults] --> V[Lowest Priority]
    
    style A fill:#90EE90
    style E fill:#FFB6C1

Priority order (highest to lowest):

Per-request tool parameters
Environment variables
Local config file (.gemini-image-mcp.json in project)
Global config file (~/.gemini-image-mcp.json)
Built-in defaults

Source: README.md:148-155

Initializing Config Files

Create a new config file with documented defaults:

# Global config
npx @jimothy-snicket/gemini-image-mcp --init

# Project-specific config
npx @jimothy-snicket/gemini-image-mcp --init --local

# Overwrite existing
npx @jimothy-snicket/gemini-image-mcp --init --force

Source: README.md:53-62

Config File Template

{
  // gemini-image-mcp configuration
  // Docs: https://github.com/JimothySnicket/gemini-image-mcp

  // Directory where generated/processed images are saved
  "outputDir": "~/gemini-images",

  // Default Gemini model for image generation
  // gemini-2.5-flash-image         — fast, ~$0.04/image, 1K only
  // gemini-3.1-flash-image-preview  — fast, ~$0.08/image, up to 4K
  // gemini-3-pro-image-preview      — best quality, ~$0.16/image, up to 4K
  "defaultModel": "gemini-2.5-flash-image",

  "logLevel": "info",
  "requestTimeout": 60000,
  "sessionTimeout": 1800000,
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,

  "defaults": {
    "generate": {
      // "aspectRatio": "1:1",
      // "resolution": "1K"
    }
  }
}

Source: src/config.ts:1-35

Security Considerations

API keys are rejected from config files with a warning. This prevents accidental exposure when config files get committed to repositories. Source: CHANGELOG.md:45-50

The config system includes:

String-aware JSONC comment stripping (won't mangle URLs in quoted strings)
Prototype pollution guard on config deep merge
Unknown config keys warned and dropped

MCP Client Setup Examples

Claude Code (One-Liner)

The simplest setup method using Claude Code's built-in MCP management:

claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp

Claude Code automatically inherits GEMINI_API_KEY from the shell environment. Source: README.md:38-45

Claude Code (Manual Configuration)

For explicit control, add to .mcp.json in your project root or ~/.claude/.mcp.json for global access:

{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}

The ${GEMINI_API_KEY} syntax reads the value from your shell environment without storing the actual key in the config file. Source: README.md:95-110

Claude Desktop

Edit the Claude Desktop configuration file:

OS	Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

{
  "mcpServers": {
    "gemini-image": {
      "command": "npx",
      "args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  }
}

Source: README.md:111-130

Plugin-Based Configuration

For environments using the Claude plugin system, configure via plugin.json:

{
  "name": "gemini-image-mcp",
  "version": "0.2.0",
  "description": "Google Gemini image generation and editing via MCP",
  "mcpServers": {
    "gemini-image": {
      "command": "node",
      "args": ["${CLAUDE_PLUGIN_ROOT}/dist/index.js"],
      "env": {
        "GEMINI_API_KEY": "${GEMINI_API_KEY}"
      }
    }
  },
  "skills": ["skills/image-generation/SKILL.md"]
}

The ${CLAUDE_PLUGIN_ROOT} variable is replaced at runtime with the plugin installation directory. Source: plugin.json:1-16

Enhanced Security Setup

For environments requiring extra security, use a wrapper script that retrieves credentials from the OS keychain:

# Wrapper script example (macOS Keychain)
#!/bin/bash
API_KEY=$(security find-generic-password -s "GEMINI_API_KEY" -w)
GEMINI_API_KEY="$API_KEY" node /path/to/gemini-image-mcp/dist/index.js

Source: README.md:145-155

Server.json Schema

The MCP protocol uses server.json to advertise server capabilities to compatible clients:

{
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
  "name": "io.github.JimothySnicket/gemini-image",
  "description": "Google Gemini image generation, editing, and local processing via MCP",
  "repository": {
    "url": "https://github.com/JimothySnicket/gemini-image-mcp",
    "source": "github"
  },
  "version": "0.4.0",
  "packages": [
    {
      "registryType": "npm",
      "identifier": "@jimothy-snicket/gemini-image-mcp",
      "version": "0.4.0",
      "transport": {
        "type": "stdio"
      },
      "environmentVariables": [
        {
          "description": "Google Gemini API key from https://aistudio.google.com/apikey",
          "isRequired": true,
          "format": "string",
          "isSecret": true,
          "name": "GEMINI_API_KEY"
        }
      ]
    }
  ]
}

This schema allows MCP clients to automatically discover server requirements and display appropriate configuration prompts. Source: server.json:1-32

Verifying Configuration

Check Environment Variables

echo $GEMINI_API_KEY

A non-empty response confirms the variable is set. Source: README.md:89-91

Test Server Startup

npx @jimothy-snicket/gemini-image-mcp --help

Successful startup displays diagnostics including Node version, PID, working directory, API key status, default model, and output directory. Source: CHANGELOG.md:35-40

Tool Registration

The MCP server registers two tools with their input schemas:

generate_image Tool

server.registerTool(
  "generate_image",
  {
    title: "Generate Image",
    description: "Generate or edit images using Google Gemini...",
    inputSchema: {
      prompt: z.string().describe("Text description of the image..."),
      images: z.optional(z.array(z.string()).max(14)),
      model: z.optional(z.string()),
      aspectRatio: z.optional(z.enum(["1:1", "16:9", "9:16", "3:2", "2:3", "3:4", "4:3", "21:9"])),
      resolution: z.optional(z.enum(["1K", "2K", "4K"])),
      // ... additional parameters
    }
  }
);

Source: src/index.ts:37-60

process_image Tool

Local image processing via sharp, free and requires no API calls. Supports crop, resize, background removal, trim, and format conversion. Source: README.md:28-40

Development Setup

For local development of the MCP server:

bun install
bun run build     # TypeScript -> dist/
bun run dev       # Run directly with Bun

Requires GEMINI_API_KEY environment variable for testing image generation. Source: CONTRIBUTING.md:1-15

Summary

MCP Client Configuration for gemini-image-mcp supports multiple integration strategies:

Method	Best For
`claude mcp add`	Quick setup in Claude Code
`.mcp.json` manual	Explicit control, version control of config
Claude Desktop	Desktop Claude applications
Plugin system	Shared team configurations
Environment variables	CI/CD pipelines, containerized deployments
Config files	Persistent, documented defaults

The configuration system follows a clear priority hierarchy, with per-request parameters taking precedence over environment variables, which take precedence over local and global config files. Rate limiting is configurable to prevent runaway costs in agent-based workflows.

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

generate_image Tool Reference

Related topics: processimage Tool Reference, Image Generation Internals, Cost Tracking and Rate Limiting

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Supported Aspect Ratios

Continue reading this section for the full explanation and source context.

Section Resolution Options

Continue reading this section for the full explanation and source context.

Section High-Level Flow

Continue reading this section for the full explanation and source context.

generate_image Tool Reference

The generate_image tool is the primary AI-powered component of the gemini-image-mcp server. It leverages Google Gemini's native image generation API (generateContent) to create and edit images based on text prompts, with optional reference images for contextual guidance. This tool is designed for scenarios requiring intelligent, model-driven image creation including text-to-image generation, iterative editing through multi-turn sessions, and AI-assisted image composition with up to 14 reference images.

Unlike traditional image generation APIs that rely on deprecated services, this tool is built on Gemini's native capabilities, ensuring long-term stability and access to cutting-edge features like multi-turn conversation context and Google Search grounding.

Overview

Property	Value
Tool Name	`generate_image`
API Backend	Google Gemini `generateContent`
Cost	Per-request (see Pricing)
Free Tier	No (requires Gemini API key)
Transport	STDIO (MCP Protocol)

Source: src/index.ts:1-50

Supported Models

The tool automatically discovers available image-capable models at startup by querying the Gemini API. However, three primary models are documented and supported:

Model ID	Strengths	Resolution Support	Reference Images	Cost Tier
`gemini-2.5-flash-image`	Fast, affordable	1K only	Up to 5	~$0.04/image
`gemini-3-pro-image-preview`	Best quality, superior text rendering	1K, 2K, 4K	Up to 14	~$0.15/image
`gemini-3.1-flash-image-preview`	Speed/quality balance, Google Search grounding	512, 1K, 2K, 4K	Up to 14	~$0.08/image

The model auto-discovery mechanism filters models based on naming patterns to exclude deprecated Imagen-based services:

const IMAGE_MODEL_PATTERNS = ["image", "img"];
const EXCLUDED_PREFIXES = ["imagen"];

Source: src/generate.ts:1-50

Parameters

The following table documents all parameters accepted by the generate_image tool:

Parameter	Type	Required	Default	Description
`prompt`	`string`	Yes	—	Text description or editing instruction
`images`	`string[]`	No	—	Array of file paths to input/reference images
`model`	`string`	No	Config `defaultModel`	Gemini model ID
`aspectRatio`	`string`	No	Config default	Output aspect ratio
`resolution`	`string`	No	Config default	Output resolution (1K, 2K, 4K)
`outputDir`	`string`	No	`~/gemini-images`	Override output directory
`filename`	`string`	No	Auto-generated	Base name with auto-versioning
`subfolder`	`string`	No	—	Subdirectory within output
`seed`	`integer`	No	Random	Reproducible generation seed
`sessionId`	`string`	No	—	Multi-turn session identifier
`useSearchGrounding`	`boolean`	No	`false`	Enable Google Search grounding

Supported Aspect Ratios

Ratio	Use Case
`1:1`	Square images, social posts, icons
`16:9`	Widescreen, hero banners, videos
`9:16`	Vertical stories, mobile content
`3:2`	Standard photography
`2:3`	Portrait photography
`4:3`	Classic aspect ratio
`3:4`	Portrait standard
`21:9`	Ultra-widescreen

Resolution Options

Resolution	Availability
`1K`	All models
`2K`	gemini-3-pro-image-preview, gemini-3.1-flash-image-preview
`4K`	gemini-3-pro-image-preview, gemini-3.1-flash-image-preview

Source: src/index.ts:50-150

Architecture

High-Level Flow

graph TD
    A[User Request: generate_image] --> B[Load Config & Validate Params]
    B --> C{images provided?}
    C -->|No| D[Text-to-Image Mode]
    C -->|Yes| E[Image Editing Mode]
    D --> F[Build Prompt Content]
    E --> G[Read Image Files as InlineData]
    G --> F
    F --> H{Model supports grounding?}
    H -->|Yes & enabled| I[Add Google Search Tool]
    H -->|No| J[Skip Grounding]
    I --> K[Call Gemini generateContent API]
    J --> K
    K --> L[Extract Generated Image]
    L --> M[Apply Filename & Subfolder Logic]
    M --> N[Save to Output Directory]
    N --> O[Log to generations.jsonl]
    O --> P[Return Response with Usage Report]

Multi-Turn Session Management

Multi-turn sessions enable iterative refinement of images by preserving conversation history across multiple requests:

graph LR
    A[Request 1: sessionId=abc123] --> B[Create New Session]
    B --> C[Generate Image]
    C --> D[Response: sessionId=abc123]
    D --> E[Request 2: sessionId=abc123]
    E --> F[Retrieve Existing Session]
    F --> G[Append to History]
    G --> H[Generate with Context]
    H --> D2[Updated Response]

Sessions are managed through an in-memory Map with automatic cleanup:

Setting	Value
Max conversation turns per session	10
Session timeout	30 minutes (1800000ms)
History storage	Array of `Content` objects

Source: src/generate.ts:50-150

Image Input Processing

When reference images are provided, they are converted to Gemini's inline data format:

const MIME_TYPES: Record<string, string> = {
  ".png": "image/png",
  ".jpg": "image/jpeg",
  ".jpeg": "image/jpeg",
  ".webp": "image/webp",
  ".gif": "image/gif",
};

Constraint	Limit
Max image file size	50MB
Max reference images (gemini-3-pro)	14
Max reference images (gemini-2.5-flash)	5

Source: src/generate.ts:150-200

Google Search Grounding

Google Search grounding enhances generation accuracy by incorporating real-world information through the Gemini googleSearch tool. This feature is restricted to specific models:

export function validateGrounding(model: string, useSearchGrounding: boolean | undefined): void {
  if (useSearchGrounding && !GROUNDING_SUPPORTED_MODELS.includes(model)) {
    throw new Error(
      `useSearchGrounding is only supported on ${GROUNDING_SUPPORTED_MODELS.join(", ")}. ` +
        `You requested ${model}.`,
    );
  }
}

export const GROUNDING_SUPPORTED_MODELS = ["gemini-3.1-flash-image-preview"];

Supported Model: gemini-3.1-flash-image-preview

Attempting to enable grounding on other models results in a validation error. This restriction ensures users receive accurate error messages rather than silent failures from the API.

Source: src/generate.ts:1-50

Pricing and Cost Reporting

Every generate_image response includes detailed cost information through the UsageReport structure:

interface UsageReport {
  inputTokens: number;
  outputTokens: number;
  totalTokens: number;
  estimatedCostUsd: number;
}

interface SessionStats {
  totalGenerations: number;
  totalCostUsd: number;
  requestsThisHour: number;
  costThisHour: number;
}

Metric	Description
`inputTokens`	Tokens consumed by the prompt and reference images
`outputTokens`	Tokens in the API response (including image data)
`totalTokens`	Sum of input and output tokens
`estimatedCostUsd`	Calculated cost in US dollars
`totalGenerations`	Running count in current session
`totalCostUsd`	Cumulative cost for the session

The pricing module calculates costs based on model-specific rates. Rate limiting is available through configuration to prevent runaway costs:

Environment Variable	Purpose
`MAX_REQUESTS_PER_HOUR`	Request rate limit
`MAX_COST_PER_HOUR`	Cost threshold limit

Source: src/pricing.ts

Configuration

The tool respects a layered configuration system with the following priority:

graph TD
    A[Priority 1: Per-Request Parameters] --> B[Priority 2: Environment Variables]
    B --> C[Priority 3: Local Config ./.gemini-image-mcp.json]
    C --> D[Priority 4: Global Config ~/.gemini-image-mcp.json]
    D --> E[Priority 5: Built-in Defaults]

Config File Structure

{
  "defaultModel": "gemini-2.5-flash-image",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    }
  }
}

Source: src/config.ts

Output Organization

Generated images are saved with intelligent naming and versioning:

Input `filename`	Existing Files	Output Saved As
`"hero"`	None	`hero.png`
`"hero"`	`hero.png` exists	`hero-v2.png`
`"hero"`	`hero.png`, `hero-v2.png` exist	`hero-v3.png`

Subfolder Organization

Images can be organized into subdirectories using the subfolder parameter:

Parameters	Result
`filename: "hero"`, `subfolder: "landing-page"`	`~/gemini-images/landing-page/hero.png`

Generation Manifest

All generations are logged to generations.jsonl for audit and reproducibility:

{"timestamp":"2024-01-15T10:30:00Z","prompt":"A modern dashboard","model":"gemini-2.5-flash-image","aspectRatio":"16:9","resolution":"2K","cost":0.04,"path":"~/gemini-images/dashboard.png"}

Source: src/generate.ts:200-300

Usage Examples

Text-to-Image Generation

{
  "prompt": "A modern dashboard UI with dark theme and blue accent colours",
  "aspectRatio": "16:9",
  "resolution": "2K",
  "filename": "dashboard-hero",
  "subfolder": "landing-page"
}

Image Editing with Reference

{
  "prompt": "Change the background to a sunset over water",
  "images": ["./src/assets/hero.png"],
  "aspectRatio": "16:9"
}

{
  "prompt": "Make the colours warmer and add more contrast",
  "sessionId": "session-1711929600000-a1b2c3"
}

Reproducible Generation with Seed

{
  "prompt": "A photorealistic mountain landscape",
  "seed": 42,
  "aspectRatio": "16:9"
}

Google Search Grounding

{
  "prompt": "Current design trends for AI product landing pages",
  "model": "gemini-3.1-flash-image-preview",
  "useSearchGrounding": true
}

Error Handling

The tool provides specific error messages for common failure scenarios:

Error Condition	Message
Invalid API key	`Failed to list models (is your API key valid?)`
Unsupported image format	`Unsupported image format ".bmp" for file: path/to/image.bmp`
Image too large	`Image file is 52MB, max is 50MB`
Grounding on unsupported model	`useSearchGrounding is only supported on gemini-3.1-flash-image-preview`
Rate limit exceeded	Clear error with remaining budget information
Session model mismatch	Error if session uses different model than original

Source: src/generate.ts:100-200

Dependencies

Package	Version	Purpose
`@google/genai`	^1.44.0	Gemini API client
`@modelcontextprotocol/sdk`	^1.22.0	MCP protocol implementation
`zod`	^3.24.0	Schema validation
`sharp`	^0.34.5	Image processing (for output)

Source: package.json

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

process_image Tool Reference

Related topics: generateimage Tool Reference, Image Processing Internals

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Tool Registration Flow

Continue reading this section for the full explanation and source context.

Section Processing Pipeline

Continue reading this section for the full explanation and source context.

Section Required Parameters

Continue reading this section for the full explanation and source context.

process_image Tool Reference

Overview

The process_image tool is a local image processing utility within the gemini-image-mcp MCP server. It leverages the sharp library to perform CPU-bound image transformations without making any API calls, making it completely free to use. Source: package.json:14

Unlike the AI-powered generate_image tool which sends requests to Google's Gemini API and incurs costs per operation, process_image operates entirely on the local machine. This creates an efficient two-tool workflow where AI generation can be followed by local processing at zero additional cost. Source: README.md:features

Architecture

Tool Registration Flow

The process_image tool is registered with the MCP server using the @modelcontextprotocol/sdk framework. The tool definition includes Zod schemas for parameter validation and a handler function that orchestrates the processing pipeline. Source: src/index.ts:tool-registration

graph TD
    A[MCP Client Request] --> B[index.ts Tool Handler]
    B --> C[Load Configuration]
    C --> D[processImage Function]
    D --> E[sharp Operations Pipeline]
    E --> F[Output File System]
    D --> G[Return JSON Result]
    
    H[Config Sources] --> C
    H --> I[Environment Variables]
    H --> J[Local Config .json]
    H --> K[Global Config .json]
    H --> L[Default Values]

Processing Pipeline

The tool chains multiple operations into a single execution call. Operations are applied in a logical order: background removal first, then cropping, resizing, trimming, and finally format conversion. Source: src/index.ts:75-89

graph LR
    A[Input Image] --> B[Background Removal]
    B --> C[Crop]
    C --> D[Resize]
    D --> E[Trim]
    E --> F[Format Conversion]
    F --> G[Saved Output]

Tool Parameters

Required Parameters

Parameter	Type	Description
`imagePath`	string	Path to the image file to process

Source: src/index.ts:48-50

Optional Parameters

Parameter	Type	Default	Description
`crop`	CropConfig	none	Crop by pixel dimensions, aspect ratio, or focal point strategy
`resize`	ResizeConfig	none	Resize to width/height (maintains aspect ratio)
`removeBackground`	RemoveBackgroundConfig	config default	Remove background by threshold or chroma key
`trim`	boolean	config default	Auto-remove whitespace/transparent borders
`format`	"png" \	"jpeg" \	"webp"	original	Convert to specified format
`quality`	number (1-100)	90	Output quality for JPEG/WebP
`outputDir`	string	~/gemini-images	Directory to save output
`filename`	string	auto	Base name for saved file with auto-versioning
`subfolder`	string	none	Subfolder within output directory

Source: src/index.ts:51-79

Operations

Crop

The crop operation supports three distinct modes for targeting specific regions of an image.

Pixel-Exact Crop

{
  "width": 500,
  "height": 300,
  "left": 100,
  "top": 50
}

Aspect Ratio Crop

{
  "aspectRatio": "16:9",
  "strategy": "center"  // or "attention" or "entropy"
}

Supported Aspect Ratios

Ratio	Use Case
`1:1`	Square images, avatars
`16:9`	Hero banners, video thumbnails
`9:16`	Mobile stories, vertical content
`3:2`	Standard photography
`2:3`	Portrait photography
`4:3`	Classic monitors
`3:4`	Portrait prints
`21:9`	Ultrawide displays

Source: README.md:aspect-ratios

Focal Point Strategies

Strategy	Behavior
`center`	Default. Crops from the center of the image
`attention`	Shifts crop toward the most visually interesting region based on saliency detection
`entropy`	Shifts crop toward the region with the most visual detail (high information entropy)

Source: skills/image-generation/SKILL.md:crop

Resize

The resize operation maintains aspect ratio when only one dimension is specified.

{
  "width": 1200        // Auto-calculate height
}
// OR
{
  "height": 800        // Auto-calculate width
}
// OR
{
  "width": 1200,
  "height": 800        // Both specified
}

When both width and height are provided, the resize operation respects the crop configuration if present. Source: src/index.ts:58-62

Background Removal

Two distinct algorithms handle background removal depending on the background type.

Threshold-Based (White Backgrounds)

{
  "threshold": 230
}

The threshold parameter specifies the brightness level below which pixels are considered background. Values closer to 255 detect lighter backgrounds. This method works well for studio product shots on white backdrops. Source: README.md:background-removal

Chroma Key (Green Screen / Any Solid Colour)

{
  "color": "#00FF00"    // Any hex colour
}

The chroma key pipeline performs HSV-based colour keying with advanced compositing techniques:

Stage	Description
HSV Keying	Converts to HSV colour space for colour-based selection
Smoothstep Feather	Softens the edges using smoothstep interpolation
Spill Suppression	Removes colour contamination from the subject
Edge Anti-Aliasing	5-pass 3x3 kernel smoothing

Source: README.md:chroma-key

Trim

The trim operation automatically removes whitespace and transparent borders from images. This is particularly useful after background removal when residual padding remains around the subject. Source: README.md:trim

{
  "trim": true
}

Format Conversion

The format parameter converts images to different output formats with quality control.

Format	Quality Range	Default	Description
`png`	N/A (lossless)	-	Portable Network Graphics
`jpeg`	1-100	90	Joint Photographic Experts Group
`webp`	1-100	90	Web Picture format

Source: src/index.ts:63-67

Output Organization

Filename Auto-Versioning

When a filename already exists in the output directory, the tool automatically versions the filename:

hero.png (first save)
hero-v2.png (second save)
hero-v3.png (third save)

This prevents overwriting existing files and maintains a clear history of processed images. Source: README.md:filename

Subfolder Organization

The subfolder parameter creates organized directory structures within the output directory:

Parameter	Example	Result
`filename` only	`"hero"`	`~/gemini-images/hero.png`
`subfolder` only	`"landing-page"`	`~/gemini-images/landing-page/original.png`
Both	`"hero"` + `"landing-page"`	`~/gemini-images/landing-page/hero.png`

Source: README.md:output

Configuration Defaults

The tool reads default values from multiple configuration sources with the following priority:

graph TD
    A[Priority Order] --> B[1. Per-Request Parameters]
    B --> C[2. Environment Variables]
    C --> D[3. Local Config .json]
    D --> E[4. Global Config .json]
    E --> F[5. Hardcoded Defaults]

Configuration File Structure

Create a config file using:

npx @jimothy-snicket/gemini-image-mcp --init

This creates ~/.gemini-image-mcp.json with commented defaults. For project-specific overrides:

npx @jimothy-snicket/gemini-image-mcp --init --local

Source: README.md:config-file

Default Process Settings

{
  "defaults": {
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true,
      "format": "png"
    }
  }
}

Source: README.md:config-defaults

Common Pipelines

Favicon from Logo

Extract a transparent background, trim whitespace, and resize to favicon dimensions in a single pipeline:

{
  "imagePath": "./logo.png",
  "removeBackground": {"threshold": 230},
  "trim": true,
  "resize": {"width": 192, "height": 192},
  "filename": "favicon-192"
}

Source: skills/image-generation/SKILL.md:favicon

Crop to 16:9 aspect ratio using attention-based focal point and resize to standard social card width:

{
  "imagePath": "./photo.png",
  "crop": {"aspectRatio": "16:9", "strategy": "attention"},
  "resize": {"width": 1200},
  "filename": "hero-banner"
}

Source: skills/image-generation/SKILL.md:social-card

WebP Conversion for Web

Convert an existing PNG to WebP format with optimized quality for web delivery:

{
  "imagePath": "./image.png",
  "format": "webp",
  "quality": 85,
  "filename": "optimized"
}

Source: README.md:webp

Transparent Asset from Green Screen

Generate an image on a green background, then remove it locally:

Step 1: Generate on green screen

{
  "prompt": "A product photo on a bright green background",
  "filename": "product-green"
}

Step 2: Remove green background

{
  "imagePath": "./product-green.png",
  "removeBackground": {"color": "#00FF00"},
  "trim": true,
  "filename": "product-transparent"
}

This two-step approach works best for high-contrast subjects (dark, red, blue, or white on green). Always use #00FF00 as it handles Gemini's actual green shade more reliably than trying to match it precisely. Source: README.md:green-screen

Subject on Specific Background (Canvas Approach)

For yellow, green, or glass/reflective subjects where chroma key struggles, use the AI-powered canvas approach:

{
  "prompt": "Place a yellow rubber duck on this background. Product photography, studio lighting, centered.",
  "images": ["./canvas-white.png"],
  "filename": "duck-on-white"
}

This technique generates the subject with correct lighting and shadows for the specific background in a single API call. Source: skills/image-generation/SKILL.md:canvas

Return Value

The tool returns a JSON object containing the processing results:

{
  "content": [
    {
      "type": "text",
      "text": JSON.stringify({
        input: {
          path: string,
          operations: string[]
        },
        output: {
          path: string,
          format: string,
          dimensions: { width: number, height: number },
          size: number
        }
      }, null, 2)
    }
  ]
}

Source: src/index.ts:80-95

Error Handling

The tool wraps all operations in try-catch blocks to provide meaningful error messages:

catch (err) {
  const message = err instanceof Error ? err.message : String(err);
  log.error("process_image failed:", message);
  // Returns error to MCP client
}

Common error scenarios include unsupported image formats and file access issues. Source: src/index.ts:96-100

Limitations

Maximum input image size: 50MB (enforced by file stat check)
Supported input formats depend on sharp library capabilities
Output format availability depends on sharp library compilation options
Processing is single-threaded per operation; large images may take longer to process
No GPU acceleration; all processing uses CPU

Source: src/process.ts:file-validation (inferred from README file size documentation)

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Configuration Guide

Related topics: Installation Guide, Cost Tracking and Rate Limiting

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Features

Continue reading this section for the full explanation and source context.

Section Default Configuration Template

Continue reading this section for the full explanation and source context.

Section Top-Level Settings

Continue reading this section for the full explanation and source context.

Configuration Guide

The gemini-image-mcp server provides a centralized configuration system that allows users to customize all aspects of image generation and processing. The configuration system replaces scattered environment variables with a unified approach using JSON config files with JSONC support (JSON with comments).

Overview

The configuration system serves as the single source of truth for all server settings. Instead of reading from process.env directly throughout the codebase, all modules now read settings from the centralized config, ensuring consistency and maintainability. Source: CHANGELOG.md

Key Features

Feature	Description
JSONC Support	JSON with comments for inline documentation
Hierarchical Priority	Env vars → Local config → Global config → Defaults
Deep Merge	Nested configuration objects merge correctly
Config Caching	Configuration is cached after first load
Security Guards	API key rejection, prototype pollution protection
Validation	Whitelist of known keys, unknown keys warned

Configuration Priority

The system follows a clear hierarchy where more specific configurations override more general ones:

graph TD
    A[Request Parameters] --> B[Override Everything]
    B --> C[Environment Variables]
    C --> D[Local Config ./.gemini-image-mcp.json]
    D --> E[Global Config ~/.gemini-image-mcp.json]
    E --> F[Hardcoded Defaults]
    
    style A fill:#90EE90
    style F fill:#FFE4B5

Environment variables take precedence over all config files. If a setting exists in both an env var and a config file, the env var wins. Source: README.md

Configuration File Format

The configuration file uses JSONC (JSON with Comments) format, allowing inline documentation and making it easy to understand each setting.

Default Configuration Template

{
  // gemini-image-mcp configuration
  // Docs: https://github.com/JimothySnicket/gemini-image-mcp

  // Directory where generated/processed images are saved
  // Supports ~ for home directory
  "outputDir": "~/gemini-images",

  // Default Gemini model for image generation
  // gemini-2.5-flash-image         — fast, ~$0.04/image, 1K only (deprecates Oct 2026)
  // gemini-3.1-flash-image-preview  — fast, ~$0.08/image, up to 4K, search grounding
  // gemini-3-pro-image-preview      — best quality, ~$0.16/image, up to 4K, 14 ref images
  "defaultModel": "gemini-2.5-flash-image",

  // Log level: "debug", "info", or "error"
  "logLevel": "info",

  // Timeout for a single API request (ms)
  "requestTimeout": 60000,

  // Timeout for multi-turn editing sessions (ms)
  "sessionTimeout": 1800000,

  // Rate limiting (0 = unlimited)
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,

  // Per-tool default parameters
  "defaults": {
    "generate": {
      // "aspectRatio": "1:1",
      // "resolution": "1K"
    },
    "process": {
      // "removeBackground": { "color": "#00FF00" },
      // "trim": true
    }
  }
}

Source: src/config.template.ts

Configuration Options

Top-Level Settings

Option	Type	Default	Description
`outputDir`	string	`~/gemini-images`	Directory for saved images. Supports `~` for home directory
`defaultModel`	string	`gemini-2.5-flash-image`	Default Gemini model for image generation
`logLevel`	string	`info`	Log verbosity: `debug`, `info`, or `error`
`requestTimeout`	number	`60000`	Timeout for a single API request in milliseconds
`sessionTimeout`	number	`1800000`	Timeout for multi-turn editing sessions in milliseconds
`maxRequestsPerHour`	number	`0`	Rate limit: max requests per hour (0 = unlimited)
`maxCostPerHour`	number	`0`	Rate limit: max cost per hour in USD (0 = unlimited)

Per-Tool Defaults

The defaults object allows setting default parameters for each tool:

#### Generate Tool Defaults

Option	Type	Description
`defaults.generate.aspectRatio`	string	Default aspect ratio: `1:1`, `16:9`, `9:16`, `3:2`, `2:3`, `4:3`, `3:4`, `21:9`
`defaults.generate.resolution`	string	Default resolution: `1K`, `2K`, `4K`

#### Process Tool Defaults

Option	Type	Description
`defaults.process.removeBackground`	object	Default background removal settings
`defaults.process.trim`	boolean	Default trim setting
`defaults.process.format`	string	Default output format: `png`, `jpeg`, `webp`
`defaults.process.quality`	number	Default quality (1-100)

Creating a Configuration File

Automatic Initialization

The easiest way to create a configuration file is using the --init flag:

npx @jimothy-snicket/gemini-image-mcp --init

This creates ~/.gemini-image-mcp.json with all defaults and inline documentation.

Local Configuration

To create a local configuration file in the current working directory:

npx @jimothy-snicket/gemini-image-mcp --init --local

This creates .gemini-image-mcp.json in the CWD, which takes precedence over the global config.

Deep Merge Behavior

The configuration system uses deep merging for nested objects. This means you can specify only the settings you want to change, and the rest will inherit from the underlying defaults. Source: CHANGELOG.md

Example: Partial Configuration

{
  "logLevel": "debug",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9"
    }
  }
}

This configuration only overrides logLevel and the generate.aspectRatio, while all other settings retain their defaults.

Security Features

API Key Protection

The configuration system explicitly rejects API keys found in config files with a warning:

[config] WARNING: "apiKey" found in ~/.gemini-image-mcp.json — API keys must not be in config files. Stripped.

This prevents accidental commits of API credentials to repositories. Source: CHANGELOG.md

Prototype Pollution Guard

The deep merge implementation protects against prototype pollution attacks by explicitly blocking dangerous keys:

__proto__
constructor
prototype

If any of these keys are encountered during config merging, they are silently ignored. Source: CHANGELOG.md

Unknown Key Warnings

The system maintains a whitelist of known configuration keys. If an unknown key is found in a config file, a warning is logged and the key is dropped:

[config] WARNING: unknown key "someUnknownKey" in ~/.gemini-image-mcp.json — ignored.

This prevents unexpected data injection and helps users catch typos in configuration. Source: CHANGELOG.md

JSONC Parsing

The configuration system supports JSONC (JSON with Comments), which extends standard JSON with:

Single-line comments: // comment
Multi-line comments: /* comment */

String-Aware Comment Stripping

The JSONC parser is string-aware, meaning it won't mangle URLs or other quoted strings that contain comment-like patterns. For example:

{
  // This is a comment
  "url": "https://example.com/api?query=1&filter=//something",
  "note": "Use // for comments in code"
}

Will correctly parse without affecting the URLs or notes. Source: CHANGELOG.md

Trailing Comma Handling

The parser automatically strips trailing commas left by commented-out lines, preventing parse failures. Source: CHANGELOG.md

Configuration Caching

After the configuration is loaded for the first time, it is cached in memory. Subsequent calls to loadConfig() return the cached value immediately without re-reading files. Source: src/config.ts

Cache Invalidation

To force a reload of the configuration (useful during development), you may need to restart the server process.

Environment Variables

While the config file is the recommended approach, the system still supports environment variables for backward compatibility:

Environment Variable	Description
`GEMINI_API_KEY`	Google Gemini API key (required)
`OUTPUT_DIR`	Override output directory

Environment variables always take precedence over config file values. Source: README.md

Per-Request Overrides

Configuration defaults can be overridden on a per-request basis. Per-request parameters always override config defaults. Source: README.md

Example: Per-Tool Defaults with Overrides

{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}

With this configuration:

All generate_image calls use 16:9 aspect ratio and 2K resolution by default
All process_image calls auto-remove green backgrounds and trim by default
Any individual request can override these by specifying different values

Programmatic Usage

The configuration module exports several functions for use within the codebase:

import { loadConfig, initConfig, CONFIG_TEMPLATE } from './config';

Function	Description
`loadConfig()`	Load and return the merged configuration (uses cache)
`initConfig(options?)`	Create a new config file interactively
`CONFIG_TEMPLATE`	The default configuration template as a string

Source: src/config.ts

File Locations

The system searches for configuration files in the following order:

Current Working Directory: ./.gemini-image-mcp.json
Home Directory: ~/.gemini-image-mcp.json

The first file found is used, with settings merged on top of defaults.

Configuration Schema

The complete configuration schema includes:

interface GeminiImageConfig {
  outputDir: string;
  defaultModel: string;
  logLevel: 'debug' | 'info' | 'error';
  requestTimeout: number;
  sessionTimeout: number;
  maxRequestsPerHour: number;
  maxCostPerHour: number;
  defaults: {
    generate?: {
      aspectRatio?: string;
      resolution?: string;
    };
    process?: {
      removeBackground?: object;
      trim?: boolean;
      format?: 'png' | 'jpeg' | 'webp';
      quality?: number;
    };
  };
}

Validation Rules

The configuration loader performs the following validations:

Rule	Behavior
API key detection	Strips any key matching `/api.?key/i`, logs warning
Unknown keys	Drops unknown keys, logs warning
Prototype pollution	Silently skips `__proto__`, `constructor`, `prototype`
JSONC syntax	Parses comments, strips trailing commas
File existence	Returns `null` if file doesn't exist

Testing

The configuration module has comprehensive test coverage including:

stripJsoncComments: String-aware comment removal
deepMerge: Nested object merging with pollution protection
loadConfig: Full configuration loading and precedence
initConfig: Interactive config file creation

Source: CHANGELOG.md

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Server Architecture

Related topics: Image Generation Internals, Image Processing Internals

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Technology Stack

Continue reading this section for the full explanation and source context.

Section Entry Point (src/index.ts)

Continue reading this section for the full explanation and source context.

Section Tool Registration Pattern

Continue reading this section for the full explanation and source context.

Server Architecture

The gemini-image-mcp is an MCP (Model Context Protocol) server that provides two primary tools for image generation and processing using Google Gemini's AI capabilities. The server is built on the @modelcontextprotocol/sdk and communicates via STDIO transport, making it compatible with MCP clients like Claude Code.

Overview

The server architecture follows a modular design pattern with clear separation of concerns:

Component	File	Responsibility
Entry Point	`src/index.ts`	Server initialization, tool registration, request routing
Image Generation	`src/generate.ts`	Gemini API integration, model discovery, image generation
Image Processing	`src/process.ts`	Local image manipulation using Sharp
Configuration	`src/config.ts`	Config file loading, validation, environment variable management
Usage Tracking	`src/tracker.ts`	Token usage logging, cost estimation
Utilities	`src/utils.ts`	Logging, file operations, path resolution

Technology Stack

Dependency	Version	Purpose
`@google/genai`	^1.44.0	Gemini API client for image generation
`@modelcontextprotocol/sdk`	^1.22.0	MCP protocol implementation
`sharp`	^0.34.5	Local image processing
`zod`	^3.24.0	Schema validation for tool parameters

Source: package.json:18-22

Architecture Diagram

graph TD
    A[MCP Client] <-->|STDIO| B[src/index.ts<br/>MCP Server]
    B --> C[generate_image tool]
    B --> D[process_image tool]
    C --> E[src/generate.ts<br/>Gemini API]
    C --> F[src/tracker.ts<br/>Usage Logger]
    D --> G[src/process.ts<br/>Sharp Library]
    E --> H[Model Discovery]
    I[Config System] -.->|Priority Resolution| B
    I --> J[src/config.ts]
    J --> K[Environment Variables]
    J --> L[Config Files]
    J --> M[Defaults]

Core Components

Entry Point (`src/index.ts`)

The server initializes a single MCP server instance that registers two tools:

const server = new McpServer(
  {
    name: "gemini-image-mcp",
    version: pkg.version,
  },
  {
    instructions: "Gemini image generation and local image processing...",
  },
);

Key initialization steps:

Load configuration via loadConfig()
Initialize usage tracker via initTracker()
Register both generate_image and process_image tools
Establish STDIO transport connection

Source: src/index.ts:1-20

Tool Registration Pattern

Each tool follows a consistent registration pattern using Zod schemas for parameter validation:

server.registerTool(
  "tool-name",
  {
    description: "...",
    parameters: z.object({ /* Zod schema */ }),
  },
  async (args) => {
    const config = loadConfig();
    // Merge config defaults with args
    // Execute tool logic
    // Return formatted response
  },
);

Source: src/index.ts:35-80

Tool: `generate_image`

The generate_image tool handles AI-powered image generation and editing via the Gemini API.

Parameters

Parameter	Type	Required	Description
`prompt`	string	Yes	Text description or editing instruction
`images`	string[]	No	File paths to reference images
`model`	string	No	Gemini model ID
`aspectRatio`	string	No	Image aspect ratio
`resolution`	string	No	Output resolution (1K, 2K, 4K)
`outputDir`	string	No	Override output directory
`filename`	string	No	Base name for saved file
`subfolder`	string	No	Subfolder within output directory
`sessionId`	string	No	Continue multi-turn session
`seed`	number	No	Integer seed for reproducibility
`useSearchGrounding`	boolean	No	Enable Google Search grounding

Source: src/index.ts:35-70

Model Discovery

The generate.ts module implements automatic model discovery to detect available image-capable models:

const IMAGE_MODEL_PATTERNS = ["image", "vision"];
const EXCLUDED_PREFIXES = ["learn", "gemini-2.0-flash-thinking"];

async function discoverModels(apiKey: string): Promise<string[]> {
  // Paginate through available models
  // Filter by image capability patterns
  // Exclude specific prefixes
  // Cache results
}

Source: src/generate.ts:85-100

Image Input Handling

Local images are converted to inline data for API submission:

async function readImageAsInlineData(filepath: string): Promise<{
  inlineData: { data: string; mimeType: string };
}> {
  const mimeType = MIME_TYPES[ext];
  // Validate file exists and is under 50MB
  // Return base64-encoded data with MIME type
}

Source: src/generate.ts:105-130

Tool: `process_image`

The process_image tool provides local, free image processing via the Sharp library.

Parameters

Parameter	Type	Required	Description
`imagePath`	string	Yes	Path to image file
`crop`	object	No	Crop by pixels, aspect ratio, or focal point
`resize`	object	No	Resize to width/height
`removeBackground`	object	No	Threshold or chroma key removal
`trim`	boolean	No	Auto-remove whitespace borders
`format`	string	No	Output format (png, jpeg, webp)
`quality`	number	No	Quality 1-100 for JPEG/WebP
`outputDir`	string	No	Override output directory
`filename`	string	No	Base name for saved file
`subfolder`	string	No	Subfolder within output directory

Source: src/index.ts:75-120

Processing Capabilities

Operation	Description
Crop	Pixel-exact, aspect ratio center crop, focal point (attention/entropy)
Resize	Width, height, or both with aspect ratio preservation
Background Removal	Threshold-based (white backgrounds) or chroma key (HSV keying)
Trim	Auto-remove whitespace/transparent borders
Format Conversion	PNG, JPEG, WebP with quality control

Configuration System (`src/config.ts`)

The configuration system implements a hierarchical priority system for settings:

Priority Order

Per-request parameters > Environment Variables > Local config (.gemini-image-mcp.json) > Global config (~/.gemini-image-mcp.json) > Defaults

Security Features

Feature	Implementation
API key rejection	Keys from config files are rejected with warning
JSONC parsing	String-aware comment stripping (preserves URLs)
Prototype pollution guard	`__proto__`, `constructor`, `prototype` blocked in deep merge
Unknown key warnings	Invalid config keys are warned and dropped

Source: CHANGELOG.md

Config Structure

{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}

Usage Tracking (`src/tracker.ts`)

The tracker module logs all image generation operations to a manifest file (generations.jsonl).

Tracked Data

Each generation logs:

Prompt text
Model used
Parameters (aspect ratio, resolution, etc.)
Token counts (prompt, output, image, thinking)
Estimated USD cost
Session information

Source: src/tracker.ts (referenced in src/index.ts:18)

Logging System (`src/utils.ts`)

The utility module provides structured logging capabilities:

import { log, setLogLevel, setLogDir } from "./utils.js";

Features:

Configurable log levels
Directory-based log output
Error message formatting

Source: src/utils.ts (referenced in src/index.ts:19)

Request Flow

sequenceDiagram
    participant Client as MCP Client
    participant Server as MCP Server
    participant Config as Config System
    participant Tool as Tool Handler
    participant API as External API

    Client->>Server: Tool Request
    Server->>Config: Load Config
    Config-->>Server: Merged Config
    Server->>Tool: Request + Config
    Tool->>Config: Get Defaults
    Config-->>Tool: Tool Defaults
    Tool->>API: Process Request
    API-->>Tool: Response
    Tool->>Server: Formatted Result
    Server-->>Client: JSON Response

Environment Variables

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Google Gemini API key from AI Studio
`MAX_REQUESTS_PER_HOUR`	No	Rate limit for requests
`MAX_COST_PER_HOUR`	No	Rate limit for cost (USD)
`OUTPUT_DIR`	No	Default output directory

Initialization

The server can be initialized in two modes:

# Global config
npx @jimothy-snicket/gemini-image-mcp --init

# Local config (in current directory)
npx @jimothy-snicket/gemini-image-mcp --init --local

This creates a ~/.gemini-image-mcp.json or .gemini-image-mcp.json file with inline documentation of all available options.

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Image Generation Internals

Related topics: generateimage Tool Reference, Server Architecture, Cost Tracking and Rate Limiting

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Dependencies

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Flow Diagram

Continue reading this section for the full explanation and source context.

Image Generation Internals

This document provides a comprehensive technical overview of the image generation subsystem within gemini-image-mcp. It covers the architecture, API integration patterns, session management, model discovery, and configuration system.

Overview

The image generation system is built on Google Gemini's native image generation API (generateContent), not the deprecated Imagen API. The system provides both text-to-image generation and image editing capabilities through a Model Context Protocol (MCP) server interface. Source: README.md

Core Dependencies

Package	Version	Purpose
`@google/genai`	^1.44.0	Gemini API client
`@modelcontextprotocol/sdk`	^1.22.0	MCP server implementation
`zod`	^3.24.0	Schema validation for tool parameters

Source: package.json:18-21

Architecture

System Components

graph TD
    A[MCP Client] -->|Tool Request| B[McpServer]
    B --> C[generateImage Function]
    C --> D[Model Discovery]
    C --> E[Session Manager]
    C --> F[API Client]
    D --> G[Gemini API<br/>List Models]
    E --> H[Session Store<br/>Map&lt;sessionId, ConversationSession&gt;]
    F --> I[Gemini generateContent API]
    I --> J[Image Response]
    J --> K[File System<br/>Output Directory]

Flow Diagram

sequenceDiagram
    participant Client
    participant Server as MCP Server
    participant Session as Session Manager
    participant API as Gemini API
    participant FS as File System

    Client->>Server: generate_image(prompt, sessionId?)
    Server->>Session: Check/Create Session
    Session-->>Server: ConversationSession
    alt New Session
        Server->>API: List Models
        API-->>Server: Available Models
        Server->>Session: Create New Session
    else Existing Session
        Server->>Session: Get Session History
    end
    Server->>API: generateContent(prompt, history)
    API-->>Server: Generated Image
    Server->>FS: Save Image
    Server-->>Client: Result + Usage Stats

Model Discovery System

Auto Model Detection

The system automatically discovers available image-capable models by querying the Gemini API at startup. This eliminates the need for hardcoded model lists and ensures compatibility as new models are released. Source: src/generate.ts:95-116

// Known image-capable model name fragments (Gemini native only)
const IMAGE_MODEL_PATTERNS = ["image", "img"];
// Imagen uses a different API (generateImages) and is deprecated June 2026
const EXCLUDED_PREFIXES = ["imagen"];

Model Filtering Logic

Filter Type	Criteria	Purpose
Include	Name contains "image" or "img"	Match Gemini image models
Exclude	Name starts with "imagen"	Avoid deprecated Imagen API

Source: src/generate.ts:95-97

Caching Mechanism

Available models are cached after the first discovery call to reduce API overhead:

let cachedAvailableModels: string[] | null = null;

export function getAvailableModels(): string[] | null {
  return cachedAvailableModels;
}

Source: src/generate.ts:63

Supported Models

Model	Resolution	Cost	Grounding	Notes
`gemini-2.5-flash-image`	1K only	~$0.04/image	No	Default, deprecates Oct 2026
`gemini-3-pro-image-preview`	1K, 2K, 4K	~$0.15/image	No	Best quality, up to 14 reference images
`gemini-3.1-flash-image-preview`	512, 1K, 2K, 4K	~$0.08/image	Yes	Search grounding support

Source: README.md:45-49

Google Search Grounding

Supported Models

Only gemini-3.1-flash-image-preview supports Google Search grounding. The system validates this at runtime and throws a descriptive error if unsupported. Source: src/generate.ts:99-108

export const GROUNDING_SUPPORTED_MODELS = ["gemini-3.1-flash-image-preview"];

export function validateGrounding(model: string, useSearchGrounding: boolean | undefined): void {
  if (useSearchGrounding && !GROUNDING_SUPPORTED_MODELS.includes(model)) {
    throw new Error(
      `useSearchGrounding is only supported on ${GROUNDING_SUPPORTED_MODELS.join(", ")}. ` +
        `You requested ${model}.`,
    );
  }
}

Source: src/generate.ts:99-108

Multi-Turn Session Management

Session Data Model

interface ConversationSession {
  history: Content[];      // Previous conversation turns
  model: string;           // Model used in this session
  lastAccessed: number;    // Timestamp for TTL cleanup
}

Source: src/generate.ts:67-71

Session Store

const sessions = new Map<string, ConversationSession>();
const MAX_SESSION_TURNS = 10;

Session Lifecycle

graph LR
    A[Create Session] --> B[Store with TTL]
    B --> C[Each Request]
    C -->|Within TTL| D[Extend TTL]
    C -->|Exceeds TTL| E[Cleanup on Access]
    E --> F[Return Error]
    D --> G[Append to History]
    G --> H[Return Response]
    H --> I[Max 10 Turns]
    I -->|Exceeded| J[Prune Oldest]

Session Configuration

Parameter	Default	Description
`sessionTimeout`	1800000ms (30 min)	Inactivity timeout before session expiry
`MAX_SESSION_TURNS`	10	Maximum conversation turns per session

Source: src/generate.ts:69 and src/config.ts

Session Cleanup

Sessions are automatically cleaned up based on the configured timeout:

function getSessionTimeout(): number {
  return loadConfig().sessionTimeout;
}

function cleanupSessions(): void {
  const timeout = getSessionTimeout();
  const now = Date.now();
  for (const [id, session] of sessions) {
    if (now - session.lastAccessed > timeout) {
      sessions.delete(id);
    }
  }
}

Source: src/generate.ts:73-84

Image Input Processing

Supported Formats

The system supports multiple image formats through MIME type mapping:

Extension	MIME Type
`.png`	`image/png`
`.jpg` / `.jpeg`	`image/jpeg`
`.webp`	`image/webp`
`.gif`	`image/gif`
`.avif`	`image/avif`

File Validation

Check	Limit	Error Message
File size	50MB max	"Image file is {size}MB, max is 50MB"
Format support	Defined MIME map	"Unsupported image format"

Source: src/generate.ts:119-135

Configuration System

Configuration Priority

per-request params > env vars > local config > global config > defaults

Source: README.md

Configuration Template

{
  "outputDir": "~/gemini-images",
  "defaultModel": "gemini-2.5-flash-image",
  "logLevel": "info",
  "requestTimeout": 60000,
  "sessionTimeout": 1800000,
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "1K"
    }
  }
}

Source: src/config.ts

Per-Tool Defaults

Users can configure default parameters for each tool to avoid repetition:

{
  "defaultModel": "gemini-3.1-flash-image-preview",
  "defaults": {
    "generate": {
      "aspectRatio": "16:9",
      "resolution": "2K"
    },
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true
    }
  }
}

Source: README.md:95-109

Rate Limiting

Configuration Parameters

Variable	Purpose
`MAX_REQUESTS_PER_HOUR`	Maximum API requests per hour
`MAX_COST_PER_HOUR`	Maximum USD cost per hour

Source: README.md:79

Rate Limit Behavior

The system monitors both request count and cost per hour. When limits are reached, the API returns a clear error message indicating remaining budget. Source: CHANGELOG.md

Response Structure

Each generation response includes:

Field	Type	Description
`sessionId`	string	Unique ID for multi-turn sessions
`imagePath`	string	Path to saved image
`generation`	object	Generation parameters used
`usage`	object	Token counts and estimated cost
`session`	object	Running totals (generations, cost, hourly count)

Source: src/generate.ts

MCP Tool Registration

The generate_image tool is registered with the MCP SDK using Zod schemas for parameter validation:

server.registerTool(
  "generate_image",
  {
    prompt: z.string(),
    images: z.array(z.string()).optional(),
    model: z.string().optional(),
    aspectRatio: z.string().optional(),
    // ... additional parameters
  },
  async (args) => {
    const config = loadConfig();
    const result = await generateImage({ ...args, config });
    return { content: [{ type: "text", text: JSON.stringify(result) }] };
  }
);

Source: src/index.ts

Error Handling

Model Mismatch Detection

Sessions verify that the requested model matches the original session model to prevent inconsistent generation behavior:

// Model mismatch detection: error if session uses a different model than the original

Source: CHANGELOG.md

Seed-Based Reproducibility

Integer seeds enable reproducible generation results:

// seed param: integer for reproducible generation

Source: CHANGELOG.md

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Image Processing Internals

Related topics: processimage Tool Reference, Server Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Diagram

Continue reading this section for the full explanation and source context.

Section Technology Stack

Continue reading this section for the full explanation and source context.

Section Parameter Schema

Continue reading this section for the full explanation and source context.

Image Processing Internals

Overview

The process_image tool provides local, free image processing capabilities powered by the sharp library. Unlike generate_image which makes API calls to Google's Gemini, process_image operates entirely on the local machine, making it ideal for batch operations, asset preparation, and cost-free transformations. Source: package.json:17

The module supports chaining multiple operations in a single tool call, including cropping, resizing, background removal (threshold and chroma key), border trimming, and format conversion. This design allows complex pipelines like favicon generation or transparent asset extraction without multiple API round-trips.

Architecture

Component Diagram

graph TD
    A["process_image Tool"] --> B["Input Validation"]
    B --> C["sharp Pipeline"]
    C --> D["Operation Chain"]
    
    D --> E1["Crop Operations"]
    D --> E2["Resize Operations"]
    D --> E3["Background Removal"]
    D --> E4["Trim Operations"]
    D --> E5["Format Conversion"]
    
    E3 --> F1["Threshold Detection"]
    E3 --> F2["HSV Chroma Key"]
    
    F2 --> G1["Smoothstep Feather"]
    F2 --> G2["Spill Suppression"]
    F2 --> G3["Edge Anti-aliasing"]
    
    E1 --> H["Output Writer"]
    E2 --> H
    E4 --> H
    E5 --> H
    
    H --> I["generations.jsonl"]
    H --> J["File System"]

Technology Stack

Component	Technology	Version	Purpose
Image Processing	sharp	^0.34.5	High-performance image manipulation
Validation	zod	^3.24.0	Runtime type checking for parameters
MCP SDK	@modelcontextprotocol/sdk	^1.22.0	Tool registration and communication

Source: package.json:13-15

Input Validation

The tool validates all parameters before processing begins. The Zod schema enforces strict type constraints and ranges.

Parameter Schema

Parameter	Type	Required	Default	Description
`imagePath`	string	Yes	—	Path to source image file
`crop`	CropConfig	No	undefined	Crop configuration object
`resize`	ResizeConfig	No	undefined	Resize configuration object
`removeBackground`	BackgroundConfig	No	config default	Background removal settings
`trim`	boolean	No	config default	Auto-remove whitespace borders
`format`	"png" \	"jpeg" \	"webp"	No	original	Output format
`quality`	number (1-100)	No	90	JPEG/WebP quality
`outputDir`	string	No	~/gemini-images	Output directory
`filename`	string	No	auto-generated	Base filename
`subfolder`	string	No	none	Subdirectory path

Source: src/index.ts:67-95

Crop Configuration

// Pixel-exact dimensions
{ width: 500, height: 300, left: 100, top: 50 }

// Aspect ratio (center crop)
{ aspectRatio: "16:9" }

// Focal point strategies
{ aspectRatio: "16:9", strategy: "attention" }
{ aspectRatio: "16:9", strategy: "entropy" }

Resize Configuration

// Width only (aspect ratio preserved)
{ width: 1200 }

// Height only (aspect ratio preserved)
{ height: 800 }

// Both dimensions (may affect aspect ratio)
{ width: 192, height: 192 }

Background Removal Configuration

// Threshold-based (white backgrounds)
{ threshold: 230 }

// Chroma key (green screen / any solid color)
{ color: "#00FF00" }

// Custom color with threshold tolerance
{ color: "#00FF00", threshold: 30 }

Processing Pipeline

Operation Flow

graph LR
    A[Input Image] --> B[Load with sharp]
    B --> C{Crop Specified?}
    C -->|Yes| D[Apply Crop]
    C -->|No| E[Resize Specified?]
    D --> E
    E -->|Yes| F[Apply Resize]
    E -->|No| G[Background Removal?]
    F --> G
    G -->|Yes| H[Apply Background Removal]
    G -->|No| I[Trim Specified?]
    H --> I
    I -->|Yes| J[Apply Trim]
    I -->|No| K[Format Conversion?]
    J --> K
    K -->|Yes| L[Apply Format & Quality]
    K -->|No| M[Write to Output]
    L --> M
    M --> N[Log to generations.jsonl]

Crop Operations

Pixel-Exact Cropping

Accepts explicit left, top, width, and height parameters in pixels. The crop is applied using sharp's region extraction, which reads only the specified portion of the source image.

await sharp(input)
  .extract({ 
    left: crop.left, 
    top: crop.top, 
    width: crop.width, 
    height: crop.height 
  })
  .toBuffer();

Aspect Ratio Cropping

When aspectRatio is specified without explicit dimensions, the system calculates the largest crop region matching the target ratio. The strategy parameter determines which region to select:

Strategy	Behavior
`center` (default)	Crops from the geometric center of the image
`attention`	Shifts crop toward the most visually interesting region based on saliency detection
`entropy`	Shifts crop toward the region with highest information density (detail)

Resize Operations

Dimension Handling

The resize operation follows sharp's resize semantics:

Width only: Height is calculated to maintain aspect ratio
Height only: Width is calculated to maintain aspect ratio
Both specified: Resizes to exact dimensions (may alter aspect ratio)

Resolution Presets

While the API accepts explicit pixel values, the generate_image tool supports resolution presets (1K, 2K, 4K) which map to standard dimensions:

Preset	Dimensions
1K	1024 × 1024 (or proportional)
2K	2048 × 2048 (or proportional)
4K	4096 × 4096 (or proportional)

Background Removal

Threshold-Based Detection

For images with white or light backgrounds, threshold-based detection identifies pixels above a brightness value and makes them transparent.

Algorithm:

Convert image to grayscale
Identify pixels exceeding the threshold (default: 230 on 0-255 scale)
Set identified pixels to transparent
Apply a slight blur to smooth edges

Best for: Product photos on plain white backgrounds, scanned documents, screenshots

Chroma Key Pipeline

For green screen or solid color backgrounds, the chroma key pipeline performs sophisticated color extraction:

graph TD
    A[Input Image] --> B[Convert to HSV]
    B --> C[Color Range Detection]
    C --> D[Create Mask]
    D --> E[Smoothstep Feather]
    E --> F[Spill Suppression]
    F --> G[Edge Anti-aliasing]
    G --> H[Composite with Transparency]

Stage Details:

Stage	Description
HSV Keying	Converts to Hue-Saturation-Value color space for better color discrimination
Smoothstep Feather	Applies smooth edge transition using smoothstep function (not linear)
Spill Suppression	Removes color contamination from edges of subject
Edge Anti-aliasing	5-pass 3×3 kernel anti-aliasing for smooth edges

Recommended Settings:

Subject Type	Color	Notes
High contrast (red, blue, black, white)	#00FF00	Best results
Yellow subjects	canvas approach	Use `generate_image` instead
Green subjects	canvas approach	Use `generate_image` instead
Glass/reflective	canvas approach	Use `generate_image` instead

Trim Operations

The trim operation automatically removes whitespace and transparent borders from images.

Algorithm:

Scan the image row-by-row and column-by-column
Identify the bounding box of non-white, non-transparent content
Extract the content region
Apply minimal padding (optional)

This operation is particularly useful after background removal to eliminate any leftover border artifacts.

Format Conversion

Supported Formats

Format	Extension	Quality Range	Use Case
PNG	.png	N/A (lossless)	Transparency, icons, diagrams
JPEG	.jpg/.jpeg	1-100	Photographs, final output
WebP	.webp	1-100	Web optimization, smaller files

Quality Control

For JPEG and WebP, the quality parameter controls the compression level:

90 (default): Balanced quality and file size
100: Maximum quality, larger file size
70-85: Smaller files, visible compression artifacts
1-69: Heavy compression, significant quality loss

Output Organization

Filename Auto-Versioning

When a filename collision occurs, the system automatically increments a version suffix:

Attempt	Filename
1st	`hero.png`
2nd	`hero-v2.png`
3rd	`hero-v3.png`
nth	`hero-v{n}.png`

Directory Structure

Output is organized as: {outputDir}/{subfolder}/{filename}.{format}

Examples:

Parameters	Result
`filename: "hero"`, no subfolder	`~/gemini-images/hero.png`
`filename: "logo"`, `subfolder: "brand"`	`~/gemini-images/brand/logo.png`
`outputDir: "./output"`, `subfolder: "icons"`	`./output/icons/{filename}.png`

Generation Manifest

Every processed image is logged to generations.jsonl in the output directory. Each entry is a JSON object on a single line:

{"timestamp":"2024-01-15T10:30:00.000Z","type":"process","operation":"background-removal","input":"product.jpg","output":"product-transparent.png","duration_ms":145}

Configuration Integration

Config Precedence

Parameters can be specified at multiple levels with this priority:

graph TD
    A[Per-Request Parameters] --> B[Highest Priority]
    B --> C[Local Config .gemini-image-mcp.json]
    C --> D[Global Config ~/.gemini-image-mcp.json]
    D --> E[Environment Variables]
    E --> F[Code Defaults]
    F --> G[Lowest Priority]

Default Configuration Template

{
  "outputDir": "~/gemini-images",
  "defaultModel": "gemini-2.5-flash-image",
  "logLevel": "info",
  "requestTimeout": 60000,
  "sessionTimeout": 1800000,
  "maxRequestsPerHour": 0,
  "maxCostPerHour": 0,
  "defaults": {
    "process": {
      "removeBackground": { "color": "#00FF00" },
      "trim": true,
      "format": "png",
      "quality": 90
    }
  }
}

Source: src/config.ts:17-47

Common Pipelines

Favicon Generation Pipeline

process_image → removeBackground {threshold: 230} + trim + resize {width: 192, height: 192}

Steps:

Remove white background using threshold detection
Trim any remaining whitespace
Resize to 192×192 favicon dimensions

Transparent Asset from Green Screen

generate_image → "A product photo on a bright green background"
process_image → removeBackground {color: "#00FF00"} + trim

Steps:

Generate subject on green screen (one API call)
Apply chroma key to remove green (free, local)
Trim excess border

process_image → crop {aspectRatio: "16:9", strategy: "attention"} + resize {width: 1200}

Steps:

Crop to 16:9 ratio, focusing on the most interesting region
Resize to optimal width for social platforms

Performance Characteristics

Processing Speed

Since all operations run locally via sharp, process_image is significantly faster than API-based alternatives:

Operation	Typical Duration
Crop/Resize	< 100ms
Background Removal (threshold)	100-300ms
Background Removal (chroma key)	300-800ms
Trim	< 50ms
Format Conversion	50-200ms

Memory Usage

Sharp processes images in memory and uses libvips, which is designed for efficient memory usage even with large images. A 4K image (4096×4096) typically requires 50-100MB of working memory depending on the operations performed.

Error Handling

Common Error Cases

Error	Cause	Resolution
Unsupported format	Invalid file extension	Use PNG, JPEG, WebP, GIF, TIFF, or WebP
File too large	Image exceeds 50MB limit	Reduce image size before processing
File not found	Invalid path	Verify imagePath is correct and accessible
Invalid crop dimensions	Crop region exceeds image bounds	Adjust width, height, left, top values
Invalid hex color	Malformed color string	Use format: `#RRGGBB` or `#RGB`

Source: src/generate.ts:107-112

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Cost Tracking and Rate Limiting

Related topics: generateimage Tool Reference, Configuration Guide, Image Generation Internals

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Pricing Table

Continue reading this section for the full explanation and source context.

Section Cost Calculation Formula

Continue reading this section for the full explanation and source context.

Section UsageReport Interface

Continue reading this section for the full explanation and source context.

Cost Tracking and Rate Limiting

The gemini-image-mcp server implements a comprehensive cost tracking and rate limiting system to help users monitor and control their API spending. This system operates at multiple levels—from per-generation cost calculation to hourly request and budget caps—ensuring predictable expenditure when using Gemini image generation capabilities.

Overview

The cost tracking and rate limiting subsystem serves three primary purposes:

Cost Transparency — Every image generation returns detailed token counts and estimated USD cost, allowing users to understand their API consumption.
Budget Enforcement — Configurable hourly limits prevent runaway agents or iterative workflows from exceeding intended spending.
Session Context — Generation costs are tracked per session, providing cumulative cost summaries across multi-turn editing workflows.

Source: src/pricing.ts:1-50

Architecture

The system comprises two interconnected modules:

Module	File	Responsibility
Pricing	`src/pricing.ts`	Token counting, cost calculation, and pricing table
Tracker	`src/tracker.ts`	Rate limiting, session tracking, and manifest logging

graph TD
    A[generate_image Tool Call] --> B[checkRateLimit]
    B --> C{Within Limits?}
    C -->|No| D[Throw RateLimitError]
    C -->|Yes| E[Call Gemini API]
    E --> F[calculateUsage]
    F --> G[UsageReport]
    G --> H[recordGeneration]
    H --> I[Update Session Stats]
    H --> J[Append to generations.jsonl]
    K[Config: MAX_REQUESTS_PER_HOUR] -.-> B
    L[Config: MAX_COST_PER_HOUR] -.-> B

Source: src/tracker.ts:40-60

Pricing Module

Pricing Table

The PRICING object in src/pricing.ts contains the authoritative pricing rates for all supported Gemini image models. All rates are expressed as USD per million tokens.

Model	Input ($/M)	Text Output ($/M)	Image Output ($/M)	Thinking ($/M)
`gemini-2.5-flash-image`	0.30	2.50	30.00	2.50
`gemini-3-pro-image-preview`	2.00	120.00	120.00	120.00
`gemini-3.1-flash-image-preview`	0.50	60.00	60.00	60.00

Source: src/pricing.ts:31-45

The pricing data is verified against Google AI Studio as of 2026-04-01, which is stored in the PRICING_VERIFIED_DATE constant and included in every UsageReport.

Cost Calculation Formula

The calculateUsage() function computes the estimated cost using the following formula:

cost = (promptTokens / 1,000,000) × inputPerMillion
     + (textTokens / 1,000,000) × textOutputPerMillion
     + (imageTokens / 1,000,000) × imageOutputPerMillion
     + (thinkingTokens / 1,000,000) × thinkingPerMillion

Source: src/pricing.ts:66-72

UsageReport Interface

Every image generation returns a UsageReport containing:

Field	Type	Description
`promptTokens`	number	Input token count
`outputTokens`	number	Total output tokens
`imageTokens`	number	Image modality output tokens
`thinkingTokens`	number	Internal reasoning tokens
`totalTokens`	number	Combined token count
`estimatedCost`	string	Formatted cost (e.g., "$0.0412")
`pricingVerifiedDate`	string	Date pricing was last verified

Source: src/pricing.ts:47-54

Handling Unknown Models

If a model is not found in the pricing table, the system returns "unknown (model not in pricing table)" as the estimated cost while still populating token counts. This ensures graceful degradation without breaking workflows for new or custom models.

Source: src/pricing.ts:74-80

Rate Limiting Module

Configuration

Rate limits are configured through environment variables or the JSON config file:

Environment Variable	Config Key	Type	Default	Description
`MAX_REQUESTS_PER_HOUR`	`maxRequestsPerHour`	number	0 (disabled)	Maximum generations per rolling hour
`MAX_COST_PER_HOUR`	`maxCostPerHour`	number	0 (disabled)	Maximum USD spend per rolling hour

Source: src/tracker.ts:35-50

Configuration priority follows this order (highest to lowest):

Environment variables
Local config file (.gemini-image-mcp.json in CWD)
Global config file (~/.gemini-image-mcp.json)
Built-in defaults

Rate Limit Enforcement

The checkRateLimit() function performs two checks against a rolling one-hour window:

graph LR
    A[Load Config] --> B[countRecentGenerations]
    B --> C{Hourly Request Limit?}
    C -->|Exceeded| D[Throw Error with count/limit]
    C -->|OK| E{Hourly Cost Limit?}
    E -->|Exceeded| F[Throw Error with $spent/$limit]
    E -->|OK| G[Continue to API Call]

Source: src/tracker.ts:42-58

Error Messages

When rate limits are exceeded, the system throws descriptive errors:

Request limit reached:

Rate limit reached — 20/20 generations used this hour. To change: set MAX_REQUESTS_PER_HOUR env var.

Cost limit reached:

Cost limit reached — $4.50/$5.00 spent this hour. To change: set MAX_COST_PER_HOUR env var.

Source: src/tracker.ts:48-56

Session Tracking

Session Statistics

Multi-turn editing sessions maintain running totals across all generations within that session:

Stat	Type	Description
`sessionGenerations`	number	Count of generations in current session
`sessionCostCents`	number	Cumulative cost in cents for the session

Source: src/tracker.ts:20-25

SessionStats Interface

Each tool response includes a session object with:

Field	Type	Description
`sessionId`	string	Unique session identifier
`sessionGenerations`	number	Generations in this session
`sessionCostCents`	number	Session cost in cents

Source: src/tracker.ts:1-20

Session Management

Sessions expire after 30 minutes of inactivity
The sessionId parameter continues editing from prior conversation context
Model mismatch detection prevents mixing models within a session

Source: CHANGELOG.md

Generation Manifest

All generations are logged to generations.jsonl in the output directory for auditing and analytics:

{"timestamp":"2026-04-01T12:00:00.000Z","model":"gemini-2.5-flash-image","prompt":"A modern dashboard UI","aspectRatio":"16:9","resolution":"2K","filename":"dashboard-hero","costCents":4.12,"tokens":1295}

Source: src/tracker.ts:60-65

Tool Response Structure

Every generate_image response includes complete cost and tracking information:

{
  "imagePath": "/home/user/gemini-images/hero-banner.png",
  "mimeType": "image/png",
  "model": "gemini-2.5-flash-image",
  "sessionId": "session-1711929600000-a1b2c3",
  "sessionTurn": 1,
  "usage": {
    "promptTokens": 5,
    "outputTokens": 1295,
    "imageTokens": 1290,
    "thinkingTokens": 412,
    "totalTokens": 1295,
    "estimatedCost": "$0.0412",
    "pricingVerifiedDate": "2026-04-01"
  },
  "session": {
    "sessionId": "session-1711929600000-a1b2c3",
    "sessionGenerations": 1,
    "sessionCostCents": 4.12
  }
}

Recommended Settings

For agentic workflows with iterative image refinement:

Setting	Value	Rationale
`MAX_REQUESTS_PER_HOUR`	20	Prevents runaway loops
`MAX_COST_PER_HOUR`	5.00	Caps hourly spend at $5

Source: README.md

Testing

The pricing and tracking modules have comprehensive test coverage:

Test File	Coverage
`src/pricing.test.ts`	Cost calculation, unknown models, missing metadata, pricing table verification
`src/tracker.test.ts`	Rate limit enforcement, session tracking, manifest appending

Source: src/pricing.test.ts:1-50

Summary

The cost tracking and rate limiting system provides transparency and control over API usage through:

Per-generation pricing with detailed token breakdowns across input, text output, image output, and thinking tokens
Hourly rate limiting on both request count and dollar amount
Session-aware tracking for multi-turn editing workflows
Manifest logging for historical analysis and auditing
Graceful degradation when encountering unknown models

Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

1. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

2. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

3. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

4. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

5. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

6. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 1

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using gemini-image-mcp with real data or production workflows.

Capability evidence risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence