Doramagic Project Pack · Human Manual
gemini-image-mcp
gemini-image-mcp is a Model Context Protocol (MCP) server that provides Google Gemini-powered image generation, editing, and local image processing capabilities. It integrates with MCP-com...
Home
Related topics: Installation Guide, MCP Client Configuration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation Guide, MCP Client Configuration
Home
Overview
gemini-image-mcp is a Model Context Protocol (MCP) server that provides Google Gemini-powered image generation, editing, and local image processing capabilities. It integrates with MCP-compatible AI assistants (such as Claude Code and Claude Desktop) to enable seamless AI-driven image workflows directly from conversational interfaces. Source: package.json
The project exposes two primary tools: generate_image for AI-powered image creation and editing via the Gemini API, and process_image for local image manipulation using the sharp library—free and fast with no API calls required. Source: README.md
Project Metadata
| Property | Value |
|---|---|
| Package Name | @jimothy-snicket/gemini-image-mcp |
| Version | 0.4.0 |
| MCP Server Name | io.github.JimothySnicket/gemini-image |
| License | MIT |
| Author | Jamie Donaldson |
| Runtime | Node.js >= 18.0.0 |
| Package Manager | Bun (primary), npm compatible |
| Repository | https://github.com/JimothySnicket/gemini-image-mcp |
Source: package.json, server.json
Architecture
graph TD
A[MCP Client<br/>Claude Code / Claude Desktop] --> B[gemini-image-mcp Server]
B --> C[generate_image Tool]
B --> D[process_image Tool]
C --> E[Google Gemini API]
D --> F[sharp Library<br/>Local Processing]
E --> G[Image Models]
G --> G1[gemini-2.5-flash-image]
G --> G2[gemini-3-pro-image-preview]
G --> G3[gemini-3.1-flash-image-preview]
F --> H[Crop / Resize]
F --> I[Background Removal]
F --> J[Trim / Format]
C --> K[Output Directory]
D --> K
K --> L[generations.jsonl<br/>Manifest Log]System Components
| Component | Technology | Purpose |
|---|---|---|
| MCP Server | @modelcontextprotocol/sdk v1.22.0 | Protocol implementation for AI tool integration |
| Gemini SDK | @google/genai v1.44.0 | Google AI API client |
| Image Processing | sharp v0.34.5 | Local image manipulation |
| Schema Validation | zod v3.24.0 | Parameter validation for both tools |
| Language | TypeScript 6.0.3 | Type-safe source code |
Source: package.json
Supported Gemini Models
| Model | Speed | Cost | Resolution Support | Max Reference Images | Special Features |
|---|---|---|---|---|---|
gemini-2.5-flash-image | Fast (~6s) | ~$0.04/image | 1K only | 1 | Default model, deprecates Oct 2026 |
gemini-3-pro-image-preview | Slow (~16s) | ~$0.15/image | 1K, 2K, 4K | 14 | Best quality, text rendering |
gemini-3.1-flash-image-preview | Balanced | Variable | 512, 1K, 2K, 4K | 1 | Google Search grounding |
Source: README.md
Available Tools
Tool: `generate_image`
AI-powered image generation and editing via Google Gemini API.
Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
prompt | Yes | string | Text description or editing instruction |
images | No | string[] | Array of file paths to input/reference images |
model | No | string | Gemini model ID (auto-detected if omitted) |
aspectRatio | No | string | Image ratio: 1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 21:9 |
resolution | No | string | 1K, 2K, 4K |
outputDir | No | string | Override output directory |
filename | No | string | Base name for saved file (auto-versioned if duplicate) |
subfolder | No | string | Subfolder within output directory |
sessionId | No | string | Continue multi-turn editing session |
seed | No | integer | Reproducible generation seed |
useSearchGrounding | No | boolean | Enable Google Search grounding (gemini-3.1-flash only) |
Source: README.md, src/index.ts
Tool: `process_image`
Local image processing via sharp. Free, fast, no API calls.
Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
imagePath | Yes | string | Path to image file to process |
crop | No | object | Pixel dimensions, aspect ratio, or focal point strategy |
resize | No | object | Resize to width/height (maintains aspect ratio) |
removeBackground | No | object | Threshold (white) or chroma key (any solid color) |
trim | No | boolean | Auto-remove whitespace/transparent borders |
format | No | string | Convert to: png, jpeg, webp |
quality | No | number | Output quality for JPEG/WebP (1-100) |
outputDir | No | string | Override output directory |
filename | No | string | Base name for saved file |
subfolder | No | string | Subfolder within output directory |
Crop Options:
// Pixel-exact
{"width": 500, "height": 300, "left": 100, "top": 50}
// Aspect ratio (center crop)
{"aspectRatio": "16:9"}
// Focal point strategies
{"aspectRatio": "16:9", "strategy": "attention"} // Visually interesting region
{"aspectRatio": "16:9", "strategy": "entropy"} // Most detailed region
Background Removal Options:
// Threshold-based (white backgrounds)
{"threshold": 230}
// Chroma key (green screen / any solid color)
{"color": "#00FF00"}
Source: README.md, skills/image-generation/SKILL.md
Feature Summary
Generate Image Features
- Text-to-image — Describe desired output, receive generated image
- Image editing — Provide reference images with editing instructions
- Multi-turn sessions — Iteratively refine images using conversation history
- Multi-image input — Up to 14 reference images on gemini-3-pro
- Cost reporting — Token counts, estimated USD cost, and session totals in every response
- Rate limiting — Configurable per-hour caps on requests and cost
- Auto model discovery — Detects available image models from API key at startup
- Seed support — Reproducible generation with integer seeds
- Google Search grounding — Real-world accuracy on gemini-3.1-flash
Source: README.md
Process Image Features
- Crop — Pixel-exact, aspect ratio (center), or focal point (attention/entropy)
- Resize — To width, height, or both (maintains aspect ratio)
- Background removal — Threshold-based (white backgrounds) or chroma key (any solid color)
- Chroma key pipeline — HSV keying with smoothstep feather, spill suppression, 5-pass 3x3 edge anti-aliasing
- Trim — Auto-remove whitespace borders
- Format conversion — PNG, JPEG, WebP with quality control
Source: CHANGELOG.md
Shared Features
- Output organization — Meaningful filenames with auto-versioning, subfolders
- Generation manifest —
generations.jsonllogs every generation with prompt, params, cost - Full aspect ratio support — 1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 21:9
- Resolution control — 1K, 2K, 4K
Source: README.md
Workflow Diagram
graph LR
subgraph "Text-to-Image"
A1[User Prompt] --> B1[generate_image]
B1 --> C1[Gemini API]
C1 --> D1[Save PNG/JPEG]
end
subgraph "Image Editing"
A2[User Prompt + Reference Image] --> B2[generate_image]
B2 --> C2[Gemini API]
C2 --> D2[Save + sessionId]
end
subgraph "Local Processing"
A3[Input Image] --> B3[process_image]
B3 --> C3[sharp Pipeline]
C3 --> D3[Processed Output]
end
subgraph "Multi-Turn Refinement"
D2 --> E1[Pass sessionId]
E1 --> B2
B2 --> D4[Refined Image]
endSetup and Configuration
Prerequisites
- Gemini API Key — Obtain from Google AI Studio
- Node.js >= 18.0.0 or Bun runtime
- MCP-compatible client (Claude Code, Claude Desktop, or other MCP clients)
Environment Setup
Windows (PowerShell):
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')
macOS / Linux:
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc
Source: README.md
Configuration File
Create a config file using the --init flag:
npx @jimothy-snicket/gemini-image-mcp --init
This creates ~/.gemini-image-mcp.json with all defaults and inline documentation.
Configuration Priority:
Environment Variables > Local Config (.gemini-image-mcp.json in CWD) > Global Config (~/.gemini-image-mcp.json) > Defaults
Example Config Structure:
{
"defaultModel": "gemini-3.1-flash-image-preview",
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "2K"
},
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true
}
}
}
Source: README.md
Rate Limiting
Configure rate limits to prevent runaway agent costs:
MAX_REQUESTS_PER_HOUR— Maximum API requests per hour (e.g., 20)MAX_COST_PER_HOUR— Maximum cost in USD per hour (e.g., 5)
Source: README.md, skills/image-generation/SKILL.md
Development
Build Commands
bun install # Install dependencies
bun run build # TypeScript -> dist/
bun run dev # Run directly with Bun
npm run start # Run production build with Node
Source: CONTRIBUTING.md
Project Structure
| Path | Purpose |
|---|---|
src/index.ts | Main MCP server implementation with tool definitions |
dist/ | Compiled JavaScript output |
skills/image-generation/SKILL.md | Claude Code plugin skill documentation |
plugin.json | Claude Code plugin manifest |
server.json | MCP server registry configuration |
Source: package.json
Version History
| Version | Release Date | Key Changes |
|---|---|---|
| 0.4.0 | 2026-05 | Config module, JSONC parsing, security hardening, prototype pollution guards |
| 0.2.0 | 2026-04-01 | Process_image tool, chroma key pipeline, session tracking, rate limiting |
| 0.1.0 | 2026-01 | Initial release with basic generation |
Source: CHANGELOG.md
Security
For security vulnerabilities, report through GitHub Security Advisories rather than opening a public issue. Source: SECURITY.md
Security Features in v0.4.0:
- API keys rejected from config files with warning
- String-aware JSONC comment stripping (won't mangle URLs in quoted strings)
- Prototype pollution guard on config deep merge (
__proto__,constructor,prototype) - Unknown config keys warned and dropped
Source: CHANGELOG.md
Contributing
Before contributing, open an issue to discuss the bug or feature. Development follows these guidelines:
- One thing per PR
- Ensure
bun run buildsucceeds with no errors - Test changes manually against the actual Gemini API
- Keep scope tight—open separate issues for unrelated fixes
Source: CONTRIBUTING.md
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Installation Guide
Related topics: Home, MCP Client Configuration, Configuration Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home, MCP Client Configuration, Configuration Guide
Installation Guide
This guide covers all methods to install and configure the gemini-image-mcp server, a Model Context Protocol (MCP) server that provides Google Gemini image generation, editing, and local image processing capabilities.
Overview
The gemini-image-mcp server provides two primary tools:
| Tool | Description |
|---|---|
generate_image | AI-powered image generation and editing via Gemini API |
process_image | Local image processing (crop, resize, background removal) via Sharp |
Package Details:
| Property | Value |
|---|---|
| Package Name | @jimothy-snicket/gemini-image-mcp |
| Version | 0.4.0 |
| Engine | Node.js >= 18.0.0 |
| License | MIT |
| Transport | stdio |
Source: package.json
Prerequisites
Before installation, ensure the following requirements are met:
System Requirements
- Node.js: Version 18.0.0 or higher
- Package Manager: npm (comes with Node.js)
- MCP Client: A compatible MCP client such as Claude Code, Claude Desktop, or any MCP-compatible tool
Required Accounts
- Google Gemini API Key: Obtain from Google AI Studio
Note: Google AI Studio provides generous rate limits for the Gemini API at no cost to start.
Source: README.md
Installation Methods
Method 1: Global npm Installation
Install the package globally for system-wide access:
npm install -g @jimothy-snicket/gemini-image-mcp
After installation, the server can be invoked via the gemini-image-mcp command:
gemini-image-mcp
Source: package.json:14 README.md
Method 2: NPX (No Installation Required)
Run directly without installation using npx:
npx -y @jimothy-snicket/gemini-image-mcp
This method automatically downloads and executes the package, making it ideal for quick testing or temporary use.
Source: README.md
Method 3: Claude Code Plugin
Add the MCP server to Claude Code with a single command:
claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp
Claude Code automatically picks up the GEMINI_API_KEY environment variable from your shell.
Source: README.md
Method 4: Manual MCP Configuration
Create a .mcp.json configuration file in your project root or ~/.claude/.mcp.json for global access:
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
"env": {
"GEMINI_API_KEY": "${GEMINI_API_KEY}"
}
}
}
}
Security Note: The ${GEMINI_API_KEY} syntax reads the value from your shell environment, ensuring your actual API key is never written into configuration files.
Source: README.md
Method 5: Claude Desktop
For Claude Desktop users, edit the configuration file:
| OS | File Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
"env": {
"GEMINI_API_KEY": "${GEMINI_API_KEY}"
}
}
}
}
Source: README.md
Environment Setup
Setting the GEMINI_API_KEY
The server requires a GEMINI_API_KEY environment variable to authenticate with the Google Gemini API.
#### Windows (PowerShell)
Run PowerShell as administrator and execute:
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')
After setting the environment variable, restart your terminal to ensure the variable is loaded.
#### macOS / Linux
Add the export statement to your shell configuration file:
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc
For zsh users, use:
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.zshrc
source ~/.zshrc
#### Verification
Confirm the API key is set correctly:
echo $GEMINI_API_KEY
This should display your API key. If empty, ensure you've restarted your terminal or run source ~/.bashrc (or equivalent).
Source: README.md
Configuration File Setup
The server supports configuration files for persistent settings. Two methods are available:
Initialize Default Config File
Create a global configuration file at ~/.gemini-image-mcp.json:
npx @jimothy-snicket/gemini-image-mcp --init
Initialize Local Config File
Create a project-local configuration file at .gemini-image-mcp.json in the current working directory:
npx @jimothy-snicket/gemini-image-mcp --init --local
Source: README.md
Configuration Priority
Settings are resolved in the following order of precedence:
Environment Variables > Local Config (.gemini-image-mcp.json) > Global Config (~/.gemini-image-mcp.json) > Default Values
Per-request parameters always override all configuration defaults.
Source: README.md
Configuration Schema
The configuration file supports the following structure:
{
"defaultModel": "gemini-3.1-flash-image-preview",
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "2K"
},
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true
}
}
}
#### Configuration Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
defaultModel | string | Default Gemini model for image generation | gemini-2.5-flash-image |
defaults.generate.aspectRatio | string | Default aspect ratio | 1:1 |
defaults.generate.resolution | string | Default resolution | 1K |
defaults.process.removeBackground | object | Default background removal settings | {} |
defaults.process.trim | boolean | Default trim setting | false |
defaults.process.format | string | Default output format | png |
defaults.process.quality | number | Default quality (1-100) | 90 |
Source: README.md
Development Setup
For contributing to the project or running from source:
1. Clone the Repository
git clone https://github.com/JimothySnicket/gemini-image-mcp.git
cd gemini-image-mcp
2. Install Dependencies
The project uses Bun as its package manager:
bun install
3. Build the Project
Compile TypeScript to JavaScript:
bun run build
This produces output in the dist/ directory.
4. Run in Development Mode
Execute directly from source using Bun:
bun run dev
5. Run the Compiled Version
After building, start the compiled server:
bun run start
Or with Node.js:
node dist/index.js
Source: CONTRIBUTING.md package.json
Rate Limiting Configuration
The server supports rate limiting to prevent runaway agents or excessive costs:
| Environment Variable | Description | Example |
|---|---|---|
MAX_REQUESTS_PER_HOUR | Maximum API requests per hour | 20 |
MAX_COST_PER_HOUR | Maximum cost per hour in USD | 5 |
Example sensible defaults for an agent loop:
export MAX_REQUESTS_PER_HOUR=20
export MAX_COST_PER_HOUR=5
Note: The server logs a warning at startup if no rate limits are configured.
Source: README.md
Supported Gemini Models
| Model | Strengths | Supported Resolutions |
|---|---|---|
gemini-2.5-flash-image | Fast, cheap (~$0.04/image) | 1K only (deprecates Oct 2026) |
gemini-3-pro-image-preview | Best quality, text rendering | 1K, 2K, 4K |
gemini-3.1-flash-image-preview | Speed + quality balance, Google Search grounding | 512, 1K, 2K, 4K |
The server performs automatic model discovery at startup, detecting image-capable models available with your API key.
Source: README.md
Server Metadata
The server is registered with the MCP registry:
{
"name": "io.github.JimothySnicket/gemini-image",
"version": "0.4.0",
"description": "Google Gemini image generation, editing, and local processing via MCP"
}
Source: server.json
Installation Flow Diagram
graph TD
A[Start Installation] --> B{Have GEMINI_API_KEY?}
B -->|No| C[Get API Key from Google AI Studio]
B -->|Yes| D{Choose Installation Method}
C --> D
D -->|Global| E[npm install -g]
D -->|Temporary| F[npx -y]
D -->|Claude Code| G[claude mcp add command]
D -->|Claude Desktop| H[Edit claude_desktop_config.json]
D -->|Development| I[Clone repo + bun install]
E --> J{Setup Config File?}
F --> J
G --> J
H --> J
I --> J
J -->|Yes| K[Run --init or --init --local]
J -->|No| L[Use Defaults]
K --> M[Start Using MCP Server]
L --> MVerification Checklist
After installation, verify your setup by checking:
- [ ]
echo $GEMINI_API_KEYreturns your API key - [ ] Server starts without errors
- [ ] MCP client recognizes the gemini-image server
- [ ] Test
generate_imagetool with a simple prompt - [ ] Rate limiting is configured (recommended for agent use)
Troubleshooting
API Key Not Found
If the server reports that GEMINI_API_KEY is not set:
- Verify the environment variable is set:
echo $GEMINI_API_KEY - Restart your terminal session
- For Claude Desktop, ensure the env variable is set before starting the application
Model Not Available
If you receive a model not available error:
- The server performs automatic model discovery at startup
- Verify your API key has access to the requested model
- Check Google AI Studio for model availability
Build Errors
If bun run build fails:
- Ensure Bun is installed:
bun --version - Clear node_modules and reinstall:
rm -rf node_modules && bun install - Check TypeScript version compatibility
Source: CONTRIBUTING.md README.md
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
MCP Client Configuration
Related topics: Installation Guide, Home
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation Guide, Home
MCP Client Configuration
Overview
The gemini-image-mcp project provides an MCP (Model Context Protocol) server that enables AI-powered image generation and local image processing through Google Gemini. MCP Client Configuration encompasses all methods and mechanisms available to connect MCP-compatible clients to this server, pass required authentication credentials, and customize server behavior through environment variables or configuration files.
The server exposes two primary tools: generate_image for AI-powered image generation via the Gemini API, and process_image for local image manipulation using the sharp library. Both tools are accessible to any MCP-compatible client once the connection is established. Source: README.md:1-15
Architecture
graph TD
A[MCP Client<br/>Claude Code / Claude Desktop] --> B[gemini-image-mcp Server]
B --> C[Google Gemini API]
B --> D[Local Processing<br/>sharp library]
E[Environment Variables] --> B
F[Config File<br/>~/.gemini-image-mcp.json] --> B
G[Local Config<br/>.gemini-image-mcp.json] --> B
H[GEMINI_API_KEY] --> E
I[OUTPUT_DIR] --> E
J[DEFAULT_MODEL] --> EMCP Server Registration
The server registers with the MCP protocol using the official @modelcontextprotocol/sdk. Upon connection, clients receive metadata describing available tools and server capabilities. Source: src/index.ts:1-35
Server Identity
| Property | Value |
|---|---|
| Server Name | gemini-image-mcp |
| Version | Dynamic from package.json |
| MCP Name | io.github.JimothySnicket/gemini-image |
| Transport | stdio |
| Protocol Version | 2025-12-11 |
The server name and version are read dynamically from package.json at runtime, ensuring the MCP handshake always reports the correct version. Source: package.json:3-7
Server Instructions
The MCP server provides structured instructions to connecting clients describing the available tools and configuration hierarchy:
Gemini image generation and local image processing. Two tools: generate_image (AI-powered, costs money)
and process_image (local via sharp, free). Configuration can be set via a JSON config file — run
`npx @jimothy-snicket/gemini-image-mcp --init` to create ~/.gemini-image-mcp.json with commented defaults.
A local .gemini-image-mcp.json in the project directory can override global settings.
Priority: per-request params > env vars > local config > global config > defaults.
Source: src/index.ts:17-26
Environment Variables
Environment variables provide the primary mechanism for configuring the MCP server. They are read at server startup and apply globally to all requests.
Required Variables
| Variable | Description | Required |
|---|---|---|
GEMINI_API_KEY | Google Gemini API key from AI Studio | Yes |
The GEMINI_API_KEY is mandatory. The server will fail to start without it, displaying a clear error message indicating the missing credential. Source: README.md:68-80
Optional Variables
| Variable | Default | Description |
|---|---|---|
OUTPUT_DIR | ~/gemini-images | Default directory for saved images |
DEFAULT_MODEL | gemini-2.5-flash-image | Default Gemini model |
LOG_LEVEL | info | Log level: debug, info, or error |
REQUEST_TIMEOUT_MS | 60000 | API request timeout in milliseconds |
SESSION_TIMEOUT_MS | 1800000 | Multi-turn session expiry (30 minutes) |
MAX_REQUESTS_PER_HOUR | 0 | Max image generations per rolling hour (0 = unlimited) |
MAX_COST_PER_HOUR | 0 | Max estimated cost (USD) per rolling hour (0 = unlimited) |
Source: src/config.ts:4-30
Setting the API Key
macOS / Linux:
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.bashrc
source ~/.bashrc
Windows (PowerShell):
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'your-key-here', 'User')
Source: README.md:75-88
Rate Limiting Configuration
Rate limiting is strongly recommended when agents have access to the generate_image tool, as an agent in a loop can generate images rapidly.
# Example: Limit to 20 requests or $5 per rolling hour
export MAX_REQUESTS_PER_HOUR=20
export MAX_COST_PER_HOUR=5
Source: README.md:166-170
Configuration Files
Beyond environment variables, the server supports persistent JSON configuration files with comments (JSONC format).
Config File Locations
| Location | Purpose |
|---|---|
~/.gemini-image-mcp.json | Global configuration for all projects |
.gemini-image-mcp.json | Project-specific overrides |
Source: src/config.ts:1-20
Configuration Priority
graph LR
A[Per-request Parameters] --> Z[Highest Priority]
B[Environment Variables] --> Y
C[Local Config<br/>.gemini-image-mcp.json] --> X
D[Global Config<br/>~/.gemini-image-mcp.json] --> W
E[Built-in Defaults] --> V[Lowest Priority]
style A fill:#90EE90
style E fill:#FFB6C1Priority order (highest to lowest):
- Per-request tool parameters
- Environment variables
- Local config file (
.gemini-image-mcp.jsonin project) - Global config file (
~/.gemini-image-mcp.json) - Built-in defaults
Source: README.md:148-155
Initializing Config Files
Create a new config file with documented defaults:
# Global config
npx @jimothy-snicket/gemini-image-mcp --init
# Project-specific config
npx @jimothy-snicket/gemini-image-mcp --init --local
# Overwrite existing
npx @jimothy-snicket/gemini-image-mcp --init --force
Source: README.md:53-62
Config File Template
{
// gemini-image-mcp configuration
// Docs: https://github.com/JimothySnicket/gemini-image-mcp
// Directory where generated/processed images are saved
"outputDir": "~/gemini-images",
// Default Gemini model for image generation
// gemini-2.5-flash-image — fast, ~$0.04/image, 1K only
// gemini-3.1-flash-image-preview — fast, ~$0.08/image, up to 4K
// gemini-3-pro-image-preview — best quality, ~$0.16/image, up to 4K
"defaultModel": "gemini-2.5-flash-image",
"logLevel": "info",
"requestTimeout": 60000,
"sessionTimeout": 1800000,
"maxRequestsPerHour": 0,
"maxCostPerHour": 0,
"defaults": {
"generate": {
// "aspectRatio": "1:1",
// "resolution": "1K"
}
}
}
Source: src/config.ts:1-35
Security Considerations
API keys are rejected from config files with a warning. This prevents accidental exposure when config files get committed to repositories. Source: CHANGELOG.md:45-50
The config system includes:
- String-aware JSONC comment stripping (won't mangle URLs in quoted strings)
- Prototype pollution guard on config deep merge
- Unknown config keys warned and dropped
MCP Client Setup Examples
Claude Code (One-Liner)
The simplest setup method using Claude Code's built-in MCP management:
claude mcp add gemini-image -- npx -y @jimothy-snicket/gemini-image-mcp
Claude Code automatically inherits GEMINI_API_KEY from the shell environment. Source: README.md:38-45
Claude Code (Manual Configuration)
For explicit control, add to .mcp.json in your project root or ~/.claude/.mcp.json for global access:
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
"env": {
"GEMINI_API_KEY": "${GEMINI_API_KEY}"
}
}
}
}
The ${GEMINI_API_KEY} syntax reads the value from your shell environment without storing the actual key in the config file. Source: README.md:95-110
Claude Desktop
Edit the Claude Desktop configuration file:
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "@jimothy-snicket/gemini-image-mcp"],
"env": {
"GEMINI_API_KEY": "${GEMINI_API_KEY}"
}
}
}
}
Source: README.md:111-130
Plugin-Based Configuration
For environments using the Claude plugin system, configure via plugin.json:
{
"name": "gemini-image-mcp",
"version": "0.2.0",
"description": "Google Gemini image generation and editing via MCP",
"mcpServers": {
"gemini-image": {
"command": "node",
"args": ["${CLAUDE_PLUGIN_ROOT}/dist/index.js"],
"env": {
"GEMINI_API_KEY": "${GEMINI_API_KEY}"
}
}
},
"skills": ["skills/image-generation/SKILL.md"]
}
The ${CLAUDE_PLUGIN_ROOT} variable is replaced at runtime with the plugin installation directory. Source: plugin.json:1-16
Enhanced Security Setup
For environments requiring extra security, use a wrapper script that retrieves credentials from the OS keychain:
# Wrapper script example (macOS Keychain)
#!/bin/bash
API_KEY=$(security find-generic-password -s "GEMINI_API_KEY" -w)
GEMINI_API_KEY="$API_KEY" node /path/to/gemini-image-mcp/dist/index.js
Source: README.md:145-155
Server.json Schema
The MCP protocol uses server.json to advertise server capabilities to compatible clients:
{
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
"name": "io.github.JimothySnicket/gemini-image",
"description": "Google Gemini image generation, editing, and local processing via MCP",
"repository": {
"url": "https://github.com/JimothySnicket/gemini-image-mcp",
"source": "github"
},
"version": "0.4.0",
"packages": [
{
"registryType": "npm",
"identifier": "@jimothy-snicket/gemini-image-mcp",
"version": "0.4.0",
"transport": {
"type": "stdio"
},
"environmentVariables": [
{
"description": "Google Gemini API key from https://aistudio.google.com/apikey",
"isRequired": true,
"format": "string",
"isSecret": true,
"name": "GEMINI_API_KEY"
}
]
}
]
}
This schema allows MCP clients to automatically discover server requirements and display appropriate configuration prompts. Source: server.json:1-32
Verifying Configuration
Check Environment Variables
echo $GEMINI_API_KEY
A non-empty response confirms the variable is set. Source: README.md:89-91
Test Server Startup
npx @jimothy-snicket/gemini-image-mcp --help
Successful startup displays diagnostics including Node version, PID, working directory, API key status, default model, and output directory. Source: CHANGELOG.md:35-40
Tool Registration
The MCP server registers two tools with their input schemas:
generate_image Tool
server.registerTool(
"generate_image",
{
title: "Generate Image",
description: "Generate or edit images using Google Gemini...",
inputSchema: {
prompt: z.string().describe("Text description of the image..."),
images: z.optional(z.array(z.string()).max(14)),
model: z.optional(z.string()),
aspectRatio: z.optional(z.enum(["1:1", "16:9", "9:16", "3:2", "2:3", "3:4", "4:3", "21:9"])),
resolution: z.optional(z.enum(["1K", "2K", "4K"])),
// ... additional parameters
}
}
);
Source: src/index.ts:37-60
process_image Tool
Local image processing via sharp, free and requires no API calls. Supports crop, resize, background removal, trim, and format conversion. Source: README.md:28-40
Development Setup
For local development of the MCP server:
bun install
bun run build # TypeScript -> dist/
bun run dev # Run directly with Bun
Requires GEMINI_API_KEY environment variable for testing image generation. Source: CONTRIBUTING.md:1-15
Summary
MCP Client Configuration for gemini-image-mcp supports multiple integration strategies:
| Method | Best For |
|---|---|
claude mcp add | Quick setup in Claude Code |
.mcp.json manual | Explicit control, version control of config |
| Claude Desktop | Desktop Claude applications |
| Plugin system | Shared team configurations |
| Environment variables | CI/CD pipelines, containerized deployments |
| Config files | Persistent, documented defaults |
The configuration system follows a clear priority hierarchy, with per-request parameters taking precedence over environment variables, which take precedence over local and global config files. Rate limiting is configurable to prevent runaway costs in agent-based workflows.
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
generate_image Tool Reference
Related topics: processimage Tool Reference, Image Generation Internals, Cost Tracking and Rate Limiting
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: process_image Tool Reference, Image Generation Internals, Cost Tracking and Rate Limiting
generate_image Tool Reference
The generate_image tool is the primary AI-powered component of the gemini-image-mcp server. It leverages Google Gemini's native image generation API (generateContent) to create and edit images based on text prompts, with optional reference images for contextual guidance. This tool is designed for scenarios requiring intelligent, model-driven image creation including text-to-image generation, iterative editing through multi-turn sessions, and AI-assisted image composition with up to 14 reference images.
Unlike traditional image generation APIs that rely on deprecated services, this tool is built on Gemini's native capabilities, ensuring long-term stability and access to cutting-edge features like multi-turn conversation context and Google Search grounding.
Overview
| Property | Value |
|---|---|
| Tool Name | generate_image |
| API Backend | Google Gemini generateContent |
| Cost | Per-request (see Pricing) |
| Free Tier | No (requires Gemini API key) |
| Transport | STDIO (MCP Protocol) |
Source: src/index.ts:1-50
Supported Models
The tool automatically discovers available image-capable models at startup by querying the Gemini API. However, three primary models are documented and supported:
| Model ID | Strengths | Resolution Support | Reference Images | Cost Tier |
|---|---|---|---|---|
gemini-2.5-flash-image | Fast, affordable | 1K only | Up to 5 | ~$0.04/image |
gemini-3-pro-image-preview | Best quality, superior text rendering | 1K, 2K, 4K | Up to 14 | ~$0.15/image |
gemini-3.1-flash-image-preview | Speed/quality balance, Google Search grounding | 512, 1K, 2K, 4K | Up to 14 | ~$0.08/image |
The model auto-discovery mechanism filters models based on naming patterns to exclude deprecated Imagen-based services:
const IMAGE_MODEL_PATTERNS = ["image", "img"];
const EXCLUDED_PREFIXES = ["imagen"];
Source: src/generate.ts:1-50
Parameters
The following table documents all parameters accepted by the generate_image tool:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | Yes | — | Text description or editing instruction |
images | string[] | No | — | Array of file paths to input/reference images |
model | string | No | Config defaultModel | Gemini model ID |
aspectRatio | string | No | Config default | Output aspect ratio |
resolution | string | No | Config default | Output resolution (1K, 2K, 4K) |
outputDir | string | No | ~/gemini-images | Override output directory |
filename | string | No | Auto-generated | Base name with auto-versioning |
subfolder | string | No | — | Subdirectory within output |
seed | integer | No | Random | Reproducible generation seed |
sessionId | string | No | — | Multi-turn session identifier |
useSearchGrounding | boolean | No | false | Enable Google Search grounding |
Supported Aspect Ratios
| Ratio | Use Case |
|---|---|
1:1 | Square images, social posts, icons |
16:9 | Widescreen, hero banners, videos |
9:16 | Vertical stories, mobile content |
3:2 | Standard photography |
2:3 | Portrait photography |
4:3 | Classic aspect ratio |
3:4 | Portrait standard |
21:9 | Ultra-widescreen |
Resolution Options
| Resolution | Availability |
|---|---|
1K | All models |
2K | gemini-3-pro-image-preview, gemini-3.1-flash-image-preview |
4K | gemini-3-pro-image-preview, gemini-3.1-flash-image-preview |
Source: src/index.ts:50-150
Architecture
High-Level Flow
graph TD
A[User Request: generate_image] --> B[Load Config & Validate Params]
B --> C{images provided?}
C -->|No| D[Text-to-Image Mode]
C -->|Yes| E[Image Editing Mode]
D --> F[Build Prompt Content]
E --> G[Read Image Files as InlineData]
G --> F
F --> H{Model supports grounding?}
H -->|Yes & enabled| I[Add Google Search Tool]
H -->|No| J[Skip Grounding]
I --> K[Call Gemini generateContent API]
J --> K
K --> L[Extract Generated Image]
L --> M[Apply Filename & Subfolder Logic]
M --> N[Save to Output Directory]
N --> O[Log to generations.jsonl]
O --> P[Return Response with Usage Report]Multi-Turn Session Management
Multi-turn sessions enable iterative refinement of images by preserving conversation history across multiple requests:
graph LR
A[Request 1: sessionId=abc123] --> B[Create New Session]
B --> C[Generate Image]
C --> D[Response: sessionId=abc123]
D --> E[Request 2: sessionId=abc123]
E --> F[Retrieve Existing Session]
F --> G[Append to History]
G --> H[Generate with Context]
H --> D2[Updated Response]Sessions are managed through an in-memory Map with automatic cleanup:
| Setting | Value |
|---|---|
| Max conversation turns per session | 10 |
| Session timeout | 30 minutes (1800000ms) |
| History storage | Array of Content objects |
Source: src/generate.ts:50-150
Image Input Processing
When reference images are provided, they are converted to Gemini's inline data format:
const MIME_TYPES: Record<string, string> = {
".png": "image/png",
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".webp": "image/webp",
".gif": "image/gif",
};
| Constraint | Limit |
|---|---|
| Max image file size | 50MB |
| Max reference images (gemini-3-pro) | 14 |
| Max reference images (gemini-2.5-flash) | 5 |
Source: src/generate.ts:150-200
Google Search Grounding
Google Search grounding enhances generation accuracy by incorporating real-world information through the Gemini googleSearch tool. This feature is restricted to specific models:
export function validateGrounding(model: string, useSearchGrounding: boolean | undefined): void {
if (useSearchGrounding && !GROUNDING_SUPPORTED_MODELS.includes(model)) {
throw new Error(
`useSearchGrounding is only supported on ${GROUNDING_SUPPORTED_MODELS.join(", ")}. ` +
`You requested ${model}.`,
);
}
}
export const GROUNDING_SUPPORTED_MODELS = ["gemini-3.1-flash-image-preview"];
Supported Model: gemini-3.1-flash-image-preview
Attempting to enable grounding on other models results in a validation error. This restriction ensures users receive accurate error messages rather than silent failures from the API.
Source: src/generate.ts:1-50
Pricing and Cost Reporting
Every generate_image response includes detailed cost information through the UsageReport structure:
interface UsageReport {
inputTokens: number;
outputTokens: number;
totalTokens: number;
estimatedCostUsd: number;
}
interface SessionStats {
totalGenerations: number;
totalCostUsd: number;
requestsThisHour: number;
costThisHour: number;
}
| Metric | Description |
|---|---|
inputTokens | Tokens consumed by the prompt and reference images |
outputTokens | Tokens in the API response (including image data) |
totalTokens | Sum of input and output tokens |
estimatedCostUsd | Calculated cost in US dollars |
totalGenerations | Running count in current session |
totalCostUsd | Cumulative cost for the session |
The pricing module calculates costs based on model-specific rates. Rate limiting is available through configuration to prevent runaway costs:
| Environment Variable | Purpose |
|---|---|
MAX_REQUESTS_PER_HOUR | Request rate limit |
MAX_COST_PER_HOUR | Cost threshold limit |
Source: src/pricing.ts
Configuration
The tool respects a layered configuration system with the following priority:
graph TD
A[Priority 1: Per-Request Parameters] --> B[Priority 2: Environment Variables]
B --> C[Priority 3: Local Config ./.gemini-image-mcp.json]
C --> D[Priority 4: Global Config ~/.gemini-image-mcp.json]
D --> E[Priority 5: Built-in Defaults]Config File Structure
{
"defaultModel": "gemini-2.5-flash-image",
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "2K"
}
}
}
Source: src/config.ts
Output Organization
Generated images are saved with intelligent naming and versioning:
Input filename | Existing Files | Output Saved As |
|---|---|---|
"hero" | None | hero.png |
"hero" | hero.png exists | hero-v2.png |
"hero" | hero.png, hero-v2.png exist | hero-v3.png |
Subfolder Organization
Images can be organized into subdirectories using the subfolder parameter:
| Parameters | Result |
|---|---|
filename: "hero", subfolder: "landing-page" | ~/gemini-images/landing-page/hero.png |
Generation Manifest
All generations are logged to generations.jsonl for audit and reproducibility:
{"timestamp":"2024-01-15T10:30:00Z","prompt":"A modern dashboard","model":"gemini-2.5-flash-image","aspectRatio":"16:9","resolution":"2K","cost":0.04,"path":"~/gemini-images/dashboard.png"}
Source: src/generate.ts:200-300
Usage Examples
Text-to-Image Generation
{
"prompt": "A modern dashboard UI with dark theme and blue accent colours",
"aspectRatio": "16:9",
"resolution": "2K",
"filename": "dashboard-hero",
"subfolder": "landing-page"
}
Image Editing with Reference
{
"prompt": "Change the background to a sunset over water",
"images": ["./src/assets/hero.png"],
"aspectRatio": "16:9"
}
Multi-Turn Refinement
{
"prompt": "Make the colours warmer and add more contrast",
"sessionId": "session-1711929600000-a1b2c3"
}
Reproducible Generation with Seed
{
"prompt": "A photorealistic mountain landscape",
"seed": 42,
"aspectRatio": "16:9"
}
Google Search Grounding
{
"prompt": "Current design trends for AI product landing pages",
"model": "gemini-3.1-flash-image-preview",
"useSearchGrounding": true
}
Error Handling
The tool provides specific error messages for common failure scenarios:
| Error Condition | Message |
|---|---|
| Invalid API key | Failed to list models (is your API key valid?) |
| Unsupported image format | Unsupported image format ".bmp" for file: path/to/image.bmp |
| Image too large | Image file is 52MB, max is 50MB |
| Grounding on unsupported model | useSearchGrounding is only supported on gemini-3.1-flash-image-preview |
| Rate limit exceeded | Clear error with remaining budget information |
| Session model mismatch | Error if session uses different model than original |
Source: src/generate.ts:100-200
Dependencies
| Package | Version | Purpose |
|---|---|---|
@google/genai | ^1.44.0 | Gemini API client |
@modelcontextprotocol/sdk | ^1.22.0 | MCP protocol implementation |
zod | ^3.24.0 | Schema validation |
sharp | ^0.34.5 | Image processing (for output) |
Source: package.json
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
process_image Tool Reference
Related topics: generateimage Tool Reference, Image Processing Internals
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: generate_image Tool Reference, Image Processing Internals
process_image Tool Reference
Overview
The process_image tool is a local image processing utility within the gemini-image-mcp MCP server. It leverages the sharp library to perform CPU-bound image transformations without making any API calls, making it completely free to use. Source: package.json:14
Unlike the AI-powered generate_image tool which sends requests to Google's Gemini API and incurs costs per operation, process_image operates entirely on the local machine. This creates an efficient two-tool workflow where AI generation can be followed by local processing at zero additional cost. Source: README.md:features
Architecture
Tool Registration Flow
The process_image tool is registered with the MCP server using the @modelcontextprotocol/sdk framework. The tool definition includes Zod schemas for parameter validation and a handler function that orchestrates the processing pipeline. Source: src/index.ts:tool-registration
graph TD
A[MCP Client Request] --> B[index.ts Tool Handler]
B --> C[Load Configuration]
C --> D[processImage Function]
D --> E[sharp Operations Pipeline]
E --> F[Output File System]
D --> G[Return JSON Result]
H[Config Sources] --> C
H --> I[Environment Variables]
H --> J[Local Config .json]
H --> K[Global Config .json]
H --> L[Default Values]Processing Pipeline
The tool chains multiple operations into a single execution call. Operations are applied in a logical order: background removal first, then cropping, resizing, trimming, and finally format conversion. Source: src/index.ts:75-89
graph LR
A[Input Image] --> B[Background Removal]
B --> C[Crop]
C --> D[Resize]
D --> E[Trim]
E --> F[Format Conversion]
F --> G[Saved Output]Tool Parameters
Required Parameters
| Parameter | Type | Description |
|---|---|---|
imagePath | string | Path to the image file to process |
Source: src/index.ts:48-50
Optional Parameters
| Parameter | Type | Default | Description | ||
|---|---|---|---|---|---|
crop | CropConfig | none | Crop by pixel dimensions, aspect ratio, or focal point strategy | ||
resize | ResizeConfig | none | Resize to width/height (maintains aspect ratio) | ||
removeBackground | RemoveBackgroundConfig | config default | Remove background by threshold or chroma key | ||
trim | boolean | config default | Auto-remove whitespace/transparent borders | ||
format | "png" \ | "jpeg" \ | "webp" | original | Convert to specified format |
quality | number (1-100) | 90 | Output quality for JPEG/WebP | ||
outputDir | string | ~/gemini-images | Directory to save output | ||
filename | string | auto | Base name for saved file with auto-versioning | ||
subfolder | string | none | Subfolder within output directory |
Source: src/index.ts:51-79
Operations
Crop
The crop operation supports three distinct modes for targeting specific regions of an image.
Pixel-Exact Crop
{
"width": 500,
"height": 300,
"left": 100,
"top": 50
}
Aspect Ratio Crop
{
"aspectRatio": "16:9",
"strategy": "center" // or "attention" or "entropy"
}
Supported Aspect Ratios
| Ratio | Use Case |
|---|---|
1:1 | Square images, avatars |
16:9 | Hero banners, video thumbnails |
9:16 | Mobile stories, vertical content |
3:2 | Standard photography |
2:3 | Portrait photography |
4:3 | Classic monitors |
3:4 | Portrait prints |
21:9 | Ultrawide displays |
Source: README.md:aspect-ratios
Focal Point Strategies
| Strategy | Behavior |
|---|---|
center | Default. Crops from the center of the image |
attention | Shifts crop toward the most visually interesting region based on saliency detection |
entropy | Shifts crop toward the region with the most visual detail (high information entropy) |
Source: skills/image-generation/SKILL.md:crop
Resize
The resize operation maintains aspect ratio when only one dimension is specified.
{
"width": 1200 // Auto-calculate height
}
// OR
{
"height": 800 // Auto-calculate width
}
// OR
{
"width": 1200,
"height": 800 // Both specified
}
When both width and height are provided, the resize operation respects the crop configuration if present. Source: src/index.ts:58-62
Background Removal
Two distinct algorithms handle background removal depending on the background type.
Threshold-Based (White Backgrounds)
{
"threshold": 230
}
The threshold parameter specifies the brightness level below which pixels are considered background. Values closer to 255 detect lighter backgrounds. This method works well for studio product shots on white backdrops. Source: README.md:background-removal
Chroma Key (Green Screen / Any Solid Colour)
{
"color": "#00FF00" // Any hex colour
}
The chroma key pipeline performs HSV-based colour keying with advanced compositing techniques:
| Stage | Description |
|---|---|
| HSV Keying | Converts to HSV colour space for colour-based selection |
| Smoothstep Feather | Softens the edges using smoothstep interpolation |
| Spill Suppression | Removes colour contamination from the subject |
| Edge Anti-Aliasing | 5-pass 3x3 kernel smoothing |
Source: README.md:chroma-key
Trim
The trim operation automatically removes whitespace and transparent borders from images. This is particularly useful after background removal when residual padding remains around the subject. Source: README.md:trim
{
"trim": true
}
Format Conversion
The format parameter converts images to different output formats with quality control.
| Format | Quality Range | Default | Description |
|---|---|---|---|
png | N/A (lossless) | - | Portable Network Graphics |
jpeg | 1-100 | 90 | Joint Photographic Experts Group |
webp | 1-100 | 90 | Web Picture format |
Source: src/index.ts:63-67
Output Organization
Filename Auto-Versioning
When a filename already exists in the output directory, the tool automatically versions the filename:
hero.png(first save)hero-v2.png(second save)hero-v3.png(third save)
This prevents overwriting existing files and maintains a clear history of processed images. Source: README.md:filename
Subfolder Organization
The subfolder parameter creates organized directory structures within the output directory:
| Parameter | Example | Result |
|---|---|---|
filename only | "hero" | ~/gemini-images/hero.png |
subfolder only | "landing-page" | ~/gemini-images/landing-page/original.png |
| Both | "hero" + "landing-page" | ~/gemini-images/landing-page/hero.png |
Source: README.md:output
Configuration Defaults
The tool reads default values from multiple configuration sources with the following priority:
graph TD
A[Priority Order] --> B[1. Per-Request Parameters]
B --> C[2. Environment Variables]
C --> D[3. Local Config .json]
D --> E[4. Global Config .json]
E --> F[5. Hardcoded Defaults]Configuration File Structure
Create a config file using:
npx @jimothy-snicket/gemini-image-mcp --init
This creates ~/.gemini-image-mcp.json with commented defaults. For project-specific overrides:
npx @jimothy-snicket/gemini-image-mcp --init --local
Source: README.md:config-file
Default Process Settings
{
"defaults": {
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true,
"format": "png"
}
}
}
Source: README.md:config-defaults
Common Pipelines
Favicon from Logo
Extract a transparent background, trim whitespace, and resize to favicon dimensions in a single pipeline:
{
"imagePath": "./logo.png",
"removeBackground": {"threshold": 230},
"trim": true,
"resize": {"width": 192, "height": 192},
"filename": "favicon-192"
}
Source: skills/image-generation/SKILL.md:favicon
Social Card from Photo
Crop to 16:9 aspect ratio using attention-based focal point and resize to standard social card width:
{
"imagePath": "./photo.png",
"crop": {"aspectRatio": "16:9", "strategy": "attention"},
"resize": {"width": 1200},
"filename": "hero-banner"
}
Source: skills/image-generation/SKILL.md:social-card
WebP Conversion for Web
Convert an existing PNG to WebP format with optimized quality for web delivery:
{
"imagePath": "./image.png",
"format": "webp",
"quality": 85,
"filename": "optimized"
}
Source: README.md:webp
Transparent Asset from Green Screen
Generate an image on a green background, then remove it locally:
Step 1: Generate on green screen
{
"prompt": "A product photo on a bright green background",
"filename": "product-green"
}
Step 2: Remove green background
{
"imagePath": "./product-green.png",
"removeBackground": {"color": "#00FF00"},
"trim": true,
"filename": "product-transparent"
}
This two-step approach works best for high-contrast subjects (dark, red, blue, or white on green). Always use #00FF00 as it handles Gemini's actual green shade more reliably than trying to match it precisely. Source: README.md:green-screen
Subject on Specific Background (Canvas Approach)
For yellow, green, or glass/reflective subjects where chroma key struggles, use the AI-powered canvas approach:
{
"prompt": "Place a yellow rubber duck on this background. Product photography, studio lighting, centered.",
"images": ["./canvas-white.png"],
"filename": "duck-on-white"
}
This technique generates the subject with correct lighting and shadows for the specific background in a single API call. Source: skills/image-generation/SKILL.md:canvas
Return Value
The tool returns a JSON object containing the processing results:
{
"content": [
{
"type": "text",
"text": JSON.stringify({
input: {
path: string,
operations: string[]
},
output: {
path: string,
format: string,
dimensions: { width: number, height: number },
size: number
}
}, null, 2)
}
]
}
Source: src/index.ts:80-95
Error Handling
The tool wraps all operations in try-catch blocks to provide meaningful error messages:
catch (err) {
const message = err instanceof Error ? err.message : String(err);
log.error("process_image failed:", message);
// Returns error to MCP client
}
Common error scenarios include unsupported image formats and file access issues. Source: src/index.ts:96-100
Limitations
- Maximum input image size: 50MB (enforced by file stat check)
- Supported input formats depend on sharp library capabilities
- Output format availability depends on sharp library compilation options
- Processing is single-threaded per operation; large images may take longer to process
- No GPU acceleration; all processing uses CPU
Source: src/process.ts:file-validation (inferred from README file size documentation)
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Configuration Guide
Related topics: Installation Guide, Cost Tracking and Rate Limiting
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation Guide, Cost Tracking and Rate Limiting
Configuration Guide
The gemini-image-mcp server provides a centralized configuration system that allows users to customize all aspects of image generation and processing. The configuration system replaces scattered environment variables with a unified approach using JSON config files with JSONC support (JSON with comments).
Overview
The configuration system serves as the single source of truth for all server settings. Instead of reading from process.env directly throughout the codebase, all modules now read settings from the centralized config, ensuring consistency and maintainability. Source: CHANGELOG.md
Key Features
| Feature | Description |
|---|---|
| JSONC Support | JSON with comments for inline documentation |
| Hierarchical Priority | Env vars → Local config → Global config → Defaults |
| Deep Merge | Nested configuration objects merge correctly |
| Config Caching | Configuration is cached after first load |
| Security Guards | API key rejection, prototype pollution protection |
| Validation | Whitelist of known keys, unknown keys warned |
Configuration Priority
The system follows a clear hierarchy where more specific configurations override more general ones:
graph TD
A[Request Parameters] --> B[Override Everything]
B --> C[Environment Variables]
C --> D[Local Config ./.gemini-image-mcp.json]
D --> E[Global Config ~/.gemini-image-mcp.json]
E --> F[Hardcoded Defaults]
style A fill:#90EE90
style F fill:#FFE4B5Environment variables take precedence over all config files. If a setting exists in both an env var and a config file, the env var wins. Source: README.md
Configuration File Format
The configuration file uses JSONC (JSON with Comments) format, allowing inline documentation and making it easy to understand each setting.
Default Configuration Template
{
// gemini-image-mcp configuration
// Docs: https://github.com/JimothySnicket/gemini-image-mcp
// Directory where generated/processed images are saved
// Supports ~ for home directory
"outputDir": "~/gemini-images",
// Default Gemini model for image generation
// gemini-2.5-flash-image — fast, ~$0.04/image, 1K only (deprecates Oct 2026)
// gemini-3.1-flash-image-preview — fast, ~$0.08/image, up to 4K, search grounding
// gemini-3-pro-image-preview — best quality, ~$0.16/image, up to 4K, 14 ref images
"defaultModel": "gemini-2.5-flash-image",
// Log level: "debug", "info", or "error"
"logLevel": "info",
// Timeout for a single API request (ms)
"requestTimeout": 60000,
// Timeout for multi-turn editing sessions (ms)
"sessionTimeout": 1800000,
// Rate limiting (0 = unlimited)
"maxRequestsPerHour": 0,
"maxCostPerHour": 0,
// Per-tool default parameters
"defaults": {
"generate": {
// "aspectRatio": "1:1",
// "resolution": "1K"
},
"process": {
// "removeBackground": { "color": "#00FF00" },
// "trim": true
}
}
}
Source: src/config.template.ts
Configuration Options
Top-Level Settings
| Option | Type | Default | Description |
|---|---|---|---|
outputDir | string | ~/gemini-images | Directory for saved images. Supports ~ for home directory |
defaultModel | string | gemini-2.5-flash-image | Default Gemini model for image generation |
logLevel | string | info | Log verbosity: debug, info, or error |
requestTimeout | number | 60000 | Timeout for a single API request in milliseconds |
sessionTimeout | number | 1800000 | Timeout for multi-turn editing sessions in milliseconds |
maxRequestsPerHour | number | 0 | Rate limit: max requests per hour (0 = unlimited) |
maxCostPerHour | number | 0 | Rate limit: max cost per hour in USD (0 = unlimited) |
Per-Tool Defaults
The defaults object allows setting default parameters for each tool:
#### Generate Tool Defaults
| Option | Type | Description |
|---|---|---|
defaults.generate.aspectRatio | string | Default aspect ratio: 1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 21:9 |
defaults.generate.resolution | string | Default resolution: 1K, 2K, 4K |
#### Process Tool Defaults
| Option | Type | Description |
|---|---|---|
defaults.process.removeBackground | object | Default background removal settings |
defaults.process.trim | boolean | Default trim setting |
defaults.process.format | string | Default output format: png, jpeg, webp |
defaults.process.quality | number | Default quality (1-100) |
Creating a Configuration File
Automatic Initialization
The easiest way to create a configuration file is using the --init flag:
npx @jimothy-snicket/gemini-image-mcp --init
This creates ~/.gemini-image-mcp.json with all defaults and inline documentation.
Local Configuration
To create a local configuration file in the current working directory:
npx @jimothy-snicket/gemini-image-mcp --init --local
This creates .gemini-image-mcp.json in the CWD, which takes precedence over the global config.
Deep Merge Behavior
The configuration system uses deep merging for nested objects. This means you can specify only the settings you want to change, and the rest will inherit from the underlying defaults. Source: CHANGELOG.md
Example: Partial Configuration
{
"logLevel": "debug",
"defaults": {
"generate": {
"aspectRatio": "16:9"
}
}
}
This configuration only overrides logLevel and the generate.aspectRatio, while all other settings retain their defaults.
Security Features
API Key Protection
The configuration system explicitly rejects API keys found in config files with a warning:
[config] WARNING: "apiKey" found in ~/.gemini-image-mcp.json — API keys must not be in config files. Stripped.
This prevents accidental commits of API credentials to repositories. Source: CHANGELOG.md
Prototype Pollution Guard
The deep merge implementation protects against prototype pollution attacks by explicitly blocking dangerous keys:
__proto__constructorprototype
If any of these keys are encountered during config merging, they are silently ignored. Source: CHANGELOG.md
Unknown Key Warnings
The system maintains a whitelist of known configuration keys. If an unknown key is found in a config file, a warning is logged and the key is dropped:
[config] WARNING: unknown key "someUnknownKey" in ~/.gemini-image-mcp.json — ignored.
This prevents unexpected data injection and helps users catch typos in configuration. Source: CHANGELOG.md
JSONC Parsing
The configuration system supports JSONC (JSON with Comments), which extends standard JSON with:
- Single-line comments:
// comment - Multi-line comments:
/* comment */
String-Aware Comment Stripping
The JSONC parser is string-aware, meaning it won't mangle URLs or other quoted strings that contain comment-like patterns. For example:
{
// This is a comment
"url": "https://example.com/api?query=1&filter=//something",
"note": "Use // for comments in code"
}
Will correctly parse without affecting the URLs or notes. Source: CHANGELOG.md
Trailing Comma Handling
The parser automatically strips trailing commas left by commented-out lines, preventing parse failures. Source: CHANGELOG.md
Configuration Caching
After the configuration is loaded for the first time, it is cached in memory. Subsequent calls to loadConfig() return the cached value immediately without re-reading files. Source: src/config.ts
Cache Invalidation
To force a reload of the configuration (useful during development), you may need to restart the server process.
Environment Variables
While the config file is the recommended approach, the system still supports environment variables for backward compatibility:
| Environment Variable | Description |
|---|---|
GEMINI_API_KEY | Google Gemini API key (required) |
OUTPUT_DIR | Override output directory |
Environment variables always take precedence over config file values. Source: README.md
Per-Request Overrides
Configuration defaults can be overridden on a per-request basis. Per-request parameters always override config defaults. Source: README.md
Example: Per-Tool Defaults with Overrides
{
"defaultModel": "gemini-3.1-flash-image-preview",
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "2K"
},
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true
}
}
}
With this configuration:
- All
generate_imagecalls use16:9aspect ratio and2Kresolution by default - All
process_imagecalls auto-remove green backgrounds and trim by default - Any individual request can override these by specifying different values
Programmatic Usage
The configuration module exports several functions for use within the codebase:
import { loadConfig, initConfig, CONFIG_TEMPLATE } from './config';
| Function | Description |
|---|---|
loadConfig() | Load and return the merged configuration (uses cache) |
initConfig(options?) | Create a new config file interactively |
CONFIG_TEMPLATE | The default configuration template as a string |
Source: src/config.ts
File Locations
The system searches for configuration files in the following order:
- Current Working Directory:
./.gemini-image-mcp.json - Home Directory:
~/.gemini-image-mcp.json
The first file found is used, with settings merged on top of defaults.
Configuration Schema
The complete configuration schema includes:
interface GeminiImageConfig {
outputDir: string;
defaultModel: string;
logLevel: 'debug' | 'info' | 'error';
requestTimeout: number;
sessionTimeout: number;
maxRequestsPerHour: number;
maxCostPerHour: number;
defaults: {
generate?: {
aspectRatio?: string;
resolution?: string;
};
process?: {
removeBackground?: object;
trim?: boolean;
format?: 'png' | 'jpeg' | 'webp';
quality?: number;
};
};
}
Validation Rules
The configuration loader performs the following validations:
| Rule | Behavior |
|---|---|
| API key detection | Strips any key matching /api.?key/i, logs warning |
| Unknown keys | Drops unknown keys, logs warning |
| Prototype pollution | Silently skips __proto__, constructor, prototype |
| JSONC syntax | Parses comments, strips trailing commas |
| File existence | Returns null if file doesn't exist |
Testing
The configuration module has comprehensive test coverage including:
stripJsoncComments: String-aware comment removaldeepMerge: Nested object merging with pollution protectionloadConfig: Full configuration loading and precedenceinitConfig: Interactive config file creation
Source: CHANGELOG.md
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Server Architecture
Related topics: Image Generation Internals, Image Processing Internals
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Image Generation Internals, Image Processing Internals
Server Architecture
The gemini-image-mcp is an MCP (Model Context Protocol) server that provides two primary tools for image generation and processing using Google Gemini's AI capabilities. The server is built on the @modelcontextprotocol/sdk and communicates via STDIO transport, making it compatible with MCP clients like Claude Code.
Overview
The server architecture follows a modular design pattern with clear separation of concerns:
| Component | File | Responsibility |
|---|---|---|
| Entry Point | src/index.ts | Server initialization, tool registration, request routing |
| Image Generation | src/generate.ts | Gemini API integration, model discovery, image generation |
| Image Processing | src/process.ts | Local image manipulation using Sharp |
| Configuration | src/config.ts | Config file loading, validation, environment variable management |
| Usage Tracking | src/tracker.ts | Token usage logging, cost estimation |
| Utilities | src/utils.ts | Logging, file operations, path resolution |
Technology Stack
| Dependency | Version | Purpose |
|---|---|---|
@google/genai | ^1.44.0 | Gemini API client for image generation |
@modelcontextprotocol/sdk | ^1.22.0 | MCP protocol implementation |
sharp | ^0.34.5 | Local image processing |
zod | ^3.24.0 | Schema validation for tool parameters |
Source: package.json:18-22
Architecture Diagram
graph TD
A[MCP Client] <-->|STDIO| B[src/index.ts<br/>MCP Server]
B --> C[generate_image tool]
B --> D[process_image tool]
C --> E[src/generate.ts<br/>Gemini API]
C --> F[src/tracker.ts<br/>Usage Logger]
D --> G[src/process.ts<br/>Sharp Library]
E --> H[Model Discovery]
I[Config System] -.->|Priority Resolution| B
I --> J[src/config.ts]
J --> K[Environment Variables]
J --> L[Config Files]
J --> M[Defaults]Core Components
Entry Point (`src/index.ts`)
The server initializes a single MCP server instance that registers two tools:
const server = new McpServer(
{
name: "gemini-image-mcp",
version: pkg.version,
},
{
instructions: "Gemini image generation and local image processing...",
},
);
Key initialization steps:
- Load configuration via
loadConfig() - Initialize usage tracker via
initTracker() - Register both
generate_imageandprocess_imagetools - Establish STDIO transport connection
Source: src/index.ts:1-20
Tool Registration Pattern
Each tool follows a consistent registration pattern using Zod schemas for parameter validation:
server.registerTool(
"tool-name",
{
description: "...",
parameters: z.object({ /* Zod schema */ }),
},
async (args) => {
const config = loadConfig();
// Merge config defaults with args
// Execute tool logic
// Return formatted response
},
);
Source: src/index.ts:35-80
Tool: `generate_image`
The generate_image tool handles AI-powered image generation and editing via the Gemini API.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description or editing instruction |
images | string[] | No | File paths to reference images |
model | string | No | Gemini model ID |
aspectRatio | string | No | Image aspect ratio |
resolution | string | No | Output resolution (1K, 2K, 4K) |
outputDir | string | No | Override output directory |
filename | string | No | Base name for saved file |
subfolder | string | No | Subfolder within output directory |
sessionId | string | No | Continue multi-turn session |
seed | number | No | Integer seed for reproducibility |
useSearchGrounding | boolean | No | Enable Google Search grounding |
Source: src/index.ts:35-70
Model Discovery
The generate.ts module implements automatic model discovery to detect available image-capable models:
const IMAGE_MODEL_PATTERNS = ["image", "vision"];
const EXCLUDED_PREFIXES = ["learn", "gemini-2.0-flash-thinking"];
async function discoverModels(apiKey: string): Promise<string[]> {
// Paginate through available models
// Filter by image capability patterns
// Exclude specific prefixes
// Cache results
}
Source: src/generate.ts:85-100
Image Input Handling
Local images are converted to inline data for API submission:
async function readImageAsInlineData(filepath: string): Promise<{
inlineData: { data: string; mimeType: string };
}> {
const mimeType = MIME_TYPES[ext];
// Validate file exists and is under 50MB
// Return base64-encoded data with MIME type
}
Source: src/generate.ts:105-130
Tool: `process_image`
The process_image tool provides local, free image processing via the Sharp library.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
imagePath | string | Yes | Path to image file |
crop | object | No | Crop by pixels, aspect ratio, or focal point |
resize | object | No | Resize to width/height |
removeBackground | object | No | Threshold or chroma key removal |
trim | boolean | No | Auto-remove whitespace borders |
format | string | No | Output format (png, jpeg, webp) |
quality | number | No | Quality 1-100 for JPEG/WebP |
outputDir | string | No | Override output directory |
filename | string | No | Base name for saved file |
subfolder | string | No | Subfolder within output directory |
Source: src/index.ts:75-120
Processing Capabilities
| Operation | Description |
|---|---|
| Crop | Pixel-exact, aspect ratio center crop, focal point (attention/entropy) |
| Resize | Width, height, or both with aspect ratio preservation |
| Background Removal | Threshold-based (white backgrounds) or chroma key (HSV keying) |
| Trim | Auto-remove whitespace/transparent borders |
| Format Conversion | PNG, JPEG, WebP with quality control |
Configuration System (`src/config.ts`)
The configuration system implements a hierarchical priority system for settings:
Priority Order
Per-request parameters > Environment Variables > Local config (.gemini-image-mcp.json) > Global config (~/.gemini-image-mcp.json) > Defaults
Security Features
| Feature | Implementation |
|---|---|
| API key rejection | Keys from config files are rejected with warning |
| JSONC parsing | String-aware comment stripping (preserves URLs) |
| Prototype pollution guard | __proto__, constructor, prototype blocked in deep merge |
| Unknown key warnings | Invalid config keys are warned and dropped |
Source: CHANGELOG.md
Config Structure
{
"defaultModel": "gemini-3.1-flash-image-preview",
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "2K"
},
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true
}
}
}
Usage Tracking (`src/tracker.ts`)
The tracker module logs all image generation operations to a manifest file (generations.jsonl).
Tracked Data
Each generation logs:
- Prompt text
- Model used
- Parameters (aspect ratio, resolution, etc.)
- Token counts (prompt, output, image, thinking)
- Estimated USD cost
- Session information
Source: src/tracker.ts (referenced in src/index.ts:18)
Logging System (`src/utils.ts`)
The utility module provides structured logging capabilities:
import { log, setLogLevel, setLogDir } from "./utils.js";
Features:
- Configurable log levels
- Directory-based log output
- Error message formatting
Source: src/utils.ts (referenced in src/index.ts:19)
Request Flow
sequenceDiagram
participant Client as MCP Client
participant Server as MCP Server
participant Config as Config System
participant Tool as Tool Handler
participant API as External API
Client->>Server: Tool Request
Server->>Config: Load Config
Config-->>Server: Merged Config
Server->>Tool: Request + Config
Tool->>Config: Get Defaults
Config-->>Tool: Tool Defaults
Tool->>API: Process Request
API-->>Tool: Response
Tool->>Server: Formatted Result
Server-->>Client: JSON ResponseEnvironment Variables
| Variable | Required | Description |
|---|---|---|
GEMINI_API_KEY | Yes | Google Gemini API key from AI Studio |
MAX_REQUESTS_PER_HOUR | No | Rate limit for requests |
MAX_COST_PER_HOUR | No | Rate limit for cost (USD) |
OUTPUT_DIR | No | Default output directory |
Initialization
The server can be initialized in two modes:
# Global config
npx @jimothy-snicket/gemini-image-mcp --init
# Local config (in current directory)
npx @jimothy-snicket/gemini-image-mcp --init --local
This creates a ~/.gemini-image-mcp.json or .gemini-image-mcp.json file with inline documentation of all available options.
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Image Generation Internals
Related topics: generateimage Tool Reference, Server Architecture, Cost Tracking and Rate Limiting
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: generate_image Tool Reference, Server Architecture, Cost Tracking and Rate Limiting
Image Generation Internals
This document provides a comprehensive technical overview of the image generation subsystem within gemini-image-mcp. It covers the architecture, API integration patterns, session management, model discovery, and configuration system.
Overview
The image generation system is built on Google Gemini's native image generation API (generateContent), not the deprecated Imagen API. The system provides both text-to-image generation and image editing capabilities through a Model Context Protocol (MCP) server interface. Source: README.md
Core Dependencies
| Package | Version | Purpose |
|---|---|---|
@google/genai | ^1.44.0 | Gemini API client |
@modelcontextprotocol/sdk | ^1.22.0 | MCP server implementation |
zod | ^3.24.0 | Schema validation for tool parameters |
Source: package.json:18-21
Architecture
System Components
graph TD
A[MCP Client] -->|Tool Request| B[McpServer]
B --> C[generateImage Function]
C --> D[Model Discovery]
C --> E[Session Manager]
C --> F[API Client]
D --> G[Gemini API<br/>List Models]
E --> H[Session Store<br/>Map<sessionId, ConversationSession>]
F --> I[Gemini generateContent API]
I --> J[Image Response]
J --> K[File System<br/>Output Directory]Flow Diagram
sequenceDiagram
participant Client
participant Server as MCP Server
participant Session as Session Manager
participant API as Gemini API
participant FS as File System
Client->>Server: generate_image(prompt, sessionId?)
Server->>Session: Check/Create Session
Session-->>Server: ConversationSession
alt New Session
Server->>API: List Models
API-->>Server: Available Models
Server->>Session: Create New Session
else Existing Session
Server->>Session: Get Session History
end
Server->>API: generateContent(prompt, history)
API-->>Server: Generated Image
Server->>FS: Save Image
Server-->>Client: Result + Usage StatsModel Discovery System
Auto Model Detection
The system automatically discovers available image-capable models by querying the Gemini API at startup. This eliminates the need for hardcoded model lists and ensures compatibility as new models are released. Source: src/generate.ts:95-116
// Known image-capable model name fragments (Gemini native only)
const IMAGE_MODEL_PATTERNS = ["image", "img"];
// Imagen uses a different API (generateImages) and is deprecated June 2026
const EXCLUDED_PREFIXES = ["imagen"];
Model Filtering Logic
| Filter Type | Criteria | Purpose |
|---|---|---|
| Include | Name contains "image" or "img" | Match Gemini image models |
| Exclude | Name starts with "imagen" | Avoid deprecated Imagen API |
Source: src/generate.ts:95-97
Caching Mechanism
Available models are cached after the first discovery call to reduce API overhead:
let cachedAvailableModels: string[] | null = null;
export function getAvailableModels(): string[] | null {
return cachedAvailableModels;
}
Source: src/generate.ts:63
Supported Models
| Model | Resolution | Cost | Grounding | Notes |
|---|---|---|---|---|
gemini-2.5-flash-image | 1K only | ~$0.04/image | No | Default, deprecates Oct 2026 |
gemini-3-pro-image-preview | 1K, 2K, 4K | ~$0.15/image | No | Best quality, up to 14 reference images |
gemini-3.1-flash-image-preview | 512, 1K, 2K, 4K | ~$0.08/image | Yes | Search grounding support |
Source: README.md:45-49
Google Search Grounding
Supported Models
Only gemini-3.1-flash-image-preview supports Google Search grounding. The system validates this at runtime and throws a descriptive error if unsupported. Source: src/generate.ts:99-108
export const GROUNDING_SUPPORTED_MODELS = ["gemini-3.1-flash-image-preview"];
export function validateGrounding(model: string, useSearchGrounding: boolean | undefined): void {
if (useSearchGrounding && !GROUNDING_SUPPORTED_MODELS.includes(model)) {
throw new Error(
`useSearchGrounding is only supported on ${GROUNDING_SUPPORTED_MODELS.join(", ")}. ` +
`You requested ${model}.`,
);
}
}
Source: src/generate.ts:99-108
Multi-Turn Session Management
Session Data Model
interface ConversationSession {
history: Content[]; // Previous conversation turns
model: string; // Model used in this session
lastAccessed: number; // Timestamp for TTL cleanup
}
Source: src/generate.ts:67-71
Session Store
const sessions = new Map<string, ConversationSession>();
const MAX_SESSION_TURNS = 10;
Session Lifecycle
graph LR
A[Create Session] --> B[Store with TTL]
B --> C[Each Request]
C -->|Within TTL| D[Extend TTL]
C -->|Exceeds TTL| E[Cleanup on Access]
E --> F[Return Error]
D --> G[Append to History]
G --> H[Return Response]
H --> I[Max 10 Turns]
I -->|Exceeded| J[Prune Oldest]Session Configuration
| Parameter | Default | Description |
|---|---|---|
sessionTimeout | 1800000ms (30 min) | Inactivity timeout before session expiry |
MAX_SESSION_TURNS | 10 | Maximum conversation turns per session |
Source: src/generate.ts:69 and src/config.ts
Session Cleanup
Sessions are automatically cleaned up based on the configured timeout:
function getSessionTimeout(): number {
return loadConfig().sessionTimeout;
}
function cleanupSessions(): void {
const timeout = getSessionTimeout();
const now = Date.now();
for (const [id, session] of sessions) {
if (now - session.lastAccessed > timeout) {
sessions.delete(id);
}
}
}
Source: src/generate.ts:73-84
Image Input Processing
Supported Formats
The system supports multiple image formats through MIME type mapping:
| Extension | MIME Type |
|---|---|
.png | image/png |
.jpg / .jpeg | image/jpeg |
.webp | image/webp |
.gif | image/gif |
.avif | image/avif |
File Validation
| Check | Limit | Error Message |
|---|---|---|
| File size | 50MB max | "Image file is {size}MB, max is 50MB" |
| Format support | Defined MIME map | "Unsupported image format" |
Source: src/generate.ts:119-135
Configuration System
Configuration Priority
per-request params > env vars > local config > global config > defaults
Source: README.md
Configuration Template
{
"outputDir": "~/gemini-images",
"defaultModel": "gemini-2.5-flash-image",
"logLevel": "info",
"requestTimeout": 60000,
"sessionTimeout": 1800000,
"maxRequestsPerHour": 0,
"maxCostPerHour": 0,
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "1K"
}
}
}
Source: src/config.ts
Per-Tool Defaults
Users can configure default parameters for each tool to avoid repetition:
{
"defaultModel": "gemini-3.1-flash-image-preview",
"defaults": {
"generate": {
"aspectRatio": "16:9",
"resolution": "2K"
},
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true
}
}
}
Source: README.md:95-109
Rate Limiting
Configuration Parameters
| Variable | Purpose |
|---|---|
MAX_REQUESTS_PER_HOUR | Maximum API requests per hour |
MAX_COST_PER_HOUR | Maximum USD cost per hour |
Source: README.md:79
Rate Limit Behavior
The system monitors both request count and cost per hour. When limits are reached, the API returns a clear error message indicating remaining budget. Source: CHANGELOG.md
Response Structure
Each generation response includes:
| Field | Type | Description |
|---|---|---|
sessionId | string | Unique ID for multi-turn sessions |
imagePath | string | Path to saved image |
generation | object | Generation parameters used |
usage | object | Token counts and estimated cost |
session | object | Running totals (generations, cost, hourly count) |
Source: src/generate.ts
MCP Tool Registration
The generate_image tool is registered with the MCP SDK using Zod schemas for parameter validation:
server.registerTool(
"generate_image",
{
prompt: z.string(),
images: z.array(z.string()).optional(),
model: z.string().optional(),
aspectRatio: z.string().optional(),
// ... additional parameters
},
async (args) => {
const config = loadConfig();
const result = await generateImage({ ...args, config });
return { content: [{ type: "text", text: JSON.stringify(result) }] };
}
);
Source: src/index.ts
Error Handling
Model Mismatch Detection
Sessions verify that the requested model matches the original session model to prevent inconsistent generation behavior:
// Model mismatch detection: error if session uses a different model than the original
Source: CHANGELOG.md
Seed-Based Reproducibility
Integer seeds enable reproducible generation results:
// seed param: integer for reproducible generation
Source: CHANGELOG.md
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Image Processing Internals
Related topics: processimage Tool Reference, Server Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: process_image Tool Reference, Server Architecture
Image Processing Internals
Overview
The process_image tool provides local, free image processing capabilities powered by the sharp library. Unlike generate_image which makes API calls to Google's Gemini, process_image operates entirely on the local machine, making it ideal for batch operations, asset preparation, and cost-free transformations. Source: package.json:17
The module supports chaining multiple operations in a single tool call, including cropping, resizing, background removal (threshold and chroma key), border trimming, and format conversion. This design allows complex pipelines like favicon generation or transparent asset extraction without multiple API round-trips.
Architecture
Component Diagram
graph TD
A["process_image Tool"] --> B["Input Validation"]
B --> C["sharp Pipeline"]
C --> D["Operation Chain"]
D --> E1["Crop Operations"]
D --> E2["Resize Operations"]
D --> E3["Background Removal"]
D --> E4["Trim Operations"]
D --> E5["Format Conversion"]
E3 --> F1["Threshold Detection"]
E3 --> F2["HSV Chroma Key"]
F2 --> G1["Smoothstep Feather"]
F2 --> G2["Spill Suppression"]
F2 --> G3["Edge Anti-aliasing"]
E1 --> H["Output Writer"]
E2 --> H
E4 --> H
E5 --> H
H --> I["generations.jsonl"]
H --> J["File System"]Technology Stack
| Component | Technology | Version | Purpose |
|---|---|---|---|
| Image Processing | sharp | ^0.34.5 | High-performance image manipulation |
| Validation | zod | ^3.24.0 | Runtime type checking for parameters |
| MCP SDK | @modelcontextprotocol/sdk | ^1.22.0 | Tool registration and communication |
Source: package.json:13-15
Input Validation
The tool validates all parameters before processing begins. The Zod schema enforces strict type constraints and ranges.
Parameter Schema
| Parameter | Type | Required | Default | Description | ||
|---|---|---|---|---|---|---|
imagePath | string | Yes | — | Path to source image file | ||
crop | CropConfig | No | undefined | Crop configuration object | ||
resize | ResizeConfig | No | undefined | Resize configuration object | ||
removeBackground | BackgroundConfig | No | config default | Background removal settings | ||
trim | boolean | No | config default | Auto-remove whitespace borders | ||
format | "png" \ | "jpeg" \ | "webp" | No | original | Output format |
quality | number (1-100) | No | 90 | JPEG/WebP quality | ||
outputDir | string | No | ~/gemini-images | Output directory | ||
filename | string | No | auto-generated | Base filename | ||
subfolder | string | No | none | Subdirectory path |
Source: src/index.ts:67-95
Crop Configuration
// Pixel-exact dimensions
{ width: 500, height: 300, left: 100, top: 50 }
// Aspect ratio (center crop)
{ aspectRatio: "16:9" }
// Focal point strategies
{ aspectRatio: "16:9", strategy: "attention" }
{ aspectRatio: "16:9", strategy: "entropy" }
Resize Configuration
// Width only (aspect ratio preserved)
{ width: 1200 }
// Height only (aspect ratio preserved)
{ height: 800 }
// Both dimensions (may affect aspect ratio)
{ width: 192, height: 192 }
Background Removal Configuration
// Threshold-based (white backgrounds)
{ threshold: 230 }
// Chroma key (green screen / any solid color)
{ color: "#00FF00" }
// Custom color with threshold tolerance
{ color: "#00FF00", threshold: 30 }
Processing Pipeline
Operation Flow
graph LR
A[Input Image] --> B[Load with sharp]
B --> C{Crop Specified?}
C -->|Yes| D[Apply Crop]
C -->|No| E[Resize Specified?]
D --> E
E -->|Yes| F[Apply Resize]
E -->|No| G[Background Removal?]
F --> G
G -->|Yes| H[Apply Background Removal]
G -->|No| I[Trim Specified?]
H --> I
I -->|Yes| J[Apply Trim]
I -->|No| K[Format Conversion?]
J --> K
K -->|Yes| L[Apply Format & Quality]
K -->|No| M[Write to Output]
L --> M
M --> N[Log to generations.jsonl]Crop Operations
Pixel-Exact Cropping
Accepts explicit left, top, width, and height parameters in pixels. The crop is applied using sharp's region extraction, which reads only the specified portion of the source image.
await sharp(input)
.extract({
left: crop.left,
top: crop.top,
width: crop.width,
height: crop.height
})
.toBuffer();
Aspect Ratio Cropping
When aspectRatio is specified without explicit dimensions, the system calculates the largest crop region matching the target ratio. The strategy parameter determines which region to select:
| Strategy | Behavior |
|---|---|
center (default) | Crops from the geometric center of the image |
attention | Shifts crop toward the most visually interesting region based on saliency detection |
entropy | Shifts crop toward the region with highest information density (detail) |
Resize Operations
Dimension Handling
The resize operation follows sharp's resize semantics:
- Width only: Height is calculated to maintain aspect ratio
- Height only: Width is calculated to maintain aspect ratio
- Both specified: Resizes to exact dimensions (may alter aspect ratio)
Resolution Presets
While the API accepts explicit pixel values, the generate_image tool supports resolution presets (1K, 2K, 4K) which map to standard dimensions:
| Preset | Dimensions |
|---|---|
| 1K | 1024 × 1024 (or proportional) |
| 2K | 2048 × 2048 (or proportional) |
| 4K | 4096 × 4096 (or proportional) |
Background Removal
Threshold-Based Detection
For images with white or light backgrounds, threshold-based detection identifies pixels above a brightness value and makes them transparent.
Algorithm:
- Convert image to grayscale
- Identify pixels exceeding the threshold (default: 230 on 0-255 scale)
- Set identified pixels to transparent
- Apply a slight blur to smooth edges
Best for: Product photos on plain white backgrounds, scanned documents, screenshots
Chroma Key Pipeline
For green screen or solid color backgrounds, the chroma key pipeline performs sophisticated color extraction:
graph TD
A[Input Image] --> B[Convert to HSV]
B --> C[Color Range Detection]
C --> D[Create Mask]
D --> E[Smoothstep Feather]
E --> F[Spill Suppression]
F --> G[Edge Anti-aliasing]
G --> H[Composite with Transparency]Stage Details:
| Stage | Description |
|---|---|
| HSV Keying | Converts to Hue-Saturation-Value color space for better color discrimination |
| Smoothstep Feather | Applies smooth edge transition using smoothstep function (not linear) |
| Spill Suppression | Removes color contamination from edges of subject |
| Edge Anti-aliasing | 5-pass 3×3 kernel anti-aliasing for smooth edges |
Recommended Settings:
| Subject Type | Color | Notes |
|---|---|---|
| High contrast (red, blue, black, white) | #00FF00 | Best results |
| Yellow subjects | canvas approach | Use generate_image instead |
| Green subjects | canvas approach | Use generate_image instead |
| Glass/reflective | canvas approach | Use generate_image instead |
Trim Operations
The trim operation automatically removes whitespace and transparent borders from images.
Algorithm:
- Scan the image row-by-row and column-by-column
- Identify the bounding box of non-white, non-transparent content
- Extract the content region
- Apply minimal padding (optional)
This operation is particularly useful after background removal to eliminate any leftover border artifacts.
Format Conversion
Supported Formats
| Format | Extension | Quality Range | Use Case |
|---|---|---|---|
| PNG | .png | N/A (lossless) | Transparency, icons, diagrams |
| JPEG | .jpg/.jpeg | 1-100 | Photographs, final output |
| WebP | .webp | 1-100 | Web optimization, smaller files |
Quality Control
For JPEG and WebP, the quality parameter controls the compression level:
- 90 (default): Balanced quality and file size
- 100: Maximum quality, larger file size
- 70-85: Smaller files, visible compression artifacts
- 1-69: Heavy compression, significant quality loss
Output Organization
Filename Auto-Versioning
When a filename collision occurs, the system automatically increments a version suffix:
| Attempt | Filename |
|---|---|
| 1st | hero.png |
| 2nd | hero-v2.png |
| 3rd | hero-v3.png |
| nth | hero-v{n}.png |
Directory Structure
Output is organized as: {outputDir}/{subfolder}/{filename}.{format}
Examples:
| Parameters | Result |
|---|---|
filename: "hero", no subfolder | ~/gemini-images/hero.png |
filename: "logo", subfolder: "brand" | ~/gemini-images/brand/logo.png |
outputDir: "./output", subfolder: "icons" | ./output/icons/{filename}.png |
Generation Manifest
Every processed image is logged to generations.jsonl in the output directory. Each entry is a JSON object on a single line:
{"timestamp":"2024-01-15T10:30:00.000Z","type":"process","operation":"background-removal","input":"product.jpg","output":"product-transparent.png","duration_ms":145}
Configuration Integration
Config Precedence
Parameters can be specified at multiple levels with this priority:
graph TD
A[Per-Request Parameters] --> B[Highest Priority]
B --> C[Local Config .gemini-image-mcp.json]
C --> D[Global Config ~/.gemini-image-mcp.json]
D --> E[Environment Variables]
E --> F[Code Defaults]
F --> G[Lowest Priority]Default Configuration Template
{
"outputDir": "~/gemini-images",
"defaultModel": "gemini-2.5-flash-image",
"logLevel": "info",
"requestTimeout": 60000,
"sessionTimeout": 1800000,
"maxRequestsPerHour": 0,
"maxCostPerHour": 0,
"defaults": {
"process": {
"removeBackground": { "color": "#00FF00" },
"trim": true,
"format": "png",
"quality": 90
}
}
}
Source: src/config.ts:17-47
Common Pipelines
Favicon Generation Pipeline
process_image → removeBackground {threshold: 230} + trim + resize {width: 192, height: 192}
Steps:
- Remove white background using threshold detection
- Trim any remaining whitespace
- Resize to 192×192 favicon dimensions
Transparent Asset from Green Screen
generate_image → "A product photo on a bright green background"
process_image → removeBackground {color: "#00FF00"} + trim
Steps:
- Generate subject on green screen (one API call)
- Apply chroma key to remove green (free, local)
- Trim excess border
Social Card from Photo
process_image → crop {aspectRatio: "16:9", strategy: "attention"} + resize {width: 1200}
Steps:
- Crop to 16:9 ratio, focusing on the most interesting region
- Resize to optimal width for social platforms
Performance Characteristics
Processing Speed
Since all operations run locally via sharp, process_image is significantly faster than API-based alternatives:
| Operation | Typical Duration |
|---|---|
| Crop/Resize | < 100ms |
| Background Removal (threshold) | 100-300ms |
| Background Removal (chroma key) | 300-800ms |
| Trim | < 50ms |
| Format Conversion | 50-200ms |
Memory Usage
Sharp processes images in memory and uses libvips, which is designed for efficient memory usage even with large images. A 4K image (4096×4096) typically requires 50-100MB of working memory depending on the operations performed.
Error Handling
Common Error Cases
| Error | Cause | Resolution |
|---|---|---|
| Unsupported format | Invalid file extension | Use PNG, JPEG, WebP, GIF, TIFF, or WebP |
| File too large | Image exceeds 50MB limit | Reduce image size before processing |
| File not found | Invalid path | Verify imagePath is correct and accessible |
| Invalid crop dimensions | Crop region exceeds image bounds | Adjust width, height, left, top values |
| Invalid hex color | Malformed color string | Use format: #RRGGBB or #RGB |
Source: src/generate.ts:107-112
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Cost Tracking and Rate Limiting
Related topics: generateimage Tool Reference, Configuration Guide, Image Generation Internals
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: generate_image Tool Reference, Configuration Guide, Image Generation Internals
Cost Tracking and Rate Limiting
The gemini-image-mcp server implements a comprehensive cost tracking and rate limiting system to help users monitor and control their API spending. This system operates at multiple levels—from per-generation cost calculation to hourly request and budget caps—ensuring predictable expenditure when using Gemini image generation capabilities.
Overview
The cost tracking and rate limiting subsystem serves three primary purposes:
- Cost Transparency — Every image generation returns detailed token counts and estimated USD cost, allowing users to understand their API consumption.
- Budget Enforcement — Configurable hourly limits prevent runaway agents or iterative workflows from exceeding intended spending.
- Session Context — Generation costs are tracked per session, providing cumulative cost summaries across multi-turn editing workflows.
Source: src/pricing.ts:1-50
Architecture
The system comprises two interconnected modules:
| Module | File | Responsibility |
|---|---|---|
| Pricing | src/pricing.ts | Token counting, cost calculation, and pricing table |
| Tracker | src/tracker.ts | Rate limiting, session tracking, and manifest logging |
graph TD
A[generate_image Tool Call] --> B[checkRateLimit]
B --> C{Within Limits?}
C -->|No| D[Throw RateLimitError]
C -->|Yes| E[Call Gemini API]
E --> F[calculateUsage]
F --> G[UsageReport]
G --> H[recordGeneration]
H --> I[Update Session Stats]
H --> J[Append to generations.jsonl]
K[Config: MAX_REQUESTS_PER_HOUR] -.-> B
L[Config: MAX_COST_PER_HOUR] -.-> BSource: src/tracker.ts:40-60
Pricing Module
Pricing Table
The PRICING object in src/pricing.ts contains the authoritative pricing rates for all supported Gemini image models. All rates are expressed as USD per million tokens.
| Model | Input ($/M) | Text Output ($/M) | Image Output ($/M) | Thinking ($/M) |
|---|---|---|---|---|
gemini-2.5-flash-image | 0.30 | 2.50 | 30.00 | 2.50 |
gemini-3-pro-image-preview | 2.00 | 120.00 | 120.00 | 120.00 |
gemini-3.1-flash-image-preview | 0.50 | 60.00 | 60.00 | 60.00 |
Source: src/pricing.ts:31-45
The pricing data is verified against Google AI Studio as of 2026-04-01, which is stored in the PRICING_VERIFIED_DATE constant and included in every UsageReport.
Cost Calculation Formula
The calculateUsage() function computes the estimated cost using the following formula:
cost = (promptTokens / 1,000,000) × inputPerMillion
+ (textTokens / 1,000,000) × textOutputPerMillion
+ (imageTokens / 1,000,000) × imageOutputPerMillion
+ (thinkingTokens / 1,000,000) × thinkingPerMillion
Source: src/pricing.ts:66-72
UsageReport Interface
Every image generation returns a UsageReport containing:
| Field | Type | Description |
|---|---|---|
promptTokens | number | Input token count |
outputTokens | number | Total output tokens |
imageTokens | number | Image modality output tokens |
thinkingTokens | number | Internal reasoning tokens |
totalTokens | number | Combined token count |
estimatedCost | string | Formatted cost (e.g., "$0.0412") |
pricingVerifiedDate | string | Date pricing was last verified |
Source: src/pricing.ts:47-54
Handling Unknown Models
If a model is not found in the pricing table, the system returns "unknown (model not in pricing table)" as the estimated cost while still populating token counts. This ensures graceful degradation without breaking workflows for new or custom models.
Source: src/pricing.ts:74-80
Rate Limiting Module
Configuration
Rate limits are configured through environment variables or the JSON config file:
| Environment Variable | Config Key | Type | Default | Description |
|---|---|---|---|---|
MAX_REQUESTS_PER_HOUR | maxRequestsPerHour | number | 0 (disabled) | Maximum generations per rolling hour |
MAX_COST_PER_HOUR | maxCostPerHour | number | 0 (disabled) | Maximum USD spend per rolling hour |
Source: src/tracker.ts:35-50
Configuration priority follows this order (highest to lowest):
- Environment variables
- Local config file (
.gemini-image-mcp.jsonin CWD) - Global config file (
~/.gemini-image-mcp.json) - Built-in defaults
Rate Limit Enforcement
The checkRateLimit() function performs two checks against a rolling one-hour window:
graph LR
A[Load Config] --> B[countRecentGenerations]
B --> C{Hourly Request Limit?}
C -->|Exceeded| D[Throw Error with count/limit]
C -->|OK| E{Hourly Cost Limit?}
E -->|Exceeded| F[Throw Error with $spent/$limit]
E -->|OK| G[Continue to API Call]Source: src/tracker.ts:42-58
Error Messages
When rate limits are exceeded, the system throws descriptive errors:
Request limit reached:
Rate limit reached — 20/20 generations used this hour. To change: set MAX_REQUESTS_PER_HOUR env var.
Cost limit reached:
Cost limit reached — $4.50/$5.00 spent this hour. To change: set MAX_COST_PER_HOUR env var.
Source: src/tracker.ts:48-56
Session Tracking
Session Statistics
Multi-turn editing sessions maintain running totals across all generations within that session:
| Stat | Type | Description |
|---|---|---|
sessionGenerations | number | Count of generations in current session |
sessionCostCents | number | Cumulative cost in cents for the session |
Source: src/tracker.ts:20-25
SessionStats Interface
Each tool response includes a session object with:
| Field | Type | Description |
|---|---|---|
sessionId | string | Unique session identifier |
sessionGenerations | number | Generations in this session |
sessionCostCents | number | Session cost in cents |
Source: src/tracker.ts:1-20
Session Management
- Sessions expire after 30 minutes of inactivity
- The
sessionIdparameter continues editing from prior conversation context - Model mismatch detection prevents mixing models within a session
Source: CHANGELOG.md
Generation Manifest
All generations are logged to generations.jsonl in the output directory for auditing and analytics:
{"timestamp":"2026-04-01T12:00:00.000Z","model":"gemini-2.5-flash-image","prompt":"A modern dashboard UI","aspectRatio":"16:9","resolution":"2K","filename":"dashboard-hero","costCents":4.12,"tokens":1295}
Source: src/tracker.ts:60-65
Tool Response Structure
Every generate_image response includes complete cost and tracking information:
{
"imagePath": "/home/user/gemini-images/hero-banner.png",
"mimeType": "image/png",
"model": "gemini-2.5-flash-image",
"sessionId": "session-1711929600000-a1b2c3",
"sessionTurn": 1,
"usage": {
"promptTokens": 5,
"outputTokens": 1295,
"imageTokens": 1290,
"thinkingTokens": 412,
"totalTokens": 1295,
"estimatedCost": "$0.0412",
"pricingVerifiedDate": "2026-04-01"
},
"session": {
"sessionId": "session-1711929600000-a1b2c3",
"sessionGenerations": 1,
"sessionCostCents": 4.12
}
}
Recommended Settings
For agentic workflows with iterative image refinement:
| Setting | Value | Rationale |
|---|---|---|
MAX_REQUESTS_PER_HOUR | 20 | Prevents runaway loops |
MAX_COST_PER_HOUR | 5.00 | Caps hourly spend at $5 |
Source: README.md
Testing
The pricing and tracking modules have comprehensive test coverage:
| Test File | Coverage |
|---|---|
src/pricing.test.ts | Cost calculation, unknown models, missing metadata, pricing table verification |
src/tracker.test.ts | Rate limit enforcement, session tracking, manifest appending |
Source: src/pricing.test.ts:1-50
Summary
The cost tracking and rate limiting system provides transparency and control over API usage through:
- Per-generation pricing with detailed token breakdowns across input, text output, image output, and thinking tokens
- Hourly rate limiting on both request count and dollar amount
- Session-aware tracking for multi-turn editing workflows
- Manifest logging for historical analysis and auditing
- Graceful degradation when encountering unknown models
Source: https://github.com/JimothySnicket/gemini-image-mcp / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.
1. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2
2. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2
3. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2
4. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2
5. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2
6. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | mcp_registry:io.github.JimothySnicket/gemini-image:0.2.2 | https://registry.modelcontextprotocol.io/v0.1/servers/io.github.JimothySnicket%2Fgemini-image/versions/0.2.2
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using gemini-image-mcp with real data or production workflows.
- Capability evidence risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence