mcp-memory-service Manual

Doramagic Project Pack · Human Manual

mcp-memory-service

Related topics: System Architecture, Installation and Setup, Quick Start Guide

Overview and Key Concepts

Related topics: System Architecture, Installation and Setup, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Supported Storage Backends

Continue reading this section for the full explanation and source context.

Section 1. MCP Server

Continue reading this section for the full explanation and source context.

Section 2. REST API Layer

Continue reading this section for the full explanation and source context.

Overview and Key Concepts

The MCP Memory Service is a semantic memory storage and retrieval system designed for AI-assisted development workflows. It provides persistent memory capabilities for Claude Code and other MCP-compatible clients, enabling intelligent context retention across development sessions.

What is MCP Memory Service?

MCP Memory Service solves the context loss problem in AI-assisted development by maintaining a persistent, searchable store of development decisions, architectural choices, and project knowledge. When you work on a project, the service captures important decisions and makes them available in future sessions.

Sources: README.md

Architecture Overview

The system follows a layered architecture with clear separation between storage, API, and client integration layers.

graph TD
    A[Claude Code / MCP Clients] -->|MCP Protocol| B[MCP Server Layer]
    B --> C[REST API Layer]
    C --> D[Service Layer]
    D --> E[Storage Backend]
    
    E -->|SQLite-vec| F[Local Vector Storage]
    E -->|Cloudflare| G[D1 + Vectorize]
    E -->|Hybrid| H[Combined Approach]
    
    I[Web Dashboard] -->|HTTP| C
    J[Claude Hooks] -->|Session Events| B

Supported Storage Backends

Backend	Description	Use Case
SQLite-vec	Local vector storage with SQLite	Single-machine deployments
Cloudflare D1 + Vectorize	Cloud-hosted serverless	Multi-device, global access
Hybrid	Combined local and cloud	Redundancy and performance

Sources: src/mcp_memory_service/api/__init__.py

Core Components

1. MCP Server

The MCP Server implements the Model Context Protocol, providing tools and prompts for memory operations. It handles tool execution, prompt management, and bidirectional communication with MCP clients.

Key Tools:

store_memory - Store new memories with automatic embedding generation
search_memories - Semantic similarity search using embeddings
retrieve_memory - Retrieve specific memory by content hash
delete_memory - Remove memory and associated embeddings
list_memories - List all memories with pagination

Key Prompts:

knowledge_retrieval - Structured memory retrieval with relevance ranking
memory_summary - Generate summaries of stored memories
knowledge_export - Export memories in various formats
memory_cleanup - Identify and remove duplicates

Sources: src/mcp_memory_service/server_impl.py

2. REST API Layer

The REST API provides HTTP endpoints for direct access to memory operations, useful for integrations and the web dashboard.

graph LR
    A[Memory Management] -->|POST /api/memories| B[Store]
    A -->|GET /api/memories| C[List]
    A -->|GET /api/memories/{hash}| D[Retrieve]
    A -->|DELETE /api/memories/{hash}| E[Delete]
    
    F[Search Operations] -->|POST /api/search| G[Semantic Search]
    F -->|GET /api/search/similar/{hash}| H[Similar Search]
    
    I[Real-time Events] -->|GET /api/events| J[SSE Stream]
    I -->|GET /api/events/stats| K[Statistics]

Endpoint	Method	Description
`/api/memories`	POST	Store a new memory with automatic embedding generation
`/api/memories`	GET	List all memories with pagination support
`/api/memories/{hash}`	GET	Retrieve a specific memory by content hash
`/api/memories/{hash}`	DELETE	Delete a memory and its embeddings
`/api/search`	POST	Semantic similarity search using embeddings
`/api/search/similar/{hash}`	GET	Find memories similar to a specific one
`/api/events`	GET	Subscribe to real-time memory events stream

Sources: src/mcp_memory_service/web/app.py

3. Web Dashboard

The built-in web interface provides:

Interactive API documentation (Swagger UI and ReDoc)
Real-time statistics display
SSE testing interface
Health monitoring

Sources: src/mcp_memory_service/web/app.py

Key Features

Semantic Search with Embeddings

The service uses the all-MiniLM-L6-v2 embedding model to convert memory content into vector representations. This enables semantic similarity search that understands meaning rather than just keyword matching.

Performance Characteristics:

Metric	Value
First call latency	~50ms (includes storage initialization)
Subsequent calls	~5-10ms (connection reused)
Memory overhead	<10MB
Cost at $0.15/1M tokens	$16.43/year per 10-user deployment

Sources: src/mcp_memory_service/api/__init__.py

Response Size Limiter

To prevent context window overflow in LLM clients, the service includes a response limiter that truncates large responses at memory boundaries. This ensures that large memory retrieval operations don't crash Claude or other LLM clients.

Configuration:

Environment Variable	Default	Description
`MCP_MAX_RESPONSE_CHARS`	0 (unlimited)	Maximum characters in responses

Sources: src/mcp_memory_service/server/utils/response_limiter.py

Claude Code Integration

The service provides deep integration with Claude Code through hooks and slash commands.

Available Commands:

Command	Purpose
`/memory-store`	Store important decisions and context
`/memory-recall`	Retrieve memories using natural language
`/memory-search`	Search by tags and content keywords
`/memory-context`	Capture current session context
`/memory-health`	Check service health and statistics

Sources: claude_commands/README.md

Automatic Hooks:

Session Start: Load relevant project memories when Claude Code starts
Session End: Store insights and decisions from completed sessions
Memory Retrieval: On-demand memory access during conversations
Permission Requests: Automated handling of MCP permission requests

Sources: claude-hooks/README.md

Memory Types Taxonomy

The service organizes memories into a standardized taxonomy:

Category	Types
Content Types	note, reference, document, guide
Activity Types	session, implementation, analysis

This standardized taxonomy helps with memory organization and retrieval. The maintenance scripts can consolidate fragmented types into this standardized set.

Sources: scripts/maintenance/README.md

Data Model

Each memory in the system has the following structure:

{
  "content": "Memory content here",
  "content_hash": "sha256hash",
  "tags": ["tag1", "tag2"],
  "created_at": 1673545200.0,
  "updated_at": 1673545200.0,
  "memory_type": "note",
  "metadata": {},
  "export_source": "machine-name"
}

Field	Type	Description
content	string	The actual memory content
content_hash	string	SHA256 hash of content for deduplication
tags	array	User-defined tags for categorization
created_at	float	Unix timestamp of creation
updated_at	float	Unix timestamp of last update
memory_type	string	Standardized type classification
metadata	object	Additional metadata storage
export_source	string	Source machine identifier

Synchronization and Backup

Export/Import

Memories can be exported and imported in JSON format for backup and migration purposes:

Export: ~1000 memories/second
Import: ~500 memories/second with deduplication
File Size: ~1KB per memory

Litestream Integration

For real-time replication, the service supports Litestream configuration, providing continuous backup to object storage.

Sources: src/mcp_memory_service/sync/litestream_config.py

Maintenance Tools

The service includes several maintenance scripts:

Script	Purpose
`check_memory_types.py`	Analyze type distribution and fragmentation
`consolidate_memory_types.py`	Consolidate fragmented types into standardized taxonomy
`export_memories.py`	Export memories to JSON format
`import_memories.py`	Import memories from JSON with deduplication

Sources: scripts/maintenance/README.md

Installation Methods

Python Package Installation

pip install mcp-memory-service

Claude Code Integration

cd claude-hooks
python install_hooks.py --natural-triggers

Sources: claude-hooks/README.md

Summary

MCP Memory Service provides a comprehensive solution for maintaining persistent, searchable memory across AI-assisted development sessions. Its layered architecture supports multiple storage backends, while deep Claude Code integration enables seamless workflow integration. The system prioritizes reliability through response limiting, data deduplication, and maintenance tools that keep the memory store organized and efficient.

Sources: README.md

Installation and Setup

Overview

The MCP Memory Service provides a comprehensive installation framework supporting multiple deployment scenarios including pip installation, Claude Code integration, OpenCode plugin support, and standalone HTTP server deployment. The setup system is designed to be modular, allowing users to install only the components they need while maintaining cross-platform compatibility across Windows, macOS, and Linux environments.

System Requirements

Prerequisites

The installation system requires several external dependencies that must be present before deployment:

Dependency	Purpose	Platform-Specific Notes
Python 3.10+	Runtime environment	Required on all platforms
Node.js	Hooks execution	Required for Claude Code hooks
`jq`	Status line features	macOS: `brew install jq`; Linux: `sudo apt install jq`; Windows: `choco install jq`
pip	Package management	Included with Python 3.10+

The dependency checking system (src/mcp_memory_service/dependency_check.py) validates all required dependencies during initialization and provides clear guidance if any are missing.

Environment Requirements

The service requires specific directory structures and environment variables:

# Standard data directory
~/.local/share/mcp-memory/

# Database file
~/.local/share/mcp-memory/sqlite_vec.db

# Configuration directory
~/.claude/hooks/config.json  # For Claude Code integration

Installation Methods

Method 1: Pip Installation

The primary installation method uses pip to install the mcp-memory-service package:

pip install mcp-memory-service

The pyproject.toml file defines all package dependencies and metadata for PyPI distribution. This installation provides:

Core memory service library
HTTP server implementation
API endpoints
CLI tools

Sources: pyproject.toml

Method 2: Standalone HTTP Server Installation

For HTTP server deployment, the installation script provides a guided setup process:

# From the repository root
python scripts/installation/install.py

# With specific options
python scripts/installation/install.py --install-claude-commands  # Install Claude commands
python scripts/installation/install.py --skip-claude-commands-prompt  # Skip command prompt

The installer performs the following operations:

Validates system prerequisites
Installs the mcp-memory-service package
Creates required directories
Configures environment variables
Sets up systemd services (on Linux)
Optionally installs Claude Code commands

Sources: scripts/installation/install.py

Method 3: Claude Code Hooks Installation

Claude Code hooks provide automatic memory awareness and context injection. The unified installer supports multiple installation modes:

cd claude-hooks

# Install Natural Memory Triggers (recommended)
python install_hooks.py --natural-triggers

# OR install basic memory awareness hooks
python install_hooks.py --basic

The hooks system consists of several components:

Component	File	Purpose
Session Start Hook	`session-start.js`	Loads relevant memories when Claude Code starts
Session End Hook	`session-end.js`	Stores session insights and decisions
Memory Retrieval	`memory-retrieval.js`	On-demand memory retrieval
Permission Request	`permission-request.js`	MCP server permission automation

Sources: claude-hooks/README.md

Method 4: Claude Commands Installation

For Claude Code CLI integration, custom commands can be installed:

# Automatic installation during main setup
python scripts/installation/install.py --install-claude-commands

# Manual installation
python scripts/claude_commands_utils.py

# Test prerequisites
python scripts/claude_commands_utils.py --test

# Uninstall
python scripts/claude_commands_utils.py --uninstall

Commands are installed to ~/.claude/commands/ and provide:

/memory-save - Save memories with tags
/memory-recall - Time-based memory retrieval
/memory-search - Tag and content search
/memory-context - Session context integration
/memory-health - Service health check

Sources: claude_commands/README.md

Method 5: OpenCode Plugin Installation

The OpenCode plugin provides memory awareness for the OpenCode editor:

# Install plugin file
mkdir -p ~/.config/opencode/plugins
cp opencode/memory-plugin.js ~/.config/opencode/plugins/

# Install example configuration
cp opencode/memory-plugin.config.example.json ~/.config/opencode/memory-plugin.json

Sources: opencode/README.md

Environment Configuration

Environment Variables

The .env.example file provides configuration templates:

# MCP Memory Service Configuration
MCP_API_KEY=your-api-key-here
MCP_MEMORY_ENDPOINT=https://localhost:8443
MCP_MEMORY_TIMEOUT_MS=30000
MCP_MEMORY_LOAD_TIMEOUT_MS=10000

Configuration Hierarchy

The OpenCode plugin demonstrates the configuration precedence system:

graph TD
    A[Config Options] --> B[Environment Variables]
    A --> C[Default Config File]
    B --> D[OPENCODE_MEMORY_ENDPOINT]
    B --> E[OPENCODE_MEMORY_API_KEY]
    B --> F[OPENCODE_MEMORY_TIMEOUT_MS]
    C --> G[~/.config/opencode/memory-plugin.json]

Configuration order of precedence (highest to lowest):

Explicit plugin options
Environment variables
User config file (~/.config/opencode/memory-plugin.json)
Project-local config (.opencode/memory-plugin.json)

Sources: opencode/README.md

HTTP Server Deployment

Service Setup on Linux/macOS

The Litestream configuration system supports streaming SQLite replication for data durability:

# Install Litestream
curl -LsS https://github.com/benbjohnson/litestream/releases/latest/download/litestream-linux-amd64.tar.gz | tar -xzf -
sudo mv litestream /usr/local/bin/

# Generate configuration
python scripts/sync/litestream_config.py

Systemd Service Configuration

Production deployment uses systemd for service management:

# Start the service
systemctl --user start mcp-memory-http.service

# Enable on boot
systemctl --user enable mcp-memory-http.service

# Check status
systemctl --user status mcp-memory-http.service

Sources: src/mcp_memory_service/sync/litestream_config.py

Windows Service Setup

For Windows deployment, a LaunchAgent plist configuration is provided:

<dict>
    <key>Label</key>
    <string>com.mcp-memory-service</string>
    <key>ProgramArguments</key>
    <array>
        <string>/local/bin/litestream</string>
        <string>replicate</string>
        <string>-config</string>
        <string>{config_path}</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>

Sources: src/mcp_memory_service/sync/litestream_config.py

Data Synchronization

Database Export/Import

For cross-machine synchronization, the sync scripts provide export and import functionality:

# Export memories to JSON
python scripts/sync/export_memories.py --output ./memories_export.json

# Import memories from JSON
python scripts/sync/import_memories.py --input ./memories_export.json

The export format preserves all memory metadata:

{
  "export_metadata": {
    "source_machine": "machine-name",
    "export_timestamp": "2025-08-12T10:30:00Z",
    "total_memories": 450,
    "database_path": "/path/to/sqlite_vec.db"
  },
  "memories": [
    {
      "content": "Memory content",
      "content_hash": "sha256hash",
      "tags": ["tag1", "tag2"],
      "created_at": 1673545200.0,
      "memory_type": "note"
    }
  ]
}

Deduplication is based on content hash during import.

Sources: scripts/sync/README.md

Maintenance Procedures

Database Type Consolidation

Over time, memory types may become fragmented. The consolidation script standardizes the taxonomy:

# Preview changes (safe, read-only)
python scripts/maintenance/consolidate_memory_types.py --dry-run

# Execute consolidation
python scripts/maintenance/consolidate_memory_types.py

# With custom mapping
python scripts/maintenance/consolidate_memory_types.py --config custom_mappings.json

The standard 24-type taxonomy includes:

Category	Types
Content Types	`note`, `reference`, `document`, `guide`
Activity Types	`session`, `implementation`, `analysis`

Sources: scripts/maintenance/README.md

Verification

After installation, verify the setup with these checks:

# Verify Claude hooks installation
claude --debug hooks

# Run integration tests
cd ~/.claude/hooks && node tests/integration-test.js

# Test API health
curl -k https://your-endpoint:8443/api/health

# Verify database
sqlite3 ~/.local/share/mcp-memory/sqlite_vec.db "SELECT COUNT(*) FROM memories;"

Troubleshooting

Common Issues

Issue	Solution
Hooks not detected	Check `ls ~/.claude/settings.json`; reinstall if missing
JSON parse errors	Update to latest version
Connection failed	Verify endpoint with `curl -k https://your-endpoint:8443/api/health`
Wrong directory	Move `~/.claude-code/hooks/*` to `~/.claude/hooks/`
Missing `jq`	Install per platform instructions in prerequisites

Debug Mode

Enable detailed logging for troubleshooting:

# Claude Code debug mode
claude --debug hooks

# Test individual hooks
node ~/.claude/hooks/core/session-start.js

PyPI Placeholder Packages

For users who may install the wrong package name, placeholder packages redirect to the correct installation:

# These packages emit deprecation warnings and redirect to mcp-memory-service
pip install mcp-memory
pip install memory-service
pip install agent-mem

Sources: tools/pypi-placeholders/README.md

Architecture Overview

graph TD
    subgraph "Installation Methods"
        A[Pip Installation] --> E[Core Service]
        B[Claude Hooks] --> F[Memory Awareness]
        C[Claude Commands] --> G[CLI Integration]
        D[OpenCode Plugin] --> H[Editor Integration]
    end
    
    subgraph "Runtime Components"
        E --> I[HTTP Server]
        I --> J[API Endpoints]
        J --> K[Memory Storage]
        K --> L[sqlite_vec]
    end
    
    subgraph "Data Layer"
        L --> M[Local Storage]
        L --> N[LitekStream Sync]
    end

Sources: pyproject.toml

Quick Start Guide

Related topics: Overview and Key Concepts, REST API Reference, Agent Framework Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Installing jq (Required for StatusLine Feature)

Continue reading this section for the full explanation and source context.

Section Automatic Installation (Recommended)

Continue reading this section for the full explanation and source context.

Section MCP Memory Service Installation

Continue reading this section for the full explanation and source context.

Quick Start Guide

Overview

The MCP Memory Service Quick Start Guide provides developers and users with a streamlined path to deploy and begin using the persistent semantic memory system for AI agents. This guide covers installation methods, initial configuration, core functionality verification, and essential commands for day-to-day operations.

The service serves as a centralized memory backend that stores, retrieves, and manages semantic memories with automatic embedding generation. It supports multiple deployment configurations including local SQLite-vec storage, Cloudflare D1 + Vectorize for cloud deployments, and hybrid configurations for distributed workflows across multiple machines.

Prerequisites

Before beginning the installation, ensure your development environment meets the following requirements:

Requirement	Minimum Version	Notes
Python	3.10+	Required for core service
Node.js	16+	Required for hooks execution
SQLite	3.35+	Bundled with Python
jq	1.6+	Required for statusLine feature
Claude Code CLI	Latest	Optional, for Claude command integration

Installing jq (Required for StatusLine Feature)

# macOS
brew install jq

# Linux (Ubuntu/Debian)
sudo apt install jq

# Windows
choco install jq

Installation Methods

Automatic Installation (Recommended)

The recommended approach uses the unified installer which handles all components:

cd claude-hooks
python install_hooks.py

The installer supports multiple installation modes:

Mode	Command	Description
Full Installation	`python install_hooks.py`	Installs all features including basic hooks and natural triggers
Basic Only	`python install_hooks.py --basic`	Installs memory hooks only
Natural Triggers	`python install_hooks.py --natural-triggers`	Installs natural memory triggers only

MCP Memory Service Installation

For the core memory service with HTTP server:

# Install with commands (will prompt if Claude Code CLI is detected)
python scripts/installation/install.py

# Force install commands
python scripts/installation/install.py --install-claude-commands

# Skip command installation prompt
python scripts/installation/install.py --skip-claude-commands-prompt

Initial Configuration

Claude Desktop Configuration

To integrate MCP Memory Service with Claude Desktop, create or modify the Claude Desktop configuration file. The configuration path varies by operating system:

macOS/Linux: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Example configuration structure:

{
  "mcpServers": {
    "mcp-memory-service": {
      "command": "python",
      "args": [
        "-m",
        "mcp_memory_service",
        "server"
      ],
      "env": {
        "MCP_MEMORY_DB_PATH": "~/.local/share/mcp-memory/sqlite_vec.db"
      }
    }
  }
}

Environment Variables

The service recognizes several environment variables for configuration:

Variable	Default	Description
`MCP_MEMORY_DB_PATH`	`~/.local/share/mcp-memory/sqlite_vec.db`	Database storage path
`MCP_MEMORY_BACKEND`	`sqlite_vec`	Storage backend type
`MCP_MEMORY_PORT`	`8443`	HTTP server port
`MCP_MEMORY_API_KEY`	None	Optional API key authentication

Backend Configuration Options

The service supports multiple backend configurations:

Backend	Use Case	Configuration Value
SQLite-vec	Local development, single machine	`sqlite_vec` or `sqlite-vec`
Cloudflare D1	Cloud deployment	`cloudflare`
Hybrid	Multi-machine sync	`hybrid`

Sources: src/mcp_memory_service/web/app.py

Starting the Service

HTTP Server Mode

Start the HTTP server for REST API access:

python -m mcp_memory_service server

The server provides an interactive dashboard at the root URL (/) displaying:

Total memory count
Active embedding model
Server health status
Response time metrics

MCP Server Mode

Start as an MCP server for Claude integration:

python -m mcp_memory_service mcp

Service Health Verification

Verify the installation by checking service health:

curl -k https://localhost:8443/api/health

Expected response includes:

backend: Current storage backend (e.g., sqlite_vec)
count: Total number of stored memories
status: Service health status

Core API Endpoints

The service exposes RESTful endpoints for memory operations:

Sources: src/mcp_memory_service/web/app.py

Memory Management Endpoints

Method	Endpoint	Description
POST	`/api/memories`	Store a new memory with automatic embedding
GET	`/api/memories`	List all memories with pagination
GET	`/api/memories/{hash}`	Retrieve specific memory by content hash
DELETE	`/api/memories/{hash}`	Delete a memory and its embeddings

Search Operations

Method	Endpoint	Description
POST	`/api/search`	Semantic similarity search using embeddings
GET	`/api/search/tags/{tag}`	Find memories by specific tag
GET	`/api/search/similar/{hash}`	Find memories similar to a specific one

Real-time Events

Method	Endpoint	Description
GET	`/api/events`	Subscribe to real-time memory event stream (SSE)
GET	`/api/events/stats`	View SSE connection statistics

API Documentation

Interactive API documentation is available at:

Swagger UI: /api/docs
ReDoc: /api/redoc

Claude Commands Integration

After installation, several slash commands become available within Claude Code:

Sources: claude_commands/README.md

Available Commands

#### /memory-store - Store New Memories

Store a new memory with automatic content hashing and embedding generation.

claude /memory-store "Implemented OAuth 2.1 authentication for improved security"
claude /memory-store "Refactored storage backend to support sqlite-vec for performance"

#### /memory-recall - Time-based Memory Retrieval

Retrieve memories using natural language time expressions.

claude /memory-recall "what did we decide about the database last week?"
claude /memory-recall "yesterday's architectural discussions"

#### /memory-search - Tag and Content Search

Search through stored memories using tags, content keywords, and semantic similarity.

claude /memory-search --tags "architecture,database"
claude /memory-search "SQLite performance optimization"

#### /memory-context - Session Context Integration

Capture the current conversation and project context as a memory.

claude /memory-context
claude /memory-context --summary "Architecture planning session"

#### /memory-health - Service Health Check

Check the health and status of the MCP Memory Service.

claude /memory-health
claude /memory-health --detailed

Claude Hooks Configuration

The hooks system provides automatic memory capture during Claude Code sessions:

Sources: claude-hooks/README.md

Hook Types

Hook	Trigger	Action
`session-start`	Claude Code launch	Load relevant project memories
`session-end`	Claude Code exit	Store insights and decisions
`auto-capture`	Automatic capture trigger	Capture important context
`mid-conversation`	Periodic intervals	Summarize and store progress

Configuration File

Edit ~/.claude/hooks/config.json to customize hook behavior:

{
  "memoryService": {
    "endpoint": "https://your-server:8443",
    "apiKey": "optional-api-key"
  },
  "hooks": {
    "verbose": true,
    "showMemoryDetails": false,
    "cleanMode": false,
    "autoCapture": true,
    "forceRemember": "#remember",
    "forceSkip": "#skip",
    "applyTo": ["auto-capture", "session-start", "mid-conversation", "session-end"]
  }
}

Verbosity Levels

Level	Settings	Output
Normal	`verbose: true`, others `false`	Essential information only
Detailed	`showMemoryDetails: true`	Includes memory scoring details
Clean	`cleanMode: true`	Minimal output, success/error only
Silent	`verbose: false`	Background operation only

Multi-Machine Synchronization

For users working across multiple machines, the service supports database synchronization:

Sources: scripts/sync/README.md

Export/Import Workflow

graph TD
    A[Machine A] -->|export| B[JSON File]
    B -->|import| C[Machine B]
    B -->|import| D[Machine C]
    
    E[Central Database] -->|replicate| A
    E -->|replicate| C

Export Command

python scripts/sync/export_memories.py \
    --source-db ~/.local/share/mcp-memory/sqlite_vec.db \
    --output ~/memories-export.json \
    --machine-name "workstation-1"

Import Command

python scripts/sync/import_memories.py \
    --input ~/memories-export.json \
    --target-db ~/.local/share/mcp-memory/sqlite_vec.db

Deduplication

Memories are deduplicated based on content hash during import:

Same content hash: Treated as duplicate, skipped
Different content hash: Imported as unique memory
Original timestamps: Preserved from source
Source machine tags: Added automatically for tracking

Litestream Configuration

For continuous database replication, configure Litestream:

dbs:
  - path: ~/.local/share/mcp-memory/sqlite_vec.db
    replicas:
      - url: s3://your-bucket/memory-db
        sync-interval: 1s

Verification

Verify Hook Installation

claude --debug hooks

Expected output: Found 1 hook matchers in settings

Run Integration Tests

cd ~/.claude/hooks
node tests/integration-test.js

Expected: All 14 integration tests pass

Test API Endpoints

# Health check
curl -s https://localhost:8443/api/health | jq

# Store a test memory
curl -X POST https://localhost:8443/api/memories \
  -H "Content-Type: application/json" \
  -d '{"content": "Test memory from quick start", "tags": ["test"]}'

# Search memories
curl -X POST https://localhost:8443/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "test", "limit": 5}'

Troubleshooting

Common Issues

Issue	Solution
Hooks not detected	Verify `~/.claude/settings.json` exists; reinstall if missing
JSON parse errors	Update to latest version with Python dict conversion
Connection failed	Check `curl -k https://your-endpoint:8443/api/health`
Wrong directory	Move `~/.claude-code/hooks/*` to `~/.claude/hooks/`

Debug Mode

Enable verbose debugging:

# Claude hooks debug
claude --debug hooks

# Individual hook testing
node ~/.claude/hooks/core/session-start.js

Windows-Specific Considerations

Directory Structure: Hooks install to %USERPROFILE%\.claude\hooks\
JSON Path Format: Use backslashes or forward slashes
Python Executable: Ensure Python is in PATH

Performance Characteristics

The service is optimized for low-latency operations:

Operation	First Call	Subsequent Calls
Search	~50ms	~5-10ms
Store	~50ms	~5-10ms
Health Check	~5ms	~1ms

Metric	Value
Memory Overhead	<10MB
Cost per 10-user deployment	~$16.43/year

Sources: src/mcp_memory_service/api/__init__.py

Next Steps

After completing this quick start guide, consider exploring:

API Reference: Full endpoint documentation at /api/docs
Maintenance Scripts: Database cleanup and type consolidation tools in scripts/maintenance/
Advanced Configuration: Backend switching and hybrid deployment options
Claude Hooks Customization: Fine-tune auto-capture behavior and verbosity levels

Sources: src/mcp_memory_service/web/app.py

System Architecture

Related topics: Storage Backends, Knowledge Graph and Entity Extraction, Memory Consolidation Engine

Section Related Pages

Continue reading this section for the full explanation and source context.

Section MCP Server

Continue reading this section for the full explanation and source context.

Section Server Implementation

Continue reading this section for the full explanation and source context.

Section Memory Service

Continue reading this section for the full explanation and source context.

System Architecture

Overview

The MCP Memory Service is a semantic memory service built on the Model Context Protocol (MCP). It provides storage, retrieval, and search capabilities for AI-assisted workflows by maintaining a persistent vector-based memory store with automatic embedding generation.

Sources: src/mcp_memory_service/api/__init__.py:1-20

The architecture follows a layered design pattern with clear separation between the MCP protocol layer, business logic services, and storage abstraction layer.

High-Level Architecture

graph TD
    subgraph "Client Layer"
        Claude[Claude Code]
        CLI[CLI Client]
        HTTP[HTTP Client]
        MCP[MCP Client]
    end

    subgraph "Interface Layer"
        FastAPI[FastAPI Server]
        MCP_Server[MCP Server]
        CLI_Interface[CLI Interface]
    end

    subgraph "Service Layer"
        Memory_Service[Memory Service]
        Graph_Service[Graph Service]
        Response_Limiter[Response Limiter]
    end

    subgraph "Storage Layer"
        Factory[Storage Factory]
        SQLiteVec[sqlite_vec]
        Litestream[Litestream Sync]
    end

    Client_Layer --> Interface_Layer
    Claude --> MCP_Server
    CLI --> CLI_Interface
    HTTP --> FastAPI

    Interface_Layer --> Service_Layer
    Memory_Service --> Factory
    Graph_Service --> Factory
    Factory --> SQLiteVec
    SQLiteVec --> Litestream

Core Components

MCP Server

The MCP Server (mcp_server.py) implements the Model Context Protocol, allowing AI assistants like Claude Code to interact with the memory service directly through standardized MCP tools.

Key responsibilities:

Expose MCP tools for memory operations
Handle tool invocation requests from MCP clients
Coordinate with the Memory Service for all operations

Sources: src/mcp_memory_service/mcp_server.py:1-50

Server Implementation

The server_impl.py serves as the primary server implementation, providing the HTTP/REST interface alongside MCP support.

Sources: src/mcp_memory_service/server_impl.py:1-30

Memory Service

The Memory Service (memory_service.py) is the core business logic layer that handles:

Operation	Description
`store()`	Store new memories with automatic embedding generation
`search()`	Semantic similarity search using vector embeddings
`recall()`	Time-based memory retrieval with natural language expressions
`delete()`	Remove memories and their associated embeddings
`list_memories()`	Paginated listing of all memories

Sources: src/mcp_memory_service/services/memory_service.py:1-60

Graph Service

The Graph Service (graph_service.py) manages relationship tracking between memories, enabling complex queries about memory connections and dependencies.

Sources: src/mcp_memory_service/services/graph_service.py:1-40

Storage Architecture

Storage Factory Pattern

The storage layer uses a factory pattern (storage/factory.py) to abstract the underlying database implementation:

graph LR
    Factory[Storage Factory] -->|Creates| SQLiteVec_Storage[SQLite Vec Storage]
    Factory -->|Creates| InMemory_Storage[In-Memory Storage]
    
    SQLiteVec_Storage -->|Uses| SQLite[(SQLite with vec extension)]
    InMemory_Storage -->|Uses| RAM[(In-Memory)]

Sources: src/mcp_memory_service/storage/factory.py:1-30

SQLite Vec Backend

The default storage backend uses sqlite_vec, which provides:

Vector storage: Native support for embedding storage and similarity search
SQLite reliability: ACID transactions, proven durability
Low overhead: <10MB memory footprint
Performance: ~5-10ms subsequent calls after first call initialization

Sources: src/mcp_memory_service/api/__init__.py:15-18

Litestream Synchronization

For production deployments, Litestream provides continuous database replication:

Sources: src/mcp_memory_service/sync/litestream_config.py:1-80

Platform	Installation Command
macOS	`brew install benbjohnson/litestream/litestream`
Linux	`curl -LsS https://... \	tar -xzf -`
Windows	Manual download from GitHub releases

API Architecture

REST API Endpoints

Sources: src/mcp_memory_service/web/app.py:30-80

#### Memory Management

Method	Endpoint	Description
POST	`/api/memories`	Store a new memory with automatic embedding generation
GET	`/api/memories`	List all memories with pagination support
GET	`/api/memories/{hash}`	Retrieve a specific memory by content hash
DELETE	`/api/memories/{hash}`	Delete a memory and its embeddings

#### Search Operations

Method	Endpoint	Description
POST	`/api/search`	Semantic similarity search using embeddings
GET	`/api/search/similar/{hash}`	Find memories similar to a specific one

#### Real-time Events

Method	Endpoint	Description
GET	`/api/events`	Subscribe to real-time memory events stream (SSE)
GET	`/api/events/stats`	View SSE connection statistics

Python API

The Python API provides programmatic access with low token overhead:

from mcp_memory_service.api import search, store, health

# Search memories (~20 tokens)
results = search("architecture decisions", limit=5)

# Store memory (~15 tokens)
hash = store("New memory", tags=["note", "important"])

# Health check (~5 tokens)
info = health()

Sources: src/mcp_memory_service/api/__init__.py:20-40

Response Management

Response Limiter

The response_limiter.py module prevents context window overflow by truncating responses at memory boundaries.

Sources: src/mcp_memory_service/server/utils/response_limiter.py:1-50

graph TD
    Request[Large Memory Request] --> Check{Under limit?}
    Check -->|Yes| Return_Full[Return all memories]
    Check -->|No| Truncate[Truncate at boundary]
    Truncate --> Add_Header[Add truncation header]
    Add_Header --> Return_Partial[Return partial results]

Configuration

Environment Variable	Default	Description
`MCP_MAX_RESPONSE_CHARS`	`0` (unlimited)	Maximum characters in responses

CLI Architecture

Sources: src/mcp_memory_service/cli/main.py:1-50

The CLI provides command-line access to all memory operations:

memory server              # Start HTTP server
memory health              # Check service health
memory logs --lines 30     # Show recent log entries

Compatibility entry points:

memory-server (deprecated, redirects to memory server)

Integration Points

Claude Code Hooks

The system integrates with Claude Code through hooks that provide:

Automatic memory loading on session start
Context injection of relevant memories
Session insight storage on session end

Sources: claude-hooks/README.md:1-30

Claude Commands

Custom slash commands for memory operations:

Command	Purpose
`/memory-save`	Save current conversation as memory
`/memory-recall`	Time-based memory retrieval
`/memory-search`	Tag and content search
`/memory-context`	Session context integration
`/memory-health`	Service health check

Sources: claude_commands/README.md:1-40

Data Flow

sequenceDiagram
    participant Client
    participant API
    participant MemoryService
    participant Storage
    participant Litestream

    Client->>API: Store memory request
    API->>MemoryService: Save operation
    MemoryService->>MemoryService: Generate embedding
    MemoryService->>Storage: Store content + vector
    Storage->>Litestream: Replicate to S3/GCS
    Storage-->>MemoryService: Confirmation
    MemoryService-->>API: Success response
    API-->>Client: Memory hash returned

    Client->>API: Search request
    API->>MemoryService: Query
    MemoryService->>MemoryService: Generate query embedding
    MemoryService->>Storage: Vector similarity search
    Storage-->>MemoryService: Top K results
    MemoryService-->>API: Ranked results
    API-->>Client: Search results

Maintenance Operations

Memory Type Consolidation

The system supports consolidating fragmented memory types into a standardized taxonomy:

Sources: scripts/maintenance/README.md:1-50

Standard 24-Type Taxonomy:

Content Types: note, reference, document, guide
Activity Types: session, implementation, analysis

Sync Export/Import

Sources: scripts/sync/README.md:1-40

Operation	Performance
Export	~1000 memories/second
Import	~500 memories/second with deduplication
File Size	~1KB per memory

Performance Characteristics

Metric	Value
First call latency	~50ms (includes storage initialization)
Subsequent calls	~5-10ms (connection reused)
Memory overhead	<10MB
Annual cost	~$16.43/year per 10-user deployment (at $0.15/1M tokens)

Sources: src/mcp_memory_service/api/__init__.py:12-15

Architecture Summary

The MCP Memory Service architecture is designed around three principles:

Separation of Concerns: Clear boundaries between protocol handling, business logic, and storage
Multiple Interfaces: Support for MCP, REST, Python API, and CLI access patterns
Production Ready: Built-in replication, response limiting, and maintenance tools

Source: https://github.com/doobidoo/mcp-memory-service / Human Manual

Storage Backends

The mcp-memory-service project implements a pluggable storage backend architecture that enables users to choose between different vector database technologies for storing and searching semantic memories. This abstraction layer decouples the core memory service logic from specific database implementations, providing flexibility in deployment scenarios.

Architecture Overview

The storage system follows a factory pattern with a unified interface. All storage implementations inherit from a common base class that defines the contract for memory operations: store, retrieve, search, and delete.

graph TD
    A[Memory Service] --> B[Storage Factory]
    B --> C[sqlite_vec Storage]
    B --> D[Cloudflare Storage]
    B --> E[Milvus Storage]
    B --> F[Hybrid Storage]
    
    C --> G[(Local SQLite DB)]
    D --> H[(Cloudflare D1 + Vectorize)]
    E --> I[(Milvus Collection)]
    F --> J[(Cloudflare + SQLite)]

Sources: src/mcp_memory_service/storage/factory.py

Supported Storage Backends

Backend	Description	Best For
`sqlite_vec`	Local SQLite database with vec0 extension	Single-user, offline-first, privacy-focused
`cloudflare`	Cloudflare D1 database + Vectorize API	Cloud-native, multi-device sync
`milvus`	Milvus vector database	Enterprise-scale, high-performance
`hybrid`	Cloudflare + SQLite combination	Multi-device with local backup

Sources: docs/guides/STORAGE_BACKENDS.md

Core Interface

All storage backends implement a common interface defined through the base storage class. The interface includes:

Storage Methods

Method	Purpose	Parameters
`store(memory)`	Store a new memory with embedding	Memory object with content, tags, metadata
`retrieve(hash)`	Retrieve memory by content hash	content_hash: str
`search(query, limit)`	Semantic search using embeddings	query: str, limit: int
`search_by_tag(tags, match_all)`	Tag-based filtering	tags: list, match_all: bool
`list_memories(page, page_size)`	Paginated memory listing	page: int, page_size: int
`delete(hash)`	Remove memory and embeddings	content_hash: str

Sources: src/mcp_memory_service/api/operations.py

SQLite Vec Backend

The sqlite_vec backend is the default and most widely-used storage option. It leverages the sqlite-vec extension to enable vector similarity search directly within SQLite.

Configuration

Environment Variable	Default	Description
`SQLITE_VEC_DB_PATH`	`~/.local/share/mcp-memory/sqlite_vec.db`	Path to SQLite database file
`EMBEDDING_MODEL_NAME`	`sentence-transformers/all-MiniLM-L6-v2`	Embedding model for vectorization

Database Schema

The SQLite backend stores memories in a relational table with the following structure:

CREATE TABLE memories (
    content TEXT NOT NULL,
    content_hash TEXT PRIMARY KEY,
    tags TEXT,
    memory_type TEXT,
    metadata TEXT,
    created_at REAL,
    updated_at REAL,
    created_at_iso TEXT,
    updated_at_iso TEXT
);

CREATE VIRTUAL TABLE memories_embeddings USING vec0(
    content_hash TEXT PRIMARY KEY,
    embedding FLOAT[384]
);

Sources: docs/sqlite-vec-backend.md

Performance Characteristics

Metric	Value
Search latency	5-10ms
Store latency	10-20ms (includes embedding)
Memory overhead	<10MB
Capacity	Limited by disk space

Cloudflare Backend

The Cloudflare backend provides cloud-native storage using Cloudflare's D1 (SQLite-compatible) database and Vectorize (vector search) API.

Configuration

Environment Variable	Required	Description
`CLOUDFLARE_API_TOKEN`	Yes	Cloudflare API authentication token
`CLOUDFLARE_ACCOUNT_ID`	Yes	Cloudflare account identifier
`CLOUDFLARE_VECTORIZE_INDEX`	Yes	Vectorize index name
`CLOUDFLARE_D1_DATABASE_ID`	Yes	D1 database identifier
`CLOUDFLARE_R2_BUCKET`	No	R2 bucket for large content storage
`CLOUDFLARE_EMBEDDING_MODEL`	No	Embedding model override
`CLOUDFLARE_LARGE_CONTENT_THRESHOLD`	No	Size threshold for R2 storage
`CLOUDFLARE_MAX_RETRIES`	No	Retry attempts for API calls
`CLOUDFLARE_BASE_DELAY`	No	Initial retry delay in seconds

Initialization

storage = CloudflareStorage(
    api_token=CLOUDFLARE_API_TOKEN,
    account_id=CLOUDFLARE_ACCOUNT_ID,
    vectorize_index=CLOUDFLARE_VECTORIZE_INDEX,
    d1_database_id=CLOUDFLARE_D1_DATABASE_ID,
    r2_bucket=CLOUDFLARE_R2_BUCKET,
    embedding_model=CLOUDFLARE_EMBEDDING_MODEL,
    large_content_threshold=CLOUDFLARE_LARGE_CONTENT_THRESHOLD,
    max_retries=CLOUDFLARE_MAX_RETRIES,
    base_delay=CLOUDFLARE_BASE_DELAY
)

Sources: src/mcp_memory_service/storage/factory.py

Milvus Backend

The Milvus backend targets enterprise deployments requiring high-performance, distributed vector search capabilities.

Configuration

Environment Variable	Required	Description
`MILVUS_URI`	Yes	Milvus server URI
`MILVUS_TOKEN`	No	Authentication token
`MILVUS_COLLECTION_NAME`	No	Collection name (default: `memories`)
`EMBEDDING_MODEL_NAME`	No	Embedding model

Initialization

storage = MilvusMemoryStorage(
    uri=MILVUS_URI,
    token=MILVUS_TOKEN,
    collection_name=MILVUS_COLLECTION_NAME,
    embedding_model=EMBEDDING_MODEL_NAME,
)

Sources: docs/milvus-backend.md

Hybrid Backend

The hybrid backend combines Cloudflare storage with local SQLite-vec backup, enabling multi-device synchronization while maintaining offline capability.

Architecture

graph LR
    A[Local SQLite DB] <--> B[Hybrid Storage]
    B <--> C[Cloudflare D1]
    B <--> D[Cloudflare Vectorize]
    
    E[Write Operations] --> B
    F[Read Operations] --> B

Configuration

The hybrid backend requires both Cloudflare and SQLite configuration:

cloudflare_config = {
    'api_token': CLOUDFLARE_API_TOKEN,
    'account_id': CLOUDFLARE_ACCOUNT_ID,
    'vectorize_index': CLOUDFLARE_VECTORIZE_INDEX,
    'd1_database_id': CLOUDFLARE_D1_DATABASE_ID,
}

storage = HybridMemoryStorage(
    cloudflare_config=cloudflare_config,
    local_db_path=LOCAL_DB_PATH,
)

Storage Factory

The get_storage() function in the factory module handles backend instantiation based on configuration:

async def get_storage(backend: Optional[str] = None) -> BaseStorage:
    """Get storage instance based on configured or specified backend."""
    backend = backend or os.getenv("MCP_MEMORY_STORAGE_BACKEND", "sqlite_vec")
    
    if backend == "cloudflare":
        return CloudflareStorage(...)
    elif backend == "milvus":
        return MilvusMemoryStorage(...)
    elif backend == "hybrid":
        return HybridMemoryStorage(...)
    else:
        return SQLiteVecStorage(...)

Sources: src/mcp_memory_service/storage/factory.py

HTTP Client Backend

For distributed deployments, the project supports HTTP-based storage access through an HttpStorage client. This enables communication with remote MCP Memory Service instances.

HTTP API Endpoints

Method	Endpoint	Purpose
POST	`/api/memories`	Store a new memory
GET	`/api/memories`	List memories with pagination
GET	`/api/memories/{hash}`	Retrieve specific memory
DELETE	`/api/memories/{hash}`	Delete memory
POST	`/api/search`	Semantic search
GET	`/api/search/similar/{hash}`	Find similar memories
GET	`/api/events`	SSE event stream

Time Filter Format

When searching by time ranges, the HTTP client converts Unix timestamps to ISO date format:

def _build_time_filter(time_start: Optional[float], time_end: Optional[float]) -> Optional[str]:
    if time_start and time_end:
        return f"between {_to_date(time_start)} and {_to_date(time_end)}"
    elif time_start:
        return _to_date(time_start)
    return _to_date(time_end)

Sources: src/mcp_memory_service/storage/http_client.py

Backend Selection

Decision Matrix

Use Case	Recommended Backend
Single machine, privacy-sensitive	`sqlite_vec`
Multi-device with cloud sync	`cloudflare`
Enterprise with high volume	`milvus`
Local backup + cloud sync	`hybrid`

Selecting Backend via CLI

When using the CLI for operations:

# Use SQLite backend (default)
memory ingest-document doc.pdf

# Use Cloudflare backend
memory ingest-document doc.pdf --storage-backend cloudflare

# Use hybrid backend
memory ingest-document doc.pdf --storage-backend hybrid

Maintenance Operations

Embedding Migration

To migrate to a different embedding model (handles dimension changes):

python scripts/maintenance/migrate_embeddings.py --url http://localhost:8000 --model new-model --dry-run

Database Repair Scripts

Script	Purpose
`repair_memories.py`	Repair corrupted memory entries
`repair_sqlite_vec_embeddings.py`	Fix embedding inconsistencies
`repair_zero_embeddings.py`	Fix zero/null embeddings
`cleanup_corrupted_encoding.py`	Fix corrupted emoji encoding

Backup Strategy

For sqlite_vec backend, Litestream provides continuous replication:

# Configure in litestream.yml
dbs:
  - path: ~/.local/share/mcp-memory/sqlite_vec.db
    replicas:
      - url: s3://your-bucket/mcp-memory/

Sources: scripts/sync/litestream/README.md

Response Limiting

All storage backends support response size limiting through the ResponseLimiter utility to prevent oversized responses:

Parameter	Default	Description
`max_chars`	10000	Maximum characters in response
`max_results`	50	Maximum memories to return

The limiter calculates estimated memory sizes including overhead and truncates at memory boundaries to ensure consistent response sizes.

Sources: src/mcp_memory_service/server/utils/response_limiter.py

Sources: src/mcp_memory_service/storage/factory.py

Knowledge Graph and Entity Extraction

Related topics: System Architecture, Memory Consolidation Engine, Quality Scoring System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Overview

Continue reading this section for the full explanation and source context.

Section Supported Entity Types

Continue reading this section for the full explanation and source context.

Knowledge Graph and Entity Extraction

Overview

The MCP Memory Service includes a sophisticated Knowledge Graph system that enables semantic relationships between memories, automatic entity extraction, and intelligent relationship inference. This feature transforms isolated memory entries into an interconnected knowledge network that supports advanced queries like graph traversal, entity-based retrieval, and similarity analysis.

The knowledge graph layer sits above the core memory storage, providing:

Entity Extraction: Automatic identification of people, organizations, locations, and concepts from memory content
Relationship Mapping: Discovery and storage of connections between memories based on semantic similarity and content analysis
Graph Traversal: Navigation of memory relationships using configurable depth and radius parameters
Association Memory: Automatic creation of memory entries that document discovered relationships between existing memories
Relationship Inference: AI-powered inference of relationship types (causes, fixes, contradicts, supports, follows)

Architecture

graph TD
    subgraph "Memory Layer"
        M1[Memory 1]
        M2[Memory 2]
        M3[Memory 3]
    end

    subgraph "Entity Extraction"
        EE[EntityExtractor]
        NER[NER Processing]
        PAT[Pattern Matching]
    end

    subgraph "Graph Storage"
        GS[GraphStorage]
        RI[RelationshipInference]
        AM[AssociationMemory]
    end

    subgraph "Query Interfaces"
        GT[Graph Tools]
        ST[Search Tools]
    end

    M1 --> EE
    M2 --> EE
    M3 --> EE

    EE --> NER
    EE --> PAT

    NER --> GS
    PAT --> GS

    GS --> RI
    RI --> AM
    AM --> GS

    GT --> GS
    ST --> GS

Core Components

Component	File	Purpose
`EntityExtractor`	`src/mcp_memory_service/reasoning/entities.py`	Extracts entities from memory content
`RelationshipInferenceEngine`	`src/mcp_memory_service/reasoning/inference.py`	Infers relationship types between memories
`GraphStorage`	`src/mcp_memory_service/storage/graph.py`	SQLite-based graph storage backend
`MilvusGraphStorage`	`src/mcp_memory_service/storage/milvus_graph.py`	Milvus vector database backend
`MemoryConsolidator`	`src/mcp_memory_service/consolidation/consolidator.py`	Creates and manages association memories
`Association`	`src/mcp_memory_service/models/association.py`	Data model for memory associations

Sources: src/mcp_memory_service/reasoning/entities.py:1-50

Entity Extraction

Overview

Entity extraction automatically identifies and categorizes entities within memory content. The system supports multiple entity types and uses both pattern-based and NER (Named Entity Recognition) approaches.

Sources: src/mcp_memory_service/reasoning/entities.py:1-30

Supported Entity Types

The system recognizes the following entity categories:

Entity Type	Description	Examples
`person`	Human individuals	"John", "Alice Chen"
`organization`	Companies, teams, agencies	"Acme Corp", "Engineering Team"
`location`	Physical places	"San Francisco", "Building A"
`concept`	Abstract ideas and concepts	"machine learning", "agile methodology"
`technology`	Tools, frameworks, languages	"Python", "React", "Docker"
`date`	Temporal references	"December 2024", "Q1"
`project`	Named projects or initiatives	"Project Alpha", "Apollo Initiative"

Extraction Process

graph LR
    A[Memory Content] --> B[Pattern Matching]
    A --> C[NER Processing]
    B --> D[Entity Deduplication]
    C --> D
    D --> E[Entity Metadata]
    E --> F[Store Entity Links]

The extraction process follows these steps:

Pattern Matching: Initial scan for known patterns (email, URL, date formats, capitalization patterns)
NER Processing: Language model-based entity recognition for contextual entity types
Entity Deduplication: Normalization and merging of duplicate entities
Metadata Generation: Creation of entity metadata including confidence scores
Storage: Persisting entity links to the graph storage

MCP Tool Interface

Entities can be extracted and stored via the MCP memory_graph tool:

Sources: src/mcp_memory_service/server/handlers/graph.py:50-75

{
  "action": "extract_entities",
  "hash": "abc123def456"
}

Response Example:

{
  "hash": "abc123def456",
  "entities_extracted": 5,
  "entities": [
    {"name": "Python", "type": "technology", "confidence": 0.95},
    {"name": "FastAPI", "type": "technology", "confidence": 0.92}
  ]
}

Knowledge Graph Storage

Storage Backends

The knowledge graph supports multiple storage backends:

Backend	File	Use Case
SQLite Graph	`src/mcp_memory_service/storage/graph.py`	Default, single-node deployments
Milvus Graph	`src/mcp_memory_service/storage/milvus_graph.py`	Large-scale, distributed deployments

Sources: src/mcp_memory_service/storage/graph.py:1-100

Graph Data Model

The graph storage maintains the following data structures:

#### Entities Table Stores extracted entities linked to memories:

Field	Type	Description
`entity_id`	TEXT	Unique entity identifier
`name`	TEXT	Entity name
`entity_type`	TEXT	Entity category
`memory_hash`	TEXT	Associated memory hash
`confidence`	REAL	Extraction confidence score

#### Relationships Table Stores relationships between memories:

Field	Type	Description
`relationship_id`	TEXT	Unique relationship identifier
`source_hash`	TEXT	Source memory hash
`target_hash`	TEXT	Target memory hash
`relationship_type`	TEXT	Type (similar, causes, fixes, etc.)
`similarity_score`	REAL	Calculated similarity (0.0-1.0)
`metadata`	JSON	Additional relationship metadata

Sources: src/mcp_memory_service/models/association.py:1-60

Graph Operations

#### Store Entity Link

async def store_entity_link(
    memory_hash: str,
    entity_name: str,
    entity_type: str
) -> bool:

Links an extracted entity to a memory for future retrieval and graph queries.

Sources: src/mcp_memory_service/storage/graph.py:150-180

#### Get Memory Subgraph

Retrieves a local subgraph centered on a specific memory:

async def get_memory_subgraph(
    memory_hash: str,
    radius: int = 2
) -> Dict[str, Any]:

Parameters:

memory_hash: Center memory for subgraph traversal
radius: Maximum traversal depth (default: 2)

#### Graph Traversal

async def traverse_graph(
    hash1: str,
    hash2: str,
    max_depth: int = 5
) -> List[Dict[str, Any]]:

Finds paths between two memories up to a specified depth.

Sources: src/mcp_memory_service/server/handlers/graph.py:20-45

Relationship Inference

RelationshipInferenceEngine

The relationship inference engine analyzes memory content pairs to determine semantic relationships:

Sources: src/mcp_memory_service/reasoning/inference.py:1-80

Supported Relationship Types

Type	Description	Example
`causes`	Source leads to target	"Changed config" → "System crashed"
`fixes`	Source resolves target	"Applied patch" → "Bug #123"
`contradicts`	Sources conflict	"Use X" vs "Don't use X"
`supports`	Source validates target	"Test results" → "Implementation works"
`follows`	Temporal sequence	"Phase 1 complete" → "Phase 2 started"
`related`	General connection	Topic similarity without specific type

Inference Process

graph TD
    M1[Memory 1] --> C1[Content Analysis]
    M2[Memory 2] --> C2[Content Analysis]
    C1 --> SIM[Similarity Check]
    C2 --> SIM
    SIM --> RT{Relationship Type?}
    RT -->|High Similarity| SA[Same Aspect]
    RT -->|Causal Keywords| CA[Causal Link]
    RT -->|Action Keywords| AC[Action Link]
    RT -->|Negation| CN[Contradiction]
    SA --> ST[Store Relationship]
    CA --> ST
    AC --> ST
    CN --> ST

Association Memory

Overview

Association memories are automatically generated entries that document discovered relationships between existing memories. They provide a memory-level representation of graph connections, enabling search and retrieval of relationship information.

Sources: src/mcp_memory_service/consolidation/consolidator.py:150-200

Association Data Model

class Association:
    source_memory_hashes: List[str]
    similarity_score: float
    connection_type: str
    discovery_method: str
    discovery_date: datetime
    metadata: Dict[str, Any]

Association Memory Structure

When an association is stored as a memory:

association_memory = Memory(
    content=f"Connected {source_hashes[0][:8]} and {source_hashes[1][:8]} by {connection_type}",
    content_hash=f"assoc_{source_hashes[0][:8]}_{source_hashes[1][:8]}",
    tags=["association", "discovered", connection_type],
    memory_type="observation",
    metadata={
        "source_memory_hashes": source_hashes,
        "similarity_score": similarity,
        "connection_type": connection_type,
        "discovery_method": association.discovery_method,
        "discovery_date": association.discovery_date.isoformat(),
    }
)

Sources: src/mcp_memory_service/consolidation/consolidator.py:170-195

Storage Process

graph LR
    A[Memory Pair] --> B[Similarity Analysis]
    B --> C{Score >= Threshold?}
    C -->|Yes| D[Type Classification]
    C -->|No| E[Skip]
    D --> F[Create Association]
    F --> G[Store Association Memory]
    G --> H[Update Graph]

The consolidator stores association memories with skip_semantic_dedup=True to prevent deduplication conflicts with templated content.

Sources: src/mcp_memory_service/consolidation/consolidator.py:195-210

MCP Tools Reference

memory_graph

Main tool for graph operations:

Parameter	Type	Required	Description
`action`	string	Yes	Operation: `traverse`, `subgraph`, `extract_entities`
`hash`	string	Conditional	Memory hash for subgraph/entities actions
`hash1`	string	Conditional	Source hash for traversal
`hash2`	string	Conditional	Target hash for traversal
`radius`	integer	No	Subgraph radius (default: 2)
`max_depth`	integer	No	Traversal max depth (default: 5)

#### Action Examples

Extract Entities:

{
  "action": "extract_entities",
  "hash": "memory_hash_here"
}

Get Subgraph:

{
  "action": "subgraph",
  "hash": "memory_hash_here",
  "radius": 3
}

Traverse Graph:

{
  "action": "traverse",
  "hash1": "memory_hash_1",
  "hash2": "memory_hash_2",
  "max_depth": 5
}

Sources: src/mcp_memory_service/server/handlers/graph.py:1-80

Maintenance Scripts

update_graph_relationship_types.py

Located in scripts/maintenance/, this script infers relationship types for existing graph associations:

# Dry run (preview changes)
python scripts/maintenance/update_graph_relationship_types.py --dry-run

# Execute inference
python scripts/maintenance/update_graph_relationship_types.py

Features:

Uses RelationshipInferenceEngine for type inference
Supports dry-run mode for safety
Automatic backup before execution
Updates relationship metadata in graph storage

Sources: scripts/maintenance/README.md

Configuration

Graph Storage Configuration

The graph storage is automatically initialized when the storage service starts. No explicit configuration is required for SQLite-based storage.

For Milvus-based storage, configure the following environment variables:

Variable	Description	Default
`MILVUS_HOST`	Milvus server host	localhost
`MILVUS_PORT`	Milvus server port	19530
`MILVUS_COLLECTION`	Collection name	memory_graph

Entity Extraction Configuration

Entity extraction settings can be configured in the service configuration:

Setting	Type	Description
`entity_types`	List[str]	Enabled entity types
`min_confidence`	float	Minimum confidence threshold (0.0-1.0)
`enable_ner`	bool	Enable NER processing
`pattern_weight`	float	Weight for pattern matching

Error Handling

The graph operations return standardized error responses:

{
  "error": "Error message describing the issue"
}

Common error scenarios:

Memory not found for entity extraction
Graph storage not initialized
Invalid hash format
Maximum depth exceeded during traversal

Sources: src/mcp_memory_service/server/handlers/graph.py:55-65

Best Practices

Entity Naming: Use consistent entity naming conventions for better graph queries
Memory Organization: Group related content in the same memory to strengthen association discovery
Regular Maintenance: Run update_graph_relationship_types.py periodically to classify new associations
Tag Usage: Tag memories with semantic tags to improve entity and relationship extraction
Graph Traversal: Use appropriate radius/depth limits to prevent performance issues with highly connected memories

Memory Consolidation Engine

Related topics: Knowledge Graph and Entity Extraction, Quality Scoring System, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Weekly Consolidation

Continue reading this section for the full explanation and source context.

Section Monthly Consolidation

Continue reading this section for the full explanation and source context.

Section Quarterly Consolidation

Continue reading this section for the full explanation and source context.

Memory Consolidation Engine

The Memory Consolidation Engine is a core subsystem of the MCP Memory Service that manages the lifecycle of stored memories through intelligent compression, decay analysis, and selective forgetting mechanisms. It ensures the memory store remains efficient, relevant, and optimized for long-term semantic retrieval.

Overview

As memories accumulate over time, the consolidation engine performs background operations to:

Compress redundant memories into consolidated summaries
Apply decay algorithms to age out less relevant information
Forget obsolete entries based on configurable time horizons
Generate insights from consolidated memory patterns

graph TD
    A[New Memory] --> B[Memory Store]
    B --> C{Consolidation Scheduler}
    C --> D[Decay Analysis]
    C --> E[Compression]
    C --> F[Forgetting Check]
    D --> G[Relevance Score Update]
    E --> H[Consolidated Memory]
    F --> I[Memory Deletion]
    G --> H
    H --> J[Insights Generation]
    J --> K[Updated Memory Store]

Architecture

The consolidation engine comprises five primary modules located in src/mcp_memory_service/consolidation/:

Module	Purpose	Key Functions
`consolidator.py`	Core consolidation orchestration	Main consolidation loop, memory merging
`scheduler.py`	Automated scheduling of consolidation tasks	Daily, weekly, monthly job scheduling
`decay.py`	Relevance decay calculations	Age-based score reduction algorithms
`forgetting.py`	Selective memory removal	Time-horizon based forgetting policies
`insights.py`	Post-consolidation analysis	Pattern detection, summary generation

Sources: src/mcp_memory_service/consolidation/scheduler.py

Consolidation Time Horizons

The engine operates on three configurable time horizons that determine consolidation aggressiveness:

Weekly Consolidation

Processes recent memories (typically 7 days)
Light compression to maintain detail
Preserves high-relevance memories unchanged

Monthly Consolidation

Reviews memories from past 30 days
Moderate compression and decay application
Identifies patterns across recent sessions

Quarterly Consolidation

Full store analysis
Aggressive forgetting of stale entries
Major compression of redundant information

Sources: src/mcp_memory_service/api/operations.py:1-50

API Integration

REST API Endpoint

POST /api/consolidate

Parameters:

Parameter	Type	Required	Description
`time_horizon`	string	Yes	One of: `daily`, `weekly`, `monthly`

Example Request:

curl -X POST https://localhost:8443/api/consolidate \
  -H "Content-Type: application/json" \
  -d '{"time_horizon": "weekly"}'

Example Response:

{
  "status": "completed",
  "time_horizon": "weekly",
  "memories_processed": 2418,
  "compressed": 156,
  "forgotten": 43
}

Sources: src/mcp_memory_service/api/operations.py

Python API

from mcp_memory_service.api import consolidate

# Async consolidation
result = await consolidate('weekly')
print(f"Compressed: {result.compressed}, Forgotten: {result.forgotten}")

Return Value Properties:

Property	Type	Description
`status`	string	Completion status
`time_horizon`	string	Horizon used
`memories_processed`	int	Total memories reviewed
`compressed`	int	Number of memories consolidated
`forgotten`	int	Number of memories deleted

Scheduler Configuration

The consolidation scheduler runs automated tasks based on configured schedules:

graph LR
    A[Scheduler Init] --> B{Job Queue}
    B --> C[Daily Job<br/>T+00:00]
    B --> D[Weekly Job<br/>Sunday T+02:00]
    B --> E[Monthly Job<br/>1st T+03:00]

Scheduler Status Model:

CompactSchedulerStatus:
    running: bool
    next_daily: datetime | None
    next_weekly: datetime | None
    next_monthly: datetime | None
    jobs_executed: int
    jobs_failed: int

Sources: src/mcp_memory_service/consolidation/scheduler.py

Decay Algorithm

The decay module applies relevance scoring based on memory age and access patterns:

def calculate_decay_score(memory_age_days: int, access_frequency: float) -> float:
    """
    Returns relevance score between 0.0 and 1.0
    Lower scores indicate memories approaching forgetting threshold
    """

Decay Factors:

Factor	Description	Weight
Time Since Creation	Older memories decay faster	0.4
Access Frequency	Frequently accessed memories decay slower	0.3
Tag Relevance	Tagged memories maintain higher scores	0.2
Memory Type	System vs user memory decay rates differ	0.1

Sources: src/mcp_memory_service/consolidation/decay.py

Forgetting Mechanism

The forgetting module determines which memories should be permanently removed:

def should_forget(memory: Memory, horizon: str) -> bool:
    """
    Evaluates if a memory meets forgetting criteria
    Returns True if memory should be deleted
    """

Forgetting Criteria by Horizon:

Horizon	Age Threshold	Min Decay Score	Additional Checks
`daily`	90 days	0.1	None
`weekly`	180 days	0.2	Duplicate detection
`monthly`	365 days	0.3	Pattern analysis

Safety Features:

Automatic backup before deletion
Transaction-based deletion (atomic rollback on failure)
Database lock detection to prevent concurrent access issues
Disk space verification before execution

Sources: src/mcp_memory_service/consolidation/forgetting.py

Performance Characteristics

Metric	Value
Typical Duration	10-30 seconds (varies with memory count)
Scaling	~10ms per memory processed
Memory Overhead	<10MB during operation
Background Operation	Non-blocking in HTTP server context

Performance Tips:

Schedule consolidation during low-usage periods
Large stores (>10,000 memories) may take longer
Disable automatic scheduling for resource-constrained environments

Insights Generation

After consolidation, the insights module analyzes patterns:

def generate_insights(consolidated_memories: List[Memory]) -> List[Insight]:
    """
    Produces actionable insights from consolidated memory patterns
    """

Insight Types:

Type	Description
`pattern`	Recurring themes detected
`summary`	Condensed representation of grouped memories
`recommendation`	Suggestions based on memory patterns
`conflict`	Detected contradictions between memories

Sources: src/mcp_memory_service/consolidation/insights.py

Configuration

Environment Variables

Variable	Default	Description
`MCP_CONSOLIDATION_ENABLED`	`true`	Enable/disable automatic consolidation
`MCP_CONSOLIDATION_SCHEDULE`	`daily`	Default schedule
`MCP_MAX_RESPONSE_CHARS`	`0`	Response truncation (0 = unlimited)

Scheduler Settings

Configure in ~/.claude/hooks/config.json:

{
  "consolidation": {
    "schedule": {
      "daily": "0 0 * * *",
      "weekly": "0 2 * * 0",
      "monthly": "0 3 1 * *"
    },
    "enabled": true,
    "horizons": ["daily", "weekly"]
  }
}

Troubleshooting

Common Issues

Issue	Cause	Solution
Consolidation fails silently	Database locked	Stop MCP clients before running
High memory usage during consolidation	Large store size	Increase `MCP_MAX_RESPONSE_CHARS`
Scheduler not running	Service not started	Check `systemctl --user status mcp-memory.service`

Verification

# Check scheduler status
curl https://localhost:8443/api/consolidate/status

# Manual consolidation (dry-run)
curl -X POST https://localhost:8443/api/consolidate \
  -d '{"time_horizon": "weekly", "dry_run": true}'

Further Reference

Memory Consolidation Guide - Detailed usage documentation
API Reference - Full API specification
CLI Commands - Command-line consolidation tools

Sources: src/mcp_memory_service/consolidation/scheduler.py

Quality Scoring System

Related topics: Memory Consolidation Engine, System Architecture, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section QualityConfig

Continue reading this section for the full explanation and source context.

Section QualityScorer

Continue reading this section for the full explanation and source context.

Section ONNXRankerModel

Continue reading this section for the full explanation and source context.

Quality Scoring System

The Quality Scoring System is a multi-layered evaluation framework within the MCP Memory Service that assesses, ranks, and maintains memory content quality. It provides both automatic quality assessment through machine learning models and explicit user feedback mechanisms to ensure that memories retain high-value information over time.

Overview

The system addresses a fundamental challenge in semantic memory systems: not all stored memories have equal importance or relevance. Over time, memory stores can become cluttered with transient information, low-value summaries, and outdated content that degrades the overall utility of the memory service.

The Quality Scoring System solves this by implementing:

Automatic Quality Assessment: Uses ONNX-based ML models to evaluate content quality without manual intervention
Implicit Signal Detection: Analyzes behavioral patterns to infer memory importance
Manual Rating Support: Allows users to explicitly rate memory quality
Quality-Aware Search: Boosts high-quality memories in search results
Maintenance Operations: Provides tools for quality-based memory management

Architecture

The Quality Scoring System follows a modular architecture with four primary components that work together to provide comprehensive quality evaluation.

graph TD
    A[Memory Content] --> B[QualityScorer]
    B --> C[ONNXRankerModel]
    B --> D[QualityEvaluator]
    B --> E[ImplicitSignalsEvaluator]
    
    C --> F[CompactSearchResult]
    D --> F
    E --> F
    
    G[User Rating] --> H[handle_rate_memory]
    H --> B
    
    I[Search Query] --> J[quality_boost Parameter]
    J --> F
    
    K[Maintenance Operations] --> L[handle_maintain]
    L --> B

Core Components

QualityConfig

The QualityConfig class provides centralized configuration for the quality scoring system. It defines thresholds, weights, and behavioral parameters that control how quality is evaluated.

Parameter	Type	Default	Description
`min_quality`	float	0.0	Minimum quality threshold for inclusion
`max_quality`	float	1.0	Maximum possible quality score
`boost_weight`	float	varies	Weight given to quality in search ranking
`implicit_weight`	float	varies	Weight for implicit signal evaluation

Sources: src/mcp_memory_service/quality/config.py

QualityScorer

The QualityScorer is the main orchestrator class that coordinates quality evaluation across all sub-components. It aggregates scores from different evaluation methods and produces a unified quality score.

class QualityScorer:
    def __init__(self, config: QualityConfig):
        self.config = config
        self.ranker = ONNXRankerModel()
        self.evaluator = QualityEvaluator()
        self.implicit_evaluator = ImplicitSignalsEvaluator()

Key Responsibilities:

Aggregates scores from ONNX ranker, AI evaluator, and implicit signals
Provides a unified get_score() interface
Handles caching and optimization for repeated evaluations
Manages configuration propagation to sub-components

Sources: src/mcp_memory_service/quality/scorer.py

ONNXRankerModel

The ONNXRankerModel provides fast, offline-capable quality ranking using ONNX Runtime. This model evaluates content based on learned patterns of what constitutes high-quality memory content.

Advantages of ONNX-based ranking:

Runs entirely offline without external API dependencies
Fast inference suitable for real-time evaluation
Portable across different platforms and hardware
No per-token costs unlike cloud-based alternatives

Sources: src/mcp_memory_service/quality/onnx_ranker.py

QualityEvaluator

The QualityEvaluator provides AI-based quality assessment, likely utilizing more sophisticated language model analysis for nuanced content quality determination.

Evaluation Criteria:

Content specificity and detail level
Actionability of information
Temporal relevance
Uniqueness of content

Sources: src/mcp_memory_service/quality/ai_evaluator.py

ImplicitSignalsEvaluator

The ImplicitSignalsEvaluator analyzes behavioral patterns to infer memory importance without explicit user feedback. This component detects signals that suggest a memory's value based on how it's accessed and used.

Implicit Signals:

Retrieval frequency
Retrieval timing patterns
Context of retrieval requests
Cross-referencing with other memories

Sources: src/mcp_memory_service/quality/implicit_signals.py

Module Exports

All quality scoring components are exported through the main quality module interface:

from mcp_memory_service.quality import (
    QualityScorer,
    ONNXRankerModel,
    QualityEvaluator,
    ImplicitSignalsEvaluator,
    QualityConfig
)

Sources: src/mcp_memory_service/quality/__init__.py

API Integration

Quality-Aware Search

The quality scoring system integrates with the search API through the quality_boost parameter. When performing semantic or hybrid searches, memories with higher quality scores receive ranking boosts.

Search Parameters Related to Quality:

Parameter	Type	Default	Description
`quality_boost`	float	0.0	Weight for quality-based ranking (0.0-1.0)
`min_quality`	float	0.0	Minimum quality threshold filter
`include_debug`	bool	false	Include quality scoring details in response

Implementation in Memory Handler:

quality_boost=arguments.get("quality_boost", 0.0),
limit=limit,
include_debug=arguments.get("include_debug", False),

Sources: src/mcp_memory_service/server/handlers/memory.py:26-32

Quality Scoring in Results

The CompactSearchResult type includes a score field that represents the relevance score, which incorporates quality assessment when quality_boost is enabled.

CompactMemory Score Field:

class CompactMemory(NamedTuple):
    hash: str           # 8-character content hash
    preview: str        # First 200 characters
    tags: tuple[str, ...]  # Immutable tags tuple
    created: float      # Unix timestamp
    score: float        # Relevance score 0-1

Sources: src/mcp_memory_service/api/types.py:48-56

Quality Actions Handler

The quality system exposes several actions through the memory_quality tool handler, providing programmatic access to quality operations.

Action Types

Action	Description	Required Parameters
`rate`	Manually rate a memory's quality	`content_hash`, `rating`
`analyze`	Analyze quality distribution	`min_quality`, `max_quality`
`maintain`	Run quality-based maintenance	varies
`maintain_status`	Check maintenance status	none

Rate Memory Action

Allows explicit user feedback on memory quality:

async def handle_rate_memory(server, arguments: dict) -> List[types.TextContent]:
    content_hash = arguments.get("content_hash")
    rating = arguments.get("rating")
    feedback = arguments.get("feedback", "")

Parameters:

Parameter	Type	Required	Description
`content_hash`	string	Yes	Hash of the memory to rate
`rating`	integer/string	Yes	Quality rating (converted to integer)
`feedback`	string	No	Optional feedback text

Sources: src/mcp_memory_service/server/handlers/quality.py:47-59

Analyze Quality Distribution

Provides statistical analysis of memory quality across the store:

elif action == "analyze":
    return await handle_analyze_quality_distribution(server, {
        "min_quality": arguments.get("min_quality", 0.0),
        "max_quality": arguments.get("max_quality", 1.0)
    })

Maintenance Operations

The maintain action provides automated quality-based memory management:

elif action == "maintain":
    return await handle_maintain(server, arguments)

elif action == "maintain_status":
    return await handle_maintain_status()

Sources: src/mcp_memory_service/server/handlers/quality.py:30-38

Workflow Diagrams

Quality Evaluation Flow

graph LR
    A[Incoming Memory] --> B{Is cached?}
    B -->|No| C[Run ONNX Ranker]
    C --> D[Run AI Evaluator]
    D --> E[Run Implicit Signals]
    E --> F[Aggregate Scores]
    F --> G[Compute Final Quality]
    G --> H[Cache Result]
    H --> I[Return Score]
    
    B -->|Yes| I

Quality-Aware Search Flow

graph TD
    A[Search Query] --> B[Semantic Search]
    B --> C[Initial Results]
    C --> D{quality_boost > 0?}
    D -->|Yes| E[Fetch Quality Scores]
    D -->|No| H[Return Results]
    E --> F[Apply Quality Boost]
    F --> G[Re-rank Results]
    G --> H

Usage Examples

Basic Quality Evaluation

from mcp_memory_service.quality import QualityScorer, QualityConfig

config = QualityConfig(min_quality=0.5, boost_weight=0.3)
scorer = QualityScorer(config)

quality_score = scorer.get_score(memory_content)
print(f"Quality score: {quality_score}")

Quality-Boosted Search

When searching with quality consideration:

results = await search_memories(
    query="architecture decisions",
    quality_boost=0.5,  # Apply quality weighting
    limit=10
)

Manual Rating

result = await handle_rate_memory(server, {
    "content_hash": "abc12345",
    "rating": 4,
    "feedback": "Important architectural decision"
})

Configuration Best Practices

Performance vs Accuracy

Use Case	Configuration	Rationale
Real-time search	`quality_boost=0.2-0.3`	Balance relevance with performance
High-precision retrieval	`quality_boost=0.5-0.7`	Prioritize quality over recall
Maintenance/cleanup	`min_quality=0.3-0.5`	Filter low-value memories

Thresholds

Memory Type	Recommended `min_quality`
Session summaries	0.4
Implementation details	0.5
Architectural decisions	0.6
Bug fixes	0.3
Reference documentation	0.5

Summary

The Quality Scoring System provides a comprehensive framework for evaluating and managing memory content quality in the MCP Memory Service. By combining ONNX-based offline ranking, AI-powered evaluation, and implicit behavioral signals, the system ensures that high-value memories are surfaced first while providing tools for ongoing quality maintenance.

The modular architecture allows for:

Scalability: ONNX Runtime enables efficient inference at scale
Flexibility: Configurable weights and thresholds for different use cases
Offline capability: Core ranking works without network dependencies
User control: Manual rating provides explicit feedback pathways
Maintenance: Built-in tools for quality-based memory management

Sources: src/mcp_memory_service/quality/config.py

REST API Reference

Related topics: Agent Framework Integration, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Server Address

Continue reading this section for the full explanation and source context.

Section Store a Memory

Continue reading this section for the full explanation and source context.

Section List All Memories

Continue reading this section for the full explanation and source context.

REST API Reference

Overview

The MCP Memory Service exposes a comprehensive REST API for managing semantic memories, performing advanced searches, and subscribing to real-time events. The API serves as the primary interface for external clients, web applications, and programmatic integrations that need to interact with the memory storage backend without using the MCP (Model Context Protocol) tools.

The REST API provides four major functional areas:

Area	Description
Memory Management	Store, retrieve, list, and delete semantic memories
Search Operations	Semantic similarity, tag-based, and time-based search
Real-time Events	Server-Sent Events (SSE) for live updates
MCP Protocol	JSON-RPC interface compatible with MCP clients

Sources: src/mcp_memory_service/web/app.py

Architecture

The REST API is built on FastAPI and serves as a thin layer over the core MemoryService and storage backends. Requests flow through the API router to the appropriate handlers, which delegate business logic to shared services.

graph TD
    A[HTTP Client] --> B[FastAPI Router]
    B --> C[/api/memories]
    B --> D[/api/search]
    B --> E[/api/events]
    B --> F[/api/mcp]
    C --> G[Memory Handler]
    D --> H[Search Handler]
    E --> I[SSE Publisher]
    F --> J[MCP Handler]
    G --> K[MemoryService]
    H --> K
    K --> L[(SQLite + sqlite_vec)]

Sources: src/mcp_memory_service/web/api/mcp.py:1-50

Base Configuration

Server Address

Environment Variable	Default	Description
`MCP_HOST`	`0.0.0.0`	Bind address
`MCP_PORT`	`8080`	HTTP port
`MCP_MAX_RESPONSE_CHARS`	`0` (unlimited)	Response truncation limit

Sources: src/mcp_memory_service/server/utils/response_limiter.py:1-40

Memory Management Endpoints

Store a Memory

Creates a new memory with automatic embedding generation.

POST /api/memories

Parameter	Type	Location	Required	Description
`content`	string	body	Yes	Memory content text
`tags`	string[]	body	No	List of tags
`memory_type`	string	body	No	Classification type (default: "note")
`metadata`	object	body	No	Custom metadata key-value pairs

Response:

{
  "content_hash": "abc12345",
  "message": "Memory stored successfully"
}

Sources: src/mcp_memory_service/api/operations.py:50-100

List All Memories

Retrieves memories with pagination and optional filtering.

GET /api/memories

Parameter	Type	Location	Required	Default	Description
`page`	integer	query	No	`1`	1-based page number
`page_size`	integer	query	No	`20`	Results per page (max: 100)
`tags`	string	query	No	-	Comma-separated tag filter
`tag_match`	string	query	No	`any`	Match logic: `any` (OR) or `all` (AND)
`memory_type`	string	query	No	-	Filter by memory type
`stale_days`	integer	query	No	-	Filter memories not accessed in N days

Response:

{
  "memories": [
    {
      "content": "Memory content here",
      "content_hash": "abc12345",
      "tags": ["tag1", "tag2"],
      "memory_type": "note",
      "created_at": "2025-01-15T10:30:00Z",
      "updated_at": "2025-01-15T10:30:00Z"
    }
  ],
  "pagination": {
    "page": 1,
    "page_size": 20,
    "total_count": 150
  }
}

Sources: src/mcp_memory_service/server/handlers/memory.py:1-50

Retrieve a Memory

Retrieves a specific memory by its content hash.

GET /api/memories/{hash}

Parameter	Type	Location	Required	Description
`hash`	string	path	Yes	8-character content hash

Response:

{
  "content": "Memory content here",
  "content_hash": "abc12345",
  "tags": ["architecture", "database"],
  "memory_type": "reference",
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:30:00Z",
  "metadata": {}
}

Sources: src/mcp_memory_service/web/app.py

Delete a Memory

Removes a memory and its associated embeddings.

DELETE /api/memories/{hash}

Parameter	Type	Location	Required	Description
`hash`	string	path	Yes	8-character content hash

Response:

{
  "success": true,
  "message": "Memory deleted successfully"
}

Sources: src/mcp_memory_service/web/app.py

Search Operations

Semantic Search

Performs vector similarity search using text embeddings.

POST /api/search

Parameter	Type	Location	Required	Default	Description
`query`	string	body	Yes	-	Search query text
`limit`	integer	body	No	`5`	Maximum results (1-100)
`tags`	string[]	body	No	-	Filter by tags
`threshold`	float	body	No	`0.0`	Minimum relevance score
`hybrid`	boolean	body	No	`false`	Enable BM25 fallback at threshold 0.4

Response:

{
  "memories": [
    {
      "content_hash": "abc12345",
      "content": "Memory content...",
      "tags": ["tag1"],
      "relevance_score": 0.87,
      "match_method": "vector"
    }
  ],
  "found": 1,
  "shown": 1
}

Sources: src/mcp_memory_service/api/operations.py:100-150

Tag-Based Search

Searches memories using tags with AND/OR logic.

POST /api/search/by-tag

Parameter	Type	Location	Required	Description
`tags`	string[]	body	Yes	List of tags to search
`match_all`	boolean	body	No	`true` for AND, `false` for OR

Response:

{
  "memories": [
    {
      "content": "Memory content",
      "content_hash": "abc12345",
      "tags": ["python", "reference"]
    }
  ],
  "total_found": 5
}

Sources: src/mcp_memory_service/server/handlers/memory.py:50-100

Time-Based Search

Natural language time-based queries for temporal memory retrieval.

POST /api/search/by-time

Parameter	Type	Location	Required	Description
`time_query`	string	body	Yes	Natural language time expression
`time_start`	float	body	No	Unix timestamp start
`time_end`	float	body	No	Unix timestamp end

Example Queries:

Expression	Interpretation
`last week`	7 days ago to now
`yesterday's architectural discussions`	Previous day
`between 2025-01-01 and 2025-01-31`	Date range

Sources: scripts/sync/README.md

Find Similar Memories

Finds memories semantically similar to a specific memory by hash.

GET /api/search/similar/{hash}

Parameter	Type	Location	Required	Description
`hash`	string	path	Yes	Content hash of reference memory
`limit`	integer	query	No	Maximum similar memories to return

Response:

{
  "reference_hash": "abc12345",
  "similar": [
    {
      "content_hash": "def67890",
      "content": "Similar memory content...",
      "similarity_score": 0.92
    }
  ]
}

Real-time Events (SSE)

Server-Sent Events stream for live memory activity.

GET /api/events

Event Types:

Event	Description
`memory_stored`	New memory added
`memory_deleted`	Memory removed
`memory_updated`	Memory modified
`embedding_complete`	Async embedding finished

Example SSE Payload:

event: memory_stored
data: {"content_hash": "abc12345", "timestamp": "2025-01-15T10:30:00Z"}

event: memory_deleted
data: {"content_hash": "def67890", "timestamp": "2025-01-15T10:35:00Z"}

Sources: src/mcp_memory_service/web/app.py

SSE Statistics

View connection statistics for SSE endpoints.

GET /api/events/stats

Response:

{
  "active_connections": 2,
  "total_events_sent": 1450,
  "uptime_seconds": 86400
}

MCP Protocol Endpoint

The MCP-compatible JSON-RPC endpoint enables integration with MCP clients.

POST /api/mcp

Supported Methods

Method	Description
`initialize`	Initialize MCP session, returns server capabilities
`tools/list`	List available MCP tools
`tools/call`	Execute a named MCP tool

Initialize Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {}
  }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2024-11-05",
    "capabilities": {"tools": {}},
    "serverInfo": {
      "name": "mcp-memory-service",
      "version": "4.1.1"
    }
  }
}

Tools List

Returns all available MCP tools with their schemas.

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/list"
}

Tool Call

Execute a named tool with arguments.

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "memory_search",
    "arguments": {
      "query": "architecture decisions",
      "limit": 5
    }
  }
}

Sources: src/mcp_memory_service/web/api/mcp.py:20-60

Response Limiting

The API implements automatic response truncation to prevent context window overflow.

Behavior

Scenario	Action
`MCP_MAX_RESPONSE_CHARS=0`	No limit (backward compatible)
`MCP_MAX_RESPONSE_CHARS>0`	Truncate at memory boundaries
Response truncated	Include warning header

Truncation Metadata

When responses are truncated, the following metadata is included:

{
  "warning": "RESPONSE TRUNCATED: Showing 5 of 25 results (15000 of 75000 chars)",
  "meta": {
    "truncated": true,
    "shown_results": 5,
    "total_results": 25,
    "shown_chars": 15000,
    "total_chars": 75000,
    "omitted_count": 20
  }
}

Sources: src/mcp_memory_service/server/utils/response_limiter.py:60-120

Data Models

Memory Object

Field	Type	Description
`content`	string	Memory text content
`content_hash`	string	8-character SHA256 hash
`tags`	string[]	Associated tags
`memory_type`	string	Classification (note, reference, decision, etc.)
`created_at`	ISO8601	Creation timestamp
`updated_at`	ISO8601	Last modification timestamp
`metadata`	object	Custom key-value pairs
`created_at_iso`	string	ISO format creation time
`updated_at_iso`	string	ISO format modification time

Search Result

Field	Type	Description
`memories`	Memory[]	Matching memories
`found`	integer	Total matches
`shown`	integer	Results returned (after limit)
`query_time_ms`	float	Search duration

Export Endpoints

Export Memories

Export memories in text or JSON format for backup and migration.

GET /api/export

Parameter	Type	Location	Description
`format`	string	query	`text` or `json` (default: `text`)

JSON Export Format:

{
  "export_metadata": {
    "source_machine": "machine-name",
    "export_timestamp": "2025-08-12T10:30:00Z",
    "total_memories": 450,
    "database_path": "/path/to/sqlite_vec.db",
    "platform": "Windows",
    "exporter_version": "5.0.0"
  },
  "memories": [
    {
      "content": "Memory content here",
      "content_hash": "sha256hash",
      "tags": ["tag1", "tag2"],
      "created_at": 1673545200.0,
      "updated_at": 1673545200.0,
      "memory_type": "note",
      "metadata": {},
      "export_source": "machine-name"
    }
  ]
}

Sources: scripts/sync/README.md

Performance Characteristics

Operation	First Call	Subsequent Calls
Search	~50ms	~5-10ms
Store	~50ms	~10-20ms
Health Check	~50ms	~5ms

Cost Estimate: At $0.15/1M tokens: ~$16.43/year per 10-user deployment.

Sources: src/mcp_memory_service/api/__init__.py

API Client Library

For programmatic access, use the Python client:

from mcp_memory_service.api import search, store, health

# Store a memory
hash = store("New memory", tags=["note", "important"])

# Search memories
results = search("architecture decisions", limit=5)
for m in results.memories:
    print(f"{m.hash}: {m.preview[:50]}...")

# Health check
info = health()
print(f"Backend: {info.backend}, Count: {info.count}")

Sources: src/mcp_memory_service/api/__init__.py

Error Handling

Standard Error Response

{
  "error": {
    "code": -32601,
    "message": "Method not found: {method}"
  }
}

Error Codes

Code	Meaning
`-32600`	Invalid Request
`-32601`	Method not found
`-32602`	Invalid params
`-32603`	Internal error
`-32000`	Storage unavailable

Documentation

Interactive API documentation is available at:

URL	Format
`/api/docs`	Swagger UI
`/api/redoc`	ReDoc

Sources: src/mcp_memory_service/web/app.py

Agent Framework Integration

Related topics: REST API Reference, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Memory Flow in Agent Workflows

Continue reading this section for the full explanation and source context.

Section Available Agent Prompts

Continue reading this section for the full explanation and source context.

Section Prompt Argument Specifications

Continue reading this section for the full explanation and source context.

Related topics: REST API Reference, Quick Start Guide

Agent Framework Integration

The MCP Memory Service provides comprehensive integration capabilities with various AI agent frameworks, enabling persistent semantic memory storage and retrieval across multi-agent architectures. This integration layer allows autonomous agents to maintain contextual awareness, share knowledge, and persist learning across sessions.

Overview

The MCP Memory Service functions as a central knowledge backbone for AI agent frameworks. Rather than requiring each agent to maintain isolated memory stores, the service provides a unified semantic memory layer that can be accessed via:

Model Context Protocol (MCP) Tools: Native integration for MCP-compatible agents
REST API: HTTP-based access for any framework with HTTP client capabilities
Direct Python API: Programmatic access for Python-native frameworks

This architecture enables agents to:

Store discovered facts and learned patterns
Retrieve contextually relevant memories during reasoning
Share knowledge across agent teams
Maintain persistent state across sessions

Sources: src/mcp_memory_service/api/__init__.py:1-50

Supported Agent Frameworks

The MCP Memory Service integrates with the following agent frameworks through dedicated documentation and examples:

Framework	Integration Type	Documentation
LangGraph	SDK/Graph-based	`docs/agents/langgraph.md`
CrewAI	Team-based agents	`docs/agents/crewai.md`
AutoGen	Multi-agent conversations	`docs/agents/autogen.md`
Custom Frameworks	HTTP/REST	`docs/agents/http-generic.md`

Sources: docs/agents/README.md

Architecture for Multi-Agent Systems

graph TD
    A[Agent Framework] --> B[MCP Memory Service API]
    B --> C[Memory Management Layer]
    C --> D[(SQLite-vec Storage)]
    C --> E[(Cloudflare D1)]
    
    F[Agent 1] --> B
    G[Agent 2] --> B
    H[Agent N] --> B
    
    I[Embedding Generation] --> C
    J[all-MiniLM-L6-v2] --> I

Memory Flow in Agent Workflows

sequenceDiagram
    participant Agent
    participant MCP as MCP Memory Service
    participant Storage as Vector Storage
    
    Agent->>MCP: Store Memory (content, tags)
    MCP->>MCP: Generate Embedding
    MCP->>Storage: Store + Index
    Storage-->>MCP: Confirm
    MCP-->>Agent: Content Hash
    
    Agent->>MCP: Semantic Search (query)
    MCP->>MCP: Generate Query Embedding
    MCP->>Storage: Similarity Search
    Storage-->>MCP: Top-K Results
    MCP-->>Agent: Relevant Memories

Sources: docs/guides/AGENTS.md

MCP Prompt Integration

The service exposes specialized prompts for agent workflows beyond basic memory operations:

Available Agent Prompts

Prompt Name	Purpose	Required Arguments
`knowledge_export`	Export memories in specific formats	`format` (json/markdown/text)
`memory_cleanup`	Remove duplicates/outdated memories	`older_than`, `similarity_threshold`
`learning_session`	Store structured learning notes	`topic`, `key_points`

Sources: src/mcp_memory_service/server_impl.py:150-200

Prompt Argument Specifications

types.PromptArgument(
    name="format",
    description="Export format (json, markdown, text)",
    required=True
)
types.PromptArgument(
    name="older_than",
    description="Remove memories older than (e.g., '6 months', '1 year')",
    required=False
)
types.PromptArgument(
    name="similarity_threshold",
    description="Similarity threshold for duplicates (0.0-1.0)",
    required=False
)

REST API Integration

For agent frameworks that prefer HTTP-based communication, the service provides a comprehensive REST API:

Core Endpoints

Method	Endpoint	Purpose
POST	`/api/memories`	Store new memory
GET	`/api/memories`	List memories with pagination
GET	`/api/memories/{hash}`	Retrieve specific memory
DELETE	`/api/memories/{hash}`	Delete memory
POST	`/api/search`	Semantic similarity search
GET	`/api/search/similar/{hash}`	Find similar memories

Sources: src/mcp_memory_service/web/app.py:50-100

Response Truncation for Agent Context

To prevent context overflow in agent prompts, the response limiter intelligently truncates results:

[!] RESPONSE TRUNCATED: Showing 5 of 20 results
(1500 of 8000 chars).
3 result(s) omitted to prevent context overflow.
Use specific queries or hash-based retrieval for full content.

Sources: src/mcp_memory_service/server/utils/response_limiter.py:30-60

Claude Code Integration

The MCP Memory Service includes specialized hooks for Claude Code CLI integration:

Available Claude Commands

Command	Purpose
`/memory-recall`	Time-based memory retrieval using natural language
`/memory-search`	Tag and content search
`/memory-context`	Session context integration with machine source ID
`/memory-health`	Service health diagnostics

Sources: claude_commands/README.md

Session Hook Workflow

graph LR
    A[Session Start] --> B[Load Relevant Memories]
    B --> C[Inject into Context]
    
    D[During Session] --> E[Track Decisions]
    E --> F[Store Key Insights]
    
    G[Session End] --> H[Save Decisions]
    H --> I[Archive Insights]
    I --> J[Persistent Storage]

Performance Characteristics for Agents

Metric	Value	Notes
First Call Latency	~50ms	Includes storage initialization
Subsequent Calls	~5-10ms	Connection reused
Memory Overhead	<10MB	Per agent instance
Embedding Model	all-MiniLM-L6-v2	384-dimensional vectors
Cost (10 users)	$16.43/year	At $0.15/1M tokens

Sources: src/mcp_memory_service/api/__init__.py:10-20

Installation and Setup

Automatic Installation

# Install with commands (detects Claude Code CLI)
python scripts/installation/install.py

# Force install commands
python scripts/installation/install.py --install-claude-commands

Manual HTTP Integration

For custom agent frameworks:

import httpx

async def store_agent_memory(content: str, tags: list[str]):
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://your-endpoint:8443/api/memories",
            json={"content": content, "tags": tags}
        )
        return response.json()

Sources: docs/agents/http-generic.md

Troubleshooting

Issue	Solution
Hooks not detected	Check `ls ~/.claude/settings.json` and reinstall
JSON parse errors	Update to latest version with Python dict conversion
Connection failed	Verify `curl -k https://your-endpoint:8443/api/health`
Wrong directory	Move `~/.claude-code/hooks/*` to `~/.claude/hooks/`

Sources: claude-hooks/README.md

Sources: src/mcp_memory_service/api/__init__.py:1-50

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)

The project may affect permissions, credentials, data exposure, or host boundaries.

medium Developers should check this installation risk before relying on the project: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)

Developers may fail before the first successful local run: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)

medium Developers should check this installation risk before relying on the project: chore(milvus): track optional BaseStorage overrides + test coverage gaps

Developers may fail before the first successful local run: chore(milvus): track optional BaseStorage overrides + test coverage gaps

medium Developers should check this installation risk before relying on the project: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug

Developers may fail before the first successful local run: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Security or permission risk: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)

Severity: high
Finding: Security or permission risk is backed by a source signal: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare). Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/950

2. Installation risk: Developers should check this installation risk before relying on the project: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)

Severity: medium
Finding: Developers should check this installation risk before relying on the project: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
User impact: Developers may fail before the first successful local run: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare). Context: Observed when using node, python, docker, macos
Evidence: failure_mode_cluster:github_issue | fmev_dd89642370c2dba2d6aacf12756658a6 | https://github.com/doobidoo/mcp-memory-service/issues/950 | [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)

3. Installation risk: Developers should check this installation risk before relying on the project: chore(milvus): track optional BaseStorage overrides + test coverage gaps

Severity: medium
Finding: Developers should check this installation risk before relying on the project: chore(milvus): track optional BaseStorage overrides + test coverage gaps
User impact: Developers may fail before the first successful local run: chore(milvus): track optional BaseStorage overrides + test coverage gaps
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: chore(milvus): track optional BaseStorage overrides + test coverage gaps. Context: Observed when using docker
Evidence: failure_mode_cluster:github_issue | fmev_74209176888c160a35483f3156117496 | https://github.com/doobidoo/mcp-memory-service/issues/888 | chore(milvus): track optional BaseStorage overrides + test coverage gaps

4. Installation risk: Developers should check this installation risk before relying on the project: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug

Severity: medium
Finding: Developers should check this installation risk before relying on the project: fix(hooks): PR #952 missed core/session-end.js — same Cloudflare Tunnel port-fallback bug
User impact: Developers may fail before the first successful local run: fix(hooks): PR #952 missed core/session-end.js — same Cloudflare Tunnel port-fallback bug
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: fix(hooks): PR #952 missed core/session-end.js — same Cloudflare Tunnel port-fallback bug. Context: Observed when using windows
Evidence: failure_mode_cluster:github_issue | fmev_b14a35b730602b08a29e3abbdfa0c377 | https://github.com/doobidoo/mcp-memory-service/issues/957 | fix(hooks): PR #952 missed core/session-end.js — same Cloudflare Tunnel port-fallback bug

5. Installation risk: Developers should check this installation risk before relying on the project: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix
User impact: Upgrade or migration may change expected behavior: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix. Context: Observed when using python
Evidence: failure_mode_cluster:github_release | fmev_d0ce94252816336aa4ecbd45eeb73603 | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.59.0 | v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix

6. Installation risk: Developers should check this installation risk before relying on the project: v10.59.1 — OAuth state parameter RFC 6749 compliance fix

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v10.59.1 — OAuth state parameter RFC 6749 compliance fix
User impact: Upgrade or migration may change expected behavior: v10.59.1 — OAuth state parameter RFC 6749 compliance fix
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.59.1 — OAuth state parameter RFC 6749 compliance fix. Context: Observed when using python
Evidence: failure_mode_cluster:github_release | fmev_a0594e0fe855897f4612a17f520e81d4 | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.59.1 | v10.59.1 — OAuth state parameter RFC 6749 compliance fix

7. Installation risk: Developers should check this installation risk before relying on the project: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility
User impact: Upgrade or migration may change expected behavior: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility. Context: Observed when using python
Evidence: failure_mode_cluster:github_release | fmev_8a1390fb930fb3d5c55aee894e53c0e3 | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.60.2 | v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility

8. Installation risk: Developers should check this installation risk before relying on the project: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix
User impact: Upgrade or migration may change expected behavior: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix. Context: Observed when using python
Evidence: failure_mode_cluster:github_release | fmev_cce458b1322f9ccb8db2498d3499650d | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.63.0 | v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix

9. Installation risk: Quality trends endpoint AttributeError on sqlite_vec backend: 'SqliteVecMemoryStorage' object has no attribute 'search_…

Severity: medium
Finding: Installation risk is backed by a source signal: Quality trends endpoint AttributeError on sqlite_vec backend: 'SqliteVecMemoryStorage' object has no attribute 'search_…. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/981

10. Installation risk: chore(milvus): track optional BaseStorage overrides + test coverage gaps

Severity: medium
Finding: Installation risk is backed by a source signal: chore(milvus): track optional BaseStorage overrides + test coverage gaps. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/888

11. Installation risk: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug

Severity: medium
Finding: Installation risk is backed by a source signal: fix(hooks): PR #952 missed core/session-end.js — same Cloudflare Tunnel port-fallback bug. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/957

12. Installation risk: fix(milvus): test_semantic_dedup_blocks_near_duplicate still fails after consistency_level=Session fix

Severity: medium
Finding: Installation risk is backed by a source signal: fix(milvus): test_semantic_dedup_blocks_near_duplicate still fails after consistency_level=Session fix. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/938

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mcp-memory-service with real data or production workflows.

Quality trends endpoint AttributeError on sqlite_vec backend: 'SqliteVec - github / github_issue
[[automated] Contributor activity digest](https://github.com/doobidoo/mcp-memory-service/issues/937) - github / github_issue
chore(milvus): track optional BaseStorage overrides + test coverage gaps - github / github_issue
bug(harvest): Kiro CLI parser misses 80% of messages — wrong kind mappin - github / github_issue
fix(hooks): PR #952 missed core/session-end.js — same Cloudflare Tunne - github / github_issue
fix(milvus): test_semantic_dedup_blocks_near_duplicate still fails after - github / github_issue
[[Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)](https://github.com/doobidoo/mcp-memory-service/issues/950) - GitHub / issue
Developers should check this installation risk before relying on the project: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix - GitHub / issue
Developers should check this installation risk before relying on the project: v10.59.1 — OAuth state parameter RFC 6749 compliance fix - GitHub / issue
Developers should check this installation risk before relying on the project: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility - GitHub / issue
Developers should check this installation risk before relying on the project: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix - GitHub / issue
v10.54.0 — AND/OR tag filtering for memory_search - GitHub / issue

Source: Project Pack community evidence and pitfall evidence