Doramagic Project Pack · Human Manual
mcp-memory-service
Related topics: System Architecture, Installation and Setup, Quick Start Guide
Overview and Key Concepts
Related topics: System Architecture, Installation and Setup, Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Installation and Setup, Quick Start Guide
Overview and Key Concepts
The MCP Memory Service is a semantic memory storage and retrieval system designed for AI-assisted development workflows. It provides persistent memory capabilities for Claude Code and other MCP-compatible clients, enabling intelligent context retention across development sessions.
What is MCP Memory Service?
MCP Memory Service solves the context loss problem in AI-assisted development by maintaining a persistent, searchable store of development decisions, architectural choices, and project knowledge. When you work on a project, the service captures important decisions and makes them available in future sessions.
Sources: README.md
Architecture Overview
The system follows a layered architecture with clear separation between storage, API, and client integration layers.
graph TD
A[Claude Code / MCP Clients] -->|MCP Protocol| B[MCP Server Layer]
B --> C[REST API Layer]
C --> D[Service Layer]
D --> E[Storage Backend]
E -->|SQLite-vec| F[Local Vector Storage]
E -->|Cloudflare| G[D1 + Vectorize]
E -->|Hybrid| H[Combined Approach]
I[Web Dashboard] -->|HTTP| C
J[Claude Hooks] -->|Session Events| BSupported Storage Backends
| Backend | Description | Use Case |
|---|---|---|
| SQLite-vec | Local vector storage with SQLite | Single-machine deployments |
| Cloudflare D1 + Vectorize | Cloud-hosted serverless | Multi-device, global access |
| Hybrid | Combined local and cloud | Redundancy and performance |
Sources: src/mcp_memory_service/api/__init__.py
Core Components
1. MCP Server
The MCP Server implements the Model Context Protocol, providing tools and prompts for memory operations. It handles tool execution, prompt management, and bidirectional communication with MCP clients.
Key Tools:
store_memory- Store new memories with automatic embedding generationsearch_memories- Semantic similarity search using embeddingsretrieve_memory- Retrieve specific memory by content hashdelete_memory- Remove memory and associated embeddingslist_memories- List all memories with pagination
Key Prompts:
knowledge_retrieval- Structured memory retrieval with relevance rankingmemory_summary- Generate summaries of stored memoriesknowledge_export- Export memories in various formatsmemory_cleanup- Identify and remove duplicates
Sources: src/mcp_memory_service/server_impl.py
2. REST API Layer
The REST API provides HTTP endpoints for direct access to memory operations, useful for integrations and the web dashboard.
graph LR
A[Memory Management] -->|POST /api/memories| B[Store]
A -->|GET /api/memories| C[List]
A -->|GET /api/memories/{hash}| D[Retrieve]
A -->|DELETE /api/memories/{hash}| E[Delete]
F[Search Operations] -->|POST /api/search| G[Semantic Search]
F -->|GET /api/search/similar/{hash}| H[Similar Search]
I[Real-time Events] -->|GET /api/events| J[SSE Stream]
I -->|GET /api/events/stats| K[Statistics]| Endpoint | Method | Description |
|---|---|---|
/api/memories | POST | Store a new memory with automatic embedding generation |
/api/memories | GET | List all memories with pagination support |
/api/memories/{hash} | GET | Retrieve a specific memory by content hash |
/api/memories/{hash} | DELETE | Delete a memory and its embeddings |
/api/search | POST | Semantic similarity search using embeddings |
/api/search/similar/{hash} | GET | Find memories similar to a specific one |
/api/events | GET | Subscribe to real-time memory events stream |
Sources: src/mcp_memory_service/web/app.py
3. Web Dashboard
The built-in web interface provides:
- Interactive API documentation (Swagger UI and ReDoc)
- Real-time statistics display
- SSE testing interface
- Health monitoring
Sources: src/mcp_memory_service/web/app.py
Key Features
Semantic Search with Embeddings
The service uses the all-MiniLM-L6-v2 embedding model to convert memory content into vector representations. This enables semantic similarity search that understands meaning rather than just keyword matching.
Performance Characteristics:
| Metric | Value |
|---|---|
| First call latency | ~50ms (includes storage initialization) |
| Subsequent calls | ~5-10ms (connection reused) |
| Memory overhead | <10MB |
| Cost at $0.15/1M tokens | $16.43/year per 10-user deployment |
Sources: src/mcp_memory_service/api/__init__.py
Response Size Limiter
To prevent context window overflow in LLM clients, the service includes a response limiter that truncates large responses at memory boundaries. This ensures that large memory retrieval operations don't crash Claude or other LLM clients.
Configuration:
| Environment Variable | Default | Description |
|---|---|---|
MCP_MAX_RESPONSE_CHARS | 0 (unlimited) | Maximum characters in responses |
Sources: src/mcp_memory_service/server/utils/response_limiter.py
Claude Code Integration
The service provides deep integration with Claude Code through hooks and slash commands.
Available Commands:
| Command | Purpose |
|---|---|
/memory-store | Store important decisions and context |
/memory-recall | Retrieve memories using natural language |
/memory-search | Search by tags and content keywords |
/memory-context | Capture current session context |
/memory-health | Check service health and statistics |
Sources: claude_commands/README.md
Automatic Hooks:
- Session Start: Load relevant project memories when Claude Code starts
- Session End: Store insights and decisions from completed sessions
- Memory Retrieval: On-demand memory access during conversations
- Permission Requests: Automated handling of MCP permission requests
Sources: claude-hooks/README.md
Memory Types Taxonomy
The service organizes memories into a standardized taxonomy:
| Category | Types |
|---|---|
| Content Types | note, reference, document, guide |
| Activity Types | session, implementation, analysis |
This standardized taxonomy helps with memory organization and retrieval. The maintenance scripts can consolidate fragmented types into this standardized set.
Sources: scripts/maintenance/README.md
Data Model
Each memory in the system has the following structure:
{
"content": "Memory content here",
"content_hash": "sha256hash",
"tags": ["tag1", "tag2"],
"created_at": 1673545200.0,
"updated_at": 1673545200.0,
"memory_type": "note",
"metadata": {},
"export_source": "machine-name"
}
| Field | Type | Description |
|---|---|---|
| content | string | The actual memory content |
| content_hash | string | SHA256 hash of content for deduplication |
| tags | array | User-defined tags for categorization |
| created_at | float | Unix timestamp of creation |
| updated_at | float | Unix timestamp of last update |
| memory_type | string | Standardized type classification |
| metadata | object | Additional metadata storage |
| export_source | string | Source machine identifier |
Synchronization and Backup
Export/Import
Memories can be exported and imported in JSON format for backup and migration purposes:
- Export: ~1000 memories/second
- Import: ~500 memories/second with deduplication
- File Size: ~1KB per memory
Litestream Integration
For real-time replication, the service supports Litestream configuration, providing continuous backup to object storage.
Sources: src/mcp_memory_service/sync/litestream_config.py
Maintenance Tools
The service includes several maintenance scripts:
| Script | Purpose |
|---|---|
check_memory_types.py | Analyze type distribution and fragmentation |
consolidate_memory_types.py | Consolidate fragmented types into standardized taxonomy |
export_memories.py | Export memories to JSON format |
import_memories.py | Import memories from JSON with deduplication |
Sources: scripts/maintenance/README.md
Installation Methods
Python Package Installation
pip install mcp-memory-service
Claude Code Integration
cd claude-hooks
python install_hooks.py --natural-triggers
Sources: claude-hooks/README.md
Summary
MCP Memory Service provides a comprehensive solution for maintaining persistent, searchable memory across AI-assisted development sessions. Its layered architecture supports multiple storage backends, while deep Claude Code integration enables seamless workflow integration. The system prioritizes reliability through response limiting, data deduplication, and maintenance tools that keep the memory store organized and efficient.
Sources: README.md
Installation and Setup
Related topics: Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Quick Start Guide
Installation and Setup
Overview
The MCP Memory Service provides a comprehensive installation framework supporting multiple deployment scenarios including pip installation, Claude Code integration, OpenCode plugin support, and standalone HTTP server deployment. The setup system is designed to be modular, allowing users to install only the components they need while maintaining cross-platform compatibility across Windows, macOS, and Linux environments.
System Requirements
Prerequisites
The installation system requires several external dependencies that must be present before deployment:
| Dependency | Purpose | Platform-Specific Notes |
|---|---|---|
| Python 3.10+ | Runtime environment | Required on all platforms |
| Node.js | Hooks execution | Required for Claude Code hooks |
jq | Status line features | macOS: brew install jq; Linux: sudo apt install jq; Windows: choco install jq |
| pip | Package management | Included with Python 3.10+ |
The dependency checking system (src/mcp_memory_service/dependency_check.py) validates all required dependencies during initialization and provides clear guidance if any are missing.
Environment Requirements
The service requires specific directory structures and environment variables:
# Standard data directory
~/.local/share/mcp-memory/
# Database file
~/.local/share/mcp-memory/sqlite_vec.db
# Configuration directory
~/.claude/hooks/config.json # For Claude Code integration
Installation Methods
Method 1: Pip Installation
The primary installation method uses pip to install the mcp-memory-service package:
pip install mcp-memory-service
The pyproject.toml file defines all package dependencies and metadata for PyPI distribution. This installation provides:
- Core memory service library
- HTTP server implementation
- API endpoints
- CLI tools
Sources: pyproject.toml
Method 2: Standalone HTTP Server Installation
For HTTP server deployment, the installation script provides a guided setup process:
# From the repository root
python scripts/installation/install.py
# With specific options
python scripts/installation/install.py --install-claude-commands # Install Claude commands
python scripts/installation/install.py --skip-claude-commands-prompt # Skip command prompt
The installer performs the following operations:
- Validates system prerequisites
- Installs the
mcp-memory-servicepackage - Creates required directories
- Configures environment variables
- Sets up systemd services (on Linux)
- Optionally installs Claude Code commands
Sources: scripts/installation/install.py
Method 3: Claude Code Hooks Installation
Claude Code hooks provide automatic memory awareness and context injection. The unified installer supports multiple installation modes:
cd claude-hooks
# Install Natural Memory Triggers (recommended)
python install_hooks.py --natural-triggers
# OR install basic memory awareness hooks
python install_hooks.py --basic
The hooks system consists of several components:
| Component | File | Purpose |
|---|---|---|
| Session Start Hook | session-start.js | Loads relevant memories when Claude Code starts |
| Session End Hook | session-end.js | Stores session insights and decisions |
| Memory Retrieval | memory-retrieval.js | On-demand memory retrieval |
| Permission Request | permission-request.js | MCP server permission automation |
Sources: claude-hooks/README.md
Method 4: Claude Commands Installation
For Claude Code CLI integration, custom commands can be installed:
# Automatic installation during main setup
python scripts/installation/install.py --install-claude-commands
# Manual installation
python scripts/claude_commands_utils.py
# Test prerequisites
python scripts/claude_commands_utils.py --test
# Uninstall
python scripts/claude_commands_utils.py --uninstall
Commands are installed to ~/.claude/commands/ and provide:
/memory-save- Save memories with tags/memory-recall- Time-based memory retrieval/memory-search- Tag and content search/memory-context- Session context integration/memory-health- Service health check
Sources: claude_commands/README.md
Method 5: OpenCode Plugin Installation
The OpenCode plugin provides memory awareness for the OpenCode editor:
# Install plugin file
mkdir -p ~/.config/opencode/plugins
cp opencode/memory-plugin.js ~/.config/opencode/plugins/
# Install example configuration
cp opencode/memory-plugin.config.example.json ~/.config/opencode/memory-plugin.json
Sources: opencode/README.md
Environment Configuration
Environment Variables
The .env.example file provides configuration templates:
# MCP Memory Service Configuration
MCP_API_KEY=your-api-key-here
MCP_MEMORY_ENDPOINT=https://localhost:8443
MCP_MEMORY_TIMEOUT_MS=30000
MCP_MEMORY_LOAD_TIMEOUT_MS=10000
Configuration Hierarchy
The OpenCode plugin demonstrates the configuration precedence system:
graph TD
A[Config Options] --> B[Environment Variables]
A --> C[Default Config File]
B --> D[OPENCODE_MEMORY_ENDPOINT]
B --> E[OPENCODE_MEMORY_API_KEY]
B --> F[OPENCODE_MEMORY_TIMEOUT_MS]
C --> G[~/.config/opencode/memory-plugin.json]Configuration order of precedence (highest to lowest):
- Explicit plugin options
- Environment variables
- User config file (
~/.config/opencode/memory-plugin.json) - Project-local config (
.opencode/memory-plugin.json)
Sources: opencode/README.md
HTTP Server Deployment
Service Setup on Linux/macOS
The Litestream configuration system supports streaming SQLite replication for data durability:
# Install Litestream
curl -LsS https://github.com/benbjohnson/litestream/releases/latest/download/litestream-linux-amd64.tar.gz | tar -xzf -
sudo mv litestream /usr/local/bin/
# Generate configuration
python scripts/sync/litestream_config.py
Systemd Service Configuration
Production deployment uses systemd for service management:
# Start the service
systemctl --user start mcp-memory-http.service
# Enable on boot
systemctl --user enable mcp-memory-http.service
# Check status
systemctl --user status mcp-memory-http.service
Sources: src/mcp_memory_service/sync/litestream_config.py
Windows Service Setup
For Windows deployment, a LaunchAgent plist configuration is provided:
<dict>
<key>Label</key>
<string>com.mcp-memory-service</string>
<key>ProgramArguments</key>
<array>
<string>/local/bin/litestream</string>
<string>replicate</string>
<string>-config</string>
<string>{config_path}</string>
</array>
<key>RunAtLoad</key>
<true/>
</dict>
Sources: src/mcp_memory_service/sync/litestream_config.py
Data Synchronization
Database Export/Import
For cross-machine synchronization, the sync scripts provide export and import functionality:
# Export memories to JSON
python scripts/sync/export_memories.py --output ./memories_export.json
# Import memories from JSON
python scripts/sync/import_memories.py --input ./memories_export.json
The export format preserves all memory metadata:
{
"export_metadata": {
"source_machine": "machine-name",
"export_timestamp": "2025-08-12T10:30:00Z",
"total_memories": 450,
"database_path": "/path/to/sqlite_vec.db"
},
"memories": [
{
"content": "Memory content",
"content_hash": "sha256hash",
"tags": ["tag1", "tag2"],
"created_at": 1673545200.0,
"memory_type": "note"
}
]
}
Deduplication is based on content hash during import.
Sources: scripts/sync/README.md
Maintenance Procedures
Database Type Consolidation
Over time, memory types may become fragmented. The consolidation script standardizes the taxonomy:
# Preview changes (safe, read-only)
python scripts/maintenance/consolidate_memory_types.py --dry-run
# Execute consolidation
python scripts/maintenance/consolidate_memory_types.py
# With custom mapping
python scripts/maintenance/consolidate_memory_types.py --config custom_mappings.json
The standard 24-type taxonomy includes:
| Category | Types |
|---|---|
| Content Types | note, reference, document, guide |
| Activity Types | session, implementation, analysis |
Sources: scripts/maintenance/README.md
Verification
After installation, verify the setup with these checks:
# Verify Claude hooks installation
claude --debug hooks
# Run integration tests
cd ~/.claude/hooks && node tests/integration-test.js
# Test API health
curl -k https://your-endpoint:8443/api/health
# Verify database
sqlite3 ~/.local/share/mcp-memory/sqlite_vec.db "SELECT COUNT(*) FROM memories;"
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| Hooks not detected | Check ls ~/.claude/settings.json; reinstall if missing |
| JSON parse errors | Update to latest version |
| Connection failed | Verify endpoint with curl -k https://your-endpoint:8443/api/health |
| Wrong directory | Move ~/.claude-code/hooks/* to ~/.claude/hooks/ |
Missing jq | Install per platform instructions in prerequisites |
Debug Mode
Enable detailed logging for troubleshooting:
# Claude Code debug mode
claude --debug hooks
# Test individual hooks
node ~/.claude/hooks/core/session-start.js
PyPI Placeholder Packages
For users who may install the wrong package name, placeholder packages redirect to the correct installation:
# These packages emit deprecation warnings and redirect to mcp-memory-service
pip install mcp-memory
pip install memory-service
pip install agent-mem
Sources: tools/pypi-placeholders/README.md
Architecture Overview
graph TD
subgraph "Installation Methods"
A[Pip Installation] --> E[Core Service]
B[Claude Hooks] --> F[Memory Awareness]
C[Claude Commands] --> G[CLI Integration]
D[OpenCode Plugin] --> H[Editor Integration]
end
subgraph "Runtime Components"
E --> I[HTTP Server]
I --> J[API Endpoints]
J --> K[Memory Storage]
K --> L[sqlite_vec]
end
subgraph "Data Layer"
L --> M[Local Storage]
L --> N[LitekStream Sync]
endSources: pyproject.toml
Quick Start Guide
Related topics: Overview and Key Concepts, REST API Reference, Agent Framework Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and Key Concepts, REST API Reference, Agent Framework Integration
Quick Start Guide
Overview
The MCP Memory Service Quick Start Guide provides developers and users with a streamlined path to deploy and begin using the persistent semantic memory system for AI agents. This guide covers installation methods, initial configuration, core functionality verification, and essential commands for day-to-day operations.
The service serves as a centralized memory backend that stores, retrieves, and manages semantic memories with automatic embedding generation. It supports multiple deployment configurations including local SQLite-vec storage, Cloudflare D1 + Vectorize for cloud deployments, and hybrid configurations for distributed workflows across multiple machines.
Prerequisites
Before beginning the installation, ensure your development environment meets the following requirements:
| Requirement | Minimum Version | Notes |
|---|---|---|
| Python | 3.10+ | Required for core service |
| Node.js | 16+ | Required for hooks execution |
| SQLite | 3.35+ | Bundled with Python |
| jq | 1.6+ | Required for statusLine feature |
| Claude Code CLI | Latest | Optional, for Claude command integration |
Installing jq (Required for StatusLine Feature)
# macOS
brew install jq
# Linux (Ubuntu/Debian)
sudo apt install jq
# Windows
choco install jq
Installation Methods
Automatic Installation (Recommended)
The recommended approach uses the unified installer which handles all components:
cd claude-hooks
python install_hooks.py
The installer supports multiple installation modes:
| Mode | Command | Description |
|---|---|---|
| Full Installation | python install_hooks.py | Installs all features including basic hooks and natural triggers |
| Basic Only | python install_hooks.py --basic | Installs memory hooks only |
| Natural Triggers | python install_hooks.py --natural-triggers | Installs natural memory triggers only |
MCP Memory Service Installation
For the core memory service with HTTP server:
# Install with commands (will prompt if Claude Code CLI is detected)
python scripts/installation/install.py
# Force install commands
python scripts/installation/install.py --install-claude-commands
# Skip command installation prompt
python scripts/installation/install.py --skip-claude-commands-prompt
Initial Configuration
Claude Desktop Configuration
To integrate MCP Memory Service with Claude Desktop, create or modify the Claude Desktop configuration file. The configuration path varies by operating system:
- macOS/Linux:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Example configuration structure:
{
"mcpServers": {
"mcp-memory-service": {
"command": "python",
"args": [
"-m",
"mcp_memory_service",
"server"
],
"env": {
"MCP_MEMORY_DB_PATH": "~/.local/share/mcp-memory/sqlite_vec.db"
}
}
}
}
Environment Variables
The service recognizes several environment variables for configuration:
| Variable | Default | Description |
|---|---|---|
MCP_MEMORY_DB_PATH | ~/.local/share/mcp-memory/sqlite_vec.db | Database storage path |
MCP_MEMORY_BACKEND | sqlite_vec | Storage backend type |
MCP_MEMORY_PORT | 8443 | HTTP server port |
MCP_MEMORY_API_KEY | None | Optional API key authentication |
Backend Configuration Options
The service supports multiple backend configurations:
| Backend | Use Case | Configuration Value |
|---|---|---|
| SQLite-vec | Local development, single machine | sqlite_vec or sqlite-vec |
| Cloudflare D1 | Cloud deployment | cloudflare |
| Hybrid | Multi-machine sync | hybrid |
Sources: src/mcp_memory_service/web/app.py
Starting the Service
HTTP Server Mode
Start the HTTP server for REST API access:
python -m mcp_memory_service server
The server provides an interactive dashboard at the root URL (/) displaying:
- Total memory count
- Active embedding model
- Server health status
- Response time metrics
MCP Server Mode
Start as an MCP server for Claude integration:
python -m mcp_memory_service mcp
Service Health Verification
Verify the installation by checking service health:
curl -k https://localhost:8443/api/health
Expected response includes:
backend: Current storage backend (e.g.,sqlite_vec)count: Total number of stored memoriesstatus: Service health status
Core API Endpoints
The service exposes RESTful endpoints for memory operations:
Sources: src/mcp_memory_service/web/app.py
Memory Management Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/memories | Store a new memory with automatic embedding |
| GET | /api/memories | List all memories with pagination |
| GET | /api/memories/{hash} | Retrieve specific memory by content hash |
| DELETE | /api/memories/{hash} | Delete a memory and its embeddings |
Search Operations
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/search | Semantic similarity search using embeddings |
| GET | /api/search/tags/{tag} | Find memories by specific tag |
| GET | /api/search/similar/{hash} | Find memories similar to a specific one |
Real-time Events
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/events | Subscribe to real-time memory event stream (SSE) |
| GET | /api/events/stats | View SSE connection statistics |
API Documentation
Interactive API documentation is available at:
- Swagger UI:
/api/docs - ReDoc:
/api/redoc
Claude Commands Integration
After installation, several slash commands become available within Claude Code:
Sources: claude_commands/README.md
Available Commands
#### /memory-store - Store New Memories
Store a new memory with automatic content hashing and embedding generation.
claude /memory-store "Implemented OAuth 2.1 authentication for improved security"
claude /memory-store "Refactored storage backend to support sqlite-vec for performance"
#### /memory-recall - Time-based Memory Retrieval
Retrieve memories using natural language time expressions.
claude /memory-recall "what did we decide about the database last week?"
claude /memory-recall "yesterday's architectural discussions"
#### /memory-search - Tag and Content Search
Search through stored memories using tags, content keywords, and semantic similarity.
claude /memory-search --tags "architecture,database"
claude /memory-search "SQLite performance optimization"
#### /memory-context - Session Context Integration
Capture the current conversation and project context as a memory.
claude /memory-context
claude /memory-context --summary "Architecture planning session"
#### /memory-health - Service Health Check
Check the health and status of the MCP Memory Service.
claude /memory-health
claude /memory-health --detailed
Claude Hooks Configuration
The hooks system provides automatic memory capture during Claude Code sessions:
Sources: claude-hooks/README.md
Hook Types
| Hook | Trigger | Action |
|---|---|---|
session-start | Claude Code launch | Load relevant project memories |
session-end | Claude Code exit | Store insights and decisions |
auto-capture | Automatic capture trigger | Capture important context |
mid-conversation | Periodic intervals | Summarize and store progress |
Configuration File
Edit ~/.claude/hooks/config.json to customize hook behavior:
{
"memoryService": {
"endpoint": "https://your-server:8443",
"apiKey": "optional-api-key"
},
"hooks": {
"verbose": true,
"showMemoryDetails": false,
"cleanMode": false,
"autoCapture": true,
"forceRemember": "#remember",
"forceSkip": "#skip",
"applyTo": ["auto-capture", "session-start", "mid-conversation", "session-end"]
}
}
Verbosity Levels
| Level | Settings | Output |
|---|---|---|
| Normal | verbose: true, others false | Essential information only |
| Detailed | showMemoryDetails: true | Includes memory scoring details |
| Clean | cleanMode: true | Minimal output, success/error only |
| Silent | verbose: false | Background operation only |
Multi-Machine Synchronization
For users working across multiple machines, the service supports database synchronization:
Sources: scripts/sync/README.md
Export/Import Workflow
graph TD
A[Machine A] -->|export| B[JSON File]
B -->|import| C[Machine B]
B -->|import| D[Machine C]
E[Central Database] -->|replicate| A
E -->|replicate| CExport Command
python scripts/sync/export_memories.py \
--source-db ~/.local/share/mcp-memory/sqlite_vec.db \
--output ~/memories-export.json \
--machine-name "workstation-1"
Import Command
python scripts/sync/import_memories.py \
--input ~/memories-export.json \
--target-db ~/.local/share/mcp-memory/sqlite_vec.db
Deduplication
Memories are deduplicated based on content hash during import:
- Same content hash: Treated as duplicate, skipped
- Different content hash: Imported as unique memory
- Original timestamps: Preserved from source
- Source machine tags: Added automatically for tracking
Litestream Configuration
For continuous database replication, configure Litestream:
dbs:
- path: ~/.local/share/mcp-memory/sqlite_vec.db
replicas:
- url: s3://your-bucket/memory-db
sync-interval: 1s
Verification
Verify Hook Installation
claude --debug hooks
Expected output: Found 1 hook matchers in settings
Run Integration Tests
cd ~/.claude/hooks
node tests/integration-test.js
Expected: All 14 integration tests pass
Test API Endpoints
# Health check
curl -s https://localhost:8443/api/health | jq
# Store a test memory
curl -X POST https://localhost:8443/api/memories \
-H "Content-Type: application/json" \
-d '{"content": "Test memory from quick start", "tags": ["test"]}'
# Search memories
curl -X POST https://localhost:8443/api/search \
-H "Content-Type: application/json" \
-d '{"query": "test", "limit": 5}'
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| Hooks not detected | Verify ~/.claude/settings.json exists; reinstall if missing |
| JSON parse errors | Update to latest version with Python dict conversion |
| Connection failed | Check curl -k https://your-endpoint:8443/api/health |
| Wrong directory | Move ~/.claude-code/hooks/* to ~/.claude/hooks/ |
Debug Mode
Enable verbose debugging:
# Claude hooks debug
claude --debug hooks
# Individual hook testing
node ~/.claude/hooks/core/session-start.js
Windows-Specific Considerations
- Directory Structure: Hooks install to
%USERPROFILE%\.claude\hooks\ - JSON Path Format: Use backslashes or forward slashes
- Python Executable: Ensure Python is in PATH
Performance Characteristics
The service is optimized for low-latency operations:
| Operation | First Call | Subsequent Calls |
|---|---|---|
| Search | ~50ms | ~5-10ms |
| Store | ~50ms | ~5-10ms |
| Health Check | ~5ms | ~1ms |
| Metric | Value |
|---|---|
| Memory Overhead | <10MB |
| Cost per 10-user deployment | ~$16.43/year |
Sources: src/mcp_memory_service/api/__init__.py
Next Steps
After completing this quick start guide, consider exploring:
- API Reference: Full endpoint documentation at
/api/docs - Maintenance Scripts: Database cleanup and type consolidation tools in
scripts/maintenance/ - Advanced Configuration: Backend switching and hybrid deployment options
- Claude Hooks Customization: Fine-tune auto-capture behavior and verbosity levels
Sources: src/mcp_memory_service/web/app.py
System Architecture
Related topics: Storage Backends, Knowledge Graph and Entity Extraction, Memory Consolidation Engine
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Storage Backends, Knowledge Graph and Entity Extraction, Memory Consolidation Engine
System Architecture
Overview
The MCP Memory Service is a semantic memory service built on the Model Context Protocol (MCP). It provides storage, retrieval, and search capabilities for AI-assisted workflows by maintaining a persistent vector-based memory store with automatic embedding generation.
Sources: src/mcp_memory_service/api/__init__.py:1-20
The architecture follows a layered design pattern with clear separation between the MCP protocol layer, business logic services, and storage abstraction layer.
High-Level Architecture
graph TD
subgraph "Client Layer"
Claude[Claude Code]
CLI[CLI Client]
HTTP[HTTP Client]
MCP[MCP Client]
end
subgraph "Interface Layer"
FastAPI[FastAPI Server]
MCP_Server[MCP Server]
CLI_Interface[CLI Interface]
end
subgraph "Service Layer"
Memory_Service[Memory Service]
Graph_Service[Graph Service]
Response_Limiter[Response Limiter]
end
subgraph "Storage Layer"
Factory[Storage Factory]
SQLiteVec[sqlite_vec]
Litestream[Litestream Sync]
end
Client_Layer --> Interface_Layer
Claude --> MCP_Server
CLI --> CLI_Interface
HTTP --> FastAPI
Interface_Layer --> Service_Layer
Memory_Service --> Factory
Graph_Service --> Factory
Factory --> SQLiteVec
SQLiteVec --> LitestreamCore Components
MCP Server
The MCP Server (mcp_server.py) implements the Model Context Protocol, allowing AI assistants like Claude Code to interact with the memory service directly through standardized MCP tools.
Key responsibilities:
- Expose MCP tools for memory operations
- Handle tool invocation requests from MCP clients
- Coordinate with the Memory Service for all operations
Sources: src/mcp_memory_service/mcp_server.py:1-50
Server Implementation
The server_impl.py serves as the primary server implementation, providing the HTTP/REST interface alongside MCP support.
Sources: src/mcp_memory_service/server_impl.py:1-30
Memory Service
The Memory Service (memory_service.py) is the core business logic layer that handles:
| Operation | Description |
|---|---|
store() | Store new memories with automatic embedding generation |
search() | Semantic similarity search using vector embeddings |
recall() | Time-based memory retrieval with natural language expressions |
delete() | Remove memories and their associated embeddings |
list_memories() | Paginated listing of all memories |
Sources: src/mcp_memory_service/services/memory_service.py:1-60
Graph Service
The Graph Service (graph_service.py) manages relationship tracking between memories, enabling complex queries about memory connections and dependencies.
Sources: src/mcp_memory_service/services/graph_service.py:1-40
Storage Architecture
Storage Factory Pattern
The storage layer uses a factory pattern (storage/factory.py) to abstract the underlying database implementation:
graph LR
Factory[Storage Factory] -->|Creates| SQLiteVec_Storage[SQLite Vec Storage]
Factory -->|Creates| InMemory_Storage[In-Memory Storage]
SQLiteVec_Storage -->|Uses| SQLite[(SQLite with vec extension)]
InMemory_Storage -->|Uses| RAM[(In-Memory)]Sources: src/mcp_memory_service/storage/factory.py:1-30
SQLite Vec Backend
The default storage backend uses sqlite_vec, which provides:
- Vector storage: Native support for embedding storage and similarity search
- SQLite reliability: ACID transactions, proven durability
- Low overhead: <10MB memory footprint
- Performance: ~5-10ms subsequent calls after first call initialization
Sources: src/mcp_memory_service/api/__init__.py:15-18
Litestream Synchronization
For production deployments, Litestream provides continuous database replication:
Sources: src/mcp_memory_service/sync/litestream_config.py:1-80
| Platform | Installation Command | |
|---|---|---|
| macOS | brew install benbjohnson/litestream/litestream | |
| Linux | `curl -LsS https://... \ | tar -xzf -` |
| Windows | Manual download from GitHub releases |
API Architecture
REST API Endpoints
Sources: src/mcp_memory_service/web/app.py:30-80
#### Memory Management
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/memories | Store a new memory with automatic embedding generation |
| GET | /api/memories | List all memories with pagination support |
| GET | /api/memories/{hash} | Retrieve a specific memory by content hash |
| DELETE | /api/memories/{hash} | Delete a memory and its embeddings |
#### Search Operations
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/search | Semantic similarity search using embeddings |
| GET | /api/search/similar/{hash} | Find memories similar to a specific one |
#### Real-time Events
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/events | Subscribe to real-time memory events stream (SSE) |
| GET | /api/events/stats | View SSE connection statistics |
Python API
The Python API provides programmatic access with low token overhead:
from mcp_memory_service.api import search, store, health
# Search memories (~20 tokens)
results = search("architecture decisions", limit=5)
# Store memory (~15 tokens)
hash = store("New memory", tags=["note", "important"])
# Health check (~5 tokens)
info = health()
Sources: src/mcp_memory_service/api/__init__.py:20-40
Response Management
Response Limiter
The response_limiter.py module prevents context window overflow by truncating responses at memory boundaries.
Sources: src/mcp_memory_service/server/utils/response_limiter.py:1-50
graph TD
Request[Large Memory Request] --> Check{Under limit?}
Check -->|Yes| Return_Full[Return all memories]
Check -->|No| Truncate[Truncate at boundary]
Truncate --> Add_Header[Add truncation header]
Add_Header --> Return_Partial[Return partial results]Configuration
| Environment Variable | Default | Description |
|---|---|---|
MCP_MAX_RESPONSE_CHARS | 0 (unlimited) | Maximum characters in responses |
CLI Architecture
Sources: src/mcp_memory_service/cli/main.py:1-50
The CLI provides command-line access to all memory operations:
memory server # Start HTTP server
memory health # Check service health
memory logs --lines 30 # Show recent log entries
Compatibility entry points:
memory-server(deprecated, redirects tomemory server)
Integration Points
Claude Code Hooks
The system integrates with Claude Code through hooks that provide:
- Automatic memory loading on session start
- Context injection of relevant memories
- Session insight storage on session end
Sources: claude-hooks/README.md:1-30
Claude Commands
Custom slash commands for memory operations:
| Command | Purpose |
|---|---|
/memory-save | Save current conversation as memory |
/memory-recall | Time-based memory retrieval |
/memory-search | Tag and content search |
/memory-context | Session context integration |
/memory-health | Service health check |
Sources: claude_commands/README.md:1-40
Data Flow
sequenceDiagram
participant Client
participant API
participant MemoryService
participant Storage
participant Litestream
Client->>API: Store memory request
API->>MemoryService: Save operation
MemoryService->>MemoryService: Generate embedding
MemoryService->>Storage: Store content + vector
Storage->>Litestream: Replicate to S3/GCS
Storage-->>MemoryService: Confirmation
MemoryService-->>API: Success response
API-->>Client: Memory hash returned
Client->>API: Search request
API->>MemoryService: Query
MemoryService->>MemoryService: Generate query embedding
MemoryService->>Storage: Vector similarity search
Storage-->>MemoryService: Top K results
MemoryService-->>API: Ranked results
API-->>Client: Search resultsMaintenance Operations
Memory Type Consolidation
The system supports consolidating fragmented memory types into a standardized taxonomy:
Sources: scripts/maintenance/README.md:1-50
Standard 24-Type Taxonomy:
- Content Types:
note,reference,document,guide - Activity Types:
session,implementation,analysis
Sync Export/Import
Sources: scripts/sync/README.md:1-40
| Operation | Performance |
|---|---|
| Export | ~1000 memories/second |
| Import | ~500 memories/second with deduplication |
| File Size | ~1KB per memory |
Performance Characteristics
| Metric | Value |
|---|---|
| First call latency | ~50ms (includes storage initialization) |
| Subsequent calls | ~5-10ms (connection reused) |
| Memory overhead | <10MB |
| Annual cost | ~$16.43/year per 10-user deployment (at $0.15/1M tokens) |
Sources: src/mcp_memory_service/api/__init__.py:12-15
Architecture Summary
The MCP Memory Service architecture is designed around three principles:
- Separation of Concerns: Clear boundaries between protocol handling, business logic, and storage
- Multiple Interfaces: Support for MCP, REST, Python API, and CLI access patterns
- Production Ready: Built-in replication, response limiting, and maintenance tools
Source: https://github.com/doobidoo/mcp-memory-service / Human Manual
Storage Backends
Related topics: System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture
Storage Backends
The mcp-memory-service project implements a pluggable storage backend architecture that enables users to choose between different vector database technologies for storing and searching semantic memories. This abstraction layer decouples the core memory service logic from specific database implementations, providing flexibility in deployment scenarios.
Architecture Overview
The storage system follows a factory pattern with a unified interface. All storage implementations inherit from a common base class that defines the contract for memory operations: store, retrieve, search, and delete.
graph TD
A[Memory Service] --> B[Storage Factory]
B --> C[sqlite_vec Storage]
B --> D[Cloudflare Storage]
B --> E[Milvus Storage]
B --> F[Hybrid Storage]
C --> G[(Local SQLite DB)]
D --> H[(Cloudflare D1 + Vectorize)]
E --> I[(Milvus Collection)]
F --> J[(Cloudflare + SQLite)]Sources: src/mcp_memory_service/storage/factory.py
Supported Storage Backends
| Backend | Description | Best For |
|---|---|---|
sqlite_vec | Local SQLite database with vec0 extension | Single-user, offline-first, privacy-focused |
cloudflare | Cloudflare D1 database + Vectorize API | Cloud-native, multi-device sync |
milvus | Milvus vector database | Enterprise-scale, high-performance |
hybrid | Cloudflare + SQLite combination | Multi-device with local backup |
Sources: docs/guides/STORAGE_BACKENDS.md
Core Interface
All storage backends implement a common interface defined through the base storage class. The interface includes:
Storage Methods
| Method | Purpose | Parameters |
|---|---|---|
store(memory) | Store a new memory with embedding | Memory object with content, tags, metadata |
retrieve(hash) | Retrieve memory by content hash | content_hash: str |
search(query, limit) | Semantic search using embeddings | query: str, limit: int |
search_by_tag(tags, match_all) | Tag-based filtering | tags: list, match_all: bool |
list_memories(page, page_size) | Paginated memory listing | page: int, page_size: int |
delete(hash) | Remove memory and embeddings | content_hash: str |
Sources: src/mcp_memory_service/api/operations.py
SQLite Vec Backend
The sqlite_vec backend is the default and most widely-used storage option. It leverages the sqlite-vec extension to enable vector similarity search directly within SQLite.
Configuration
| Environment Variable | Default | Description |
|---|---|---|
SQLITE_VEC_DB_PATH | ~/.local/share/mcp-memory/sqlite_vec.db | Path to SQLite database file |
EMBEDDING_MODEL_NAME | sentence-transformers/all-MiniLM-L6-v2 | Embedding model for vectorization |
Database Schema
The SQLite backend stores memories in a relational table with the following structure:
CREATE TABLE memories (
content TEXT NOT NULL,
content_hash TEXT PRIMARY KEY,
tags TEXT,
memory_type TEXT,
metadata TEXT,
created_at REAL,
updated_at REAL,
created_at_iso TEXT,
updated_at_iso TEXT
);
CREATE VIRTUAL TABLE memories_embeddings USING vec0(
content_hash TEXT PRIMARY KEY,
embedding FLOAT[384]
);
Sources: docs/sqlite-vec-backend.md
Performance Characteristics
| Metric | Value |
|---|---|
| Search latency | 5-10ms |
| Store latency | 10-20ms (includes embedding) |
| Memory overhead | <10MB |
| Capacity | Limited by disk space |
Cloudflare Backend
The Cloudflare backend provides cloud-native storage using Cloudflare's D1 (SQLite-compatible) database and Vectorize (vector search) API.
Configuration
| Environment Variable | Required | Description |
|---|---|---|
CLOUDFLARE_API_TOKEN | Yes | Cloudflare API authentication token |
CLOUDFLARE_ACCOUNT_ID | Yes | Cloudflare account identifier |
CLOUDFLARE_VECTORIZE_INDEX | Yes | Vectorize index name |
CLOUDFLARE_D1_DATABASE_ID | Yes | D1 database identifier |
CLOUDFLARE_R2_BUCKET | No | R2 bucket for large content storage |
CLOUDFLARE_EMBEDDING_MODEL | No | Embedding model override |
CLOUDFLARE_LARGE_CONTENT_THRESHOLD | No | Size threshold for R2 storage |
CLOUDFLARE_MAX_RETRIES | No | Retry attempts for API calls |
CLOUDFLARE_BASE_DELAY | No | Initial retry delay in seconds |
Initialization
storage = CloudflareStorage(
api_token=CLOUDFLARE_API_TOKEN,
account_id=CLOUDFLARE_ACCOUNT_ID,
vectorize_index=CLOUDFLARE_VECTORIZE_INDEX,
d1_database_id=CLOUDFLARE_D1_DATABASE_ID,
r2_bucket=CLOUDFLARE_R2_BUCKET,
embedding_model=CLOUDFLARE_EMBEDDING_MODEL,
large_content_threshold=CLOUDFLARE_LARGE_CONTENT_THRESHOLD,
max_retries=CLOUDFLARE_MAX_RETRIES,
base_delay=CLOUDFLARE_BASE_DELAY
)
Sources: src/mcp_memory_service/storage/factory.py
Milvus Backend
The Milvus backend targets enterprise deployments requiring high-performance, distributed vector search capabilities.
Configuration
| Environment Variable | Required | Description |
|---|---|---|
MILVUS_URI | Yes | Milvus server URI |
MILVUS_TOKEN | No | Authentication token |
MILVUS_COLLECTION_NAME | No | Collection name (default: memories) |
EMBEDDING_MODEL_NAME | No | Embedding model |
Initialization
storage = MilvusMemoryStorage(
uri=MILVUS_URI,
token=MILVUS_TOKEN,
collection_name=MILVUS_COLLECTION_NAME,
embedding_model=EMBEDDING_MODEL_NAME,
)
Sources: docs/milvus-backend.md
Hybrid Backend
The hybrid backend combines Cloudflare storage with local SQLite-vec backup, enabling multi-device synchronization while maintaining offline capability.
Architecture
graph LR
A[Local SQLite DB] <--> B[Hybrid Storage]
B <--> C[Cloudflare D1]
B <--> D[Cloudflare Vectorize]
E[Write Operations] --> B
F[Read Operations] --> BConfiguration
The hybrid backend requires both Cloudflare and SQLite configuration:
cloudflare_config = {
'api_token': CLOUDFLARE_API_TOKEN,
'account_id': CLOUDFLARE_ACCOUNT_ID,
'vectorize_index': CLOUDFLARE_VECTORIZE_INDEX,
'd1_database_id': CLOUDFLARE_D1_DATABASE_ID,
}
storage = HybridMemoryStorage(
cloudflare_config=cloudflare_config,
local_db_path=LOCAL_DB_PATH,
)
Storage Factory
The get_storage() function in the factory module handles backend instantiation based on configuration:
async def get_storage(backend: Optional[str] = None) -> BaseStorage:
"""Get storage instance based on configured or specified backend."""
backend = backend or os.getenv("MCP_MEMORY_STORAGE_BACKEND", "sqlite_vec")
if backend == "cloudflare":
return CloudflareStorage(...)
elif backend == "milvus":
return MilvusMemoryStorage(...)
elif backend == "hybrid":
return HybridMemoryStorage(...)
else:
return SQLiteVecStorage(...)
Sources: src/mcp_memory_service/storage/factory.py
HTTP Client Backend
For distributed deployments, the project supports HTTP-based storage access through an HttpStorage client. This enables communication with remote MCP Memory Service instances.
HTTP API Endpoints
| Method | Endpoint | Purpose |
|---|---|---|
| POST | /api/memories | Store a new memory |
| GET | /api/memories | List memories with pagination |
| GET | /api/memories/{hash} | Retrieve specific memory |
| DELETE | /api/memories/{hash} | Delete memory |
| POST | /api/search | Semantic search |
| GET | /api/search/similar/{hash} | Find similar memories |
| GET | /api/events | SSE event stream |
Time Filter Format
When searching by time ranges, the HTTP client converts Unix timestamps to ISO date format:
def _build_time_filter(time_start: Optional[float], time_end: Optional[float]) -> Optional[str]:
if time_start and time_end:
return f"between {_to_date(time_start)} and {_to_date(time_end)}"
elif time_start:
return _to_date(time_start)
return _to_date(time_end)
Sources: src/mcp_memory_service/storage/http_client.py
Backend Selection
Decision Matrix
| Use Case | Recommended Backend |
|---|---|
| Single machine, privacy-sensitive | sqlite_vec |
| Multi-device with cloud sync | cloudflare |
| Enterprise with high volume | milvus |
| Local backup + cloud sync | hybrid |
Selecting Backend via CLI
When using the CLI for operations:
# Use SQLite backend (default)
memory ingest-document doc.pdf
# Use Cloudflare backend
memory ingest-document doc.pdf --storage-backend cloudflare
# Use hybrid backend
memory ingest-document doc.pdf --storage-backend hybrid
Maintenance Operations
Embedding Migration
To migrate to a different embedding model (handles dimension changes):
python scripts/maintenance/migrate_embeddings.py --url http://localhost:8000 --model new-model --dry-run
Database Repair Scripts
| Script | Purpose |
|---|---|
repair_memories.py | Repair corrupted memory entries |
repair_sqlite_vec_embeddings.py | Fix embedding inconsistencies |
repair_zero_embeddings.py | Fix zero/null embeddings |
cleanup_corrupted_encoding.py | Fix corrupted emoji encoding |
Backup Strategy
For sqlite_vec backend, Litestream provides continuous replication:
# Configure in litestream.yml
dbs:
- path: ~/.local/share/mcp-memory/sqlite_vec.db
replicas:
- url: s3://your-bucket/mcp-memory/
Sources: scripts/sync/litestream/README.md
Response Limiting
All storage backends support response size limiting through the ResponseLimiter utility to prevent oversized responses:
| Parameter | Default | Description |
|---|---|---|
max_chars | 10000 | Maximum characters in response |
max_results | 50 | Maximum memories to return |
The limiter calculates estimated memory sizes including overhead and truncates at memory boundaries to ensure consistent response sizes.
Sources: src/mcp_memory_service/server/utils/response_limiter.py
Knowledge Graph and Entity Extraction
Related topics: System Architecture, Memory Consolidation Engine, Quality Scoring System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Memory Consolidation Engine, Quality Scoring System
Knowledge Graph and Entity Extraction
Overview
The MCP Memory Service includes a sophisticated Knowledge Graph system that enables semantic relationships between memories, automatic entity extraction, and intelligent relationship inference. This feature transforms isolated memory entries into an interconnected knowledge network that supports advanced queries like graph traversal, entity-based retrieval, and similarity analysis.
The knowledge graph layer sits above the core memory storage, providing:
- Entity Extraction: Automatic identification of people, organizations, locations, and concepts from memory content
- Relationship Mapping: Discovery and storage of connections between memories based on semantic similarity and content analysis
- Graph Traversal: Navigation of memory relationships using configurable depth and radius parameters
- Association Memory: Automatic creation of memory entries that document discovered relationships between existing memories
- Relationship Inference: AI-powered inference of relationship types (causes, fixes, contradicts, supports, follows)
Architecture
graph TD
subgraph "Memory Layer"
M1[Memory 1]
M2[Memory 2]
M3[Memory 3]
end
subgraph "Entity Extraction"
EE[EntityExtractor]
NER[NER Processing]
PAT[Pattern Matching]
end
subgraph "Graph Storage"
GS[GraphStorage]
RI[RelationshipInference]
AM[AssociationMemory]
end
subgraph "Query Interfaces"
GT[Graph Tools]
ST[Search Tools]
end
M1 --> EE
M2 --> EE
M3 --> EE
EE --> NER
EE --> PAT
NER --> GS
PAT --> GS
GS --> RI
RI --> AM
AM --> GS
GT --> GS
ST --> GSCore Components
| Component | File | Purpose |
|---|---|---|
EntityExtractor | src/mcp_memory_service/reasoning/entities.py | Extracts entities from memory content |
RelationshipInferenceEngine | src/mcp_memory_service/reasoning/inference.py | Infers relationship types between memories |
GraphStorage | src/mcp_memory_service/storage/graph.py | SQLite-based graph storage backend |
MilvusGraphStorage | src/mcp_memory_service/storage/milvus_graph.py | Milvus vector database backend |
MemoryConsolidator | src/mcp_memory_service/consolidation/consolidator.py | Creates and manages association memories |
Association | src/mcp_memory_service/models/association.py | Data model for memory associations |
Sources: src/mcp_memory_service/reasoning/entities.py:1-50
Entity Extraction
Overview
Entity extraction automatically identifies and categorizes entities within memory content. The system supports multiple entity types and uses both pattern-based and NER (Named Entity Recognition) approaches.
Sources: src/mcp_memory_service/reasoning/entities.py:1-30
Supported Entity Types
The system recognizes the following entity categories:
| Entity Type | Description | Examples |
|---|---|---|
person | Human individuals | "John", "Alice Chen" |
organization | Companies, teams, agencies | "Acme Corp", "Engineering Team" |
location | Physical places | "San Francisco", "Building A" |
concept | Abstract ideas and concepts | "machine learning", "agile methodology" |
technology | Tools, frameworks, languages | "Python", "React", "Docker" |
date | Temporal references | "December 2024", "Q1" |
project | Named projects or initiatives | "Project Alpha", "Apollo Initiative" |
Extraction Process
graph LR
A[Memory Content] --> B[Pattern Matching]
A --> C[NER Processing]
B --> D[Entity Deduplication]
C --> D
D --> E[Entity Metadata]
E --> F[Store Entity Links]The extraction process follows these steps:
- Pattern Matching: Initial scan for known patterns (email, URL, date formats, capitalization patterns)
- NER Processing: Language model-based entity recognition for contextual entity types
- Entity Deduplication: Normalization and merging of duplicate entities
- Metadata Generation: Creation of entity metadata including confidence scores
- Storage: Persisting entity links to the graph storage
MCP Tool Interface
Entities can be extracted and stored via the MCP memory_graph tool:
Sources: src/mcp_memory_service/server/handlers/graph.py:50-75
{
"action": "extract_entities",
"hash": "abc123def456"
}
Response Example:
{
"hash": "abc123def456",
"entities_extracted": 5,
"entities": [
{"name": "Python", "type": "technology", "confidence": 0.95},
{"name": "FastAPI", "type": "technology", "confidence": 0.92}
]
}
Knowledge Graph Storage
Storage Backends
The knowledge graph supports multiple storage backends:
| Backend | File | Use Case |
|---|---|---|
| SQLite Graph | src/mcp_memory_service/storage/graph.py | Default, single-node deployments |
| Milvus Graph | src/mcp_memory_service/storage/milvus_graph.py | Large-scale, distributed deployments |
Sources: src/mcp_memory_service/storage/graph.py:1-100
Graph Data Model
The graph storage maintains the following data structures:
#### Entities Table Stores extracted entities linked to memories:
| Field | Type | Description |
|---|---|---|
entity_id | TEXT | Unique entity identifier |
name | TEXT | Entity name |
entity_type | TEXT | Entity category |
memory_hash | TEXT | Associated memory hash |
confidence | REAL | Extraction confidence score |
#### Relationships Table Stores relationships between memories:
| Field | Type | Description |
|---|---|---|
relationship_id | TEXT | Unique relationship identifier |
source_hash | TEXT | Source memory hash |
target_hash | TEXT | Target memory hash |
relationship_type | TEXT | Type (similar, causes, fixes, etc.) |
similarity_score | REAL | Calculated similarity (0.0-1.0) |
metadata | JSON | Additional relationship metadata |
Sources: src/mcp_memory_service/models/association.py:1-60
Graph Operations
#### Store Entity Link
async def store_entity_link(
memory_hash: str,
entity_name: str,
entity_type: str
) -> bool:
Links an extracted entity to a memory for future retrieval and graph queries.
Sources: src/mcp_memory_service/storage/graph.py:150-180
#### Get Memory Subgraph
Retrieves a local subgraph centered on a specific memory:
async def get_memory_subgraph(
memory_hash: str,
radius: int = 2
) -> Dict[str, Any]:
Parameters:
memory_hash: Center memory for subgraph traversalradius: Maximum traversal depth (default: 2)
#### Graph Traversal
async def traverse_graph(
hash1: str,
hash2: str,
max_depth: int = 5
) -> List[Dict[str, Any]]:
Finds paths between two memories up to a specified depth.
Sources: src/mcp_memory_service/server/handlers/graph.py:20-45
Relationship Inference
RelationshipInferenceEngine
The relationship inference engine analyzes memory content pairs to determine semantic relationships:
Sources: src/mcp_memory_service/reasoning/inference.py:1-80
Supported Relationship Types
| Type | Description | Example |
|---|---|---|
causes | Source leads to target | "Changed config" → "System crashed" |
fixes | Source resolves target | "Applied patch" → "Bug #123" |
contradicts | Sources conflict | "Use X" vs "Don't use X" |
supports | Source validates target | "Test results" → "Implementation works" |
follows | Temporal sequence | "Phase 1 complete" → "Phase 2 started" |
related | General connection | Topic similarity without specific type |
Inference Process
graph TD
M1[Memory 1] --> C1[Content Analysis]
M2[Memory 2] --> C2[Content Analysis]
C1 --> SIM[Similarity Check]
C2 --> SIM
SIM --> RT{Relationship Type?}
RT -->|High Similarity| SA[Same Aspect]
RT -->|Causal Keywords| CA[Causal Link]
RT -->|Action Keywords| AC[Action Link]
RT -->|Negation| CN[Contradiction]
SA --> ST[Store Relationship]
CA --> ST
AC --> ST
CN --> STAssociation Memory
Overview
Association memories are automatically generated entries that document discovered relationships between existing memories. They provide a memory-level representation of graph connections, enabling search and retrieval of relationship information.
Sources: src/mcp_memory_service/consolidation/consolidator.py:150-200
Association Data Model
class Association:
source_memory_hashes: List[str]
similarity_score: float
connection_type: str
discovery_method: str
discovery_date: datetime
metadata: Dict[str, Any]
Association Memory Structure
When an association is stored as a memory:
association_memory = Memory(
content=f"Connected {source_hashes[0][:8]} and {source_hashes[1][:8]} by {connection_type}",
content_hash=f"assoc_{source_hashes[0][:8]}_{source_hashes[1][:8]}",
tags=["association", "discovered", connection_type],
memory_type="observation",
metadata={
"source_memory_hashes": source_hashes,
"similarity_score": similarity,
"connection_type": connection_type,
"discovery_method": association.discovery_method,
"discovery_date": association.discovery_date.isoformat(),
}
)
Sources: src/mcp_memory_service/consolidation/consolidator.py:170-195
Storage Process
graph LR
A[Memory Pair] --> B[Similarity Analysis]
B --> C{Score >= Threshold?}
C -->|Yes| D[Type Classification]
C -->|No| E[Skip]
D --> F[Create Association]
F --> G[Store Association Memory]
G --> H[Update Graph]The consolidator stores association memories with skip_semantic_dedup=True to prevent deduplication conflicts with templated content.
Sources: src/mcp_memory_service/consolidation/consolidator.py:195-210
MCP Tools Reference
memory_graph
Main tool for graph operations:
| Parameter | Type | Required | Description |
|---|---|---|---|
action | string | Yes | Operation: traverse, subgraph, extract_entities |
hash | string | Conditional | Memory hash for subgraph/entities actions |
hash1 | string | Conditional | Source hash for traversal |
hash2 | string | Conditional | Target hash for traversal |
radius | integer | No | Subgraph radius (default: 2) |
max_depth | integer | No | Traversal max depth (default: 5) |
#### Action Examples
Extract Entities:
{
"action": "extract_entities",
"hash": "memory_hash_here"
}
Get Subgraph:
{
"action": "subgraph",
"hash": "memory_hash_here",
"radius": 3
}
Traverse Graph:
{
"action": "traverse",
"hash1": "memory_hash_1",
"hash2": "memory_hash_2",
"max_depth": 5
}
Sources: src/mcp_memory_service/server/handlers/graph.py:1-80
Maintenance Scripts
update_graph_relationship_types.py
Located in scripts/maintenance/, this script infers relationship types for existing graph associations:
# Dry run (preview changes)
python scripts/maintenance/update_graph_relationship_types.py --dry-run
# Execute inference
python scripts/maintenance/update_graph_relationship_types.py
Features:
- Uses
RelationshipInferenceEnginefor type inference - Supports dry-run mode for safety
- Automatic backup before execution
- Updates relationship metadata in graph storage
Sources: scripts/maintenance/README.md
Configuration
Graph Storage Configuration
The graph storage is automatically initialized when the storage service starts. No explicit configuration is required for SQLite-based storage.
For Milvus-based storage, configure the following environment variables:
| Variable | Description | Default |
|---|---|---|
MILVUS_HOST | Milvus server host | localhost |
MILVUS_PORT | Milvus server port | 19530 |
MILVUS_COLLECTION | Collection name | memory_graph |
Entity Extraction Configuration
Entity extraction settings can be configured in the service configuration:
| Setting | Type | Description |
|---|---|---|
entity_types | List[str] | Enabled entity types |
min_confidence | float | Minimum confidence threshold (0.0-1.0) |
enable_ner | bool | Enable NER processing |
pattern_weight | float | Weight for pattern matching |
Error Handling
The graph operations return standardized error responses:
{
"error": "Error message describing the issue"
}
Common error scenarios:
- Memory not found for entity extraction
- Graph storage not initialized
- Invalid hash format
- Maximum depth exceeded during traversal
Sources: src/mcp_memory_service/server/handlers/graph.py:55-65
Best Practices
- Entity Naming: Use consistent entity naming conventions for better graph queries
- Memory Organization: Group related content in the same memory to strengthen association discovery
- Regular Maintenance: Run
update_graph_relationship_types.pyperiodically to classify new associations - Tag Usage: Tag memories with semantic tags to improve entity and relationship extraction
- Graph Traversal: Use appropriate radius/depth limits to prevent performance issues with highly connected memories
See Also
- Memory Storage - Core memory storage architecture
- Consolidation System - Association memory generation
- Reasoning Engine - Entity extraction and inference details
- Maintenance Scripts - Graph maintenance utilities
Memory Consolidation Engine
Related topics: Knowledge Graph and Entity Extraction, Quality Scoring System, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Knowledge Graph and Entity Extraction, Quality Scoring System, System Architecture
Memory Consolidation Engine
The Memory Consolidation Engine is a core subsystem of the MCP Memory Service that manages the lifecycle of stored memories through intelligent compression, decay analysis, and selective forgetting mechanisms. It ensures the memory store remains efficient, relevant, and optimized for long-term semantic retrieval.
Overview
As memories accumulate over time, the consolidation engine performs background operations to:
- Compress redundant memories into consolidated summaries
- Apply decay algorithms to age out less relevant information
- Forget obsolete entries based on configurable time horizons
- Generate insights from consolidated memory patterns
graph TD
A[New Memory] --> B[Memory Store]
B --> C{Consolidation Scheduler}
C --> D[Decay Analysis]
C --> E[Compression]
C --> F[Forgetting Check]
D --> G[Relevance Score Update]
E --> H[Consolidated Memory]
F --> I[Memory Deletion]
G --> H
H --> J[Insights Generation]
J --> K[Updated Memory Store]Architecture
The consolidation engine comprises five primary modules located in src/mcp_memory_service/consolidation/:
| Module | Purpose | Key Functions |
|---|---|---|
consolidator.py | Core consolidation orchestration | Main consolidation loop, memory merging |
scheduler.py | Automated scheduling of consolidation tasks | Daily, weekly, monthly job scheduling |
decay.py | Relevance decay calculations | Age-based score reduction algorithms |
forgetting.py | Selective memory removal | Time-horizon based forgetting policies |
insights.py | Post-consolidation analysis | Pattern detection, summary generation |
Sources: src/mcp_memory_service/consolidation/scheduler.py
Consolidation Time Horizons
The engine operates on three configurable time horizons that determine consolidation aggressiveness:
Weekly Consolidation
- Processes recent memories (typically 7 days)
- Light compression to maintain detail
- Preserves high-relevance memories unchanged
Monthly Consolidation
- Reviews memories from past 30 days
- Moderate compression and decay application
- Identifies patterns across recent sessions
Quarterly Consolidation
- Full store analysis
- Aggressive forgetting of stale entries
- Major compression of redundant information
Sources: src/mcp_memory_service/api/operations.py:1-50
API Integration
REST API Endpoint
POST /api/consolidate
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
time_horizon | string | Yes | One of: daily, weekly, monthly |
Example Request:
curl -X POST https://localhost:8443/api/consolidate \
-H "Content-Type: application/json" \
-d '{"time_horizon": "weekly"}'
Example Response:
{
"status": "completed",
"time_horizon": "weekly",
"memories_processed": 2418,
"compressed": 156,
"forgotten": 43
}
Sources: src/mcp_memory_service/api/operations.py
Python API
from mcp_memory_service.api import consolidate
# Async consolidation
result = await consolidate('weekly')
print(f"Compressed: {result.compressed}, Forgotten: {result.forgotten}")
Return Value Properties:
| Property | Type | Description |
|---|---|---|
status | string | Completion status |
time_horizon | string | Horizon used |
memories_processed | int | Total memories reviewed |
compressed | int | Number of memories consolidated |
forgotten | int | Number of memories deleted |
Scheduler Configuration
The consolidation scheduler runs automated tasks based on configured schedules:
graph LR
A[Scheduler Init] --> B{Job Queue}
B --> C[Daily Job<br/>T+00:00]
B --> D[Weekly Job<br/>Sunday T+02:00]
B --> E[Monthly Job<br/>1st T+03:00]Scheduler Status Model:
CompactSchedulerStatus:
running: bool
next_daily: datetime | None
next_weekly: datetime | None
next_monthly: datetime | None
jobs_executed: int
jobs_failed: int
Sources: src/mcp_memory_service/consolidation/scheduler.py
Decay Algorithm
The decay module applies relevance scoring based on memory age and access patterns:
def calculate_decay_score(memory_age_days: int, access_frequency: float) -> float:
"""
Returns relevance score between 0.0 and 1.0
Lower scores indicate memories approaching forgetting threshold
"""
Decay Factors:
| Factor | Description | Weight |
|---|---|---|
| Time Since Creation | Older memories decay faster | 0.4 |
| Access Frequency | Frequently accessed memories decay slower | 0.3 |
| Tag Relevance | Tagged memories maintain higher scores | 0.2 |
| Memory Type | System vs user memory decay rates differ | 0.1 |
Sources: src/mcp_memory_service/consolidation/decay.py
Forgetting Mechanism
The forgetting module determines which memories should be permanently removed:
def should_forget(memory: Memory, horizon: str) -> bool:
"""
Evaluates if a memory meets forgetting criteria
Returns True if memory should be deleted
"""
Forgetting Criteria by Horizon:
| Horizon | Age Threshold | Min Decay Score | Additional Checks |
|---|---|---|---|
daily | 90 days | 0.1 | None |
weekly | 180 days | 0.2 | Duplicate detection |
monthly | 365 days | 0.3 | Pattern analysis |
Safety Features:
- Automatic backup before deletion
- Transaction-based deletion (atomic rollback on failure)
- Database lock detection to prevent concurrent access issues
- Disk space verification before execution
Sources: src/mcp_memory_service/consolidation/forgetting.py
Performance Characteristics
| Metric | Value |
|---|---|
| Typical Duration | 10-30 seconds (varies with memory count) |
| Scaling | ~10ms per memory processed |
| Memory Overhead | <10MB during operation |
| Background Operation | Non-blocking in HTTP server context |
Performance Tips:
- Schedule consolidation during low-usage periods
- Large stores (>10,000 memories) may take longer
- Disable automatic scheduling for resource-constrained environments
Insights Generation
After consolidation, the insights module analyzes patterns:
def generate_insights(consolidated_memories: List[Memory]) -> List[Insight]:
"""
Produces actionable insights from consolidated memory patterns
"""
Insight Types:
| Type | Description |
|---|---|
pattern | Recurring themes detected |
summary | Condensed representation of grouped memories |
recommendation | Suggestions based on memory patterns |
conflict | Detected contradictions between memories |
Sources: src/mcp_memory_service/consolidation/insights.py
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
MCP_CONSOLIDATION_ENABLED | true | Enable/disable automatic consolidation |
MCP_CONSOLIDATION_SCHEDULE | daily | Default schedule |
MCP_MAX_RESPONSE_CHARS | 0 | Response truncation (0 = unlimited) |
Scheduler Settings
Configure in ~/.claude/hooks/config.json:
{
"consolidation": {
"schedule": {
"daily": "0 0 * * *",
"weekly": "0 2 * * 0",
"monthly": "0 3 1 * *"
},
"enabled": true,
"horizons": ["daily", "weekly"]
}
}
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| Consolidation fails silently | Database locked | Stop MCP clients before running |
| High memory usage during consolidation | Large store size | Increase MCP_MAX_RESPONSE_CHARS |
| Scheduler not running | Service not started | Check systemctl --user status mcp-memory.service |
Verification
# Check scheduler status
curl https://localhost:8443/api/consolidate/status
# Manual consolidation (dry-run)
curl -X POST https://localhost:8443/api/consolidate \
-d '{"time_horizon": "weekly", "dry_run": true}'
Further Reference
- Memory Consolidation Guide - Detailed usage documentation
- API Reference - Full API specification
- CLI Commands - Command-line consolidation tools
Quality Scoring System
Related topics: Memory Consolidation Engine, System Architecture, Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Memory Consolidation Engine, System Architecture, Quick Start Guide
Quality Scoring System
The Quality Scoring System is a multi-layered evaluation framework within the MCP Memory Service that assesses, ranks, and maintains memory content quality. It provides both automatic quality assessment through machine learning models and explicit user feedback mechanisms to ensure that memories retain high-value information over time.
Overview
The system addresses a fundamental challenge in semantic memory systems: not all stored memories have equal importance or relevance. Over time, memory stores can become cluttered with transient information, low-value summaries, and outdated content that degrades the overall utility of the memory service.
The Quality Scoring System solves this by implementing:
- Automatic Quality Assessment: Uses ONNX-based ML models to evaluate content quality without manual intervention
- Implicit Signal Detection: Analyzes behavioral patterns to infer memory importance
- Manual Rating Support: Allows users to explicitly rate memory quality
- Quality-Aware Search: Boosts high-quality memories in search results
- Maintenance Operations: Provides tools for quality-based memory management
Architecture
The Quality Scoring System follows a modular architecture with four primary components that work together to provide comprehensive quality evaluation.
graph TD
A[Memory Content] --> B[QualityScorer]
B --> C[ONNXRankerModel]
B --> D[QualityEvaluator]
B --> E[ImplicitSignalsEvaluator]
C --> F[CompactSearchResult]
D --> F
E --> F
G[User Rating] --> H[handle_rate_memory]
H --> B
I[Search Query] --> J[quality_boost Parameter]
J --> F
K[Maintenance Operations] --> L[handle_maintain]
L --> BCore Components
QualityConfig
The QualityConfig class provides centralized configuration for the quality scoring system. It defines thresholds, weights, and behavioral parameters that control how quality is evaluated.
| Parameter | Type | Default | Description |
|---|---|---|---|
min_quality | float | 0.0 | Minimum quality threshold for inclusion |
max_quality | float | 1.0 | Maximum possible quality score |
boost_weight | float | varies | Weight given to quality in search ranking |
implicit_weight | float | varies | Weight for implicit signal evaluation |
Sources: src/mcp_memory_service/quality/config.py
QualityScorer
The QualityScorer is the main orchestrator class that coordinates quality evaluation across all sub-components. It aggregates scores from different evaluation methods and produces a unified quality score.
class QualityScorer:
def __init__(self, config: QualityConfig):
self.config = config
self.ranker = ONNXRankerModel()
self.evaluator = QualityEvaluator()
self.implicit_evaluator = ImplicitSignalsEvaluator()
Key Responsibilities:
- Aggregates scores from ONNX ranker, AI evaluator, and implicit signals
- Provides a unified
get_score()interface - Handles caching and optimization for repeated evaluations
- Manages configuration propagation to sub-components
Sources: src/mcp_memory_service/quality/scorer.py
ONNXRankerModel
The ONNXRankerModel provides fast, offline-capable quality ranking using ONNX Runtime. This model evaluates content based on learned patterns of what constitutes high-quality memory content.
Advantages of ONNX-based ranking:
- Runs entirely offline without external API dependencies
- Fast inference suitable for real-time evaluation
- Portable across different platforms and hardware
- No per-token costs unlike cloud-based alternatives
Sources: src/mcp_memory_service/quality/onnx_ranker.py
QualityEvaluator
The QualityEvaluator provides AI-based quality assessment, likely utilizing more sophisticated language model analysis for nuanced content quality determination.
Evaluation Criteria:
- Content specificity and detail level
- Actionability of information
- Temporal relevance
- Uniqueness of content
Sources: src/mcp_memory_service/quality/ai_evaluator.py
ImplicitSignalsEvaluator
The ImplicitSignalsEvaluator analyzes behavioral patterns to infer memory importance without explicit user feedback. This component detects signals that suggest a memory's value based on how it's accessed and used.
Implicit Signals:
- Retrieval frequency
- Retrieval timing patterns
- Context of retrieval requests
- Cross-referencing with other memories
Sources: src/mcp_memory_service/quality/implicit_signals.py
Module Exports
All quality scoring components are exported through the main quality module interface:
from mcp_memory_service.quality import (
QualityScorer,
ONNXRankerModel,
QualityEvaluator,
ImplicitSignalsEvaluator,
QualityConfig
)
Sources: src/mcp_memory_service/quality/__init__.py
API Integration
Quality-Aware Search
The quality scoring system integrates with the search API through the quality_boost parameter. When performing semantic or hybrid searches, memories with higher quality scores receive ranking boosts.
Search Parameters Related to Quality:
| Parameter | Type | Default | Description |
|---|---|---|---|
quality_boost | float | 0.0 | Weight for quality-based ranking (0.0-1.0) |
min_quality | float | 0.0 | Minimum quality threshold filter |
include_debug | bool | false | Include quality scoring details in response |
Implementation in Memory Handler:
quality_boost=arguments.get("quality_boost", 0.0),
limit=limit,
include_debug=arguments.get("include_debug", False),
Sources: src/mcp_memory_service/server/handlers/memory.py:26-32
Quality Scoring in Results
The CompactSearchResult type includes a score field that represents the relevance score, which incorporates quality assessment when quality_boost is enabled.
CompactMemory Score Field:
class CompactMemory(NamedTuple):
hash: str # 8-character content hash
preview: str # First 200 characters
tags: tuple[str, ...] # Immutable tags tuple
created: float # Unix timestamp
score: float # Relevance score 0-1
Sources: src/mcp_memory_service/api/types.py:48-56
Quality Actions Handler
The quality system exposes several actions through the memory_quality tool handler, providing programmatic access to quality operations.
Action Types
| Action | Description | Required Parameters |
|---|---|---|
rate | Manually rate a memory's quality | content_hash, rating |
analyze | Analyze quality distribution | min_quality, max_quality |
maintain | Run quality-based maintenance | varies |
maintain_status | Check maintenance status | none |
Rate Memory Action
Allows explicit user feedback on memory quality:
async def handle_rate_memory(server, arguments: dict) -> List[types.TextContent]:
content_hash = arguments.get("content_hash")
rating = arguments.get("rating")
feedback = arguments.get("feedback", "")
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
content_hash | string | Yes | Hash of the memory to rate |
rating | integer/string | Yes | Quality rating (converted to integer) |
feedback | string | No | Optional feedback text |
Sources: src/mcp_memory_service/server/handlers/quality.py:47-59
Analyze Quality Distribution
Provides statistical analysis of memory quality across the store:
elif action == "analyze":
return await handle_analyze_quality_distribution(server, {
"min_quality": arguments.get("min_quality", 0.0),
"max_quality": arguments.get("max_quality", 1.0)
})
Maintenance Operations
The maintain action provides automated quality-based memory management:
elif action == "maintain":
return await handle_maintain(server, arguments)
elif action == "maintain_status":
return await handle_maintain_status()
Sources: src/mcp_memory_service/server/handlers/quality.py:30-38
Workflow Diagrams
Quality Evaluation Flow
graph LR
A[Incoming Memory] --> B{Is cached?}
B -->|No| C[Run ONNX Ranker]
C --> D[Run AI Evaluator]
D --> E[Run Implicit Signals]
E --> F[Aggregate Scores]
F --> G[Compute Final Quality]
G --> H[Cache Result]
H --> I[Return Score]
B -->|Yes| IQuality-Aware Search Flow
graph TD
A[Search Query] --> B[Semantic Search]
B --> C[Initial Results]
C --> D{quality_boost > 0?}
D -->|Yes| E[Fetch Quality Scores]
D -->|No| H[Return Results]
E --> F[Apply Quality Boost]
F --> G[Re-rank Results]
G --> HUsage Examples
Basic Quality Evaluation
from mcp_memory_service.quality import QualityScorer, QualityConfig
config = QualityConfig(min_quality=0.5, boost_weight=0.3)
scorer = QualityScorer(config)
quality_score = scorer.get_score(memory_content)
print(f"Quality score: {quality_score}")
Quality-Boosted Search
When searching with quality consideration:
results = await search_memories(
query="architecture decisions",
quality_boost=0.5, # Apply quality weighting
limit=10
)
Manual Rating
result = await handle_rate_memory(server, {
"content_hash": "abc12345",
"rating": 4,
"feedback": "Important architectural decision"
})
Configuration Best Practices
Performance vs Accuracy
| Use Case | Configuration | Rationale |
|---|---|---|
| Real-time search | quality_boost=0.2-0.3 | Balance relevance with performance |
| High-precision retrieval | quality_boost=0.5-0.7 | Prioritize quality over recall |
| Maintenance/cleanup | min_quality=0.3-0.5 | Filter low-value memories |
Thresholds
| Memory Type | Recommended min_quality |
|---|---|
| Session summaries | 0.4 |
| Implementation details | 0.5 |
| Architectural decisions | 0.6 |
| Bug fixes | 0.3 |
| Reference documentation | 0.5 |
Summary
The Quality Scoring System provides a comprehensive framework for evaluating and managing memory content quality in the MCP Memory Service. By combining ONNX-based offline ranking, AI-powered evaluation, and implicit behavioral signals, the system ensures that high-value memories are surfaced first while providing tools for ongoing quality maintenance.
The modular architecture allows for:
- Scalability: ONNX Runtime enables efficient inference at scale
- Flexibility: Configurable weights and thresholds for different use cases
- Offline capability: Core ranking works without network dependencies
- User control: Manual rating provides explicit feedback pathways
- Maintenance: Built-in tools for quality-based memory management
REST API Reference
Related topics: Agent Framework Integration, Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Agent Framework Integration, Quick Start Guide
REST API Reference
Overview
The MCP Memory Service exposes a comprehensive REST API for managing semantic memories, performing advanced searches, and subscribing to real-time events. The API serves as the primary interface for external clients, web applications, and programmatic integrations that need to interact with the memory storage backend without using the MCP (Model Context Protocol) tools.
The REST API provides four major functional areas:
| Area | Description |
|---|---|
| Memory Management | Store, retrieve, list, and delete semantic memories |
| Search Operations | Semantic similarity, tag-based, and time-based search |
| Real-time Events | Server-Sent Events (SSE) for live updates |
| MCP Protocol | JSON-RPC interface compatible with MCP clients |
Sources: src/mcp_memory_service/web/app.py
Architecture
The REST API is built on FastAPI and serves as a thin layer over the core MemoryService and storage backends. Requests flow through the API router to the appropriate handlers, which delegate business logic to shared services.
graph TD
A[HTTP Client] --> B[FastAPI Router]
B --> C[/api/memories]
B --> D[/api/search]
B --> E[/api/events]
B --> F[/api/mcp]
C --> G[Memory Handler]
D --> H[Search Handler]
E --> I[SSE Publisher]
F --> J[MCP Handler]
G --> K[MemoryService]
H --> K
K --> L[(SQLite + sqlite_vec)]Sources: src/mcp_memory_service/web/api/mcp.py:1-50
Base Configuration
Server Address
| Environment Variable | Default | Description |
|---|---|---|
MCP_HOST | 0.0.0.0 | Bind address |
MCP_PORT | 8080 | HTTP port |
MCP_MAX_RESPONSE_CHARS | 0 (unlimited) | Response truncation limit |
Sources: src/mcp_memory_service/server/utils/response_limiter.py:1-40
Memory Management Endpoints
Store a Memory
Creates a new memory with automatic embedding generation.
POST /api/memories
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
content | string | body | Yes | Memory content text |
tags | string[] | body | No | List of tags |
memory_type | string | body | No | Classification type (default: "note") |
metadata | object | body | No | Custom metadata key-value pairs |
Response:
{
"content_hash": "abc12345",
"message": "Memory stored successfully"
}
Sources: src/mcp_memory_service/api/operations.py:50-100
List All Memories
Retrieves memories with pagination and optional filtering.
GET /api/memories
| Parameter | Type | Location | Required | Default | Description |
|---|---|---|---|---|---|
page | integer | query | No | 1 | 1-based page number |
page_size | integer | query | No | 20 | Results per page (max: 100) |
tags | string | query | No | - | Comma-separated tag filter |
tag_match | string | query | No | any | Match logic: any (OR) or all (AND) |
memory_type | string | query | No | - | Filter by memory type |
stale_days | integer | query | No | - | Filter memories not accessed in N days |
Response:
{
"memories": [
{
"content": "Memory content here",
"content_hash": "abc12345",
"tags": ["tag1", "tag2"],
"memory_type": "note",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:30:00Z"
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_count": 150
}
}
Sources: src/mcp_memory_service/server/handlers/memory.py:1-50
Retrieve a Memory
Retrieves a specific memory by its content hash.
GET /api/memories/{hash}
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
hash | string | path | Yes | 8-character content hash |
Response:
{
"content": "Memory content here",
"content_hash": "abc12345",
"tags": ["architecture", "database"],
"memory_type": "reference",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:30:00Z",
"metadata": {}
}
Sources: src/mcp_memory_service/web/app.py
Delete a Memory
Removes a memory and its associated embeddings.
DELETE /api/memories/{hash}
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
hash | string | path | Yes | 8-character content hash |
Response:
{
"success": true,
"message": "Memory deleted successfully"
}
Sources: src/mcp_memory_service/web/app.py
Search Operations
Semantic Search
Performs vector similarity search using text embeddings.
POST /api/search
| Parameter | Type | Location | Required | Default | Description |
|---|---|---|---|---|---|
query | string | body | Yes | - | Search query text |
limit | integer | body | No | 5 | Maximum results (1-100) |
tags | string[] | body | No | - | Filter by tags |
threshold | float | body | No | 0.0 | Minimum relevance score |
hybrid | boolean | body | No | false | Enable BM25 fallback at threshold 0.4 |
Response:
{
"memories": [
{
"content_hash": "abc12345",
"content": "Memory content...",
"tags": ["tag1"],
"relevance_score": 0.87,
"match_method": "vector"
}
],
"found": 1,
"shown": 1
}
Sources: src/mcp_memory_service/api/operations.py:100-150
Tag-Based Search
Searches memories using tags with AND/OR logic.
POST /api/search/by-tag
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
tags | string[] | body | Yes | List of tags to search |
match_all | boolean | body | No | true for AND, false for OR |
Response:
{
"memories": [
{
"content": "Memory content",
"content_hash": "abc12345",
"tags": ["python", "reference"]
}
],
"total_found": 5
}
Sources: src/mcp_memory_service/server/handlers/memory.py:50-100
Time-Based Search
Natural language time-based queries for temporal memory retrieval.
POST /api/search/by-time
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
time_query | string | body | Yes | Natural language time expression |
time_start | float | body | No | Unix timestamp start |
time_end | float | body | No | Unix timestamp end |
Example Queries:
| Expression | Interpretation |
|---|---|
last week | 7 days ago to now |
yesterday's architectural discussions | Previous day |
between 2025-01-01 and 2025-01-31 | Date range |
Sources: scripts/sync/README.md
Find Similar Memories
Finds memories semantically similar to a specific memory by hash.
GET /api/search/similar/{hash}
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
hash | string | path | Yes | Content hash of reference memory |
limit | integer | query | No | Maximum similar memories to return |
Response:
{
"reference_hash": "abc12345",
"similar": [
{
"content_hash": "def67890",
"content": "Similar memory content...",
"similarity_score": 0.92
}
]
}
Real-time Events (SSE)
Subscribe to Memory Events
Server-Sent Events stream for live memory activity.
GET /api/events
Event Types:
| Event | Description |
|---|---|
memory_stored | New memory added |
memory_deleted | Memory removed |
memory_updated | Memory modified |
embedding_complete | Async embedding finished |
Example SSE Payload:
event: memory_stored
data: {"content_hash": "abc12345", "timestamp": "2025-01-15T10:30:00Z"}
event: memory_deleted
data: {"content_hash": "def67890", "timestamp": "2025-01-15T10:35:00Z"}
Sources: src/mcp_memory_service/web/app.py
SSE Statistics
View connection statistics for SSE endpoints.
GET /api/events/stats
Response:
{
"active_connections": 2,
"total_events_sent": 1450,
"uptime_seconds": 86400
}
MCP Protocol Endpoint
The MCP-compatible JSON-RPC endpoint enables integration with MCP clients.
POST /api/mcp
Supported Methods
| Method | Description |
|---|---|
initialize | Initialize MCP session, returns server capabilities |
tools/list | List available MCP tools |
tools/call | Execute a named MCP tool |
Initialize Request
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {}
}
}
Response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2024-11-05",
"capabilities": {"tools": {}},
"serverInfo": {
"name": "mcp-memory-service",
"version": "4.1.1"
}
}
}
Tools List
Returns all available MCP tools with their schemas.
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/list"
}
Tool Call
Execute a named tool with arguments.
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "memory_search",
"arguments": {
"query": "architecture decisions",
"limit": 5
}
}
}
Sources: src/mcp_memory_service/web/api/mcp.py:20-60
Response Limiting
The API implements automatic response truncation to prevent context window overflow.
Behavior
| Scenario | Action |
|---|---|
MCP_MAX_RESPONSE_CHARS=0 | No limit (backward compatible) |
MCP_MAX_RESPONSE_CHARS>0 | Truncate at memory boundaries |
| Response truncated | Include warning header |
Truncation Metadata
When responses are truncated, the following metadata is included:
{
"warning": "RESPONSE TRUNCATED: Showing 5 of 25 results (15000 of 75000 chars)",
"meta": {
"truncated": true,
"shown_results": 5,
"total_results": 25,
"shown_chars": 15000,
"total_chars": 75000,
"omitted_count": 20
}
}
Sources: src/mcp_memory_service/server/utils/response_limiter.py:60-120
Data Models
Memory Object
| Field | Type | Description |
|---|---|---|
content | string | Memory text content |
content_hash | string | 8-character SHA256 hash |
tags | string[] | Associated tags |
memory_type | string | Classification (note, reference, decision, etc.) |
created_at | ISO8601 | Creation timestamp |
updated_at | ISO8601 | Last modification timestamp |
metadata | object | Custom key-value pairs |
created_at_iso | string | ISO format creation time |
updated_at_iso | string | ISO format modification time |
Search Result
| Field | Type | Description |
|---|---|---|
memories | Memory[] | Matching memories |
found | integer | Total matches |
shown | integer | Results returned (after limit) |
query_time_ms | float | Search duration |
Export Endpoints
Export Memories
Export memories in text or JSON format for backup and migration.
GET /api/export
| Parameter | Type | Location | Description |
|---|---|---|---|
format | string | query | text or json (default: text) |
JSON Export Format:
{
"export_metadata": {
"source_machine": "machine-name",
"export_timestamp": "2025-08-12T10:30:00Z",
"total_memories": 450,
"database_path": "/path/to/sqlite_vec.db",
"platform": "Windows",
"exporter_version": "5.0.0"
},
"memories": [
{
"content": "Memory content here",
"content_hash": "sha256hash",
"tags": ["tag1", "tag2"],
"created_at": 1673545200.0,
"updated_at": 1673545200.0,
"memory_type": "note",
"metadata": {},
"export_source": "machine-name"
}
]
}
Sources: scripts/sync/README.md
Performance Characteristics
| Operation | First Call | Subsequent Calls |
|---|---|---|
| Search | ~50ms | ~5-10ms |
| Store | ~50ms | ~10-20ms |
| Health Check | ~50ms | ~5ms |
Cost Estimate: At $0.15/1M tokens: ~$16.43/year per 10-user deployment.
Sources: src/mcp_memory_service/api/__init__.py
API Client Library
For programmatic access, use the Python client:
from mcp_memory_service.api import search, store, health
# Store a memory
hash = store("New memory", tags=["note", "important"])
# Search memories
results = search("architecture decisions", limit=5)
for m in results.memories:
print(f"{m.hash}: {m.preview[:50]}...")
# Health check
info = health()
print(f"Backend: {info.backend}, Count: {info.count}")
Sources: src/mcp_memory_service/api/__init__.py
Error Handling
Standard Error Response
{
"error": {
"code": -32601,
"message": "Method not found: {method}"
}
}
Error Codes
| Code | Meaning |
|---|---|
-32600 | Invalid Request |
-32601 | Method not found |
-32602 | Invalid params |
-32603 | Internal error |
-32000 | Storage unavailable |
Documentation
Interactive API documentation is available at:
| URL | Format |
|---|---|
/api/docs | Swagger UI |
/api/redoc | ReDoc |
Sources: src/mcp_memory_service/web/app.py
Sources: src/mcp_memory_service/web/app.py
Agent Framework Integration
Related topics: REST API Reference, Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: REST API Reference, Quick Start Guide
Agent Framework Integration
The MCP Memory Service provides comprehensive integration capabilities with various AI agent frameworks, enabling persistent semantic memory storage and retrieval across multi-agent architectures. This integration layer allows autonomous agents to maintain contextual awareness, share knowledge, and persist learning across sessions.
Overview
The MCP Memory Service functions as a central knowledge backbone for AI agent frameworks. Rather than requiring each agent to maintain isolated memory stores, the service provides a unified semantic memory layer that can be accessed via:
- Model Context Protocol (MCP) Tools: Native integration for MCP-compatible agents
- REST API: HTTP-based access for any framework with HTTP client capabilities
- Direct Python API: Programmatic access for Python-native frameworks
This architecture enables agents to:
- Store discovered facts and learned patterns
- Retrieve contextually relevant memories during reasoning
- Share knowledge across agent teams
- Maintain persistent state across sessions
Sources: src/mcp_memory_service/api/__init__.py:1-50
Supported Agent Frameworks
The MCP Memory Service integrates with the following agent frameworks through dedicated documentation and examples:
| Framework | Integration Type | Documentation |
|---|---|---|
| LangGraph | SDK/Graph-based | docs/agents/langgraph.md |
| CrewAI | Team-based agents | docs/agents/crewai.md |
| AutoGen | Multi-agent conversations | docs/agents/autogen.md |
| Custom Frameworks | HTTP/REST | docs/agents/http-generic.md |
Sources: docs/agents/README.md
Architecture for Multi-Agent Systems
graph TD
A[Agent Framework] --> B[MCP Memory Service API]
B --> C[Memory Management Layer]
C --> D[(SQLite-vec Storage)]
C --> E[(Cloudflare D1)]
F[Agent 1] --> B
G[Agent 2] --> B
H[Agent N] --> B
I[Embedding Generation] --> C
J[all-MiniLM-L6-v2] --> IMemory Flow in Agent Workflows
sequenceDiagram
participant Agent
participant MCP as MCP Memory Service
participant Storage as Vector Storage
Agent->>MCP: Store Memory (content, tags)
MCP->>MCP: Generate Embedding
MCP->>Storage: Store + Index
Storage-->>MCP: Confirm
MCP-->>Agent: Content Hash
Agent->>MCP: Semantic Search (query)
MCP->>MCP: Generate Query Embedding
MCP->>Storage: Similarity Search
Storage-->>MCP: Top-K Results
MCP-->>Agent: Relevant MemoriesSources: docs/guides/AGENTS.md
MCP Prompt Integration
The service exposes specialized prompts for agent workflows beyond basic memory operations:
Available Agent Prompts
| Prompt Name | Purpose | Required Arguments |
|---|---|---|
knowledge_export | Export memories in specific formats | format (json/markdown/text) |
memory_cleanup | Remove duplicates/outdated memories | older_than, similarity_threshold |
learning_session | Store structured learning notes | topic, key_points |
Sources: src/mcp_memory_service/server_impl.py:150-200
Prompt Argument Specifications
types.PromptArgument(
name="format",
description="Export format (json, markdown, text)",
required=True
)
types.PromptArgument(
name="older_than",
description="Remove memories older than (e.g., '6 months', '1 year')",
required=False
)
types.PromptArgument(
name="similarity_threshold",
description="Similarity threshold for duplicates (0.0-1.0)",
required=False
)
REST API Integration
For agent frameworks that prefer HTTP-based communication, the service provides a comprehensive REST API:
Core Endpoints
| Method | Endpoint | Purpose |
|---|---|---|
| POST | /api/memories | Store new memory |
| GET | /api/memories | List memories with pagination |
| GET | /api/memories/{hash} | Retrieve specific memory |
| DELETE | /api/memories/{hash} | Delete memory |
| POST | /api/search | Semantic similarity search |
| GET | /api/search/similar/{hash} | Find similar memories |
Sources: src/mcp_memory_service/web/app.py:50-100
Response Truncation for Agent Context
To prevent context overflow in agent prompts, the response limiter intelligently truncates results:
[!] RESPONSE TRUNCATED: Showing 5 of 20 results
(1500 of 8000 chars).
3 result(s) omitted to prevent context overflow.
Use specific queries or hash-based retrieval for full content.
Sources: src/mcp_memory_service/server/utils/response_limiter.py:30-60
Claude Code Integration
The MCP Memory Service includes specialized hooks for Claude Code CLI integration:
Available Claude Commands
| Command | Purpose |
|---|---|
/memory-recall | Time-based memory retrieval using natural language |
/memory-search | Tag and content search |
/memory-context | Session context integration with machine source ID |
/memory-health | Service health diagnostics |
Sources: claude_commands/README.md
Session Hook Workflow
graph LR
A[Session Start] --> B[Load Relevant Memories]
B --> C[Inject into Context]
D[During Session] --> E[Track Decisions]
E --> F[Store Key Insights]
G[Session End] --> H[Save Decisions]
H --> I[Archive Insights]
I --> J[Persistent Storage]Performance Characteristics for Agents
| Metric | Value | Notes |
|---|---|---|
| First Call Latency | ~50ms | Includes storage initialization |
| Subsequent Calls | ~5-10ms | Connection reused |
| Memory Overhead | <10MB | Per agent instance |
| Embedding Model | all-MiniLM-L6-v2 | 384-dimensional vectors |
| Cost (10 users) | $16.43/year | At $0.15/1M tokens |
Sources: src/mcp_memory_service/api/__init__.py:10-20
Installation and Setup
Automatic Installation
# Install with commands (detects Claude Code CLI)
python scripts/installation/install.py
# Force install commands
python scripts/installation/install.py --install-claude-commands
Manual HTTP Integration
For custom agent frameworks:
import httpx
async def store_agent_memory(content: str, tags: list[str]):
async with httpx.AsyncClient() as client:
response = await client.post(
"https://your-endpoint:8443/api/memories",
json={"content": content, "tags": tags}
)
return response.json()
Sources: docs/agents/http-generic.md
Troubleshooting
| Issue | Solution |
|---|---|
| Hooks not detected | Check ls ~/.claude/settings.json and reinstall |
| JSON parse errors | Update to latest version with Python dict conversion |
| Connection failed | Verify curl -k https://your-endpoint:8443/api/health |
| Wrong directory | Move ~/.claude-code/hooks/* to ~/.claude/hooks/ |
Sources: claude-hooks/README.md
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
The project may affect permissions, credentials, data exposure, or host boundaries.
Developers may fail before the first successful local run: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
Developers may fail before the first successful local run: chore(milvus): track optional BaseStorage overrides + test coverage gaps
Developers may fail before the first successful local run: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Security or permission risk: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
- Severity: high
- Finding: Security or permission risk is backed by a source signal: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare). Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/950
2. Installation risk: Developers should check this installation risk before relying on the project: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
- User impact: Developers may fail before the first successful local run: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare). Context: Observed when using node, python, docker, macos
- Evidence: failure_mode_cluster:github_issue | fmev_dd89642370c2dba2d6aacf12756658a6 | https://github.com/doobidoo/mcp-memory-service/issues/950 | [Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)
3. Installation risk: Developers should check this installation risk before relying on the project: chore(milvus): track optional BaseStorage overrides + test coverage gaps
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: chore(milvus): track optional BaseStorage overrides + test coverage gaps
- User impact: Developers may fail before the first successful local run: chore(milvus): track optional BaseStorage overrides + test coverage gaps
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: chore(milvus): track optional BaseStorage overrides + test coverage gaps. Context: Observed when using docker
- Evidence: failure_mode_cluster:github_issue | fmev_74209176888c160a35483f3156117496 | https://github.com/doobidoo/mcp-memory-service/issues/888 | chore(milvus): track optional BaseStorage overrides + test coverage gaps
4. Installation risk: Developers should check this installation risk before relying on the project: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: fix(hooks): PR #952 missed
core/session-end.js— same Cloudflare Tunnel port-fallback bug - User impact: Developers may fail before the first successful local run: fix(hooks): PR #952 missed
core/session-end.js— same Cloudflare Tunnel port-fallback bug - Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: fix(hooks): PR #952 missed
core/session-end.js— same Cloudflare Tunnel port-fallback bug. Context: Observed when using windows - Evidence: failure_mode_cluster:github_issue | fmev_b14a35b730602b08a29e3abbdfa0c377 | https://github.com/doobidoo/mcp-memory-service/issues/957 | fix(hooks): PR #952 missed
core/session-end.js— same Cloudflare Tunnel port-fallback bug
5. Installation risk: Developers should check this installation risk before relying on the project: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix
- User impact: Upgrade or migration may change expected behavior: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix. Context: Observed when using python
- Evidence: failure_mode_cluster:github_release | fmev_d0ce94252816336aa4ecbd45eeb73603 | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.59.0 | v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix
6. Installation risk: Developers should check this installation risk before relying on the project: v10.59.1 — OAuth state parameter RFC 6749 compliance fix
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: v10.59.1 — OAuth state parameter RFC 6749 compliance fix
- User impact: Upgrade or migration may change expected behavior: v10.59.1 — OAuth state parameter RFC 6749 compliance fix
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.59.1 — OAuth state parameter RFC 6749 compliance fix. Context: Observed when using python
- Evidence: failure_mode_cluster:github_release | fmev_a0594e0fe855897f4612a17f520e81d4 | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.59.1 | v10.59.1 — OAuth state parameter RFC 6749 compliance fix
7. Installation risk: Developers should check this installation risk before relying on the project: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility
- User impact: Upgrade or migration may change expected behavior: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility. Context: Observed when using python
- Evidence: failure_mode_cluster:github_release | fmev_8a1390fb930fb3d5c55aee894e53c0e3 | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.60.2 | v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility
8. Installation risk: Developers should check this installation risk before relying on the project: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix
- User impact: Upgrade or migration may change expected behavior: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix. Context: Observed when using python
- Evidence: failure_mode_cluster:github_release | fmev_cce458b1322f9ccb8db2498d3499650d | https://github.com/doobidoo/mcp-memory-service/releases/tag/v10.63.0 | v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix
9. Installation risk: Quality trends endpoint AttributeError on sqlite_vec backend: 'SqliteVecMemoryStorage' object has no attribute 'search_…
- Severity: medium
- Finding: Installation risk is backed by a source signal: Quality trends endpoint AttributeError on sqlite_vec backend: 'SqliteVecMemoryStorage' object has no attribute 'search_…. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/981
10. Installation risk: chore(milvus): track optional BaseStorage overrides + test coverage gaps
- Severity: medium
- Finding: Installation risk is backed by a source signal: chore(milvus): track optional BaseStorage overrides + test coverage gaps. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/888
11. Installation risk: fix(hooks): PR #952 missed `core/session-end.js` — same Cloudflare Tunnel port-fallback bug
- Severity: medium
- Finding: Installation risk is backed by a source signal: fix(hooks): PR #952 missed
core/session-end.js— same Cloudflare Tunnel port-fallback bug. Treat it as a review item until the current version is checked. - User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/957
12. Installation risk: fix(milvus): test_semantic_dedup_blocks_near_duplicate still fails after consistency_level=Session fix
- Severity: medium
- Finding: Installation risk is backed by a source signal: fix(milvus): test_semantic_dedup_blocks_near_duplicate still fails after consistency_level=Session fix. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/doobidoo/mcp-memory-service/issues/938
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mcp-memory-service with real data or production workflows.
- Quality trends endpoint AttributeError on sqlite_vec backend: 'SqliteVec - github / github_issue
- [[automated] Contributor activity digest](https://github.com/doobidoo/mcp-memory-service/issues/937) - github / github_issue
- chore(milvus): track optional BaseStorage overrides + test coverage gaps - github / github_issue
- bug(harvest): Kiro CLI parser misses 80% of messages — wrong kind mappin - github / github_issue
- fix(hooks): PR #952 missed
core/session-end.js— same Cloudflare Tunne - github / github_issue - fix(milvus): test_semantic_dedup_blocks_near_duplicate still fails after - github / github_issue
- [[Bug]: hardcoded port in memory-client.js, breaking HTTP/HTTPS tunnels (e.g., Cloudflare)](https://github.com/doobidoo/mcp-memory-service/issues/950) - GitHub / issue
- Developers should check this installation risk before relying on the project: v10.59.0 — OAuth PEM key files, IDE redirect URI schemes, memory-scorer affinity fix - GitHub / issue
- Developers should check this installation risk before relying on the project: v10.59.1 — OAuth state parameter RFC 6749 compliance fix - GitHub / issue
- Developers should check this installation risk before relying on the project: v10.60.2 — fix(milvus): brute-force query() for semantic dedup growing-segment visibility - GitHub / issue
- Developers should check this installation risk before relying on the project: v10.63.0 — Milvus Issue #888 Complete + Kiro CLI Harvest Fix - GitHub / issue
- v10.54.0 — AND/OR tag filtering for memory_search - GitHub / issue
Source: Project Pack community evidence and pitfall evidence