Doramagic Project Pack · Human Manual
superlocalmemory
SuperLocalMemory (SLM) solves a fundamental problem in AI agent development: persistent, context-aware memory that survives across sessions without relying on cloud services. Unlike cloud-...
Home
Related topics: System Architecture, Modes Explained (A/B/C), Installation
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Modes Explained (A/B/C), Installation
Home
SuperLocalMemory is an information-geometric agent memory system designed for AI assistants, providing mathematical guarantees for retrieval accuracy, zero-LLM inference mode, and EU AI Act compliance. The system stores, retrieves, and manages memories locally on your machine, ensuring complete data sovereignty while enabling seamless integration with Claude, Cursor, Windsurf, and 17+ AI tools.
Source: package.json
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Installation
Related topics: Home, CLI Reference
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home, CLI Reference
Installation
This page covers the complete installation process for SuperLocalMemory V3, including prerequisites, installation methods, configuration, and troubleshooting. SuperLocalMemory is a local-first AI memory system that stores all data in ~/.superlocalmemory/memory.db by default, ensuring complete data sovereignty with zero cloud dependencies.
Overview
SuperLocalMemory supports three primary installation methods:
| Method | Platform | Command | |
|---|---|---|---|
| NPM (Recommended) | macOS, Linux, Windows | npm install -g superlocalmemory | |
| Pip | macOS, Linux, Windows | pip install superlocalmemory | |
| Shell Script | macOS, Linux | `curl -fsSL https://raw.githubusercontent.com/qualixar/superlocalmemory/main/install.sh \ | bash` |
| PowerShell | Windows | `irm https://raw.githubusercontent.com/qualixar/superlocalmemory/main/scripts/install.ps1 \ | iex` |
Source: package.json:1-50
Prerequisites
System Requirements
| Component | Minimum | Recommended |
|---|---|---|
| Python | 3.9+ | 3.12+ |
| Node.js | 18.0.0 | 20.x LTS |
| npm | 9.0.0 | 10.x |
| Disk Space | 500 MB | 2 GB |
| RAM | 4 GB | 8 GB |
Required Dependencies
- Ollama (optional, for Mode A): Download from ollama.ai for fully local LLM inference
- SQLite: Bundled with Python; no separate installation needed
- Embedding Model: Automatically downloaded on first run (nomic-ai/nomic-embed-text-v1.5)
- Reranker Model: Automatically downloaded on first run (cross-encoder/ms-marco-MiniLM-L-12-v2)
Source: src/superlocalmemory/cli/setup_wizard.py:24-26
Installation Methods
Method 1: NPM Installation (Recommended)
The NPM package provides a cross-platform CLI with automatic post-install configuration.
# Install globally
npm install -g superlocalmemory
# Verify installation
slm status
What happens during installation:
postinstall.jsscript runs after npm installation completes- Detects existing V2 installation and prompts for migration
- Triggers setup wizard for new installations
- Downloads required embedding and reranker models
Source: scripts/postinstall.js:1-50
Method 2: Pip Installation
For Python-native environments:
# Install from PyPI
pip install superlocalmemory
# Or install from source
pip install git+https://github.com/qualixar/superlocalmemory.git
First-run behavior:
On first slm command execution, the setup wizard runs automatically when .setup-complete marker is missing from the data directory.
Source: src/superlocalmemory/cli/setup_wizard.py:50-70
Method 3: Shell Script (Linux/macOS)
curl -fsSL https://raw.githubusercontent.com/qualixar/superlocalmemory/main/install.sh | bash
Method 4: PowerShell (Windows)
irm https://raw.githubusercontent.com/qualixar/superlocalmemory/main/scripts/install.ps1 | iex
Note: Windows users previously encountered issues cloning repositories with special characters in filenames. This was fixed in v2.8.2. See Issue #7.
Installation Flow
flowchart TD
A[User runs install command] --> B{Installation method?}
B -->|NPM| C[postinstall.js executes]
B -->|Pip| D[pip install completes]
B -->|Shell| E[install.sh executes]
C --> F{V2 installation detected?}
D --> G{First run?}
E --> H{First run?}
F -->|Yes| I[Run V2 Migrator]
F -->|No| J[Check .setup-complete]
G -->|.setup-complete missing| K[Run Setup Wizard]
H -->|.setup-complete missing| K
G -->|.setup-complete exists| L[Ready to use]
H -->|.setup-complete exists| L
I --> M[Start Setup Wizard]
J -->|Missing| K
J -->|Exists| L
K --> N[Download embedding model]
N --> O[Download reranker model]
O --> P[Configure mode: A or B]
P --> Q[Create .setup-complete marker]
Q --> LSetup Wizard
The interactive setup wizard runs automatically on first use or via slm setup. It performs the following steps:
Step 1: Environment Detection
The wizard detects the runtime environment:
def is_interactive() -> bool:
"""True if running in a terminal (not CI, not piped, not MCP)."""
if os.environ.get("CI"):
return False
if os.environ.get("SLM_NON_INTERACTIVE"):
return False
return sys.stdin.isatty() and sys.stdout.isatty()
Source: src/superlocalmemory/cli/setup_wizard.py:36-43
Step 2: Model Download
Two models are downloaded automatically:
| Model | Purpose | Size |
|---|---|---|
nomic-ai/nomic-embed-text-v1.5 | Text embeddings for semantic search | ~275 MB |
cross-encoder/ms-marco-MiniLM-L-12-v2 | Reranking for improved recall | ~90 MB |
Source: src/superlocalmemory/cli/setup_wizard.py:24-26
Step 3: Mode Configuration
Choose between two operating modes:
| Mode | Description | LLM Required |
|---|---|---|
| Mode A | Fully local with Ollama | Ollama running locally |
| Mode B | OpenAI-compatible API | API key or local proxy |
The setup wizard writes the configuration to ~/.superlocalmemory/config.json.
Source: src/superlocalmemory/core/config.py:1-100
Data Directory Configuration
Default Location
By default, all data is stored in ~/.superlocalmemory/:
~/.superlocalmemory/
├── memory.db # Main SQLite database
├── config.json # Configuration file
├── .setup-complete # Setup marker
├── models/ # Cached embedding models
└── logs/ # Application logs
Source: src/superlocalmemory/cli/setup_wizard.py:20-21
Custom Data Directory
You can customize the data directory using the SL_MEMORY_PATH environment variable:
# Linux/macOS
export SL_MEMORY_PATH=/mnt/data/slm
# Windows PowerShell
$env:SL_MEMORY_PATH="D:\data\slm"
# Run slm commands
slm remember "My custom data location"
Note: TheSLM_DATA_DIRenvironment variable was requested in Issue #10 but the implementation usesSL_MEMORY_PATHinstead. This allows storing memory data on custom paths, including external drives or network mounts.
Upgrading from V2
Users upgrading from V2 are detected automatically:
from superlocalmemory.storage.v2_migrator import V2Migrator
migrator = V2Migrator()
if migrator.detect_v2() and not migrator.is_already_migrated():
# Run migration logic
migrator.migrate()
Source: src/superlocalmemory/cli/post_install.py:30-45
Migration Process
- Detect V2 installation at
~/.superlocalmemory/ - Back up existing database
- Run schema migrations for V3
- Copy user profiles and settings
- Mark migration complete with version marker
Source: src/superlocalmemory/server/unified_daemon.py:50-80
Post-Installation Verification
After installation, verify everything is working:
# Check installation status
slm status
# View configuration
slm config
# Test memory operations
slm remember "Test memory from installation verification"
slm recall "installation verification"
Expected output from slm status:
SuperLocalMemory V3.x.x
━━━━━━━━━━━━━━━━━━━━━━━
Mode: A
Provider: ollama
Model: llama3.2
Database: ~/.superlocalmemory/memory.db
Status: Running
Docker Installation
For containerized environments, see Issue #26 for known considerations:
Dockerfile Example
FROM python:3.12-slim
# Install Node.js for npm-based installation
RUN apt-get update && apt-get install -y curl
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash -
RUN apt-get install -y nodejs
# Install SuperLocalMemory
RUN npm install -g superlocalmemory
# Set data directory
ENV SL_MEMORY_PATH=/data/slm
# Create data directory
RUN mkdir -p /data/slm
# Default command
CMD ["slm", "daemon"]
Docker Compose
version: '3.8'
services:
superlocalmemory:
image: python:3.12-slim
environment:
- SL_MEMORY_PATH=/data/slm
volumes:
- slm-data:/data/slm
command: slm daemon
ports:
- "8765:8765"
volumes:
slm-data:
Troubleshooting
Installation Hangs
If slm remember hangs with no response, this was fixed in v3.3.19. Ensure you have the latest version:
npm install -g superlocalmemory@latest
See Issue #11 for details.
Model Download Failures
If model downloads fail, manually download using Ollama:
ollama pull nomic-embed-text
Permission Errors (Linux/macOS)
# Fix npm global directory permissions
mkdir -p ~/.npm-global
npm config set prefix '~/.npm-global'
export PATH=~/.npm-global/bin:$PATH
# Or use sudo (not recommended)
sudo npm install -g superlocalmemory
Windows PATH Issues
If slm command is not recognized after installation:
- Find npm global bin directory:
npm config get prefix - Add to System PATH
- Restart terminal
API Key Not Working (Mode B)
If api_key is silently dropped in Mode B, check Issue #9. The workaround is to ensure api_key is properly set in config.json:
{
"llm": {
"provider": "openai",
"model": "gpt-4",
"api_key": "your-api-key",
"api_base": "https://api.openai.com/v1"
}
}
Network Configuration
By default, the daemon binds to 127.0.0.1 for security. For multi-machine setups (as requested in Issue #23), consider:
- WireGuard mesh: Recommended for trusted networks
- slm-mesh: Part of the Qualixar stack for distributed memory
- Custom proxy: Forward ports through your own reverse proxy
Note: The SLM_HOST feature request for configurable bind addresses is tracked in Issue #23.
Quick Reference
| Command | Description |
|---|---|
npm install -g superlocalmemory | Install via npm |
slm setup | Run setup wizard |
slm status | Check installation status |
slm config | View/edit configuration |
slm daemon | Start the daemon manually |
slm restart | Restart daemon after config changes |
See Also
- Universal Architecture - System architecture overview
- MCP Integration - IDE integration setup
- Home - Main wiki page with feature overview
- FAQ - Common questions and answers
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Migration from V2
Related topics: Home, Installation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home, Installation
Migration from V2
This page documents the migration process from SuperLocalMemory V2 to V3, including how data is transferred, what changes occur during migration, and how to troubleshoot common issues.
Overview
SuperLocalMemory V3 introduces a complete architectural redesign while maintaining backward compatibility with your existing V2 data. The migration system automatically detects V2 installations, preserves all memories, and performs necessary schema transformations to ensure a seamless upgrade experience.
Why Migrate?
The V2 to V3 migration delivers significant improvements:
| Feature | V2 | V3 |
|---|---|---|
| Memory Engine | SQLite-based | V3 Engine with Fisher-Rao similarity |
| Trust System | Basic | 4-channel retrieval with trust scores |
| Learning | Adaptive ranking | Behavioral learning with zero-LLM inference |
| Compliance | Basic audit | EU AI Act compliant with immutable trails |
| Multi-Agent | Flat namespace | Multi-scope memory (personal/shared/global) |
| LLM Dependency | Required | Mode A (LLM) and Mode B (zero-LLM) |
Source: package.json | Community: RFC Multi-Scope Memory #20
Migration Architecture
Component Overview
The migration system consists of three primary components:
graph TD
A[V2 Installation<br/>~/.superlocalmemory/] --> B[V2Migrator<br/>Detection & Analysis]
B --> C{Migration Status}
C -->|Not Migrated| D[Run Migration]
C -->|Already Migrated| E[Skip Migration]
D --> F[Schema Transformation<br/>migrations.py]
F --> G[V3 Database<br/>Preserved Data]
E --> G
H[Post-Install Script<br/>post_install.py] --> I[User Prompt<br/>if V2 detected]
I --> J[Confirm Migration]
J --> DSource: src/superlocalmemory/cli/post_install.py:1-50
Data Preservation
During migration, the system preserves the following V2 data:
| V2 Data Type | V3 Preservation | Transformation |
|---|---|---|
| Memories | ✅ Complete | Schema migration |
| Profiles | ✅ Complete | Enhanced with trust |
| Learning data | ✅ Complete | Behavioral learning format |
| Configuration | ⚠️ Partial | Recommended review |
| Chat histories | ✅ Via integrations | LlamaIndex/LangChain adapters |
Source: src/superlocalmemory/storage/v2_migrator.py
Migration Detection
Automatic Detection
The V3 installation automatically detects existing V2 installations during the post-install phase. This detection runs through the post_install.py script which is triggered by both npm and pip installations.
sequenceDiagram
participant User
participant PostInstall as post_install.py
participant Migrator as V2Migrator
participant Daemon as unified_daemon.py
User->>PostInstall: npm install -g superlocalmemory
PostInstall->>Migrator: detect_v2()
Migrator-->>PostInstall: V2 installation found
PostInstall->>Migrator: is_already_migrated()
Migrator-->>PostInstall: False
PostInstall->>User: Prompt for migration
User->>PostInstall: Confirm
PostInstall->>Migrator: migrate()
Migrator-->>PostInstall: Success
PostInstall->>Daemon: Mark as migratedSource: src/superlocalmemory/cli/post_install.py:30-60
Detection Logic
The V2Migrator class implements two key detection methods:
| Method | Purpose | Source |
|---|---|---|
detect_v2() | Checks for existence of V2 data directory | v2_migrator.py |
is_already_migrated() | Prevents re-migration of already migrated data | v2_migrator.py |
Version Marker System
V3 uses a version marker system to track upgrades and prevent duplicate migration attempts. This marker is written only after successful migration completion.
# From unified_daemon.py - version marker logic
_want_write_marker = _prev != _slm_version
if _want_write_marker:
if _prev is None:
logger.info(
"[slm] first boot on v%s — run `slm status` to see your "
"memory overview. Changelog: "
"https://github.com/qualixar/superlocalmemory/blob/598b2fc1ce9af40b8b58ac24d2db4827513300b0/CHANGELOG.md",
_slm_version,
)
else:
logger.info(
"[slm] upgraded %s → %s. Data migrations run in a moment; "
"your 18k+ atomic facts are preserved.",
_prev, _slm_version,
)
Source: src/superlocalmemory/server/unified_daemon.py:1-40
Migration Workflow
Step-by-Step Process
- Detection Phase
- Post-install script runs
V2Migrator.detect_v2() - Checks for V2 data directory at
~/.superlocalmemory/
- Confirmation Phase
- If V2 detected and not already migrated, prompt user for confirmation
- Display migration summary and estimated duration
- Schema Migration Phase
- Run additive schema migrations via
migrations.py - Transform V2 memories to V3 format
- Preserve all metadata and importance scores
- Verification Phase
- Verify all memories transferred correctly
- Check profile integrity
- Validate learning data
- Completion Phase
- Set migration marker to prevent re-migration
- Display upgrade banner with changelog link
Source: src/superlocalmemory/storage/migrations.py
Migration Data Flow
graph LR
subgraph V2_Data["V2 Data (~/.superlocalmemory/)"]
A2[memories.db]
B2[profiles.json]
C2[learning_data.json]
end
subgraph Migration["Migration Layer"]
D[V2Migrator]
E[Schema Migrations]
end
subgraph V3_Data["V3 Data"]
A3[memory.db<br/>V3 Schema]
B3[Profiles<br/>Enhanced]
C3[Behavioral Learning]
end
A2 --> D
B2 --> D
C2 --> D
D --> E
E --> A3
E --> B3
E --> C3Configuration After Migration
Required Configuration Review
After migration, certain V2 configuration options may require manual review:
| Config Option | V2 Behavior | V3 Behavior | Action Required |
|---|---|---|---|
LLM_BACKBONE | Ollama only | Multiple providers | Verify if using non-Ollama |
SLM_DATA_DIR | Not implemented | Now supported | Optional relocation |
SLM_HOST | Hardcoded 127.0.0.1 | Configurable | Review for multi-machine setups |
api_key | Dropped silently | Now preserved | Verify Mode B providers |
Source: Community Issue #9 | Community Issue #10 | Community Issue #23
Mode Configuration
V3 introduces dual-mode operation:
| Mode | Description | LLM Required |
|---|---|---|
| Mode A | LLM-powered retrieval with Fisher-Rao similarity | Yes |
| Mode B | Zero-LLM mode using embedding similarity | No |
The setup wizard (setup_wizard.py) guides new users through mode selection. Existing V2 users maintain their configuration but should verify it after migration.
Source: src/superlocalmemory/cli/setup_wizard.py:1-30
Common Issues and Troubleshooting
Issue: Migration Fails Silently
Symptom: Post-install completes without prompting for migration.
Diagnosis:
# Check migration status
python -c "from superlocalmemory.storage.v2_migrator import V2Migrator; m = V2Migrator(); print(f'V2: {m.detect_v2()}, Migrated: {m.is_already_migrated()}')"
Resolution: If migration marker exists but data wasn't transferred, manually run:
slm migrate --force
Source: src/superlocalmemory/storage/v2_migrator.py
Issue: api_key Dropped for Mode B
Symptom: Mode B configured with OpenAI-compatible API, but LLM unavailable.
Affected versions: V2.8.0 - V3.3.x (fixed in V3.4+)
Diagnosis:
# Check LLM availability
slm status
Resolution: Reconfigure the API key after migration using:
slm config set llm.api_key YOUR_API_KEY
Source: Community Issue #9
Issue: Docker/Linux Memory Consolidation
Symptom: Memories not appearing after Docker restart or on different Linux machines.
Diagnosis: Check that data directory is properly mounted or configured for multi-machine access.
Resolution:
- Configure
SLM_DATA_DIRenvironment variable - Use
slm meshfor cross-machine sync - Verify data persistence in Docker volume
Source: Community Issue #26
Issue: Long Wait Times on First `slm remember`
Symptom: slm remember command hangs without response.
Affected versions: Pre-V3.3.19
Resolution: Upgrade to v3.3.19 or later, which includes fix for the streaming response handling.
Source: Community Issue #11
Manual Migration
For advanced users who prefer manual control:
Backup V2 Data
# Backup before migration
cp -r ~/.superlocalmemory ~/.superlocalmemory.backup
Force Migration
# Force migration (will re-run even if already done)
python -m superlocalmemory.storage.v2_migrator --force
Skip Migration
# Start fresh with V3 (loses V2 data)
export SLM_SKIP_MIGRATION=1
slm setup
Integration Adapters
After migration, your existing integrations continue to work:
LlamaIndex
The langchain-superlocalmemory package provides a SuperLocalMemoryChatStore compatible with V3:
from llama_index.storage.chat_store.superlocalmemory import SuperLocalMemoryChatStore
chat_store = SuperLocalMemoryChatStore() # Uses V3 database
Source: ide/integrations/llamaindex/README.md
LangChain
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory
history = SuperLocalMemoryChatMessageHistory(session_id="my-session")
Source: ide/integrations/langchain/README.md
Rollback Procedure
If migration causes issues:
``bash rm -rf ~/.superlocalmemory cp -r ~/.superlocalmemory.backup ~/.superlocalmemory ``
- Restore from Backup
``bash npm uninstall -g superlocalmemory npm install -g [email protected] ``
- Reinstall V2
- Report Issue
- Create issue at GitHub Issues
- Include migration logs from post-install
See Also
- Home - Project overview and features
- Universal Architecture - V3 architecture details
- Installation - Installation guide
- FAQ - Common questions
- Community Issues - Bug reports and feature requests
Source: https://github.com/qualixar/superlocalmemory / Human Manual
System Architecture
Related topics: Home, Retrieval Pipeline, Modes Explained (A/B/C)
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home, Retrieval Pipeline, Modes Explained (A/B/C)
System Architecture
SuperLocalMemory is a local-first AI memory system designed for privacy-conscious users who want persistent, semantic memory capabilities for AI assistants without relying on cloud services. The architecture follows a multi-layered design that separates storage, retrieval, API serving, and user interface concerns while maintaining tight integration between components.
Architecture Overview
The system consists of four primary layers that work together to provide memory persistence and retrieval capabilities:
graph TD
subgraph CLI["CLI Layer"]
CLI_CMD["slm remember<br/>slm recall<br/>slm forget"]
SETUP_WIZARD["Setup Wizard"]
POST_INSTALL["Post-Install"]
end
subgraph Server["API Server Layer"]
DAEMON["Unified Daemon<br/>(unified_daemon.py)"]
API["FastAPI Server<br/>(api.py)"]
UI["UI Server<br/>(ui.py)"]
ROUTES["Route Handlers<br/>(routes/*)"]
end
subgraph Core["Core Engine Layer"]
MEMORY["Memory Engine"]
CONFIG["Configuration<br/>(config.py)"]
MIGRATOR["V2 Migrator"]
end
subgraph Storage["Storage Layer"]
DB["SQLite Database<br/>(~/.superlocalmemory/memory.db)"]
CACHE["Context Cache"]
end
CLI_CMD -->|Command| DAEMON
SETUP_WIZARD -->|Initialize| DB
POST_INSTALL -->|Migrate| MIGRATOR
DAEMON -->|Manage| DB
API -->|Serve| UI
API -->|Routes| ROUTES
ROUTES -->|Query| MEMORY
MEMORY -->|Read/Write| DB
MEMORY -->|Cache| CACHE
CONFIG -->|Configure| MEMORYThe CLI layer provides the primary interface for human users and AI agents, while the API server layer handles both programmatic access and the web-based dashboard. The core engine layer implements the memory operations, and the storage layer persists all data locally in SQLite.
Core Components
Unified Daemon
The unified daemon (unified_daemon.py) serves as the central process manager for SuperLocalMemory. It handles version tracking, data migrations, and system initialization.
Key Responsibilities:
- Version banner display on startup and upgrades
- Additive schema migrations before engine initialization
- Non-blocking startup (failures in version tracking do not prevent daemon operation)
# Source: src/superlocalmemory/server/unified_daemon.py:1-50
# LLD-06 §7.3 / LLD-07 §4.1 — run additive schema migrations BEFORE
# engine init so later queries see the expected columns/tables.
# Non-fatal: any failure here is logged and the daemon still starts.
The daemon implements a fail-safe approach where version banner errors are caught and logged without blocking startup, ensuring the system remains operational even when version tracking encounters issues.
FastAPI API Server
The API server (api.py) provides REST endpoints for memory operations and serves the web-based user interface. It uses FastAPI with several middleware components for security and performance.
Middleware Stack (in order):
| Layer | Middleware | Purpose | Source |
|---|---|---|---|
| 1 (outermost) | SecurityHeadersMiddleware | Security headers | api.py:1-30 |
| 2 | GZipMiddleware | Response compression (min 1000 bytes) | api.py:1-30 |
| 3 | CORSMiddleware | Cross-origin resource sharing | ui.py:1-50 |
| 4 (innermost) | RateLimiter | Request throttling | ui.py:1-50 |
CORS Configuration:
The API server allows requests from localhost origins on two ports:
http://localhost:8765andhttp://127.0.0.1:8765http://localhost:8417andhttp://127.0.0.1:8417
Allowed methods include GET, POST, PUT, DELETE, PATCH, and OPTIONS. Headers allowed include Content-Type, Authorization, and X-SLM-API-Key.
Rate Limiting:
- Write operations: 30 requests per 60 seconds
- Read operations: 120 requests per 60 seconds
UI Server
The UI server (ui.py) serves the web-based memory dashboard and is integrated into the FastAPI application. It searches for the UI directory in two locations:
# Source: src/superlocalmemory/server/ui.py:1-20
# V3.3.21: UI shipped inside the package for pip/npm installs.
_PKG_UI = Path(__file__).resolve().parent.parent / "ui"
_REPO_UI = Path(__file__).resolve().parent.parent.parent.parent / "ui"
UI_DIR = _PKG_UI if (_PKG_UI / "index.html").exists() else _REPO_UI
This dual-location search supports both package-installed and repository-clone deployments.
CLI Architecture
The command-line interface provides the primary interaction method for users and AI agents. The CLI is organized around commands that map to memory operations.
graph LR
subgraph Commands
REMEMBER["remember<br/>Store new memory"]
RECALL["recall<br/>Semantic search"]
FORGET["forget<br/>Delete by query"]
DELETE["delete<br/>Delete by ID"]
UPDATE["update<br/>Modify memory"]
end
subgraph Output
JSON["--json flag<br/>Agent-native format"]
HUMAN["Human readable"]
end
REMEMBER -->|result| JSON
RECALL -->|result| JSON
RECALL -->|result| HUMANCommand Reference
| Command | Purpose | Key Options | Output |
|---|---|---|---|
slm remember | Store a new memory | --importance, --tags, --project | Confirmation or JSON |
slm recall | Semantic search | --limit, --json, --fast | Results list |
slm forget | Delete by query | --dry-run, --yes, --json | Confirmation |
slm delete | Delete by ID | --yes, --json | Confirmation |
slm update | Modify existing memory | Various | Updated memory |
Source: src/superlocalmemory/cli/main.py:1-100
The `--fast` Flag
The recall command supports a --fast option that skips the SpreadingActivation 5th channel for sub-second response times. When enabled, only four channels execute:
- Semantic similarity
- Lexical matching
- Temporal proximity
- Structural relevance
This trade-off is recommended when you need recall results before making a tool call (e.g., before WebSearch).
JSON Output Format
The CLI supports an agent-native JSON output format with a consistent envelope structure:
{
"success": true,
"command": "recall",
"version": "3.4.58",
"data": [...],
"next_actions": [...]
}
The version is read from package.json (npm installs), pyproject.toml (pip installs), or falls back to importlib.metadata.
Source: src/superlocalmemory/cli/json_output.py:1-60
Setup and Initialization
Setup Wizard
The setup wizard (setup_wizard.py) runs automatically on first use or via slm setup. It handles:
- Model downloads (embedding model:
nomic-ai/nomic-embed-text-v1.5) - Reranker model downloads (
cross-encoder/ms-marco-MiniLM-L-12-v2) - Mode configuration
- Installation verification
Source: src/superlocalmemory/cli/setup_wizard.py:1-50
The wizard detects non-interactive environments (CI, piped input, MCP calls) and skips interactive prompts in those contexts.
Post-Install Process
For npm installations, a post-install script runs after npm install -g superlocalmemory. It performs:
- Version banner check (detects upgrades from prior versions)
- V2 installation detection
- Migration prompt if V2 data exists
- Setup wizard invocation for new users
Source: src/superlocalmemory/cli/post_install.py:1-50
Data Directory
The default data directory is ~/.superlocalmemory/, configurable via:
- Environment variable:
SL_MEMORY_PATH(Python layer) - Environment variable:
SLM_DATA_DIR(documented but noted as potentially unused in some versions)
All memories are stored in memory.db within this directory.
Route Architecture
The API server includes multiple route modules that handle different aspects of memory operations:
Registered Routers
| Router | Purpose | Source |
|---|---|---|
memories_router | Core memory CRUD operations | routes/memories.py |
stats_router | Statistics and analytics | routes/stats.py |
profiles_router | Profile management | routes/profiles.py |
backup_router | Backup and restore | routes/backup.py |
events_router | Audit trail events | routes/events.py |
v3_router | V3 dashboard and advanced features | routes/v3_api.py |
chat_router | Chat with memory context (SSE) | routes/chat.py |
Chat Route (SSE Streaming)
The chat route implements server-sent events for streaming LLM responses with memory context and citation detection:
sequenceDiagram
participant Client
participant Server
participant Memory
participant LLM
Client->>Server: POST /chat with query
Server->>Memory: Retrieve relevant memories
Memory-->>Server: List of memories with trust scores
Server->>Server: Build context with citation markers
Server->>LLM: Stream response request
LLM-->>Server: Token stream
Server-->>Client: SSE events (token, done, error)Source: src/superlocalmemory/server/routes/chat.py:1-100
The system prompt instructs the LLM to cite memories using markers like [MEM-1], [MEM-2], etc., enabling traceable responses.
Optional Feature Routers
Several routers are loaded gracefully and do not block startup if unavailable:
learning- Adaptive learning from user feedbacklifecycle- Memory lifecycle managementbehavioral- Behavioral pattern recognitioncompliance- Enterprise compliance features
# Source: src/superlocalmemory/server/ui.py:100-120
for _module_name in ("learning", "lifecycle", "behavioral", "compliance"):
try:
_mod = __import__(f"superlocalmemory.server.routes.{_module_name}", fromlist=["router"])
application.include_router(_mod.router)
except (ImportError, Exception):
pass
Multi-Profile Architecture
SuperLocalMemory supports multiple isolated profiles, where each profile maintains its own:
- Memory entries
- Learning data
- Preferences
- Feedback
Profile isolation ensures that memories from one profile cannot leak to another, a security feature introduced in v2.6.0.
Data Flow
graph TD
subgraph Ingestion
USER["User Input<br/>(CLI/API)"]
AGENT["AI Agent<br/>(MCP/Tools)"]
INTEGRATION["Integrations<br/>(LangChain/LlamaIndex)"]
end
subgraph Processing
ROUTE["Route Handler"]
VALIDATE["Validation"]
STORE["Storage Engine"]
end
subgraph Retrieval
SEARCH["Search Engine"]
RERANK["Reranker"]
FUSE["Result Fusion"]
end
USER -->|slm remember| ROUTE
AGENT -->|MCP Tools| ROUTE
INTEGRATION -->|Chat History| ROUTE
ROUTE --> VALIDATE
VALIDATE --> STORE
STORE -->|Query| SEARCH
SEARCH --> RERANK
RERANK --> FUSE
FUSE -->|Results| USER
FUSE -->|Results| AGENTSecurity Considerations
Hardcoded Bind Address
As noted in GitHub Issue #23, both the SLM daemon and related components (like slm-mesh broker) currently hardcode 127.0.0.1 as the bind address. This is appropriate for single-machine usage but limits multi-machine deployments over trusted networks.
Profile Isolation
Memory queries enforce profile isolation at the API layer, preventing cross-profile data leakage.
Rate Limiting
The API implements per-endpoint rate limiting to prevent abuse, with stricter limits on write operations.
Common Architecture Patterns
Lazy Import Pattern
Several modules use lazy imports to keep module-level imports fast:
# Source: src/superlocalmemory/server/routes/prewarm.py:1-30
def _compute_topic_sig(prompt: str) -> str:
"""Lazy import so module import is free of hot-path SLM modules."""
from superlocalmemory.core.topic_signature import compute_topic_signature
return compute_topic_signature(prompt)
Graceful Degradation
Optional features are loaded with try/except blocks, ensuring the core system remains functional even when optional components fail:
# Source: src/superlocalmemory/server/routes/prewarm.py:30-50
def _upsert_cache(...) -> None:
from superlocalmemory.core.context_cache import CacheEntry, ContextCache
cache = ContextCache()
try:
cache.upsert(CacheEntry(...))
finally:
cache.close()
Fail-Safe Version Tracking
Version banner errors are caught without blocking startup:
# Source: src/superlocalmemory/server/unified_daemon.py:50-70
try:
# version tracking logic
except Exception _exc: # pragma: no cover — never block startup
logger.debug("version-banner skipped: %s", _exc)
_want_write_marker = False
Installation Modes
SuperLocalMemory supports multiple installation methods, each with slightly different directory structures:
| Method | UI Location | Version Source | Post-Install |
|---|---|---|---|
| npm global | _PKG_UI (package) | package.json | post_install.js |
| pip | _PKG_UI (package) | pyproject.toml | First-run wizard |
| Repository clone | _REPO_UI (repo root) | Dynamic | Manual setup |
The UI directory detection falls back from package location to repository location if the package UI is not found.
See Also
- Home — Project overview and feature summary
- Installation — Detailed installation instructions
- MCP Integration — Model Context Protocol setup for AI IDEs
- Universal Skills — CLI slash commands and automation
- FAQ — Common questions and troubleshooting
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Modes Explained (A/B/C)
Related topics: Home, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home, System Architecture
Modes Explained (A/B/C)
SuperLocalMemory V3 operates in distinct operational modes that determine how memory storage, retrieval, and LLM integration function. Understanding these modes is essential for configuring the system for your specific use case—whether you prioritize complete data sovereignty with local-only processing or require cloud-based LLM capabilities.
Overview
SuperLocalMemory V3 provides three primary operational modes that govern how the memory system processes queries and integrates with Large Language Model backends:
| Mode | Description | LLM Required | Data Location |
|---|---|---|---|
| Mode A | Local-only semantic search | No | Always local |
| Mode B | Cloud LLM with local memory | Yes (external) | Memory local, inference cloud |
| Mode C | (Documentation pending) | Varies | Varies |
The mode system is central to SuperLocalMemory's architecture, enabling deployment flexibility from fully air-gapped environments to cloud-integrated workflows. Source: src/superlocalmemory/core/config.py
Mode A: Local-Only Semantic Search
Mode A is the default operational mode for SuperLocalMemory when no external LLM provider is configured. In this mode, the system performs all memory operations using local embedding models and SQLite-based retrieval without requiring any external API calls.
How Mode A Works
When a user issues a slm recall command in Mode A, the system performs semantic search using locally-hosted embedding models. The retrieval pipeline includes:
- Query embedding — The user's search query is embedded using a local model (typically
nomic-ai/nomic-embed-text-v1.5) - Vector similarity search — Embeddings are compared against stored memory vectors in SQLite
- Multi-channel retrieval — Results are ranked using semantic, lexical, temporal, and structural signals
- Raw result presentation — Memory cards are returned directly without LLM synthesis
Behavior Without LLM Provider
When no LLM provider is configured, the chat API endpoint explicitly falls back to Mode A behavior:
if not provider:
yield _sse_event("token", "No LLM provider configured. Showing raw results instead.\n\n")
async for event in _stream_mode_a(query, memories):
yield event
Source: src/superlocalmemory/server/routes/chat.py:58-61
Fast Mode Option
Mode A supports a --fast flag for sub-second response times by skipping the Spreading Activation fifth channel:
recall_p.add_argument(
"--fast", action="store_true",
help="Skip SpreadingActivation 5th channel for sub-second response. "
"Other 4 channels (semantic, lexical, temporal, structural) still run. "
"Use when you need recall before a tool call (e.g. before WebSearch).",
)
Source: src/superlocalmemory/cli/main.py
When to Use Mode A
- Air-gapped environments — Systems without internet connectivity
- Maximum privacy — When no data should leave the local machine under any circumstances
- Maximum speed — When sub-second retrieval is prioritized over synthesis
- Resource-constrained deployments — When GPU/CPU resources cannot support inference
Mode B: Cloud LLM with Local Memory
Mode B extends Mode A's local memory foundation with cloud-based LLM synthesis. In this mode, memory embeddings and storage remain entirely local, but query synthesis and response generation use external LLM providers.
Mode B Architecture
graph TD
A[User Query] --> B[Local Embedding]
B --> C[SQLite Memory Store]
C --> D[Retrieved Memories]
D --> E[Context Construction]
E --> F[Cloud LLM Provider]
F --> G[Synthesized Response]
G --> H[Local Display]
H --> I[Trust Scoring]
I --> J[Memory Update]
style F fill:#ffcccc
style C fill:#ccffccSupported Providers
Mode B supports any OpenAI-compatible API endpoint, including:
| Provider | Configuration | Notes |
|---|---|---|
| Ollama | provider: ollama | Local LLM option |
| LM Studio | OpenAI-compatible | Local inference |
| Groq | Cloud | Fast inference |
| OpenAI | api_key required | Standard OpenAI |
| OpenRouter | api_key required | Aggregated models |
| Custom endpoints | api_base configurable | Self-hosted |
Source: src/superlocalmemory/llm/backbone.py
Known Issue: api_key Handling
A documented issue affects Mode B configuration where the api_key field may be silently dropped:
Symptom: Any configuredapi_keyis ignored in Mode B, causingLLMBackbone.is_available()to returnFalsefor non-Ollama providers.
Source: GitHub Issue #9
This occurs in the SLMConfig.for_mode() method's Mode B branch, where API credentials may not be properly propagated to the LLM backbone initialization.
When to Use Mode B
- Complex synthesis — When memory retrieval benefits from natural language explanation
- Multi-modal reasoning — When combining memory with real-time information
- Multi-language support — When working with non-English content requiring advanced language models
- Balanced privacy — When memory data must remain local but inference can be external
Mode Selection and Configuration
Viewing Current Mode
Check the active mode using the CLI:
slm status
The status command displays the current mode, LLM configuration, and memory statistics.
Switching Modes
Switch between modes using the setup wizard or direct configuration:
slm mode # Interactive mode selection
slm provider # Configure LLM provider settings
Source: src/superlocalmemory/cli/main.py:80-85
Configuration File Structure
The SLMConfig class manages mode-specific settings:
class SLMConfig:
@classmethod
def for_mode(cls, mode: str) -> "SLMConfig":
"""Factory method that returns mode-specific configuration."""
...
Source: src/superlocalmemory/core/config.py
Setup Wizard Integration
The setup wizard handles initial mode selection during first-time installation:
_SLM_HOME = Path(os.environ.get("SL_MEMORY_PATH", Path.home() / ".superlocalmemory"))
_SETUP_MARKER = _SLM_HOME / ".setup-complete"
_EMBED_MODEL = "nomic-ai/nomic-embed-text-v1.5"
_RERANKER_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"
Source: src/superlocalmemory/cli/setup_wizard.py
The wizard:
- Detects whether an LLM provider is available
- Offers mode selection based on detected capabilities
- Downloads required embedding models for Mode A
- Configures provider credentials for Mode B
Data Flow Comparison
graph LR
subgraph Mode A
A1[Query] --> A2[Local Embed]
A2 --> A3[Local Search]
A3 --> A4[Raw Results]
end
subgraph Mode B
B1[Query] --> B2[Local Embed]
B2 --> B3[Local Search]
B3 --> B4[Context Build]
B4 --> B5[Cloud LLM]
B5 --> B6[Synthesized]
end
subgraph Mode C
C1[Query] --> C2[Context]
C2 --> C3[Distributed]
C3 --> C4[Collaborative]
endTroubleshooting Mode Selection
Long Wait Times Without Response
If slm remember commands hang without response, this may indicate:
- Mode B is configured but the LLM provider is unreachable
- Network connectivity issues with cloud endpoints
- Model download still in progress for first-time setup
Resolution: This was addressed in v3.3.19. Ensure you are running the latest version.
Source: GitHub Issue #11
Provider Not Detected
When Mode B features are unavailable:
- Verify LLM configuration in settings
- Check
api_keyis not empty in config - Test provider connectivity with
slm status - Consider falling back to Mode A if cloud access is unreliable
SLM_DATA_DIR Not Honored
The SLM_DATA_DIR environment variable is documented but may not be used everywhere. This affects all modes equally. The recommended workaround is to ensure the default ~/.superlocalmemory directory has appropriate permissions.
Source: GitHub Issue #10
Security Considerations
Profile Isolation
All modes enforce profile isolation at the query endpoint level:
Memories from one profile can never leak to another.
Source: v2.6.0 Release Notes
This security guarantee applies regardless of which mode is active.
Mode B Data Privacy
In Mode B:
- Memory content never leaves the local machine
- Only query embeddings and synthesized responses traverse the network
- The cloud LLM receives constructed context, not raw memory
For maximum privacy in Mode B, ensure your LLM provider's data handling policies meet your compliance requirements.
Community Feature Requests
The community has proposed several mode-related enhancements:
Configurable Embedding Endpoints (Issue #16)
Users have requested support for configurable local embedding endpoints to improve non-English language support:
"Support configurable local embedding endpoints (e.g., OpenAI-compatible API) to unlock non-English language potential."
Source: GitHub Issue #16
This would allow Mode B users to specify custom embedding services that better handle their language requirements.
Multi-Scope Memory (Issue #20)
An RFC proposes scope-aware retrieval that could work across all modes:
"Currently, SLM stores all memories in a flat namespace... no distinction between private memories and shared knowledge."
Source: GitHub Issue #20
See Also
- Installation Guide — Setting up SuperLocalMemory with your preferred mode
- Configuration Reference — Detailed config file documentation
- CLI Commands — Mode-related command documentation
- Security Hardening — Profile isolation and access control
- GitHub Issues — Current mode-related discussions
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Retrieval Pipeline
Related topics: System Architecture, Memory Lifecycle
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Memory Lifecycle
Retrieval Pipeline
The Retrieval Pipeline is the core information retrieval system in SuperLocalMemory V3, responsible for finding the most relevant memories in response to user queries. It implements a multi-channel retrieval architecture that combines semantic similarity, lexical matching, temporal proximity, structural relationships, and spreading activation into a unified retrieval operation. The pipeline is invoked through the slm recall CLI command, the REST API endpoints, and the MCP tools for programmatic access.
Overview
SuperLocalMemory stores memories as atomic facts in a SQLite database, each tagged with semantic embeddings, temporal metadata, structural relationships (parent-child, project, profile), and trust scores. When a query arrives, the Retrieval Pipeline must efficiently locate the most relevant facts across these dimensions.
The pipeline is designed around the following principles:
- Multi-channel retrieval: Combines 4-5 orthogonal retrieval channels to capture different similarity aspects
- Trust-aware ranking: Results are filtered and weighted by trust scores computed from provenance chains
- Sub-second response: The
--fastmode delivers results in under 1 second for time-sensitive tool calls - Zero-LLM inference: Core retrieval runs without calling an LLM, enabling fully local operation (Mode A)
- Fisher-Rao similarity: Mathematical framework ensuring information-geometric guarantees on retrieval quality
The CLI entry point is slm recall, which wraps the WorkerPool's recall method for thread-safe concurrent execution. Source: cli/main.py:recall_p
Architecture
High-Level Data Flow
graph TD
A["User Query<br/>'slm recall <query>'"] --> B["WorkerPool.recall<br/>Entry Point"]
B --> C["Semantic Channel<br/>Embedding Similarity"]
B --> D["Lexical Channel<br/>BM25 / Keyword Match"]
B --> E["Temporal Channel<br/>Recency Weighting"]
B --> F["Structural Channel<br/>Project / Profile Graph"]
B --> G["Spreading Activation<br/>5th Channel (optional)"]
C --> H["Parallel Channel Execution"]
D --> H
E --> H
F --> H
G -.-> H
H --> I["Reranker<br/>Cross-Encoder Scoring"]
I --> J["Trust Filter<br/>Minimum Threshold"]
J --> K["Ranked Results<br/>JSON / Markdown Output"]
style G fill:#f9f,stroke:#333,stroke-width:2px
style K fill:#bf9,stroke:#333,stroke-width:2pxChannel Architecture
The retrieval engine combines five distinct channels, each capturing a different dimension of relevance:
| Channel | Purpose | Input | Output |
|---|---|---|---|
| Semantic | Meaning-based similarity via embeddings | Query text | Top-K fact IDs with cosine similarity scores |
| Lexical | Keyword and exact match | Query terms | Fact IDs with BM25 scores |
| Temporal | Recency-weighted relevance | Timestamp metadata | Time-decay weighted scores |
| Structural | Graph-based relationships | Parent-child, project graph | Transitive closure scores |
| Spreading Activation | Neural-style associative retrieval | Active facts from other channels | Propagated activation scores |
The first four channels run in parallel for maximum throughput. The Spreading Activation channel is optional and skipped when --fast mode is enabled. Source: cli/main.py:recall_p.add_argument --fast
CLI Interface
The slm recall command provides the primary interface to the retrieval pipeline:
slm recall <query> [--limit N] [--json] [--fast]
| Argument | Default | Description |
|---|---|---|
query | (required) | Search query string |
--limit, -l | 10 | Maximum number of results to return |
--json | false | Output structured JSON for agent consumption |
--fast | false | Skip SpreadingActivation 5th channel for sub-second response |
The --fast flag is recommended when recall must complete before a subsequent tool call (e.g., before WebSearch). It still executes all four primary channels: semantic, lexical, temporal, and structural. Source: cli/main.py:recall_p
REST API Endpoint
The REST API exposes retrieval through the memories router:
GET /api/memories/recall?query=<query>&limit=<limit>
The endpoint delegates to the same WorkerPool shared instance used by the CLI, ensuring consistent behavior across all access methods. Results are returned as a JSON array of fact objects with content, trust scores, and provenance metadata. Source: server/routes/memories.py
WorkerPool Integration
Shared Execution Context
The Retrieval Pipeline runs inside a WorkerPool singleton that manages concurrent access to the SQLite database. This design ensures thread safety and allows both the CLI daemon and HTTP server to share the same retrieval engine.
# Internal flow in chat.py
from superlocalmemory.core.worker_pool import WorkerPool
pool = WorkerPool.shared()
result = pool.recall(query, limit=limit)
The WorkerPool.shared() method returns a process-global singleton that initializes the V3 MemoryEngine on first access. All subsequent recall operations reuse this instance, avoiding repeated engine initialization overhead. Source: server/routes/chat.py:_recall_memories
Result Structure
The recall operation returns a dictionary with the following structure:
{
"ok": true,
"results": [
{
"fact_id": "fact_abc123",
"content": "The project uses Python 3.12 for type safety",
"trust_score": 0.92,
"provenance": ["user_feedback:thumbs_up", "automatic_verification"],
"tags": ["project:backend", "profile:default"],
"created_at": 1704067200
}
]
}
Context Caching
Cache Architecture
The Retrieval Pipeline integrates with a context cache layer to accelerate repeated queries. When memories are retrieved, they can be cached with a topic signature for future use:
graph LR
A["Query: 'Python best practices'"] --> B["Compute Topic Signature<br/>hash(query)"]
B --> C["ContextCache Lookup"]
C -->|Cache Hit| D["Return Cached Results"]
C -->|Cache Miss| E["Run Full Retrieval Pipeline"]
E --> F["Upsert Cache Entry"]
F --> DThe ContextCache stores entries with:
session_id: Conversation or agent session identifiertopic_sig: Hash of the query for fast lookupcontent: Serialized memory contextfact_ids: Tuple of referenced fact IDsprovenance: How the cache entry was computedcomputed_at: Unix timestamp for TTL decisions
Source: server/routes/prewarm.py:_upsert_cache
Prewarm Endpoint
The /api/prewarm endpoint allows agents to eagerly populate the cache before a conversation starts:
POST /api/prewarm
{
"session_id": "session_xyz",
"prompt": "Tell me about the backend architecture"
}
This triggers retrieval for the provided prompt and stores results in the context cache, reducing latency when the user later asks related questions. Source: server/routes/prewarm.py
Topic Signature Computation
Topic signatures are computed lazily to keep module imports fast:
def _compute_topic_sig(prompt: str) -> str:
"""Lazy import so module import is free of hot-path SLM modules."""
from superlocalmemory.core.topic_signature import compute_topic_signature
return compute_topic_signature(prompt)
This lazy import pattern ensures that importing the prewarm module does not trigger loading of the heavier retrieval dependencies until actually needed. Source: server/routes/prewarm.py:_compute_topic_sig
Chat Integration with Memory Context
Streaming Response Flow
When the /api/chat endpoint receives a message, it retrieves relevant memories and streams them alongside the LLM response:
sequenceDiagram
participant User
participant API as /api/chat
participant Recall as Retrieval Pipeline
participant LLM as LLM Provider
participant User as Client (SSE)
User->>API: POST /api/chat {message}
API->>Recall: _recall_memories(message, limit=10)
Recall-->>API: memories[]
API->>API: Build context with [MEM-1], [MEM-2] markers
API->>LLM: Stream completion(messages + context)
LLM-->>User: SSE tokens with memory citationsThe retrieved memories are formatted with citation markers that the LLM can reference:
[MEM-1] The project uses Python 3.12 (trust: 0.92)
[MEM-2] PostgreSQL is the primary database (trust: 0.88)
Source: server/routes/chat.py:_build_context
Trust Scoring in Results
Each retrieved memory carries a trust_score between 0.0 and 1.0, computed from the provenance chain:
- Memories with positive user feedback (thumbs up) receive higher trust
- Memories auto-verified against external sources score higher
- Memories from recent sessions with the same profile are weighted more heavily
- Imported memories without provenance chains receive lower default trust
The trust score appears in both CLI output and the SSE stream, allowing clients to display visual indicators of memory reliability.
Performance Considerations
Fast Mode Trade-offs
When --fast is specified, the Spreading Activation channel is disabled. This channel provides neural-style associative retrieval by propagating activation through the memory graph. Disabling it trades recall quality for speed:
| Mode | Latency | Channels Active | Best For |
|---|---|---|---|
| Default | ~500-2000ms | 5 (all) | Comprehensive research, agent planning |
--fast | ~100-500ms | 4 (no spreading) | Tool calls before web search, real-time autocomplete |
For most interactive use cases, --fast provides sufficient accuracy. The full pipeline is recommended for final answer synthesis or when recall quality is paramount. Source: cli/main.py:recall_p.add_argument --fast
Concurrent Access
The WorkerPool handles concurrent requests safely through Python's concurrent.futures thread pool executor. Multiple simultaneous slm recall invocations or API calls are serialized at the database level but execute channel operations in parallel within each request.
Docker and Linux Considerations
Community reports indicate potential issues with retrieval latency in Docker containers and certain Linux distributions. These may relate to:
- SQLite file locking behavior in overlay filesystems
- Thread pool sizing in containerized environments
- Model loading times for embedding models (Mode B)
Users experiencing long wait times (reported as "wait for a long time but seems no response" in issue #11, fixed in v3.3.19) should verify:
- The daemon is running (
slm status) - Sufficient memory is available for embedding models
- The database file is on a local filesystem (not network storage)
Configuration
Data Directory
By default, SuperLocalMemory stores all retrieval data in ~/.superlocalmemory/. The SLM_DATA_DIR environment variable can relocate this, though note this feature had a bug (issue #10) that has since been corrected:
export SLM_DATA_DIR=/path/to/custom/data
slm recall "my query"
Mode A vs Mode B
The retrieval pipeline operates in two modes:
| Aspect | Mode A (Local) | Mode B (API) |
|---|---|---|
| Embeddings | Local model (nomic-embed-text) | Remote API (OpenAI-compatible) |
| LLM | Local (Ollama) | Remote (OpenAI, Anthropic, etc.) |
| Latency | Higher (model loading) | Lower (API calls) |
| Privacy | Maximum (fully offline) | High (data stays on configured server) |
| Cost | Free (compute only) | API token costs |
Mode B supports OpenAI-compatible embedding endpoints, enabling users to configure custom embedding providers for non-English languages (feature request #16). Source: src/superlocalmemory/core/config.py
Common Issues
No Results Returned
If slm recall returns an empty result set:
- Verify memories exist:
slm list-recent - Check profile isolation: memories are scoped to the current profile
- Inspect database:
sqlite3 ~/.superlocalmemory/memory.db "SELECT COUNT(*) FROM facts" - Enable debug logging:
SLM_LOG_LEVEL=DEBUG slm recall <query>
Long Response Times
Long retrieval times (beyond 2-3 seconds) may indicate:
- First-run embedding model download (Setup Wizard should handle this)
- Large fact database without proper indexing
- Mode B embedding endpoint unreachable
- Insufficient system memory for local models
API Key Not Used (Mode B)
Issue #9 reported that api_key is silently dropped in Mode B configurations. Ensure the API key is correctly set in the configuration file or environment variable, and that the provider field is set to a non-Ollama value to trigger the correct code path.
See Also
- Home — Project overview and getting started
- Universal Architecture — Complete system architecture
- MCP Integration — Programmatic access via MCP protocol
- Universal Skills — Slash command reference
- Installation — Setup guide including embedding models
Source: https://github.com/qualixar/superlocalmemory / Human Manual
CLI Reference
Related topics: Home, MCP Tools
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home, MCP Tools
CLI Reference
The SuperLocalMemory CLI (slm) provides a command-line interface for managing AI memory operations. The CLI serves as the primary interface for agents and developers to store, retrieve, update, and delete memories from the local SQLite database. All operations are local-first, with data stored in ~/.superlocalmemory/memory.db by default.
Overview
The CLI follows the 2026 agent-native CLI standard, providing consistent JSON envelopes for programmatic consumption alongside human-readable output for interactive use. Source: src/superlocalmemory/cli/json_output.py:1-50
graph TD
A[slm CLI] --> B[Commands]
B --> C[remember - Store facts]
B --> D[recall - Semantic search]
B --> E[forget - Fuzzy delete]
B --> D2[delete - Precise delete]
B --> F[update - Modify memory]
A --> G[Global Options]
G --> H[--json Agent-native output]
G --> I[--profile Profile isolation]Installation
The CLI is available via both npm and pip package managers.
# npm installation (recommended for global access)
npm install -g superlocalmemory
# pip installation
pip install superlocalmemory
After installation, the slm command becomes available globally. The post-install script automatically detects V2 installations and prompts for migration. Source: src/superlocalmemory/cli/post_install.py:1-40
For first-time users, the setup wizard runs automatically on first slm command when ~/.superlocalmemory/.setup-complete is missing. Source: src/superlocalmemory/cli/setup_wizard.py:1-45
Commands
slm remember
Stores new facts and memories in the local database.
slm remember <content> [options]
| Argument | Description |
|---|---|
content | The fact or memory to store |
Options:
| Option | Description |
|---|---|
--json | Output structured JSON (agent-native) |
--profile <name> | Store in specific profile (default: current profile) |
Example:
# Interactive mode
slm remember "The PostgreSQL connection pool should use max 20 connections"
# JSON output for scripting
slm remember "Project deadline is March 15th" --json
The remember command processes the content through the V3 MemoryEngine, computing topic signatures and extracting entities for optimized retrieval. Source: src/superlocalmemory/server/routes/prewarm.py:1-50
slm recall
Performs semantic search with 4-channel retrieval across stored memories.
slm recall <query> [options]
| Argument | Description |
|---|---|
query | Semantic search query |
Options:
| Option | Default | Description |
|---|---|---|
--limit <n> | 10 | Maximum number of results |
--json | false | Output structured JSON |
--fast | false | Skip 5th channel (SpreadingActivation) for sub-second response |
--profile <name> | current | Search within specific profile |
4-Channel Retrieval Architecture:
The recall command uses four retrieval channels:
- Semantic - Vector embedding similarity
- Lexical - Keyword and phrase matching
- Temporal - Time-based relevance scoring
- Structural - Graph topology influence
The optional 5th channel (SpreadingActivation) performs network propagation for deeper context discovery. Use --fast to skip this channel when speed is critical, such as before a tool call. Source: src/superlocalmemory/cli/main.py:1-80
graph LR
A[Query] --> B[Semantic Channel]
A --> C[Lexical Channel]
A --> D[Temporal Channel]
A --> E[Structural Channel]
B --> F[Ranking Engine]
C --> F
D --> F
E --> F
F --> G[Results]Example:
# Standard recall
slm recall "database connection pooling settings"
# Fast recall for pre-tool use
slm recall "API endpoint for users" --fast
# JSON output with higher limit
slm recall "authentication token handling" --limit 20 --json
Performance Note: If slm recall appears to hang with no response, this was a known issue fixed in v3.3.19. Ensure you are running a recent version. Source: GitHub Issue #11
slm forget
Deletes memories matching a query using fuzzy matching.
slm forget <query> [options]
| Argument | Description |
|---|---|
query | Query to match for deletion |
Options:
| Option | Description |
|---|---|
--dry-run | Preview matches without deleting |
--yes, -y | Skip confirmation prompt |
--json | Output structured JSON |
--profile <name> | Operate within specific profile |
Example:
# Preview what would be deleted
slm forget "old project notes" --dry-run
# Delete with confirmation
slm forget "duplicate entry about config" -y
slm delete
Deletes a specific memory by its exact fact ID.
slm delete <fact_id> [options]
| Argument | Description |
|---|---|
fact_id | Exact fact ID to delete |
Options:
| Option | Description |
|---|---|
--yes, -y | Skip confirmation prompt |
--json | Output structured JSON |
Example:
# Delete by exact ID
slm delete fact_abc123def456 -y
Unlike forget, this command requires the precise fact ID, making it suitable for programmatic deletion workflows.
slm update
Modifies an existing memory entry.
slm update [options]
Options:
| Option | Description |
|---|---|
--fact-id <id> | Fact ID to update |
--content <text> | New content for the memory |
--json | Output structured JSON |
Example:
# Update memory content
slm update --fact-id fact_xyz789 --content "Updated project requirements"
# JSON output for automation
slm update --fact-id fact_xyz789 --content "New content here" --json
Global Options
These options work with any command.
| Option | Description |
|---|---|
--json | Enable agent-native JSON output format |
--profile <name> | Specify which profile to use |
--help, -h | Show help message |
--version, -v | Show version information |
JSON Output Format
When --json is specified, commands return a structured envelope following the 2026 agent-native CLI standard:
{
"success": true,
"command": "recall",
"version": "3.4.58",
"data": {
"results": [
{
"fact_id": "fact_abc123",
"content": "The app uses port 5432 for PostgreSQL",
"trust_score": 0.87,
"importance": 7,
"tags": ["database", "config"],
"created_at": "2026-02-01T10:30:00Z"
}
],
"total": 1,
"query": "postgres port",
"channel_used": ["semantic", "lexical"]
},
"metadata": {
"profile": "default",
"execution_time_ms": 234
}
}
Source: src/superlocalmemory/cli/json_output.py:1-80
Version Detection:
The JSON envelope includes version information read from:
package.json(npm installs)pyproject.toml(pip installs)importlib.metadatafallback
Source: src/superlocalmemory/cli/json_output.py:20-50
Environment Variables
| Variable | Default | Description |
|---|---|---|
SL_MEMORY_PATH | ~/.superlocalmemory | Base directory for data storage |
CI | (not set) | Set to disable interactive prompts |
SLM_NON_INTERACTIVE | (not set) | Disable interactive mode |
SLM_HOST | 127.0.0.1 | Daemon bind address |
SLM_PORT | 8765 | Daemon port |
Note: The SLM_DATA_DIR environment variable was requested in the community (Issue #10) for custom data directories. Verify current support by checking src/superlocalmemory/core/config.py.
Profile Isolation
All CLI commands respect profile isolation. Memories from one profile cannot leak to another. This is enforced at the query endpoint level and applies to all operations including recall, remember, forget, and delete. Source: src/superlocalmemory/server/ui.py:1-50
graph TD
A[CLI Command] --> B{Profile Specified?}
B -->|No| C[Use Current Profile]
B -->|Yes| D[Use Named Profile]
C --> E[Query Engine]
D --> E
E --> F{Isolation Check}
F -->|Pass| G[Return Results]
F -->|Fail| H[Empty Results]Integration with AI Tools
The CLI is designed to work seamlessly with AI coding assistants and IDE integrations:
Supported Integrations
| Integration | Package | Description |
|---|---|---|
| LlamaIndex | llamaindex | Chat store adapter for conversation history |
| LangChain | langchain-superlocalmemory | Chat message history implementation |
| Claude Desktop | MCP | Model Context Protocol server |
| Cursor | MCP | AI IDE integration |
| Windsurf | MCP | AI-powered code editor |
Source: ide/integrations/llamaindex/README.md:1-40 Source: ide/integrations/langchain/README.md:1-60
LangChain Example
from langchain_core.messages import HumanMessage, AIMessage
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory
# Session-isolated chat history
history = SuperLocalMemoryChatMessageHistory(session_id="debug-session-42")
history.add_messages([
HumanMessage(content="The login API returns 500 on production"),
AIMessage(content="Let me check the error logs..."),
])
# Messages persist locally and are accessible via slm recall
Common Issues
Recall Hangs with No Response
Symptom: slm remember or slm recall hangs indefinitely.
Solution: This was fixed in v3.3.19. Upgrade to the latest version:
npm install -g superlocalmemory@latest
# or
pip install --upgrade superlocalmemory
Source: GitHub Issue #11
Windows Clone Issues
Symptom: git clone fails with "invalid path" error due to files with colons in bin/.
Solution: Fixed in v2.8.2. Update to a newer version or clone with symlinks disabled:
git clone --no-checkout https://github.com/qualixar/superlocalmemory.git
cd superlocalmemory
git checkout HEAD -- ':!bin/*'
Source: GitHub Issue #7
Profile Isolation Concerns
For multi-agent setups where users want distinct memory scopes (personal/global/shared), this is tracked in the Multi-Scope Memory RFC. Current implementation stores all memories in a flat namespace with profile-based filtering. Source: GitHub Issue #20
See Also
- Home - Project overview and features
- Universal Architecture - System architecture details
- MCP Integration - Model Context Protocol setup
- Installation - Detailed installation guide
- FAQ - Common questions and troubleshooting
- CHANGELOG.md - Version history
Source: https://github.com/qualixar/superlocalmemory / Human Manual
MCP Tools
Related topics: CLI Reference
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: CLI Reference
MCP Tools
SuperLocalMemory V3 provides a comprehensive Model Context Protocol (MCP) server implementation that enables AI tools and integrated development environments (IDEs) to interact with the local memory system. The MCP Tools layer serves as the primary integration mechanism for Claude Desktop, Cursor, Windsurf, and 17+ other AI-powered development tools.
Overview
The MCP Tools module implements the Model Context Protocol specification, providing a standardized interface for AI assistants to store, retrieve, and manage memories without leaving the development environment. This integration eliminates context window friction by making relevant memories available automatically during coding sessions.
Key capabilities:
- Semantic memory storage and retrieval via 4-channel SpreadingActivation
- Real-time memory context injection into AI conversations
- Profile-scoped memory isolation for multi-agent workflows
- Trust-aware retrieval with provenance tracking
- Zero-LLM inference mode (Mode A) for local-only operation
Source: src/superlocalmemory/mcp/server.py
Architecture
The MCP implementation follows a client-server architecture where the SLM daemon acts as the MCP host and AI tools act as MCP clients. The architecture supports both local-only operation and OpenAI-compatible API backends.
graph TD
subgraph "MCP Client Layer"
Claude["Claude Desktop"]
Cursor["Cursor IDE"]
Windsurf["Windsurf"]
Continue["Continue.dev"]
Cody["Cody by Sourcegraph"]
end
subgraph "MCP Protocol Layer"
JSONRPC["JSON-RPC 2.0 Transport"]
Tools["Tool Handlers"]
Resources["Resource Handlers"]
Prompts["Prompt Templates"]
end
subgraph "SLM Core Layer"
MemoryEngine["V3 MemoryEngine"]
SpreadingActivation["4-Channel Retrieval"]
ContextCache["Context Cache"]
TrustEngine["Trust Scoring Engine"]
end
subgraph "Storage Layer"
SQLite["SQLite Database<br/>/.superlocalmemory/memory.db"]
VFS["Vector Store<br/>nomic-embed-text-v1.5"]
GraphDB["Graph Store<br/>Temporal + Structural"]
end
Claude --> JSONRPC
Cursor --> JSONRPC
Windsurf --> JSONRPC
Continue --> JSONRPC
Cody --> JSONRPC
JSONRPC --> Tools
JSONRPC --> Resources
JSONRPC --> Prompts
Tools --> MemoryEngine
Resources --> MemoryEngine
Prompts --> MemoryEngine
MemoryEngine --> SpreadingActivation
SpreadingActivation --> ContextCache
SpreadingActivation --> TrustEngine
SpreadingActivation --> SQLite
SpreadingActivation --> VFS
SpreadingActivation --> GraphDBComponent Responsibilities
| Component | Responsibility | File Reference |
|---|---|---|
| MCP Server | Protocol implementation, transport handling | server.py:1-150 |
| Tool Handlers | Execute memory operations via MCP protocol | tools.py:1-200 |
| Memory Engine | Core retrieval and storage logic | V3 Engine integration |
| Context Cache | Pre-warm cache for low-latency retrieval | prewarm.py:1-100 |
| API Server | REST endpoint for dashboard and external access | api.py:1-150 |
Available MCP Tools
SuperLocalMemory V3 exposes 6 primary MCP tools for memory management operations. Each tool corresponds to a CLI command and provides equivalent functionality through the protocol.
Tool Reference
| Tool Name | Description | Parameters | Return Type |
|---|---|---|---|
slm_remember | Store new memory with automatic importance scoring | content, project, tags, importance | fact_id, topic_sig |
slm_recall | Semantic search with 4-channel retrieval | query, limit, profile | List of memory entries |
slm_list_recent | List recently accessed memories | limit, profile | List of memory entries |
slm_status | System health and memory statistics | None | Status object |
slm_build_graph | Generate relationship graph for memories | query, depth | Graph data |
slm_switch_profile | Change active memory profile | profile_name | Confirmation |
Tool Specifications
#### slm_remember
Stores new information in the memory system with automatic topic signature computation and importance scoring.
# MCP tool signature
slm_remember(
content: str, # The memory content to store
project: str = "", # Project identifier (optional)
tags: list[str] = [], # Custom tags for filtering
importance: int = 5 # 1-10 importance score
) -> {
"fact_id": str,
"topic_sig": str,
"created_at": int
}
Behavior:
- Computes topic signature from content using local embeddings
- Applies automatic importance scoring based on content analysis
- Stores in SQLite with profile isolation
- Updates vector index for semantic retrieval
- Triggers cognitive consolidation check
Source: tools.py
#### slm_recall
Performs semantic search using the 4-channel SpreadingActivation algorithm.
# MCP tool signature
slm_recall(
query: str, # Search query
limit: int = 10, # Maximum results
profile: str = "" # Profile scope (empty = current)
) -> {
"results": [
{
"fact_id": str,
"content": str,
"trust_score": float,
"relevance": float,
"channel_scores": {...}
}
],
"total": int,
"query_time_ms": float
}
Retrieval Channels:
- Semantic — Vector similarity using nomic-embed-text-v1.5
- Lexical — BM25 keyword matching
- Temporal — Time-decay weighted retrieval
- Structural — Topic graph proximity scoring
Source: src/superlocalmemory/core/retrieval.py
#### slm_status
Returns system health, memory statistics, and configuration state.
# Return type
{
"version": str,
"mode": "A" | "B",
"daemon_running": bool,
"profile": str,
"stats": {
"total_memories": int,
"total_facts": int,
"total_projects": int,
"cache_size": int
},
"llm": {
"provider": str,
"model": str,
"available": bool
}
}
Source: src/superlocalmemory/cli/main.py
MCP Resources
MCP Resources provide read-only access to memory data, suitable for context injection without tool execution overhead.
| Resource URI | Description | Refresh Policy |
|---|---|---|
slm://memories/recent | Last 20 accessed memories | On-demand |
slm://memories/by-project/{project} | Memories for specific project | On-demand |
slm://profile/current | Current active profile | On-demand |
slm://system/status | System health snapshot | 30-second cache |
MCP Prompts
Pre-defined prompt templates for common memory operations:
| Prompt Name | Description | Variables |
|---|---|---|
summarize-project | Generate project summary from memories | project_name |
find-related | Find memories related to current context | current_topic |
learning-summary | Summarize learned patterns | time_range |
Configuration
MCP Server Settings
The MCP server is configured through slm config or environment variables:
| Setting | Environment Variable | Default | Description |
|---|---|---|---|
| Server Host | SLM_HOST | 127.0.0.1 | Bind address for MCP connections |
| Server Port | SLM_PORT | 8765 | Port for MCP JSON-RPC transport |
| Transport | SLM_TRANSPORT | stdio | Transport mode (stdio/sse) |
Note: The SLM_HOST feature request (Issue #23) addresses the limitation of hardcoded 127.0.0.1 for multi-machine deployments in trusted networks.
IDE-Specific Configuration
Each supported IDE requires a configuration file pointing to the MCP server:
{
"mcpServers": {
"superlocalmemory": {
"command": "npx",
"args": ["-y", "superlocalmemory@latest", "mcp"]
}
}
}
Supported IDEs and configuration locations:
| IDE | Configuration File |
|---|---|
| Claude Desktop | ~/.claude_desktop_config.json |
| Cursor | .cursor/mcp.json in project |
| Windsurf | .windsurf/mcp_config.json |
| Continue.dev | .continue/config.json |
| Cody | Sourcegraph dashboard settings |
| ChatGPT | ChatGPT Desktop settings |
| Perplexity | Perplexity Desktop settings |
| Zed | .zed/mcp.json |
| OpenCode | OpenCode MCP settings |
| Antigravity | Antigravity MCP config |
| Aider | ~/.aider.conf.yml |
Source: ide/configs
Integration with LangChain and LlamaIndex
Beyond native MCP support, SuperLocalMemory provides direct integrations with popular AI development frameworks.
LangChain Integration
The langchain-superlocalmemory package implements BaseChatMessageHistory for storing conversation history:
from langchain_core.messages import AIMessage, HumanMessage
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory
history = SuperLocalMemoryChatMessageHistory(session_id="my-session")
history.add_messages([
HumanMessage(content="What is SuperLocalMemory?"),
AIMessage(content="It's a local-first memory system for AI assistants."),
])
Features:
- Session isolation via
session_id - All messages stored in
~/.superlocalmemory/memory.db - Compatible with LangChain chains and agents
- Messages visible via CLI and MCP tools
Source: ide/integrations/langchain/README.md
LlamaIndex Integration
The llamaindex-superlocalmemory package provides chat storage:
from llamaindex_superlocalmemory import SuperLocalMemoryChatStore
chat_store = SuperLocalMemoryChatStore(
session_key="user-session-123",
db_path="/path/to/custom/memory.db"
)
Features:
- Async support via
BaseChatStore - Tag-based session isolation:
llamaindex:chat:<session_key> - Messages queryable via SLM CLI and MCP
Source: ide/integrations/llamaindex/README.md
Common Usage Patterns
Context Pre-warming
The system automatically pre-warms context for known tool calls to reduce latency:
# From prewarm.py - automatic context caching
def _upsert_cache(
session_id: str,
topic_sig: str,
content: str,
fact_ids: list[str],
) -> None:
cache = ContextCache()
cache.upsert(CacheEntry(
session_id=session_id,
topic_sig=topic_sig,
content=content,
fact_ids=tuple(fact_ids),
provenance="prewarm_post_tool",
computed_at=int(time.time()),
))
This pattern ensures memories related to active sessions are immediately available without triggering full retrieval.
Multi-Profile Workflows
For teams running multiple specialized agents, profile isolation ensures memory separation:
# Create profile for specialized agent
slm profile create coding-agent
# Store memory scoped to this profile
slm remember "Python asyncio best practices" --project python-tips
# Recall uses current profile context automatically
slm recall "async patterns"
Note: Multi-scope memory (personal/global/shared) is a requested feature (Issue #20) that would extend the current flat namespace model.
Troubleshooting
Long Response Times
If slm remember or slm recall takes excessive time, this was a known issue fixed in v3.3.19 (Issue #11). Upgrade to the latest version:
npm install -g superlocalmemory@latest
Mode B API Key Not Working
When using OpenAI-compatible providers in Mode B, the api_key may be silently dropped. Check the configuration in src/superlocalmemory/core/config.py — SLMConfig.for_mode() in the Mode B branch (Issue #9).
Docker/Linux Container Issues
For cognitive consolidation issues on Linux/Docker setups, refer to Issue #26 for environment-specific guidance.
Security Considerations
The MCP server enforces profile isolation on all query endpoints. Memories from one profile cannot leak to another, ensuring compliance with enterprise security requirements introduced in v2.6.0.
Key security features:
- Profile-scoped access control on all MCP tools
- API key validation for external providers (Mode B)
- Trust scoring to flag potentially unreliable memories
- Audit trail for compliance (v2.8.0 enterprise compliance)
See Also
- Home — Project overview and feature summary
- Installation — Detailed setup for MCP integrations
- Universal Architecture — 7-layer system architecture
- Universal Skills — CLI slash commands reference
- FAQ — Common questions and troubleshooting
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Memory Lifecycle
Related topics: Retrieval Pipeline
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Retrieval Pipeline
Memory Lifecycle
Memory Lifecycle is a core system in SuperLocalMemory that automatically organizes memories over time based on usage patterns, ensuring the memory system remains fast, relevant, and scalable. Introduced in v2.8.0, this feature manages the complete journey of a memory from creation through consolidation to potential archival or deletion.
Overview
As you interact with SuperLocalMemory, the system accumulates memories at different rates depending on your workflow. Without lifecycle management, a growing memory database leads to degraded recall performance and storage bloat. The Memory Lifecycle system addresses these challenges by:
- Automatically organizing memories based on usage frequency and recency
- Consolidating related memories into coherent knowledge units
- Managing storage efficiency through intelligent quantization
- Maintaining relevance by prioritizing frequently accessed memories
- Preserving user privacy through profile isolation during all operations
Source: CHANGELOG.md - v2.8.0
Architecture
The Memory Lifecycle system consists of several interconnected components that work together to manage memories throughout their lifetime.
graph TD
A[Memory Created<br/>slm remember] --> B[Initial Storage<br/>memory.db]
B --> C[Usage Tracking<br/>Access Count + Recency]
C --> D{Lifecycle Decision<br/>Engine}
D -->|Frequently Used| E[Active Memory<br/>Priority Queue]
D -->|Infrequent Access| F[Consolidation Queue]
D -->|Stale + Redundant| G[Archive/Prune]
E --> H[Fast Recall Path]
F --> I[Consolidation Worker]
I --> J[Quantized Storage]
J --> K[Compressed Facts]
G --> L[Audit Trail]
H --> M[Context Cache]
M --> N[Instant Recall]
style A fill:#e1f5fe
style H fill:#c8e6c9
style I fill:#fff3e0
style G fill:#ffcdd2Core Components
| Component | File | Purpose |
|---|---|---|
| ConsolidationEngine | core/consolidation_engine.py | Orchestrates lifecycle decisions and manages memory states |
| Consolidator | encoding/consolidator.py | Performs the actual memory merging and deduplication |
| ConsolidationWorker | learning/consolidation_worker.py | Background worker for async consolidation tasks |
| QuantizedStore | storage/quantized_store.py | Storage layer with compression and quantization |
| ContextCache | core/context_cache.py | Caches frequently accessed contexts for fast recall |
Source: src/superlocalmemory/core/consolidation_engine.py
Lifecycle States
Memories transition through distinct states during their lifecycle. Understanding these states helps you troubleshoot recall issues and optimize memory management.
stateDiagram-v2
[*] --> Created: slm remember
Created --> Active: First access
Active --> Active: Regular access
Active --> Consolidated: Consolidation trigger
Active --> Archived: Extended inactivity
Consolidated --> Active: Referenced again
Consolidated --> Archived: Further decay
Archived --> Active: Re-accessed
Archived --> Purged: TTL exceeded
Purged --> [*]: Deleted
note right of Active: Hot path<br/>Full indexing
note right of Consolidated: Compressed<br/>Quantized storage
note right of Archived: Minimal footprint<br/>Audit preservedState Definitions
| State | Description | Storage Format | Recall Speed |
|---|---|---|---|
| Created | Newly added via slm remember | Full text + embeddings | Fast |
| Active | Recently accessed memories | Indexed, full fidelity | Fastest |
| Consolidated | Merged with similar memories | Quantized, compressed | Moderate |
| Archived | Inactive for extended period | Minimal metadata | Slower |
| Purged | Removed from active storage | Audit trail only | N/A |
Source: src/superlocalmemory/encoding/consolidator.py
Consolidation Process
Consolidation is the core mechanism that keeps your memory system efficient. It runs automatically based on configurable triggers.
Trigger Conditions
Consolidation is triggered when specific conditions are met:
- Frequency Threshold: Memory accessed fewer than X times in Y days
- Semantic Redundancy: Multiple memories with high Fisher-Rao similarity
- Temporal Clustering: Memories created within the same session/context
- Topic Signature Collision: Memories sharing similar topic signatures
Source: src/superlocalmemory/core/topic_signature.py
Consolidation Workflow
sequenceDiagram
participant U as User/Agent
participant DA as Daemon
participant CE as ConsolidationEngine
participant CW as ConsolidationWorker
participant QS as QuantizedStore
participant DB as memory.db
U->>DA: slm remember "fact"
DA->>DB: Store memory
Note over CE: Periodic check<br/>(configurable interval)
CE->>DB: Query usage patterns
CE->>CE: Evaluate consolidation candidates
CE->>CW: Queue consolidation task
CW->>QS: Read candidate memories
QS-->>CW: Decompressed data
CW->>CW: Merge & deduplicate
CW->>QS: Write consolidated memory
QS->>DB: Update storage
Note over DB: Original audit trail<br/>preservedConsolidation Worker
The ConsolidationWorker runs as a background task, processing consolidation in batches to avoid blocking the main application.
# From learning/consolidation_worker.py - Batch processing pattern
def process_batch(self, candidates: list[str]) -> ConsolidationResult:
"""
Process a batch of memories for consolidation.
Returns result with merged facts and storage savings.
"""
Key behaviors:
- Processes memories in configurable batch sizes
- Runs during idle periods to minimize performance impact
- Maintains full audit trail of consolidation operations
- Supports rollback if consolidation fails
Source: src/superlocalmemory/learning/consolidation_worker.py
Context Cache Integration
The Memory Lifecycle system integrates with the Context Cache to provide instant recall for active memories.
Cache Entry Structure
# From core/context_cache.py
class CacheEntry:
session_id: str # Which session created this cache
topic_sig: str # Topic signature for the context
content: str # Cached context content
fact_ids: tuple[str] # Memory IDs included in this cache
provenance: str # How this was computed
computed_at: int # Unix timestamp
Source: src/superlocalmemory/core/context_cache.py
Prewarm Mechanism
The prewarm system proactively caches contexts before they're needed:
- After
slm remember— caches the new memory in context - After successful recall — caches retrieved facts
- During session start — warms cache based on topic signatures
graph LR
A[slm remember] --> B[Compute topic_sig]
B --> C[Upsert cache entry]
C --> D[Session warm start]
E[slm recall] --> F[Retrieve memories]
F --> G[Update cache]
G --> DSource: src/superlocalmemory/server/routes/prewarm.py
Cache Management
Clearing Cache DBs
The slm clear-cache command removes regenerable cache databases while preserving user memories:
# Remove only cache databases
slm clear-cache
# Output:
# Removed cache DBs:
# - active_brain_cache.db
# - context_cache.db
# - entity_trigram_cache.db
# memory.db / learning.db preserved (user memories are safe)
Cache databases removed:
| Database | Purpose | Regenerated? |
|---|---|---|
active_brain_cache.db | Active memory indexing | Yes |
context_cache.db | Context prewarm data | Yes |
entity_trigram_cache.db | Lexical search index | Yes |
Protected databases (never removed):
memory.db— User memorieslearning.db— User preferences and feedbackaudit.db— Compliance audit trailaudit_chain.db— Immutable audit chain
Source: src/superlocalmemory/cli/escape_hatch.py
Quantized Storage
The QuantizedStore provides storage optimization for consolidated memories, reducing disk usage while maintaining retrieval quality.
Storage Format
| Memory Type | Format | Compression |
|---|---|---|
| Active | Full text + vectors | None |
| Consolidated | Semantic tokens only | 60-80% size reduction |
| Archived | Metadata only | 90%+ size reduction |
Source: src/superlocalmemory/storage/quantized_store.py
Retrieval Behavior
When a consolidated memory is accessed:
- Decompress from quantized format
- Reconstruct full semantic representation
- Update access statistics (may move back to Active)
- Return to requestor
Configuration
Lifecycle Configuration Options
| Setting | Default | Description |
|---|---|---|
consolidation_interval | 24 hours | How often consolidation runs |
batch_size | 50 | Memories processed per batch |
frequency_threshold | 3 accesses/week | Below this triggers consolidation |
similarity_threshold | 0.85 | Fisher-Rao similarity for merging |
archive_after_days | 90 | Days inactive before archival |
purge_after_days | 365 | Days archived before purge |
Cache Configuration
| Setting | Default | Description |
|---|---|---|
cache_ttl_seconds | 3600 | Context cache entry TTL |
max_cache_entries | 1000 | Maximum cached contexts |
prewarm_on_remember | true | Auto-cache after slm remember |
Troubleshooting
Common Issues
#### Issue: Slow Recall Despite Many Memories
Symptoms: slm recall takes longer than expected
Possible Causes:
- Consolidation not running — check daemon logs
- Cache cleared — run
slm clear-cacheafter checking - Too many active memories — consider reducing frequency threshold
Resolution:
# Check daemon status
slm serve status
# Verify consolidation is running
# Look for "consolidation_worker" in logs
# Force cache rebuild
slm clear-cache # Removes only cache DBs
# Daemon will rebuild on next access
#### Issue: Memories Not Consolidating
Symptoms: Memory count keeps growing, consolidation never reduces it
Possible Causes:
- Memories are too diverse (low similarity)
- Access frequency above threshold
- Consolidation worker disabled
Resolution: Verify settings in ~/.superlocalmemory/config.json
#### Issue: Linux/Docker Memory Lifecycle Issues
Symptoms: Reported in Issue #26 — consolidation and trace issues on Linux/Docker
Known Workaround: Ensure the daemon has write access to the data directory and that the user running the container matches the file ownership of ~/.superlocalmemory/.
# Fix ownership on Linux/Docker
chown -R $(id -u):$(id -g) ~/.superlocalmemory
Related Features
The Memory Lifecycle system works closely with these related systems:
| Feature | Integration Point | Documentation |
|---|---|---|
| Behavioral Learning | Learns from consolidation outcomes to improve future decisions | Behavioral Learning |
| Trust System | Lifecycle decisions influenced by trust scores | Trust System |
| Profile Isolation | Lifecycle respects profile boundaries | Profile Management |
| Audit Trail | All lifecycle changes logged for compliance | Enterprise Compliance |
See Also
- Universal Architecture — System architecture overview
- 4-Channel Retrieval — How memories are retrieved
- Behavioral Learning — Learning from action outcomes
- Enterprise Compliance — Audit and compliance features
- MCP Integration — Using memories via MCP protocol
- CLI Reference —
slmcommand documentation
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Multi-Machine Mesh
Related topics: CLI Reference
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: CLI Reference
Multi-Machine Mesh
SuperLocalMemory Multi-Machine Mesh enables distributed memory synchronization across multiple machines on a trusted network. This feature extends the local-first memory architecture to a fleet of machines, allowing AI agents running on different hosts to share and access consolidated memory.
Community Note: This feature addresses a highly requested capability from users running the Qualixar stack across home lab environments. See GitHub Issue #23 for the original feature request from users running SLM on 7-machine WireGuard meshes.
Overview
The mesh architecture consists of interconnected SuperLocalMemory instances that communicate to maintain a synchronized view of shared memories. Each machine in the mesh operates as an independent memory node while selectively sharing designated memories with peer nodes.
| Component | Role |
|---|---|
| Mesh Broker | Coordinates communication between mesh nodes |
| Remote Sync | Handles memory synchronization protocol |
| Mesh MCP Tools | Exposes mesh operations via MCP interface |
| Unified Daemon | Manages daemon lifecycle and mesh bindings |
Architecture
graph TD
A[Machine A - SLM Node] -->|Mesh Broker| B[WireGuard/VPN Network]
C[Machine B - SLM Node] -->|Mesh Broker| B
D[Machine C - SLM Node] -->|Mesh Broker| B
B -->|Sync Protocol| A
B -->|Sync Protocol| C
B -->|Sync Protocol| D
A -->|Local Memory| A1[(Local SQLite)]
C -->|Local Memory| C1[(Local SQLite)]
D -->|Local Memory| D1[(Local SQLite)]
A1 -->|Remote Sync| Shared[(Shared Memory Pool)]
C1 -->|Remote Sync| Shared
D1 -->|Remote Sync| SharedDesign Principles
The mesh system is built on three core principles derived from the local-first architecture:
- Selective Sharing — Only explicitly marked memories propagate across nodes
- Conflict Resolution — Last-write-wins with provenance tracking
- Trust Boundaries — Mesh operates within authenticated network boundaries (e.g., WireGuard VPN)
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
SLM_HOST | 127.0.0.1 | Bind address for SLM daemon and mesh broker |
SLM_MESH_PORT | 8766 | Port for mesh broker communication |
SLM_MESH_TOKEN | (generated) | Authentication token for mesh nodes |
SLM_DATA_DIR | ~/.superlocalmemory | Base directory for memory storage |
Known Limitation: As of the current release, SLM_DATA_DIR is documented but may not be fully implemented in all components. See GitHub Issue #10 for tracking.
Daemon Binding
The unified daemon binds to a configurable host address to enable mesh connectivity:
# From unified_daemon.py
# The daemon can be configured to bind to non-localhost addresses
# enabling cross-machine communication on trusted networks
Source: src/superlocalmemory/server/unified_daemon.py
Mesh Components
Mesh Broker
The mesh broker (broker.py) serves as the central coordination point for mesh operations:
# Conceptual structure based on available source references
class MeshBroker:
def __init__(self, host: str, port: int, token: str):
...
def register_node(self, node_id: str, endpoint: str) -> bool:
"""Register a new node in the mesh."""
...
def broadcast(self, message: MeshMessage) -> None:
"""Broadcast a message to all registered nodes."""
...
Source: src/superlocalmemory/mesh/broker.py
Remote Synchronization
The remote sync module (remote_sync.py) implements the synchronization protocol between nodes:
| Method | Purpose |
|---|---|
sync_to_peers() | Push local changes to connected peers |
pull_from_peers() | Pull remote changes into local store |
resolve_conflicts() | Handle concurrent modifications |
get_sync_status() | Return current synchronization state |
Source: src/superlocalmemory/mesh/remote_sync.py
MCP Mesh Tools
Mesh operations are exposed through the MCP interface for AI tool integration:
# MCP tools available for mesh operations
tools = [
"mesh_list_nodes", # List connected mesh peers
"mesh_sync_memory", # Trigger synchronization
"mesh_share_memory", # Share a specific memory across mesh
"mesh_get_status" # Get mesh connectivity status
]
Source: src/superlocalmemory/mcp/tools_mesh.py
Setup Procedures
Prerequisites
- SuperLocalMemory installed on all machines (
~/.superlocalmemory/) - Network connectivity between machines (WireGuard VPN recommended)
- Unique machine identifiers for each node
- Mesh authentication token shared across the fleet
Basic Setup
# 1. Generate mesh token on primary machine
slm mesh token generate
# 2. Join secondary machines to mesh
slm mesh join <primary-host>:<port> --token <generated-token>
# 3. Verify connectivity
slm mesh status
Network Configuration
For mesh communication across machines, configure the bind address:
# Set SLM_HOST to allow external connections
export SLM_HOST=0.0.0.0
# Or configure in slm config
slm config set mesh.host 0.0.0.0
slm config set mesh.port 8766
Security Consideration: Binding to 0.0.0.0 exposes the SLM daemon to all network interfaces. Only use in trusted environments such as a private VPN.
Multi-Scope Memory Integration
The mesh system integrates with the planned multi-scope memory architecture (see RFC #20):
graph LR
subgraph "Personal Scope"
P1[Machine A Memory]
P2[Machine B Memory]
end
subgraph "Shared Scope"
S1[Shared Knowledge Base]
end
P1 -->|Personal| S1
P2 -->|Personal| S1| Scope | Visibility | Sync Behavior |
|---|---|---|
| Personal | Local machine only | Never sync |
| Shared | All mesh nodes | Automatic sync |
| Team | Selected nodes | Selective sync |
Troubleshooting
Connection Issues
| Symptom | Possible Cause | Resolution |
|---|---|---|
Connection refused on port 8766 | Mesh broker not running | Run slm daemon start |
Authentication failed | Token mismatch | Verify token on all nodes |
Timeout waiting for peers | Network/firewall issue | Check VPN connectivity |
Docker/Linux Environments
Users running SLM in Docker containers report challenges with network binding (Issue #26):
# For Docker deployments, expose mesh ports explicitly
docker run -p 8765:8765 -p 8766:8766 superlocalmemory
# Ensure SLM_HOST is set to container IP or 0.0.0.0
docker run -e SLM_HOST=0.0.0.0 superlocalmemory
Performance Considerations
| Factor | Impact | Recommendation |
|---|---|---|
| Network latency | Sync delay | Use low-latency VPN |
| Memory size | Sync duration | Batch large memories |
| Node count | Coordination overhead | Limit to trusted fleet |
CLI Commands
# Mesh management commands
slm mesh status # Show mesh connectivity status
slm mesh list # List connected peer nodes
slm mesh sync # Trigger immediate synchronization
slm mesh share <id> # Share a memory to mesh peers
slm mesh revoke <id> # Revoke mesh sharing for a memory
slm mesh leave # Disconnect from mesh
Security Model
The mesh system implements trust boundaries based on network topology:
- Node Authentication — Mesh tokens verify node identity
- Profile Isolation — Memories from one profile cannot leak to another
- Scope Enforcement — Only shared-scope memories traverse the mesh
For production deployments, combine with:
- WireGuard VPN for encrypted transport
- Firewall rules restricting mesh ports
- Regular token rotation via
slm rotate-token
See Also
- Universal Architecture — 7-layer system overview
- MCP Integration — MCP tools including mesh operations
- Installation Guide — Initial setup and configuration
- GitHub: SLM_HOST Feature Request #23
- GitHub: Multi-Scope Memory RFC #20
- slm-mesh Repository — Standalone mesh broker
Source: https://github.com/qualixar/superlocalmemory / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 11 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_8628e403827148fcb5f3b537c1af2263 | https://github.com/qualixar/superlocalmemory/issues/26
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_c74d27ed1bf2462585e76845639adfd5 | https://github.com/qualixar/superlocalmemory/issues/23
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_ba05e614f5a8499da175aa7ba09ac343 | https://github.com/qualixar/superlocalmemory/issues/20
4. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
5. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
6. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
7. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
8. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
9. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_be8654b4ef434c37bedf0a453e65f5d6 | https://github.com/qualixar/superlocalmemory/issues/7
10. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
11. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using superlocalmemory with real data or production workflows.
- Cognitive consolidation and trace issues on Linux/Docker setup - github / github_issue
- GitHub issue body — qualixar SLM_HOST feature request - github / github_issue
- RFC: Multi-Scope Memory — personal/global/shared scopes with scope-aware - github / github_issue
- Feature Request: Support configurable local embedding endpoints (e.g., O - github / github_issue
- slm remember xxx wait for a long time but seems no response - github / github_issue
- SLM_DATA_DIR - github / github_issue
api_keysilently dropped for Mode B LLM config - github / github_issue- Can't clone on Windows- repo contains bin - github / github_issue
- v2.8.0 — Memory Lifecycle, Behavioral Learning, Enterprise Compliance - github / github_release
- SuperLocalMemory v2.7.4 - github / github_release
- v2.7.0 — Your AI Learns You - github / github_release
- v2.6.0 — Security Hardening & Performance - github / github_release
Source: Project Pack community evidence and pitfall evidence