superlocalmemory Manual

Doramagic Project Pack · Human Manual

superlocalmemory

SuperLocalMemory (SLM) solves a fundamental problem in AI agent development: persistent, context-aware memory that survives across sessions without relying on cloud services. Unlike cloud-...

Home

Related topics: System Architecture, Modes Explained (A/B/C), Installation

Section Related Pages

Continue reading this section for the full explanation and source context.

Home

SuperLocalMemory is an information-geometric agent memory system designed for AI assistants, providing mathematical guarantees for retrieval accuracy, zero-LLM inference mode, and EU AI Act compliance. The system stores, retrieves, and manages memories locally on your machine, ensuring complete data sovereignty while enabling seamless integration with Claude, Cursor, Windsurf, and 17+ AI tools.

Source: package.json

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Installation

This page covers the complete installation process for SuperLocalMemory V3, including prerequisites, installation methods, configuration, and troubleshooting. SuperLocalMemory is a local-first AI memory system that stores all data in ~/.superlocalmemory/memory.db by default, ensuring complete data sovereignty with zero cloud dependencies.

Overview

SuperLocalMemory supports three primary installation methods:

Method	Platform	Command
NPM (Recommended)	macOS, Linux, Windows	`npm install -g superlocalmemory`
Pip	macOS, Linux, Windows	`pip install superlocalmemory`
Shell Script	macOS, Linux	`curl -fsSL https://raw.githubusercontent.com/qualixar/superlocalmemory/main/install.sh \	bash`
PowerShell	Windows	`irm https://raw.githubusercontent.com/qualixar/superlocalmemory/main/scripts/install.ps1 \	iex`

Source: package.json:1-50

Prerequisites

System Requirements

Component	Minimum	Recommended
Python	3.9+	3.12+
Node.js	18.0.0	20.x LTS
npm	9.0.0	10.x
Disk Space	500 MB	2 GB
RAM	4 GB	8 GB

Required Dependencies

Ollama (optional, for Mode A): Download from ollama.ai for fully local LLM inference
SQLite: Bundled with Python; no separate installation needed
Embedding Model: Automatically downloaded on first run (nomic-ai/nomic-embed-text-v1.5)
Reranker Model: Automatically downloaded on first run (cross-encoder/ms-marco-MiniLM-L-12-v2)

Source: src/superlocalmemory/cli/setup_wizard.py:24-26

Installation Methods

Method 1: NPM Installation (Recommended)

The NPM package provides a cross-platform CLI with automatic post-install configuration.

# Install globally
npm install -g superlocalmemory

# Verify installation
slm status

What happens during installation:

postinstall.js script runs after npm installation completes
Detects existing V2 installation and prompts for migration
Triggers setup wizard for new installations
Downloads required embedding and reranker models

Source: scripts/postinstall.js:1-50

Method 2: Pip Installation

For Python-native environments:

# Install from PyPI
pip install superlocalmemory

# Or install from source
pip install git+https://github.com/qualixar/superlocalmemory.git

First-run behavior:

On first slm command execution, the setup wizard runs automatically when .setup-complete marker is missing from the data directory.

Source: src/superlocalmemory/cli/setup_wizard.py:50-70

Method 3: Shell Script (Linux/macOS)

curl -fsSL https://raw.githubusercontent.com/qualixar/superlocalmemory/main/install.sh | bash

Method 4: PowerShell (Windows)

irm https://raw.githubusercontent.com/qualixar/superlocalmemory/main/scripts/install.ps1 | iex

Note: Windows users previously encountered issues cloning repositories with special characters in filenames. This was fixed in v2.8.2. See Issue #7.

Installation Flow

flowchart TD
    A[User runs install command] --> B{Installation method?}
    B -->|NPM| C[postinstall.js executes]
    B -->|Pip| D[pip install completes]
    B -->|Shell| E[install.sh executes]
    
    C --> F{V2 installation detected?}
    D --> G{First run?}
    E --> H{First run?}
    
    F -->|Yes| I[Run V2 Migrator]
    F -->|No| J[Check .setup-complete]
    G -->|.setup-complete missing| K[Run Setup Wizard]
    H -->|.setup-complete missing| K
    G -->|.setup-complete exists| L[Ready to use]
    H -->|.setup-complete exists| L
    
    I --> M[Start Setup Wizard]
    J -->|Missing| K
    J -->|Exists| L
    
    K --> N[Download embedding model]
    N --> O[Download reranker model]
    O --> P[Configure mode: A or B]
    P --> Q[Create .setup-complete marker]
    Q --> L

Setup Wizard

The interactive setup wizard runs automatically on first use or via slm setup. It performs the following steps:

Step 1: Environment Detection

The wizard detects the runtime environment:

def is_interactive() -> bool:
    """True if running in a terminal (not CI, not piped, not MCP)."""
    if os.environ.get("CI"):
        return False
    if os.environ.get("SLM_NON_INTERACTIVE"):
        return False
    return sys.stdin.isatty() and sys.stdout.isatty()

Source: src/superlocalmemory/cli/setup_wizard.py:36-43

Step 2: Model Download

Two models are downloaded automatically:

Model	Purpose	Size
`nomic-ai/nomic-embed-text-v1.5`	Text embeddings for semantic search	~275 MB
`cross-encoder/ms-marco-MiniLM-L-12-v2`	Reranking for improved recall	~90 MB

Source: src/superlocalmemory/cli/setup_wizard.py:24-26

Step 3: Mode Configuration

Choose between two operating modes:

Mode	Description	LLM Required
Mode A	Fully local with Ollama	Ollama running locally
Mode B	OpenAI-compatible API	API key or local proxy

The setup wizard writes the configuration to ~/.superlocalmemory/config.json.

Source: src/superlocalmemory/core/config.py:1-100

Data Directory Configuration

Default Location

By default, all data is stored in ~/.superlocalmemory/:

~/.superlocalmemory/
├── memory.db          # Main SQLite database
├── config.json        # Configuration file
├── .setup-complete    # Setup marker
├── models/            # Cached embedding models
└── logs/              # Application logs

Source: src/superlocalmemory/cli/setup_wizard.py:20-21

Custom Data Directory

You can customize the data directory using the SL_MEMORY_PATH environment variable:

# Linux/macOS
export SL_MEMORY_PATH=/mnt/data/slm

# Windows PowerShell
$env:SL_MEMORY_PATH="D:\data\slm"

# Run slm commands
slm remember "My custom data location"

Note: The SLM_DATA_DIR environment variable was requested in Issue #10 but the implementation uses SL_MEMORY_PATH instead. This allows storing memory data on custom paths, including external drives or network mounts.

Upgrading from V2

Users upgrading from V2 are detected automatically:

from superlocalmemory.storage.v2_migrator import V2Migrator

migrator = V2Migrator()

if migrator.detect_v2() and not migrator.is_already_migrated():
    # Run migration logic
    migrator.migrate()

Source: src/superlocalmemory/cli/post_install.py:30-45

Migration Process

Detect V2 installation at ~/.superlocalmemory/
Back up existing database
Run schema migrations for V3
Copy user profiles and settings
Mark migration complete with version marker

Source: src/superlocalmemory/server/unified_daemon.py:50-80

Post-Installation Verification

After installation, verify everything is working:

# Check installation status
slm status

# View configuration
slm config

# Test memory operations
slm remember "Test memory from installation verification"
slm recall "installation verification"

Expected output from slm status:

SuperLocalMemory V3.x.x
━━━━━━━━━━━━━━━━━━━━━━━
Mode: A
Provider: ollama
Model: llama3.2
Database: ~/.superlocalmemory/memory.db
Status: Running

Docker Installation

For containerized environments, see Issue #26 for known considerations:

Dockerfile Example

FROM python:3.12-slim

# Install Node.js for npm-based installation
RUN apt-get update && apt-get install -y curl
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash -
RUN apt-get install -y nodejs

# Install SuperLocalMemory
RUN npm install -g superlocalmemory

# Set data directory
ENV SL_MEMORY_PATH=/data/slm

# Create data directory
RUN mkdir -p /data/slm

# Default command
CMD ["slm", "daemon"]

Docker Compose

version: '3.8'
services:
  superlocalmemory:
    image: python:3.12-slim
    environment:
      - SL_MEMORY_PATH=/data/slm
    volumes:
      - slm-data:/data/slm
    command: slm daemon
    ports:
      - "8765:8765"

volumes:
  slm-data:

Troubleshooting

Installation Hangs

If slm remember hangs with no response, this was fixed in v3.3.19. Ensure you have the latest version:

npm install -g superlocalmemory@latest

See Issue #11 for details.

Model Download Failures

If model downloads fail, manually download using Ollama:

ollama pull nomic-embed-text

Permission Errors (Linux/macOS)

# Fix npm global directory permissions
mkdir -p ~/.npm-global
npm config set prefix '~/.npm-global'
export PATH=~/.npm-global/bin:$PATH

# Or use sudo (not recommended)
sudo npm install -g superlocalmemory

Windows PATH Issues

If slm command is not recognized after installation:

Find npm global bin directory: npm config get prefix
Add to System PATH
Restart terminal

API Key Not Working (Mode B)

If api_key is silently dropped in Mode B, check Issue #9. The workaround is to ensure api_key is properly set in config.json:

{
  "llm": {
    "provider": "openai",
    "model": "gpt-4",
    "api_key": "your-api-key",
    "api_base": "https://api.openai.com/v1"
  }
}

Network Configuration

By default, the daemon binds to 127.0.0.1 for security. For multi-machine setups (as requested in Issue #23), consider:

WireGuard mesh: Recommended for trusted networks
slm-mesh: Part of the Qualixar stack for distributed memory
Custom proxy: Forward ports through your own reverse proxy

Note: The SLM_HOST feature request for configurable bind addresses is tracked in Issue #23.

Quick Reference

Command	Description
`npm install -g superlocalmemory`	Install via npm
`slm setup`	Run setup wizard
`slm status`	Check installation status
`slm config`	View/edit configuration
`slm daemon`	Start the daemon manually
`slm restart`	Restart daemon after config changes

Migration from V2

This page documents the migration process from SuperLocalMemory V2 to V3, including how data is transferred, what changes occur during migration, and how to troubleshoot common issues.

Overview

SuperLocalMemory V3 introduces a complete architectural redesign while maintaining backward compatibility with your existing V2 data. The migration system automatically detects V2 installations, preserves all memories, and performs necessary schema transformations to ensure a seamless upgrade experience.

Why Migrate?

The V2 to V3 migration delivers significant improvements:

Feature	V2	V3
Memory Engine	SQLite-based	V3 Engine with Fisher-Rao similarity
Trust System	Basic	4-channel retrieval with trust scores
Learning	Adaptive ranking	Behavioral learning with zero-LLM inference
Compliance	Basic audit	EU AI Act compliant with immutable trails
Multi-Agent	Flat namespace	Multi-scope memory (personal/shared/global)
LLM Dependency	Required	Mode A (LLM) and Mode B (zero-LLM)

Source: package.json | Community: RFC Multi-Scope Memory #20

Migration Architecture

Component Overview

The migration system consists of three primary components:

graph TD
    A[V2 Installation<br/>~/.superlocalmemory/] --> B[V2Migrator<br/>Detection & Analysis]
    B --> C{Migration Status}
    C -->|Not Migrated| D[Run Migration]
    C -->|Already Migrated| E[Skip Migration]
    D --> F[Schema Transformation<br/>migrations.py]
    F --> G[V3 Database<br/>Preserved Data]
    E --> G
    H[Post-Install Script<br/>post_install.py] --> I[User Prompt<br/>if V2 detected]
    I --> J[Confirm Migration]
    J --> D

Source: src/superlocalmemory/cli/post_install.py:1-50

Data Preservation

During migration, the system preserves the following V2 data:

V2 Data Type	V3 Preservation	Transformation
Memories	✅ Complete	Schema migration
Profiles	✅ Complete	Enhanced with trust
Learning data	✅ Complete	Behavioral learning format
Configuration	⚠️ Partial	Recommended review
Chat histories	✅ Via integrations	LlamaIndex/LangChain adapters

Source: src/superlocalmemory/storage/v2_migrator.py

Migration Detection

Automatic Detection

The V3 installation automatically detects existing V2 installations during the post-install phase. This detection runs through the post_install.py script which is triggered by both npm and pip installations.

sequenceDiagram
    participant User
    participant PostInstall as post_install.py
    participant Migrator as V2Migrator
    participant Daemon as unified_daemon.py

    User->>PostInstall: npm install -g superlocalmemory
    PostInstall->>Migrator: detect_v2()
    Migrator-->>PostInstall: V2 installation found
    PostInstall->>Migrator: is_already_migrated()
    Migrator-->>PostInstall: False
    PostInstall->>User: Prompt for migration
    User->>PostInstall: Confirm
    PostInstall->>Migrator: migrate()
    Migrator-->>PostInstall: Success
    PostInstall->>Daemon: Mark as migrated

Source: src/superlocalmemory/cli/post_install.py:30-60

Detection Logic

The V2Migrator class implements two key detection methods:

Method	Purpose	Source
`detect_v2()`	Checks for existence of V2 data directory	v2_migrator.py
`is_already_migrated()`	Prevents re-migration of already migrated data	v2_migrator.py

Version Marker System

V3 uses a version marker system to track upgrades and prevent duplicate migration attempts. This marker is written only after successful migration completion.

# From unified_daemon.py - version marker logic
_want_write_marker = _prev != _slm_version
if _want_write_marker:
    if _prev is None:
        logger.info(
            "[slm] first boot on v%s — run `slm status` to see your "
            "memory overview. Changelog: "
            "https://github.com/qualixar/superlocalmemory/blob/598b2fc1ce9af40b8b58ac24d2db4827513300b0/CHANGELOG.md",
            _slm_version,
        )
    else:
        logger.info(
            "[slm] upgraded %s → %s. Data migrations run in a moment; "
            "your 18k+ atomic facts are preserved.",
            _prev, _slm_version,
        )

Source: src/superlocalmemory/server/unified_daemon.py:1-40

Migration Workflow

Step-by-Step Process

Detection Phase

Post-install script runs V2Migrator.detect_v2()
Checks for V2 data directory at ~/.superlocalmemory/

Confirmation Phase

If V2 detected and not already migrated, prompt user for confirmation
Display migration summary and estimated duration

Schema Migration Phase

Run additive schema migrations via migrations.py
Transform V2 memories to V3 format
Preserve all metadata and importance scores

Verification Phase

Verify all memories transferred correctly
Check profile integrity
Validate learning data

Completion Phase

Set migration marker to prevent re-migration
Display upgrade banner with changelog link

Source: src/superlocalmemory/storage/migrations.py

Migration Data Flow

graph LR
    subgraph V2_Data["V2 Data (~/.superlocalmemory/)"]
        A2[memories.db]
        B2[profiles.json]
        C2[learning_data.json]
    end

    subgraph Migration["Migration Layer"]
        D[V2Migrator]
        E[Schema Migrations]
    end

    subgraph V3_Data["V3 Data"]
        A3[memory.db<br/>V3 Schema]
        B3[Profiles<br/>Enhanced]
        C3[Behavioral Learning]
    end

    A2 --> D
    B2 --> D
    C2 --> D
    D --> E
    E --> A3
    E --> B3
    E --> C3

Configuration After Migration

Required Configuration Review

After migration, certain V2 configuration options may require manual review:

Config Option	V2 Behavior	V3 Behavior	Action Required
`LLM_BACKBONE`	Ollama only	Multiple providers	Verify if using non-Ollama
`SLM_DATA_DIR`	Not implemented	Now supported	Optional relocation
`SLM_HOST`	Hardcoded 127.0.0.1	Configurable	Review for multi-machine setups
`api_key`	Dropped silently	Now preserved	Verify Mode B providers

Source: Community Issue #9 | Community Issue #10 | Community Issue #23

Mode Configuration

V3 introduces dual-mode operation:

Mode	Description	LLM Required
Mode A	LLM-powered retrieval with Fisher-Rao similarity	Yes
Mode B	Zero-LLM mode using embedding similarity	No

The setup wizard (setup_wizard.py) guides new users through mode selection. Existing V2 users maintain their configuration but should verify it after migration.

Source: src/superlocalmemory/cli/setup_wizard.py:1-30

Common Issues and Troubleshooting

Issue: Migration Fails Silently

Symptom: Post-install completes without prompting for migration.

Diagnosis:

# Check migration status
python -c "from superlocalmemory.storage.v2_migrator import V2Migrator; m = V2Migrator(); print(f'V2: {m.detect_v2()}, Migrated: {m.is_already_migrated()}')"

Resolution: If migration marker exists but data wasn't transferred, manually run:

slm migrate --force

Source: src/superlocalmemory/storage/v2_migrator.py

Issue: api_key Dropped for Mode B

Symptom: Mode B configured with OpenAI-compatible API, but LLM unavailable.

Affected versions: V2.8.0 - V3.3.x (fixed in V3.4+)

Diagnosis:

# Check LLM availability
slm status

Resolution: Reconfigure the API key after migration using:

slm config set llm.api_key YOUR_API_KEY

Source: Community Issue #9

Issue: Docker/Linux Memory Consolidation

Symptom: Memories not appearing after Docker restart or on different Linux machines.

Diagnosis: Check that data directory is properly mounted or configured for multi-machine access.

Resolution:

Configure SLM_DATA_DIR environment variable
Use slm mesh for cross-machine sync
Verify data persistence in Docker volume

Source: Community Issue #26

Issue: Long Wait Times on First `slm remember`

Symptom: slm remember command hangs without response.

Affected versions: Pre-V3.3.19

Resolution: Upgrade to v3.3.19 or later, which includes fix for the streaming response handling.

Source: Community Issue #11

Manual Migration

For advanced users who prefer manual control:

Backup V2 Data

# Backup before migration
cp -r ~/.superlocalmemory ~/.superlocalmemory.backup

Force Migration

# Force migration (will re-run even if already done)
python -m superlocalmemory.storage.v2_migrator --force

Skip Migration

# Start fresh with V3 (loses V2 data)
export SLM_SKIP_MIGRATION=1
slm setup

Integration Adapters

After migration, your existing integrations continue to work:

LlamaIndex

The langchain-superlocalmemory package provides a SuperLocalMemoryChatStore compatible with V3:

from llama_index.storage.chat_store.superlocalmemory import SuperLocalMemoryChatStore

chat_store = SuperLocalMemoryChatStore()  # Uses V3 database

Source: ide/integrations/llamaindex/README.md

LangChain

from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory

history = SuperLocalMemoryChatMessageHistory(session_id="my-session")

Source: ide/integrations/langchain/README.md

Rollback Procedure

If migration causes issues:

``bash rm -rf ~/.superlocalmemory cp -r ~/.superlocalmemory.backup ~/.superlocalmemory ``

Restore from Backup

``bash npm uninstall -g superlocalmemory npm install -g [email protected] ``

Reinstall V2

Report Issue

Create issue at GitHub Issues
Include migration logs from post-install

System Architecture

Related topics: Home, Retrieval Pipeline, Modes Explained (A/B/C)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Unified Daemon

Continue reading this section for the full explanation and source context.

Section FastAPI API Server

Continue reading this section for the full explanation and source context.

Section UI Server

Continue reading this section for the full explanation and source context.

Related topics: Home, Retrieval Pipeline, Modes Explained (A/B/C)

System Architecture

SuperLocalMemory is a local-first AI memory system designed for privacy-conscious users who want persistent, semantic memory capabilities for AI assistants without relying on cloud services. The architecture follows a multi-layered design that separates storage, retrieval, API serving, and user interface concerns while maintaining tight integration between components.

Architecture Overview

The system consists of four primary layers that work together to provide memory persistence and retrieval capabilities:

graph TD
    subgraph CLI["CLI Layer"]
        CLI_CMD["slm remember<br/>slm recall<br/>slm forget"]
        SETUP_WIZARD["Setup Wizard"]
        POST_INSTALL["Post-Install"]
    end

    subgraph Server["API Server Layer"]
        DAEMON["Unified Daemon<br/>(unified_daemon.py)"]
        API["FastAPI Server<br/>(api.py)"]
        UI["UI Server<br/>(ui.py)"]
        ROUTES["Route Handlers<br/>(routes/*)"]
    end

    subgraph Core["Core Engine Layer"]
        MEMORY["Memory Engine"]
        CONFIG["Configuration<br/>(config.py)"]
        MIGRATOR["V2 Migrator"]
    end

    subgraph Storage["Storage Layer"]
        DB["SQLite Database<br/>(~/.superlocalmemory/memory.db)"]
        CACHE["Context Cache"]
    end

    CLI_CMD -->|Command| DAEMON
    SETUP_WIZARD -->|Initialize| DB
    POST_INSTALL -->|Migrate| MIGRATOR
    DAEMON -->|Manage| DB
    API -->|Serve| UI
    API -->|Routes| ROUTES
    ROUTES -->|Query| MEMORY
    MEMORY -->|Read/Write| DB
    MEMORY -->|Cache| CACHE
    CONFIG -->|Configure| MEMORY

The CLI layer provides the primary interface for human users and AI agents, while the API server layer handles both programmatic access and the web-based dashboard. The core engine layer implements the memory operations, and the storage layer persists all data locally in SQLite.

Core Components

Unified Daemon

The unified daemon (unified_daemon.py) serves as the central process manager for SuperLocalMemory. It handles version tracking, data migrations, and system initialization.

Key Responsibilities:

Version banner display on startup and upgrades
Additive schema migrations before engine initialization
Non-blocking startup (failures in version tracking do not prevent daemon operation)

# Source: src/superlocalmemory/server/unified_daemon.py:1-50

# LLD-06 §7.3 / LLD-07 §4.1 — run additive schema migrations BEFORE
# engine init so later queries see the expected columns/tables.
# Non-fatal: any failure here is logged and the daemon still starts.

The daemon implements a fail-safe approach where version banner errors are caught and logged without blocking startup, ensuring the system remains operational even when version tracking encounters issues.

FastAPI API Server

The API server (api.py) provides REST endpoints for memory operations and serves the web-based user interface. It uses FastAPI with several middleware components for security and performance.

Middleware Stack (in order):

Layer	Middleware	Purpose	Source
1 (outermost)	SecurityHeadersMiddleware	Security headers	api.py:1-30
2	GZipMiddleware	Response compression (min 1000 bytes)	api.py:1-30
3	CORSMiddleware	Cross-origin resource sharing	ui.py:1-50
4 (innermost)	RateLimiter	Request throttling	ui.py:1-50

CORS Configuration:

The API server allows requests from localhost origins on two ports:

http://localhost:8765 and http://127.0.0.1:8765
http://localhost:8417 and http://127.0.0.1:8417

Allowed methods include GET, POST, PUT, DELETE, PATCH, and OPTIONS. Headers allowed include Content-Type, Authorization, and X-SLM-API-Key.

Rate Limiting:

Write operations: 30 requests per 60 seconds
Read operations: 120 requests per 60 seconds

UI Server

The UI server (ui.py) serves the web-based memory dashboard and is integrated into the FastAPI application. It searches for the UI directory in two locations:

# Source: src/superlocalmemory/server/ui.py:1-20

# V3.3.21: UI shipped inside the package for pip/npm installs.
_PKG_UI = Path(__file__).resolve().parent.parent / "ui"
_REPO_UI = Path(__file__).resolve().parent.parent.parent.parent / "ui"
UI_DIR = _PKG_UI if (_PKG_UI / "index.html").exists() else _REPO_UI

This dual-location search supports both package-installed and repository-clone deployments.

CLI Architecture

The command-line interface provides the primary interaction method for users and AI agents. The CLI is organized around commands that map to memory operations.

graph LR
    subgraph Commands
        REMEMBER["remember<br/>Store new memory"]
        RECALL["recall<br/>Semantic search"]
        FORGET["forget<br/>Delete by query"]
        DELETE["delete<br/>Delete by ID"]
        UPDATE["update<br/>Modify memory"]
    end

    subgraph Output
        JSON["--json flag<br/>Agent-native format"]
        HUMAN["Human readable"]
    end

    REMEMBER -->|result| JSON
    RECALL -->|result| JSON
    RECALL -->|result| HUMAN

Command Reference

Command	Purpose	Key Options	Output
`slm remember`	Store a new memory	`--importance`, `--tags`, `--project`	Confirmation or JSON
`slm recall`	Semantic search	`--limit`, `--json`, `--fast`	Results list
`slm forget`	Delete by query	`--dry-run`, `--yes`, `--json`	Confirmation
`slm delete`	Delete by ID	`--yes`, `--json`	Confirmation
`slm update`	Modify existing memory	Various	Updated memory

Source: src/superlocalmemory/cli/main.py:1-100

The `--fast` Flag

The recall command supports a --fast option that skips the SpreadingActivation 5th channel for sub-second response times. When enabled, only four channels execute:

Semantic similarity
Lexical matching
Temporal proximity
Structural relevance

This trade-off is recommended when you need recall results before making a tool call (e.g., before WebSearch).

JSON Output Format

The CLI supports an agent-native JSON output format with a consistent envelope structure:

{
  "success": true,
  "command": "recall",
  "version": "3.4.58",
  "data": [...],
  "next_actions": [...]
}

The version is read from package.json (npm installs), pyproject.toml (pip installs), or falls back to importlib.metadata.

Source: src/superlocalmemory/cli/json_output.py:1-60

Setup and Initialization

Setup Wizard

The setup wizard (setup_wizard.py) runs automatically on first use or via slm setup. It handles:

Model downloads (embedding model: nomic-ai/nomic-embed-text-v1.5)
Reranker model downloads (cross-encoder/ms-marco-MiniLM-L-12-v2)
Mode configuration
Installation verification

Source: src/superlocalmemory/cli/setup_wizard.py:1-50

The wizard detects non-interactive environments (CI, piped input, MCP calls) and skips interactive prompts in those contexts.

Post-Install Process

For npm installations, a post-install script runs after npm install -g superlocalmemory. It performs:

Version banner check (detects upgrades from prior versions)
V2 installation detection
Migration prompt if V2 data exists
Setup wizard invocation for new users

Source: src/superlocalmemory/cli/post_install.py:1-50

Data Directory

The default data directory is ~/.superlocalmemory/, configurable via:

Environment variable: SL_MEMORY_PATH (Python layer)
Environment variable: SLM_DATA_DIR (documented but noted as potentially unused in some versions)

All memories are stored in memory.db within this directory.

Route Architecture

The API server includes multiple route modules that handle different aspects of memory operations:

Registered Routers

Router	Purpose	Source
`memories_router`	Core memory CRUD operations	routes/memories.py
`stats_router`	Statistics and analytics	routes/stats.py
`profiles_router`	Profile management	routes/profiles.py
`backup_router`	Backup and restore	routes/backup.py
`events_router`	Audit trail events	routes/events.py
`v3_router`	V3 dashboard and advanced features	routes/v3_api.py
`chat_router`	Chat with memory context (SSE)	routes/chat.py

Chat Route (SSE Streaming)

The chat route implements server-sent events for streaming LLM responses with memory context and citation detection:

sequenceDiagram
    participant Client
    participant Server
    participant Memory
    participant LLM

    Client->>Server: POST /chat with query
    Server->>Memory: Retrieve relevant memories
    Memory-->>Server: List of memories with trust scores
    Server->>Server: Build context with citation markers
    Server->>LLM: Stream response request
    LLM-->>Server: Token stream
    Server-->>Client: SSE events (token, done, error)

Source: src/superlocalmemory/server/routes/chat.py:1-100

The system prompt instructs the LLM to cite memories using markers like [MEM-1], [MEM-2], etc., enabling traceable responses.

Optional Feature Routers

Several routers are loaded gracefully and do not block startup if unavailable:

learning - Adaptive learning from user feedback
lifecycle - Memory lifecycle management
behavioral - Behavioral pattern recognition
compliance - Enterprise compliance features

# Source: src/superlocalmemory/server/ui.py:100-120

for _module_name in ("learning", "lifecycle", "behavioral", "compliance"):
    try:
        _mod = __import__(f"superlocalmemory.server.routes.{_module_name}", fromlist=["router"])
        application.include_router(_mod.router)
    except (ImportError, Exception):
        pass

Multi-Profile Architecture

SuperLocalMemory supports multiple isolated profiles, where each profile maintains its own:

Memory entries
Learning data
Preferences
Feedback

Profile isolation ensures that memories from one profile cannot leak to another, a security feature introduced in v2.6.0.

Data Flow

graph TD
    subgraph Ingestion
        USER["User Input<br/>(CLI/API)"]
        AGENT["AI Agent<br/>(MCP/Tools)"]
        INTEGRATION["Integrations<br/>(LangChain/LlamaIndex)"]
    end

    subgraph Processing
        ROUTE["Route Handler"]
        VALIDATE["Validation"]
        STORE["Storage Engine"]
    end

    subgraph Retrieval
        SEARCH["Search Engine"]
        RERANK["Reranker"]
        FUSE["Result Fusion"]
    end

    USER -->|slm remember| ROUTE
    AGENT -->|MCP Tools| ROUTE
    INTEGRATION -->|Chat History| ROUTE
    ROUTE --> VALIDATE
    VALIDATE --> STORE
    STORE -->|Query| SEARCH
    SEARCH --> RERANK
    RERANK --> FUSE
    FUSE -->|Results| USER
    FUSE -->|Results| AGENT

Security Considerations

Hardcoded Bind Address

As noted in GitHub Issue #23, both the SLM daemon and related components (like slm-mesh broker) currently hardcode 127.0.0.1 as the bind address. This is appropriate for single-machine usage but limits multi-machine deployments over trusted networks.

Profile Isolation

Memory queries enforce profile isolation at the API layer, preventing cross-profile data leakage.

Rate Limiting

The API implements per-endpoint rate limiting to prevent abuse, with stricter limits on write operations.

Common Architecture Patterns

Lazy Import Pattern

Several modules use lazy imports to keep module-level imports fast:

# Source: src/superlocalmemory/server/routes/prewarm.py:1-30

def _compute_topic_sig(prompt: str) -> str:
    """Lazy import so module import is free of hot-path SLM modules."""
    from superlocalmemory.core.topic_signature import compute_topic_signature
    return compute_topic_signature(prompt)

Graceful Degradation

Optional features are loaded with try/except blocks, ensuring the core system remains functional even when optional components fail:

# Source: src/superlocalmemory/server/routes/prewarm.py:30-50

def _upsert_cache(...) -> None:
    from superlocalmemory.core.context_cache import CacheEntry, ContextCache
    cache = ContextCache()
    try:
        cache.upsert(CacheEntry(...))
    finally:
        cache.close()

Fail-Safe Version Tracking

Version banner errors are caught without blocking startup:

# Source: src/superlocalmemory/server/unified_daemon.py:50-70

try:
    # version tracking logic
except Exception _exc:  # pragma: no cover — never block startup
    logger.debug("version-banner skipped: %s", _exc)
    _want_write_marker = False

Installation Modes

SuperLocalMemory supports multiple installation methods, each with slightly different directory structures:

Method	UI Location	Version Source	Post-Install
npm global	`_PKG_UI` (package)	`package.json`	`post_install.js`
pip	`_PKG_UI` (package)	`pyproject.toml`	First-run wizard
Repository clone	`_REPO_UI` (repo root)	Dynamic	Manual setup

The UI directory detection falls back from package location to repository location if the package UI is not found.

Modes Explained (A/B/C)

Related topics: Home, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section How Mode A Works

Continue reading this section for the full explanation and source context.

Section Behavior Without LLM Provider

Continue reading this section for the full explanation and source context.

Section Fast Mode Option

Continue reading this section for the full explanation and source context.

Related topics: Home, System Architecture

Modes Explained (A/B/C)

SuperLocalMemory V3 operates in distinct operational modes that determine how memory storage, retrieval, and LLM integration function. Understanding these modes is essential for configuring the system for your specific use case—whether you prioritize complete data sovereignty with local-only processing or require cloud-based LLM capabilities.

Overview

SuperLocalMemory V3 provides three primary operational modes that govern how the memory system processes queries and integrates with Large Language Model backends:

Mode	Description	LLM Required	Data Location
Mode A	Local-only semantic search	No	Always local
Mode B	Cloud LLM with local memory	Yes (external)	Memory local, inference cloud
Mode C	(Documentation pending)	Varies	Varies

The mode system is central to SuperLocalMemory's architecture, enabling deployment flexibility from fully air-gapped environments to cloud-integrated workflows. Source: src/superlocalmemory/core/config.py

Mode A: Local-Only Semantic Search

Mode A is the default operational mode for SuperLocalMemory when no external LLM provider is configured. In this mode, the system performs all memory operations using local embedding models and SQLite-based retrieval without requiring any external API calls.

How Mode A Works

When a user issues a slm recall command in Mode A, the system performs semantic search using locally-hosted embedding models. The retrieval pipeline includes:

Query embedding — The user's search query is embedded using a local model (typically nomic-ai/nomic-embed-text-v1.5)
Vector similarity search — Embeddings are compared against stored memory vectors in SQLite
Multi-channel retrieval — Results are ranked using semantic, lexical, temporal, and structural signals
Raw result presentation — Memory cards are returned directly without LLM synthesis

Behavior Without LLM Provider

When no LLM provider is configured, the chat API endpoint explicitly falls back to Mode A behavior:

if not provider:
    yield _sse_event("token", "No LLM provider configured. Showing raw results instead.\n\n")
    async for event in _stream_mode_a(query, memories):
        yield event

Source: src/superlocalmemory/server/routes/chat.py:58-61

Fast Mode Option

Mode A supports a --fast flag for sub-second response times by skipping the Spreading Activation fifth channel:

recall_p.add_argument(
    "--fast", action="store_true",
    help="Skip SpreadingActivation 5th channel for sub-second response. "
         "Other 4 channels (semantic, lexical, temporal, structural) still run. "
         "Use when you need recall before a tool call (e.g. before WebSearch).",
)

Source: src/superlocalmemory/cli/main.py

When to Use Mode A

Air-gapped environments — Systems without internet connectivity
Maximum privacy — When no data should leave the local machine under any circumstances
Maximum speed — When sub-second retrieval is prioritized over synthesis
Resource-constrained deployments — When GPU/CPU resources cannot support inference

Mode B: Cloud LLM with Local Memory

Mode B extends Mode A's local memory foundation with cloud-based LLM synthesis. In this mode, memory embeddings and storage remain entirely local, but query synthesis and response generation use external LLM providers.

Mode B Architecture

graph TD
    A[User Query] --> B[Local Embedding]
    B --> C[SQLite Memory Store]
    C --> D[Retrieved Memories]
    D --> E[Context Construction]
    E --> F[Cloud LLM Provider]
    F --> G[Synthesized Response]
    G --> H[Local Display]
    
    H --> I[Trust Scoring]
    I --> J[Memory Update]
    
    style F fill:#ffcccc
    style C fill:#ccffcc

Supported Providers

Mode B supports any OpenAI-compatible API endpoint, including:

Provider	Configuration	Notes
Ollama	`provider: ollama`	Local LLM option
LM Studio	OpenAI-compatible	Local inference
Groq	Cloud	Fast inference
OpenAI	`api_key` required	Standard OpenAI
OpenRouter	`api_key` required	Aggregated models
Custom endpoints	`api_base` configurable	Self-hosted

Source: src/superlocalmemory/llm/backbone.py

Known Issue: api_key Handling

A documented issue affects Mode B configuration where the api_key field may be silently dropped:

Symptom: Any configured api_key is ignored in Mode B, causing LLMBackbone.is_available() to return False for non-Ollama providers.

Source: GitHub Issue #9

This occurs in the SLMConfig.for_mode() method's Mode B branch, where API credentials may not be properly propagated to the LLM backbone initialization.

When to Use Mode B

Complex synthesis — When memory retrieval benefits from natural language explanation
Multi-modal reasoning — When combining memory with real-time information
Multi-language support — When working with non-English content requiring advanced language models
Balanced privacy — When memory data must remain local but inference can be external

Mode Selection and Configuration

Viewing Current Mode

Check the active mode using the CLI:

slm status

The status command displays the current mode, LLM configuration, and memory statistics.

Switching Modes

Switch between modes using the setup wizard or direct configuration:

slm mode          # Interactive mode selection
slm provider      # Configure LLM provider settings

Source: src/superlocalmemory/cli/main.py:80-85

Configuration File Structure

The SLMConfig class manages mode-specific settings:

class SLMConfig:
    @classmethod
    def for_mode(cls, mode: str) -> "SLMConfig":
        """Factory method that returns mode-specific configuration."""
        ...

Source: src/superlocalmemory/core/config.py

Setup Wizard Integration

The setup wizard handles initial mode selection during first-time installation:

_SLM_HOME = Path(os.environ.get("SL_MEMORY_PATH", Path.home() / ".superlocalmemory"))
_SETUP_MARKER = _SLM_HOME / ".setup-complete"
_EMBED_MODEL = "nomic-ai/nomic-embed-text-v1.5"
_RERANKER_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"

Source: src/superlocalmemory/cli/setup_wizard.py

The wizard:

Detects whether an LLM provider is available
Offers mode selection based on detected capabilities
Downloads required embedding models for Mode A
Configures provider credentials for Mode B

Data Flow Comparison

graph LR
    subgraph Mode A
        A1[Query] --> A2[Local Embed]
        A2 --> A3[Local Search]
        A3 --> A4[Raw Results]
    end
    
    subgraph Mode B
        B1[Query] --> B2[Local Embed]
        B2 --> B3[Local Search]
        B3 --> B4[Context Build]
        B4 --> B5[Cloud LLM]
        B5 --> B6[Synthesized]
    end
    
    subgraph Mode C
        C1[Query] --> C2[Context]
        C2 --> C3[Distributed]
        C3 --> C4[Collaborative]
    end

Troubleshooting Mode Selection

Long Wait Times Without Response

If slm remember commands hang without response, this may indicate:

Mode B is configured but the LLM provider is unreachable
Network connectivity issues with cloud endpoints
Model download still in progress for first-time setup

Resolution: This was addressed in v3.3.19. Ensure you are running the latest version.

Source: GitHub Issue #11

Provider Not Detected

When Mode B features are unavailable:

Verify LLM configuration in settings
Check api_key is not empty in config
Test provider connectivity with slm status
Consider falling back to Mode A if cloud access is unreliable

SLM_DATA_DIR Not Honored

The SLM_DATA_DIR environment variable is documented but may not be used everywhere. This affects all modes equally. The recommended workaround is to ensure the default ~/.superlocalmemory directory has appropriate permissions.

Source: GitHub Issue #10

Security Considerations

Profile Isolation

All modes enforce profile isolation at the query endpoint level:

Memories from one profile can never leak to another.

Source: v2.6.0 Release Notes

This security guarantee applies regardless of which mode is active.

Mode B Data Privacy

In Mode B:

Memory content never leaves the local machine
Only query embeddings and synthesized responses traverse the network
The cloud LLM receives constructed context, not raw memory

For maximum privacy in Mode B, ensure your LLM provider's data handling policies meet your compliance requirements.

Community Feature Requests

The community has proposed several mode-related enhancements:

Configurable Embedding Endpoints (Issue #16)

Users have requested support for configurable local embedding endpoints to improve non-English language support:

"Support configurable local embedding endpoints (e.g., OpenAI-compatible API) to unlock non-English language potential."

Source: GitHub Issue #16

This would allow Mode B users to specify custom embedding services that better handle their language requirements.

Multi-Scope Memory (Issue #20)

An RFC proposes scope-aware retrieval that could work across all modes:

"Currently, SLM stores all memories in a flat namespace... no distinction between private memories and shared knowledge."

Source: GitHub Issue #20

Retrieval Pipeline

Related topics: System Architecture, Memory Lifecycle

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Data Flow

Continue reading this section for the full explanation and source context.

Section Channel Architecture

Continue reading this section for the full explanation and source context.

Section CLI Interface

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Memory Lifecycle

Retrieval Pipeline

The Retrieval Pipeline is the core information retrieval system in SuperLocalMemory V3, responsible for finding the most relevant memories in response to user queries. It implements a multi-channel retrieval architecture that combines semantic similarity, lexical matching, temporal proximity, structural relationships, and spreading activation into a unified retrieval operation. The pipeline is invoked through the slm recall CLI command, the REST API endpoints, and the MCP tools for programmatic access.

Overview

SuperLocalMemory stores memories as atomic facts in a SQLite database, each tagged with semantic embeddings, temporal metadata, structural relationships (parent-child, project, profile), and trust scores. When a query arrives, the Retrieval Pipeline must efficiently locate the most relevant facts across these dimensions.

The pipeline is designed around the following principles:

Multi-channel retrieval: Combines 4-5 orthogonal retrieval channels to capture different similarity aspects
Trust-aware ranking: Results are filtered and weighted by trust scores computed from provenance chains
Sub-second response: The --fast mode delivers results in under 1 second for time-sensitive tool calls
Zero-LLM inference: Core retrieval runs without calling an LLM, enabling fully local operation (Mode A)
Fisher-Rao similarity: Mathematical framework ensuring information-geometric guarantees on retrieval quality

The CLI entry point is slm recall, which wraps the WorkerPool's recall method for thread-safe concurrent execution. Source: cli/main.py:recall_p

Architecture

High-Level Data Flow

graph TD
    A["User Query<br/>'slm recall &lt;query&gt;'"] --> B["WorkerPool.recall<br/>Entry Point"]
    B --> C["Semantic Channel<br/>Embedding Similarity"]
    B --> D["Lexical Channel<br/>BM25 / Keyword Match"]
    B --> E["Temporal Channel<br/>Recency Weighting"]
    B --> F["Structural Channel<br/>Project / Profile Graph"]
    B --> G["Spreading Activation<br/>5th Channel (optional)"]
    C --> H["Parallel Channel Execution"]
    D --> H
    E --> H
    F --> H
    G -.-> H
    H --> I["Reranker<br/>Cross-Encoder Scoring"]
    I --> J["Trust Filter<br/>Minimum Threshold"]
    J --> K["Ranked Results<br/>JSON / Markdown Output"]
    
    style G fill:#f9f,stroke:#333,stroke-width:2px
    style K fill:#bf9,stroke:#333,stroke-width:2px

Channel Architecture

The retrieval engine combines five distinct channels, each capturing a different dimension of relevance:

Channel	Purpose	Input	Output
Semantic	Meaning-based similarity via embeddings	Query text	Top-K fact IDs with cosine similarity scores
Lexical	Keyword and exact match	Query terms	Fact IDs with BM25 scores
Temporal	Recency-weighted relevance	Timestamp metadata	Time-decay weighted scores
Structural	Graph-based relationships	Parent-child, project graph	Transitive closure scores
Spreading Activation	Neural-style associative retrieval	Active facts from other channels	Propagated activation scores

The first four channels run in parallel for maximum throughput. The Spreading Activation channel is optional and skipped when --fast mode is enabled. Source: cli/main.py:recall_p.add_argument --fast

CLI Interface

The slm recall command provides the primary interface to the retrieval pipeline:

slm recall <query> [--limit N] [--json] [--fast]

Argument	Default	Description
`query`	(required)	Search query string
`--limit`, `-l`	10	Maximum number of results to return
`--json`	false	Output structured JSON for agent consumption
`--fast`	false	Skip SpreadingActivation 5th channel for sub-second response

The --fast flag is recommended when recall must complete before a subsequent tool call (e.g., before WebSearch). It still executes all four primary channels: semantic, lexical, temporal, and structural. Source: cli/main.py:recall_p

REST API Endpoint

The REST API exposes retrieval through the memories router:

GET /api/memories/recall?query=<query>&limit=<limit>

The endpoint delegates to the same WorkerPool shared instance used by the CLI, ensuring consistent behavior across all access methods. Results are returned as a JSON array of fact objects with content, trust scores, and provenance metadata. Source: server/routes/memories.py

WorkerPool Integration

Shared Execution Context

The Retrieval Pipeline runs inside a WorkerPool singleton that manages concurrent access to the SQLite database. This design ensures thread safety and allows both the CLI daemon and HTTP server to share the same retrieval engine.

# Internal flow in chat.py
from superlocalmemory.core.worker_pool import WorkerPool

pool = WorkerPool.shared()
result = pool.recall(query, limit=limit)

The WorkerPool.shared() method returns a process-global singleton that initializes the V3 MemoryEngine on first access. All subsequent recall operations reuse this instance, avoiding repeated engine initialization overhead. Source: server/routes/chat.py:_recall_memories

Result Structure

The recall operation returns a dictionary with the following structure:

{
  "ok": true,
  "results": [
    {
      "fact_id": "fact_abc123",
      "content": "The project uses Python 3.12 for type safety",
      "trust_score": 0.92,
      "provenance": ["user_feedback:thumbs_up", "automatic_verification"],
      "tags": ["project:backend", "profile:default"],
      "created_at": 1704067200
    }
  ]
}

Context Caching

Cache Architecture

The Retrieval Pipeline integrates with a context cache layer to accelerate repeated queries. When memories are retrieved, they can be cached with a topic signature for future use:

graph LR
    A["Query: 'Python best practices'"] --> B["Compute Topic Signature<br/>hash(query)"]
    B --> C["ContextCache Lookup"]
    C -->|Cache Hit| D["Return Cached Results"]
    C -->|Cache Miss| E["Run Full Retrieval Pipeline"]
    E --> F["Upsert Cache Entry"]
    F --> D

The ContextCache stores entries with:

session_id: Conversation or agent session identifier
topic_sig: Hash of the query for fast lookup
content: Serialized memory context
fact_ids: Tuple of referenced fact IDs
provenance: How the cache entry was computed
computed_at: Unix timestamp for TTL decisions

Source: server/routes/prewarm.py:_upsert_cache

Prewarm Endpoint

The /api/prewarm endpoint allows agents to eagerly populate the cache before a conversation starts:

POST /api/prewarm
{
  "session_id": "session_xyz",
  "prompt": "Tell me about the backend architecture"
}

This triggers retrieval for the provided prompt and stores results in the context cache, reducing latency when the user later asks related questions. Source: server/routes/prewarm.py

Topic Signature Computation

Topic signatures are computed lazily to keep module imports fast:

def _compute_topic_sig(prompt: str) -> str:
    """Lazy import so module import is free of hot-path SLM modules."""
    from superlocalmemory.core.topic_signature import compute_topic_signature
    return compute_topic_signature(prompt)

This lazy import pattern ensures that importing the prewarm module does not trigger loading of the heavier retrieval dependencies until actually needed. Source: server/routes/prewarm.py:_compute_topic_sig

Chat Integration with Memory Context

Streaming Response Flow

When the /api/chat endpoint receives a message, it retrieves relevant memories and streams them alongside the LLM response:

sequenceDiagram
    participant User
    participant API as /api/chat
    participant Recall as Retrieval Pipeline
    participant LLM as LLM Provider
    participant User as Client (SSE)
    
    User->>API: POST /api/chat {message}
    API->>Recall: _recall_memories(message, limit=10)
    Recall-->>API: memories[]
    API->>API: Build context with [MEM-1], [MEM-2] markers
    API->>LLM: Stream completion(messages + context)
    LLM-->>User: SSE tokens with memory citations

The retrieved memories are formatted with citation markers that the LLM can reference:

[MEM-1] The project uses Python 3.12 (trust: 0.92)
[MEM-2] PostgreSQL is the primary database (trust: 0.88)

Source: server/routes/chat.py:_build_context

Trust Scoring in Results

Each retrieved memory carries a trust_score between 0.0 and 1.0, computed from the provenance chain:

Memories with positive user feedback (thumbs up) receive higher trust
Memories auto-verified against external sources score higher
Memories from recent sessions with the same profile are weighted more heavily
Imported memories without provenance chains receive lower default trust

The trust score appears in both CLI output and the SSE stream, allowing clients to display visual indicators of memory reliability.

Performance Considerations

Fast Mode Trade-offs

When --fast is specified, the Spreading Activation channel is disabled. This channel provides neural-style associative retrieval by propagating activation through the memory graph. Disabling it trades recall quality for speed:

Mode	Latency	Channels Active	Best For
Default	~500-2000ms	5 (all)	Comprehensive research, agent planning
`--fast`	~100-500ms	4 (no spreading)	Tool calls before web search, real-time autocomplete

For most interactive use cases, --fast provides sufficient accuracy. The full pipeline is recommended for final answer synthesis or when recall quality is paramount. Source: cli/main.py:recall_p.add_argument --fast

Concurrent Access

The WorkerPool handles concurrent requests safely through Python's concurrent.futures thread pool executor. Multiple simultaneous slm recall invocations or API calls are serialized at the database level but execute channel operations in parallel within each request.

Docker and Linux Considerations

Community reports indicate potential issues with retrieval latency in Docker containers and certain Linux distributions. These may relate to:

SQLite file locking behavior in overlay filesystems
Thread pool sizing in containerized environments
Model loading times for embedding models (Mode B)

Users experiencing long wait times (reported as "wait for a long time but seems no response" in issue #11, fixed in v3.3.19) should verify:

The daemon is running (slm status)
Sufficient memory is available for embedding models
The database file is on a local filesystem (not network storage)

Configuration

Data Directory

By default, SuperLocalMemory stores all retrieval data in ~/.superlocalmemory/. The SLM_DATA_DIR environment variable can relocate this, though note this feature had a bug (issue #10) that has since been corrected:

export SLM_DATA_DIR=/path/to/custom/data
slm recall "my query"

Mode A vs Mode B

The retrieval pipeline operates in two modes:

Aspect	Mode A (Local)	Mode B (API)
Embeddings	Local model (nomic-embed-text)	Remote API (OpenAI-compatible)
LLM	Local (Ollama)	Remote (OpenAI, Anthropic, etc.)
Latency	Higher (model loading)	Lower (API calls)
Privacy	Maximum (fully offline)	High (data stays on configured server)
Cost	Free (compute only)	API token costs

Mode B supports OpenAI-compatible embedding endpoints, enabling users to configure custom embedding providers for non-English languages (feature request #16). Source: src/superlocalmemory/core/config.py

Common Issues

No Results Returned

If slm recall returns an empty result set:

Verify memories exist: slm list-recent
Check profile isolation: memories are scoped to the current profile
Inspect database: sqlite3 ~/.superlocalmemory/memory.db "SELECT COUNT(*) FROM facts"
Enable debug logging: SLM_LOG_LEVEL=DEBUG slm recall <query>

Long Response Times

Long retrieval times (beyond 2-3 seconds) may indicate:

First-run embedding model download (Setup Wizard should handle this)
Large fact database without proper indexing
Mode B embedding endpoint unreachable
Insufficient system memory for local models

API Key Not Used (Mode B)

Issue #9 reported that api_key is silently dropped in Mode B configurations. Ensure the API key is correctly set in the configuration file or environment variable, and that the provider field is set to a non-Ollama value to trigger the correct code path.

CLI Reference

The SuperLocalMemory CLI (slm) provides a command-line interface for managing AI memory operations. The CLI serves as the primary interface for agents and developers to store, retrieve, update, and delete memories from the local SQLite database. All operations are local-first, with data stored in ~/.superlocalmemory/memory.db by default.

Overview

The CLI follows the 2026 agent-native CLI standard, providing consistent JSON envelopes for programmatic consumption alongside human-readable output for interactive use. Source: src/superlocalmemory/cli/json_output.py:1-50

graph TD
    A[slm CLI] --> B[Commands]
    B --> C[remember - Store facts]
    B --> D[recall - Semantic search]
    B --> E[forget - Fuzzy delete]
    B --> D2[delete - Precise delete]
    B --> F[update - Modify memory]
    A --> G[Global Options]
    G --> H[--json Agent-native output]
    G --> I[--profile Profile isolation]

Installation

The CLI is available via both npm and pip package managers.

# npm installation (recommended for global access)
npm install -g superlocalmemory

# pip installation
pip install superlocalmemory

After installation, the slm command becomes available globally. The post-install script automatically detects V2 installations and prompts for migration. Source: src/superlocalmemory/cli/post_install.py:1-40

For first-time users, the setup wizard runs automatically on first slm command when ~/.superlocalmemory/.setup-complete is missing. Source: src/superlocalmemory/cli/setup_wizard.py:1-45

Commands

slm remember

Stores new facts and memories in the local database.

slm remember <content> [options]

Argument	Description
`content`	The fact or memory to store

Options:

Option	Description
`--json`	Output structured JSON (agent-native)
`--profile <name>`	Store in specific profile (default: current profile)

Example:

# Interactive mode
slm remember "The PostgreSQL connection pool should use max 20 connections"

# JSON output for scripting
slm remember "Project deadline is March 15th" --json

The remember command processes the content through the V3 MemoryEngine, computing topic signatures and extracting entities for optimized retrieval. Source: src/superlocalmemory/server/routes/prewarm.py:1-50

slm recall

Performs semantic search with 4-channel retrieval across stored memories.

slm recall <query> [options]

Argument	Description
`query`	Semantic search query

Options:

Option	Default	Description
`--limit <n>`	10	Maximum number of results
`--json`	false	Output structured JSON
`--fast`	false	Skip 5th channel (SpreadingActivation) for sub-second response
`--profile <name>`	current	Search within specific profile

4-Channel Retrieval Architecture:

The recall command uses four retrieval channels:

Semantic - Vector embedding similarity
Lexical - Keyword and phrase matching
Temporal - Time-based relevance scoring
Structural - Graph topology influence

The optional 5th channel (SpreadingActivation) performs network propagation for deeper context discovery. Use --fast to skip this channel when speed is critical, such as before a tool call. Source: src/superlocalmemory/cli/main.py:1-80

graph LR
    A[Query] --> B[Semantic Channel]
    A --> C[Lexical Channel]
    A --> D[Temporal Channel]
    A --> E[Structural Channel]
    B --> F[Ranking Engine]
    C --> F
    D --> F
    E --> F
    F --> G[Results]

Example:

# Standard recall
slm recall "database connection pooling settings"

# Fast recall for pre-tool use
slm recall "API endpoint for users" --fast

# JSON output with higher limit
slm recall "authentication token handling" --limit 20 --json

Performance Note: If slm recall appears to hang with no response, this was a known issue fixed in v3.3.19. Ensure you are running a recent version. Source: GitHub Issue #11

slm forget

Deletes memories matching a query using fuzzy matching.

slm forget <query> [options]

Argument	Description
`query`	Query to match for deletion

Options:

Option	Description
`--dry-run`	Preview matches without deleting
`--yes, -y`	Skip confirmation prompt
`--json`	Output structured JSON
`--profile <name>`	Operate within specific profile

Example:

# Preview what would be deleted
slm forget "old project notes" --dry-run

# Delete with confirmation
slm forget "duplicate entry about config" -y

slm delete

Deletes a specific memory by its exact fact ID.

slm delete <fact_id> [options]

Argument	Description
`fact_id`	Exact fact ID to delete

Options:

Option	Description
`--yes, -y`	Skip confirmation prompt
`--json`	Output structured JSON

Example:

# Delete by exact ID
slm delete fact_abc123def456 -y

Unlike forget, this command requires the precise fact ID, making it suitable for programmatic deletion workflows.

slm update

Modifies an existing memory entry.

slm update [options]

Options:

Option	Description
`--fact-id <id>`	Fact ID to update
`--content <text>`	New content for the memory
`--json`	Output structured JSON

Example:

# Update memory content
slm update --fact-id fact_xyz789 --content "Updated project requirements"

# JSON output for automation
slm update --fact-id fact_xyz789 --content "New content here" --json

Global Options

These options work with any command.

Option	Description
`--json`	Enable agent-native JSON output format
`--profile <name>`	Specify which profile to use
`--help, -h`	Show help message
`--version, -v`	Show version information

JSON Output Format

When --json is specified, commands return a structured envelope following the 2026 agent-native CLI standard:

{
  "success": true,
  "command": "recall",
  "version": "3.4.58",
  "data": {
    "results": [
      {
        "fact_id": "fact_abc123",
        "content": "The app uses port 5432 for PostgreSQL",
        "trust_score": 0.87,
        "importance": 7,
        "tags": ["database", "config"],
        "created_at": "2026-02-01T10:30:00Z"
      }
    ],
    "total": 1,
    "query": "postgres port",
    "channel_used": ["semantic", "lexical"]
  },
  "metadata": {
    "profile": "default",
    "execution_time_ms": 234
  }
}

Source: src/superlocalmemory/cli/json_output.py:1-80

Version Detection:

The JSON envelope includes version information read from:

package.json (npm installs)
pyproject.toml (pip installs)
importlib.metadata fallback

Source: src/superlocalmemory/cli/json_output.py:20-50

Environment Variables

Variable	Default	Description
`SL_MEMORY_PATH`	`~/.superlocalmemory`	Base directory for data storage
`CI`	(not set)	Set to disable interactive prompts
`SLM_NON_INTERACTIVE`	(not set)	Disable interactive mode
`SLM_HOST`	`127.0.0.1`	Daemon bind address
`SLM_PORT`	`8765`	Daemon port

Note: The SLM_DATA_DIR environment variable was requested in the community (Issue #10) for custom data directories. Verify current support by checking src/superlocalmemory/core/config.py.

Profile Isolation

All CLI commands respect profile isolation. Memories from one profile cannot leak to another. This is enforced at the query endpoint level and applies to all operations including recall, remember, forget, and delete. Source: src/superlocalmemory/server/ui.py:1-50

graph TD
    A[CLI Command] --> B{Profile Specified?}
    B -->|No| C[Use Current Profile]
    B -->|Yes| D[Use Named Profile]
    C --> E[Query Engine]
    D --> E
    E --> F{Isolation Check}
    F -->|Pass| G[Return Results]
    F -->|Fail| H[Empty Results]

Integration with AI Tools

The CLI is designed to work seamlessly with AI coding assistants and IDE integrations:

Supported Integrations

Integration	Package	Description
LlamaIndex	`llamaindex`	Chat store adapter for conversation history
LangChain	`langchain-superlocalmemory`	Chat message history implementation
Claude Desktop	MCP	Model Context Protocol server
Cursor	MCP	AI IDE integration
Windsurf	MCP	AI-powered code editor

Source: ide/integrations/llamaindex/README.md:1-40 Source: ide/integrations/langchain/README.md:1-60

LangChain Example

from langchain_core.messages import HumanMessage, AIMessage
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory

# Session-isolated chat history
history = SuperLocalMemoryChatMessageHistory(session_id="debug-session-42")
history.add_messages([
    HumanMessage(content="The login API returns 500 on production"),
    AIMessage(content="Let me check the error logs..."),
])

# Messages persist locally and are accessible via slm recall

Common Issues

Recall Hangs with No Response

Symptom: slm remember or slm recall hangs indefinitely.

Solution: This was fixed in v3.3.19. Upgrade to the latest version:

npm install -g superlocalmemory@latest
# or
pip install --upgrade superlocalmemory

Source: GitHub Issue #11

Windows Clone Issues

Symptom: git clone fails with "invalid path" error due to files with colons in bin/.

Solution: Fixed in v2.8.2. Update to a newer version or clone with symlinks disabled:

git clone --no-checkout https://github.com/qualixar/superlocalmemory.git
cd superlocalmemory
git checkout HEAD -- ':!bin/*'

Source: GitHub Issue #7

Profile Isolation Concerns

For multi-agent setups where users want distinct memory scopes (personal/global/shared), this is tracked in the Multi-Scope Memory RFC. Current implementation stores all memories in a flat namespace with profile-based filtering. Source: GitHub Issue #20

MCP Tools

Related topics: CLI Reference

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Responsibilities

Continue reading this section for the full explanation and source context.

Section Tool Reference

Continue reading this section for the full explanation and source context.

Section Tool Specifications

Continue reading this section for the full explanation and source context.

Related topics: CLI Reference

MCP Tools

SuperLocalMemory V3 provides a comprehensive Model Context Protocol (MCP) server implementation that enables AI tools and integrated development environments (IDEs) to interact with the local memory system. The MCP Tools layer serves as the primary integration mechanism for Claude Desktop, Cursor, Windsurf, and 17+ other AI-powered development tools.

Overview

The MCP Tools module implements the Model Context Protocol specification, providing a standardized interface for AI assistants to store, retrieve, and manage memories without leaving the development environment. This integration eliminates context window friction by making relevant memories available automatically during coding sessions.

Key capabilities:

Semantic memory storage and retrieval via 4-channel SpreadingActivation
Real-time memory context injection into AI conversations
Profile-scoped memory isolation for multi-agent workflows
Trust-aware retrieval with provenance tracking
Zero-LLM inference mode (Mode A) for local-only operation

Source: src/superlocalmemory/mcp/server.py

Architecture

The MCP implementation follows a client-server architecture where the SLM daemon acts as the MCP host and AI tools act as MCP clients. The architecture supports both local-only operation and OpenAI-compatible API backends.

graph TD
    subgraph "MCP Client Layer"
        Claude["Claude Desktop"]
        Cursor["Cursor IDE"]
        Windsurf["Windsurf"]
        Continue["Continue.dev"]
        Cody["Cody by Sourcegraph"]
    end

    subgraph "MCP Protocol Layer"
        JSONRPC["JSON-RPC 2.0 Transport"]
        Tools["Tool Handlers"]
        Resources["Resource Handlers"]
        Prompts["Prompt Templates"]
    end

    subgraph "SLM Core Layer"
        MemoryEngine["V3 MemoryEngine"]
        SpreadingActivation["4-Channel Retrieval"]
        ContextCache["Context Cache"]
        TrustEngine["Trust Scoring Engine"]
    end

    subgraph "Storage Layer"
        SQLite["SQLite Database<br/>/.superlocalmemory/memory.db"]
        VFS["Vector Store<br/>nomic-embed-text-v1.5"]
        GraphDB["Graph Store<br/>Temporal + Structural"]
    end

    Claude --> JSONRPC
    Cursor --> JSONRPC
    Windsurf --> JSONRPC
    Continue --> JSONRPC
    Cody --> JSONRPC

    JSONRPC --> Tools
    JSONRPC --> Resources
    JSONRPC --> Prompts

    Tools --> MemoryEngine
    Resources --> MemoryEngine
    Prompts --> MemoryEngine

    MemoryEngine --> SpreadingActivation
    SpreadingActivation --> ContextCache
    SpreadingActivation --> TrustEngine

    SpreadingActivation --> SQLite
    SpreadingActivation --> VFS
    SpreadingActivation --> GraphDB

Component Responsibilities

Component	Responsibility	File Reference
MCP Server	Protocol implementation, transport handling	server.py:1-150
Tool Handlers	Execute memory operations via MCP protocol	tools.py:1-200
Memory Engine	Core retrieval and storage logic	V3 Engine integration
Context Cache	Pre-warm cache for low-latency retrieval	prewarm.py:1-100
API Server	REST endpoint for dashboard and external access	api.py:1-150

Available MCP Tools

SuperLocalMemory V3 exposes 6 primary MCP tools for memory management operations. Each tool corresponds to a CLI command and provides equivalent functionality through the protocol.

Tool Reference

Tool Name	Description	Parameters	Return Type
`slm_remember`	Store new memory with automatic importance scoring	`content`, `project`, `tags`, `importance`	`fact_id`, `topic_sig`
`slm_recall`	Semantic search with 4-channel retrieval	`query`, `limit`, `profile`	List of memory entries
`slm_list_recent`	List recently accessed memories	`limit`, `profile`	List of memory entries
`slm_status`	System health and memory statistics	None	Status object
`slm_build_graph`	Generate relationship graph for memories	`query`, `depth`	Graph data
`slm_switch_profile`	Change active memory profile	`profile_name`	Confirmation

Tool Specifications

#### slm_remember

Stores new information in the memory system with automatic topic signature computation and importance scoring.

# MCP tool signature
slm_remember(
    content: str,      # The memory content to store
    project: str = "", # Project identifier (optional)
    tags: list[str] = [], # Custom tags for filtering
    importance: int = 5  # 1-10 importance score
) -> {
    "fact_id": str,
    "topic_sig": str,
    "created_at": int
}

Behavior:

Computes topic signature from content using local embeddings
Applies automatic importance scoring based on content analysis
Stores in SQLite with profile isolation
Updates vector index for semantic retrieval
Triggers cognitive consolidation check

Source: tools.py

#### slm_recall

Performs semantic search using the 4-channel SpreadingActivation algorithm.

# MCP tool signature
slm_recall(
    query: str,        # Search query
    limit: int = 10,   # Maximum results
    profile: str = ""  # Profile scope (empty = current)
) -> {
    "results": [
        {
            "fact_id": str,
            "content": str,
            "trust_score": float,
            "relevance": float,
            "channel_scores": {...}
        }
    ],
    "total": int,
    "query_time_ms": float
}

Retrieval Channels:

Semantic — Vector similarity using nomic-embed-text-v1.5
Lexical — BM25 keyword matching
Temporal — Time-decay weighted retrieval
Structural — Topic graph proximity scoring

Source: src/superlocalmemory/core/retrieval.py

#### slm_status

Returns system health, memory statistics, and configuration state.

# Return type
{
    "version": str,
    "mode": "A" | "B",
    "daemon_running": bool,
    "profile": str,
    "stats": {
        "total_memories": int,
        "total_facts": int,
        "total_projects": int,
        "cache_size": int
    },
    "llm": {
        "provider": str,
        "model": str,
        "available": bool
    }
}

Source: src/superlocalmemory/cli/main.py

MCP Resources

MCP Resources provide read-only access to memory data, suitable for context injection without tool execution overhead.

Resource URI	Description	Refresh Policy
`slm://memories/recent`	Last 20 accessed memories	On-demand
`slm://memories/by-project/{project}`	Memories for specific project	On-demand
`slm://profile/current`	Current active profile	On-demand
`slm://system/status`	System health snapshot	30-second cache

MCP Prompts

Pre-defined prompt templates for common memory operations:

Prompt Name	Description	Variables
`summarize-project`	Generate project summary from memories	`project_name`
`find-related`	Find memories related to current context	`current_topic`
`learning-summary`	Summarize learned patterns	`time_range`

Configuration

MCP Server Settings

The MCP server is configured through slm config or environment variables:

Setting	Environment Variable	Default	Description
Server Host	`SLM_HOST`	`127.0.0.1`	Bind address for MCP connections
Server Port	`SLM_PORT`	`8765`	Port for MCP JSON-RPC transport
Transport	`SLM_TRANSPORT`	`stdio`	Transport mode (stdio/sse)

Note: The SLM_HOST feature request (Issue #23) addresses the limitation of hardcoded 127.0.0.1 for multi-machine deployments in trusted networks.

IDE-Specific Configuration

Each supported IDE requires a configuration file pointing to the MCP server:

{
  "mcpServers": {
    "superlocalmemory": {
      "command": "npx",
      "args": ["-y", "superlocalmemory@latest", "mcp"]
    }
  }
}

Supported IDEs and configuration locations:

IDE	Configuration File
Claude Desktop	`~/.claude_desktop_config.json`
Cursor	`.cursor/mcp.json` in project
Windsurf	`.windsurf/mcp_config.json`
Continue.dev	`.continue/config.json`
Cody	Sourcegraph dashboard settings
ChatGPT	ChatGPT Desktop settings
Perplexity	Perplexity Desktop settings
Zed	`.zed/mcp.json`
OpenCode	OpenCode MCP settings
Antigravity	Antigravity MCP config
Aider	`~/.aider.conf.yml`

Source: ide/configs

Integration with LangChain and LlamaIndex

Beyond native MCP support, SuperLocalMemory provides direct integrations with popular AI development frameworks.

LangChain Integration

The langchain-superlocalmemory package implements BaseChatMessageHistory for storing conversation history:

from langchain_core.messages import AIMessage, HumanMessage
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory

history = SuperLocalMemoryChatMessageHistory(session_id="my-session")
history.add_messages([
    HumanMessage(content="What is SuperLocalMemory?"),
    AIMessage(content="It's a local-first memory system for AI assistants."),
])

Features:

Session isolation via session_id
All messages stored in ~/.superlocalmemory/memory.db
Compatible with LangChain chains and agents
Messages visible via CLI and MCP tools

Source: ide/integrations/langchain/README.md

LlamaIndex Integration

The llamaindex-superlocalmemory package provides chat storage:

from llamaindex_superlocalmemory import SuperLocalMemoryChatStore

chat_store = SuperLocalMemoryChatStore(
    session_key="user-session-123",
    db_path="/path/to/custom/memory.db"
)

Features:

Async support via BaseChatStore
Tag-based session isolation: llamaindex:chat:<session_key>
Messages queryable via SLM CLI and MCP

Source: ide/integrations/llamaindex/README.md

Common Usage Patterns

Context Pre-warming

The system automatically pre-warms context for known tool calls to reduce latency:

# From prewarm.py - automatic context caching
def _upsert_cache(
    session_id: str,
    topic_sig: str,
    content: str,
    fact_ids: list[str],
) -> None:
    cache = ContextCache()
    cache.upsert(CacheEntry(
        session_id=session_id,
        topic_sig=topic_sig,
        content=content,
        fact_ids=tuple(fact_ids),
        provenance="prewarm_post_tool",
        computed_at=int(time.time()),
    ))

This pattern ensures memories related to active sessions are immediately available without triggering full retrieval.

Multi-Profile Workflows

For teams running multiple specialized agents, profile isolation ensures memory separation:

# Create profile for specialized agent
slm profile create coding-agent

# Store memory scoped to this profile
slm remember "Python asyncio best practices" --project python-tips

# Recall uses current profile context automatically
slm recall "async patterns"

Note: Multi-scope memory (personal/global/shared) is a requested feature (Issue #20) that would extend the current flat namespace model.

Troubleshooting

Long Response Times

If slm remember or slm recall takes excessive time, this was a known issue fixed in v3.3.19 (Issue #11). Upgrade to the latest version:

npm install -g superlocalmemory@latest

Mode B API Key Not Working

When using OpenAI-compatible providers in Mode B, the api_key may be silently dropped. Check the configuration in src/superlocalmemory/core/config.py — SLMConfig.for_mode() in the Mode B branch (Issue #9).

Docker/Linux Container Issues

For cognitive consolidation issues on Linux/Docker setups, refer to Issue #26 for environment-specific guidance.

Security Considerations

The MCP server enforces profile isolation on all query endpoints. Memories from one profile cannot leak to another, ensuring compliance with enterprise security requirements introduced in v2.6.0.

Key security features:

Profile-scoped access control on all MCP tools
API key validation for external providers (Mode B)
Trust scoring to flag potentially unreliable memories
Audit trail for compliance (v2.8.0 enterprise compliance)

Memory Lifecycle

Memory Lifecycle is a core system in SuperLocalMemory that automatically organizes memories over time based on usage patterns, ensuring the memory system remains fast, relevant, and scalable. Introduced in v2.8.0, this feature manages the complete journey of a memory from creation through consolidation to potential archival or deletion.

Overview

As you interact with SuperLocalMemory, the system accumulates memories at different rates depending on your workflow. Without lifecycle management, a growing memory database leads to degraded recall performance and storage bloat. The Memory Lifecycle system addresses these challenges by:

Automatically organizing memories based on usage frequency and recency
Consolidating related memories into coherent knowledge units
Managing storage efficiency through intelligent quantization
Maintaining relevance by prioritizing frequently accessed memories
Preserving user privacy through profile isolation during all operations

Source: CHANGELOG.md - v2.8.0

Architecture

The Memory Lifecycle system consists of several interconnected components that work together to manage memories throughout their lifetime.

graph TD
    A[Memory Created<br/>slm remember] --> B[Initial Storage<br/>memory.db]
    B --> C[Usage Tracking<br/>Access Count + Recency]
    C --> D{Lifecycle Decision<br/>Engine}
    D -->|Frequently Used| E[Active Memory<br/>Priority Queue]
    D -->|Infrequent Access| F[Consolidation Queue]
    D -->|Stale + Redundant| G[Archive/Prune]
    E --> H[Fast Recall Path]
    F --> I[Consolidation Worker]
    I --> J[Quantized Storage]
    J --> K[Compressed Facts]
    G --> L[Audit Trail]
    H --> M[Context Cache]
    M --> N[Instant Recall]
    
    style A fill:#e1f5fe
    style H fill:#c8e6c9
    style I fill:#fff3e0
    style G fill:#ffcdd2

Core Components

Component	File	Purpose
ConsolidationEngine	`core/consolidation_engine.py`	Orchestrates lifecycle decisions and manages memory states
Consolidator	`encoding/consolidator.py`	Performs the actual memory merging and deduplication
ConsolidationWorker	`learning/consolidation_worker.py`	Background worker for async consolidation tasks
QuantizedStore	`storage/quantized_store.py`	Storage layer with compression and quantization
ContextCache	`core/context_cache.py`	Caches frequently accessed contexts for fast recall

Source: src/superlocalmemory/core/consolidation_engine.py

Lifecycle States

Memories transition through distinct states during their lifecycle. Understanding these states helps you troubleshoot recall issues and optimize memory management.

stateDiagram-v2
    [*] --> Created: slm remember
    Created --> Active: First access
    Active --> Active: Regular access
    Active --> Consolidated: Consolidation trigger
    Active --> Archived: Extended inactivity
    Consolidated --> Active: Referenced again
    Consolidated --> Archived: Further decay
    Archived --> Active: Re-accessed
    Archived --> Purged: TTL exceeded
    Purged --> [*]: Deleted
    
    note right of Active: Hot path<br/>Full indexing
    note right of Consolidated: Compressed<br/>Quantized storage
    note right of Archived: Minimal footprint<br/>Audit preserved

State Definitions

State	Description	Storage Format	Recall Speed
Created	Newly added via `slm remember`	Full text + embeddings	Fast
Active	Recently accessed memories	Indexed, full fidelity	Fastest
Consolidated	Merged with similar memories	Quantized, compressed	Moderate
Archived	Inactive for extended period	Minimal metadata	Slower
Purged	Removed from active storage	Audit trail only	N/A

Source: src/superlocalmemory/encoding/consolidator.py

Consolidation Process

Consolidation is the core mechanism that keeps your memory system efficient. It runs automatically based on configurable triggers.

Trigger Conditions

Consolidation is triggered when specific conditions are met:

Frequency Threshold: Memory accessed fewer than X times in Y days
Semantic Redundancy: Multiple memories with high Fisher-Rao similarity
Temporal Clustering: Memories created within the same session/context
Topic Signature Collision: Memories sharing similar topic signatures

Source: src/superlocalmemory/core/topic_signature.py

Consolidation Workflow

sequenceDiagram
    participant U as User/Agent
    participant DA as Daemon
    participant CE as ConsolidationEngine
    participant CW as ConsolidationWorker
    participant QS as QuantizedStore
    participant DB as memory.db
    
    U->>DA: slm remember "fact"
    DA->>DB: Store memory
    Note over CE: Periodic check<br/>(configurable interval)
    CE->>DB: Query usage patterns
    CE->>CE: Evaluate consolidation candidates
    CE->>CW: Queue consolidation task
    CW->>QS: Read candidate memories
    QS-->>CW: Decompressed data
    CW->>CW: Merge & deduplicate
    CW->>QS: Write consolidated memory
    QS->>DB: Update storage
    Note over DB: Original audit trail<br/>preserved

Consolidation Worker

The ConsolidationWorker runs as a background task, processing consolidation in batches to avoid blocking the main application.

# From learning/consolidation_worker.py - Batch processing pattern
def process_batch(self, candidates: list[str]) -> ConsolidationResult:
    """
    Process a batch of memories for consolidation.
    Returns result with merged facts and storage savings.
    """

Key behaviors:

Processes memories in configurable batch sizes
Runs during idle periods to minimize performance impact
Maintains full audit trail of consolidation operations
Supports rollback if consolidation fails

Source: src/superlocalmemory/learning/consolidation_worker.py

Context Cache Integration

The Memory Lifecycle system integrates with the Context Cache to provide instant recall for active memories.

Cache Entry Structure

# From core/context_cache.py
class CacheEntry:
    session_id: str       # Which session created this cache
    topic_sig: str        # Topic signature for the context
    content: str          # Cached context content
    fact_ids: tuple[str]  # Memory IDs included in this cache
    provenance: str       # How this was computed
    computed_at: int      # Unix timestamp

Source: src/superlocalmemory/core/context_cache.py

Prewarm Mechanism

The prewarm system proactively caches contexts before they're needed:

After slm remember — caches the new memory in context
After successful recall — caches retrieved facts
During session start — warms cache based on topic signatures

graph LR
    A[slm remember] --> B[Compute topic_sig]
    B --> C[Upsert cache entry]
    C --> D[Session warm start]
    
    E[slm recall] --> F[Retrieve memories]
    F --> G[Update cache]
    G --> D

Source: src/superlocalmemory/server/routes/prewarm.py

Cache Management

Clearing Cache DBs

The slm clear-cache command removes regenerable cache databases while preserving user memories:

# Remove only cache databases
slm clear-cache

# Output:
# Removed cache DBs:
#   - active_brain_cache.db
#   - context_cache.db
#   - entity_trigram_cache.db
# memory.db / learning.db preserved (user memories are safe)

Cache databases removed:

Database	Purpose	Regenerated?
`active_brain_cache.db`	Active memory indexing	Yes
`context_cache.db`	Context prewarm data	Yes
`entity_trigram_cache.db`	Lexical search index	Yes

Protected databases (never removed):

memory.db — User memories
learning.db — User preferences and feedback
audit.db — Compliance audit trail
audit_chain.db — Immutable audit chain

Source: src/superlocalmemory/cli/escape_hatch.py

Quantized Storage

The QuantizedStore provides storage optimization for consolidated memories, reducing disk usage while maintaining retrieval quality.

Storage Format

Memory Type	Format	Compression
Active	Full text + vectors	None
Consolidated	Semantic tokens only	60-80% size reduction
Archived	Metadata only	90%+ size reduction

Source: src/superlocalmemory/storage/quantized_store.py

Retrieval Behavior

When a consolidated memory is accessed:

Decompress from quantized format
Reconstruct full semantic representation
Update access statistics (may move back to Active)
Return to requestor

Configuration

Lifecycle Configuration Options

Setting	Default	Description
`consolidation_interval`	24 hours	How often consolidation runs
`batch_size`	50	Memories processed per batch
`frequency_threshold`	3 accesses/week	Below this triggers consolidation
`similarity_threshold`	0.85	Fisher-Rao similarity for merging
`archive_after_days`	90	Days inactive before archival
`purge_after_days`	365	Days archived before purge

Cache Configuration

Setting	Default	Description
`cache_ttl_seconds`	3600	Context cache entry TTL
`max_cache_entries`	1000	Maximum cached contexts
`prewarm_on_remember`	true	Auto-cache after slm remember

Troubleshooting

Common Issues

#### Issue: Slow Recall Despite Many Memories

Symptoms: slm recall takes longer than expected

Possible Causes:

Consolidation not running — check daemon logs
Cache cleared — run slm clear-cache after checking
Too many active memories — consider reducing frequency threshold

Resolution:

# Check daemon status
slm serve status

# Verify consolidation is running
# Look for "consolidation_worker" in logs

# Force cache rebuild
slm clear-cache  # Removes only cache DBs
# Daemon will rebuild on next access

#### Issue: Memories Not Consolidating

Symptoms: Memory count keeps growing, consolidation never reduces it

Possible Causes:

Memories are too diverse (low similarity)
Access frequency above threshold
Consolidation worker disabled

Resolution: Verify settings in ~/.superlocalmemory/config.json

#### Issue: Linux/Docker Memory Lifecycle Issues

Symptoms: Reported in Issue #26 — consolidation and trace issues on Linux/Docker

Known Workaround: Ensure the daemon has write access to the data directory and that the user running the container matches the file ownership of ~/.superlocalmemory/.

# Fix ownership on Linux/Docker
chown -R $(id -u):$(id -g) ~/.superlocalmemory

The Memory Lifecycle system works closely with these related systems:

Feature	Integration Point	Documentation
Behavioral Learning	Learns from consolidation outcomes to improve future decisions	Behavioral Learning
Trust System	Lifecycle decisions influenced by trust scores	Trust System
Profile Isolation	Lifecycle respects profile boundaries	Profile Management
Audit Trail	All lifecycle changes logged for compliance	Enterprise Compliance

Multi-Machine Mesh

Related topics: CLI Reference

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Design Principles

Continue reading this section for the full explanation and source context.

Section Environment Variables

Continue reading this section for the full explanation and source context.

Section Daemon Binding

Continue reading this section for the full explanation and source context.

Related topics: CLI Reference

Multi-Machine Mesh

SuperLocalMemory Multi-Machine Mesh enables distributed memory synchronization across multiple machines on a trusted network. This feature extends the local-first memory architecture to a fleet of machines, allowing AI agents running on different hosts to share and access consolidated memory.

Community Note: This feature addresses a highly requested capability from users running the Qualixar stack across home lab environments. See GitHub Issue #23 for the original feature request from users running SLM on 7-machine WireGuard meshes.

Overview

The mesh architecture consists of interconnected SuperLocalMemory instances that communicate to maintain a synchronized view of shared memories. Each machine in the mesh operates as an independent memory node while selectively sharing designated memories with peer nodes.

Component	Role
Mesh Broker	Coordinates communication between mesh nodes
Remote Sync	Handles memory synchronization protocol
Mesh MCP Tools	Exposes mesh operations via MCP interface
Unified Daemon	Manages daemon lifecycle and mesh bindings

Architecture

graph TD
    A[Machine A - SLM Node] -->|Mesh Broker| B[WireGuard/VPN Network]
    C[Machine B - SLM Node] -->|Mesh Broker| B
    D[Machine C - SLM Node] -->|Mesh Broker| B
    
    B -->|Sync Protocol| A
    B -->|Sync Protocol| C
    B -->|Sync Protocol| D
    
    A -->|Local Memory| A1[(Local SQLite)]
    C -->|Local Memory| C1[(Local SQLite)]
    D -->|Local Memory| D1[(Local SQLite)]
    
    A1 -->|Remote Sync| Shared[(Shared Memory Pool)]
    C1 -->|Remote Sync| Shared
    D1 -->|Remote Sync| Shared

Design Principles

The mesh system is built on three core principles derived from the local-first architecture:

Selective Sharing — Only explicitly marked memories propagate across nodes
Conflict Resolution — Last-write-wins with provenance tracking
Trust Boundaries — Mesh operates within authenticated network boundaries (e.g., WireGuard VPN)

Configuration

Environment Variables

Variable	Default	Description
`SLM_HOST`	`127.0.0.1`	Bind address for SLM daemon and mesh broker
`SLM_MESH_PORT`	`8766`	Port for mesh broker communication
`SLM_MESH_TOKEN`	(generated)	Authentication token for mesh nodes
`SLM_DATA_DIR`	`~/.superlocalmemory`	Base directory for memory storage

Known Limitation: As of the current release, SLM_DATA_DIR is documented but may not be fully implemented in all components. See GitHub Issue #10 for tracking.

Daemon Binding

The unified daemon binds to a configurable host address to enable mesh connectivity:

# From unified_daemon.py
# The daemon can be configured to bind to non-localhost addresses
# enabling cross-machine communication on trusted networks

Source: src/superlocalmemory/server/unified_daemon.py

Mesh Components

Mesh Broker

The mesh broker (broker.py) serves as the central coordination point for mesh operations:

# Conceptual structure based on available source references
class MeshBroker:
    def __init__(self, host: str, port: int, token: str):
        ...
    
    def register_node(self, node_id: str, endpoint: str) -> bool:
        """Register a new node in the mesh."""
        ...
    
    def broadcast(self, message: MeshMessage) -> None:
        """Broadcast a message to all registered nodes."""
        ...

Source: src/superlocalmemory/mesh/broker.py

Remote Synchronization

The remote sync module (remote_sync.py) implements the synchronization protocol between nodes:

Method	Purpose
`sync_to_peers()`	Push local changes to connected peers
`pull_from_peers()`	Pull remote changes into local store
`resolve_conflicts()`	Handle concurrent modifications
`get_sync_status()`	Return current synchronization state

Source: src/superlocalmemory/mesh/remote_sync.py

MCP Mesh Tools

Mesh operations are exposed through the MCP interface for AI tool integration:

# MCP tools available for mesh operations
tools = [
    "mesh_list_nodes",      # List connected mesh peers
    "mesh_sync_memory",     # Trigger synchronization
    "mesh_share_memory",    # Share a specific memory across mesh
    "mesh_get_status"       # Get mesh connectivity status
]

Source: src/superlocalmemory/mcp/tools_mesh.py

Setup Procedures

Prerequisites

SuperLocalMemory installed on all machines (~/.superlocalmemory/)
Network connectivity between machines (WireGuard VPN recommended)
Unique machine identifiers for each node
Mesh authentication token shared across the fleet

Basic Setup

# 1. Generate mesh token on primary machine
slm mesh token generate

# 2. Join secondary machines to mesh
slm mesh join <primary-host>:<port> --token <generated-token>

# 3. Verify connectivity
slm mesh status

Network Configuration

For mesh communication across machines, configure the bind address:

# Set SLM_HOST to allow external connections
export SLM_HOST=0.0.0.0

# Or configure in slm config
slm config set mesh.host 0.0.0.0
slm config set mesh.port 8766

Security Consideration: Binding to 0.0.0.0 exposes the SLM daemon to all network interfaces. Only use in trusted environments such as a private VPN.

Multi-Scope Memory Integration

The mesh system integrates with the planned multi-scope memory architecture (see RFC #20):

graph LR
    subgraph "Personal Scope"
        P1[Machine A Memory]
        P2[Machine B Memory]
    end
    
    subgraph "Shared Scope"
        S1[Shared Knowledge Base]
    end
    
    P1 -->|Personal| S1
    P2 -->|Personal| S1

Scope	Visibility	Sync Behavior
Personal	Local machine only	Never sync
Shared	All mesh nodes	Automatic sync
Team	Selected nodes	Selective sync

Troubleshooting

Connection Issues

Symptom	Possible Cause	Resolution
`Connection refused` on port 8766	Mesh broker not running	Run `slm daemon start`
`Authentication failed`	Token mismatch	Verify token on all nodes
`Timeout waiting for peers`	Network/firewall issue	Check VPN connectivity

Docker/Linux Environments

Users running SLM in Docker containers report challenges with network binding (Issue #26):

# For Docker deployments, expose mesh ports explicitly
docker run -p 8765:8765 -p 8766:8766 superlocalmemory

# Ensure SLM_HOST is set to container IP or 0.0.0.0
docker run -e SLM_HOST=0.0.0.0 superlocalmemory

Performance Considerations

Factor	Impact	Recommendation
Network latency	Sync delay	Use low-latency VPN
Memory size	Sync duration	Batch large memories
Node count	Coordination overhead	Limit to trusted fleet

CLI Commands

# Mesh management commands
slm mesh status          # Show mesh connectivity status
slm mesh list            # List connected peer nodes
slm mesh sync            # Trigger immediate synchronization
slm mesh share <id>      # Share a memory to mesh peers
slm mesh revoke <id>     # Revoke mesh sharing for a memory
slm mesh leave           # Disconnect from mesh

Security Model

The mesh system implements trust boundaries based on network topology:

Node Authentication — Mesh tokens verify node identity
Profile Isolation — Memories from one profile cannot leak to another
Scope Enforcement — Only shared-scope memories traverse the mesh

For production deployments, combine with:

WireGuard VPN for encrypted transport
Firewall rules restricting mesh ports
Regular token rotation via slm rotate-token

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 11 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_8628e403827148fcb5f3b537c1af2263 | https://github.com/qualixar/superlocalmemory/issues/26

2. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_c74d27ed1bf2462585e76845639adfd5 | https://github.com/qualixar/superlocalmemory/issues/23

3. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_ba05e614f5a8499da175aa7ba09ac343 | https://github.com/qualixar/superlocalmemory/issues/20

4. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

5. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

6. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

7. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

8. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

9. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_be8654b4ef434c37bedf0a453e65f5d6 | https://github.com/qualixar/superlocalmemory/issues/7

10. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

11. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using superlocalmemory with real data or production workflows.

Cognitive consolidation and trace issues on Linux/Docker setup - github / github_issue
GitHub issue body — qualixar SLM_HOST feature request - github / github_issue
RFC: Multi-Scope Memory — personal/global/shared scopes with scope-aware - github / github_issue
Feature Request: Support configurable local embedding endpoints (e.g., O - github / github_issue
slm remember xxx wait for a long time but seems no response - github / github_issue
SLM_DATA_DIR - github / github_issue
api_key silently dropped for Mode B LLM config - github / github_issue
Can't clone on Windows- repo contains bin - github / github_issue
v2.8.0 — Memory Lifecycle, Behavioral Learning, Enterprise Compliance - github / github_release
SuperLocalMemory v2.7.4 - github / github_release
v2.7.0 — Your AI Learns You - github / github_release
v2.6.0 — Security Hardening & Performance - github / github_release

Source: Project Pack community evidence and pitfall evidence