Doramagic Project Pack · Human Manual

superlocalmemory

SuperLocalMemory (SLM) solves a fundamental problem in AI agent development: persistent, context-aware memory that survives across sessions without relying on cloud services. Unlike cloud-...

Home

Related topics: System Architecture, Modes Explained (A/B/C), Installation

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Modes Explained (A/B/C), Installation

Home

SuperLocalMemory is an information-geometric agent memory system designed for AI assistants, providing mathematical guarantees for retrieval accuracy, zero-LLM inference mode, and EU AI Act compliance. The system stores, retrieves, and manages memories locally on your machine, ensuring complete data sovereignty while enabling seamless integration with Claude, Cursor, Windsurf, and 17+ AI tools.

Source: package.json

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Installation

Related topics: Home, CLI Reference

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Requirements

Continue reading this section for the full explanation and source context.

Section Required Dependencies

Continue reading this section for the full explanation and source context.

Section Method 1: NPM Installation (Recommended)

Continue reading this section for the full explanation and source context.

Related topics: Home, CLI Reference

Installation

This page covers the complete installation process for SuperLocalMemory V3, including prerequisites, installation methods, configuration, and troubleshooting. SuperLocalMemory is a local-first AI memory system that stores all data in ~/.superlocalmemory/memory.db by default, ensuring complete data sovereignty with zero cloud dependencies.

Overview

SuperLocalMemory supports three primary installation methods:

MethodPlatformCommand
NPM (Recommended)macOS, Linux, Windowsnpm install -g superlocalmemory
PipmacOS, Linux, Windowspip install superlocalmemory
Shell ScriptmacOS, Linux`curl -fsSL https://raw.githubusercontent.com/qualixar/superlocalmemory/main/install.sh \bash`
PowerShellWindows`irm https://raw.githubusercontent.com/qualixar/superlocalmemory/main/scripts/install.ps1 \iex`

Source: package.json:1-50

Prerequisites

System Requirements

ComponentMinimumRecommended
Python3.9+3.12+
Node.js18.0.020.x LTS
npm9.0.010.x
Disk Space500 MB2 GB
RAM4 GB8 GB

Required Dependencies

  • Ollama (optional, for Mode A): Download from ollama.ai for fully local LLM inference
  • SQLite: Bundled with Python; no separate installation needed
  • Embedding Model: Automatically downloaded on first run (nomic-ai/nomic-embed-text-v1.5)
  • Reranker Model: Automatically downloaded on first run (cross-encoder/ms-marco-MiniLM-L-12-v2)

Source: src/superlocalmemory/cli/setup_wizard.py:24-26

Installation Methods

The NPM package provides a cross-platform CLI with automatic post-install configuration.

# Install globally
npm install -g superlocalmemory

# Verify installation
slm status

What happens during installation:

  1. postinstall.js script runs after npm installation completes
  2. Detects existing V2 installation and prompts for migration
  3. Triggers setup wizard for new installations
  4. Downloads required embedding and reranker models

Source: scripts/postinstall.js:1-50

Method 2: Pip Installation

For Python-native environments:

# Install from PyPI
pip install superlocalmemory

# Or install from source
pip install git+https://github.com/qualixar/superlocalmemory.git

First-run behavior:

On first slm command execution, the setup wizard runs automatically when .setup-complete marker is missing from the data directory.

Source: src/superlocalmemory/cli/setup_wizard.py:50-70

Method 3: Shell Script (Linux/macOS)

curl -fsSL https://raw.githubusercontent.com/qualixar/superlocalmemory/main/install.sh | bash

Method 4: PowerShell (Windows)

irm https://raw.githubusercontent.com/qualixar/superlocalmemory/main/scripts/install.ps1 | iex
Note: Windows users previously encountered issues cloning repositories with special characters in filenames. This was fixed in v2.8.2. See Issue #7.

Installation Flow

flowchart TD
    A[User runs install command] --> B{Installation method?}
    B -->|NPM| C[postinstall.js executes]
    B -->|Pip| D[pip install completes]
    B -->|Shell| E[install.sh executes]
    
    C --> F{V2 installation detected?}
    D --> G{First run?}
    E --> H{First run?}
    
    F -->|Yes| I[Run V2 Migrator]
    F -->|No| J[Check .setup-complete]
    G -->|.setup-complete missing| K[Run Setup Wizard]
    H -->|.setup-complete missing| K
    G -->|.setup-complete exists| L[Ready to use]
    H -->|.setup-complete exists| L
    
    I --> M[Start Setup Wizard]
    J -->|Missing| K
    J -->|Exists| L
    
    K --> N[Download embedding model]
    N --> O[Download reranker model]
    O --> P[Configure mode: A or B]
    P --> Q[Create .setup-complete marker]
    Q --> L

Setup Wizard

The interactive setup wizard runs automatically on first use or via slm setup. It performs the following steps:

Step 1: Environment Detection

The wizard detects the runtime environment:

def is_interactive() -> bool:
    """True if running in a terminal (not CI, not piped, not MCP)."""
    if os.environ.get("CI"):
        return False
    if os.environ.get("SLM_NON_INTERACTIVE"):
        return False
    return sys.stdin.isatty() and sys.stdout.isatty()

Source: src/superlocalmemory/cli/setup_wizard.py:36-43

Step 2: Model Download

Two models are downloaded automatically:

ModelPurposeSize
nomic-ai/nomic-embed-text-v1.5Text embeddings for semantic search~275 MB
cross-encoder/ms-marco-MiniLM-L-12-v2Reranking for improved recall~90 MB

Source: src/superlocalmemory/cli/setup_wizard.py:24-26

Step 3: Mode Configuration

Choose between two operating modes:

ModeDescriptionLLM Required
Mode AFully local with OllamaOllama running locally
Mode BOpenAI-compatible APIAPI key or local proxy

The setup wizard writes the configuration to ~/.superlocalmemory/config.json.

Source: src/superlocalmemory/core/config.py:1-100

Data Directory Configuration

Default Location

By default, all data is stored in ~/.superlocalmemory/:

~/.superlocalmemory/
├── memory.db          # Main SQLite database
├── config.json        # Configuration file
├── .setup-complete    # Setup marker
├── models/            # Cached embedding models
└── logs/              # Application logs

Source: src/superlocalmemory/cli/setup_wizard.py:20-21

Custom Data Directory

You can customize the data directory using the SL_MEMORY_PATH environment variable:

# Linux/macOS
export SL_MEMORY_PATH=/mnt/data/slm

# Windows PowerShell
$env:SL_MEMORY_PATH="D:\data\slm"

# Run slm commands
slm remember "My custom data location"
Note: The SLM_DATA_DIR environment variable was requested in Issue #10 but the implementation uses SL_MEMORY_PATH instead. This allows storing memory data on custom paths, including external drives or network mounts.

Upgrading from V2

Users upgrading from V2 are detected automatically:

from superlocalmemory.storage.v2_migrator import V2Migrator

migrator = V2Migrator()

if migrator.detect_v2() and not migrator.is_already_migrated():
    # Run migration logic
    migrator.migrate()

Source: src/superlocalmemory/cli/post_install.py:30-45

Migration Process

  1. Detect V2 installation at ~/.superlocalmemory/
  2. Back up existing database
  3. Run schema migrations for V3
  4. Copy user profiles and settings
  5. Mark migration complete with version marker

Source: src/superlocalmemory/server/unified_daemon.py:50-80

Post-Installation Verification

After installation, verify everything is working:

# Check installation status
slm status

# View configuration
slm config

# Test memory operations
slm remember "Test memory from installation verification"
slm recall "installation verification"

Expected output from slm status:

SuperLocalMemory V3.x.x
━━━━━━━━━━━━━━━━━━━━━━━
Mode: A
Provider: ollama
Model: llama3.2
Database: ~/.superlocalmemory/memory.db
Status: Running

Docker Installation

For containerized environments, see Issue #26 for known considerations:

Dockerfile Example

FROM python:3.12-slim

# Install Node.js for npm-based installation
RUN apt-get update && apt-get install -y curl
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash -
RUN apt-get install -y nodejs

# Install SuperLocalMemory
RUN npm install -g superlocalmemory

# Set data directory
ENV SL_MEMORY_PATH=/data/slm

# Create data directory
RUN mkdir -p /data/slm

# Default command
CMD ["slm", "daemon"]

Docker Compose

version: '3.8'
services:
  superlocalmemory:
    image: python:3.12-slim
    environment:
      - SL_MEMORY_PATH=/data/slm
    volumes:
      - slm-data:/data/slm
    command: slm daemon
    ports:
      - "8765:8765"

volumes:
  slm-data:

Troubleshooting

Installation Hangs

If slm remember hangs with no response, this was fixed in v3.3.19. Ensure you have the latest version:

npm install -g superlocalmemory@latest

See Issue #11 for details.

Model Download Failures

If model downloads fail, manually download using Ollama:

ollama pull nomic-embed-text

Permission Errors (Linux/macOS)

# Fix npm global directory permissions
mkdir -p ~/.npm-global
npm config set prefix '~/.npm-global'
export PATH=~/.npm-global/bin:$PATH

# Or use sudo (not recommended)
sudo npm install -g superlocalmemory

Windows PATH Issues

If slm command is not recognized after installation:

  1. Find npm global bin directory: npm config get prefix
  2. Add to System PATH
  3. Restart terminal

API Key Not Working (Mode B)

If api_key is silently dropped in Mode B, check Issue #9. The workaround is to ensure api_key is properly set in config.json:

{
  "llm": {
    "provider": "openai",
    "model": "gpt-4",
    "api_key": "your-api-key",
    "api_base": "https://api.openai.com/v1"
  }
}

Network Configuration

By default, the daemon binds to 127.0.0.1 for security. For multi-machine setups (as requested in Issue #23), consider:

  1. WireGuard mesh: Recommended for trusted networks
  2. slm-mesh: Part of the Qualixar stack for distributed memory
  3. Custom proxy: Forward ports through your own reverse proxy
Note: The SLM_HOST feature request for configurable bind addresses is tracked in Issue #23.

Quick Reference

CommandDescription
npm install -g superlocalmemoryInstall via npm
slm setupRun setup wizard
slm statusCheck installation status
slm configView/edit configuration
slm daemonStart the daemon manually
slm restartRestart daemon after config changes

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Migration from V2

Related topics: Home, Installation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Why Migrate?

Continue reading this section for the full explanation and source context.

Section Component Overview

Continue reading this section for the full explanation and source context.

Section Data Preservation

Continue reading this section for the full explanation and source context.

Related topics: Home, Installation

Migration from V2

This page documents the migration process from SuperLocalMemory V2 to V3, including how data is transferred, what changes occur during migration, and how to troubleshoot common issues.

Overview

SuperLocalMemory V3 introduces a complete architectural redesign while maintaining backward compatibility with your existing V2 data. The migration system automatically detects V2 installations, preserves all memories, and performs necessary schema transformations to ensure a seamless upgrade experience.

Why Migrate?

The V2 to V3 migration delivers significant improvements:

FeatureV2V3
Memory EngineSQLite-basedV3 Engine with Fisher-Rao similarity
Trust SystemBasic4-channel retrieval with trust scores
LearningAdaptive rankingBehavioral learning with zero-LLM inference
ComplianceBasic auditEU AI Act compliant with immutable trails
Multi-AgentFlat namespaceMulti-scope memory (personal/shared/global)
LLM DependencyRequiredMode A (LLM) and Mode B (zero-LLM)

Source: package.json | Community: RFC Multi-Scope Memory #20

Migration Architecture

Component Overview

The migration system consists of three primary components:

graph TD
    A[V2 Installation<br/>~/.superlocalmemory/] --> B[V2Migrator<br/>Detection & Analysis]
    B --> C{Migration Status}
    C -->|Not Migrated| D[Run Migration]
    C -->|Already Migrated| E[Skip Migration]
    D --> F[Schema Transformation<br/>migrations.py]
    F --> G[V3 Database<br/>Preserved Data]
    E --> G
    H[Post-Install Script<br/>post_install.py] --> I[User Prompt<br/>if V2 detected]
    I --> J[Confirm Migration]
    J --> D

Source: src/superlocalmemory/cli/post_install.py:1-50

Data Preservation

During migration, the system preserves the following V2 data:

V2 Data TypeV3 PreservationTransformation
Memories✅ CompleteSchema migration
Profiles✅ CompleteEnhanced with trust
Learning data✅ CompleteBehavioral learning format
Configuration⚠️ PartialRecommended review
Chat histories✅ Via integrationsLlamaIndex/LangChain adapters

Source: src/superlocalmemory/storage/v2_migrator.py

Migration Detection

Automatic Detection

The V3 installation automatically detects existing V2 installations during the post-install phase. This detection runs through the post_install.py script which is triggered by both npm and pip installations.

sequenceDiagram
    participant User
    participant PostInstall as post_install.py
    participant Migrator as V2Migrator
    participant Daemon as unified_daemon.py

    User->>PostInstall: npm install -g superlocalmemory
    PostInstall->>Migrator: detect_v2()
    Migrator-->>PostInstall: V2 installation found
    PostInstall->>Migrator: is_already_migrated()
    Migrator-->>PostInstall: False
    PostInstall->>User: Prompt for migration
    User->>PostInstall: Confirm
    PostInstall->>Migrator: migrate()
    Migrator-->>PostInstall: Success
    PostInstall->>Daemon: Mark as migrated

Source: src/superlocalmemory/cli/post_install.py:30-60

Detection Logic

The V2Migrator class implements two key detection methods:

MethodPurposeSource
detect_v2()Checks for existence of V2 data directoryv2_migrator.py
is_already_migrated()Prevents re-migration of already migrated datav2_migrator.py

Version Marker System

V3 uses a version marker system to track upgrades and prevent duplicate migration attempts. This marker is written only after successful migration completion.

# From unified_daemon.py - version marker logic
_want_write_marker = _prev != _slm_version
if _want_write_marker:
    if _prev is None:
        logger.info(
            "[slm] first boot on v%s — run `slm status` to see your "
            "memory overview. Changelog: "
            "https://github.com/qualixar/superlocalmemory/blob/598b2fc1ce9af40b8b58ac24d2db4827513300b0/CHANGELOG.md",
            _slm_version,
        )
    else:
        logger.info(
            "[slm] upgraded %s → %s. Data migrations run in a moment; "
            "your 18k+ atomic facts are preserved.",
            _prev, _slm_version,
        )

Source: src/superlocalmemory/server/unified_daemon.py:1-40

Migration Workflow

Step-by-Step Process

  1. Detection Phase
  • Post-install script runs V2Migrator.detect_v2()
  • Checks for V2 data directory at ~/.superlocalmemory/
  1. Confirmation Phase
  • If V2 detected and not already migrated, prompt user for confirmation
  • Display migration summary and estimated duration
  1. Schema Migration Phase
  • Run additive schema migrations via migrations.py
  • Transform V2 memories to V3 format
  • Preserve all metadata and importance scores
  1. Verification Phase
  • Verify all memories transferred correctly
  • Check profile integrity
  • Validate learning data
  1. Completion Phase
  • Set migration marker to prevent re-migration
  • Display upgrade banner with changelog link

Source: src/superlocalmemory/storage/migrations.py

Migration Data Flow

graph LR
    subgraph V2_Data["V2 Data (~/.superlocalmemory/)"]
        A2[memories.db]
        B2[profiles.json]
        C2[learning_data.json]
    end

    subgraph Migration["Migration Layer"]
        D[V2Migrator]
        E[Schema Migrations]
    end

    subgraph V3_Data["V3 Data"]
        A3[memory.db<br/>V3 Schema]
        B3[Profiles<br/>Enhanced]
        C3[Behavioral Learning]
    end

    A2 --> D
    B2 --> D
    C2 --> D
    D --> E
    E --> A3
    E --> B3
    E --> C3

Configuration After Migration

Required Configuration Review

After migration, certain V2 configuration options may require manual review:

Config OptionV2 BehaviorV3 BehaviorAction Required
LLM_BACKBONEOllama onlyMultiple providersVerify if using non-Ollama
SLM_DATA_DIRNot implementedNow supportedOptional relocation
SLM_HOSTHardcoded 127.0.0.1ConfigurableReview for multi-machine setups
api_keyDropped silentlyNow preservedVerify Mode B providers

Source: Community Issue #9 | Community Issue #10 | Community Issue #23

Mode Configuration

V3 introduces dual-mode operation:

ModeDescriptionLLM Required
Mode ALLM-powered retrieval with Fisher-Rao similarityYes
Mode BZero-LLM mode using embedding similarityNo

The setup wizard (setup_wizard.py) guides new users through mode selection. Existing V2 users maintain their configuration but should verify it after migration.

Source: src/superlocalmemory/cli/setup_wizard.py:1-30

Common Issues and Troubleshooting

Issue: Migration Fails Silently

Symptom: Post-install completes without prompting for migration.

Diagnosis:

# Check migration status
python -c "from superlocalmemory.storage.v2_migrator import V2Migrator; m = V2Migrator(); print(f'V2: {m.detect_v2()}, Migrated: {m.is_already_migrated()}')"

Resolution: If migration marker exists but data wasn't transferred, manually run:

slm migrate --force

Source: src/superlocalmemory/storage/v2_migrator.py

Issue: api_key Dropped for Mode B

Symptom: Mode B configured with OpenAI-compatible API, but LLM unavailable.

Affected versions: V2.8.0 - V3.3.x (fixed in V3.4+)

Diagnosis:

# Check LLM availability
slm status

Resolution: Reconfigure the API key after migration using:

slm config set llm.api_key YOUR_API_KEY

Source: Community Issue #9

Issue: Docker/Linux Memory Consolidation

Symptom: Memories not appearing after Docker restart or on different Linux machines.

Diagnosis: Check that data directory is properly mounted or configured for multi-machine access.

Resolution:

  1. Configure SLM_DATA_DIR environment variable
  2. Use slm mesh for cross-machine sync
  3. Verify data persistence in Docker volume

Source: Community Issue #26

Issue: Long Wait Times on First `slm remember`

Symptom: slm remember command hangs without response.

Affected versions: Pre-V3.3.19

Resolution: Upgrade to v3.3.19 or later, which includes fix for the streaming response handling.

Source: Community Issue #11

Manual Migration

For advanced users who prefer manual control:

Backup V2 Data

# Backup before migration
cp -r ~/.superlocalmemory ~/.superlocalmemory.backup

Force Migration

# Force migration (will re-run even if already done)
python -m superlocalmemory.storage.v2_migrator --force

Skip Migration

# Start fresh with V3 (loses V2 data)
export SLM_SKIP_MIGRATION=1
slm setup

Integration Adapters

After migration, your existing integrations continue to work:

LlamaIndex

The langchain-superlocalmemory package provides a SuperLocalMemoryChatStore compatible with V3:

from llama_index.storage.chat_store.superlocalmemory import SuperLocalMemoryChatStore

chat_store = SuperLocalMemoryChatStore()  # Uses V3 database

Source: ide/integrations/llamaindex/README.md

LangChain

from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory

history = SuperLocalMemoryChatMessageHistory(session_id="my-session")

Source: ide/integrations/langchain/README.md

Rollback Procedure

If migration causes issues:

``bash rm -rf ~/.superlocalmemory cp -r ~/.superlocalmemory.backup ~/.superlocalmemory ``

  1. Restore from Backup

``bash npm uninstall -g superlocalmemory npm install -g [email protected] ``

  1. Reinstall V2
  1. Report Issue
  • Create issue at GitHub Issues
  • Include migration logs from post-install

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

System Architecture

Related topics: Home, Retrieval Pipeline, Modes Explained (A/B/C)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Unified Daemon

Continue reading this section for the full explanation and source context.

Section FastAPI API Server

Continue reading this section for the full explanation and source context.

Section UI Server

Continue reading this section for the full explanation and source context.

Related topics: Home, Retrieval Pipeline, Modes Explained (A/B/C)

System Architecture

SuperLocalMemory is a local-first AI memory system designed for privacy-conscious users who want persistent, semantic memory capabilities for AI assistants without relying on cloud services. The architecture follows a multi-layered design that separates storage, retrieval, API serving, and user interface concerns while maintaining tight integration between components.

Architecture Overview

The system consists of four primary layers that work together to provide memory persistence and retrieval capabilities:

graph TD
    subgraph CLI["CLI Layer"]
        CLI_CMD["slm remember<br/>slm recall<br/>slm forget"]
        SETUP_WIZARD["Setup Wizard"]
        POST_INSTALL["Post-Install"]
    end

    subgraph Server["API Server Layer"]
        DAEMON["Unified Daemon<br/>(unified_daemon.py)"]
        API["FastAPI Server<br/>(api.py)"]
        UI["UI Server<br/>(ui.py)"]
        ROUTES["Route Handlers<br/>(routes/*)"]
    end

    subgraph Core["Core Engine Layer"]
        MEMORY["Memory Engine"]
        CONFIG["Configuration<br/>(config.py)"]
        MIGRATOR["V2 Migrator"]
    end

    subgraph Storage["Storage Layer"]
        DB["SQLite Database<br/>(~/.superlocalmemory/memory.db)"]
        CACHE["Context Cache"]
    end

    CLI_CMD -->|Command| DAEMON
    SETUP_WIZARD -->|Initialize| DB
    POST_INSTALL -->|Migrate| MIGRATOR
    DAEMON -->|Manage| DB
    API -->|Serve| UI
    API -->|Routes| ROUTES
    ROUTES -->|Query| MEMORY
    MEMORY -->|Read/Write| DB
    MEMORY -->|Cache| CACHE
    CONFIG -->|Configure| MEMORY

The CLI layer provides the primary interface for human users and AI agents, while the API server layer handles both programmatic access and the web-based dashboard. The core engine layer implements the memory operations, and the storage layer persists all data locally in SQLite.

Core Components

Unified Daemon

The unified daemon (unified_daemon.py) serves as the central process manager for SuperLocalMemory. It handles version tracking, data migrations, and system initialization.

Key Responsibilities:

  • Version banner display on startup and upgrades
  • Additive schema migrations before engine initialization
  • Non-blocking startup (failures in version tracking do not prevent daemon operation)
# Source: src/superlocalmemory/server/unified_daemon.py:1-50

# LLD-06 §7.3 / LLD-07 §4.1 — run additive schema migrations BEFORE
# engine init so later queries see the expected columns/tables.
# Non-fatal: any failure here is logged and the daemon still starts.

The daemon implements a fail-safe approach where version banner errors are caught and logged without blocking startup, ensuring the system remains operational even when version tracking encounters issues.

FastAPI API Server

The API server (api.py) provides REST endpoints for memory operations and serves the web-based user interface. It uses FastAPI with several middleware components for security and performance.

Middleware Stack (in order):

LayerMiddlewarePurposeSource
1 (outermost)SecurityHeadersMiddlewareSecurity headersapi.py:1-30
2GZipMiddlewareResponse compression (min 1000 bytes)api.py:1-30
3CORSMiddlewareCross-origin resource sharingui.py:1-50
4 (innermost)RateLimiterRequest throttlingui.py:1-50

CORS Configuration:

The API server allows requests from localhost origins on two ports:

  • http://localhost:8765 and http://127.0.0.1:8765
  • http://localhost:8417 and http://127.0.0.1:8417

Allowed methods include GET, POST, PUT, DELETE, PATCH, and OPTIONS. Headers allowed include Content-Type, Authorization, and X-SLM-API-Key.

Rate Limiting:

  • Write operations: 30 requests per 60 seconds
  • Read operations: 120 requests per 60 seconds

UI Server

The UI server (ui.py) serves the web-based memory dashboard and is integrated into the FastAPI application. It searches for the UI directory in two locations:

# Source: src/superlocalmemory/server/ui.py:1-20

# V3.3.21: UI shipped inside the package for pip/npm installs.
_PKG_UI = Path(__file__).resolve().parent.parent / "ui"
_REPO_UI = Path(__file__).resolve().parent.parent.parent.parent / "ui"
UI_DIR = _PKG_UI if (_PKG_UI / "index.html").exists() else _REPO_UI

This dual-location search supports both package-installed and repository-clone deployments.

CLI Architecture

The command-line interface provides the primary interaction method for users and AI agents. The CLI is organized around commands that map to memory operations.

graph LR
    subgraph Commands
        REMEMBER["remember<br/>Store new memory"]
        RECALL["recall<br/>Semantic search"]
        FORGET["forget<br/>Delete by query"]
        DELETE["delete<br/>Delete by ID"]
        UPDATE["update<br/>Modify memory"]
    end

    subgraph Output
        JSON["--json flag<br/>Agent-native format"]
        HUMAN["Human readable"]
    end

    REMEMBER -->|result| JSON
    RECALL -->|result| JSON
    RECALL -->|result| HUMAN

Command Reference

CommandPurposeKey OptionsOutput
slm rememberStore a new memory--importance, --tags, --projectConfirmation or JSON
slm recallSemantic search--limit, --json, --fastResults list
slm forgetDelete by query--dry-run, --yes, --jsonConfirmation
slm deleteDelete by ID--yes, --jsonConfirmation
slm updateModify existing memoryVariousUpdated memory

Source: src/superlocalmemory/cli/main.py:1-100

The `--fast` Flag

The recall command supports a --fast option that skips the SpreadingActivation 5th channel for sub-second response times. When enabled, only four channels execute:

  1. Semantic similarity
  2. Lexical matching
  3. Temporal proximity
  4. Structural relevance

This trade-off is recommended when you need recall results before making a tool call (e.g., before WebSearch).

JSON Output Format

The CLI supports an agent-native JSON output format with a consistent envelope structure:

{
  "success": true,
  "command": "recall",
  "version": "3.4.58",
  "data": [...],
  "next_actions": [...]
}

The version is read from package.json (npm installs), pyproject.toml (pip installs), or falls back to importlib.metadata.

Source: src/superlocalmemory/cli/json_output.py:1-60

Setup and Initialization

Setup Wizard

The setup wizard (setup_wizard.py) runs automatically on first use or via slm setup. It handles:

  • Model downloads (embedding model: nomic-ai/nomic-embed-text-v1.5)
  • Reranker model downloads (cross-encoder/ms-marco-MiniLM-L-12-v2)
  • Mode configuration
  • Installation verification

Source: src/superlocalmemory/cli/setup_wizard.py:1-50

The wizard detects non-interactive environments (CI, piped input, MCP calls) and skips interactive prompts in those contexts.

Post-Install Process

For npm installations, a post-install script runs after npm install -g superlocalmemory. It performs:

  1. Version banner check (detects upgrades from prior versions)
  2. V2 installation detection
  3. Migration prompt if V2 data exists
  4. Setup wizard invocation for new users

Source: src/superlocalmemory/cli/post_install.py:1-50

Data Directory

The default data directory is ~/.superlocalmemory/, configurable via:

  • Environment variable: SL_MEMORY_PATH (Python layer)
  • Environment variable: SLM_DATA_DIR (documented but noted as potentially unused in some versions)

All memories are stored in memory.db within this directory.

Route Architecture

The API server includes multiple route modules that handle different aspects of memory operations:

Registered Routers

RouterPurposeSource
memories_routerCore memory CRUD operationsroutes/memories.py
stats_routerStatistics and analyticsroutes/stats.py
profiles_routerProfile managementroutes/profiles.py
backup_routerBackup and restoreroutes/backup.py
events_routerAudit trail eventsroutes/events.py
v3_routerV3 dashboard and advanced featuresroutes/v3_api.py
chat_routerChat with memory context (SSE)routes/chat.py

Chat Route (SSE Streaming)

The chat route implements server-sent events for streaming LLM responses with memory context and citation detection:

sequenceDiagram
    participant Client
    participant Server
    participant Memory
    participant LLM

    Client->>Server: POST /chat with query
    Server->>Memory: Retrieve relevant memories
    Memory-->>Server: List of memories with trust scores
    Server->>Server: Build context with citation markers
    Server->>LLM: Stream response request
    LLM-->>Server: Token stream
    Server-->>Client: SSE events (token, done, error)

Source: src/superlocalmemory/server/routes/chat.py:1-100

The system prompt instructs the LLM to cite memories using markers like [MEM-1], [MEM-2], etc., enabling traceable responses.

Optional Feature Routers

Several routers are loaded gracefully and do not block startup if unavailable:

  • learning - Adaptive learning from user feedback
  • lifecycle - Memory lifecycle management
  • behavioral - Behavioral pattern recognition
  • compliance - Enterprise compliance features
# Source: src/superlocalmemory/server/ui.py:100-120

for _module_name in ("learning", "lifecycle", "behavioral", "compliance"):
    try:
        _mod = __import__(f"superlocalmemory.server.routes.{_module_name}", fromlist=["router"])
        application.include_router(_mod.router)
    except (ImportError, Exception):
        pass

Multi-Profile Architecture

SuperLocalMemory supports multiple isolated profiles, where each profile maintains its own:

  • Memory entries
  • Learning data
  • Preferences
  • Feedback

Profile isolation ensures that memories from one profile cannot leak to another, a security feature introduced in v2.6.0.

Data Flow

graph TD
    subgraph Ingestion
        USER["User Input<br/>(CLI/API)"]
        AGENT["AI Agent<br/>(MCP/Tools)"]
        INTEGRATION["Integrations<br/>(LangChain/LlamaIndex)"]
    end

    subgraph Processing
        ROUTE["Route Handler"]
        VALIDATE["Validation"]
        STORE["Storage Engine"]
    end

    subgraph Retrieval
        SEARCH["Search Engine"]
        RERANK["Reranker"]
        FUSE["Result Fusion"]
    end

    USER -->|slm remember| ROUTE
    AGENT -->|MCP Tools| ROUTE
    INTEGRATION -->|Chat History| ROUTE
    ROUTE --> VALIDATE
    VALIDATE --> STORE
    STORE -->|Query| SEARCH
    SEARCH --> RERANK
    RERANK --> FUSE
    FUSE -->|Results| USER
    FUSE -->|Results| AGENT

Security Considerations

Hardcoded Bind Address

As noted in GitHub Issue #23, both the SLM daemon and related components (like slm-mesh broker) currently hardcode 127.0.0.1 as the bind address. This is appropriate for single-machine usage but limits multi-machine deployments over trusted networks.

Profile Isolation

Memory queries enforce profile isolation at the API layer, preventing cross-profile data leakage.

Rate Limiting

The API implements per-endpoint rate limiting to prevent abuse, with stricter limits on write operations.

Common Architecture Patterns

Lazy Import Pattern

Several modules use lazy imports to keep module-level imports fast:

# Source: src/superlocalmemory/server/routes/prewarm.py:1-30

def _compute_topic_sig(prompt: str) -> str:
    """Lazy import so module import is free of hot-path SLM modules."""
    from superlocalmemory.core.topic_signature import compute_topic_signature
    return compute_topic_signature(prompt)

Graceful Degradation

Optional features are loaded with try/except blocks, ensuring the core system remains functional even when optional components fail:

# Source: src/superlocalmemory/server/routes/prewarm.py:30-50

def _upsert_cache(...) -> None:
    from superlocalmemory.core.context_cache import CacheEntry, ContextCache
    cache = ContextCache()
    try:
        cache.upsert(CacheEntry(...))
    finally:
        cache.close()

Fail-Safe Version Tracking

Version banner errors are caught without blocking startup:

# Source: src/superlocalmemory/server/unified_daemon.py:50-70

try:
    # version tracking logic
except Exception _exc:  # pragma: no cover — never block startup
    logger.debug("version-banner skipped: %s", _exc)
    _want_write_marker = False

Installation Modes

SuperLocalMemory supports multiple installation methods, each with slightly different directory structures:

MethodUI LocationVersion SourcePost-Install
npm global_PKG_UI (package)package.jsonpost_install.js
pip_PKG_UI (package)pyproject.tomlFirst-run wizard
Repository clone_REPO_UI (repo root)DynamicManual setup

The UI directory detection falls back from package location to repository location if the package UI is not found.

See Also

  • Home — Project overview and feature summary
  • Installation — Detailed installation instructions
  • MCP Integration — Model Context Protocol setup for AI IDEs
  • Universal Skills — CLI slash commands and automation
  • FAQ — Common questions and troubleshooting

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Modes Explained (A/B/C)

Related topics: Home, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section How Mode A Works

Continue reading this section for the full explanation and source context.

Section Behavior Without LLM Provider

Continue reading this section for the full explanation and source context.

Section Fast Mode Option

Continue reading this section for the full explanation and source context.

Related topics: Home, System Architecture

Modes Explained (A/B/C)

SuperLocalMemory V3 operates in distinct operational modes that determine how memory storage, retrieval, and LLM integration function. Understanding these modes is essential for configuring the system for your specific use case—whether you prioritize complete data sovereignty with local-only processing or require cloud-based LLM capabilities.

Overview

SuperLocalMemory V3 provides three primary operational modes that govern how the memory system processes queries and integrates with Large Language Model backends:

ModeDescriptionLLM RequiredData Location
Mode ALocal-only semantic searchNoAlways local
Mode BCloud LLM with local memoryYes (external)Memory local, inference cloud
Mode C(Documentation pending)VariesVaries

The mode system is central to SuperLocalMemory's architecture, enabling deployment flexibility from fully air-gapped environments to cloud-integrated workflows. Source: src/superlocalmemory/core/config.py

Mode A is the default operational mode for SuperLocalMemory when no external LLM provider is configured. In this mode, the system performs all memory operations using local embedding models and SQLite-based retrieval without requiring any external API calls.

How Mode A Works

When a user issues a slm recall command in Mode A, the system performs semantic search using locally-hosted embedding models. The retrieval pipeline includes:

  1. Query embedding — The user's search query is embedded using a local model (typically nomic-ai/nomic-embed-text-v1.5)
  2. Vector similarity search — Embeddings are compared against stored memory vectors in SQLite
  3. Multi-channel retrieval — Results are ranked using semantic, lexical, temporal, and structural signals
  4. Raw result presentation — Memory cards are returned directly without LLM synthesis

Behavior Without LLM Provider

When no LLM provider is configured, the chat API endpoint explicitly falls back to Mode A behavior:

if not provider:
    yield _sse_event("token", "No LLM provider configured. Showing raw results instead.\n\n")
    async for event in _stream_mode_a(query, memories):
        yield event

Source: src/superlocalmemory/server/routes/chat.py:58-61

Fast Mode Option

Mode A supports a --fast flag for sub-second response times by skipping the Spreading Activation fifth channel:

recall_p.add_argument(
    "--fast", action="store_true",
    help="Skip SpreadingActivation 5th channel for sub-second response. "
         "Other 4 channels (semantic, lexical, temporal, structural) still run. "
         "Use when you need recall before a tool call (e.g. before WebSearch).",
)

Source: src/superlocalmemory/cli/main.py

When to Use Mode A

  • Air-gapped environments — Systems without internet connectivity
  • Maximum privacy — When no data should leave the local machine under any circumstances
  • Maximum speed — When sub-second retrieval is prioritized over synthesis
  • Resource-constrained deployments — When GPU/CPU resources cannot support inference

Mode B: Cloud LLM with Local Memory

Mode B extends Mode A's local memory foundation with cloud-based LLM synthesis. In this mode, memory embeddings and storage remain entirely local, but query synthesis and response generation use external LLM providers.

Mode B Architecture

graph TD
    A[User Query] --> B[Local Embedding]
    B --> C[SQLite Memory Store]
    C --> D[Retrieved Memories]
    D --> E[Context Construction]
    E --> F[Cloud LLM Provider]
    F --> G[Synthesized Response]
    G --> H[Local Display]
    
    H --> I[Trust Scoring]
    I --> J[Memory Update]
    
    style F fill:#ffcccc
    style C fill:#ccffcc

Supported Providers

Mode B supports any OpenAI-compatible API endpoint, including:

ProviderConfigurationNotes
Ollamaprovider: ollamaLocal LLM option
LM StudioOpenAI-compatibleLocal inference
GroqCloudFast inference
OpenAIapi_key requiredStandard OpenAI
OpenRouterapi_key requiredAggregated models
Custom endpointsapi_base configurableSelf-hosted

Source: src/superlocalmemory/llm/backbone.py

Known Issue: api_key Handling

A documented issue affects Mode B configuration where the api_key field may be silently dropped:

Symptom: Any configured api_key is ignored in Mode B, causing LLMBackbone.is_available() to return False for non-Ollama providers.

Source: GitHub Issue #9

This occurs in the SLMConfig.for_mode() method's Mode B branch, where API credentials may not be properly propagated to the LLM backbone initialization.

When to Use Mode B

  • Complex synthesis — When memory retrieval benefits from natural language explanation
  • Multi-modal reasoning — When combining memory with real-time information
  • Multi-language support — When working with non-English content requiring advanced language models
  • Balanced privacy — When memory data must remain local but inference can be external

Mode Selection and Configuration

Viewing Current Mode

Check the active mode using the CLI:

slm status

The status command displays the current mode, LLM configuration, and memory statistics.

Switching Modes

Switch between modes using the setup wizard or direct configuration:

slm mode          # Interactive mode selection
slm provider      # Configure LLM provider settings

Source: src/superlocalmemory/cli/main.py:80-85

Configuration File Structure

The SLMConfig class manages mode-specific settings:

class SLMConfig:
    @classmethod
    def for_mode(cls, mode: str) -> "SLMConfig":
        """Factory method that returns mode-specific configuration."""
        ...

Source: src/superlocalmemory/core/config.py

Setup Wizard Integration

The setup wizard handles initial mode selection during first-time installation:

_SLM_HOME = Path(os.environ.get("SL_MEMORY_PATH", Path.home() / ".superlocalmemory"))
_SETUP_MARKER = _SLM_HOME / ".setup-complete"
_EMBED_MODEL = "nomic-ai/nomic-embed-text-v1.5"
_RERANKER_MODEL = "cross-encoder/ms-marco-MiniLM-L-12-v2"

Source: src/superlocalmemory/cli/setup_wizard.py

The wizard:

  1. Detects whether an LLM provider is available
  2. Offers mode selection based on detected capabilities
  3. Downloads required embedding models for Mode A
  4. Configures provider credentials for Mode B

Data Flow Comparison

graph LR
    subgraph Mode A
        A1[Query] --> A2[Local Embed]
        A2 --> A3[Local Search]
        A3 --> A4[Raw Results]
    end
    
    subgraph Mode B
        B1[Query] --> B2[Local Embed]
        B2 --> B3[Local Search]
        B3 --> B4[Context Build]
        B4 --> B5[Cloud LLM]
        B5 --> B6[Synthesized]
    end
    
    subgraph Mode C
        C1[Query] --> C2[Context]
        C2 --> C3[Distributed]
        C3 --> C4[Collaborative]
    end

Troubleshooting Mode Selection

Long Wait Times Without Response

If slm remember commands hang without response, this may indicate:

  • Mode B is configured but the LLM provider is unreachable
  • Network connectivity issues with cloud endpoints
  • Model download still in progress for first-time setup

Resolution: This was addressed in v3.3.19. Ensure you are running the latest version.

Source: GitHub Issue #11

Provider Not Detected

When Mode B features are unavailable:

  1. Verify LLM configuration in settings
  2. Check api_key is not empty in config
  3. Test provider connectivity with slm status
  4. Consider falling back to Mode A if cloud access is unreliable

SLM_DATA_DIR Not Honored

The SLM_DATA_DIR environment variable is documented but may not be used everywhere. This affects all modes equally. The recommended workaround is to ensure the default ~/.superlocalmemory directory has appropriate permissions.

Source: GitHub Issue #10

Security Considerations

Profile Isolation

All modes enforce profile isolation at the query endpoint level:

Memories from one profile can never leak to another.

Source: v2.6.0 Release Notes

This security guarantee applies regardless of which mode is active.

Mode B Data Privacy

In Mode B:

  • Memory content never leaves the local machine
  • Only query embeddings and synthesized responses traverse the network
  • The cloud LLM receives constructed context, not raw memory

For maximum privacy in Mode B, ensure your LLM provider's data handling policies meet your compliance requirements.

Community Feature Requests

The community has proposed several mode-related enhancements:

Configurable Embedding Endpoints (Issue #16)

Users have requested support for configurable local embedding endpoints to improve non-English language support:

"Support configurable local embedding endpoints (e.g., OpenAI-compatible API) to unlock non-English language potential."

Source: GitHub Issue #16

This would allow Mode B users to specify custom embedding services that better handle their language requirements.

Multi-Scope Memory (Issue #20)

An RFC proposes scope-aware retrieval that could work across all modes:

"Currently, SLM stores all memories in a flat namespace... no distinction between private memories and shared knowledge."

Source: GitHub Issue #20

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Retrieval Pipeline

Related topics: System Architecture, Memory Lifecycle

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Data Flow

Continue reading this section for the full explanation and source context.

Section Channel Architecture

Continue reading this section for the full explanation and source context.

Section CLI Interface

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Memory Lifecycle

Retrieval Pipeline

The Retrieval Pipeline is the core information retrieval system in SuperLocalMemory V3, responsible for finding the most relevant memories in response to user queries. It implements a multi-channel retrieval architecture that combines semantic similarity, lexical matching, temporal proximity, structural relationships, and spreading activation into a unified retrieval operation. The pipeline is invoked through the slm recall CLI command, the REST API endpoints, and the MCP tools for programmatic access.

Overview

SuperLocalMemory stores memories as atomic facts in a SQLite database, each tagged with semantic embeddings, temporal metadata, structural relationships (parent-child, project, profile), and trust scores. When a query arrives, the Retrieval Pipeline must efficiently locate the most relevant facts across these dimensions.

The pipeline is designed around the following principles:

  • Multi-channel retrieval: Combines 4-5 orthogonal retrieval channels to capture different similarity aspects
  • Trust-aware ranking: Results are filtered and weighted by trust scores computed from provenance chains
  • Sub-second response: The --fast mode delivers results in under 1 second for time-sensitive tool calls
  • Zero-LLM inference: Core retrieval runs without calling an LLM, enabling fully local operation (Mode A)
  • Fisher-Rao similarity: Mathematical framework ensuring information-geometric guarantees on retrieval quality

The CLI entry point is slm recall, which wraps the WorkerPool's recall method for thread-safe concurrent execution. Source: cli/main.py:recall_p

Architecture

High-Level Data Flow

graph TD
    A["User Query<br/>'slm recall &lt;query&gt;'"] --> B["WorkerPool.recall<br/>Entry Point"]
    B --> C["Semantic Channel<br/>Embedding Similarity"]
    B --> D["Lexical Channel<br/>BM25 / Keyword Match"]
    B --> E["Temporal Channel<br/>Recency Weighting"]
    B --> F["Structural Channel<br/>Project / Profile Graph"]
    B --> G["Spreading Activation<br/>5th Channel (optional)"]
    C --> H["Parallel Channel Execution"]
    D --> H
    E --> H
    F --> H
    G -.-> H
    H --> I["Reranker<br/>Cross-Encoder Scoring"]
    I --> J["Trust Filter<br/>Minimum Threshold"]
    J --> K["Ranked Results<br/>JSON / Markdown Output"]
    
    style G fill:#f9f,stroke:#333,stroke-width:2px
    style K fill:#bf9,stroke:#333,stroke-width:2px

Channel Architecture

The retrieval engine combines five distinct channels, each capturing a different dimension of relevance:

ChannelPurposeInputOutput
SemanticMeaning-based similarity via embeddingsQuery textTop-K fact IDs with cosine similarity scores
LexicalKeyword and exact matchQuery termsFact IDs with BM25 scores
TemporalRecency-weighted relevanceTimestamp metadataTime-decay weighted scores
StructuralGraph-based relationshipsParent-child, project graphTransitive closure scores
Spreading ActivationNeural-style associative retrievalActive facts from other channelsPropagated activation scores

The first four channels run in parallel for maximum throughput. The Spreading Activation channel is optional and skipped when --fast mode is enabled. Source: cli/main.py:recall_p.add_argument --fast

CLI Interface

The slm recall command provides the primary interface to the retrieval pipeline:

slm recall <query> [--limit N] [--json] [--fast]
ArgumentDefaultDescription
query(required)Search query string
--limit, -l10Maximum number of results to return
--jsonfalseOutput structured JSON for agent consumption
--fastfalseSkip SpreadingActivation 5th channel for sub-second response

The --fast flag is recommended when recall must complete before a subsequent tool call (e.g., before WebSearch). It still executes all four primary channels: semantic, lexical, temporal, and structural. Source: cli/main.py:recall_p

REST API Endpoint

The REST API exposes retrieval through the memories router:

GET /api/memories/recall?query=<query>&limit=<limit>

The endpoint delegates to the same WorkerPool shared instance used by the CLI, ensuring consistent behavior across all access methods. Results are returned as a JSON array of fact objects with content, trust scores, and provenance metadata. Source: server/routes/memories.py

WorkerPool Integration

Shared Execution Context

The Retrieval Pipeline runs inside a WorkerPool singleton that manages concurrent access to the SQLite database. This design ensures thread safety and allows both the CLI daemon and HTTP server to share the same retrieval engine.

# Internal flow in chat.py
from superlocalmemory.core.worker_pool import WorkerPool

pool = WorkerPool.shared()
result = pool.recall(query, limit=limit)

The WorkerPool.shared() method returns a process-global singleton that initializes the V3 MemoryEngine on first access. All subsequent recall operations reuse this instance, avoiding repeated engine initialization overhead. Source: server/routes/chat.py:_recall_memories

Result Structure

The recall operation returns a dictionary with the following structure:

{
  "ok": true,
  "results": [
    {
      "fact_id": "fact_abc123",
      "content": "The project uses Python 3.12 for type safety",
      "trust_score": 0.92,
      "provenance": ["user_feedback:thumbs_up", "automatic_verification"],
      "tags": ["project:backend", "profile:default"],
      "created_at": 1704067200
    }
  ]
}

Context Caching

Cache Architecture

The Retrieval Pipeline integrates with a context cache layer to accelerate repeated queries. When memories are retrieved, they can be cached with a topic signature for future use:

graph LR
    A["Query: 'Python best practices'"] --> B["Compute Topic Signature<br/>hash(query)"]
    B --> C["ContextCache Lookup"]
    C -->|Cache Hit| D["Return Cached Results"]
    C -->|Cache Miss| E["Run Full Retrieval Pipeline"]
    E --> F["Upsert Cache Entry"]
    F --> D

The ContextCache stores entries with:

  • session_id: Conversation or agent session identifier
  • topic_sig: Hash of the query for fast lookup
  • content: Serialized memory context
  • fact_ids: Tuple of referenced fact IDs
  • provenance: How the cache entry was computed
  • computed_at: Unix timestamp for TTL decisions

Source: server/routes/prewarm.py:_upsert_cache

Prewarm Endpoint

The /api/prewarm endpoint allows agents to eagerly populate the cache before a conversation starts:

POST /api/prewarm
{
  "session_id": "session_xyz",
  "prompt": "Tell me about the backend architecture"
}

This triggers retrieval for the provided prompt and stores results in the context cache, reducing latency when the user later asks related questions. Source: server/routes/prewarm.py

Topic Signature Computation

Topic signatures are computed lazily to keep module imports fast:

def _compute_topic_sig(prompt: str) -> str:
    """Lazy import so module import is free of hot-path SLM modules."""
    from superlocalmemory.core.topic_signature import compute_topic_signature
    return compute_topic_signature(prompt)

This lazy import pattern ensures that importing the prewarm module does not trigger loading of the heavier retrieval dependencies until actually needed. Source: server/routes/prewarm.py:_compute_topic_sig

Chat Integration with Memory Context

Streaming Response Flow

When the /api/chat endpoint receives a message, it retrieves relevant memories and streams them alongside the LLM response:

sequenceDiagram
    participant User
    participant API as /api/chat
    participant Recall as Retrieval Pipeline
    participant LLM as LLM Provider
    participant User as Client (SSE)
    
    User->>API: POST /api/chat {message}
    API->>Recall: _recall_memories(message, limit=10)
    Recall-->>API: memories[]
    API->>API: Build context with [MEM-1], [MEM-2] markers
    API->>LLM: Stream completion(messages + context)
    LLM-->>User: SSE tokens with memory citations

The retrieved memories are formatted with citation markers that the LLM can reference:

[MEM-1] The project uses Python 3.12 (trust: 0.92)
[MEM-2] PostgreSQL is the primary database (trust: 0.88)

Source: server/routes/chat.py:_build_context

Trust Scoring in Results

Each retrieved memory carries a trust_score between 0.0 and 1.0, computed from the provenance chain:

  • Memories with positive user feedback (thumbs up) receive higher trust
  • Memories auto-verified against external sources score higher
  • Memories from recent sessions with the same profile are weighted more heavily
  • Imported memories without provenance chains receive lower default trust

The trust score appears in both CLI output and the SSE stream, allowing clients to display visual indicators of memory reliability.

Performance Considerations

Fast Mode Trade-offs

When --fast is specified, the Spreading Activation channel is disabled. This channel provides neural-style associative retrieval by propagating activation through the memory graph. Disabling it trades recall quality for speed:

ModeLatencyChannels ActiveBest For
Default~500-2000ms5 (all)Comprehensive research, agent planning
--fast~100-500ms4 (no spreading)Tool calls before web search, real-time autocomplete

For most interactive use cases, --fast provides sufficient accuracy. The full pipeline is recommended for final answer synthesis or when recall quality is paramount. Source: cli/main.py:recall_p.add_argument --fast

Concurrent Access

The WorkerPool handles concurrent requests safely through Python's concurrent.futures thread pool executor. Multiple simultaneous slm recall invocations or API calls are serialized at the database level but execute channel operations in parallel within each request.

Docker and Linux Considerations

Community reports indicate potential issues with retrieval latency in Docker containers and certain Linux distributions. These may relate to:

  • SQLite file locking behavior in overlay filesystems
  • Thread pool sizing in containerized environments
  • Model loading times for embedding models (Mode B)

Users experiencing long wait times (reported as "wait for a long time but seems no response" in issue #11, fixed in v3.3.19) should verify:

  1. The daemon is running (slm status)
  2. Sufficient memory is available for embedding models
  3. The database file is on a local filesystem (not network storage)

Configuration

Data Directory

By default, SuperLocalMemory stores all retrieval data in ~/.superlocalmemory/. The SLM_DATA_DIR environment variable can relocate this, though note this feature had a bug (issue #10) that has since been corrected:

export SLM_DATA_DIR=/path/to/custom/data
slm recall "my query"

Mode A vs Mode B

The retrieval pipeline operates in two modes:

AspectMode A (Local)Mode B (API)
EmbeddingsLocal model (nomic-embed-text)Remote API (OpenAI-compatible)
LLMLocal (Ollama)Remote (OpenAI, Anthropic, etc.)
LatencyHigher (model loading)Lower (API calls)
PrivacyMaximum (fully offline)High (data stays on configured server)
CostFree (compute only)API token costs

Mode B supports OpenAI-compatible embedding endpoints, enabling users to configure custom embedding providers for non-English languages (feature request #16). Source: src/superlocalmemory/core/config.py

Common Issues

No Results Returned

If slm recall returns an empty result set:

  1. Verify memories exist: slm list-recent
  2. Check profile isolation: memories are scoped to the current profile
  3. Inspect database: sqlite3 ~/.superlocalmemory/memory.db "SELECT COUNT(*) FROM facts"
  4. Enable debug logging: SLM_LOG_LEVEL=DEBUG slm recall <query>

Long Response Times

Long retrieval times (beyond 2-3 seconds) may indicate:

  • First-run embedding model download (Setup Wizard should handle this)
  • Large fact database without proper indexing
  • Mode B embedding endpoint unreachable
  • Insufficient system memory for local models

API Key Not Used (Mode B)

Issue #9 reported that api_key is silently dropped in Mode B configurations. Ensure the API key is correctly set in the configuration file or environment variable, and that the provider field is set to a non-Ollama value to trigger the correct code path.

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

CLI Reference

Related topics: Home, MCP Tools

Section Related Pages

Continue reading this section for the full explanation and source context.

Section slm remember

Continue reading this section for the full explanation and source context.

Section slm recall

Continue reading this section for the full explanation and source context.

Section slm forget

Continue reading this section for the full explanation and source context.

Related topics: Home, MCP Tools

CLI Reference

The SuperLocalMemory CLI (slm) provides a command-line interface for managing AI memory operations. The CLI serves as the primary interface for agents and developers to store, retrieve, update, and delete memories from the local SQLite database. All operations are local-first, with data stored in ~/.superlocalmemory/memory.db by default.

Overview

The CLI follows the 2026 agent-native CLI standard, providing consistent JSON envelopes for programmatic consumption alongside human-readable output for interactive use. Source: src/superlocalmemory/cli/json_output.py:1-50

graph TD
    A[slm CLI] --> B[Commands]
    B --> C[remember - Store facts]
    B --> D[recall - Semantic search]
    B --> E[forget - Fuzzy delete]
    B --> D2[delete - Precise delete]
    B --> F[update - Modify memory]
    A --> G[Global Options]
    G --> H[--json Agent-native output]
    G --> I[--profile Profile isolation]

Installation

The CLI is available via both npm and pip package managers.

# npm installation (recommended for global access)
npm install -g superlocalmemory

# pip installation
pip install superlocalmemory

After installation, the slm command becomes available globally. The post-install script automatically detects V2 installations and prompts for migration. Source: src/superlocalmemory/cli/post_install.py:1-40

For first-time users, the setup wizard runs automatically on first slm command when ~/.superlocalmemory/.setup-complete is missing. Source: src/superlocalmemory/cli/setup_wizard.py:1-45

Commands

slm remember

Stores new facts and memories in the local database.

slm remember <content> [options]
ArgumentDescription
contentThe fact or memory to store

Options:

OptionDescription
--jsonOutput structured JSON (agent-native)
--profile <name>Store in specific profile (default: current profile)

Example:

# Interactive mode
slm remember "The PostgreSQL connection pool should use max 20 connections"

# JSON output for scripting
slm remember "Project deadline is March 15th" --json

The remember command processes the content through the V3 MemoryEngine, computing topic signatures and extracting entities for optimized retrieval. Source: src/superlocalmemory/server/routes/prewarm.py:1-50

slm recall

Performs semantic search with 4-channel retrieval across stored memories.

slm recall <query> [options]
ArgumentDescription
querySemantic search query

Options:

OptionDefaultDescription
--limit <n>10Maximum number of results
--jsonfalseOutput structured JSON
--fastfalseSkip 5th channel (SpreadingActivation) for sub-second response
--profile <name>currentSearch within specific profile

4-Channel Retrieval Architecture:

The recall command uses four retrieval channels:

  1. Semantic - Vector embedding similarity
  2. Lexical - Keyword and phrase matching
  3. Temporal - Time-based relevance scoring
  4. Structural - Graph topology influence

The optional 5th channel (SpreadingActivation) performs network propagation for deeper context discovery. Use --fast to skip this channel when speed is critical, such as before a tool call. Source: src/superlocalmemory/cli/main.py:1-80

graph LR
    A[Query] --> B[Semantic Channel]
    A --> C[Lexical Channel]
    A --> D[Temporal Channel]
    A --> E[Structural Channel]
    B --> F[Ranking Engine]
    C --> F
    D --> F
    E --> F
    F --> G[Results]

Example:

# Standard recall
slm recall "database connection pooling settings"

# Fast recall for pre-tool use
slm recall "API endpoint for users" --fast

# JSON output with higher limit
slm recall "authentication token handling" --limit 20 --json

Performance Note: If slm recall appears to hang with no response, this was a known issue fixed in v3.3.19. Ensure you are running a recent version. Source: GitHub Issue #11

slm forget

Deletes memories matching a query using fuzzy matching.

slm forget <query> [options]
ArgumentDescription
queryQuery to match for deletion

Options:

OptionDescription
--dry-runPreview matches without deleting
--yes, -ySkip confirmation prompt
--jsonOutput structured JSON
--profile <name>Operate within specific profile

Example:

# Preview what would be deleted
slm forget "old project notes" --dry-run

# Delete with confirmation
slm forget "duplicate entry about config" -y

slm delete

Deletes a specific memory by its exact fact ID.

slm delete <fact_id> [options]
ArgumentDescription
fact_idExact fact ID to delete

Options:

OptionDescription
--yes, -ySkip confirmation prompt
--jsonOutput structured JSON

Example:

# Delete by exact ID
slm delete fact_abc123def456 -y

Unlike forget, this command requires the precise fact ID, making it suitable for programmatic deletion workflows.

slm update

Modifies an existing memory entry.

slm update [options]

Options:

OptionDescription
--fact-id <id>Fact ID to update
--content <text>New content for the memory
--jsonOutput structured JSON

Example:

# Update memory content
slm update --fact-id fact_xyz789 --content "Updated project requirements"

# JSON output for automation
slm update --fact-id fact_xyz789 --content "New content here" --json

Global Options

These options work with any command.

OptionDescription
--jsonEnable agent-native JSON output format
--profile <name>Specify which profile to use
--help, -hShow help message
--version, -vShow version information

JSON Output Format

When --json is specified, commands return a structured envelope following the 2026 agent-native CLI standard:

{
  "success": true,
  "command": "recall",
  "version": "3.4.58",
  "data": {
    "results": [
      {
        "fact_id": "fact_abc123",
        "content": "The app uses port 5432 for PostgreSQL",
        "trust_score": 0.87,
        "importance": 7,
        "tags": ["database", "config"],
        "created_at": "2026-02-01T10:30:00Z"
      }
    ],
    "total": 1,
    "query": "postgres port",
    "channel_used": ["semantic", "lexical"]
  },
  "metadata": {
    "profile": "default",
    "execution_time_ms": 234
  }
}

Source: src/superlocalmemory/cli/json_output.py:1-80

Version Detection:

The JSON envelope includes version information read from:

  1. package.json (npm installs)
  2. pyproject.toml (pip installs)
  3. importlib.metadata fallback

Source: src/superlocalmemory/cli/json_output.py:20-50

Environment Variables

VariableDefaultDescription
SL_MEMORY_PATH~/.superlocalmemoryBase directory for data storage
CI(not set)Set to disable interactive prompts
SLM_NON_INTERACTIVE(not set)Disable interactive mode
SLM_HOST127.0.0.1Daemon bind address
SLM_PORT8765Daemon port

Note: The SLM_DATA_DIR environment variable was requested in the community (Issue #10) for custom data directories. Verify current support by checking src/superlocalmemory/core/config.py.

Profile Isolation

All CLI commands respect profile isolation. Memories from one profile cannot leak to another. This is enforced at the query endpoint level and applies to all operations including recall, remember, forget, and delete. Source: src/superlocalmemory/server/ui.py:1-50

graph TD
    A[CLI Command] --> B{Profile Specified?}
    B -->|No| C[Use Current Profile]
    B -->|Yes| D[Use Named Profile]
    C --> E[Query Engine]
    D --> E
    E --> F{Isolation Check}
    F -->|Pass| G[Return Results]
    F -->|Fail| H[Empty Results]

Integration with AI Tools

The CLI is designed to work seamlessly with AI coding assistants and IDE integrations:

Supported Integrations

IntegrationPackageDescription
LlamaIndexllamaindexChat store adapter for conversation history
LangChainlangchain-superlocalmemoryChat message history implementation
Claude DesktopMCPModel Context Protocol server
CursorMCPAI IDE integration
WindsurfMCPAI-powered code editor

Source: ide/integrations/llamaindex/README.md:1-40 Source: ide/integrations/langchain/README.md:1-60

LangChain Example

from langchain_core.messages import HumanMessage, AIMessage
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory

# Session-isolated chat history
history = SuperLocalMemoryChatMessageHistory(session_id="debug-session-42")
history.add_messages([
    HumanMessage(content="The login API returns 500 on production"),
    AIMessage(content="Let me check the error logs..."),
])

# Messages persist locally and are accessible via slm recall

Common Issues

Recall Hangs with No Response

Symptom: slm remember or slm recall hangs indefinitely.

Solution: This was fixed in v3.3.19. Upgrade to the latest version:

npm install -g superlocalmemory@latest
# or
pip install --upgrade superlocalmemory

Source: GitHub Issue #11

Windows Clone Issues

Symptom: git clone fails with "invalid path" error due to files with colons in bin/.

Solution: Fixed in v2.8.2. Update to a newer version or clone with symlinks disabled:

git clone --no-checkout https://github.com/qualixar/superlocalmemory.git
cd superlocalmemory
git checkout HEAD -- ':!bin/*'

Source: GitHub Issue #7

Profile Isolation Concerns

For multi-agent setups where users want distinct memory scopes (personal/global/shared), this is tracked in the Multi-Scope Memory RFC. Current implementation stores all memories in a flat namespace with profile-based filtering. Source: GitHub Issue #20

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

MCP Tools

Related topics: CLI Reference

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Responsibilities

Continue reading this section for the full explanation and source context.

Section Tool Reference

Continue reading this section for the full explanation and source context.

Section Tool Specifications

Continue reading this section for the full explanation and source context.

Related topics: CLI Reference

MCP Tools

SuperLocalMemory V3 provides a comprehensive Model Context Protocol (MCP) server implementation that enables AI tools and integrated development environments (IDEs) to interact with the local memory system. The MCP Tools layer serves as the primary integration mechanism for Claude Desktop, Cursor, Windsurf, and 17+ other AI-powered development tools.

Overview

The MCP Tools module implements the Model Context Protocol specification, providing a standardized interface for AI assistants to store, retrieve, and manage memories without leaving the development environment. This integration eliminates context window friction by making relevant memories available automatically during coding sessions.

Key capabilities:

  • Semantic memory storage and retrieval via 4-channel SpreadingActivation
  • Real-time memory context injection into AI conversations
  • Profile-scoped memory isolation for multi-agent workflows
  • Trust-aware retrieval with provenance tracking
  • Zero-LLM inference mode (Mode A) for local-only operation

Source: src/superlocalmemory/mcp/server.py

Architecture

The MCP implementation follows a client-server architecture where the SLM daemon acts as the MCP host and AI tools act as MCP clients. The architecture supports both local-only operation and OpenAI-compatible API backends.

graph TD
    subgraph "MCP Client Layer"
        Claude["Claude Desktop"]
        Cursor["Cursor IDE"]
        Windsurf["Windsurf"]
        Continue["Continue.dev"]
        Cody["Cody by Sourcegraph"]
    end

    subgraph "MCP Protocol Layer"
        JSONRPC["JSON-RPC 2.0 Transport"]
        Tools["Tool Handlers"]
        Resources["Resource Handlers"]
        Prompts["Prompt Templates"]
    end

    subgraph "SLM Core Layer"
        MemoryEngine["V3 MemoryEngine"]
        SpreadingActivation["4-Channel Retrieval"]
        ContextCache["Context Cache"]
        TrustEngine["Trust Scoring Engine"]
    end

    subgraph "Storage Layer"
        SQLite["SQLite Database<br/>/.superlocalmemory/memory.db"]
        VFS["Vector Store<br/>nomic-embed-text-v1.5"]
        GraphDB["Graph Store<br/>Temporal + Structural"]
    end

    Claude --> JSONRPC
    Cursor --> JSONRPC
    Windsurf --> JSONRPC
    Continue --> JSONRPC
    Cody --> JSONRPC

    JSONRPC --> Tools
    JSONRPC --> Resources
    JSONRPC --> Prompts

    Tools --> MemoryEngine
    Resources --> MemoryEngine
    Prompts --> MemoryEngine

    MemoryEngine --> SpreadingActivation
    SpreadingActivation --> ContextCache
    SpreadingActivation --> TrustEngine

    SpreadingActivation --> SQLite
    SpreadingActivation --> VFS
    SpreadingActivation --> GraphDB

Component Responsibilities

ComponentResponsibilityFile Reference
MCP ServerProtocol implementation, transport handlingserver.py:1-150
Tool HandlersExecute memory operations via MCP protocoltools.py:1-200
Memory EngineCore retrieval and storage logicV3 Engine integration
Context CachePre-warm cache for low-latency retrievalprewarm.py:1-100
API ServerREST endpoint for dashboard and external accessapi.py:1-150

Available MCP Tools

SuperLocalMemory V3 exposes 6 primary MCP tools for memory management operations. Each tool corresponds to a CLI command and provides equivalent functionality through the protocol.

Tool Reference

Tool NameDescriptionParametersReturn Type
slm_rememberStore new memory with automatic importance scoringcontent, project, tags, importancefact_id, topic_sig
slm_recallSemantic search with 4-channel retrievalquery, limit, profileList of memory entries
slm_list_recentList recently accessed memorieslimit, profileList of memory entries
slm_statusSystem health and memory statisticsNoneStatus object
slm_build_graphGenerate relationship graph for memoriesquery, depthGraph data
slm_switch_profileChange active memory profileprofile_nameConfirmation

Tool Specifications

#### slm_remember

Stores new information in the memory system with automatic topic signature computation and importance scoring.

# MCP tool signature
slm_remember(
    content: str,      # The memory content to store
    project: str = "", # Project identifier (optional)
    tags: list[str] = [], # Custom tags for filtering
    importance: int = 5  # 1-10 importance score
) -> {
    "fact_id": str,
    "topic_sig": str,
    "created_at": int
}

Behavior:

  1. Computes topic signature from content using local embeddings
  2. Applies automatic importance scoring based on content analysis
  3. Stores in SQLite with profile isolation
  4. Updates vector index for semantic retrieval
  5. Triggers cognitive consolidation check

Source: tools.py

#### slm_recall

Performs semantic search using the 4-channel SpreadingActivation algorithm.

# MCP tool signature
slm_recall(
    query: str,        # Search query
    limit: int = 10,   # Maximum results
    profile: str = ""  # Profile scope (empty = current)
) -> {
    "results": [
        {
            "fact_id": str,
            "content": str,
            "trust_score": float,
            "relevance": float,
            "channel_scores": {...}
        }
    ],
    "total": int,
    "query_time_ms": float
}

Retrieval Channels:

  1. Semantic — Vector similarity using nomic-embed-text-v1.5
  2. Lexical — BM25 keyword matching
  3. Temporal — Time-decay weighted retrieval
  4. Structural — Topic graph proximity scoring

Source: src/superlocalmemory/core/retrieval.py

#### slm_status

Returns system health, memory statistics, and configuration state.

# Return type
{
    "version": str,
    "mode": "A" | "B",
    "daemon_running": bool,
    "profile": str,
    "stats": {
        "total_memories": int,
        "total_facts": int,
        "total_projects": int,
        "cache_size": int
    },
    "llm": {
        "provider": str,
        "model": str,
        "available": bool
    }
}

Source: src/superlocalmemory/cli/main.py

MCP Resources

MCP Resources provide read-only access to memory data, suitable for context injection without tool execution overhead.

Resource URIDescriptionRefresh Policy
slm://memories/recentLast 20 accessed memoriesOn-demand
slm://memories/by-project/{project}Memories for specific projectOn-demand
slm://profile/currentCurrent active profileOn-demand
slm://system/statusSystem health snapshot30-second cache

MCP Prompts

Pre-defined prompt templates for common memory operations:

Prompt NameDescriptionVariables
summarize-projectGenerate project summary from memoriesproject_name
find-relatedFind memories related to current contextcurrent_topic
learning-summarySummarize learned patternstime_range

Configuration

MCP Server Settings

The MCP server is configured through slm config or environment variables:

SettingEnvironment VariableDefaultDescription
Server HostSLM_HOST127.0.0.1Bind address for MCP connections
Server PortSLM_PORT8765Port for MCP JSON-RPC transport
TransportSLM_TRANSPORTstdioTransport mode (stdio/sse)

Note: The SLM_HOST feature request (Issue #23) addresses the limitation of hardcoded 127.0.0.1 for multi-machine deployments in trusted networks.

IDE-Specific Configuration

Each supported IDE requires a configuration file pointing to the MCP server:

{
  "mcpServers": {
    "superlocalmemory": {
      "command": "npx",
      "args": ["-y", "superlocalmemory@latest", "mcp"]
    }
  }
}

Supported IDEs and configuration locations:

IDEConfiguration File
Claude Desktop~/.claude_desktop_config.json
Cursor.cursor/mcp.json in project
Windsurf.windsurf/mcp_config.json
Continue.dev.continue/config.json
CodySourcegraph dashboard settings
ChatGPTChatGPT Desktop settings
PerplexityPerplexity Desktop settings
Zed.zed/mcp.json
OpenCodeOpenCode MCP settings
AntigravityAntigravity MCP config
Aider~/.aider.conf.yml

Source: ide/configs

Integration with LangChain and LlamaIndex

Beyond native MCP support, SuperLocalMemory provides direct integrations with popular AI development frameworks.

LangChain Integration

The langchain-superlocalmemory package implements BaseChatMessageHistory for storing conversation history:

from langchain_core.messages import AIMessage, HumanMessage
from langchain_superlocalmemory import SuperLocalMemoryChatMessageHistory

history = SuperLocalMemoryChatMessageHistory(session_id="my-session")
history.add_messages([
    HumanMessage(content="What is SuperLocalMemory?"),
    AIMessage(content="It's a local-first memory system for AI assistants."),
])

Features:

  • Session isolation via session_id
  • All messages stored in ~/.superlocalmemory/memory.db
  • Compatible with LangChain chains and agents
  • Messages visible via CLI and MCP tools

Source: ide/integrations/langchain/README.md

LlamaIndex Integration

The llamaindex-superlocalmemory package provides chat storage:

from llamaindex_superlocalmemory import SuperLocalMemoryChatStore

chat_store = SuperLocalMemoryChatStore(
    session_key="user-session-123",
    db_path="/path/to/custom/memory.db"
)

Features:

  • Async support via BaseChatStore
  • Tag-based session isolation: llamaindex:chat:<session_key>
  • Messages queryable via SLM CLI and MCP

Source: ide/integrations/llamaindex/README.md

Common Usage Patterns

Context Pre-warming

The system automatically pre-warms context for known tool calls to reduce latency:

# From prewarm.py - automatic context caching
def _upsert_cache(
    session_id: str,
    topic_sig: str,
    content: str,
    fact_ids: list[str],
) -> None:
    cache = ContextCache()
    cache.upsert(CacheEntry(
        session_id=session_id,
        topic_sig=topic_sig,
        content=content,
        fact_ids=tuple(fact_ids),
        provenance="prewarm_post_tool",
        computed_at=int(time.time()),
    ))

This pattern ensures memories related to active sessions are immediately available without triggering full retrieval.

Multi-Profile Workflows

For teams running multiple specialized agents, profile isolation ensures memory separation:

# Create profile for specialized agent
slm profile create coding-agent

# Store memory scoped to this profile
slm remember "Python asyncio best practices" --project python-tips

# Recall uses current profile context automatically
slm recall "async patterns"

Note: Multi-scope memory (personal/global/shared) is a requested feature (Issue #20) that would extend the current flat namespace model.

Troubleshooting

Long Response Times

If slm remember or slm recall takes excessive time, this was a known issue fixed in v3.3.19 (Issue #11). Upgrade to the latest version:

npm install -g superlocalmemory@latest

Mode B API Key Not Working

When using OpenAI-compatible providers in Mode B, the api_key may be silently dropped. Check the configuration in src/superlocalmemory/core/config.pySLMConfig.for_mode() in the Mode B branch (Issue #9).

Docker/Linux Container Issues

For cognitive consolidation issues on Linux/Docker setups, refer to Issue #26 for environment-specific guidance.

Security Considerations

The MCP server enforces profile isolation on all query endpoints. Memories from one profile cannot leak to another, ensuring compliance with enterprise security requirements introduced in v2.6.0.

Key security features:

  • Profile-scoped access control on all MCP tools
  • API key validation for external providers (Mode B)
  • Trust scoring to flag potentially unreliable memories
  • Audit trail for compliance (v2.8.0 enterprise compliance)

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Memory Lifecycle

Related topics: Retrieval Pipeline

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section State Definitions

Continue reading this section for the full explanation and source context.

Section Trigger Conditions

Continue reading this section for the full explanation and source context.

Related topics: Retrieval Pipeline

Memory Lifecycle

Memory Lifecycle is a core system in SuperLocalMemory that automatically organizes memories over time based on usage patterns, ensuring the memory system remains fast, relevant, and scalable. Introduced in v2.8.0, this feature manages the complete journey of a memory from creation through consolidation to potential archival or deletion.

Overview

As you interact with SuperLocalMemory, the system accumulates memories at different rates depending on your workflow. Without lifecycle management, a growing memory database leads to degraded recall performance and storage bloat. The Memory Lifecycle system addresses these challenges by:

  • Automatically organizing memories based on usage frequency and recency
  • Consolidating related memories into coherent knowledge units
  • Managing storage efficiency through intelligent quantization
  • Maintaining relevance by prioritizing frequently accessed memories
  • Preserving user privacy through profile isolation during all operations

Source: CHANGELOG.md - v2.8.0

Architecture

The Memory Lifecycle system consists of several interconnected components that work together to manage memories throughout their lifetime.

graph TD
    A[Memory Created<br/>slm remember] --> B[Initial Storage<br/>memory.db]
    B --> C[Usage Tracking<br/>Access Count + Recency]
    C --> D{Lifecycle Decision<br/>Engine}
    D -->|Frequently Used| E[Active Memory<br/>Priority Queue]
    D -->|Infrequent Access| F[Consolidation Queue]
    D -->|Stale + Redundant| G[Archive/Prune]
    E --> H[Fast Recall Path]
    F --> I[Consolidation Worker]
    I --> J[Quantized Storage]
    J --> K[Compressed Facts]
    G --> L[Audit Trail]
    H --> M[Context Cache]
    M --> N[Instant Recall]
    
    style A fill:#e1f5fe
    style H fill:#c8e6c9
    style I fill:#fff3e0
    style G fill:#ffcdd2

Core Components

ComponentFilePurpose
ConsolidationEnginecore/consolidation_engine.pyOrchestrates lifecycle decisions and manages memory states
Consolidatorencoding/consolidator.pyPerforms the actual memory merging and deduplication
ConsolidationWorkerlearning/consolidation_worker.pyBackground worker for async consolidation tasks
QuantizedStorestorage/quantized_store.pyStorage layer with compression and quantization
ContextCachecore/context_cache.pyCaches frequently accessed contexts for fast recall

Source: src/superlocalmemory/core/consolidation_engine.py

Lifecycle States

Memories transition through distinct states during their lifecycle. Understanding these states helps you troubleshoot recall issues and optimize memory management.

stateDiagram-v2
    [*] --> Created: slm remember
    Created --> Active: First access
    Active --> Active: Regular access
    Active --> Consolidated: Consolidation trigger
    Active --> Archived: Extended inactivity
    Consolidated --> Active: Referenced again
    Consolidated --> Archived: Further decay
    Archived --> Active: Re-accessed
    Archived --> Purged: TTL exceeded
    Purged --> [*]: Deleted
    
    note right of Active: Hot path<br/>Full indexing
    note right of Consolidated: Compressed<br/>Quantized storage
    note right of Archived: Minimal footprint<br/>Audit preserved

State Definitions

StateDescriptionStorage FormatRecall Speed
CreatedNewly added via slm rememberFull text + embeddingsFast
ActiveRecently accessed memoriesIndexed, full fidelityFastest
ConsolidatedMerged with similar memoriesQuantized, compressedModerate
ArchivedInactive for extended periodMinimal metadataSlower
PurgedRemoved from active storageAudit trail onlyN/A

Source: src/superlocalmemory/encoding/consolidator.py

Consolidation Process

Consolidation is the core mechanism that keeps your memory system efficient. It runs automatically based on configurable triggers.

Trigger Conditions

Consolidation is triggered when specific conditions are met:

  1. Frequency Threshold: Memory accessed fewer than X times in Y days
  2. Semantic Redundancy: Multiple memories with high Fisher-Rao similarity
  3. Temporal Clustering: Memories created within the same session/context
  4. Topic Signature Collision: Memories sharing similar topic signatures

Source: src/superlocalmemory/core/topic_signature.py

Consolidation Workflow

sequenceDiagram
    participant U as User/Agent
    participant DA as Daemon
    participant CE as ConsolidationEngine
    participant CW as ConsolidationWorker
    participant QS as QuantizedStore
    participant DB as memory.db
    
    U->>DA: slm remember "fact"
    DA->>DB: Store memory
    Note over CE: Periodic check<br/>(configurable interval)
    CE->>DB: Query usage patterns
    CE->>CE: Evaluate consolidation candidates
    CE->>CW: Queue consolidation task
    CW->>QS: Read candidate memories
    QS-->>CW: Decompressed data
    CW->>CW: Merge & deduplicate
    CW->>QS: Write consolidated memory
    QS->>DB: Update storage
    Note over DB: Original audit trail<br/>preserved

Consolidation Worker

The ConsolidationWorker runs as a background task, processing consolidation in batches to avoid blocking the main application.

# From learning/consolidation_worker.py - Batch processing pattern
def process_batch(self, candidates: list[str]) -> ConsolidationResult:
    """
    Process a batch of memories for consolidation.
    Returns result with merged facts and storage savings.
    """

Key behaviors:

  • Processes memories in configurable batch sizes
  • Runs during idle periods to minimize performance impact
  • Maintains full audit trail of consolidation operations
  • Supports rollback if consolidation fails

Source: src/superlocalmemory/learning/consolidation_worker.py

Context Cache Integration

The Memory Lifecycle system integrates with the Context Cache to provide instant recall for active memories.

Cache Entry Structure

# From core/context_cache.py
class CacheEntry:
    session_id: str       # Which session created this cache
    topic_sig: str        # Topic signature for the context
    content: str          # Cached context content
    fact_ids: tuple[str]  # Memory IDs included in this cache
    provenance: str       # How this was computed
    computed_at: int      # Unix timestamp

Source: src/superlocalmemory/core/context_cache.py

Prewarm Mechanism

The prewarm system proactively caches contexts before they're needed:

  1. After slm remember — caches the new memory in context
  2. After successful recall — caches retrieved facts
  3. During session start — warms cache based on topic signatures
graph LR
    A[slm remember] --> B[Compute topic_sig]
    B --> C[Upsert cache entry]
    C --> D[Session warm start]
    
    E[slm recall] --> F[Retrieve memories]
    F --> G[Update cache]
    G --> D

Source: src/superlocalmemory/server/routes/prewarm.py

Cache Management

Clearing Cache DBs

The slm clear-cache command removes regenerable cache databases while preserving user memories:

# Remove only cache databases
slm clear-cache

# Output:
# Removed cache DBs:
#   - active_brain_cache.db
#   - context_cache.db
#   - entity_trigram_cache.db
# memory.db / learning.db preserved (user memories are safe)

Cache databases removed:

DatabasePurposeRegenerated?
active_brain_cache.dbActive memory indexingYes
context_cache.dbContext prewarm dataYes
entity_trigram_cache.dbLexical search indexYes

Protected databases (never removed):

  • memory.db — User memories
  • learning.db — User preferences and feedback
  • audit.db — Compliance audit trail
  • audit_chain.db — Immutable audit chain

Source: src/superlocalmemory/cli/escape_hatch.py

Quantized Storage

The QuantizedStore provides storage optimization for consolidated memories, reducing disk usage while maintaining retrieval quality.

Storage Format

Memory TypeFormatCompression
ActiveFull text + vectorsNone
ConsolidatedSemantic tokens only60-80% size reduction
ArchivedMetadata only90%+ size reduction

Source: src/superlocalmemory/storage/quantized_store.py

Retrieval Behavior

When a consolidated memory is accessed:

  1. Decompress from quantized format
  2. Reconstruct full semantic representation
  3. Update access statistics (may move back to Active)
  4. Return to requestor

Configuration

Lifecycle Configuration Options

SettingDefaultDescription
consolidation_interval24 hoursHow often consolidation runs
batch_size50Memories processed per batch
frequency_threshold3 accesses/weekBelow this triggers consolidation
similarity_threshold0.85Fisher-Rao similarity for merging
archive_after_days90Days inactive before archival
purge_after_days365Days archived before purge

Cache Configuration

SettingDefaultDescription
cache_ttl_seconds3600Context cache entry TTL
max_cache_entries1000Maximum cached contexts
prewarm_on_remembertrueAuto-cache after slm remember

Troubleshooting

Common Issues

#### Issue: Slow Recall Despite Many Memories

Symptoms: slm recall takes longer than expected

Possible Causes:

  1. Consolidation not running — check daemon logs
  2. Cache cleared — run slm clear-cache after checking
  3. Too many active memories — consider reducing frequency threshold

Resolution:

# Check daemon status
slm serve status

# Verify consolidation is running
# Look for "consolidation_worker" in logs

# Force cache rebuild
slm clear-cache  # Removes only cache DBs
# Daemon will rebuild on next access

#### Issue: Memories Not Consolidating

Symptoms: Memory count keeps growing, consolidation never reduces it

Possible Causes:

  1. Memories are too diverse (low similarity)
  2. Access frequency above threshold
  3. Consolidation worker disabled

Resolution: Verify settings in ~/.superlocalmemory/config.json

#### Issue: Linux/Docker Memory Lifecycle Issues

Symptoms: Reported in Issue #26 — consolidation and trace issues on Linux/Docker

Known Workaround: Ensure the daemon has write access to the data directory and that the user running the container matches the file ownership of ~/.superlocalmemory/.

# Fix ownership on Linux/Docker
chown -R $(id -u):$(id -g) ~/.superlocalmemory

The Memory Lifecycle system works closely with these related systems:

FeatureIntegration PointDocumentation
Behavioral LearningLearns from consolidation outcomes to improve future decisionsBehavioral Learning
Trust SystemLifecycle decisions influenced by trust scoresTrust System
Profile IsolationLifecycle respects profile boundariesProfile Management
Audit TrailAll lifecycle changes logged for complianceEnterprise Compliance

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Multi-Machine Mesh

Related topics: CLI Reference

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Design Principles

Continue reading this section for the full explanation and source context.

Section Environment Variables

Continue reading this section for the full explanation and source context.

Section Daemon Binding

Continue reading this section for the full explanation and source context.

Related topics: CLI Reference

Multi-Machine Mesh

SuperLocalMemory Multi-Machine Mesh enables distributed memory synchronization across multiple machines on a trusted network. This feature extends the local-first memory architecture to a fleet of machines, allowing AI agents running on different hosts to share and access consolidated memory.

Community Note: This feature addresses a highly requested capability from users running the Qualixar stack across home lab environments. See GitHub Issue #23 for the original feature request from users running SLM on 7-machine WireGuard meshes.

Overview

The mesh architecture consists of interconnected SuperLocalMemory instances that communicate to maintain a synchronized view of shared memories. Each machine in the mesh operates as an independent memory node while selectively sharing designated memories with peer nodes.

ComponentRole
Mesh BrokerCoordinates communication between mesh nodes
Remote SyncHandles memory synchronization protocol
Mesh MCP ToolsExposes mesh operations via MCP interface
Unified DaemonManages daemon lifecycle and mesh bindings

Architecture

graph TD
    A[Machine A - SLM Node] -->|Mesh Broker| B[WireGuard/VPN Network]
    C[Machine B - SLM Node] -->|Mesh Broker| B
    D[Machine C - SLM Node] -->|Mesh Broker| B
    
    B -->|Sync Protocol| A
    B -->|Sync Protocol| C
    B -->|Sync Protocol| D
    
    A -->|Local Memory| A1[(Local SQLite)]
    C -->|Local Memory| C1[(Local SQLite)]
    D -->|Local Memory| D1[(Local SQLite)]
    
    A1 -->|Remote Sync| Shared[(Shared Memory Pool)]
    C1 -->|Remote Sync| Shared
    D1 -->|Remote Sync| Shared

Design Principles

The mesh system is built on three core principles derived from the local-first architecture:

  1. Selective Sharing — Only explicitly marked memories propagate across nodes
  2. Conflict Resolution — Last-write-wins with provenance tracking
  3. Trust Boundaries — Mesh operates within authenticated network boundaries (e.g., WireGuard VPN)

Configuration

Environment Variables

VariableDefaultDescription
SLM_HOST127.0.0.1Bind address for SLM daemon and mesh broker
SLM_MESH_PORT8766Port for mesh broker communication
SLM_MESH_TOKEN(generated)Authentication token for mesh nodes
SLM_DATA_DIR~/.superlocalmemoryBase directory for memory storage
Known Limitation: As of the current release, SLM_DATA_DIR is documented but may not be fully implemented in all components. See GitHub Issue #10 for tracking.

Daemon Binding

The unified daemon binds to a configurable host address to enable mesh connectivity:

# From unified_daemon.py
# The daemon can be configured to bind to non-localhost addresses
# enabling cross-machine communication on trusted networks

Source: src/superlocalmemory/server/unified_daemon.py

Mesh Components

Mesh Broker

The mesh broker (broker.py) serves as the central coordination point for mesh operations:

# Conceptual structure based on available source references
class MeshBroker:
    def __init__(self, host: str, port: int, token: str):
        ...
    
    def register_node(self, node_id: str, endpoint: str) -> bool:
        """Register a new node in the mesh."""
        ...
    
    def broadcast(self, message: MeshMessage) -> None:
        """Broadcast a message to all registered nodes."""
        ...

Source: src/superlocalmemory/mesh/broker.py

Remote Synchronization

The remote sync module (remote_sync.py) implements the synchronization protocol between nodes:

MethodPurpose
sync_to_peers()Push local changes to connected peers
pull_from_peers()Pull remote changes into local store
resolve_conflicts()Handle concurrent modifications
get_sync_status()Return current synchronization state

Source: src/superlocalmemory/mesh/remote_sync.py

MCP Mesh Tools

Mesh operations are exposed through the MCP interface for AI tool integration:

# MCP tools available for mesh operations
tools = [
    "mesh_list_nodes",      # List connected mesh peers
    "mesh_sync_memory",     # Trigger synchronization
    "mesh_share_memory",    # Share a specific memory across mesh
    "mesh_get_status"       # Get mesh connectivity status
]

Source: src/superlocalmemory/mcp/tools_mesh.py

Setup Procedures

Prerequisites

  1. SuperLocalMemory installed on all machines (~/.superlocalmemory/)
  2. Network connectivity between machines (WireGuard VPN recommended)
  3. Unique machine identifiers for each node
  4. Mesh authentication token shared across the fleet

Basic Setup

# 1. Generate mesh token on primary machine
slm mesh token generate

# 2. Join secondary machines to mesh
slm mesh join <primary-host>:<port> --token <generated-token>

# 3. Verify connectivity
slm mesh status

Network Configuration

For mesh communication across machines, configure the bind address:

# Set SLM_HOST to allow external connections
export SLM_HOST=0.0.0.0

# Or configure in slm config
slm config set mesh.host 0.0.0.0
slm config set mesh.port 8766
Security Consideration: Binding to 0.0.0.0 exposes the SLM daemon to all network interfaces. Only use in trusted environments such as a private VPN.

Multi-Scope Memory Integration

The mesh system integrates with the planned multi-scope memory architecture (see RFC #20):

graph LR
    subgraph "Personal Scope"
        P1[Machine A Memory]
        P2[Machine B Memory]
    end
    
    subgraph "Shared Scope"
        S1[Shared Knowledge Base]
    end
    
    P1 -->|Personal| S1
    P2 -->|Personal| S1
ScopeVisibilitySync Behavior
PersonalLocal machine onlyNever sync
SharedAll mesh nodesAutomatic sync
TeamSelected nodesSelective sync

Troubleshooting

Connection Issues

SymptomPossible CauseResolution
Connection refused on port 8766Mesh broker not runningRun slm daemon start
Authentication failedToken mismatchVerify token on all nodes
Timeout waiting for peersNetwork/firewall issueCheck VPN connectivity

Docker/Linux Environments

Users running SLM in Docker containers report challenges with network binding (Issue #26):

# For Docker deployments, expose mesh ports explicitly
docker run -p 8765:8765 -p 8766:8766 superlocalmemory

# Ensure SLM_HOST is set to container IP or 0.0.0.0
docker run -e SLM_HOST=0.0.0.0 superlocalmemory

Performance Considerations

FactorImpactRecommendation
Network latencySync delayUse low-latency VPN
Memory sizeSync durationBatch large memories
Node countCoordination overheadLimit to trusted fleet

CLI Commands

# Mesh management commands
slm mesh status          # Show mesh connectivity status
slm mesh list            # List connected peer nodes
slm mesh sync            # Trigger immediate synchronization
slm mesh share <id>      # Share a memory to mesh peers
slm mesh revoke <id>     # Revoke mesh sharing for a memory
slm mesh leave           # Disconnect from mesh

Security Model

The mesh system implements trust boundaries based on network topology:

  1. Node Authentication — Mesh tokens verify node identity
  2. Profile Isolation — Memories from one profile cannot leak to another
  3. Scope Enforcement — Only shared-scope memories traverse the mesh

For production deployments, combine with:

  • WireGuard VPN for encrypted transport
  • Firewall rules restricting mesh ports
  • Regular token rotation via slm rotate-token

See Also

Source: https://github.com/qualixar/superlocalmemory / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 11 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_8628e403827148fcb5f3b537c1af2263 | https://github.com/qualixar/superlocalmemory/issues/26

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_c74d27ed1bf2462585e76845639adfd5 | https://github.com/qualixar/superlocalmemory/issues/23

3. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_ba05e614f5a8499da175aa7ba09ac343 | https://github.com/qualixar/superlocalmemory/issues/20

4. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.host_targets | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

5. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

6. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

7. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

8. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

9. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_be8654b4ef434c37bedf0a453e65f5d6 | https://github.com/qualixar/superlocalmemory/issues/7

10. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

11. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:1150546081 | https://github.com/qualixar/superlocalmemory

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using superlocalmemory with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence