memsearch Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

memsearch

memsearch solves the context window limitation problem by creating an external, searchable memory layer. When an agent processes a new request, memsearch retrieves relevant past context th...

Introduction to memsearch

Related topics: System Architecture, Design Philosophy, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Capabilities

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Capture Flow (Writing Memory)

Continue reading this section for the full explanation and source context.

Introduction to memsearch

memsearch is a cross-platform semantic memory system for AI coding agents — markdown-first, backed by Milvus vector database. It enables AI coding assistants to recall relevant context from past sessions, maintain project-specific knowledge, and provide more informed assistance based on historical decisions and conversations.

Overview

memsearch solves the context window limitation problem by creating an external, searchable memory layer. When an agent processes a new request, memsearch retrieves relevant past context through hybrid semantic and keyword search, allowing the agent to understand what was previously discussed, decided, or implemented.

The system operates on a simple but powerful principle: markdown files are the source of truth. All memory is stored as human-readable markdown in a local .memsearch/memory/ directory. Milvus serves as a rebuildable shadow index for fast semantic search. Source: README.md

Key Capabilities

Capability	Description
Semantic Search	Find relevant memories using natural language queries via vector similarity
Keyword Search	BM25-based keyword matching for exact term matching
Hybrid Search	Combines semantic and keyword search for optimal recall
File Watching	Auto-index markdown changes as they occur
Plugin Integration	Native plugins for Claude Code, Codex, OpenClaw, and OpenCode
Local Embeddings	ONNX-based BGE-M3 embeddings without API keys

Architecture

memsearch follows a layered architecture where markdown files serve as the durable storage and Milvus provides fast search capabilities.

graph TB
    subgraph "Plugin Layer"
        ClaudeCode["Claude Code Plugin"]
        Codex["Codex Plugin"]
        OpenClaw["OpenClaw Plugin"]
        OpenCode["OpenCode Plugin"]
    end
    
    subgraph "CLI / API Layer"
        CLI["memsearch CLI"]
        API["MemSearch Python API"]
    end
    
    subgraph "Core Engine"
        Scanner["scanner.py<br/>File Discovery"]
        Chunker["chunker.py<br/>Markdown Splitting"]
        Embedder["Embedding Provider<br/>ONNX / OpenAI"]
        MilvusStore["MilvusStore<br/>Vector Storage"]
    end
    
    subgraph "Storage Layer"
        MemoryFiles[".memsearch/memory/<br/>Markdown Files"]
        MilvusDB["Milvus<br/>Vector Index"]
    end
    
    ClaudeCode --> CLI
    Codex --> CLI
    OpenClaw --> CLI
    OpenCode --> CLI
    CLI --> API
    API --> Scanner
    API --> Chunker
    Chunker --> Embedder
    Embedder --> MilvusStore
    API --> MilvusStore
    MemoryFiles --> Scanner
    MilvusStore --> MilvusDB

Core Components

#### 1. MemSearch Core (src/memsearch/__init__.py)

The main entry point for programmatic usage:

from memsearch import MemSearch

ms = MemSearch(["./docs", "./memory"])
ms.index()
results = ms.search("batch size optimization")

The MemSearch class orchestrates scanning, chunking, embedding, and storage operations. Source: src/memsearch/__init__.py:1-7

#### 2. File Scanner (src/memsearch/scanner.py)

Recursively discovers markdown files across multiple paths:

from memsearch.scanner import scan_paths

files = scan_paths(["./docs", "./memory"])
# Returns list of ScannedFile(path, mtime, size)

Key features:

Recursive directory traversal
Hidden file/directory filtering (. prefix)
Deduplication via absolute path tracking
Supports .md and .markdown extensions Source: src/memsearch/scanner.py:1-47

#### 3. Markdown Chunker (src/memsearch/chunker.py)

Splits markdown files into semantically meaningful chunks by headings:

from memsearch.chunker import clean_content_for_embedding

cleaned = clean_content_for_embedding(chunk_text)
# Removes HTML comments, collapses blank lines

Features:

Splits by heading levels (H1-H6)
Content cleaning for embeddings (removes HTML comments)
Minimum content length filtering
Metadata noise removal Source: src/memsearch/chunker.py:1-47

#### 4. CLI (src/memsearch/cli.py)

Command-line interface with subcommands:

Command	Purpose
`memsearch index`	Index markdown files to Milvus
`memsearch search`	Query semantic memory
`memsearch expand`	Show full context around a chunk
`memsearch watch`	Monitor and auto-index changes
`memsearch stats`	Display collection statistics
`memsearch config`	Manage configuration

Source: src/memsearch/cli.py:1-50

Data Flow

Capture Flow (Writing Memory)

When a conversation turn completes, the plugin captures and stores the interaction:

sequenceDiagram
    participant User
    participant Agent
    participant Plugin as Stop Hook
    participant Summarizer as LLM (haiku)
    participant FS as Memory Files
    participant Indexer as memsearch index
    participant Milvus

    User->>Agent: Asks question
    Agent->>User: Responds
    User->>Plugin: Conversation ends
    Plugin->>Plugin: Parse last turn
    Plugin->>Summarizer: Summarize turn
    Summarizer-->>Plugin: "- User asked X<br/>- Agent did Y"
    Plugin->>FS: Append to memory/YYYY-MM-DD.md<br/>with session anchor
    Plugin->>Indexer: Trigger indexing
    Indexer->>FS: Read memory files
    Indexer->>Milvus: Store chunks

Source: README.md

Recall Flow (Reading Memory)

When searching for past context, the system uses a 3-layer progressive approach:

graph LR
    A["User Query"] --> B["L1: memsearch search<br/>Ranked chunks by relevance"]
    B --> C{"Need more?"}
    C -->|Yes| D["L2: memsearch expand<br/>Full markdown section"]
    D --> E{"Need original?"}
    E -->|Yes| F["L3: Parse transcript<br/>Raw dialogue from session"]
    C -->|No| G["Use chunk directly"]
    E -->|No| G
    F --> G

Source: README.md

Plugin Memory Recall Workflow

Each plugin provides specialized memory recall skills:

memory_search: Initial semantic search returning chunk summaries with chunk_hash identifiers
memory_get: Expands a specific chunk to show full section context, may include transcript anchors ()
memory_transcript: Parses original session transcript for exact conversation dialogue Source: plugins/claude-code/skills/memory-recall/SKILL.md:1-30

Configuration

memsearch supports flexible configuration at multiple levels:

Configuration Options

Setting	Default	Description
`embedding.provider`	`onnx`	Embedding provider: `onnx` or `openai`
`embedding.model`	`bge-m3`	Model for embeddings
`milvus.uri`	`*.db` (local)	Milvus connection URI
`milvus.token`	-	Authentication token
`llm.providers.*`	-	LLM provider configuration
`plugins..summarize.`	-	Plugin-specific settings

Embedding Providers

ONNX (Default)

Runs locally, no API key required
Uses BGE-M3 model
Suitable for most use cases

memsearch config set embedding.provider onnx

OpenAI

Requires OPENAI_API_KEY
Potentially better quality

memsearch config set embedding.provider openai
memsearch config set embedding.model text-embedding-3-small

Source: plugins/codex/README.md:1-35

Milvus Backends

Milvus Lite (Default)

Local .db file
Zero configuration
Best for single-user, local development

memsearch config set milvus.uri ./memory.db

Remote Milvus Server

For larger memory stores
Team sharing capabilities

memsearch config set milvus.uri http://localhost:19530

Known Limitations and Issues

Based on community reports, be aware of these limitations:

Issue	Severity	Workaround
CLI search fails on Milvus Lite with "released" state	High	Use `memsearch watch` to keep collection loaded
Duplicate primary keys error during indexing	Medium	Avoid re-indexing same chunks in single batch
`memsearch stats` shows 0 chunks	Medium	Explicitly pass `-c` flag for collection name
Memory leak (~9 MB/chunk) during large indexing	High	Process in batches or restart between large corpus indexing
Codex plugin uses deprecated `features.codex_hooks`	Low	Manual configuration update pending
Remote Milvus 2.5+ upsert durability	Medium	Ensure explicit flush after upsert operations

Source: Community Issues #540, #539, #538, #533, #535, #534

Project Memory Format

memsearch maintains memories in daily markdown files with structured sections:

# Project Memory

## Current Direction
- Active development focus areas

## Active Threads
- Ongoing tasks and discussions

## Recent Progress
- Completed work and decisions

## Decisions
- Architectural and design choices

## Open Questions
- Items requiring further investigation

## Risks and Constraints
- Known limitations and blockers

## Next Steps
- Upcoming planned work

## Cold Items
- Deferred or abandoned items

The system includes a review prompt template that guides the LLM to keep project memory concise and actionable, preferring targeted additions over broad rewrites. Source: src/memsearch/prompts/project_review.txt:1-35

Quick Start

Installation

pip install memsearch
# or
uvx memsearch --help

Basic Usage

# Initialize configuration
memsearch config init

# Index current directory
memsearch index .

# Search for relevant memories
memsearch search "batch size optimization"

# Watch for changes (auto-index)
memsearch watch .

# Check stats
memsearch stats

Python API

from memsearch import MemSearch

# Initialize
ms = MemSearch(["./memory"], description="project-memory")

# Index files
ms.index()

# Search
results = ms.search("API design decisions", top_k=5)
for result in results:
    print(result["text"])
    print(f"Score: {result['score']}")

# Cleanup
ms.close()

Feature Roadmap

Community feature requests highlight future directions:

Personal/Global Memory (#337): Cross-project memory layer alongside project-scoped memory
Automatic Memory Refinement (#523): Periodic "dreaming" to prevent document rot in append-only memory
CJK Tokenization (#102): Jieba-based Chinese tokenization for improved CJK search quality

Source: Community Issues #337, #523, #102

Quick Start Guide

Related topics: Introduction to memsearch, Memory Storage

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Using pip

Continue reading this section for the full explanation and source context.

Section Using uv (recommended)

Continue reading this section for the full explanation and source context.

Section Verify Installation

Continue reading this section for the full explanation and source context.

Related topics: Introduction to memsearch, Memory Storage

Quick Start Guide

This guide walks you through installing, configuring, and using memsearch for the first time. By the end, you'll have a working semantic memory system that indexes markdown files and enables hybrid search across your project knowledge base.

What is Memsearch?

Memsearch is a cross-platform semantic memory system for AI coding agents. It provides semantic and keyword hybrid search over markdown knowledge bases, with Milvus as the backing vector database.

Source: mkdocs.yml:2

The system has three core properties:

Property	Description
Markdown-first	All memory is stored as `.md` files in a `memory/` directory
Milvus-backed	A vector index (Milvus Lite, local server, or cloud) enables fast semantic search
Plugin-native	First-class plugins for Claude Code, Codex, OpenCode, and OpenClaw agents

Source: README.md:1

Architecture Overview

Before diving in, understand the three layers that power memsearch:

graph TD
    A[Markdown Files] --> B[Scanner & Chunker]
    B --> C[Milvus Vector Store]
    D[CLI / Python API] --> C
    E[Agent Plugins] --> D
    F[memsearch search] --> C
    G[memsearch expand] --> F

Component	Role
Scanner (`scanner.py`)	Recursively discovers `.md` and `.markdown` files
Chunker (`chunker.py`)	Splits files into semantic chunks by headings
Milvus Store	Stores chunks with embeddings for hybrid search
CLI/API	Exposes `index`, `search`, `expand`, and `stats` commands

Source: src/memsearch/scanner.py:1-30 Source: src/memsearch/chunker.py:1-25

Prerequisites

Requirement	Details
Python	3.10 or later
Package manager	pip, uv, or poetry
Milvus backend	Milvus Lite (default, no setup) or remote Milvus server

Installation

Using pip

pip install memsearch

Using uv (recommended)

uv pip install memsearch

Verify Installation

memsearch --version

Source: src/memsearch/cli.py:1-50

Initial Configuration

After installation, run the interactive setup:

memsearch config init

This creates a global configuration file at ~/.config/memsearch/config.toml.

Project-Scoped Configuration

For per-project settings:

memsearch config init --project

This creates .memsearch/config.toml within the current directory, allowing project-specific collections and providers.

Source: plugins/claude-code/skills/memory-config/SKILL.md

Configuration Options

Setting	Default	Description
`embedding.provider`	`onnx`	Embedding provider (onnx, openai)
`milvus.uri`	`milvus_lite`	Milvus connection URI
`collection.name`	auto-derived	Collection name for the index

# Switch to OpenAI embeddings
memsearch config set embedding.provider openai

# Use remote Milvus server
memsearch config set milvus.uri http://localhost:19530

# Set collection name explicitly
memsearch config set collection.name my_project_memory

Source: plugins/codex/README.md:15-30

Core Workflow: Indexing and Searching

Step 1: Index Your Markdown Files

memsearch index <paths...>

Example:

memsearch index ./docs ./memory

This command:

Scans directories for .md and .markdown files
Chunks content by heading structure
Generates embeddings using the configured provider
Upserts chunks into Milvus

graph LR
    A[Markdown Files] --> B[scan_paths]
    B --> C[chunk_by_heading]
    C --> D[clean_content_for_embedding]
    D --> E[embed_chunks]
    E --> F[upsert_to_milvus]

Source: src/memsearch/chunker.py:15-25

Step 2: Search Semantic Content

memsearch search "your query here"

Example:

memsearch search "batch size optimization"

The search uses hybrid search combining:

Vector similarity (semantic meaning)
BM25 keyword matching (exact terms)

Source: plugins/openclaw/skills/memory-recall/SKILL.md

Step 3: Expand Results for Full Context

When search returns a chunk_hash, retrieve the full section:

memsearch expand <chunk_hash>

This returns the markdown section with surrounding context, which may include session anchors like .

Source: plugins/opencode/index.ts

Step 4: Check Index Statistics

memsearch stats

Note: If using project-scoped configuration, pass the config explicitly:

```bash

memsearch stats -c .memsearch/config.toml

```

This addresses a known issue where collection name is only correctly resolved when explicitly specified. Source: #538

Using the Python API

For programmatic access, use the MemSearch class:

from memsearch import MemSearch

# Initialize with paths to index
ms = MemSearch(["./docs", "./memory"])

# Index files
chunk_count = ms.index()
print(f"Indexed {chunk_count} chunks")

# Search
results = ms.search("batch size optimization")
for result in results:
    print(f"- {result['chunk_hash']}: {result['text'][:100]}...")

# Expand a chunk
full_content = ms.expand("<chunk_hash>")

# Cleanup
ms.close()

Source: src/memsearch/__init__.py:1-8

Python API Reference

Method	Parameters	Returns	Description
`MemSearch`	`paths: list[str]`, `**kwargs`	`MemSearch`	Initialize with paths to index
`.index()`	None	`int`	Index all discovered files
`.search()`	`query: str`, `top_k: int`	`list[dict]`	Hybrid search results
`.expand()`	`chunk_hash: str`	`str`	Full section content
`.watch()`	`on_event: callable`, `debounce_ms: int`	`Watcher`	File system watcher
`.close()`	None	None	Release resources

Watch Mode for Continuous Indexing

Monitor directories for changes and auto-index:

memsearch watch <paths...> --description "my project"

This is useful when:

Running as a background process during development
Using with IDE plugins that auto-save markdown files

graph TD
    A[Watch Mode] --> B{File Change?}
    B -->|Yes| C[Debounce]
    C --> D[Index Changed File]
    D --> B
    B -->|No| E[Continue]
    E --> B

Known Issue: When using Milvus Lite, search operations may fail with "Collection is in state 'released'" if run outside an active watch session. Keep the watcher running or manually load the collection before searching. Source: #540

Agent Plugins

Memsearch provides first-class plugins for popular AI coding agents:

Plugin	Agent	Features
`plugins/claude-code`	Claude Code	Stop hook capture, SessionStart injection, memory recall tools
`plugins/codex`	OpenAI Codex	Summarization routing, project review, user profiling
`plugins/opencode`	OpenCode	TypeScript memory tools, transcript retrieval
`plugins/openclaw`	OpenClaw	Memory recall skill, session anchors

Installing a Plugin

# Claude Code
memsearch plugins install claude-code

# Codex
memsearch plugins install codex

Plugin Memory Workflow

graph LR
    A[User: Ask Question] --> B[Agent Responds]
    B --> C[Stop Hook Fires]
    C --> D[Summarize Turn]
    D --> E[Append to memory/YYYY-MM-DD.md]
    E --> F[Index to Milvus]
    
    G[User: Recall] --> H[memsearch search]
    H --> I[memsearch expand]
    I --> J[Optional: Parse Transcript]

Source: README.md:40-70

Configuring Plugin Summarization

# Use native (built-in) model for summarization
memsearch config set plugins.claude-code.summarize.provider native

# Use external provider
memsearch config set plugins.claude-code.summarize.provider openai
memsearch config set plugins.claude-code.summarize.model gpt-5-mini

Source: plugins/codex/README.md:40-50

Troubleshooting Common Issues

Collection in 'released' State (Milvus Lite)

Problem: memsearch search fails outside watch mode with "Collection is in state 'released'"

Solution: Keep the watcher running, or load the collection manually:

memsearch watch ./memory &
# Now search in a separate terminal
memsearch search "query"

Source: #540

Duplicate Primary Key Errors During Indexing

Problem: memsearch index fails with "duplicate primary keys are not allowed in the same batch"

Solution: This occurs when the same chunk content is indexed multiple times. Ensure files haven't been indexed before, or clear and re-index:

memsearch maintenance reset
memsearch index .

Source: #539

Stats Shows 0 Chunks

Problem: memsearch stats reports 0 chunks despite successful indexing

Solution: Pass the config file explicitly:

memsearch stats -c .memsearch/config.toml

Source: #538

Memory Usage During Large Indexing Jobs

Problem: High memory usage (~9 MB per chunk) causes OOM on large corpora

Solution: Process in smaller batches or restart the process periodically:

# Index subdirectories separately
memsearch index ./docs/part1
# ... restart ...
memsearch index ./docs/part2

Source: #533

Next Steps

Topic	Description
Configuration Reference	Full configuration options
CLI Reference	Complete CLI command documentation
Python API	Detailed API documentation
Architecture	System design internals
Platform Comparison	Plugin feature matrix
Troubleshooting	Extended troubleshooting guide

Environment Variables

Variable	Description
`OPENAI_API_KEY`	API key for OpenAI embeddings/summarization
`ANTHROPIC_API_KEY`	API key for Claude summarization
`MEMSEARCH_DIR`	Override default config directory

Source: plugins/claude-code/skills/memory-config/SKILL.md

Source: https://github.com/zilliztech/memsearch / Human Manual

System Architecture

Related topics: Introduction to memsearch, Progressive Retrieval, Memory Storage, Milvus Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section MemSearch Core (src/memsearch/init.py)

Continue reading this section for the full explanation and source context.

Section Scanner (src/memsearch/scanner.py)

Continue reading this section for the full explanation and source context.

Section Chunker (src/memsearch/chunker.py)

Continue reading this section for the full explanation and source context.

System Architecture

MemSearch is a cross-platform semantic memory system for AI coding agents. It combines markdown-first local storage with vector search backed by Milvus to provide persistent, searchable memory across coding sessions.

Overview

The architecture follows a layered design where markdown files are the source of truth and Milvus serves as a rebuildable shadow index. This design prioritizes durability and portability—memory persists in human-readable markdown that can survive across different tools and systems.

graph TB
    subgraph "Agent Layer"
        A[Claude Code] --> P1[Plugin]
        B[Codex] --> P2[Plugin]
        C[OpenCode] --> P3[Plugin]
        D[OpenClaw] --> P4[Plugin]
    end

    subgraph "CLI / API Layer"
        CLI[memsearch CLI]
        API[Python API]
    end

    subgraph "Core Engine"
        SC[Scanner]
        CH[Chunker]
        EMB[Embedding Provider]
        STORE[MilvusStore]
    end

    subgraph "Storage Layer"
        DB[(Milvus<br/>Local .db or<br/>Remote)]
        FS[(Markdown Files<br/>memory/*.md)]
    end

    P1 --> CLI
    P2 --> CLI
    P3 --> CLI
    P4 --> CLI

    CLI --> API
    API --> MemSearch[MemSearch Core]

    MemSearch --> SC
    MemSearch --> CH
    MemSearch --> EMB
    MemSearch --> STORE

    SC --> FS
    CH --> FS
    STORE --> DB

    EMB --> DB

Source: README.md:1-50

Core Components

MemSearch Core (`src/memsearch/init.py`)

The main entry point is the MemSearch class exported from src/memsearch/__init__.py:

from .core import MemSearch

Source: src/memsearch/__init__.py:1-5

The MemSearch class orchestrates all operations:

Component	Responsibility
Scanner	Discovers markdown files in specified paths
Chunker	Splits files into semantic chunks by headings
Embedding Provider	Generates vector embeddings for chunks
MilvusStore	Persists chunks and vectors to Milvus

Scanner (`src/memsearch/scanner.py`)

The scanner module handles file discovery across multiple paths:

@dataclass(frozen=True)
class ScannedFile:
    """Metadata for a discovered markdown file."""
    path: Path
    mtime: float
    size: int

Source: src/memsearch/scanner.py:8-12

Key behavior:

Recursively walks directories
Filters by extension (.md, .markdown)
Skips hidden files/directories by default
Deduplicates by resolved path
Sorts results alphabetically

def scan_paths(
    paths: list[str | Path],
    *,
    extensions: tuple[str, ...] = (".md", ".markdown"),
    ignore_hidden: bool = True,
) -> list[ScannedFile]:

Source: src/memsearch/scanner.py:16-26

Chunker (`src/memsearch/chunker.py`)

The chunker splits markdown files into semantically meaningful units by heading structure:

_HEADING_RE = re.compile(r"^(#{1,6})\s+(.+)$", re.MULTILINE)
_HTML_COMMENT_RE = re.compile(r"<!--.*?-->", re.DOTALL)

Source: src/memsearch/chunker.py:1-10

Content Cleaning

Before embedding, content is cleaned to improve vector quality:

def clean_content_for_embedding(text: str) -> str:
    """Strip metadata noise from chunk content before embedding."""
    cleaned = _HTML_COMMENT_RE.sub("", text)
    cleaned = re.sub(r"\n{3,}", "\n\n", cleaned)
    return cleaned.strip()

Source: src/memsearch/chunker.py:13-22

This removes HTML comments (often containing session UUIDs and transcript paths) that dilute embedding quality. The original file content remains unchanged—cleaning only affects the text sent to the embedding model.

Content Validation

Chunks below a minimum meaningful length threshold are dropped:

_MIN_MEANINGFUL_LEN = 2

def _has_meaningful_content(text: str) -> bool:
    """Return True if *text* has enough substance to be worth indexing."""

Source: src/memsearch/chunker.py:6-25

Embedding Providers

MemSearch supports multiple embedding backends:

Provider	Model	Description
`onnx`	bge-m3	Local inference, no API key required (default)
`openai`	text-embedding-3-*	OpenAI API-based embeddings
OpenAI-compatible	Configurable	Supports custom API endpoints

Source: README.md:60-70

Configuration is exposed via CLI:

memsearch config set embedding.provider onnx
memsearch config set embedding.provider openai

Source: README.md:65-68

CLI Architecture

The CLI provides a command-line interface to the MemSearch engine:

graph LR
    A[memsearch CLI] --> B[index]
    A --> C[search]
    A --> D[expand]
    A --> E[stats]
    A --> F[watch]
    A --> G[config]

Source: src/memsearch/cli.py:1-100

Command Structure

The CLI uses Click for command parsing and shares common options across commands:

def _common_options(f):
    """Shared options for commands that create a MemSearch instance."""
    f = click.option("--provider", "-p", default=None, help="Embedding provider.")(f)
    f = click.option("--model", "-m", default=None, help="Override embedding model.")(f)
    f = click.option("--batch-size", default=None, type=int, help="Embedding batch size.")(f)
    f = click.option("--base-url", default=None, help="OpenAI-compatible API base URL.")(f)
    f = click.option("--api-key", default=None, help="API key for the embedding provider.")(f)
    f = click.option("--collection", "-c", default=None, help="Milvus collection name.")(f)
    f = click.option("--milvus-uri", default=None, help="Milvus connection URI.")(f)
    f = click.option("--milvus-token", default=None, help="Milvus auth token.")(f)
    return f

Source: src/memsearch/cli.py:80-91

Watch Mode

The watch command provides live file monitoring with debounced re-indexing:

@cli.command()
def watch(
    paths: list[str],
    *,
    debounce_ms: int = 500,
    max_chunk_size: int = 1024,
    ...
):
    """Watch PATHS for markdown changes and auto-index."""

Source: src/memsearch/cli.py:200-250

Plugin Architecture

Plugins integrate MemSearch with specific AI coding agents. Each plugin provides:

Capture — Append summaries to markdown files after each conversation turn
Recall — Search and retrieve relevant memories during conversations
Transcript anchoring — Link search results back to original conversations

graph TB
    subgraph "Agent Session"
        U[User Input]
        A[Agent Response]
        S[Stop Hook]
    end

    subgraph "Capture Flow"
        P[Parse Turn]
        L[LLM Summarize]
        F[Append to memory/*.md]
        I[Index to Milvus]
    end

    U --> A
    A --> S
    S --> P
    P --> L
    L --> F
    F --> I

Source: README.md:55-80

Memory Recall Flow

Recall follows a progressive 3-layer strategy:

Layer	Action	Returns
L1	`memsearch search`	Ranked chunks by hybrid search
L2	`memsearch expand`	Full markdown section
L3	Parse transcript	Raw conversation dialogue

Source: README.md:82-90

Plugin Tools

Each plugin exposes tools for memory operations:

memory_search: tool({
    description: "Find relevant memory chunks using semantic search",
    args: {
        query: tool.schema.string().describe("Search query"),
    }
}),

memory_get: tool({
    description: "Expand a memory chunk to see the full markdown section",
    args: {
        chunk_hash: tool.schema.string().describe("Hash from search result"),
    }
}),

memory_transcript: tool({
    description: "Retrieve original conversation from transcript",
    args: {
        session_id: tool.schema.string(),
        turn_id: tool.schema.string().optional(),
    }
})

Source: plugins/opencode/index.ts:1-60

Storage Architecture

Milvus Backend

MemSearch uses Milvus for vector storage with two deployment modes:

Mode	URI	Use Case
Milvus Lite	`~/.memsearch/memsearch.db`	Single-user, local development
Remote Milvus	`http://host:19530`	Team sharing, production scale

Source: README.md:70-75

Data Model

Each indexed chunk contains:

Field	Type	Description
`chunk_hash`	string	SHA-based hash as primary key
`content`	string	Full markdown text
`clean_content`	string	HTML-comments stripped (for embedding)
`file_path`	string	Original file location
`heading`	string	Parent heading hierarchy
`mtime`	float	File modification time
`vector`	float[]	Embedding vector

Source: src/memsearch/chunker.py:1-30

Collection Naming

Collections are derived from the project directory to ensure isolation:

~/.memsearch/collections/
├── ms_project_abc12345/    # project root hash
├── ms_home_user_docs/       # ~/docs/ hash
└── ...

Source: plugins/opencode/index.ts:20-30

Known Issue: The memsearch stats command may report 0 chunks when the collection name is only set in the config file, requiring explicit -c flag. Source: Issue #538

Configuration System

Configuration is hierarchical and supports multiple scopes:

graph TD
    C[CLI Overrides] --> P[Project Config]
    C --> G[Global Config]
    P --> D[Defaults]
    G --> D

Source: src/memsearch/cli.py:80-120

Configuration Keys

Category	Keys	Description
`embedding.*`	provider, model, batch_size	Embedding configuration
`milvus.*`	uri, token, collection	Milvus connection
`plugins..summarize.`	enabled, provider, model	Per-plugin summarization
`prompts.*`	System prompt customization	Memory generation prompts

Source: src/memsearch/cli.py:200-260

Known Architectural Limitations

Based on community-reported issues:

Issue	Impact	Workaround
Collection "released" state on Milvus Lite	Search fails outside active watch session	Use `watch` mode or call `load()` explicitly
Duplicate primary keys in batch	Index fails with 1100 error	Deduplicate chunks before indexing
Memory leak during indexing (~9MB/chunk)	OOM on large corpora with ONNX	Restart process periodically

Source: Issue #540, Issue #533, Issue #539

Documentation Structure

The project documentation is organized as follows (from mkdocs.yml):

nav:
  - Home: Overview, Why memsearch, For Users, For Developers
  - Plugins: Claude Code, Codex, OpenCode, OpenClaw
  - Reference: CLI, Python API, Integrations
  - Platform Comparison

Source: mkdocs.yml:10-35

Key documentation files:

architecture.md — System architecture details
design-philosophy.md — Design decisions and rationale
python-api.md — Python API reference
cli.md — CLI command reference

Source: https://github.com/zilliztech/memsearch / Human Manual

Design Philosophy

Related topics: System Architecture, Memory Storage

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. Markdown as Source of Truth

Continue reading this section for the full explanation and source context.

Section 2. Project-Scoped Isolation

Continue reading this section for the full explanation and source context.

Section 3. Three-Layer Progressive Recall

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Memory Storage

Design Philosophy

Overview

Memsearch is designed around a markdown-first, rebuildable-index philosophy for AI coding agents. The system treats human-readable markdown files as the authoritative source of truth, with Milvus serving as a secondary semantic index that can be rebuilt at any time from the markdown corpus.

This design prioritizes:

Transparency — memory is readable and editable by humans
Durability — Milvus failures do not lose data
Portability — memory works across platforms and agent implementations
Progressive recall — from quick search to deep conversation drill-down

Core Principles

1. Markdown as Source of Truth

Every piece of memory stored by memsearch exists as readable markdown in the local filesystem. Milvus is a "rebuildable shadow index" — it stores embeddings and enables fast semantic search, but the ground truth remains in .md files.

graph LR
    A[Plugins Append] --> B[memory/YYYY-MM-DD.md]
    B --> C[memsearch index]
    C --> D[Milvus Vector DB]
    
    E[Search Query] --> D
    D --> F[Chunk Hash + Content]
    F --> G[Full .md Section]
    
    style B fill:#f9f,stroke:#333
    style D fill:#ccf,stroke:#333

This approach provides several guarantees:

Guarantee	Benefit
Human-readable	Users can inspect, edit, or delete memory directly
Milvus-independent	No data loss if Milvus storage fails or is cleared
Cross-tool	Memory files work without memsearch installed
Versionable	Standard git workflows apply to memory

Source: README.md:1-50

2. Project-Scoped Isolation

Memsearch uses a per-project memory model. Each project has its own .memsearch/memory/ directory and derives its own Milvus collection name. This isolation ensures:

Memory from Project A does not pollute searches in Project B
Users can clear one project's memory without affecting others
Collection names prevent collisions in shared Milvus deployments

graph TD
    P1[Project A: .memsearch/] --> M1[memory/]
    P1 --> C1[Collection: ms_project_A]
    
    P2[Project B: .memsearch/] --> M2[memory/]
    P2 --> C2[Collection: ms_project_B]
    
    M1 -.-> C1
    M2 -.-> C2
    
    style M1 fill:#f9f,stroke:#333
    style M2 fill:#f9f,stroke:#333

Source: plugins/claude-code/prompts/project_review.txt:1-30

3. Three-Layer Progressive Recall

The recall system implements a three-layer progressive search pattern, moving from broad semantic matching to exact conversation retrieval:

Layer	Command	Purpose	Returns
L1	`memsearch search`	Semantic + keyword hybrid search	Ranked chunk list with hashes
L2	`memsearch expand`	Full markdown section around match	Complete context
L3	`memsearch transcript`	Original session transcript	Raw dialogue with turns

graph LR
    Q[User Question] --> L1[Layer 1: search]
    L1 -->|Need details?| R1[Ranked Chunks]
    R1 --> L2[Layer 2: expand]
    L2 -->|Need original?| R2[Full Section + Anchors]
    R2 --> L3[Layer 3: transcript]
    L3 --> R3[Raw Dialogue]
    
    R1 -.->|Quick answer| E[End]
    R2 -.->|Sufficient| E

This pattern balances speed and depth — most queries resolve at L1, while complex debugging or review tasks drill to L3.

Source: plugins/openclaw/skills/memory-recall/SKILL.md:1-60

Architecture Layers

Layer Hierarchy

Memsearch architecture follows a strict layer hierarchy where each layer depends only on layers below it:

graph TB
    subgraph Plugins
        P1[Claude Code]
        P2[Codex]
        P3[OpenCode]
        P4[OpenClaw]
    end
    
    subgraph CLI
        C[memsearch CLI]
    end
    
    subgraph Core
        M[MemSearch API]
        S[Scanner]
        K[Chunker]
    end
    
    subgraph Storage
        MD[Markdown Files]
        MV[Milvus]
    end
    
    P1 --> C
    P2 --> C
    P3 --> C
    P4 --> C
    C --> M
    M --> S
    M --> K
    M --> MV
    M --> MD
    
    style MD fill:#9f9,stroke:#333
    style MV fill:#ccf,stroke:#333

Layer	Responsibility	Public API
Plugins	Agent integration, hook handling, capture automation	Agent-specific
CLI	User-facing commands, config management	`memsearch` commands
Core	Indexing, search, watching	`MemSearch` class
Storage	File I/O, Milvus sync	Internal

Source: src/memsearch/__init__.py:1-10

Capture-Recall Workflow

The fundamental workflow in memsearch consists of two phases that mirror the agent interaction pattern:

Capture Phase (after each conversation turn):

Agent responds → Stop hook fires → Parse last turn → LLM summarize
    → Append to memory/YYYY-MM-DD.md → Index to Milvus

Recall Phase (when user queries):

User asks → memsearch search → Ranked chunks → memsearch expand
    → Full context → memsearch transcript (optional) → Original dialogue

Source: README.md:50-100

Semantic Chunking Design

Chunking Philosophy

The chunker splits markdown files into semantic units before indexing. This design reflects several constraints:

Constraint	Design Decision
Heading context lost in chunks	Chunks preserve parent heading text
Meaningless sections waste search	Minimum meaningful content threshold enforced
HTML comments dilute embeddings	Metadata stripped before embedding, stored separately
Oversized chunks reduce precision	Maximum chunk size with rollback splitting

Source: src/memsearch/chunker.py:1-60

Content Cleaning

Before embedding, chunk content is cleaned to remove noise that would dilute vector quality:

HTML comments () are stripped — these often contain session UUIDs and transcript paths
Runs of blank lines left by removed comments are collapsed
Original content in Milvus remains unchanged; cleaning only affects embedding text

Source: src/memsearch/chunker.py:20-35

Configuration Philosophy

Progressive Configuration

Memsearch supports configuration at multiple levels with predictable precedence:

Level	Scope	File Location
Global	All projects for current user	`~/.config/memsearch/config.yaml`
Project	Specific project only	`.memsearch/config.yaml`
Environment	Per-invocation overrides	CLI flags, env vars

Configuration options follow a consistent naming scheme organized by domain:

embedding.* — embedding provider settings
milvus.* — Milvus connection settings
plugins.* — plugin-specific configuration
llm.providers.* — LLM provider definitions

Source: src/memsearch/cli.py:100-200

Default Backend: Milvus Lite

The default Milvus backend is Milvus Lite — a local .db file requiring no server setup. This enables:

Zero-configuration first-run experience
Full functionality without network dependencies
Easy migration to remote Milvus for larger deployments

Remote Milvus is configured by setting milvus.uri to a server URL.

Source: plugins/codex/README.md:1-50

File Scanner Design

Scanning Strategy

The file scanner discovers markdown files across multiple paths with configurable behavior:

Parameter	Default	Purpose
`extensions`	`(.md, .markdown)`	File type filter
`ignore_hidden`	`True`	Skip dotfiles/dotdirs
`paths`	User-specified	Multiple root directories

Results are sorted by path and deduplicated to ensure consistent chunk ordering across invocations.

Source: src/memsearch/scanner.py:1-50

Known Design Considerations

Memory Growth in Large Corpora

Indexing large markdown corpora with the ONNX embedding provider can consume significant memory (~9 MB per chunk in some cases). For corpus sizes exceeding available RAM, consider:

Batching index operations
Using a remote Milvus server with more resources
Reducing max_chunk_size to create smaller, more memory-efficient chunks

Source: Community Issue #533

Document Rot Prevention

As daily markdown files accumulate, they become append-only streams that can become difficult to navigate. The project review and user profile maintenance tasks (configurable via plugins.*.project_review and plugins.*.user_profile) provide periodic refinement to keep memory organized.

Source: Community Issue #523

Summary

Memsearch's design philosophy centers on making memory human-readable, durable, and agent-aware:

Markdown-first — all memory exists as editable .md files
Rebuildable index — Milvus is a shadow that can be regenerated
Project isolation — each project has its own memory store
Progressive recall — from quick search to deep transcript drill-down
Plugin extensibility — unified across Claude Code, Codex, OpenCode, and OpenClaw

This philosophy ensures that memory remains accessible even when infrastructure fails, while enabling the sophisticated semantic search that AI coding agents need.

Source: https://github.com/zilliztech/memsearch / Human Manual

Progressive Retrieval

Related topics: Hybrid Search and Deduplication, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Layer 1: Semantic Search

Continue reading this section for the full explanation and source context.

Section Layer 2: Section Expansion

Continue reading this section for the full explanation and source context.

Section Layer 3: Transcript Deep Drill

Continue reading this section for the full explanation and source context.

Progressive Retrieval

Progressive Retrieval is memsearch's multi-tier approach to memory recall, designed to balance speed, context depth, and precision. Instead of returning all available information at once, it enables agents to progressively drill into memories through three distinct layers, each providing more context than the last.

Overview

Memory recall in memsearch follows a progressive escalation pattern where each layer provides richer context than the previous one. This approach optimizes for the common case (quick recall) while making detailed investigation available on demand.

The system is built on three core principles:

Layered depth — Start with lightweight semantic search and escalate to full transcripts only when needed
Anchor-based linking — Chunks contain metadata anchors that enable navigation between layers
Agent-driven — The agent decides when to escalate based on user intent

Source: README.md:conceptual-overview

The Three-Layer Architecture

graph TD
    A[User Query] --> B[L1: memsearch search]
    B --> C[Ranked Chunks]
    C --> D{Need more context?}
    D -->|No| Z[Use results]
    D -->|Yes| E[L2: memsearch expand]
    E --> F[Full .md Section]
    F --> G{Need original dialogue?}
    G -->|No| Z
    G -->|Yes| H[L3: memory_transcript]
    H --> I[Raw Conversation]
    
    style B fill:#4a90d9
    style E fill:#f5a623
    style H fill:#d0021b

Layer 1: Semantic Search

The first layer performs hybrid search combining semantic vector similarity with keyword-based BM25 ranking. This returns ranked chunks that match the user's query intent.

Command: memsearch search "<query>" --top-k 5 --json-output

Output: A list of ranked memory chunks with metadata including chunk_hash, relevance scores, and anchor comments.

memsearch search "batch size optimization" --top-k 5 --json-output --collection ms_project_abc12345

When to use:

Quick factual recall ("did we discuss X?")
Initial exploration of a topic
Questions about past decisions or discussions

Source: plugins/openclaw/skills/memory-recall/SKILL.md:decision-guide

Layer 2: Section Expansion

When a chunk contains relevant information but needs more surrounding context, the agent escalates to expand the full markdown section.

Command: memsearch expand <chunk_hash>

What it retrieves:

The complete markdown section containing the matched chunk
Surrounding headings and context
Session anchor comments ()

Output format from skill file:

Returns: full section text, may include transcript anchors
Use for: "Show me the details", "I need more context on that result"

Source: plugins/openclaw/index.ts:memory_get

Layer 3: Transcript Deep Drill

The deepest layer retrieves the original conversation transcript from the session archive. This is used when the agent needs to see exactly what was said, including tool calls and exact wording.

Trigger: When the expanded chunk contains a session anchor in the format 

Implementation:

memory_transcript: tool({
  description: "Retrieve the original conversation from a past OpenCode session.",
  args: {
    session_id: tool.schema.string().describe("The session ID from the anchor comment"),
    turn_id: ...
  }
})

Output: Formatted dialogue with [User] and [Assistant] labels and tool calls.

Source: plugins/openclaw/index.ts:memory_transcript

Decision Framework

The decision of which layer to use depends on the user's intent:

User Intent	Required Layers	Tools
Quick recall ("did we discuss X?")	L1 only	`memory_search`
Need details ("what was the solution?")	L1 → L2	`memory_search` → `memory_get`
Need original dialogue ("show me the exact conversation")	L1 → L2 → L3	`memory_search` → `memory_get` → `memory_transcript`

Source: plugins/openclaw/skills/memory-recall/SKILL.md:decision-guide

Anchor System

Progressive retrieval relies on anchor comments embedded in memory chunks. These anchors provide the linkage between layers.

Anchor Format

<!-- session:UUID turn:TURN_ID db:PATH -->

Or for Codex transcripts:

<!-- session:UUID transcript:PATH -->

Supported Anchor Prefixes

Prefix	Description
`session:`	Session UUID identifier
`turn:`	Specific turn within the session
`transcript:`	Path to the original transcript file
`rollout:`	Alternative transcript reference (OpenCode variant)
`db:`	Database path for session storage

Note: If the anchor format is unfamiliar (e.g., rollout:, turn:, db: instead of transcript:), try reading the referenced file directly to explore its structure and locate the relevant conversation by the session or turn identifiers.

Source: plugins/openclaw/index.ts:memory_transcript

Agent Integration

Progressive retrieval is designed for agentic use. The agent's memory-recall skill orchestrates the flow:

Claude Code Integration

## Steps

1. **Search**: Run `memsearch search "<query>" --top-k 5 --json-output --collection <collection>`
2. If results are relevant, proceed to expand
3. If anchor comments exist, offer to retrieve transcripts

Source: plugins/claude-code/skills/memory-recall/SKILL.md:steps

Codex Integration

The Codex plugin's memory-recall skill follows the same pattern:

1. **Search**: Run `memsearch search "<query>" --top-k 5 --json-output --collection <collection>`
   - If `memsearch` is not found, try `uvx memsearch` instead
2. Choose a search query that captures the intent
3. Expand results as needed

Source: plugins/codex/skills/memory-recall/SKILL.md

Configuration Options

Collection Derivation

Progressive retrieval tools automatically derive the correct collection name based on context:

# From MEMSEARCH_DIR if set
bash "${CLAUDE_PLUGIN_ROOT}/scripts/derive-collection.sh" "$MEMSEARCH_DIR"

# From git root if available
bash "${CLAUDE_PLUGIN_ROOT}/scripts/derive-collection.sh" "$(git rev-parse --show-toplevel)"

# Fallback to default
bash "${CLAUDE_PLUGIN_ROOT}/scripts/derive-collection.sh"

Search Parameters

Parameter	Default	Description
`--top-k`	5	Number of chunks to return
`--json-output`	false	Return machine-readable JSON
`--collection`	auto	Collection name (auto-derived if not specified)

Usage Tips

Start shallow — Begin with semantic search and escalate only when necessary
Check anchor comments — Expanded sections reveal if transcripts are available
Rephrase if needed — If L1 returns no results, try different keywords
Results are sorted by relevance — Hybrid BM25 + vector search determines ranking

Source: plugins/openclaw/skills/memory-recall/SKILL.md:tips

Memory Capture — The complementary process that stores memories in the first place
Session Anchors — Metadata format enabling transcript retrieval
Project Review — Periodic memory consolidation

For information about configuring the search backend or embedding provider, see the main Configuration documentation.

Source: https://github.com/zilliztech/memsearch / Human Manual

Hybrid Search and Deduplication

Overview

memsearch implements a hybrid search architecture that combines semantic vector similarity with traditional keyword (BM25) matching to provide robust information retrieval. The deduplication layer operates at multiple levels: during indexing through content fingerprinting and at search time through result ranking that deprioritizes semantically similar chunks.

The system uses Milvus as its vector database backend, with Milvus Lite for local development and embedded use, and supports remote Milvus deployments for production workloads. Source: README.md

Architecture

Search Pipeline

memsearch's recall mechanism follows a three-layer progressive disclosure model:

graph TD
    A[User Query] --> B[L1: memsearch search<br/>Hybrid BM25 + Vector]
    B --> C{Relevant<br/>Results?}
    C -->|Yes| D[Return ranked chunks<br/>with chunk_hash]
    C -->|Need more context| E[L2: memsearch expand<br/>Full markdown section]
    E --> F{Raw<br/>dialogue needed?}
    F -->|Yes| G[L3: Parse transcript<br/>Original conversation]
    F -->|No| H[Use expanded content]
    G --> I[Formatted dialogue<br/>with session anchors]
    
    style B fill:#e1f5fe
    style E fill:#fff3e0
    style G fill:#e8f5e9

This architecture allows users to progressively drill from quick recall to original conversation context. Source: README.md

Indexing Pipeline

The indexing pipeline transforms markdown files into searchable chunks with deduplication at multiple stages:

graph LR
    A[Markdown Files] --> B[Scanner<br/>Discovers .md files]
    B --> C[Chunker<br/>Split by headings]
    C --> D[Content Cleaning<br/>Remove noise, validate]
    D --> E{Duplicate<br/>Check?}
    E -->|New| F[Generate Embedding<br/>ONNX or OpenAI]
    E -->|Duplicate| G[Skip/Update<br/>Existing chunk]
    F --> H[Upsert to Milvus<br/>with chunk_hash PK]
    
    style E fill:#ffebee
    style H fill:#e8f5e9

Content Chunking

Markdown-Aware Splitting

The chunker splits markdown files by heading hierarchy, preserving semantic context. Source: src/memsearch/chunker.py:1-15

Heading Level	Split Behavior
`# H1`	New top-level section, becomes separate chunk
`## H2`	Subsection chunk, includes parent heading
`### H3+`	Nested chunk with full heading path

Content Cleaning

Before embedding, content passes through a cleaning pipeline that removes metadata noise:

def clean_content_for_embedding(text: str) -> str:
    """Strip metadata noise from chunk content before embedding."""
    cleaned = _HTML_COMMENT_RE.sub("", text)  # Remove <!-- session:UUID -->
    cleaned = re.sub(r"\n{3,}", "\n\n", cleaned)  # Collapse blank lines
    return cleaned.strip()

This cleaning:

Removes HTML comments containing session UUIDs and transcript paths
Collapses excessive blank lines left by removed comments
Preserves original content in Milvus (only affects embedding input) Source: src/memsearch/chunker.py:19-30

Content Validation

Chunks undergo minimum meaningful length validation to filter out heading-only sections:

| Heading lines | Stripped | Not counted toward length | Source: src/memsearch/chunker.py:32-45

Check	Threshold	Behavior
Minimum meaningful text	2 characters	Chunks below threshold are dropped
HTML comments	Stripped	Not counted toward length

Deduplication Strategy

Primary Key Architecture

memsearch uses chunk_hash as the Milvus primary key, ensuring each unique content fingerprint exists exactly once. Source: src/memsearch/chunker.py:10

@dataclass(frozen=True)
class ScannedFile:
    """Metadata for a discovered markdown file."""
    path: Path
    mtime: float
    size: int

Deduplication Levels

Level	Mechanism	Scope
File scanning	Skip files with identical mtime+size	Repository scan
Content hashing	SHA256 of normalized chunk text	Cross-file
Milvus upsert	Primary key collision prevention	Database

Common Deduplication Issues

#### Duplicate Primary Keys in Batch

When indexing produces duplicate content hashes within a single batch, Milvus rejects the operation:

pymilvus.exceptions.MilvusException: (code=1100, 
message=duplicate primary keys are not allowed in the same batch)

This occurs when:

The same markdown section exists in multiple files
Re-indexing without content changes
File symlinks or hard links creating duplicates Source: Community Issue #539

Resolution: The chunk_hash deduplication should prevent this, but batching logic should normalize input order.

Hybrid Search Implementation

BM25 + Vector Combination

memsearch combines keyword and semantic search for improved relevance: Source: plugins/openclaw/skills/memory-recall/SKILL.md

Search Type	Strength	Use Case
BM25	Exact keyword matching, rare terms	Technical names, exact phrases
Vector	Semantic similarity, synonyms	Conceptual queries, vague recall
Hybrid	Balanced relevance	Most queries

Search Result Ranking

Results are ranked by combined relevance scores. The system returns:

{
  "chunk_hash": "sha256...",
  "content": "markdown section...",
  "score": 0.85,
  "session_anchors": "<!-- session:UUID -->"
}

Expand Operation

The memsearch expand command retrieves the full markdown section surrounding a matched chunk: Source: plugins/opencode/index.ts

Parameter	Type	Description
`chunk_hash`	string	Hash from search result
`collection`	string	Milvus collection name

Returns the complete section with heading context for full understanding.

Configuration Options

Embedding Provider Selection

Provider	Model	API Key Required	Performance
`onnx` (default)	bge-m3	No	Local, fast
`openai`	text-embedding-3-small	Yes	Cloud, accurate

# Use local ONNX embeddings (default)
memsearch config set embedding.provider onnx

# Use OpenAI embeddings
memsearch config set embedding.provider openai

Milvus Backend Configuration

Mode	URI	Use Case
Milvus Lite	`.db` file	Local/dev, single user
Remote Milvus	`http://host:19530`	Production, team sharing

# Local Milvus Lite (default)
memsearch config set milvus.uri ./memory.db

# Remote Milvus server
memsearch config set milvus.uri http://localhost:19530
memsearch config set milvus.token "your-token"

CLI Commands

Search Command

memsearch search <query> [options]

Option	Default	Description
`--collection`, `-c`	auto-derived	Collection name
`--limit`	10	Max results
`--rerank`	true	Enable hybrid reranking

Index Command

memsearch index <paths...> [options]

| --max-chunk-size | 512 | Max tokens per chunk | Source: src/memsearch/cli.py

Option	Default	Description
`--batch-size`	100	Chunks per batch
`--collection`	auto	Target collection

Stats Command

memsearch stats [options]

Known Issue: When collection name is only set in config file (not via -c flag), stats may report 0 chunks. Always verify with explicit collection specification. Source: Community Issue #538

Watch Mode and Auto-Indexing

Watch mode monitors paths for changes and auto-indexes updates:

graph LR
    A[File Change] --> B[Debounce<br/>cfg.watch.debounce_ms]
    B --> C[Re-chunk<br/>Modified file]
    C --> D[Update Milvus<br/>Upsert changed chunks]
    D --> E[Collection<br/>auto-load]

Watch Lifecycle

Initial index: All existing files indexed before watching begins
Change detection: File system events trigger re-index
Milvus collection state: Collection must be loaded for search Source: src/memsearch/cli.py

Common Issues and Resolutions

Collection in 'Released' State

When using Milvus Lite without an active watch process:

MilvusException: Collection 'ms_project_abc12345' is in state 'released'

Cause: Milvus Lite collections are unloaded when the watch process exits.

Solutions:

Keep watch process running for persistent search
Call load() before search if using programmatic API
Use remote Milvus for stateless search operations Source: Community Issue #540

Durability on Remote Milvus 2.5+

MilvusStore.upsert() may report success without durable writes:

Indexed 219 chunks  # Reported
row_count: 0        # Actual after refresh

Cause: Missing flush() call after upsert on remote Milvus 2.5+.

Fix: Ensure flush is called after batch upsert operations. Source: Community Issue #534

Memory Leak During Indexing

Large corpus indexing may cause OOM issues with ONNX provider:

Chunk Size	Memory Growth	Threshold
~9 MB/chunk	Linear growth	15 GB host limit

Workaround: Process in smaller batches or restart periodically. Source: Community Issue #533

Plugin Integration

Claude Code Plugin

The plugin hooks into conversation lifecycle for automatic memory capture:

graph TD
    A[User Turn] --> B[Agent Response]
    B --> C[Stop Hook Fires]
    C --> D[Parse Last Turn]
    D --> E[LLM Summarize<br/>haiku model]
    E --> F[Append to memory/YYYY-MM-DD.md]
    F --> G[memsearch index]
    
    style C fill:#fff3e0
    style E fill:#e8f5e9
    style G fill:#e1f5fe

Source: AGENT.md

Memory Recall Tools

Tool	Input	Output	Use Case
`memory_search`	Query string	Ranked chunks	Quick recall
`memory_get`	chunk_hash	Full section	Need details

Summary

Hybrid search in memsearch combines BM25 keyword matching with vector similarity for robust recall. Deduplication operates through content hashing at the chunking stage, with Milvus primary keys preventing database-level duplicates. The three-layer search architecture (search → expand → transcript) provides progressive disclosure from quick recall to original context.

Key operational considerations:

Use watch mode or explicitly load collections for Milvus Lite search
Monitor memory usage during large corpus indexing with ONNX
Ensure collection flushing on remote Milvus 2.5+ for durability
Collection names must be explicitly passed if not in config file

Source: https://github.com/zilliztech/memsearch / Human Manual

Memory Storage

Related topics: Milvus Integration, Indexing Pipeline, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Dual-Storage Model

Continue reading this section for the full explanation and source context.

Section Storage Components

Continue reading this section for the full explanation and source context.

Section Capture Workflow

Continue reading this section for the full explanation and source context.

Memory Storage

Overview

Memory Storage in memsearch is the system responsible for persisting, indexing, and retrieving conversational context and project knowledge. The architecture follows a dual-storage model where Markdown files serve as the source of truth and Milvus provides a rebuildable shadow index for fast semantic search.

Source: README.md:1-50

The storage system is designed to capture memory from AI coding agent sessions, organize it into structured markdown files, and enable rapid retrieval through hybrid vector and keyword search.

Architecture

Dual-Storage Model

graph TD
    subgraph "Source of Truth"
        MD[Markdown Files<br/>memory/YYYY-MM-DD.md]
    end
    
    subgraph "Shadow Index"
        MV[Milvus Collection<br/>Vector + Text Fields]
    end
    
    subgraph "Plugins"
        CAP[Capture<br/>Stop Hook]
        SCR[Scanner]
        CHK[Chunker]
        EMB[Embedding Provider]
        IDX[Indexer]
    end
    
    CAP --> MD
    SCR --> MD
    MD --> CHK
    CHK --> EMB
    EMB --> IDX
    IDX --> MV
    
    MV -.->|rebuildable from| MD

The architecture enforces a clear principle: Markdown files are always the source of truth, and Milvus is a secondary index that can be rebuilt entirely from the markdown files.

Source: README.md:40-55

Storage Components

Component	Role	Key Files
File Scanner	Discovers markdown files recursively	`src/memsearch/scanner.py`
Chunker	Splits content into searchable units	Internal chunking logic
Embedding Provider	Generates vector representations	ONNX (local) or OpenAI (API)
Milvus Store	Persists vectors and searchable text	`MilvusStore` class
Compactor	Summarizes chunks via LLM	`src/memsearch/compact.py`

Source: src/memsearch/scanner.py:1-60

Memory Capture Flow

Memory capture occurs at the end of each conversation turn through plugin hooks (Stop hooks). The process transforms raw agent interactions into structured, searchable memories.

Capture Workflow

sequenceDiagram
    participant User
    participant Agent
    participant Plugin as Stop Hook
    participant LLM as Summarizer
    participant FS as File System
    participant IDX as memsearch index

    User->>Agent: Conversation turn
    Agent->>User: Response
    Plugin->>Plugin: Detect turn end
    Plugin->>Plugin: Parse last turn
    Plugin->>LLM: Summarize turn
    LLM-->>Plugin: "User asked about X.<br/>Agent did Y."
    Plugin->>FS: Append to memory/YYYY-MM-DD.md
    Note over FS: <!-- session:UUID turn:ID db:PATH -->
    Plugin->>IDX: memsearch index
    IDX->>IDX: Chunk → Embed → Store

The summarization uses lightweight models (e.g., haiku) to create concise bullet-point summaries that preserve key facts, decisions, and actionable insights.

Source: README.md:56-80

Session Anchors

Each captured memory entry includes metadata anchors that enable deep drill into original conversations:

<!-- session:abc12345-6789 turn:3 db:/path/to/memory.db -->

The anchor format supports:

session:ID — Session identifier
turn:ID — Turn number within session
db:PATH — Path to the database file

Source: plugins/opencode/index.ts:1-30

Memory Organization

Directory Structure

Memsearch stores memory in a project-local directory structure:

project/
├── .memsearch/
│   ├── memory/
│   │   ├── 2026-03-27.md
│   │   ├── 2026-03-26.md
│   │   └── ...
│   └── config.yaml
└── .git/

Each markdown file represents one day's worth of conversation memory, following a journal-style organization pattern.

Source: README.md:30-45

Project Review Format

The project review prompt defines the expected structure for consolidated memory entries:

# Project Memory
## Current Direction
## Active Threads
## Recent Progress
## Decisions
## Open Questions
## Risks and Constraints
## Next Steps
## Cold Items

Source: src/memsearch/prompts/project_review.txt:1-25

Retrieval System

Memsearch implements a 3-layer progressive retrieval system for memory recall:

Retrieval Layers

graph LR
    A[User Query] --> B[L1 Search]
    B --> C[Ranked Chunks]
    C --> D{More details needed?}
    D -->|Yes| E[L2 Expand]
    E --> F[Full Section]
    F --> G{Original dialogue?}
    G -->|Yes| H[L3 Transcript]
    H --> I[Raw Conversation]
    
    D -->|No| J[Use Results]
    G -->|No| J

Layer 1: Semantic Search

The first layer uses hybrid search combining:

Vector similarity — Semantic matching via embeddings
BM25 keyword matching — Exact term matching

Returns ranked chunk summaries with chunk_hash identifiers.

Source: plugins/claude-code/skills/memory-recall/SKILL.md:1-30

Layer 2: Chunk Expansion

When search results lack sufficient detail, memory_get expands a specific chunk_hash to show:

Full markdown section with surrounding context
Transcript anchors ()

Source: plugins/opencode/index.ts:5-35

Layer 3: Transcript Parsing

For exact original conversations, memory_transcript parses the raw session transcript referenced in anchors:

Returns formatted dialogue with [User] and [Assistant] labels
Includes tool calls and system responses
Supports surrounding context via turn_id

Source: plugins/opencode/index.ts:35-55

Embedding Providers

Memsearch supports multiple embedding backends:

Provider Configuration

Provider	Type	API Key Required	Default Model
`onnx`	Local CPU/GPU	No	bge-m3
`openai`	Remote API	Yes	text-embedding-3-small

To switch providers:

# Use OpenAI embeddings
memsearch config set embedding.provider openai

# Use local ONNX (default)
memsearch config set embedding.provider onnx

Source: plugins/codex/README.md:1-40

ONNX Provider Notes

The ONNX provider runs locally and is the default choice for privacy-sensitive environments. However, community issues have reported memory usage concerns with large corpora:

Indexing leaks ~9 MB anon-rss per chunk in single process; OOM-kills mid-corpus on 15 GB host

This is particularly relevant when indexing multi-MB markdown corpora.

Source: Community Issue #533

Milvus Backend Configuration

Storage Modes

Mode	Use Case	URI Example
Milvus Lite	Local/single-user	`milvus_lite.db`
Remote Server	Team/shared	`http://localhost:19530`

Configuration:

# Use a remote Milvus server
memsearch config set milvus.uri http://localhost:19530

Source: plugins/codex/README.md:25-30

Collection Management

Collections are derived from project paths using the derive-collection.sh script:

bash __INSTALL_DIR__/scripts/derive-collection.sh <project_path>

The collection name follows the pattern: ms_<project_identifier>

Source: plugins/codex/skills/memory-config/SKILL.md:1-20

Known Issues

Collection Released State — When using Milvus Lite, the CLI may fail with collection in 'released' state outside active sessions:

pymilvus.exceptions.MilvusException: Collection 'ms_project_abc12345' 
is in state 'released'; call load() before search/get/query

This requires calling load() before search operations in standalone CLI usage.

Source: Community Issue #540

Memory Compaction

For long-running sessions, memsearch provides compaction to summarize and compress memory chunks using an LLM.

Supported Providers

OpenAI (default)
Anthropic
Gemini

API keys are read from environment variables:

OPENAI_API_KEY / OPENAI_BASE_URL
ANTHROPIC_API_KEY
GOOGLE_API_KEY

Source: src/memsearch/compact.py:1-30

Compaction Prompt

The default prompt instructs the LLM to preserve:

Key facts
Decisions
Code patterns
Actionable insights
Technical details and code snippets

COMPACT_PROMPT = """\
You are a knowledge compression assistant. Given the following chunks of text \
from a knowledge base, create a concise but comprehensive summary that preserves \
all key facts, decisions, code patterns, and actionable insights.
...
"""

Source: src/memsearch/compact.py:10-25

Pre-Compaction Transcript Capture

A known limitation exists where the Stop hook's transcript parsing runs after compaction:

Pre-compaction transcript not captured by Stop hook? — A long agentic run that compacts mid-stream seems to lose that detail before memsearch ever sees it.

Source: Community Issue #537

Configuration

Configuration Hierarchy

Settings can be configured at multiple levels:

Level	Command	Scope
Global	`memsearch config init`	All projects
Project	`memsearch config init --project`	Current project only
Session	Environment variable `MEMSEARCH_DIR`	Active session

Source: plugins/codex/skills/memory-config/SKILL.md:1-15

Key Configuration Options

Option	Description	Default
`embedding.provider`	Embedding backend	`onnx`
`milvus.uri`	Milvus connection URI	`milvus_lite.db`
`plugins.codex.summarize.model`	Summarization model	Native
`plugins.codex.summarize.provider`	Summarization provider	`native`
`llm.providers.openai.model`	OpenAI model	`gpt-5-mini`

Source: plugins/codex/README.md:30-55

File Scanning

The scanner discovers markdown files for indexing:

def scan_paths(
    paths: list[str | Path],
    *,
    extensions: tuple[str, ...] = (".md", ".markdown"),
    ignore_hidden: bool = True,
) -> list[ScannedFile]:

Source: src/memsearch/scanner.py:20-45

Scanning Behavior

Recursively walks directories
Filters by extension (.md, .markdown)
Skips hidden files/directories by default (configurable)
Deduplicates results by resolved path
Returns sorted list of ScannedFile objects

Source: src/memsearch/scanner.py:45-70

Troubleshooting

Common Issues

Issue	Symptom	Resolution
Collection released	Search fails with "Collection in state 'released'"	Ensure collection is loaded before search
Duplicate primary keys	Index fails with "duplicate primary keys"	Deduplicate content before indexing
Zero chunks reported	`memsearch stats` shows 0	Pass `-c` flag explicitly or check config
Non-durable writes	Remote Milvus 2.5+ reports 0 row_count	Verify flush behavior on remote servers

Source: Community Issues #540, #539, #538, #534

CLI Reference — Command-line interface for storage operations
Python API — Programmatic access to storage layer
Architecture — System-level architecture overview
Plugin Installation — Setting up plugins

Source: https://github.com/zilliztech/memsearch / Human Manual

Milvus Integration

Overview

Milvus is the vector search backend powering memsearch's semantic memory capabilities. Memsearch uses Milvus to store, index, and search high-dimensional embedding vectors generated from markdown content chunks. The integration supports both local file-based storage (Milvus Lite) and remote Milvus server deployments, enabling a range of use cases from personal single-user workflows to team-shared memory stores.

Architecture

Memsearch implements a dual-layer architecture where markdown files serve as the source of truth and Milvus acts as a rebuildable shadow index:

graph TD
    A[Markdown Files] -->|Index| B[memsearch API]
    B -->|Chunk & Embed| C[Milvus Vector Store]
    D[Search Query] -->|Semantic Search| C
    C -->|Ranked Results| E[User/Agent]
    A -->|Read for Expand| F[Full Section Retrieval]
    
    C -.->|Rebuildable| A

The MemSearch class in src/memsearch/__init__.py exposes the primary Python interface for interacting with Milvus:

from memsearch import MemSearch

Source: src/memsearch/__init__.py:1-5

Data Model

Chunk Schema

Each indexed markdown chunk produces a Milvus record with the following fields:

Field	Type	Description
`chunk_hash`	VARCHAR	Primary key; SHA-like hash of chunk content
`content`	VARCHAR	Full chunk text
`path`	VARCHAR	Source file path
`heading`	VARCHAR	Parent heading for context
`chunk_order`	INT	Position within document
`mtime`	FLOAT	File modification time
`vector`	FLOAT_VECTOR	Embedding vector (dimension depends on provider)

Primary Key Constraints

The chunk_hash field serves as the primary key for upsert operations. Milvus enforces uniqueness within a batch—duplicate primary keys in the same upsert batch will fail with error code 1100:

duplicate primary keys are not allowed in the same batch: invalid parameter

Source: src/memsearch/chunker.py:1-15

Embedding Providers

Memsearch supports multiple embedding providers for vector generation:

Local ONNX (Default)

The default provider uses the BGE-M3 model via ONNX runtime, requiring no API keys:

memsearch config set embedding.provider onnx

OpenAI

Requires OPENAI_API_KEY environment variable or direct configuration:

memsearch config set embedding.provider openai

Configuration Options

Parameter	Default	Description
`embedding.provider`	`onnx`	Embedding provider selection
`embedding.model`	Provider-specific	Model identifier

Source: plugins/codex/README.md:1-40

Backend Configuration

Milvus Lite (Local)

Default configuration for single-user environments. Stores data in a local .db file:

memsearch config set milvus.uri ./memory/milvus.db

Remote Milvus Server

For team or production deployments:

memsearch config set milvus.uri http://localhost:19530
memsearch config set milvus.token <token>

Source: README.md:1-100

Collection Management

Collection Naming

Collections are automatically derived from the project directory name with an ms_ prefix. The full derivation follows this pattern:

Environment	Collection Name
Project `~/code/myapp`	`ms_myapp`
Global memory	`ms_global`

Collection State

Collections may enter a released state when using Milvus Lite outside an active session. When this occurs, search operations fail with:

Collection 'ms_project_abc12345' is in state 'released'; call load() before search/get/query

This commonly affects CLI search commands run outside of an active watch process. Source: Community Issue #540

Stats Reporting

The memsearch stats command retrieves row counts from Milvus:

memsearch stats --collection ms_project

Note: When collection name is set only in the config file, stats may report 0 chunks despite successful indexing. Use the explicit -c flag for reliable results. Source: Community Issue #538

CLI Operations

Index Command

Indexes markdown files and writes chunks to Milvus:

memsearch index <paths> [options]

Option	Description
`--collection`, `-c`	Target collection name
`--provider`	Embedding provider override
`--model`	Model override
`--batch-size`	Upsert batch size
`--base-url`	API base URL for remote providers
`--api-key`	API key for remote providers

Source: src/memsearch/cli.py:1-100

Search Command

Performs hybrid semantic + keyword search against the Milvus collection:

memsearch search <query> [options]

Watch Mode

Monitors paths for changes and automatically re-indexes:

memsearch watch <paths> [options]

Watch mode maintains collection load state, preventing the released state issue in active sessions.

Source: src/memsearch/cli.py:200-280

Content Processing Pipeline

Markdown Chunking

The scanner.py module discovers markdown files:

from memsearch.scanner import scan_paths, ScannedFile

files = scan_paths(["./memory"], extensions=(".md", ".markdown"))

Source: src/memsearch/scanner.py:1-50

Pre-Embedding Cleaning

Before vectorization, content is cleaned to improve embedding quality:

from memsearch.chunker import clean_content_for_embedding

cleaned = clean_content_for_embedding(text)

This function:

Removes HTML comments () containing session UUIDs
Collapses multiple blank lines
Strips metadata noise

Source: src/memsearch/chunker.py:15-30

Chunk Validation

Chunks must meet minimum meaningful content thresholds to be indexed:

from memsearch.chunker import _has_meaningful_content

if _has_meaningful_content(chunk_text):
    # Include in index

Sections consisting only of headings without body content are rejected during chunking.

Source: src/memsearch/chunker.py:30-50

Known Issues

Durability on Remote Milvus 2.5+

MilvusStore.upsert() may report success while writes are not immediately durable on remote Milvus 2.5+ instances. The method returns len(chunks) as a success count even when underlying upserts are not flushed:

memsearch CLI reports Indexed 219 chunks while get_collection_stats() shows row_count: 0

A flush operation may be required for guaranteed durability. Source: Community Issue #534

Memory Usage During Indexing

Large corpus indexing with the ONNX provider exhibits linear memory growth (~9 MB per chunk in anon-rss). For memory-constrained environments, consider batching or restarting the indexing process periodically. Source: Community Issue #533

Plugin Integration

Claude Code Plugin

The Claude Code plugin automatically indexes memory after each conversation turn:

memsearch config set plugins.claude-code.summarize.model haiku

Source: plugins/claude-code/skills/memory-config/SKILL.md:1-20

OpenCode Plugin

The OpenCode plugin provides memory_search, memory_get, and memory_transcript tools that query Milvus through the memsearch CLI:

const result = spawnSync("memsearch", ["search", query, `--collection ${col}`]);

Source: plugins/opencode/index.ts:1-80

Configuration Reference

Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key for embeddings
`MILVUS_URI`	Milvus connection URI
`MILVUS_TOKEN`	Milvus authentication token

Config File Structure

milvus:
  uri: ./memory/milvus.db
  token: ""

embedding:
  provider: onnx
  model: ""

plugins:
  claude-code:
    summarize:
      provider: native
      model: haiku

Source: mkdocs.yml:1-50

Troubleshooting

Symptom	Cause	Solution
`released` state error	Milvus Lite collection unloaded	Run `memsearch watch` first, or call `load()`
0 chunks in stats	Collection name in config not recognized	Use `-c` flag explicitly
Duplicate key errors	Re-indexing without cleaning	Delete collection or use upsert with new hashes
Search returns no results	Content not indexed	Run `memsearch index` before searching

Source: https://github.com/zilliztech/memsearch / Human Manual

Indexing Pipeline

Related topics: Memory Storage, Milvus Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section File Scanner

Continue reading this section for the full explanation and source context.

Section Semantic Chunker

Continue reading this section for the full explanation and source context.

Section Hash Generation and Deduplication

Continue reading this section for the full explanation and source context.

Related topics: Memory Storage, Milvus Integration

Indexing Pipeline

The indexing pipeline is the core mechanism by which memsearch transforms markdown files into searchable vector embeddings stored in Milvus. It serves as the bridge between human-editable markdown files and the Milvus vector database, ensuring that every change to memory files is reflected in search results.

Overview

The indexing pipeline performs the following high-level function: scan filesystem paths for markdown files, split them into semantic chunks by heading structure, generate content hashes for deduplication, embed the text content, and upsert the results to Milvus.

graph TD
    A[Markdown Files] --> B[Scanner]
    B --> C[Chunker]
    C --> D[Hash Generation]
    D --> E{Hash Changed?}
    E -->|No| F[Skip - no Milvus call]
    E -->|Yes| G[Embed Content]
    G --> H[Upsert to Milvus]
    H --> I[Milvus Shadow Index]
    
    F -.-> J[(Sync)]
    I -.-> J

Key design principle: Markdown files are always the source of truth. Milvus is a rebuildable shadow index. If the Milvus collection is lost or corrupted, running memsearch index rebuilds it entirely from the markdown files.

Pipeline Components

File Scanner

The scanner module (src/memsearch/scanner.py) discovers markdown files across specified paths.

Source: scanner.py:1-50

@dataclass(frozen=True)
class ScannedFile:
    """Metadata for a discovered markdown file."""
    path: Path
    mtime: float
    size: int

Parameter	Type	Default	Description
`paths`	`list[str \	Path]`	required	Files/directories to scan
`extensions`	`tuple[str, ...]`	`(".md", ".markdown")`	File extensions to include
`ignore_hidden`	`bool`	`True`	Skip files/directories starting with `.`

The scanner performs recursive directory traversal, collecting ScannedFile entries containing the absolute path, modification time, and file size. Results are sorted by path for deterministic ordering.

Semantic Chunker

The chunker (src/memsearch/chunker.py) splits markdown files into semantically meaningful units based on heading structure.

Source: chunker.py:1-30

The chunking logic uses regular expressions to detect headings:

_HEADING_RE = re.compile(r"^(#{1,6})\s+(.+)$", re.MULTILINE)
_HTML_COMMENT_RE = re.compile(r"<!--.*?-->", re.DOTALL)

Function	Purpose
`clean_content_for_embedding()`	Strips HTML comments and collapses blank lines before embedding
`_has_meaningful_content()`	Filters out chunks with insufficient text content

Minimum meaningful length: Chunks with fewer than 2 characters of substantive text after stripping metadata are dropped during indexing. This prevents indexing empty headings or purely decorative sections.

Content cleaning for embeddings: The clean_content_for_embedding() function removes HTML comments (which often contain session UUIDs and transcript paths) from text sent to the embedding model. The original Milvus-stored content remains unchanged.

Hash Generation and Deduplication

Each chunk receives a SHA-256 hash derived from its content. This serves two purposes:

Deduplication: Skip unchanged chunks when re-indexing
Content integrity: Detect modified chunks requiring re-embedding

graph LR
    A[Chunk Content] --> B[SHA-256 Hash]
    B --> C{Exists in Milvus?}
    C -->|Yes, hash matches| D[Skip]
    C -->|No or hash differs| E[Embed → Upsert]

Source: README.md — "hash each chunk (SHA-256) → hash unchanged → skip (no API call)"

Embedding

Chunk text is converted to vector embeddings using the configured provider:

Provider	Description	API Key Required
`onnx`	BGE-M3 model, runs locally	No
`openai`	OpenAI embeddings	Yes (`OPENAI_API_KEY`)

Configuration:

# Switch to OpenAI embeddings
memsearch config set embedding.provider openai

# Use local ONNX (default)
memsearch config set embedding.provider onnx

Milvus Upsert

Embedded chunks are upserted to the Milvus collection with fields:

Field	Type	Description
`chunk_hash`	`VARCHAR`	Primary key (SHA-256)
`content`	`VARCHAR`	Full chunk text
`file_path`	`VARCHAR`	Source file path
`heading`	`VARCHAR`	Nearest heading title
`vector`	`FLOAT_VECTOR`	Embedding vector

Indexing Workflows

One-Time Indexing

Run memsearch index to scan and index all markdown files in specified paths:

# Index current directory
memsearch index .

# Index specific paths
memsearch index ./docs ./memory

# With custom collection
memsearch index . -c my_collection

Source: cli.py — index command implementation

Watch Mode (Live Indexing)

The memsearch watch command monitors filesystem paths for changes and automatically re-indexes modified files:

memsearch watch ./memory ./docs

Source: cli.py — watch command with event handler

def _on_event(event_type: str, summary: str, file_path) -> None:
    click.echo(summary)

Workflow:

graph TD
    A[memsearch watch] --> B[Initial Index]
    B --> C[Start Filesystem Watcher]
    C --> D{File Change Detected}
    D --> E[Debounce]
    E --> F[Re-chunk Changed File]
    F --> G[Compare Hashes]
    G --> H{Hash Changed?}
    H -->|Yes| I[Embed & Upsert]
    H -->|No| J[Skip]
    I --> C
    J --> C

Debounce: The watcher applies configurable debouncing (default: 500ms) to batch rapid file changes.

Configuration Options

Option	Default	Description
`embedding.provider`	`onnx`	Embedding provider
`embedding.max_chunk_size`	varies	Maximum chunk size in tokens
`milvus.uri`	`milvus_lite.db`	Milvus connection URI
`watch.debounce_ms`	`500`	Debounce delay for file watchers

Source: README.md — embedding provider configuration

Known Issues and Limitations

Memory Leak During Large Indexing Jobs

Issue #533: Indexing large corpora with the ONNX provider exhibits memory growth of approximately 9 MB per chunk (RSS), which can cause OOM kills on memory-constrained systems.

Workaround: Process in smaller batches or restart the process mid-corpus.

Collection State on Milvus Lite

Issue #540: CLI search fails on Milvus Lite when the collection is in a "released" state. The collection must be loaded before search operations.

pymilvus.exceptions.MilvusException: (code=101, message=Collection '...' is in state 'released'; call load() before search/get/query)

Duplicate Primary Key Errors

Issue #539: Duplicate primary keys within a single upsert batch cause indexing failures:

duplicate primary keys are not allowed in the same batch: invalid parameter

This occurs when multiple chunks produce identical SHA-256 hashes. The chunker includes metadata (file path, heading) in hash computation to minimize collisions.

Stats Shows Zero Chunks

Issue #538: memsearch stats reports 0 chunks when the collection name is set only in the config file, not via the -c flag.

Write Durability on Remote Milvus

Issue #534: MilvusStore.upsert() may report success without ensuring durable writes on remote Milvus 2.5+ deployments. The upsert may return success even when rows are not persisted.

Plugin Integration

Agent plugins interact with the indexing pipeline through shell commands:

Source: plugins/opencode/index.ts

memsearch expand '${chunk_hash}' --collection ${col}

The indexing pipeline supports plugin hooks for:

SessionStart: Inject memories at session beginning
Stop: Capture and index conversation summaries
Project review: Periodic memory consolidation

Source: plugins/openclaw/skills/memory-recall/SKILL.md — memory recall workflow

Troubleshooting

Symptom	Likely Cause	Solution
`memsearch search` fails with "collection released"	Milvus Lite collection unloaded	Run `memsearch index` to reload
0 chunks in stats	Missing `-c` flag	Explicitly pass collection name
Memory growth during indexing	ONNX provider memory leak	Batch process, restart mid-corpus
Duplicate primary key error	Hash collision	Run `memsearch index` to rebuild

Rebuild from scratch: Delete the .db file or drop the collection, then run memsearch index to rebuild the entire index from markdown files.

Source: https://github.com/zilliztech/memsearch / Human Manual

Claude Code Plugin

The Claude Code Plugin integrates memsearch's semantic memory capabilities directly into the Claude Code CLI environment. It enables automatic capture of conversation turns, memory-driven ...

Section Component Layers

Continue reading this section for the full explanation and source context.

Section Data Flow

Continue reading this section for the full explanation and source context.

Section Capture Steps

Continue reading this section for the full explanation and source context.

Section Session Anchors

Continue reading this section for the full explanation and source context.

The Claude Code Plugin integrates memsearch's semantic memory capabilities directly into the Claude Code CLI environment. It enables automatic capture of conversation turns, memory-driven search across project history, and intelligent memory consolidation for coding sessions.

Overview

The plugin operates as a hook-based extension to Claude Code, intercepting conversation lifecycle events (session start, stop, message processing) to build and maintain a searchable semantic memory index. Unlike standalone memsearch usage, the plugin automates memory capture after each conversation turn and provides Claude Code-native memory recall tools.

Source: README.md:1-50

Architecture

Component Layers

┌─────────────────────────────────────────────────────────────┐
│                    Claude Code CLI                          │
├─────────────────────────────────────────────────────────────┤
│  SessionStart Hook  │  Stop Hook  │  Memory Recall Tools    │
├─────────────────────────────────────────────────────────────┤
│                    memsearch CLI/API                        │
├─────────────────────────────────────────────────────────────┤
│  Milvus Lite (.db)  │  Embedder (ONNX/API)  │  Chunker     │
└─────────────────────────────────────────────────────────────┘

Data Flow

graph TD
    A[User Question] --> B[Claude Code Response]
    B --> C[Stop Hook Fires]
    C --> D[Parse Last Turn]
    D --> E[Summarize with haiku]
    E --> F[Append to memory YYYY-MM-DD.md]
    F --> G[Session Anchor Added]
    G --> H[memsearch index]
    H --> I[Milvus Collection Updated]

    J[User Query: 'What about X?'] --> K[memory_search Tool]
    K --> L[memsearch search]
    L --> M[Ranked Chunks]
    M --> N[memory_get for Full Section]
    N --> O[memory_transcript for Raw Dialogue]

Source: README.md:50-100

Capture Workflow

The capture workflow executes after each conversation turn via the Stop hook. This automation ensures that every meaningful exchange is preserved without manual intervention.

Capture Steps

Turn Parsing — Extract the last user query and agent response from the session transcript
Summarization — Generate a concise bullet-point summary using a lightweight model (haiku)
Markdown Appending — Write the summary to memory/YYYY-MM-DD.md with session anchors
Indexing — Invoke memsearch index to update the Milvus vector store

Session Anchors

Each captured turn includes metadata anchors in HTML comment format:

## Session Summary

- User asked about batch size optimization
- Claude explained learning rate scheduling

<!-- session:abc123 turn:5 db:/path/to/memory.db -->

The session: anchor enables tracing back to the original conversation transcript. Source: plugins/claude-code/prompts/project_review.txt:1-30

Memory Recall Tools

The plugin provides three-tier progressive memory search tools for retrieving past context.

Tool Comparison

Tool	Purpose	Input	Output
`memory_search`	Semantic + keyword search	Query string	Ranked chunks with relevance scores
`memory_get`	Expand full markdown section	`chunk_hash`	Complete section with context
`memory_transcript`	Retrieve raw dialogue	`session_id`, `turn_id`	Formatted conversation transcript

Decision Guide

User Intent	Recommended Tools
Quick recall ("did we discuss X?")	`memory_search` only
Need details ("what was the solution?")	`memory_search` → `memory_get`
Need original dialogue ("show me the exact conversation")	`memory_search` → `memory_get` → `memory_transcript`

Source: plugins/claude-code/skills/memory-recall/SKILL.md:1-50

Configuration

Configuration Schema

The plugin reads configuration from ~/.config/memsearch/config.yaml or project-local .memsearch/config.yaml. Configuration changes take effect on the next Claude Code session start. Source: plugins/claude-code/skills/memory-config/SKILL.md:1-20

Summarization Settings

Parameter	Description	Default
`plugins.claude-code.summarize.enabled`	Enable automatic turn summarization	`true`
`plugins.claude-code.summarize.provider`	LLM provider for summarization	`native`
`plugins.claude-code.summarize.model`	Model used for turn summaries	`haiku`

Project Review Settings

Parameter	Description
`plugins.claude-code.project_review.*`	Controls periodic project memory review and update behavior

Embedding and Milvus

Standard memsearch embedding and Milvus settings apply:

Parameter	Description
`embedding.provider`	Embedding model provider (`onnx`, `openai`)
`milvus.uri`	Milvus backend (`milvus-lite` for local `.db`, remote URL for server)

Source: src/memsearch/cli.py:200-250

Installation

Prerequisites

memsearch v0.4.0 or later
Claude Code CLI installed and authenticated
Python 3.9+ with pip or uv package manager

Installation Steps

``bash uv pip install memsearch ``

Install memsearch globally:

``bash claude plugin install zilliztech/memsearch ``

Install the Claude Code plugin:

``bash memsearch config init ``

Initialize configuration:

Start a new Claude Code session (required for hooks to activate)

Source: mkdocs.yml:30-45

Known Issues and Limitations

Stop Hook SessionStart Pollution

The Stop hook's claude -p summarize invocation triggers the SessionStart hook in certain configurations, causing duplicate or nested session entries in daily logs. This occurs even when --no-session-persistence is set.

Workaround: Clear the CLAUDECODE environment variable before the summarize call.

Source: GitHub Issue #520

Rate Limit Error String as Content

When the user's Anthropic account hits API rate limits during summarization, the error message string is written to the memory file instead of being handled gracefully.

Impact: Memory files may contain error text that pollutes future search results.

Source: GitHub Issue #527

Milvus Lite Collection Released State

When using Milvus Lite (local .db file) without an active watch process, CLI search operations may fail with:

MilvusException: Collection 'ms_project_xxx' is in state 'released'

Fix: Call memsearch index or memsearch watch to reload the collection before searching.

Source: GitHub Issue #540

Pre-Compaction Transcript Loss

The Stop hook's transcript parsing runs after compaction has collapsed pre-compact transcripts into summary markers. Long agentic runs that compact mid-stream may lose original dialogue detail.

Source: GitHub Issue #537

Troubleshooting

Plugin Not Activating

Verify installation: claude plugin list
Check hook files exist: ls ~/.claude/plugins/memsearch/hooks/
Start a fresh Claude Code session (hooks load at session start)

Memory Search Returns No Results

Ensure indexing completed: memsearch stats
Check collection exists: memsearch stats --collection <name>
Verify Milvus is running: memsearch watch in a separate terminal

Configuration Not Applied

Configuration changes require a new Claude Code session to take effect. The current session loads plugin state at startup. Source: plugins/claude-code/skills/memory-config/SKILL.md:10-15

Project Memory Review

The plugin supports periodic project memory review to consolidate and clean up accumulated daily notes. The review process uses structured prompts that:

Keep project-focused content in project memory files
Separate user preferences to USER.md
Apply targeted additions rather than full rewrites

Suggested Project Memory Structure

# Project Memory

## Current Direction
## Active Threads
## Recent Progress
## Decisions
## Open Questions
## Risks and Constraints
## Next Steps
## Cold Items

Source: plugins/claude-code/prompts/project_review.txt:20-40

Uninstallation

To remove the plugin:

claude plugin uninstall memsearch

Remove local memory files if desired:

rm -rf .memsearch/

Source: mkdocs.yml:40-50

Source: https://github.com/zilliztech/memsearch / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 23 structured pitfall item(s), including 6 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_e448ba5c18f74fbcac5193643cf9bf00 | https://github.com/zilliztech/memsearch/issues/534

2. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_6a8adfef7558499e973bb2775db7368a | https://github.com/zilliztech/memsearch/issues/552

3. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_f7c0535d74d940bba792903b440f1808 | https://github.com/zilliztech/memsearch/issues/520

4. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_bcaeec224c444d9484142ebf19f6ecc8 | https://github.com/zilliztech/memsearch/issues/102

5. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_b8e6b6226cfa4953ab4ffffc71e575e8 | https://github.com/zilliztech/memsearch/issues/533

6. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_05b51b5d9b1145fca49076b5f0a3e9d2 | https://github.com/zilliztech/memsearch/issues/527

7. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_c7024063d9bc4202803e0629e7ba02bc | https://github.com/zilliztech/memsearch/issues/535

8. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_9276552f302c48309d9a27db91487d0d | https://github.com/zilliztech/memsearch/issues/537

9. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_231ab841119e432ca1bf0cd072ff5b6a | https://github.com/zilliztech/memsearch/releases/tag/v0.3.1

10. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_0803c1e9097f43d8800b01f21f8a9c5a | https://github.com/zilliztech/memsearch/releases/tag/v0.4.2

11. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | github_repo:1153190876 | https://github.com/zilliztech/memsearch

12. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_5b382c2552e449c698d3757e72374f04 | https://github.com/zilliztech/memsearch/issues/540

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using memsearch with real data or production workflows.

Reorganize memory - github / github_issue
OpenCode plugin installation silently fails - github / github_issue
memsearch stats shows 0 chunks when collection name is only set in confi - github / github_issue
Stop hook writes Anthropic API rate-limit error string as memory summary - github / github_issue
Feature request: automatic memory refinement (dreaming) and knowledge wi - github / github_issue
CLI search fails on Milvus Lite: collection in 'released' state - github / github_issue
duplicate primary keys are not allowed in the same batch: invalid parame - github / github_issue
Installation risk requires verification - GitHub / issue
Installation risk requires verification - GitHub / issue
Security or permission risk requires verification - GitHub / issue
Security or permission risk requires verification - GitHub / issue
Installation risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

memsearch

Introduction to memsearch

Related Pages

Introduction to memsearch

Overview

Key Capabilities

Architecture

Core Components

Data Flow

Capture Flow (Writing Memory)

Recall Flow (Reading Memory)

Plugin Memory Recall Workflow

Configuration

Configuration Options

Embedding Providers

Milvus Backends

Known Limitations and Issues

Project Memory Format

Quick Start

Installation

Basic Usage

Python API

Feature Roadmap

See Also

Quick Start Guide

Related Pages

Quick Start Guide

What is Memsearch?

Architecture Overview

Prerequisites

Installation

Using pip

Using uv (recommended)

Verify Installation

Initial Configuration

Project-Scoped Configuration

Configuration Options

Core Workflow: Indexing and Searching

Step 1: Index Your Markdown Files

Step 2: Search Semantic Content

Step 3: Expand Results for Full Context

Step 4: Check Index Statistics

Using the Python API

Python API Reference

Watch Mode for Continuous Indexing

Agent Plugins

Installing a Plugin

Plugin Memory Workflow

Configuring Plugin Summarization

Troubleshooting Common Issues

Collection in 'released' State (Milvus Lite)

Duplicate Primary Key Errors During Indexing

Stats Shows 0 Chunks

Memory Usage During Large Indexing Jobs

Next Steps

Environment Variables

System Architecture

Related Pages

System Architecture

Overview

Core Components

MemSearch Core (`src/memsearch/__init__.py`)

Scanner (`src/memsearch/scanner.py`)

Chunker (`src/memsearch/chunker.py`)

Embedding Providers

CLI Architecture

Command Structure

Watch Mode

Plugin Architecture

Memory Recall Flow

Plugin Tools

Storage Architecture

Milvus Backend

Data Model

Collection Naming

Configuration System

Configuration Keys

Known Architectural Limitations

Documentation Structure

Design Philosophy

MemSearch Core (`src/memsearch/init.py`)