# https://github.com/sverklo/sverklo Project Manual

Generated at: 2026-05-31 02:38:03 UTC

## Table of Contents

- [Overview](#overview)
- [Installation](#installation)
- [Quick Start Guide](#quickstart)
- [System Architecture](#architecture)
- [MCP Server Design](#mcp-server)
- [Search and Retrieval System](#search-system)
- [Bi-Temporal Memory Layer](#memory-layer)
- [Indexing System](#indexing)
- [Search Tools Reference](#search-tools)
- [Impact and Reference Tools](#impact-tools)

<a id='overview'></a>

## Overview

### Related Pages

Related topics: [Installation](#installation), [Quick Start Guide](#quickstart), [System Architecture](#architecture)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [package.json](https://github.com/sverklo/sverklo/blob/main/package.json)
- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
- [src/server/tool-overrides.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/prompts.ts](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)
- [skill/README.md](https://github.com/sverklo/sverklo/blob/main/skill/README.md)
- [action/README.md](https://github.com/sverklo/sverklo/blob/main/action/README.md)
</details>

# Overview

Sverklo is a local-first MCP (Model Context Protocol) server that provides repository memory and code intelligence for AI coding agents. It enables persistent context, semantic search, dependency graphs, blast-radius analysis, diff-aware review, and git-pinned decisions across coding sessions.

**Version:** 0.29.0  
**License:** MIT  
**Repository:** [sverklo/sverklo](https://github.com/sverklo/sverklo)  
**Website:** [https://sverklo.com](https://sverklo.com)

## Purpose and Scope

Sverklo transforms a codebase into a queryable knowledge graph that AI agents can interact with across sessions. Unlike cloud-based solutions, sverklo runs entirely locally—no API keys or code upload required. The system indexes source files, builds dependency graphs, computes PageRank scores, and maintains persistent memories.

**Key capabilities include:**

| Category | Capabilities |
|----------|--------------|
| **Search** | Semantic embeddings, BM25 full-text search, hybrid retrieval, PageRank-weighted ranking |
| **Graph** | Dependency analysis, blast-radius computation, impact analysis |
| **Memory** | Persistent context across sessions, core project invariants, categorized memories |
| **Review** | Diff-aware PR review, risk scoring, structural heuristics |
| **Audit** | Codebase health scoring, architecture diagrams, Obsidian-compatible exports |

Source: [package.json:5-33](https://github.com/sverklo/sverklo/blob/main/package.json#L5-L33)

## Architecture Overview

```mermaid
graph TD
    subgraph "Client Layer"
        IDE[Claude Code / Cursor / Windsurf / Codex CLI]
    end
    
    subgraph "MCP Server"
        MCP[MCP Protocol Handler]
        Tools[Tool Router]
        Hints[Intent Hints Engine]
    end
    
    subgraph "Indexer Subsystem"
        Files[File Indexer]
        Code[Code Parser]
        Graph[Dependency Graph]
        Memory[Memory Store]
        Search[Hybrid Search Engine]
    end
    
    subgraph "Stores"
        FS[File Store]
        GS[Graph Store]
        MS[Memory Store]
        DS[Doc Edge Store]
    end
    
    IDE <--> MCP
    MCP <--> Tools
    Tools <--> Hints
    Tools <--> Indexer
    Indexer <--> Stores
```

The MCP server (`src/server/mcp-server.ts`) implements the Model Context Protocol, exposing tools and resources that IDE clients consume. The indexer subsystem coordinates file scanning, AST-based code parsing, graph construction, and search indexing into multiple backing stores.

Source: [src/server/mcp-server.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts#L1-L50)

## MCP Tools

Sverklo exposes a comprehensive set of code intelligence tools via the MCP protocol. Tools are organized into presets that optimize the available surface area for different agent workflows.

### Tool Presets

| Preset | Purpose | Tools Included |
|--------|---------|----------------|
| `default` | Balanced overview + search | search, lookup, overview, refs, impact |
| `nav` | Navigation focus | search, lookup, overview, refs, impact, deps, context, status |
| `lean` | Minimal footprint | search, lookup, overview, refs, impact, deps, context, status, remember, recall, review_diff |
| `research` | Code exploration | search, search_iterative, investigate, ask, lookup, overview, refs, impact, deps, concepts, patterns, clusters, verify, critique, ctx_slice, ctx_grep, ctx_stats, status |
| `review` | PR/MR review | review_diff, diff_search, test_map, impact, refs, lookup, search, investigate, verify, status |

Source: [src/server/tool-overrides.ts:1-60](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts#L1-L60)

### Core Tools

**Context Bundle** (`context`) — An umbrella tool that returns a curated bundle in a single call: codebase overview, semantically relevant code, related symbols, and matching memories. This is the recommended first call for unfamiliar tasks.

```typescript
inputSchema: {
  task: string,           // Free-form task description
  detail_level: enum,     // "minimal" | "normal" | "full"
  scope: string,          // Optional path prefix filter
  budget: number          // PageRank-pruned token budget
}
```

Source: [src/server/tools/context.ts:1-30](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts#L1-L30)

**Search** (`search`, `search_iterative`) — Hybrid retrieval combining:
- Full-text search via BM25
- Semantic embeddings via ONNX runtime
- PageRank-weighted ranking
- Reciprocal rank fusion

Source: [src/server/tools/context.ts:45-55](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts#L45-L55)

**Critique** (`critique`) — Validates an agent's answer by checking cited evidence for staleness, detecting missed high-PageRank hubs, and flagging undocumented symbols. Returns structured critique without LLM calls.

Source: [src/server/tools/critique.ts:1-60](https://github.com/sverklo/sverklo/blob/main/src/server/tools/critique.ts#L1-L60)

**Review Diff** (`review_diff`) — Diff-aware code review with risk scoring and structural heuristics. Emits structured GitHub PR review JSON for CI integration.

Source: [src/server/tools/review-format.ts:1-40](https://github.com/sverklo/sverklo/blob/main/src/server/tools/review-format.ts#L1-L40)

### Intent-Aware Hints

The hint engine tracks recent tool-call trajectories and appends "next steps" suggestions. It classifies intent into categories:

| Intent | Trigger Patterns |
|--------|------------------|
| `exploring` | Search, lookup, investigate sequences |
| `reviewing-diff` | Diff tools followed by refs |
| `tracing-impact` | Impact analysis after symbol lookups |
| `debugging` | Grep, lookup, investigate patterns |
| `onboarding` | Context, overview, status calls |
| `memory-curating` | Remember, recall sequences |

Source: [src/server/hints.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/hints.ts#L1-L50)

## Memory System

Sverklo maintains persistent context across coding sessions through a tiered memory architecture:

```mermaid
graph LR
    Core[Core Memories<br/>Tier: core]
    Recent[Recent Memories<br/>Tier: session]
    Stale[Stale Flagging<br/>is_stale flag]
    
    Core --> Session[Auto-injected on session start]
    Recent --> Session
    Stale --> Session
```

**Memory Tiers:**
- **Core** — Project invariants, always auto-injected at session start
- **Recent** — Session-scoped memories, last N entries
- **Stale** — Flagged when underlying code changes (detected via graph analysis)

Source: [src/server/mcp-server.ts:80-100](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts#L80-L100)

The `sverklo://context` resource is auto-injected on every session start, providing the agent with project context without requiring explicit tool calls.

Source: [src/server/mcp-server.ts:55-78](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts#L55-L78)

## Audit and Reporting

### HTML Audit Report

Generates self-contained HTML reports with sverklo.com dark theme branding. Includes:
- Dimension grade cards (A/B/C/D/F color-coded)
- Section content cards with formatted bodies
- SEO metadata and Open Graph tags
- Responsive styling with JetBrains Mono and Public Sans fonts

Source: [src/server/audit-html.ts:1-30](https://github.com/sverklo/sverklo/blob/main/src/server/audit-html.ts#L1-L30)

### Obsidian Export

Generates Obsidian-compatible markdown with `[[wikilinks]]` for clickable dependency navigation in the Obsidian knowledge base.

Source: [src/server/audit-obsidian.ts:1-30](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts#L1-L30)

### Architecture Diagram

Generates self-contained HTML architecture diagrams showing:
- Layer groupings (Frontend, API, Storage, Search, Indexer)
- File distribution by pagerank
- Cross-layer dependency edges
- Color-coded directory patterns

Source: [src/server/audit-arch.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/audit-arch.ts#L1-L50)

## Workflow Prompts

Sverklo defines prompt templates for common code-intelligence tasks that encode the optimal order of tool calls:

| Prompt | Purpose |
|--------|---------|
| `sverklo/onboarding` | New team member context injection |
| `sverklo/premerge` | Pre-merge review checklist |
| `sverklo/premerge-full` | Comprehensive pre-merge review |
| `sverklo/investigate` | Root-cause debugging workflow |
| `sverklo/map-feature` | Feature tracing across codebase |

Source: [src/server/prompts.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts#L1-L50)

Example prompt structure for feature mapping:

```typescript
build: ({ feature, scope }) => `
  1. investigate query:"${feature}"${scopeArg}
  2. Pick top 3-5 symbols → refs on each
  3. impact on most-referenced symbols
  4. Call verify to validate assumptions
`
```

Source: [src/server/prompts.ts:20-45](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts#L20-L45)

## GitHub Action Integration

The `sverklo/sverklo/action` provides CI-integrated code review:

```yaml
- uses: sverklo/sverklo/action@main
  with:
    github-token: ${{ secrets.GITHUB_TOKEN }}
    fail-on: high        # Fail build on risk threshold
    max-files: 25        # Max files to review
    inline-comments: true # Post inline comments
```

The action posts a PR review containing:
- Sticky summary comment with risk-scored files
- Up to 30 inline comments anchored to flagged lines
- JSON payload for direct `pulls.createReview` API posting

Source: [action/README.md:1-30](https://github.com/sverklo/sverklo/blob/main/action/README.md#L1-L30)

## CLI Commands

| Command | Description |
|---------|-------------|
| `sverklo init` | Initialize project with index and CLAUDE.md |
| `sverklo register <path>` | Register a repository |
| `sverklo unregister <name>` | Unregister a repository |
| `sverklo list` | List registered repositories |
| `sverklo reindex` | Rebuild index for a repository |
| `sverklo status` | Show current repository status |
| `sverklo doctor` | Diagnose installation health |

Source: [skill/README.md:1-20](https://github.com/sverklo/sverklo/blob/main/skill/README.md#L1-L20)

## Dependencies

**Runtime Dependencies:**

| Package | Version | Purpose |
|---------|---------|---------|
| `@modelcontextprotocol/sdk` | ^1.12.0 | MCP protocol implementation |
| `chokidar` | ^4.0.0 | File watching |
| `ignore` | ^7.0.0 | Gitignore pattern matching |
| `onnxruntime-node` | ^1.21.0 | Local embedding inference |
| `picomatch` | ^4.0.4 | Glob pattern matching |
| `yaml` | ^2.8.3 | YAML parsing |

**Optional Dependencies:**

| Package | Version | Purpose |
|---------|---------|---------|
| `web-tree-sitter` | ^0.24.0 | AST parsing (optional) |

Source: [package.json:35-55](https://github.com/sverklo/sverklo/blob/main/package.json#L35-L55)

**Engine Requirements:** Node.js >= 24.0.0

## Supported Environments

Sverklo integrates with:

- Claude Code (primary)
- Cursor
- Windsurf
- Codex CLI
- ZED editor

Source: [package.json:4](https://github.com/sverklo/sverklo/blob/main/package.json#L4)

## Known Limitations

Based on community feedback:

| Issue | Status | Reference |
|-------|--------|-----------|
| Windows path handling | User-reported issues | Issue #20 |
| `AGENTS.md` not respected by `sverklo init` | Known | Issue #19 |
| MCP tool double-prefixing (`sverklo_sverklo_*`) | Known | Issue #71 |
| `reindex` does not update `lastIndexed` timestamp | Bug | Issue #74 |
| Stale MCP server binary after upgrade | Known | Issue #17 |

Source: [package.json:1-10](https://github.com/sverklo/sverklo/blob/main/package.json#L1-L10)

## Getting Started

```bash
# Install globally
npm install -g sverklo

# Initialize in your project
cd your-project
sverklo init

# Register a repository
sverklo register .

# Start using MCP tools from your IDE
```

The initialization creates a `CLAUDE.md` file with project context and builds the initial index. The MCP server then becomes available to any connected IDE.

Source: [skill/README.md:20-35](https://github.com/sverklo/sverklo/blob/main/skill/README.md#L20-L35)

---

<a id='installation'></a>

## Installation

### Related Pages

Related topics: [Quick Start Guide](#quickstart)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/init.ts](https://github.com/sverklo/sverklo/blob/main/src/init.ts)
- [src/indexer/grammars-install.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/grammars-install.ts)
- [src/doctor.ts](https://github.com/sverklo/sverklo/blob/main/src/doctor.ts)
- [src/cli.ts](https://github.com/sverklo/sverklo/blob/main/src/cli.ts)
- [src/indexer/indexer.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/indexer.ts)
- [package.json](https://github.com/sverklo/sverklo/blob/main/package.json)
</details>

# Installation

Sverklo is a local-first MCP (Model Context Protocol) server for code intelligence. This guide covers the complete installation process, from prerequisites through post-install verification.

## Prerequisites

| Requirement | Version | Notes |
|-------------|---------|-------|
| Node.js | >= 24.0.0 | Required runtime. Earlier versions lack needed ESM and import metadata support. |
| npm | Any recent version | Used for global installation |
| Git | Any recent version | Required for repository operations during init |
| OS | Linux, macOS, Windows | Windows has known path normalization quirks (see [Platform Notes](#windows-specific-considerations)) |

Verify your Node version before proceeding:

```bash
node --version
```

## Installing via npm

Sverklo is distributed as a global npm package:

```bash
npm install -g sverklo
```

This installs the `sverklo` CLI binary globally, making it available from any directory. Source: [package.json:9-11]()

The installation includes:

- **CLI binary** (`sverklo`) — Command-line interface for all operations
- **MCP server** — Language server for IDE integration
- **Tree-sitter grammars** — Language parsers for AST indexing
- **ONNX runtime** — Local embeddings for semantic search

Verify the installation:

```bash
sverklo --version
```

## Project Initialization

Each codebase you want sverklo to manage requires initialization. Navigate to your project directory and run:

```bash
cd /path/to/your/project
sverklo init
```

The `init` command performs the following setup:

### What `sverklo init` Does

```mermaid
graph TD
    A[sverklo init] --> B{Is .sverklo dir present?}
    B -->|No| C[Create ~/.sverklo directory]
    B -->|Yes| D[Skip creation]
    C --> E[Create registry.json]
    D --> E
    E --> F[Create CLAUDE.md in project]
    F --> G[Parse existing docs/ADRs]
    G --> H[Install tree-sitter grammars]
    H --> I[Index codebase files]
    I --> J[Build symbol graph]
    J --> K[Compute PageRank scores]
    K --> L[Generate initial embeddings]
    L --> M[Write project metadata]
```

### Files Created

| File | Location | Purpose |
|------|----------|---------|
| `CLAUDE.md` | Project root | Agent instructions for code intelligence |
| `registry.json` | `~/.sverklo/` | Project registration with name, path, last indexed timestamp |

Source: [src/init.ts:1-50]()

### Grammar Installation

During initialization, sverklo installs tree-sitter grammars for supported languages. Grammars enable precise AST-based parsing for accurate symbol extraction.

```mermaid
graph LR
    A[Init starts] --> B[Check installed grammars]
    B --> C{Grammars exist?}
    C -->|Yes| D[Use cached]
    C -->|No| E[Download from npm]
    E --> F[Build with node-gyp]
    F --> G[Store in ~/.sverklo/grammars/]
```

Source: [src/indexer/grammars-install.ts:1-40]()

Supported languages include TypeScript, JavaScript, Python, Go, Rust, and more. The grammars are installed once and reused across projects.

## Post-Install Verification

After installation and initialization, verify everything works correctly:

```bash
sverklo doctor
```

The `doctor` command performs health checks on:

| Check | Purpose |
|-------|---------|
| **Installation** | Verifies CLI is reachable |
| **Node version** | Confirms >= 24.0.0 |
| **Grammar binaries** | Checks tree-sitter parsers are compiled |
| **Project registry** | Validates `~/.sverklo/registry.json` |
| **MCP server** | Tests server startup |

Source: [src/doctor.ts:1-60]()

### Known Issue: Version Mismatch

If you have multiple sverklo installations (e.g., a stale global binary), `sverklo doctor` may report a different version than the one you're actually running. This occurs when the doctor check uses `sverklo` from `$PATH` instead of the embedded version.

## Platform-Specific Considerations

### Windows

Windows users may encounter path-related issues due to backslash vs forward slash handling. The codebase normalizes paths using:

```javascript
path.replace(/\\/g, "/").split("/")
```

This approach converts Windows paths to Unix-style before processing. Source: [Issue #20](https://github.com/sverklo/sverklo/issues/20)

### Fresh Git Repositories

When running `sverklo init` in a fresh repository with no commits, you may see a spurious git warning:

```
Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]'
```

This warning is cosmetic and does not affect functionality. Source: [Issue #3](https://github.com/sverklo/sverklo/issues/20)

## IDE Integration

After installation, configure your IDE to use sverklo as an MCP server:

### Claude Code

Sverklo ships with a Claude Skill package for Claude Code. After installation, Claude Code automatically discovers the MCP tools.

### Cursor / Windsurf / Other MCP Clients

Register the MCP server by adding to your IDE's MCP configuration:

```json
{
  "mcpServers": {
    "sverklo": {
      "command": "sverklo",
      "args": ["mcp", "serve"]
    }
  }
}
```

Source: [src/server/mcp-server.ts:1-30]()

## Troubleshooting

### MCP Tools Not Appearing

1. Run `sverklo doctor` to verify the server starts correctly
2. Restart your IDE to pick up the newly registered MCP server
3. Check that the project is registered: `sverklo list`

### Stale Index After Upgrade

When upgrading via `npm install -g`, any running MCP server subprocess continues serving from the old binary until restarted. Restart your IDE after upgrading. Source: [Issue #17](https://github.com/sverklo/sverklo/issues/17)

### Grammar Compilation Failures

If tree-sitter grammar compilation fails:

1. Ensure `node-gyp` is available: `npm install -g node-gyp`
2. Verify build tools are installed (Python, C++ compiler)
3. On macOS, install Xcode Command Line Tools: `xcode-select --install`

## Configuration Reference

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `SVERKLO_PROFILE` | `full` | Tool profile: `core`, `nav`, `lean`, `full`, `research`, `review` |
| `SVERKLO_DISABLED_TOOLS` | (none) | Comma-separated list of tools to hide |
| `SVERKLO_TOOL_<NAME>_DESCRIPTION` | (none) | Override tool description |

Source: [src/server/tool-overrides.ts:1-30]()

### Registry Structure

Projects are registered in `~/.sverklo/registry.json`:

```json
{
  "projects": [
    {
      "name": "my-project",
      "path": "/home/user/code/my-project",
      "lastIndexed": "2025-01-15T10:30:00Z",
      "version": "0.29.0"
    }
  ]
}
```

## Next Steps

After successful installation:

1. **Index your project**: `sverklo index` (automatically run by `init`)
2. **Explore the codebase**: `sverklo overview`
3. **Enable IDE integration**: Configure MCP server in your IDE
4. **Read the CLI reference**: Explore available commands with `sverklo --help`

---

<a id='quickstart'></a>

## Quick Start Guide

### Related Pages

Related topics: [Overview](#overview), [Installation](#installation)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [FIRST_RUN.md](https://github.com/sverklo/sverklo/blob/main/FIRST_RUN.md)
- [package.json](https://github.com/sverklo/sverklo/blob/main/package.json)
- [skill/README.md](https://github.com/sverklo/sverklo/blob/main/skill/README.md)
- [agents/README.md](https://github.com/sverklo/sverklo/blob/main/agents/README.md)
- [action/README.md](https://github.com/sverklo/sverklo/blob/main/action/README.md)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
</details>

# Quick Start Guide

Sverklo is a local-first MCP (Model Context Protocol) server that provides code intelligence for AI coding assistants. It delivers symbol graphs, blast-radius analysis, diff-aware review, and persistent memory across sessions—without requiring API keys or uploading code. Source: [package.json](https://github.com/sverklo/sverklo/blob/main/package.json)

This guide walks you through installing sverklo, initializing it for your project, and getting started with its core capabilities.

## Prerequisites

| Requirement | Version/Details |
|-------------|-----------------|
| Node.js | >= 24.0.0 |
| Package Manager | npm, pnpm, or yarn |
| IDE/Client | Claude Code, Cursor, Windsurf, or Codex CLI |
| Git | Required for version-aware features |

Sverklo uses ONNX runtime for embeddings and tree-sitter for AST parsing. These are included as dependencies. Source: [package.json:38-43](https://github.com/sverklo/sverklo/blob/main/package.json)

## Installation

Install sverklo globally via npm:

```bash
npm install -g sverklo
```

Verify the installation:

```bash
sverklo --version
# or
sverklo doctor
```

The `doctor` command checks your environment and reports any configuration issues. Source: [skill/README.md](https://github.com/sverklo/sverklo/blob/main/skill/README.md)

## Project Initialization

Navigate to your project directory and run the initialization:

```bash
cd your-project
sverklo init
```

The `init` command performs the following setup:

1. **Indexes your codebase** — Scans source files, builds a symbol graph, and computes dependency relationships
2. **Detects existing agent instructions** — Checks for `CLAUDE.md`, `AGENTS.md`, or other agent configuration files
3. **Creates context files** — Generates or updates documentation for AI assistants
4. **Registers the project** — Adds the project to your local registry for quick access

Source: [skill/README.md](https://github.com/sverklo/sverklo/blob/main/skill/README.md)

> **Note:** In fresh git repositories with no commits, you may see a stray git warning. This is cosmetic—the init still succeeds. Source: [GitHub Issue #3](https://github.com/sverklo/sverklo/issues/3)

### Initialization for Windows

If you're on Windows and encounter path-related issues, ensure your PATH handling is compatible. The tool uses forward-slash normalized paths internally, but some edge cases may still arise. Source: [GitHub Issue #20](https://github.com/sverklo/sverklo/issues/20)

### Global Setup Option

If you want one-time machine setup without per-project boilerplate, use the global initialization flow. This imports memories once and allows quick registration for subsequent projects. Source: [GitHub Issue #72](https://github.com/sverklo/sverklo/issues/72)

## Core Commands

### Register a Project

Register an existing project (if not done during init):

```bash
sverklo register .
```

### List Registered Projects

View all registered projects and their status:

```bash
sverklo list
```

### Reindex a Project

After significant code changes, refresh the index:

```bash
sverklo reindex .
```

> **Known Issue:** The `reindex` command may not update the `lastIndexed` timestamp in `~/.sverklo/registry.json`, causing `sverklo list` to show stale ages. Source: [GitHub Issue #74](https://github.com/sverklo/sverklo/issues/74)

### Unregister a Project

Remove a project from the registry:

```bash
sverklo unregister <project-name>
```

To unregister by path (useful for agent-driven workflows):

```bash
sverklo unregister --by-path /path/to/project
```

Source: [GitHub Issue #73](https://github.com/sverklo/sverklo/issues/73)

## MCP Tools Overview

Once initialized, sverklo provides these tools to your AI assistant:

| Tool | Purpose |
|------|---------|
| `search` | Hybrid semantic code search (BM25 + embeddings + PageRank) |
| `lookup` | Find symbol definitions and references |
| `overview` | Get codebase statistics and top files by importance |
| `impact` | Calculate blast radius for proposed changes |
| `refs` | Find all references to a symbol |
| `deps` | Show dependency graph for a file |
| `context` | Umbrella tool—returns curated context bundle in one call |
| `remember` | Save decisions and context for future sessions |
| `recall` | Retrieve previously saved memories |
| `review_diff` | Risk-scored PR review with inline comments |
| `audit` | Codebase health scoring |
| `investigate` | Fan-out search across multiple signals |
| `critique` | Verify claims against codebase evidence |

Source: [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)

## First Session Workflow

When your AI assistant starts a session, sverklo automatically provides context resources:

```mermaid
graph TD
    A[Session Start] --> B[MCP Server Initializes]
    B --> C[Load Core Memories]
    C --> D[Load Recent Memories]
    D --> E[Build sverklo://context Resource]
    E --> F[Auto-inject into Session]
```

The `sverklo://context` resource includes:

- **Core project context** — Tier-1 project invariants
- **Key memories** — Previously saved decisions
- **Top files** — High PageRank files for orientation
- **Language stats** — File counts by language

Source: [src/server/mcp-server.ts:37-67](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)

## Using the Context Tool

The `context` tool is the recommended starting point for new tasks:

```json
{
  "task": "add rate limiting to the login endpoint",
  "detail_level": "normal",
  "budget": 4000
}
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `task` | string | Description of what you're working on |
| `detail_level` | minimal \| normal \| full | How much context to return |
| `scope` | string | Optional path prefix to constrain search |
| `budget` | number | Token budget for PageRank-pruned repo map |

- **minimal** — Fast/cheap: overview header + top 3 search hits + top 2 memories
- **normal** — Balanced: header + top 5 search hits + top 5 memories + symbol table
- **full** — Normal + dependency neighbors of top results
- **budget** — Returns PageRank-pruned repo map fit to token budget

Source: [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)

## GitHub Actions Integration

For automated code review on pull requests, use the sverklo action:

```yaml
name: Sverklo Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: sverklo/sverklo/action@main
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          fail-on: high
          inline-comments: true
```

| Input | Default | Description |
|-------|---------|-------------|
| `github-token` | `${{ github.token }}` | GitHub token for posting comments |
| `fail-on` | `none` | Risk threshold: `critical`, `high`, `medium`, `low`, `none` |
| `ref` | auto-detected | Git ref range (e.g., `main..HEAD`) |
| `max-files` | `25` | Maximum files to review |
| `inline-comments` | `true` | Post inline comments at flagged lines |

Source: [action/README.md](https://github.com/sverklo/sverklo/blob/main/action/README.md)

## Claude Code Subagents

Replace Claude Code's built-in subagents with sverklo-enhanced versions:

```bash
mkdir -p .claude/agents
curl -o .claude/agents/sverklo-explore.md \
  https://raw.githubusercontent.com/sverklo/sverklo/main/agents/sverklo-explore.md
```

The sverklo-explore subagent uses hybrid retrieval (BM25 + ONNX embeddings + PageRank) and answers questions in ~150-800 tokens, versus ~14,200 tokens for the default approach. Source: [agents/README.md](https://github.com/sverklo/sverklo/blob/main/agents/README.md)

## Claude Skill Package

Sverklo ships a Claude Skill for Claude Code:

```bash
# The skill is included in the npm package
ls skill/
# Contains: sverklo-skill.zip and skill definitions
```

Tools available via the skill include `sverklo_search`, `sverklo_review_diff`, `sverklo_audit`, and memory tools. Source: [skill/README.md](https://github.com/sverklo/sverklo/blob/main/skill/README.md)

## Troubleshooting

### Version Mismatch Warning

When upgrading sverklo via npm, a running MCP server subprocess may continue using the old binary. Restart your IDE or MCP client to pick up the new version. Source: [GitHub Issue #17](https://github.com/sverklo/sverklo/issues/17)

### AGENTS.md Not Respected

If your project uses `AGENTS.md` instead of `CLAUDE.md`, the init command may still add context to the wrong file. Manually migrate the content or file an issue. Source: [GitHub Issue #19](https://github.com/sverklo/sverklo/issues/19)

### MCP Tool Name Prefix

When registering the MCP server under the key `"sverklo"`, tool names may appear as `sverklo_sverklo_*` due to double-prefixing. Register under a different key (e.g., `"io.github.sverklo"`) to avoid this. Source: [GitHub Issue #71](https://github.com/sverklo/sverklo/issues/71)

## Next Steps

- Review the [sverklo prompts documentation](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts) for workflow templates
- Explore [architecture mapping](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts) to understand your codebase
- Set up [PR review automation](https://github.com/sverklo/sverklo/blob/main/action/README.md) for your CI/CD pipeline

---

<a id='architecture'></a>

## System Architecture

### Related Pages

Related topics: [MCP Server Design](#mcp-server), [Search and Retrieval System](#search-system), [Indexing System](#indexing), [Bi-Temporal Memory Layer](#memory-layer)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/tools/critique.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/critique.ts)
- [src/server/tools/find-references.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)
- [src/server/prompts.ts](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)
- [src/server/tool-overrides.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)
- [src/server/tools/review-format.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/review-format.ts)
- [src/server/tools/wakeup.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)
- [src/server/audit-html.ts](https://github.com/sverklo/sverklo/blob/main/src/server/audit-html.ts)
- [src/server/audit-obsidian.ts](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts)
- [package.json](https://github.com/sverklo/sverklo/blob/main/package.json)
</details>

# System Architecture

Sverklo is a local-first code intelligence platform designed as an MCP (Model Context Protocol) server. It provides persistent memory, semantic code search, dependency graphs, blast-radius analysis, and diff-aware review for AI coding assistants. The architecture follows a layered design with clear separation between indexing, storage, search, and tool delivery layers.

## High-Level Architecture Overview

Sverklo operates as a long-running MCP server process that serves code intelligence tools to IDE-integrated AI clients. The system indexes code once and serves multiple query types across sessions.

```mermaid
graph TD
    subgraph "Client Layer"
        A["Claude Code / Cursor / Windsurf / Codex CLI"]
    end
    
    subgraph "MCP Server Layer"
        B["mcp-server.ts<br/>MCP Protocol Handler"]
        C["Tool Handlers"]
        D["Prompt Templates"]
        E["Resource Provider"]
    end
    
    subgraph "Search & Query Layer"
        F["hybrid-search.ts<br/>BM25 + ONNX Embeddings + PageRank"]
        G["investigate.ts<br/>Multi-signal Fan-out"]
        H["Tool Profiles<br/>core, nav, lean, research, review"]
    end
    
    subgraph "Indexing Layer"
        I["Index Files"]
        J["Index Code (AST)"]
        K["Index Graph (Dependencies)"]
        L["Index Memory"]
    end
    
    subgraph "Storage Layer"
        M["SQLite Database"]
        N["Vector Store"]
        O["File Registry"]
    end
    
    A --> B
    B --> C
    B --> D
    B --> E
    C --> H
    D --> C
    H --> I
    H --> J
    H --> K
    H --> L
    I --> M
    J --> M
    K --> M
    L --> M
```

## Core Design Principles

### Local-First Architecture

Sverklo stores all data locally in `~/.sverklo/` and per-project `.sverklo/` directories. No API keys or cloud services are required. The system runs entirely on the developer's machine.

**Dependencies supporting local-first operation:**
- `chokidar` for file system watching
- `picomatch` for glob pattern matching
- `ignore` for `.gitignore` compatible filtering
- `onnxruntime-node` for local embedding inference
- `web-tree-sitter` (optional) for AST parsing

Source: [package.json:1-50](https://github.com/sverklo/sverklo/blob/main/package.json)

### MCP Protocol Integration

The server implements the full MCP 1.12.0 specification with three resource types:

```typescript
server.setRequestHandler(ListResourcesRequestSchema, async () => ({
  resources: [{
    uri: "sverklo://context",
    name: "Sverklo Project Context",
    description: "Key memories and codebase overview...",
    mimeType: "text/plain",
  }],
}));
```

The server exposes resources that are auto-injected at session start, prompts for workflow templates, and tools for all code intelligence operations.

Source: [src/server/mcp-server.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)

## Tool Architecture

### Tool Registration System

All tools follow a standardized handler pattern. The MCP server maintains a tool registry that supports dynamic configuration:

```typescript
export const contextTool = {
  name: "context",
  description: "Umbrella context bundler...",
  inputSchema: {
    type: "object" as const,
    properties: {
      task: { type: "string", description: "..." },
      detail_level: { type: "string", enum: ["minimal", "normal", "full"] },
      scope: { type: "string" },
      budget: { type: "number" },
    },
  },
};
```

Source: [src/server/tools/context.ts:1-40](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)

### Tool Profiles

The system provides pre-defined tool subsets called profiles to control the MCP tool surface:

| Profile | Tools | Use Case |
|---------|-------|----------|
| `core` | search, lookup, overview, refs, impact | Hot path only |
| `nav` | core + deps, context, status | Navigation focus |
| `lean` | nav + remember, recall, review_diff | Memory + diff |
| `research` | search, investigate, ask, concepts, patterns, clusters, verify, critique | Code research |
| `review` | review_diff, diff_search, test_map, impact, refs | PR/MR review |

Source: [src/server/tool-overrides.ts:1-80](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)

### Runtime Configuration

Tools can be customized via environment variables:

| Variable | Purpose |
|----------|---------|
| `SVERKLO_TOOL_<NAME>_DESCRIPTION` | Override tool description text |
| `SVERKLO_DISABLED_TOOLS` | Comma-separated list of tools to hide |
| `SVERKLO_PROFILE` | Apply a named profile (core, nav, lean, research, review) |
| `SVERKLO_ZILLIZ_COMPAT` | Enable Zilliz Claude context compatibility aliases |

Source: [src/server/tool-overrides.ts:80-120](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)

## Search Architecture

### Hybrid Search Pipeline

Sverklo combines multiple retrieval signals to maximize result quality:

```mermaid
graph LR
    A["Query"] --> B["BM25 Keyword Search"]
    A --> C["ONNX Bi-encoder Embeddings"]
    A --> D["PageRank Centrality"]
    B --> E["Reciprocal Rank Fusion"]
    C --> E
    D --> E
    E --> F["Ranked Results"]
```

The search layer coordinates BM25 keyword matching, semantic embedding similarity via ONNX, and PageRank-based importance scoring. Reciprocal Rank Fusion combines these signals into a unified ranking.

Source: [src/server/mcp-server.ts:100-150](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)

### Investigation Engine

The `investigate` tool performs multi-signal fan-out in a single call:

1. Executes full-text search
2. Queries vector embeddings
3. Resolves symbol references
4. Checks documentation mentions
5. Aggregates results with `found_by` tags indicating which signals matched

Results agreed on by multiple retrievers are tagged as higher-signal than single-source hits.

Source: [src/server/prompts.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)

### PageRank Integration

Dependency graph analysis produces PageRank scores used to:
- Rank files by architectural importance
- Prune large repos to fit token budgets
- Identify load-bearing modules
- Surface high-centrality files in overview

Source: [src/server/tools/wakeup.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)

## Code Analysis Engine

### Symbol Indexing

The indexing pipeline extracts symbols from AST-aware parsing:

- Function and class definitions
- Import/export relationships
- Type annotations
- Documentation comments

Symbol data enables `refs` (find references), `impact` (blast radius), and symbol-based search.

### Dependency Graph

Edges between files capture:
- Import statements (ES modules, CommonJS, TypeScript imports)
- Re-exports and re-typed symbols
- Cross-reference relationships

The graph supports cycle detection, fan-in/fan-out analysis, and impact propagation.

Source: [src/server/audit-obsidian.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts)

### Critique System

The `critique` tool validates claims against indexed evidence:

```typescript
function formatCritique(c: CritiqueData): string {
  const parts: string[] = [];
  parts.push(c.claim ? `## critique — "${c.claim}"` : "## critique");
  // Verifies citations point to actual source files
  // Checks for undocumented symbols
  // Detects stale or moved references
}
```

Critique verifies that:
- Cited evidence actually exists
- Symbols are documented in `.md`, `.markdown`, or `.mdx` files
- References haven't become stale or moved

Source: [src/server/tools/critique.ts:1-60](https://github.com/sverklo/sverklo/blob/main/src/server/tools/critique.ts)

## Review and Audit Output

### GitHub PR Review Format

Review output supports structured GitHub API payloads:

```typescript
export interface InlineComment {
  /** Repo-relative path of the file being commented on */
  path: string;
  /** 1-based file line number */
  line: number;
  severity: "info" | "warning" | "error";
  body: string;
}
```

Risk levels are classified as: `critical`, `high`, `medium`, `low`.

Source: [src/server/tools/review-format.ts:1-40](https://github.com/sverklo/sverklo/blob/main/src/server/tools/review-format.ts)

### HTML Audit Reports

Self-contained HTML reports with dark theme branding are generated for codebase health analysis:

- Dimension cards with letter grades (A-F)
- Section cards with formatted content
- SEO meta tags and Open Graph support
- Google Fonts (JetBrains Mono, Public Sans)

Source: [src/server/audit-html.ts:1-60](https://github.com/sverklo/sverklo/blob/main/src/server/audit-html.ts)

### Obsidian Export

Audit reports can be exported as Obsidian-compatible markdown with `[[wikilinks]]` for clickable navigation between files and symbols.

Source: [src/server/audit-obsidian.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts)

## Prompt Templates

### Workflow Orchestration

Prompts encode the *order* of sverklo tool calls for common tasks:

| Prompt | Purpose |
|--------|---------|
| `sverklo/map-feature` | Map a feature across codebase entry points, symbols, tests, docs |
| `sverklo/architecture-map` | Generate architecture map using overview, deps, PageRank, recall |
| `sverklo/onboard` | New developer onboarding with conventions and project index |
| `sverklo/premerge` | Pre-merge review checklist |
| `sverklo/debug` | Systematic debugging using symbol graphs and references |

Each prompt uses the `PromptDefinition` interface:

```typescript
interface PromptDefinition {
  name: string;
  description: string;
  arguments: { name: string; description: string; required: boolean }[];
  build: (args: Record<string, string>) => string;
}
```

Source: [src/server/prompts.ts:1-80](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)

## Context Injection

### Session Startup

On session start, the MCP server injects context via the `sverklo://context` resource:

```typescript
const coreMemories = indexer.memoryStore.getCore(15);
const recentMemories = indexer.memoryStore.getRecent(10);
const projectMemories = indexer.memoryStore.getByCategory("project");
const conventions = indexer.memoryStore.getByCategory("convention");
```

Context tiers:
- **Core** — Project invariants (always injected)
- **Recent** — Latest saved memories
- **Category** — Organized by type (project, convention, architecture)

Source: [src/server/mcp-server.ts:50-100](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)

### Wakeup Generation

The `wakeup` tool produces quick orientation summaries:

```typescript
export function generateWakeup(
  indexer: IndexFiles & IndexMemory,
  options: { maxTokens?: number; format?: "markdown" | "plain" }
): string
```

Includes project status, core files by dependency rank, and project invariants.

Source: [src/server/tools/wakeup.ts:1-50](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)

## References Lookup

### Symbol Resolution

The `refs` tool finds all references to a symbol and separates:
- **Structural inclusions** — where the symbol is defined/included
- **Associative references** — "see also" mentions

Dedup logic prevents near-identical rows from the same logical doc location:

```typescript
const seen = new Set<string>();
for (const m of docMentions) {
  const key = `${m.doc_file_path}|${m.doc_breadcrumb ?? ""}|${m.match_kind}`;
  if (seen.has(key)) continue;
  seen.add(key);
  dedupedAll.push(m);
}
```

Source: [src/server/tools/find-references.ts:1-60](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)

## Known Architectural Considerations

### Windows Path Handling

Path normalization uses forward-slash conversion:
```typescript
path.replace(/\\/g, "/").split("/").pop()
```

This is noted in community discussions as a workaround rather than a comprehensive fix. See [Issue #20](https://github.com/sverklo/sverklo/issues/20).

### MCP Tool Name Prefixing

All tools include a `sverklo_` prefix (e.g., `sverklo_impact`, `sverklo_search`). When registered under key `"sverklo"`, this produces double-prefixing (e.g., `sverklo_sverklo_impact`). See [Issue #71](https://github.com/sverklo/sverklo/issues/71).

### Retrieval Architecture Evolution

Community discussions ([Issue #29](https://github.com/sverklo/sverklo/issues/29)) have raised evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank approach.

## Technology Stack

| Component | Technology | Purpose |
|-----------|------------|---------|
| Runtime | Node.js >= 24.0.0 | Server runtime |
| Protocol | MCP SDK 1.12.0 | Client-server communication |
| Embeddings | ONNX Runtime Node 1.21.0 | Local vector inference |
| Parsing | Tree-sitter (optional) | AST extraction |
| File Watching | Chokidar 4.0.0 | Live reload support |
| Matching | Picomatch 4.0.4 | Glob pattern support |
| Config | YAML 2.8.3 | Configuration files |

Source: [package.json:30-60](https://github.com/sverklo/sverklo/blob/main/package.json)

## Summary

Sverklo's architecture implements a clean separation between indexing, storage, search, and delivery layers. The MCP protocol enables integration with multiple AI coding clients while the hybrid search pipeline combines keyword, semantic, and graph-based signals. Tool profiles allow runtime customization of the available surface, and the prompt system encodes best-practice workflows. The local-first design ensures data privacy and eliminates external dependencies.

---

<a id='mcp-server'></a>

## MCP Server Design

### Related Pages

Related topics: [Search Tools Reference](#search-tools), [Impact and Reference Tools](#impact-tools)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
- [src/server/tool-overrides.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)
- [src/server/tools/_validation.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/_validation.ts)
- [src/server/prompts.ts](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/tools/critique.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/critique.ts)
- [src/server/tools/rename-aliases.test.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/rename-aliases.test.ts)
</details>

# MCP Server Design

Sverklo implements a Model Context Protocol (MCP) server that provides code intelligence capabilities to AI coding agents. The MCP server layer sits between the indexer (which maintains the code graph, embeddings, and memories) and the AI client (Claude Code, Cursor, Windsurf, or Codex CLI). This design enables persistent, local-first code understanding without API keys or code upload.

## Architecture Overview

The MCP server is built on top of the `@modelcontextprotocol/sdk` and exposes sverklo's indexer capabilities as tools, resources, and prompts. The architecture follows a layered design:

```mermaid
graph TD
    subgraph "AI Client Layer"
        A["Claude Code / Cursor / Windsurf"]
    end
    
    subgraph "MCP Server Layer"
        B["startMcpServer()"]
        C["startGlobalMcpServer()"]
        D["Tool Handlers"]
        E["Resource Handlers"]
        F["Prompt Handlers"]
    end
    
    subgraph "Core Indexer Layer"
        G["Indexer<br/>(IndexFiles + IndexCode + IndexGraph + IndexMemory)"]
        H["Vector Store"]
        I["Graph Store"]
        J["Memory Store"]
    end
    
    A --> B
    A --> C
    B --> D
    B --> E
    B --> F
    D --> G
    E --> G
    F --> G
    G --> H
    G --> I
    G --> J
```

### Key Design Principles

1. **Local-first**: All indexing happens on-disk; no data leaves the machine
2. **Git-aware**: Tools understand branches, commits, and diffs
3. **Multi-signal retrieval**: Combines FTS, embeddings, symbol graphs, and PageRank
4. **Backward compatibility**: Legacy tool names are aliased to canonical names with deprecation warnings

## Server Initialization

### Per-Project MCP Server

The `startMcpServer()` function initializes an MCP server for a single repository:

```typescript
// src/server/mcp-server.ts:180
export async function startMcpServer(rootPath: string): Promise<void> {
  const config = getProjectConfig(rootPath);
  // ... initializes indexer, registers handlers, starts server
}
```

Server configuration includes:

| Parameter | Source | Description |
|-----------|--------|-------------|
| `rootPath` | CLI argument | Absolute path to the project root |
| `serverName` | `server.json` | MCP server identifier (default: `io.github.sverklo/sverklo`) |
| `serverVersion` | `package.json` | Inherited from npm package version |
| `instructions` | Static string | Server capabilities description for AI clients |

### Server Capabilities

The MCP server declares three capability categories:

```typescript
// src/server/mcp-server.ts:190
const server = new Server(
  { name: "sverklo", version: serverVersion },
  {
    capabilities: {
      tools: {},
      resources: {},
      prompts: {},
    },
    instructions: /* string */,
  }
);
```

| Capability | Purpose | Handler |
|------------|---------|---------|
| `tools` | Code intelligence operations (search, lookup, impact, etc.) | `server.setRequestHandler(HandleCallToolRequestSchema, ...)` |
| `resources` | Static project context at session start | `server.setRequestHandler(ListResourcesRequestSchema, ...)` |
| `prompts` | Reusable workflow templates | `server.setRequestHandler(ListPromptsRequestSchema, ...)` |

## Tool System Architecture

### Tool Registration

Tools are registered through the MCP SDK's `server.tool()` method. Each tool declares:

```typescript
// src/server/mcp-server.ts (pattern)
server.tool(
  "search",           // canonical tool name
  "Natural language search across indexed code",  // description
  { query: { type: "string" } },                 // input schema
  async (args, extra) => { /* handler */ }       // implementation
);
```

### Tool Naming Convention (v0.28.0+)

Following issue [#71](https://github.com/sverklo/sverklo/issues/71), tool names use the format `<verb>_<noun>` without the `sverklo_` prefix:

| Canonical Name | Description |
|----------------|-------------|
| `search` | Full-text and semantic search |
| `lookup` | Symbol lookup by name |
| `impact` | Blast radius analysis |
| `refs` | Find references to a symbol |
| `investigate` | Multi-signal fan-out investigation |
| `context` | Umbrella context bundler |
| `review_diff` | Diff-aware code review |
| `status` | Indexing status |

### Legacy Tool Aliases

For backward compatibility, the server maintains a `LEGACY_TOOL_ALIASES` map:

```typescript
// src/server/mcp-server.ts
export const LEGACY_TOOL_ALIASES: Record<string, string> = {
  "sverklo_search": "search",
  "sverklo_lookup": "lookup",
  // ... ≥30 entries
};
```

The `resolveToolName()` function routes legacy names to canonical names and emits a single deprecation warning per legacy name per server instance:

```typescript
// src/server/tools/rename-aliases.test.ts:16
it("resolveToolName routes legacy → canonical correctly", () => { ... });

// src/server/tools/rename-aliases.test.ts:22
it("deprecation warning fires exactly once per legacy name", () => { ... });
```

### Tool Presets

Tool availability is controlled through named presets defined in `tool-overrides.ts`:

```typescript
// src/server/tool-overrides.ts
export const TOOL_PRESETS = {
  // Minimal: only essential tools
  minimal: ["search", "lookup", "overview", "refs", "impact", "status"],
  
  // Standard: balanced for most use cases
  standard: ["search", "lookup", "overview", "refs", "impact", "deps", "context", "status"],
  
  // Lean: adds memory tools for recall/remember
  lean: ["search", "lookup", "overview", "refs", "impact", "deps", "context", "status", "remember", "recall", "review_diff"],
  
  // Research: full investigation surface for code onboarding
  research: ["search", "search_iterative", "investigate", "ask", "lookup", "overview", "refs", "impact", "deps", "concepts", "patterns", "clusters", "verify", "critique", "ctx_slice", "ctx_grep", "ctx_stats", "status"],
  
  // Review: PR/MR focus with diff tools front-and-center
  review: ["review_diff", "diff_search", "test_map", "impact", "refs", "lookup", "search", "investigate", "verify", "status"],
};
```

Presets are configured in `server.json`:

```json
// server.json
{
  "mcp": {
    "preset": "research",
    "env": {
      "OVERRIDE_TOOLS": "standard,context"
    }
  }
}
```

## Input Validation

### Server-Side Validation Layer

The `_validation.ts` module provides shared validators used by all tool handlers:

```typescript
// src/server/tools/_validation.ts
export function validateEnum<T extends string>(
  raw: unknown,
  allowed: readonly T[],
  argName: string,
  fallback: T
): T | Error { ... }

export function requireString(
  raw: unknown,
  argName: string,
  usage: string
): { ok: true; value: string } | { ok: false; message: string } { ... }
```

### Why Server-Side Validation?

The MCP wrapper declares JSON schemas, but Claude/agents sometimes pass values outside declared enums. Without server-side guards, invalid values fall through to silent type-cast paths, returning wrong but successful-looking results. Source: [src/server/tools/_validation.ts:1-15]()

### Git Parameter Validation

Git parameters (refs, paths) are validated against injection patterns:

```typescript
// src/utils/git-validation.ts
export function validateGitRef(ref: string): boolean {
  // Allows: branch names, tags, SHAs, ranges (A..B, A...B), HEAD~N, HEAD^N
  // Rejects: spaces, semicolons, backticks, pipes, dollar signs, parentheses
  return /^[a-zA-Z0-9_.\/@{}\-~^:]+(\.\.[a-zA-Z0-9_.\/@{}\-~^:]+)?$/.test(ref);
}
```

This prevents command injection (CWE-78) when git commands are executed via `execSync` or `spawnSync`.

## Resource System

### Auto-Injected Project Context

The MCP server registers a single resource `sverklo://context` that AI clients read at session start:

```typescript
// src/server/mcp-server.ts:200
server.setRequestHandler(ListResourcesRequestSchema, async () => ({
  resources: [
    {
      uri: "sverklo://context",
      name: "Sverklo Project Context",
      description:
        "Key memories and codebase overview. Read this at session start to understand the project.",
      mimeType: "text/plain",
    },
  ],
}));
```

### Context Content

When `sverklo://context` is read, the server returns a markdown document containing:

| Section | Content | Selection Criteria |
|---------|---------|-------------------|
| Core Project Context | Project-invariant memories (tier='core') | Top 15 by recency |
| Stale Memories | Memories flagged as outdated | Any with `is_stale: true` |
| Recent Memories | Recent context entries | Top 5 by recency |
| Top Files | Files sorted by PageRank | Top 5 files |

## Prompt Templates

The MCP server exposes reusable workflow prompts via the prompts protocol:

```typescript
// src/server/prompts.ts
export interface PromptDefinition {
  name: string;
  description: string;
  arguments: PromptArgument[];
  build: (args: Record<string, string | undefined>) => string;
}
```

### Available Prompts

| Prompt Name | Description | Required Args |
|-------------|-------------|---------------|
| `sverklo/review-changes` | Diff-aware code review workflow | `ref` (optional) |
| `sverklo/map-feature` | Map a feature across codebase entry points, symbols, tests, docs | `feature` |

### Prompt Workflow Example

The `sverklo/review-changes` prompt guides the model through a structured review:

```markdown
# Review changes workflow (simplified)

1. Call `review_diff ref:"<ref>"` for risk-scored findings
2. Call `diff_search query:"<risk keywords>"` to surface related changes
3. Call `test_map ref:"<ref>"` to check test coverage
4. Call `impact ref:"<ref>"` for blast radius before approving
```

## Context Tool (`context`)

The `context` tool is an umbrella bundler that provides codebase overview in a single call:

```typescript
// src/server/tools/context.ts
const contextTool = {
  name: "context",
  description:
    "Umbrella context bundler. Give a task description and get a single curated bundle: " +
    "codebase overview header, semantically relevant code, related symbols, and matching " +
    "saved memories — in one round trip.",
  inputSchema: {
    type: "object",
    properties: {
      task: { type: "string" },
      detail_level: { type: "string", enum: ["minimal", "normal", "full"] },
      scope: { type: "string" },
      budget: { type: "number" },  // PageRank-pruned token budget
    },
  },
};
```

### Context Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `task` | string | - | Free-form task description |
| `detail_level` | enum | `"normal"` | `minimal`=fast/cheap, `normal`=balanced, `full`=adds dep neighbors |
| `scope` | string | - | Path prefix to constrain search (e.g., `src/api/`) |
| `budget` | number | - | When set, returns PageRank-pruned repo map fit to token budget |

## Critique Tool

The `critique` tool evaluates whether an AI's answer properly cites sverklo's evidence:

```typescript
// src/server/tools/critique.ts
interface CritiqueData {
  claim: string | null;
  verify: VerifyResult[];
  stale: VerifyResult[];
  moved: VerifyResult[];
  hubsCited: string[];
  missedHubs: string[];
  undefinedSymbols: string[];
  undocumentedSymbols: string[];
  totalSymbols: number;
}
```

### What Critique Checks

| Check | Description |
|-------|-------------|
| Symbol verification | Are cited symbols actually defined in the codebase? |
| Stale memory | Are referenced memories marked as outdated? |
| Moved code | Do citations point to symbols that have been relocated? |
| Hub citation | Does the answer cite high PageRank hub files? |
| Undefined symbols | Does the answer mention symbols that don't exist? |
| Undocumented symbols | Are important symbols missing `.md`/`.markdown`/`.mdx` documentation? |

## Zilliz Compatibility Layer

Sverklo provides aliases for Zilliz claude-context MCP server tools:

```typescript
// src/server/mcp-server.ts (Zilliz compat tools)
const zillizTools = [
  {
    name: "search_code",
    description: "[Zilliz claude-context compat] Alias for sverklo's search tool.",
    inputSchema: {
      type: "object",
      properties: {
        query: { type: "string" },
        path: { type: "string" },
        limit: { type: "number" },
      },
      required: ["query"],
    },
  },
  {
    name: "clear_index",
    description: "[Zilliz claude-context compat] Delete the index database and rebuild from scratch.",
  },
  {
    name: "get_indexing_status",
    description: "[Zilliz claude-context compat] Alias for sverklo's `status` tool.",
  },
];
```

## Global MCP Server (Multi-Repo Mode)

The `startGlobalMcpServer()` function serves multiple repositories from a single MCP server:

```typescript
// src/server/mcp-server.ts:290
export async function startGlobalMcpServer(): Promise<void> {
  const pool = new IndexerPool();
  const hints = new HintEngine();
  // ... initializes server with list_repos tool
}
```

### Global Mode Features

| Feature | Description |
|---------|-------------|
| `list_repos` | List all registered repositories with path, name, and status |
| `repo` parameter | All tools accept optional `repo` parameter to target specific repo |
| Single-repo shortcut | If only one repo is registered, `repo` parameter is optional |

### Server Instructions (Global Mode)

```
Sverklo (global mode): code intelligence serving multiple repos.
Use the list_repos tool to see available repositories, then pass the
repo name to any tool. If only one repo is registered, the repo
parameter is optional.
```

## Wakeup Generation

The `generateWakeup()` function creates a compact project summary:

```typescript
// src/server/tools/wakeup.ts
export function generateWakeup(
  indexer: IndexFiles & IndexMemory,
  options: { maxTokens?: number; format?: "markdown" | "plain" } = {}
): string
```

### Wakeup Output Structure

```markdown
# {projectName}
{fileCount} files · {languages}

## Core files (by dependency rank)
- `path/to/high-pagerank-file.ts`
- ...

## Project invariants (or Recent context)
- [{category}] {memory content}
- ...
```

## Version Management

The server reads its version from `package.json` at startup using a directory traversal pattern:

```typescript
// src/server/mcp-server.ts:305
for (const rel of ["..", "../..", "../../.."]) {
  try {
    const pkg = JSON.parse(readFileSync(join(here, rel, "package.json"), "utf-8"));
    if (pkg.name === "sverklo" && pkg.version) {
      serverVersion = pkg.version;
      break;
    }
  } catch {}
}
```

This ensures the MCP server always reports the version of the installed npm package, even when invoked from different working directories.

---

<a id='search-system'></a>

## Search and Retrieval System

### Related Pages

Related topics: [Search Tools Reference](#search-tools), [Indexing System](#indexing), [System Architecture](#architecture)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/search/investigate.ts](https://github.com/sverklo/sverklo/blob/main/src/search/investigate.ts)
- [src/search/hybrid-search.ts](https://github.com/sverklo/sverklo/blob/main/src/search/hybrid-search.ts)
- [src/search/boost.ts](https://github.com/sverklo/sverklo/blob/main/src/search/boost.ts)
- [src/search/rerank.ts](https://github.com/sverklo/sverklo/blob/main/src/search/rerank.ts)
- [src/search/pagerank.ts](https://github.com/sverklo/sverklo/blob/main/src/search/pagerank.ts)
- [src/storage/embedding-store.ts](https://github.com/sverklo/sverklo/blob/main/src/storage/embedding-store.ts)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/tools/search-iterative.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/search-iterative.ts)
</details>

# Search and Retrieval System

## Overview

The Search and Retrieval System in sverklo provides multi-signal code search capabilities for coding agents. It combines full-text search (BM25), semantic embeddings (bi-encoder), symbol-based lookup, and graph-based PageRank scoring to surface relevant code chunks for a given query.

The system is designed to be local-first with no API keys required, using ONNX runtime for embedding inference and in-memory data structures for fast retrieval. It powers tools like `search`, `investigate`, `context`, and `review_diff` across the MCP server interface.

Source: [src/search/hybrid-search.ts](https://github.com/sverklo/sverklo/blob/main/src/search/hybrid-search.ts)

## Architecture

The retrieval architecture combines multiple rankers into a unified pipeline:

```mermaid
graph TD
    A[Query] --> B[BM25 FTS]
    A --> C[Bi-Encoder Embeddings]
    A --> D[Symbol Lookup]
    B --> E[Reciprocal Rank Fusion]
    C --> E
    D --> E
    E --> F[PageRank Boost]
    F --> G[Reranker]
    G --> H[Final Results]
```

The system uses **Reciprocal Rank Fusion (RRF)** to combine signals from multiple retrievers, followed by **PageRank-based boosting** to prioritize centrally-important files, and an optional **reranking pass** to refine results based on query-document affinity.

Source: [src/search/rerank.ts](https://github.com/sverklo/sverklo/blob/main/src/search/rerank.ts)
Source: [src/search/pagerank.ts](https://github.com/sverklo/sverklo/blob/main/src/search/pagerank.ts)

## Core Components

### Hybrid Search

The `hybridSearch` function orchestrates multi-signal retrieval by:

1. Executing parallel searches across FTS, embeddings, symbols, and references
2. Collecting results with source attribution (`found_by` tags)
3. Applying Reciprocal Rank Fusion to merge ranked lists
4. Boosting results from high-PageRank files

```typescript
export async function hybridSearch(
  indexer: Indexer,
  query: string,
  options?: HybridSearchOptions
): Promise<SearchResult[]>
```

The function returns results annotated with `found_by` arrays, allowing callers to identify multi-source agreement:

> Results agreed on by multiple retrievers are higher-signal than single-source hits.

Source: [src/search/hybrid-search.ts](https://github.com/sverklo/sverklo/blob/main/src/search/hybrid-search.ts)
Source: [src/server/prompts.ts](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)

### Reciprocal Rank Fusion

RRF combines ranked lists using the formula:

```
RRF_score(doc) = Σ 1 / (k + rank_i(doc))
```

Where `k` is a constant (typically 60) that controls how much the lowest-ranked retrievers contribute. This approach is parameter-free and handles different score distributions across rankers.

Source: [src/search/hybrid-search.ts](https://github.com/sverklo/sverklo/blob/main/src/search/hybrid-search.ts)

### PageRank Boost

PageRank scores are computed from the import dependency graph during indexing. The `pagerankBoost` function adjusts search scores based on file centrality:

```typescript
export function pagerankBoost(
  results: SearchResult[],
  fileStore: FileStore,
  factor?: number
): SearchResult[]
```

High-PR files (core libraries, entry points) receive a multiplicative boost, ensuring frequently-imported code surfaces first even when query terms are sparse.

Source: [src/search/pagerank.ts](https://github.com/sverklo/sverklo/blob/main/src/search/pagerank.ts)

### Embedding Store

The `EmbeddingStore` manages vector embeddings for semantic search:

```typescript
export class EmbeddingStore {
  get(query: string, k?: number): EmbeddingResult[]
  upsert(records: EmbeddingRecord[]): void
  prune(ids: Set<number>): void
}
```

Embeddings are computed using ONNX runtime and stored in memory. The store supports:
- **Top-k retrieval** by cosine similarity
- **Pruning** to remove embeddings for deleted files
- **Batch upsert** for incremental index updates

Source: [src/storage/embedding-store.ts](https://github.com/sverklo/sverklo/blob/main/src/storage/embedding-store.ts)

### Reranking

The reranker refines initial results using a cross-encoder approach. It takes the top-N candidates from hybrid search and reorders them based on finer-grained query-document matching:

```typescript
export interface RerankerResult {
  chunk_id: number;
  score: number;
  rerank_score: number;
  source: "fts" | "embedding" | "symbol" | "ref";
  path: string;
  lines: string;
}
```

Source: [src/search/rerank.ts](https://github.com/sverklo/sverklo/blob/main/src/search/rerank.ts)

## Investigation Tool

The `investigate` tool provides single-pass fan-out over all retrieval signals:

```typescript
export async function handleInvestigate(
  indexer: Indexer,
  args: { query: string; scope?: string; limit?: number }
): Promise<string>
```

It returns structured results showing which retrievers found each result, enabling agents to:
- Identify high-confidence hits (multi-source agreement)
- Discover unexpected code locations
- Build confidence in retrieved evidence

Source: [src/search/investigate.ts](https://github.com/sverklo/sverklo/blob/main/src/search/investigate.ts)

## Search Iterative Tool

For complex queries requiring refinement, `searchIterative` supports multi-turn search with context accumulation:

```typescript
export const searchIterativeTool = {
  name: "search_iterative",
  description: "Multi-turn search that builds on previous results...",
  inputSchema: {
    type: "object",
    properties: {
      query: { type: "string", description: "Search query" },
      refine: { type: "string", description: "Refinement to previous results" },
      // ...
    }
  }
}
```

The tool maintains trajectory state across calls, allowing progressive narrowing of search space.

Source: [src/server/tools/search-iterative.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/search-iterative.ts)

## Context Tool

The `context` tool is an umbrella bundler that combines search with memory recall:

```typescript
export const contextTool = {
  name: "context",
  description: "Umbrella context bundler. Give a task description and get a single curated bundle..."
}
```

It supports a `budget` parameter for PageRank-pruned repo maps that fit a token budget—ideal for giving agents a complete mental model of an unfamiliar codebase in one call.

Source: [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)

## Retrieval Signal Types

| Signal | Source | Strength |
|--------|--------|----------|
| BM25 FTS | Full-text indexing | Exact term matching |
| Bi-encoder embeddings | ONNX inference | Semantic similarity |
| Symbol lookup | AST parsing | Definition/expression finding |
| Reference graph | Import analysis | Call-site discovery |
| PageRank | Dependency graph | Architectural importance |

## Configuration

### Boost Factor

The `BOOST_FACTOR` constant (default: `0.5`) controls PageRank influence on final scores:

```typescript
export function pagerankBoost(
  results: SearchResult[],
  fileStore: FileStore,
  factor: number = BOOST_FACTOR
): SearchResult[]
```

### RRF Constant

The `RRF_K` constant (default: `60`) controls how aggressively lower-ranked retrievers influence fusion:

```typescript
const RRF_K = 60;
function rrfScore(rank: number): number {
  return 1 / (RRF_K + rank);
}
```

## Community Considerations

### Multi-Vector Reranker Evaluation (Issue #29)

The community has discussed evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder approach. Multi-vector models tokenize queries and documents into multiple embedding vectors, potentially capturing finer-grained relevance signals for code search.

Current architecture uses bi-encoder embeddings where both query and document are encoded independently. The enhancement would involve:
- Late interaction between query and document token vectors
- Potential improvement in recall for partial matches
- Trade-off consideration: latency vs. accuracy

### Benchmark Performance (Issue #28)

Parser improvements (string/comment-aware brace counting) recovered P1 performance from 0.30 → 0.73 on the 90-task benchmark, though P2/P4 categories showed slight regression. This highlights the sensitivity of retrieval quality to underlying code parsing accuracy.

## Tool Summary

| Tool | Purpose | Signals Used |
|------|---------|--------------|
| `search` | Basic hybrid search | FTS + Embeddings + Symbols |
| `investigate` | Multi-source fan-out | All signals + agreement analysis |
| `search_iterative` | Refinement search | Trajectory-aware multi-turn |
| `context` | Bundle + memories | Hybrid search + recall |
| `ctx_grep` | Grep within results | Post-filter FTS |
| `ctx_slice` | Dependency slice | Graph-based filtering |

## Dependencies

The search system depends on:
- **ONNX Runtime Node** (`onnxruntime-node`) for embedding inference
- **Tree-sitter** (`web-tree-sitter`) for AST parsing and symbol extraction
- **Picomatch** for file path matching patterns

Source: [package.json](https://github.com/sverklo/sverklo/blob/main/package.json)

---

<a id='memory-layer'></a>

## Bi-Temporal Memory Layer

### Related Pages

Related topics: [Search and Retrieval System](#search-system)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/server/tools/memories.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/memories.ts)
- [src/server/tools/remember.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/remember.ts)
- [src/server/tools/pin.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/pin.ts)
- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
- [src/types/index.ts](https://github.com/sverklo/sverklo/blob/main/src/types/index.ts)
</details>

# Bi-Temporal Memory Layer

The Bi-Temporal Memory Layer is sverklo's persistent knowledge system that preserves coding decisions, conventions, and context across agent sessions. Unlike traditional memory stores that overwrite previous entries, the bi-temporal model maintains both the current state and historical validity of each memory, enabling conflict detection without data loss.

## Core Concepts

### What Makes It "Bi-Temporal"

The bi-temporal architecture tracks two independent time dimensions:

1. **Valid Time**: When a memory was or will be true in the real world (e.g., "we deprecated X in v2.0")
2. **Record Time**: When the memory was recorded in the system (e.g., "I learned about X's deprecation today")

This separation allows agents to reason about both what was historically true and when that knowledge was acquired. Source: [src/server/tools/memories.ts:1-20]()

### Memory Categories

Memories are classified into categories that determine their behavior:

| Category | Purpose | Default Kind |
|----------|---------|--------------|
| `decision` | Architectural choices, API contracts | semantic |
| `preference` | Coding style, team conventions | semantic |
| `pattern` | Reusable solutions to recurring problems | semantic |
| `context` | Project-specific information (default) | episodic |
| `todo` | Outstanding work items | episodic |
| `procedural` | Step-by-step processes | procedural |
| `correction` | Fixes for prior model mistakes | episodic |

Source: [src/server/tools/remember.ts:15-23]()

### Memory Kinds (Cognitive Axis)

The cognitive axis determines how memories are retrieved and prioritized:

- **episodic**: Moment-bound events or decisions tied to specific contexts
- **semantic**: Timeless facts or rules that apply universally
- **procedural**: How-to knowledge for executing tasks

Source: [src/server/tools/remember.ts:47-50]()

### Memory Tiers

| Tier | Behavior |
|------|----------|
| `core` | Auto-injected at every session start (up to 15 memories) |
| `archive` | Searched on demand, not auto-injected |

Source: [src/server/mcp-server.ts:45-55]()

## Architecture

### Storage Components

```
┌─────────────────────────────────────────────────────────────────┐
│                     Bi-Temporal Memory Layer                     │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐    ┌──────────────────────┐                   │
│  │memory-store │    │memory-embedding-store │                   │
│  │  (SQLite)   │    │   (Vector + SQLite)   │                   │
│  └─────────────┘    └──────────────────────┘                   │
│         │                     │                                │
│         └──────────┬──────────┘                                │
│                     ▼                                           │
│         ┌─────────────────────┐                                │
│         │  staleness.ts       │  (file-change tracking)         │
│         └─────────────────────┘                                │
│                     │                                           │
│         ┌─────────────────────┐                                │
│         │  prune.ts           │  (lifecycle management)         │
│         └─────────────────────┘                                │
└─────────────────────────────────────────────────────────────────┘
```

### Memory Lifecycle

```mermaid
graph TD
    A[remember tool] --> B[Check for conflicting memories]
    B --> C{Conflict threshold > 0.85?}
    C -->|Yes| D[Mark old memory as STALE]
    C -->|No| E[Keep both memories active]
    D --> F[Save new memory with git state]
    E --> F
    F --> G[Store in memory-store]
    F --> H[Generate embeddings in memory-embedding-store]
    G --> I[Auto-inject if tier=core]
```

### Conflict Detection

The system uses a configurable conflict threshold (default: 0.85) to identify potentially contradictory memories:

```typescript
const CONFLICT_THRESHOLD = 0.85;
```

When a new memory conflicts with an existing one above this threshold, the older memory is marked as stale rather than deleted. Both records are preserved, enabling agents to review the conflict. Source: [src/server/tools/remember.ts:8]()

## MCP Tools

### remember — Save Persistent Memory

```typescript
rememberTool = {
  name: "remember",
  inputSchema: {
    content: string,           // Required: the memory content
    category: MemoryCategory,  // decision|preference|pattern|context|todo|procedural|correction
    tags: string[],            // Optional metadata tags
    related_files: string[],   // Files this memory relates to
    confidence: number,        // 0.0-1.0, affects retrieval ranking
    tier: "core" | "archive",   // core=auto-inject, archive=search-on-demand
    kind: "episodic" | "semantic" | "procedural",
    scope: "project" | "workspace",  // project=repo-local, workspace=cross-repo
  }
}
```

**Key behaviors:**
- Tied to git state — memories are associated with the current commit/branch
- Auto-invalidates conflicting prior memories above the conflict threshold
- `procedural` category defaults to `procedural` kind
- `preference`/`pattern` categories default to `semantic` kind
- Other categories default to `episodic` kind

Source: [src/server/tools/remember.ts:12-55]()

### recall — Retrieve Relevant Memories

The recall tool searches memories semantically and returns results ranked by relevance. Memories linked to files that have changed since recording may be marked as potentially stale.

### memories — List and Audit Memories

```typescript
memoriesTool = {
  name: "memories",
  inputSchema: {
    mode: "list" | "conflicts",  // list=show all, conflicts=show contradictory pairs
    category: MemoryCategory,    // Filter by category
    limit: number,              // Max results (default: 50)
    stale_only: boolean,         // Only show stale memories
  }
}
```

**Conflict mode**: Surfaces pairs of active memories sharing a pin that may contradict. The bi-temporal model preserves both, presenting this as a review prompt rather than auto-resolving.

Source: [src/server/tools/memories.ts:8-30]()

### pin / unpin — Anchor Memories to Code Locations

```typescript
pinTool = {
  name: "pin",
  inputSchema: {
    memory_id: number,  // From recall/memories results
    target: string,     // File path or symbol name
  }
}
```

Pinned memories surface automatically when recalling by that file path or symbol name, without requiring semantic search. This enables location-specific knowledge injection.

Source: [src/server/tools/pin.ts:1-30]()

## Core Memories Auto-Injection

On every MCP session start, the server auto-injects **core tier** memories into the context:

```typescript
// From mcp-server.ts
const coreMemories = indexer.memoryStore.getCore(15);
for (const m of coreMemories) {
  const stale = m.is_stale ? " [STALE]" : "";
  parts.push(`- [${m.category}]${stale} ${m.content}`);
}
```

These memories appear in the `sverklo://context` resource and serve as project invariants that agents should always consider. Source: [src/server/mcp-server.ts:45-55]()

## Staleness Detection

### File-Based Staleness

When `related_files` are provided with a memory, sverklo tracks file changes:

```typescript
interface Memory {
  related_files: string[];  // Files this memory relates to
  is_stale: boolean;        // Set when related files change
}
```

Stale memories are flagged with `[STALE]` in the context output, alerting agents to re-evaluate whether the memory is still valid. Source: [src/server/mcp-server.ts:52]()

### Conflict-Based Staleness

Memories that conflict with newer entries (similarity > 0.85) are automatically marked stale, preserving the historical record while surfacing the current best knowledge.

## Workspace Scope

Memories can be saved at two scopes:

| Scope | Storage Location | Visibility |
|-------|-----------------|-------------|
| `project` | `{repo}/.sverklo/memories.db` | Current repository only |
| `workspace` | `~/.sverklo/workspaces/{name}/memories.db` | All repos in workspace |

The workspace scope enables cross-repository decisions (e.g., "we use Postgres everywhere") to be shared across projects. Source: [src/server/tools/remember.ts:55-62]()

## Community Considerations

### Global Memory Setup (Issue #72)

Users requesting `sverklo init --global` want one-time workspace-level memory setup that doesn't require per-project initialization. The bi-temporal layer's `workspace` scope partially addresses this by enabling cross-repo memories, but the initialization workflow remains per-project.

### MCP Tool Name Conflicts (Issue #71)

Memory tools (`remember`, `recall`, `memories`) may be double-prefixed (`sverklo_sverklo_remember`) when registered under the `sverklo` key, though this is a naming convention issue rather than a bi-temporal architecture concern.

## Summary

The Bi-Temporal Memory Layer provides sverklo's agents with persistent, conflict-aware knowledge that survives across sessions. Key design decisions:

- **Preservation over deletion**: Conflicting memories are marked stale, not removed
- **Dual temporal axes**: Valid time and record time enable historical reasoning
- **Tiered retrieval**: Core memories auto-inject; archive memories search on demand
- **Location pinning**: Memories anchor to files/symbols for context-sensitive recall
- **Git integration**: Memories are tied to repository state for traceability

---

<a id='indexing'></a>

## Indexing System

### Related Pages

Related topics: [Search and Retrieval System](#search-system), [System Architecture](#architecture)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/indexer/index-files.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/index-files.ts)
- [src/indexer/index-code.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/index-code.ts)
- [src/indexer/index-graph.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/index-graph.ts)
- [src/indexer/index-memory.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/index-memory.ts)
- [src/indexer/indexer.ts](https://github.com/sverklo/sverklo/blob/main/src/indexer/indexer.ts)
- [src/search/hybrid-search.ts](https://github.com/sverklo/sverklo/blob/main/src/search/hybrid-search.ts)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/tools/find-references.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)
</details>

# Indexing System

The indexing system is the core data pipeline that powers sverklo's code intelligence capabilities. It transforms source code into a queryable, multi-dimensional index that combines file metadata, code symbols, dependency graphs, semantic embeddings, and persistent memory.

## Overview

Sverklo's indexer builds a **four-layer index** that feeds all downstream tools:

| Layer | Purpose | Data Structure |
|-------|---------|----------------|
| Files | Track all indexed source files with metadata | `fileStore` |
| Code | Extract symbols, chunks, and documentation edges | `codeStore` / `docEdgeStore` |
| Graph | Build dependency relationships for PageRank | `graphStore` |
| Memory | Persist context and decisions across sessions | `memoryStore` |

Source: [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)

## Architecture

```mermaid
graph TD
    A[Source Files] --> B[File Indexer]
    B --> C[fileStore]
    
    D[Code Parsing] --> E[Symbol Extractor]
    E --> F[codeStore]
    F --> G[Doc Edge Store]
    G --> H[docEdgeStore]
    
    I[File Metadata] --> J[Graph Builder]
    J --> K[graphStore]
    K --> L[PageRank Compute]
    L --> K
    
    M[Memory Store] --> N[memoryStore]
    
    C --> O[Hybrid Search]
    F --> O
    K --> O
    G --> O
```

## Indexer Interfaces

The indexer exposes a unified interface combining four specialized stores:

```typescript
type IndexFiles = {
  fileStore: FileStore;
  getStatus(): ProjectStatus;
};

type IndexCode = {
  codeStore: CodeStore;
  docEdgeStore: DocEdgeStore;
};

type IndexGraph = {
  graphStore: GraphStore;
};

type IndexMemory = {
  memoryStore: MemoryStore;
};
```

Source: [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)

## File Store

The file store maintains a registry of all indexed source files with their metadata:

```typescript
interface FileRecord {
  id: number;
  path: string;
  pagerank: number;
  // ... other metadata
}
```

The store provides:
- `getAll()` - Retrieve all indexed files
- `getById(id)` - Lookup by file ID
- Path-to-ID mapping for graph edge resolution

Source: [src/server/audit-obsidian.ts](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts)

### Build Lookup Maps

```typescript
const idToPath = new Map<number, string>();
for (const f of files) idToPath.set(f.id, f.path);
```

Source: [src/server/audit-obsidian.ts](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts)

## Code Store

The code store indexes code symbols and documentation references:

### Symbol Extraction

Symbols are extracted from parsed code and stored with:
- Symbol name and type (function, class, type, interface, method, variable)
- Source file location
- Chunk boundaries for code retrieval

Source: [src/server/tools/find-references.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)

### Documentation Edge Store

Documentation edges connect code symbols to their documentation:

```typescript
interface DocMention {
  doc_file_path: string;
  doc_breadcrumb?: string;
  match_kind: string;
  edge_kind: "includes" | "reference";
  confidence: number;
}
```

The store supports:
- `getBySymbol(symbol, limit)` - Find documentation mentions of a symbol
- Edge kind filtering: `"includes"` for structural inclusions, `"reference"` for associative mentions
- Deduplication by file path + breadcrumb + match kind

Source: [src/server/tools/find-references.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)

## Graph Store

The graph store builds and maintains the dependency graph:

### Edge Structure

```typescript
interface GraphEdge {
  source_file_id: number;
  target_file_id: number;
}
```

### Import/Dependency Maps

```typescript
const imports = new Map<string, string[]>();      // file -> files it imports
const importedBy = new Map<string, string[]>();   // file -> files that import it
```

Source: [src/server/audit-obsidian.ts](https://github.com/sverklo/sverklo/blob/main/src/server/audit-obsidian.ts)

### PageRank Computation

Files receive PageRank scores based on their position in the dependency graph. High PageRank files are considered "load-bearing" modules.

Source: [src/server/tools/wakeup.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)

## Hybrid Search

The hybrid search combines multiple retrieval signals:

```typescript
const { hybridSearch } = require("../../search/hybrid-search.js");
```

### Search Signals

| Signal | Description |
|--------|-------------|
| BM25 | Traditional keyword matching |
| Vector/Embeddings | Semantic similarity via bi-encoder |
| PageRank | Graph-based importance |
| Symbol Match | Exact symbol references |

Source: [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)

## Memory Store

The memory store provides persistent context across sessions:

### Memory Tiers

| Tier | Usage |
|------|-------|
| Core | Project invariants, auto-injected on session start |
| Standard | General memories and decisions |

### Memory Categories

- Conventions
- Architecture decisions
- Project-specific patterns

Source: [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)

## Status Reporting

The indexer provides project status through `getStatus()`:

```typescript
const status = indexer.getStatus();
// Returns: { projectName, fileCount, languages, ... }
```

Source: [src/server/tools/wakeup.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)

## Related Community Issues

### Index Timestamp Bug (Issue #74)

`sverklo reindex` does not update the `lastIndexed` field in `registry.json`, causing stale age displays after reindexing.

**Reproduction:**
```
sverklo register .
sverklo reindex .
sverklo list  # Still shows stale age
```

### Parser Regression (Issue #28)

A string/comment-aware brace counter fix in the parser recovered P1 from 0.30 → 0.73 on the 90-task benchmark but caused slight regressions in P2/P4 categories.

### Retrieval Architecture (Issue #29)

Community discussion on evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank architecture.

## Configuration

The indexer supports path-based filtering:

| Option | Description |
|--------|-------------|
| `scope` | Path prefix to constrain indexing |
| `ignore` | Patterns to exclude from indexing |

## Wakeup Generation

The wakeup system generates a quick project orientation:

```typescript
function generateWakeup(indexer, options) {
  const status = indexer.getStatus();
  const coreMemories = indexer.memoryStore.getCore(10);
  const topFiles = indexer.fileStore.getAll().slice(0, 5);
}
```

Output includes:
- Project name and file count
- Top 5 files by PageRank
- Core memories (tier='core')

Source: [src/server/tools/wakeup.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)

---

<a id='search-tools'></a>

## Search Tools Reference

### Related Pages

Related topics: [Impact and Reference Tools](#impact-tools), [Search and Retrieval System](#search-system)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/prompts.ts](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)
- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
- [src/server/tool-overrides.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)
- [src/server/tools/critique.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/critique.ts)
- [src/server/tools/find-references.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)
- [src/server/tools/review-format.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/review-format.ts)
- [src/server/tools/wakeup.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/wakeup.ts)
</details>

# Search Tools Reference

Sverklo provides a layered suite of search and investigation tools designed to help coding agents navigate, understand, and reason about codebases. The tools span from fast keyword lookups to deep semantic analysis, with reciprocal rank fusion combining multiple retrieval signals for high-quality results.

## Architecture Overview

Sverklo's search system uses a hybrid retrieval architecture that combines three signals:

| Signal | Mechanism | Purpose |
|--------|-----------|---------|
| **BM25** | Sparse keyword matching | Exact term matches and code identifiers |
| **Bi-encoder embeddings** | Dense vector similarity | Semantic/code-similarity search |
| **PageRank** | Graph-based importance | Prioritize high-centrality files |

These signals are merged using reciprocal rank fusion (RRF) to produce ranked results that balance precision with recall. Source: `package.json` (keywords: `"reciprocal-rank-fusion"`, `"bm25"`, `"pagerank"`)

```mermaid
graph TD
    subgraph "Retrieval Layer"
        BM25[BM25 Keyword Search]
        EMB[Bi-encoder Embeddings]
        PGR[PageRank Scorer]
    end
    
    subgraph "Fusion"
        RRF[Reciprocal Rank Fusion]
    end
    
    subgraph "Post-Processing"
        VERIFY[Verify Results]
        REFINE[Refine & Deduplicate]
    end
    
    BM25 --> RRF
    EMB --> RRF
    PGR --> RRF
    RRF --> VERIFY
    VERIFY --> REFINE
    REFINE --> RESULTS[Final Results]
```

## Tool Categories

Sverklo organizes search tools into four functional categories, each targeting a specific stage of code investigation.

### Discovery Tools

These tools help locate code and understand what's in the codebase.

| Tool | Purpose |
|------|---------|
| `search` | Primary semantic search across code chunks |
| `search_iterative` | Multi-turn search with result refinement |
| `investigate` | Fan-out search across FTS, embeddings, symbols, and refs simultaneously |
| `grep` | Exact string/pattern matching |
| `head` | View beginning of files |

Source: `tool-overrides.ts` (tool lists in `research` and `lean` profiles)

### Context Tools

These tools bundle information for rapid orientation.

| Tool | Purpose |
|------|---------|
| `context` | Umbrella tool returning codebase overview + search + symbols + memories in one call |
| `overview` | Structural summary with file/chunk counts and PageRank rankings |
| `ask` | Free-form question answering over indexed content |
| `wakeup` | Quick orientation summary for new sessions |

The `context` tool is designed as the "first call" when starting work on a new task, returning a curated bundle in a single round trip:

> Give a task description and get a single curated bundle: codebase overview header, semantically relevant code, related symbols, and matching saved memories — in one round trip.

Source: `src/server/tools/context.ts` (tool description)

### Navigation Tools

These tools help traverse code structure and relationships.

| Tool | Purpose |
|------|---------|
| `lookup` | Retrieve full code chunks by path |
| `refs` | Find references to symbols, functions, and variables |
| `deps` | Explore dependency relationships |
| `clusters` | Group similar code patterns |
| `patterns` | Identify recurring code idioms |
| `concepts` | Extract and map high-level concepts |

### Verification Tools

These tools validate findings and assess code quality.

| Tool | Purpose |
|------|---------|
| `verify` | Check if cited evidence actually supports claims |
| `critique` | Multi-dimensional analysis of response quality |
| `review_diff` | Risk-scored PR review with heuristic finding detection |
| `audit` | Codebase health scoring |

Source: `tool-overrides.ts` (verification tools in `research` and `review` profiles)

## Core Search Tool: `search`

The primary search tool uses hybrid retrieval to find semantically relevant code chunks.

```typescript
const searchTool = {
  name: "search",
  description: "Semantic code search using FTS + embeddings + PageRank"
};
```

Results include `found_by` tags that indicate which retrievers agreed on each result — results marked by multiple signals are higher-signal than single-source hits.

Source: `prompts.ts` (investigate prompt instructs: "Read the `found_by` tags — results agreed on by multiple retrievers are higher-signal than single-source hits")

## Iterative Search: `search_iterative`

For complex queries requiring refinement, the iterative search tool supports multi-turn exploration where results from one search inform subsequent queries.

Source: `tool-overrides.ts` (included in `research` profile)

## Investigation Tool: `investigate`

The `investigate` tool performs a single-pass fan-out across all retrieval signals simultaneously:

1. Full-text search (BM25)
2. Vector embeddings (semantic similarity)
3. Symbol index (function/class names)
4. Reference graph (who calls whom)

This is the recommended starting point for feature mapping and deep code exploration.

Source: `prompts.ts` (map-feature prompt)

## Context Bundler: `context`

The `context` tool is an umbrella that combines multiple searches into a single curated response.

### Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `task` | string | Free-form task description |
| `detail_level` | "minimal" \| "normal" \| "full" | Amount of detail to return |
| `scope` | string | Optional path prefix constraint |
| `budget` | number | PageRank-pruned token budget |

### Detail Levels

| Level | Contents |
|-------|----------|
| `minimal` | Overview header + top 3 search hits + top 2 memories |
| `normal` | Overview header + top 5 search hits + top 5 memories + symbol table |
| `full` | Normal + dependency neighbors of top results |

When `budget` is set, returns a PageRank-pruned repo map greedily filled to the token limit.

Source: `src/server/tools/context.ts` (input schema and implementation)

## Finding References: `refs`

The `refs` tool finds all references to a symbol, including:

- Call sites
- Type definitions
- Documentation mentions
- Import/export relationships

A significant enhancement in recent versions separates structural inclusions from associative references:

> Sprint 9: split structural inclusions from associative references so callers see "this is where the symbol is documented" separately from "see also" mentions.

Results include deduplication to avoid emitting near-identical lines when both an outer fenced chunk and its inner content resolve to the same symbol.

Source: `src/server/tools/find-references.ts` (deduplication logic and comment)

### Documentation Detection

The `refs` tool detects when markdown or README files reference a symbol by backtick or fenced code:

> If any markdown / README / ADR chunks reference this symbol by backtick or fenced code, surface them so the agent sees both the code and its documentation together.

This helps agents identify both the implementation and its documentation in one view.

Source: `src/server/tools/find-references.ts` (doc mention detection)

## Verification: `verify` and `critique`

These tools validate that search results actually support agent claims.

### `verify`

Checks if cited evidence points to actual file locations and content matches the claim.

### `critique`

Performs multi-dimensional analysis including:

- Claim verification against cited evidence
- Stale memory detection
- Moved symbol tracking
- Hub file citation analysis
- Undefined symbol detection
- Undocumented symbol identification

```typescript
interface CritiqueData {
  claim: string | null;
  verify: VerifyResult[];
  stale: VerifyResult[];
  moved: VerifyResult[];
  hubsCited: string[];
  missedHubs: string[];
  undefinedSymbols: string[];
  undocumentedSymbols: string[];
  totalSymbols: number;
}
```

The `critique` tool specifically checks if documentation files (`.md`, `.markdown`, `.mdx`) cite the symbols — if not, the symbol is flagged as undocumented.

Source: `src/server/tools/critique.ts` (formatCritique function and verification logic)

## Tool Profiles

Sverklo organizes tools into named profiles for different workflows:

| Profile | Tools Included | Use Case |
|---------|----------------|----------|
| `full` | All tools | Complete access |
| `minimal` | search, lookup, overview, refs, impact | Quick lookups |
| `lean` | search, lookup, overview, refs, impact, deps, context, status, remember, recall, review_diff | Development workflow |
| `research` | search, search_iterative, investigate, ask, lookup, overview, refs, impact, deps, concepts, patterns, clusters, verify, critique, ctx_slice, ctx_grep, ctx_stats, status | Code investigation |
| `review` | review_diff, diff_search, test_map, impact, refs, lookup, search, investigate, verify, status | PR/MR review |

Source: `tool-overrides.ts` (PROFILES constant)

## Environment Configuration

### Tool Descriptions

Override tool descriptions via environment variables:

```bash
SVERKLO_TOOL_<NAME>_DESCRIPTION="custom description"
```

For example:
```bash
SVERKLO_TOOL_SEARCH_DESCRIPTION="Custom search override"
```

### Tool Profiles

Select which tools are available via `SVERKLO_PROFILE`:

```bash
SVERKLO_PROFILE=research  # Enable investigation tools
SVERKLO_PROFILE=review    # Enable PR review tools
```

### Tool Disabling

Disable specific tools via `SVERKLO_DISABLED_TOOLS`:

```bash
SVERKLO_DISABLED_TOOLS=search,investigate
```

Source: `tool-overrides.ts` (environment variable handling)

## Prompt Templates

Sverklo includes prompt templates that encode the recommended order of tool calls for common tasks.

### Map Feature Prompt

Maps a feature across the codebase using a recommended workflow:

1. `investigate` — single-pass fan-out over all retrieval signals
2. `refs` — expand surface area for top symbols
3. `impact` — assess blast radius before proposing changes
4. `overview` — structural summary for architectural context
5. `search` — keyword/diff-aware search for recent changes

### Architecture Map Prompt

Generates an architecture map using:

1. `overview` — structural summary with PageRank
2. `deps` — dependency analysis for top files
3. `recall` — any saved design decisions
4. `search` — find entry points

Source: `src/server/prompts.ts` (prompt definitions)

## GitHub Action Integration

The search tools integrate with GitHub Actions for automated PR review:

```yaml
uses: sverklo/sverklo/action@main
with:
  github-token: ${{ secrets.GITHUB_TOKEN }}
  fail-on: high
  max-files: 25
```

The action uses heuristic finding detection to identify risky code patterns and posts results as GitHub PR review comments with inline annotations.

Source: `action/README.md` (usage documentation)

## Community Considerations

### Retrieval Architecture Evaluation

The community has discussed evaluating alternative retrieval architectures:

> LinkedIn discussion... raised two concrete pushes against sverklo's current retrieval architecture, both worth taking seriously...

Specifically, [issue #29](https://github.com/sverklo/sverklo/issues/29) discusses evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank approach.

### Regressions After Parser Fix

[Benchmark issue #28](https://github.com/sverklo/sverklo/issues/28) documents a regression in P2/P4 performance after a parser brace-counter fix. This affected search quality for certain code patterns, highlighting the importance of the indexer's parsing quality for retrieval accuracy.

---

<a id='impact-tools'></a>

## Impact and Reference Tools

### Related Pages

Related topics: [Search Tools Reference](#search-tools), [Indexing System](#indexing)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/server/mcp-server.ts](https://github.com/sverklo/sverklo/blob/main/src/server/mcp-server.ts)
- [src/server/tools/find-references.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/find-references.ts)
- [src/server/tools/diff-search.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/diff-search.ts)
- [src/server/tools/critique.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/critique.ts)
- [src/server/tools/context.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tools/context.ts)
- [src/server/tool-overrides.ts](https://github.com/sverklo/sverklo/blob/main/src/server/tool-overrides.ts)
- [src/server/prompts.ts](https://github.com/sverklo/sverklo/blob/main/src/server/prompts.ts)
</details>

# Impact and Reference Tools

The Impact and Reference Tools form the graph-analysis layer of sverklo's code intelligence system. These tools enable agents to understand how code entities relate to each other, assess the blast radius of proposed changes, and verify that code modifications haven't introduced broken references or undocumented dependencies.

## Overview

Sverklo maintains a dependency graph built from import/export relationships parsed during indexing. The Impact and Reference Tools query this graph to answer two fundamental questions:

1. **What depends on this code?** — Reference analysis traces consumers and dependencies
2. **What would break if this changed?** — Impact analysis calculates blast radius and risk scores

These tools are available across all tool profiles (core, nav, lean, research, review) and are considered essential for safe refactoring and architectural decisions. Source: [src/server/tool-overrides.ts:1-51](src/server/tool-overrides.ts)

## Tool Inventory

### Core Graph Tools

| Tool | Purpose | Primary Use Case |
|------|---------|------------------|
| `refs` | Find all references to a symbol | Understanding usage patterns before refactoring |
| `impact` | Calculate blast radius of changes | Risk assessment for proposed modifications |
| `deps` | Show dependency graph for a file/symbol | Understanding architectural layers |

### Supporting Tools

| Tool | Purpose | Integration |
|------|---------|-------------|
| `investigate` | Fan-out search across FTS, vectors, symbols, refs | Research workflow entry point |
| `verify` | Cross-reference claims against codebase | Code review and documentation validation |
| `critique` | Structured verification of code claims | PR review and architectural review |

Source: [src/server/mcp-server.ts](src/server/mcp-server.ts)

## Reference Analysis

### Symbol Reference Finding

The `refs` tool traces both direct code references and documentation mentions of symbols. It builds bidirectional maps from the graph store to answer "who imports this?" and "what does this import?" queries.

```mermaid
graph LR
    A[Symbol Query] --> B[Graph Store]
    B --> C[Import Map<br/>file → files it imports]
    B --> D[Imported-By Map<br/>file → files importing it]
    C --> E[Direct Dependencies]
    D --> F[Direct Consumers]
    E --> G[Transitive Dependencies<br/>N hops]
    F --> H[Transitive Consumers<br/>N hops]
```

The tool separates structural inclusions from associative references, surfacing "this is where the symbol is documented" separately from "see also" mentions. Source: [src/server/tools/find-references.ts:1-40](src/server/tools/find-references.ts)

### Documentation Citation Tracking

The `refs` tool detects when documentation (`.md`, `.markdown`, `.mdx` files) references a symbol. This helps agents identify:

- Architecture Decision Records (ADRs) that document the symbol's design
- Usage examples in README files
- Related documentation that should be updated alongside code changes

Reference rows are deduplicated to prevent emitting near-identical lines when both an outer fenced chunk and inner fence resolve to the same symbol. Source: [src/server/tools/find-references.ts:25-38](src/server/tools/find-references.ts)

### Verify Results Format

Reference findings are returned with file path, line information, and match kind:

```typescript
interface VerifyResult {
  file?: string;           // Repo-relative path
  line?: number;           // 1-based line number
  match_kind: string;      // 'import' | 'call' | 'type_ref' | 'doc_mention'
  confidence?: number;     // 0-1 for heuristic matches
}
```

## Impact Analysis

### Blast Radius Calculation

The `impact` tool calculates how many and which files would be affected by changes to a given symbol. It uses PageRank scores to prioritize the most important affected files.

```mermaid
graph TD
    A[Changed Symbol] --> B[Direct Consumers]
    B --> C[Test Files]
    B --> D[Direct Importers]
    C --> E[High Risk<br/>No Alternative Path]
    D --> F[Transitive Consumers]
    F --> G[Indirect Dependencies]
    G --> H[Risk Score by<br/>PageRank Weight]
```

### Risk Scoring Factors

Impact analysis considers multiple factors:

| Factor | Weight | Description |
|--------|--------|-------------|
| PageRank Score | High | Files with higher centrality are riskier |
| Test Coverage | Medium | Files with tests are safer to change |
| Fan-out Count | Medium | Files importing many things affect more |
| Circular Dependencies | High | Changes in cycles affect all members |

Source: [src/server/mcp-server.ts:40-70](src/server/mcp-server.ts)

### Partition Plans

For large blast radii, the `impact` tool returns partition plans that break the change into buckets. Agents should pick one bucket and drill in rather than attempting to read the full list. This is especially important for monorepos with hundreds of affected files. Source: [src/server/prompts.ts:20-35](src/server/prompts.ts)

## Diff-Aware Analysis

### Diff Search

The `diff_search` tool combines semantic search with git diff awareness. It searches only changed files between two refs, useful for understanding what changed in a PR:

```mermaid
graph LR
    A[Query + Ref Range] --> B[git diff]
    B --> C[Changed Paths]
    C --> D[Filtered Search<br/>Only Changed Files]
    D --> E[Impact on Changed Code]
```

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | string | required | Search query |
| `ref` | string | `main..HEAD` | Git ref range |
| `include_callers` | number | 0 | Include N-hop transitive callers |
| `token_budget` | number | 3000 | Max tokens to return |
| `type` | enum | `any` | Filter by symbol type |

Source: [src/server/tools/diff-search.ts:1-80](src/server/tools/diff-search.ts)

### Include Callers Mode

When `include_callers` is set, the tool includes files that import the changed files. This answers "what uses these changed files?" rather than just "what changed?":

- `0` — Only changed files
- `1` — Changed files + direct callers
- `2` — Changed files + transitive callers

Source: [src/server/tools/diff-search.ts:20-30](src/server/tools/diff-search.ts)

## Critique and Verification

### Claim Verification

The `critique` tool validates code claims by cross-referencing them against the indexed codebase. It checks:

1. **Stale references** — Symbols that no longer exist or have moved
2. **Undefined symbols** — References to unindexed or non-existent code
3. **Documentation coverage** — Whether cited symbols have documentation mentions
4. **Hub citation** — Whether high-centrality files are referenced

```mermaid
graph TD
    A[Code Claim] --> B[Verify Against Index]
    B --> C{Still Exists?}
    C -->|No| D[Moved Symbol]
    C -->|No| E[Undefined Symbol]
    C -->|Yes| F{Has Docs?}
    F -->|No| G[Undocumented Symbol]
    F -->|Yes| H[Verified Claim]
    D --> I[Critique Report]
    E --> I
    G --> I
```

The tool detects when none of the cited evidence points at documentation files, suggesting the answer skipped documentation. Source: [src/server/tools/critique.ts:1-50](src/server/tools/critique.ts)

### Critique Data Structure

```typescript
interface CritiqueData {
  claim: string | null;
  verify: VerifyResult[];
  stale: VerifyResult[];
  moved: VerifyResult[];
  hubsCited: string[];
  missedHubs: string[];
  undefinedSymbols: string[];
  undocumentedSymbols: string[];
  totalSymbols: number;
}
```

## Tool Profiles

Tool availability varies by profile. The Impact and Reference tools are available in all profiles:

| Tool | core | nav | lean | research | review |
|------|------|-----|------|----------|--------|
| `refs` | ✓ | ✓ | ✓ | ✓ | ✓ |
| `impact` | ✓ | ✓ | ✓ | ✓ | ✓ |
| `deps` | — | ✓ | ✓ | ✓ | ✓ |
| `investigate` | — | — | — | ✓ | ✓ |
| `verify` | — | — | — | ✓ | ✓ |
| `critique` | — | — | — | ✓ | ✓ |

Source: [src/server/tool-overrides.ts:10-50](src/server/tool-overrides.ts)

## Integration with Memory

The graph tools integrate with sverklo's memory layer:

- **Core memories** (tier='core') are project invariants auto-injected on every session start
- Impact analysis can be saved as memories using `remember`
- Reference findings can be recalled in future sessions

This enables agents to build institutional knowledge about risky changes and their outcomes. Source: [src/server/mcp-server.ts:75-100](src/server/mcp-server.ts)

## Usage Patterns

### Safe Refactoring Workflow

1. **Identify the symbol** to refactor
2. **Call `refs`** to understand all consumers
3. **Call `impact`** to assess blast radius
4. **Review partition plans** if blast radius is large
5. **Save decisions** with `remember` for future reference

### PR Review Workflow

1. **Call `investigate`** to understand changed components
2. **Call `diff_search`** on the PR branch
3. **Call `verify`** to check for broken references
4. **Call `critique`** to validate architectural claims

Source: [src/server/prompts.ts:1-30](src/server/prompts.ts)

## Performance Considerations

- Reference lookups use in-memory graph stores for sub-millisecond response
- PageRank scores are precomputed during indexing
- Transitive dependency traversal is bounded by default to prevent runaway queries
- Token budgets prevent oversized responses in large codebases

## Related Documentation

- [Audit Tools](audit-tools.md) — Codebase health scoring and architecture analysis
- [Search Tools](search-tools.md) — Semantic and full-text search
- [Memory Tools](memory-tools.md) — Persistent context across sessions
- [Git Integration](git-integration.md) — Diff-aware analysis and ref validation

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: sverklo/sverklo

Summary: Found 19 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

## 1. Configuration risk - Configuration risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: packet_text.keyword_scan | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_7a50e3a046d2438db185ba21d580ec9e | https://github.com/sverklo/sverklo/issues/71

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_13e1bc9ab7fa41a0866eb6c4f814875c | https://github.com/sverklo/sverklo/issues/60

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_a8bdc3779b264243b8362d6e57096e25 | https://github.com/sverklo/sverklo/issues/61

## 5. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_c0c1f6a71a764af596178de506d0b2c3 | https://github.com/sverklo/sverklo/issues/58

## 6. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_6be83aacd98c4e3abb6ae6361bf81940 | https://github.com/sverklo/sverklo/issues/69

## 7. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_fc3cc34d92454d5a92ab4a196b178799 | https://github.com/sverklo/sverklo/issues/72

## 8. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_bf329d5553724c3281773c6aee96cae5 | https://github.com/sverklo/sverklo/issues/74

## 9. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_42920ecfbbc54f4f8b207e386dfc9ebd | https://github.com/sverklo/sverklo/issues/73

## 10. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 11. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 12. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 13. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 14. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 15. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_6256d97c525b460c92ca9c7c5c3d6e70 | https://github.com/sverklo/sverklo/issues/59

## 16. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_1c794119f13c4355bc54d0cab37b3cf9 | https://github.com/sverklo/sverklo/issues/53

## 17. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_690412bdd0374be1b2850ff4124171fd | https://github.com/sverklo/sverklo/issues/66

## 18. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1203034717 | https://github.com/sverklo/sverklo

## 19. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1203034717 | https://github.com/sverklo/sverklo

<!-- canonical_name: sverklo/sverklo; human_manual_source: deepwiki_human_wiki -->
