Doramagic Project Pack · Human Manual

sverklo

Related topics: Installation, Quick Start Guide, System Architecture

Overview

Related topics: Installation, Quick Start Guide, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Tool Presets

Continue reading this section for the full explanation and source context.

Section Core Tools

Continue reading this section for the full explanation and source context.

Section Intent-Aware Hints

Continue reading this section for the full explanation and source context.

Related topics: Installation, Quick Start Guide, System Architecture

Overview

Sverklo is a local-first MCP (Model Context Protocol) server that provides repository memory and code intelligence for AI coding agents. It enables persistent context, semantic search, dependency graphs, blast-radius analysis, diff-aware review, and git-pinned decisions across coding sessions.

Version: 0.29.0 License: MIT Repository: sverklo/sverklo Website: https://sverklo.com

Purpose and Scope

Sverklo transforms a codebase into a queryable knowledge graph that AI agents can interact with across sessions. Unlike cloud-based solutions, sverklo runs entirely locally—no API keys or code upload required. The system indexes source files, builds dependency graphs, computes PageRank scores, and maintains persistent memories.

Key capabilities include:

CategoryCapabilities
SearchSemantic embeddings, BM25 full-text search, hybrid retrieval, PageRank-weighted ranking
GraphDependency analysis, blast-radius computation, impact analysis
MemoryPersistent context across sessions, core project invariants, categorized memories
ReviewDiff-aware PR review, risk scoring, structural heuristics
AuditCodebase health scoring, architecture diagrams, Obsidian-compatible exports

Source: package.json:5-33

Architecture Overview

graph TD
    subgraph "Client Layer"
        IDE[Claude Code / Cursor / Windsurf / Codex CLI]
    end
    
    subgraph "MCP Server"
        MCP[MCP Protocol Handler]
        Tools[Tool Router]
        Hints[Intent Hints Engine]
    end
    
    subgraph "Indexer Subsystem"
        Files[File Indexer]
        Code[Code Parser]
        Graph[Dependency Graph]
        Memory[Memory Store]
        Search[Hybrid Search Engine]
    end
    
    subgraph "Stores"
        FS[File Store]
        GS[Graph Store]
        MS[Memory Store]
        DS[Doc Edge Store]
    end
    
    IDE <--> MCP
    MCP <--> Tools
    Tools <--> Hints
    Tools <--> Indexer
    Indexer <--> Stores

The MCP server (src/server/mcp-server.ts) implements the Model Context Protocol, exposing tools and resources that IDE clients consume. The indexer subsystem coordinates file scanning, AST-based code parsing, graph construction, and search indexing into multiple backing stores.

Source: src/server/mcp-server.ts:1-50

MCP Tools

Sverklo exposes a comprehensive set of code intelligence tools via the MCP protocol. Tools are organized into presets that optimize the available surface area for different agent workflows.

Tool Presets

PresetPurposeTools Included
defaultBalanced overview + searchsearch, lookup, overview, refs, impact
navNavigation focussearch, lookup, overview, refs, impact, deps, context, status
leanMinimal footprintsearch, lookup, overview, refs, impact, deps, context, status, remember, recall, review_diff
researchCode explorationsearch, search_iterative, investigate, ask, lookup, overview, refs, impact, deps, concepts, patterns, clusters, verify, critique, ctx_slice, ctx_grep, ctx_stats, status
reviewPR/MR reviewreview_diff, diff_search, test_map, impact, refs, lookup, search, investigate, verify, status

Source: src/server/tool-overrides.ts:1-60

Core Tools

Context Bundle (context) — An umbrella tool that returns a curated bundle in a single call: codebase overview, semantically relevant code, related symbols, and matching memories. This is the recommended first call for unfamiliar tasks.

inputSchema: {
  task: string,           // Free-form task description
  detail_level: enum,     // "minimal" | "normal" | "full"
  scope: string,          // Optional path prefix filter
  budget: number          // PageRank-pruned token budget
}

Source: src/server/tools/context.ts:1-30

Search (search, search_iterative) — Hybrid retrieval combining:

  • Full-text search via BM25
  • Semantic embeddings via ONNX runtime
  • PageRank-weighted ranking
  • Reciprocal rank fusion

Source: src/server/tools/context.ts:45-55

Critique (critique) — Validates an agent's answer by checking cited evidence for staleness, detecting missed high-PageRank hubs, and flagging undocumented symbols. Returns structured critique without LLM calls.

Source: src/server/tools/critique.ts:1-60

Review Diff (review_diff) — Diff-aware code review with risk scoring and structural heuristics. Emits structured GitHub PR review JSON for CI integration.

Source: src/server/tools/review-format.ts:1-40

Intent-Aware Hints

The hint engine tracks recent tool-call trajectories and appends "next steps" suggestions. It classifies intent into categories:

IntentTrigger Patterns
exploringSearch, lookup, investigate sequences
reviewing-diffDiff tools followed by refs
tracing-impactImpact analysis after symbol lookups
debuggingGrep, lookup, investigate patterns
onboardingContext, overview, status calls
memory-curatingRemember, recall sequences

Source: src/server/hints.ts:1-50

Memory System

Sverklo maintains persistent context across coding sessions through a tiered memory architecture:

graph LR
    Core[Core Memories<br/>Tier: core]
    Recent[Recent Memories<br/>Tier: session]
    Stale[Stale Flagging<br/>is_stale flag]
    
    Core --> Session[Auto-injected on session start]
    Recent --> Session
    Stale --> Session

Memory Tiers:

  • Core — Project invariants, always auto-injected at session start
  • Recent — Session-scoped memories, last N entries
  • Stale — Flagged when underlying code changes (detected via graph analysis)

Source: src/server/mcp-server.ts:80-100

The sverklo://context resource is auto-injected on every session start, providing the agent with project context without requiring explicit tool calls.

Source: src/server/mcp-server.ts:55-78

Audit and Reporting

HTML Audit Report

Generates self-contained HTML reports with sverklo.com dark theme branding. Includes:

  • Dimension grade cards (A/B/C/D/F color-coded)
  • Section content cards with formatted bodies
  • SEO metadata and Open Graph tags
  • Responsive styling with JetBrains Mono and Public Sans fonts

Source: src/server/audit-html.ts:1-30

Obsidian Export

Generates Obsidian-compatible markdown with [[wikilinks]] for clickable dependency navigation in the Obsidian knowledge base.

Source: src/server/audit-obsidian.ts:1-30

Architecture Diagram

Generates self-contained HTML architecture diagrams showing:

  • Layer groupings (Frontend, API, Storage, Search, Indexer)
  • File distribution by pagerank
  • Cross-layer dependency edges
  • Color-coded directory patterns

Source: src/server/audit-arch.ts:1-50

Workflow Prompts

Sverklo defines prompt templates for common code-intelligence tasks that encode the optimal order of tool calls:

PromptPurpose
sverklo/onboardingNew team member context injection
sverklo/premergePre-merge review checklist
sverklo/premerge-fullComprehensive pre-merge review
sverklo/investigateRoot-cause debugging workflow
sverklo/map-featureFeature tracing across codebase

Source: src/server/prompts.ts:1-50

Example prompt structure for feature mapping:

build: ({ feature, scope }) => `
  1. investigate query:"${feature}"${scopeArg}
  2. Pick top 3-5 symbols → refs on each
  3. impact on most-referenced symbols
  4. Call verify to validate assumptions
`

Source: src/server/prompts.ts:20-45

GitHub Action Integration

The sverklo/sverklo/action provides CI-integrated code review:

- uses: sverklo/sverklo/action@main
  with:
    github-token: ${{ secrets.GITHUB_TOKEN }}
    fail-on: high        # Fail build on risk threshold
    max-files: 25        # Max files to review
    inline-comments: true # Post inline comments

The action posts a PR review containing:

  • Sticky summary comment with risk-scored files
  • Up to 30 inline comments anchored to flagged lines
  • JSON payload for direct pulls.createReview API posting

Source: action/README.md:1-30

CLI Commands

CommandDescription
sverklo initInitialize project with index and CLAUDE.md
sverklo register <path>Register a repository
sverklo unregister <name>Unregister a repository
sverklo listList registered repositories
sverklo reindexRebuild index for a repository
sverklo statusShow current repository status
sverklo doctorDiagnose installation health

Source: skill/README.md:1-20

Dependencies

Runtime Dependencies:

PackageVersionPurpose
@modelcontextprotocol/sdk^1.12.0MCP protocol implementation
chokidar^4.0.0File watching
ignore^7.0.0Gitignore pattern matching
onnxruntime-node^1.21.0Local embedding inference
picomatch^4.0.4Glob pattern matching
yaml^2.8.3YAML parsing

Optional Dependencies:

PackageVersionPurpose
web-tree-sitter^0.24.0AST parsing (optional)

Source: package.json:35-55

Engine Requirements: Node.js >= 24.0.0

Supported Environments

Sverklo integrates with:

  • Claude Code (primary)
  • Cursor
  • Windsurf
  • Codex CLI
  • ZED editor

Source: package.json:4

Known Limitations

Based on community feedback:

IssueStatusReference
Windows path handlingUser-reported issuesIssue #20
AGENTS.md not respected by sverklo initKnownIssue #19
MCP tool double-prefixing (sverklo_sverklo_*)KnownIssue #71
reindex does not update lastIndexed timestampBugIssue #74
Stale MCP server binary after upgradeKnownIssue #17

Source: package.json:1-10

Getting Started

# Install globally
npm install -g sverklo

# Initialize in your project
cd your-project
sverklo init

# Register a repository
sverklo register .

# Start using MCP tools from your IDE

The initialization creates a CLAUDE.md file with project context and builds the initial index. The MCP server then becomes available to any connected IDE.

Source: skill/README.md:20-35

Source: https://github.com/sverklo/sverklo / Human Manual

Installation

Related topics: Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section What sverklo init Does

Continue reading this section for the full explanation and source context.

Section Files Created

Continue reading this section for the full explanation and source context.

Section Grammar Installation

Continue reading this section for the full explanation and source context.

Related topics: Quick Start Guide

Installation

Sverklo is a local-first MCP (Model Context Protocol) server for code intelligence. This guide covers the complete installation process, from prerequisites through post-install verification.

Prerequisites

RequirementVersionNotes
Node.js>= 24.0.0Required runtime. Earlier versions lack needed ESM and import metadata support.
npmAny recent versionUsed for global installation
GitAny recent versionRequired for repository operations during init
OSLinux, macOS, WindowsWindows has known path normalization quirks (see Platform Notes)

Verify your Node version before proceeding:

node --version

Installing via npm

Sverklo is distributed as a global npm package:

npm install -g sverklo

This installs the sverklo CLI binary globally, making it available from any directory. Source: package.json:9-11

The installation includes:

  • CLI binary (sverklo) — Command-line interface for all operations
  • MCP server — Language server for IDE integration
  • Tree-sitter grammars — Language parsers for AST indexing
  • ONNX runtime — Local embeddings for semantic search

Verify the installation:

sverklo --version

Project Initialization

Each codebase you want sverklo to manage requires initialization. Navigate to your project directory and run:

cd /path/to/your/project
sverklo init

The init command performs the following setup:

What `sverklo init` Does

graph TD
    A[sverklo init] --> B{Is .sverklo dir present?}
    B -->|No| C[Create ~/.sverklo directory]
    B -->|Yes| D[Skip creation]
    C --> E[Create registry.json]
    D --> E
    E --> F[Create CLAUDE.md in project]
    F --> G[Parse existing docs/ADRs]
    G --> H[Install tree-sitter grammars]
    H --> I[Index codebase files]
    I --> J[Build symbol graph]
    J --> K[Compute PageRank scores]
    K --> L[Generate initial embeddings]
    L --> M[Write project metadata]

Files Created

FileLocationPurpose
CLAUDE.mdProject rootAgent instructions for code intelligence
registry.json~/.sverklo/Project registration with name, path, last indexed timestamp

Source: src/init.ts:1-50

Grammar Installation

During initialization, sverklo installs tree-sitter grammars for supported languages. Grammars enable precise AST-based parsing for accurate symbol extraction.

graph LR
    A[Init starts] --> B[Check installed grammars]
    B --> C{Grammars exist?}
    C -->|Yes| D[Use cached]
    C -->|No| E[Download from npm]
    E --> F[Build with node-gyp]
    F --> G[Store in ~/.sverklo/grammars/]

Source: src/indexer/grammars-install.ts:1-40

Supported languages include TypeScript, JavaScript, Python, Go, Rust, and more. The grammars are installed once and reused across projects.

Post-Install Verification

After installation and initialization, verify everything works correctly:

sverklo doctor

The doctor command performs health checks on:

CheckPurpose
InstallationVerifies CLI is reachable
Node versionConfirms >= 24.0.0
Grammar binariesChecks tree-sitter parsers are compiled
Project registryValidates ~/.sverklo/registry.json
MCP serverTests server startup

Source: src/doctor.ts:1-60

Known Issue: Version Mismatch

If you have multiple sverklo installations (e.g., a stale global binary), sverklo doctor may report a different version than the one you're actually running. This occurs when the doctor check uses sverklo from $PATH instead of the embedded version.

Platform-Specific Considerations

Windows

Windows users may encounter path-related issues due to backslash vs forward slash handling. The codebase normalizes paths using:

path.replace(/\\/g, "/").split("/")

This approach converts Windows paths to Unix-style before processing. Source: Issue #20

Fresh Git Repositories

When running sverklo init in a fresh repository with no commits, you may see a spurious git warning:

Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]'

This warning is cosmetic and does not affect functionality. Source: Issue #3

IDE Integration

After installation, configure your IDE to use sverklo as an MCP server:

Claude Code

Sverklo ships with a Claude Skill package for Claude Code. After installation, Claude Code automatically discovers the MCP tools.

Cursor / Windsurf / Other MCP Clients

Register the MCP server by adding to your IDE's MCP configuration:

{
  "mcpServers": {
    "sverklo": {
      "command": "sverklo",
      "args": ["mcp", "serve"]
    }
  }
}

Source: src/server/mcp-server.ts:1-30

Troubleshooting

MCP Tools Not Appearing

  1. Run sverklo doctor to verify the server starts correctly
  2. Restart your IDE to pick up the newly registered MCP server
  3. Check that the project is registered: sverklo list

Stale Index After Upgrade

When upgrading via npm install -g, any running MCP server subprocess continues serving from the old binary until restarted. Restart your IDE after upgrading. Source: Issue #17

Grammar Compilation Failures

If tree-sitter grammar compilation fails:

  1. Ensure node-gyp is available: npm install -g node-gyp
  2. Verify build tools are installed (Python, C++ compiler)
  3. On macOS, install Xcode Command Line Tools: xcode-select --install

Configuration Reference

Environment Variables

VariableDefaultDescription
SVERKLO_PROFILEfullTool profile: core, nav, lean, full, research, review
SVERKLO_DISABLED_TOOLS(none)Comma-separated list of tools to hide
SVERKLO_TOOL_<NAME>_DESCRIPTION(none)Override tool description

Source: src/server/tool-overrides.ts:1-30

Registry Structure

Projects are registered in ~/.sverklo/registry.json:

{
  "projects": [
    {
      "name": "my-project",
      "path": "/home/user/code/my-project",
      "lastIndexed": "2025-01-15T10:30:00Z",
      "version": "0.29.0"
    }
  ]
}

Next Steps

After successful installation:

  1. Index your project: sverklo index (automatically run by init)
  2. Explore the codebase: sverklo overview
  3. Enable IDE integration: Configure MCP server in your IDE
  4. Read the CLI reference: Explore available commands with sverklo --help

Source: https://github.com/sverklo/sverklo / Human Manual

Quick Start Guide

Related topics: Overview, Installation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Initialization for Windows

Continue reading this section for the full explanation and source context.

Section Global Setup Option

Continue reading this section for the full explanation and source context.

Section Register a Project

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation

Quick Start Guide

Sverklo is a local-first MCP (Model Context Protocol) server that provides code intelligence for AI coding assistants. It delivers symbol graphs, blast-radius analysis, diff-aware review, and persistent memory across sessions—without requiring API keys or uploading code. Source: package.json

This guide walks you through installing sverklo, initializing it for your project, and getting started with its core capabilities.

Prerequisites

RequirementVersion/Details
Node.js>= 24.0.0
Package Managernpm, pnpm, or yarn
IDE/ClientClaude Code, Cursor, Windsurf, or Codex CLI
GitRequired for version-aware features

Sverklo uses ONNX runtime for embeddings and tree-sitter for AST parsing. These are included as dependencies. Source: package.json:38-43

Installation

Install sverklo globally via npm:

npm install -g sverklo

Verify the installation:

sverklo --version
# or
sverklo doctor

The doctor command checks your environment and reports any configuration issues. Source: skill/README.md

Project Initialization

Navigate to your project directory and run the initialization:

cd your-project
sverklo init

The init command performs the following setup:

  1. Indexes your codebase — Scans source files, builds a symbol graph, and computes dependency relationships
  2. Detects existing agent instructions — Checks for CLAUDE.md, AGENTS.md, or other agent configuration files
  3. Creates context files — Generates or updates documentation for AI assistants
  4. Registers the project — Adds the project to your local registry for quick access

Source: skill/README.md

Note: In fresh git repositories with no commits, you may see a stray git warning. This is cosmetic—the init still succeeds. Source: GitHub Issue #3

Initialization for Windows

If you're on Windows and encounter path-related issues, ensure your PATH handling is compatible. The tool uses forward-slash normalized paths internally, but some edge cases may still arise. Source: GitHub Issue #20

Global Setup Option

If you want one-time machine setup without per-project boilerplate, use the global initialization flow. This imports memories once and allows quick registration for subsequent projects. Source: GitHub Issue #72

Core Commands

Register a Project

Register an existing project (if not done during init):

sverklo register .

List Registered Projects

View all registered projects and their status:

sverklo list

Reindex a Project

After significant code changes, refresh the index:

sverklo reindex .
Known Issue: The reindex command may not update the lastIndexed timestamp in ~/.sverklo/registry.json, causing sverklo list to show stale ages. Source: GitHub Issue #74

Unregister a Project

Remove a project from the registry:

sverklo unregister <project-name>

To unregister by path (useful for agent-driven workflows):

sverklo unregister --by-path /path/to/project

Source: GitHub Issue #73

MCP Tools Overview

Once initialized, sverklo provides these tools to your AI assistant:

ToolPurpose
searchHybrid semantic code search (BM25 + embeddings + PageRank)
lookupFind symbol definitions and references
overviewGet codebase statistics and top files by importance
impactCalculate blast radius for proposed changes
refsFind all references to a symbol
depsShow dependency graph for a file
contextUmbrella tool—returns curated context bundle in one call
rememberSave decisions and context for future sessions
recallRetrieve previously saved memories
review_diffRisk-scored PR review with inline comments
auditCodebase health scoring
investigateFan-out search across multiple signals
critiqueVerify claims against codebase evidence

Source: src/server/mcp-server.ts

First Session Workflow

When your AI assistant starts a session, sverklo automatically provides context resources:

graph TD
    A[Session Start] --> B[MCP Server Initializes]
    B --> C[Load Core Memories]
    C --> D[Load Recent Memories]
    D --> E[Build sverklo://context Resource]
    E --> F[Auto-inject into Session]

The sverklo://context resource includes:

  • Core project context — Tier-1 project invariants
  • Key memories — Previously saved decisions
  • Top files — High PageRank files for orientation
  • Language stats — File counts by language

Source: src/server/mcp-server.ts:37-67

Using the Context Tool

The context tool is the recommended starting point for new tasks:

{
  "task": "add rate limiting to the login endpoint",
  "detail_level": "normal",
  "budget": 4000
}
ParameterTypeDescription
taskstringDescription of what you're working on
detail_levelminimal \normal \fullHow much context to return
scopestringOptional path prefix to constrain search
budgetnumberToken budget for PageRank-pruned repo map
  • minimal — Fast/cheap: overview header + top 3 search hits + top 2 memories
  • normal — Balanced: header + top 5 search hits + top 5 memories + symbol table
  • full — Normal + dependency neighbors of top results
  • budget — Returns PageRank-pruned repo map fit to token budget

Source: src/server/tools/context.ts

GitHub Actions Integration

For automated code review on pull requests, use the sverklo action:

name: Sverklo Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: sverklo/sverklo/action@main
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          fail-on: high
          inline-comments: true
InputDefaultDescription
github-token${{ github.token }}GitHub token for posting comments
fail-onnoneRisk threshold: critical, high, medium, low, none
refauto-detectedGit ref range (e.g., main..HEAD)
max-files25Maximum files to review
inline-commentstruePost inline comments at flagged lines

Source: action/README.md

Claude Code Subagents

Replace Claude Code's built-in subagents with sverklo-enhanced versions:

mkdir -p .claude/agents
curl -o .claude/agents/sverklo-explore.md \
  https://raw.githubusercontent.com/sverklo/sverklo/main/agents/sverklo-explore.md

The sverklo-explore subagent uses hybrid retrieval (BM25 + ONNX embeddings + PageRank) and answers questions in ~150-800 tokens, versus ~14,200 tokens for the default approach. Source: agents/README.md

Claude Skill Package

Sverklo ships a Claude Skill for Claude Code:

# The skill is included in the npm package
ls skill/
# Contains: sverklo-skill.zip and skill definitions

Tools available via the skill include sverklo_search, sverklo_review_diff, sverklo_audit, and memory tools. Source: skill/README.md

Troubleshooting

Version Mismatch Warning

When upgrading sverklo via npm, a running MCP server subprocess may continue using the old binary. Restart your IDE or MCP client to pick up the new version. Source: GitHub Issue #17

AGENTS.md Not Respected

If your project uses AGENTS.md instead of CLAUDE.md, the init command may still add context to the wrong file. Manually migrate the content or file an issue. Source: GitHub Issue #19

MCP Tool Name Prefix

When registering the MCP server under the key "sverklo", tool names may appear as sverklo_sverklo_* due to double-prefixing. Register under a different key (e.g., "io.github.sverklo") to avoid this. Source: GitHub Issue #71

Next Steps

Source: https://github.com/sverklo/sverklo / Human Manual

System Architecture

Related topics: MCP Server Design, Search and Retrieval System, Indexing System, Bi-Temporal Memory Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Local-First Architecture

Continue reading this section for the full explanation and source context.

Section MCP Protocol Integration

Continue reading this section for the full explanation and source context.

Section Tool Registration System

Continue reading this section for the full explanation and source context.

Related topics: MCP Server Design, Search and Retrieval System, Indexing System, Bi-Temporal Memory Layer

System Architecture

Sverklo is a local-first code intelligence platform designed as an MCP (Model Context Protocol) server. It provides persistent memory, semantic code search, dependency graphs, blast-radius analysis, and diff-aware review for AI coding assistants. The architecture follows a layered design with clear separation between indexing, storage, search, and tool delivery layers.

High-Level Architecture Overview

Sverklo operates as a long-running MCP server process that serves code intelligence tools to IDE-integrated AI clients. The system indexes code once and serves multiple query types across sessions.

graph TD
    subgraph "Client Layer"
        A["Claude Code / Cursor / Windsurf / Codex CLI"]
    end
    
    subgraph "MCP Server Layer"
        B["mcp-server.ts<br/>MCP Protocol Handler"]
        C["Tool Handlers"]
        D["Prompt Templates"]
        E["Resource Provider"]
    end
    
    subgraph "Search & Query Layer"
        F["hybrid-search.ts<br/>BM25 + ONNX Embeddings + PageRank"]
        G["investigate.ts<br/>Multi-signal Fan-out"]
        H["Tool Profiles<br/>core, nav, lean, research, review"]
    end
    
    subgraph "Indexing Layer"
        I["Index Files"]
        J["Index Code (AST)"]
        K["Index Graph (Dependencies)"]
        L["Index Memory"]
    end
    
    subgraph "Storage Layer"
        M["SQLite Database"]
        N["Vector Store"]
        O["File Registry"]
    end
    
    A --> B
    B --> C
    B --> D
    B --> E
    C --> H
    D --> C
    H --> I
    H --> J
    H --> K
    H --> L
    I --> M
    J --> M
    K --> M
    L --> M

Core Design Principles

Local-First Architecture

Sverklo stores all data locally in ~/.sverklo/ and per-project .sverklo/ directories. No API keys or cloud services are required. The system runs entirely on the developer's machine.

Dependencies supporting local-first operation:

  • chokidar for file system watching
  • picomatch for glob pattern matching
  • ignore for .gitignore compatible filtering
  • onnxruntime-node for local embedding inference
  • web-tree-sitter (optional) for AST parsing

Source: package.json:1-50

MCP Protocol Integration

The server implements the full MCP 1.12.0 specification with three resource types:

server.setRequestHandler(ListResourcesRequestSchema, async () => ({
  resources: [{
    uri: "sverklo://context",
    name: "Sverklo Project Context",
    description: "Key memories and codebase overview...",
    mimeType: "text/plain",
  }],
}));

The server exposes resources that are auto-injected at session start, prompts for workflow templates, and tools for all code intelligence operations.

Source: src/server/mcp-server.ts:1-50

Tool Architecture

Tool Registration System

All tools follow a standardized handler pattern. The MCP server maintains a tool registry that supports dynamic configuration:

export const contextTool = {
  name: "context",
  description: "Umbrella context bundler...",
  inputSchema: {
    type: "object" as const,
    properties: {
      task: { type: "string", description: "..." },
      detail_level: { type: "string", enum: ["minimal", "normal", "full"] },
      scope: { type: "string" },
      budget: { type: "number" },
    },
  },
};

Source: src/server/tools/context.ts:1-40

Tool Profiles

The system provides pre-defined tool subsets called profiles to control the MCP tool surface:

ProfileToolsUse Case
coresearch, lookup, overview, refs, impactHot path only
navcore + deps, context, statusNavigation focus
leannav + remember, recall, review_diffMemory + diff
researchsearch, investigate, ask, concepts, patterns, clusters, verify, critiqueCode research
reviewreview_diff, diff_search, test_map, impact, refsPR/MR review

Source: src/server/tool-overrides.ts:1-80

Runtime Configuration

Tools can be customized via environment variables:

VariablePurpose
SVERKLO_TOOL_<NAME>_DESCRIPTIONOverride tool description text
SVERKLO_DISABLED_TOOLSComma-separated list of tools to hide
SVERKLO_PROFILEApply a named profile (core, nav, lean, research, review)
SVERKLO_ZILLIZ_COMPATEnable Zilliz Claude context compatibility aliases

Source: src/server/tool-overrides.ts:80-120

Search Architecture

Hybrid Search Pipeline

Sverklo combines multiple retrieval signals to maximize result quality:

graph LR
    A["Query"] --> B["BM25 Keyword Search"]
    A --> C["ONNX Bi-encoder Embeddings"]
    A --> D["PageRank Centrality"]
    B --> E["Reciprocal Rank Fusion"]
    C --> E
    D --> E
    E --> F["Ranked Results"]

The search layer coordinates BM25 keyword matching, semantic embedding similarity via ONNX, and PageRank-based importance scoring. Reciprocal Rank Fusion combines these signals into a unified ranking.

Source: src/server/mcp-server.ts:100-150

Investigation Engine

The investigate tool performs multi-signal fan-out in a single call:

  1. Executes full-text search
  2. Queries vector embeddings
  3. Resolves symbol references
  4. Checks documentation mentions
  5. Aggregates results with found_by tags indicating which signals matched

Results agreed on by multiple retrievers are tagged as higher-signal than single-source hits.

Source: src/server/prompts.ts:1-50

PageRank Integration

Dependency graph analysis produces PageRank scores used to:

  • Rank files by architectural importance
  • Prune large repos to fit token budgets
  • Identify load-bearing modules
  • Surface high-centrality files in overview

Source: src/server/tools/wakeup.ts:1-50

Code Analysis Engine

Symbol Indexing

The indexing pipeline extracts symbols from AST-aware parsing:

  • Function and class definitions
  • Import/export relationships
  • Type annotations
  • Documentation comments

Symbol data enables refs (find references), impact (blast radius), and symbol-based search.

Dependency Graph

Edges between files capture:

  • Import statements (ES modules, CommonJS, TypeScript imports)
  • Re-exports and re-typed symbols
  • Cross-reference relationships

The graph supports cycle detection, fan-in/fan-out analysis, and impact propagation.

Source: src/server/audit-obsidian.ts:1-50

Critique System

The critique tool validates claims against indexed evidence:

function formatCritique(c: CritiqueData): string {
  const parts: string[] = [];
  parts.push(c.claim ? `## critique — "${c.claim}"` : "## critique");
  // Verifies citations point to actual source files
  // Checks for undocumented symbols
  // Detects stale or moved references
}

Critique verifies that:

  • Cited evidence actually exists
  • Symbols are documented in .md, .markdown, or .mdx files
  • References haven't become stale or moved

Source: src/server/tools/critique.ts:1-60

Review and Audit Output

GitHub PR Review Format

Review output supports structured GitHub API payloads:

export interface InlineComment {
  /** Repo-relative path of the file being commented on */
  path: string;
  /** 1-based file line number */
  line: number;
  severity: "info" | "warning" | "error";
  body: string;
}

Risk levels are classified as: critical, high, medium, low.

Source: src/server/tools/review-format.ts:1-40

HTML Audit Reports

Self-contained HTML reports with dark theme branding are generated for codebase health analysis:

  • Dimension cards with letter grades (A-F)
  • Section cards with formatted content
  • SEO meta tags and Open Graph support
  • Google Fonts (JetBrains Mono, Public Sans)

Source: src/server/audit-html.ts:1-60

Obsidian Export

Audit reports can be exported as Obsidian-compatible markdown with [[wikilinks]] for clickable navigation between files and symbols.

Source: src/server/audit-obsidian.ts:1-50

Prompt Templates

Workflow Orchestration

Prompts encode the *order* of sverklo tool calls for common tasks:

PromptPurpose
sverklo/map-featureMap a feature across codebase entry points, symbols, tests, docs
sverklo/architecture-mapGenerate architecture map using overview, deps, PageRank, recall
sverklo/onboardNew developer onboarding with conventions and project index
sverklo/premergePre-merge review checklist
sverklo/debugSystematic debugging using symbol graphs and references

Each prompt uses the PromptDefinition interface:

interface PromptDefinition {
  name: string;
  description: string;
  arguments: { name: string; description: string; required: boolean }[];
  build: (args: Record<string, string>) => string;
}

Source: src/server/prompts.ts:1-80

Context Injection

Session Startup

On session start, the MCP server injects context via the sverklo://context resource:

const coreMemories = indexer.memoryStore.getCore(15);
const recentMemories = indexer.memoryStore.getRecent(10);
const projectMemories = indexer.memoryStore.getByCategory("project");
const conventions = indexer.memoryStore.getByCategory("convention");

Context tiers:

  • Core — Project invariants (always injected)
  • Recent — Latest saved memories
  • Category — Organized by type (project, convention, architecture)

Source: src/server/mcp-server.ts:50-100

Wakeup Generation

The wakeup tool produces quick orientation summaries:

export function generateWakeup(
  indexer: IndexFiles & IndexMemory,
  options: { maxTokens?: number; format?: "markdown" | "plain" }
): string

Includes project status, core files by dependency rank, and project invariants.

Source: src/server/tools/wakeup.ts:1-50

References Lookup

Symbol Resolution

The refs tool finds all references to a symbol and separates:

  • Structural inclusions — where the symbol is defined/included
  • Associative references — "see also" mentions

Dedup logic prevents near-identical rows from the same logical doc location:

const seen = new Set<string>();
for (const m of docMentions) {
  const key = `${m.doc_file_path}|${m.doc_breadcrumb ?? ""}|${m.match_kind}`;
  if (seen.has(key)) continue;
  seen.add(key);
  dedupedAll.push(m);
}

Source: src/server/tools/find-references.ts:1-60

Known Architectural Considerations

Windows Path Handling

Path normalization uses forward-slash conversion:

path.replace(/\\/g, "/").split("/").pop()

This is noted in community discussions as a workaround rather than a comprehensive fix. See Issue #20.

MCP Tool Name Prefixing

All tools include a sverklo_ prefix (e.g., sverklo_impact, sverklo_search). When registered under key "sverklo", this produces double-prefixing (e.g., sverklo_sverklo_impact). See Issue #71.

Retrieval Architecture Evolution

Community discussions (Issue #29) have raised evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank approach.

Technology Stack

ComponentTechnologyPurpose
RuntimeNode.js >= 24.0.0Server runtime
ProtocolMCP SDK 1.12.0Client-server communication
EmbeddingsONNX Runtime Node 1.21.0Local vector inference
ParsingTree-sitter (optional)AST extraction
File WatchingChokidar 4.0.0Live reload support
MatchingPicomatch 4.0.4Glob pattern support
ConfigYAML 2.8.3Configuration files

Source: package.json:30-60

Summary

Sverklo's architecture implements a clean separation between indexing, storage, search, and delivery layers. The MCP protocol enables integration with multiple AI coding clients while the hybrid search pipeline combines keyword, semantic, and graph-based signals. Tool profiles allow runtime customization of the available surface, and the prompt system encodes best-practice workflows. The local-first design ensures data privacy and eliminates external dependencies.

Source: https://github.com/sverklo/sverklo / Human Manual

MCP Server Design

Related topics: Search Tools Reference, Impact and Reference Tools

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Design Principles

Continue reading this section for the full explanation and source context.

Section Per-Project MCP Server

Continue reading this section for the full explanation and source context.

Section Server Capabilities

Continue reading this section for the full explanation and source context.

Related topics: Search Tools Reference, Impact and Reference Tools

MCP Server Design

Sverklo implements a Model Context Protocol (MCP) server that provides code intelligence capabilities to AI coding agents. The MCP server layer sits between the indexer (which maintains the code graph, embeddings, and memories) and the AI client (Claude Code, Cursor, Windsurf, or Codex CLI). This design enables persistent, local-first code understanding without API keys or code upload.

Architecture Overview

The MCP server is built on top of the @modelcontextprotocol/sdk and exposes sverklo's indexer capabilities as tools, resources, and prompts. The architecture follows a layered design:

graph TD
    subgraph "AI Client Layer"
        A["Claude Code / Cursor / Windsurf"]
    end
    
    subgraph "MCP Server Layer"
        B["startMcpServer()"]
        C["startGlobalMcpServer()"]
        D["Tool Handlers"]
        E["Resource Handlers"]
        F["Prompt Handlers"]
    end
    
    subgraph "Core Indexer Layer"
        G["Indexer<br/>(IndexFiles + IndexCode + IndexGraph + IndexMemory)"]
        H["Vector Store"]
        I["Graph Store"]
        J["Memory Store"]
    end
    
    A --> B
    A --> C
    B --> D
    B --> E
    B --> F
    D --> G
    E --> G
    F --> G
    G --> H
    G --> I
    G --> J

Key Design Principles

  1. Local-first: All indexing happens on-disk; no data leaves the machine
  2. Git-aware: Tools understand branches, commits, and diffs
  3. Multi-signal retrieval: Combines FTS, embeddings, symbol graphs, and PageRank
  4. Backward compatibility: Legacy tool names are aliased to canonical names with deprecation warnings

Server Initialization

Per-Project MCP Server

The startMcpServer() function initializes an MCP server for a single repository:

// src/server/mcp-server.ts:180
export async function startMcpServer(rootPath: string): Promise<void> {
  const config = getProjectConfig(rootPath);
  // ... initializes indexer, registers handlers, starts server
}

Server configuration includes:

ParameterSourceDescription
rootPathCLI argumentAbsolute path to the project root
serverNameserver.jsonMCP server identifier (default: io.github.sverklo/sverklo)
serverVersionpackage.jsonInherited from npm package version
instructionsStatic stringServer capabilities description for AI clients

Server Capabilities

The MCP server declares three capability categories:

// src/server/mcp-server.ts:190
const server = new Server(
  { name: "sverklo", version: serverVersion },
  {
    capabilities: {
      tools: {},
      resources: {},
      prompts: {},
    },
    instructions: /* string */,
  }
);
CapabilityPurposeHandler
toolsCode intelligence operations (search, lookup, impact, etc.)server.setRequestHandler(HandleCallToolRequestSchema, ...)
resourcesStatic project context at session startserver.setRequestHandler(ListResourcesRequestSchema, ...)
promptsReusable workflow templatesserver.setRequestHandler(ListPromptsRequestSchema, ...)

Tool System Architecture

Tool Registration

Tools are registered through the MCP SDK's server.tool() method. Each tool declares:

// src/server/mcp-server.ts (pattern)
server.tool(
  "search",           // canonical tool name
  "Natural language search across indexed code",  // description
  { query: { type: "string" } },                 // input schema
  async (args, extra) => { /* handler */ }       // implementation
);

Tool Naming Convention (v0.28.0+)

Following issue #71, tool names use the format <verb>_<noun> without the sverklo_ prefix:

Canonical NameDescription
searchFull-text and semantic search
lookupSymbol lookup by name
impactBlast radius analysis
refsFind references to a symbol
investigateMulti-signal fan-out investigation
contextUmbrella context bundler
review_diffDiff-aware code review
statusIndexing status

Legacy Tool Aliases

For backward compatibility, the server maintains a LEGACY_TOOL_ALIASES map:

// src/server/mcp-server.ts
export const LEGACY_TOOL_ALIASES: Record<string, string> = {
  "sverklo_search": "search",
  "sverklo_lookup": "lookup",
  // ... ≥30 entries
};

The resolveToolName() function routes legacy names to canonical names and emits a single deprecation warning per legacy name per server instance:

// src/server/tools/rename-aliases.test.ts:16
it("resolveToolName routes legacy → canonical correctly", () => { ... });

// src/server/tools/rename-aliases.test.ts:22
it("deprecation warning fires exactly once per legacy name", () => { ... });

Tool Presets

Tool availability is controlled through named presets defined in tool-overrides.ts:

// src/server/tool-overrides.ts
export const TOOL_PRESETS = {
  // Minimal: only essential tools
  minimal: ["search", "lookup", "overview", "refs", "impact", "status"],
  
  // Standard: balanced for most use cases
  standard: ["search", "lookup", "overview", "refs", "impact", "deps", "context", "status"],
  
  // Lean: adds memory tools for recall/remember
  lean: ["search", "lookup", "overview", "refs", "impact", "deps", "context", "status", "remember", "recall", "review_diff"],
  
  // Research: full investigation surface for code onboarding
  research: ["search", "search_iterative", "investigate", "ask", "lookup", "overview", "refs", "impact", "deps", "concepts", "patterns", "clusters", "verify", "critique", "ctx_slice", "ctx_grep", "ctx_stats", "status"],
  
  // Review: PR/MR focus with diff tools front-and-center
  review: ["review_diff", "diff_search", "test_map", "impact", "refs", "lookup", "search", "investigate", "verify", "status"],
};

Presets are configured in server.json:

// server.json
{
  "mcp": {
    "preset": "research",
    "env": {
      "OVERRIDE_TOOLS": "standard,context"
    }
  }
}

Input Validation

Server-Side Validation Layer

The _validation.ts module provides shared validators used by all tool handlers:

// src/server/tools/_validation.ts
export function validateEnum<T extends string>(
  raw: unknown,
  allowed: readonly T[],
  argName: string,
  fallback: T
): T | Error { ... }

export function requireString(
  raw: unknown,
  argName: string,
  usage: string
): { ok: true; value: string } | { ok: false; message: string } { ... }

Why Server-Side Validation?

The MCP wrapper declares JSON schemas, but Claude/agents sometimes pass values outside declared enums. Without server-side guards, invalid values fall through to silent type-cast paths, returning wrong but successful-looking results. Source: src/server/tools/_validation.ts:1-15

Git Parameter Validation

Git parameters (refs, paths) are validated against injection patterns:

// src/utils/git-validation.ts
export function validateGitRef(ref: string): boolean {
  // Allows: branch names, tags, SHAs, ranges (A..B, A...B), HEAD~N, HEAD^N
  // Rejects: spaces, semicolons, backticks, pipes, dollar signs, parentheses
  return /^[a-zA-Z0-9_.\/@{}\-~^:]+(\.\.[a-zA-Z0-9_.\/@{}\-~^:]+)?$/.test(ref);
}

This prevents command injection (CWE-78) when git commands are executed via execSync or spawnSync.

Resource System

Auto-Injected Project Context

The MCP server registers a single resource sverklo://context that AI clients read at session start:

// src/server/mcp-server.ts:200
server.setRequestHandler(ListResourcesRequestSchema, async () => ({
  resources: [
    {
      uri: "sverklo://context",
      name: "Sverklo Project Context",
      description:
        "Key memories and codebase overview. Read this at session start to understand the project.",
      mimeType: "text/plain",
    },
  ],
}));

Context Content

When sverklo://context is read, the server returns a markdown document containing:

SectionContentSelection Criteria
Core Project ContextProject-invariant memories (tier='core')Top 15 by recency
Stale MemoriesMemories flagged as outdatedAny with is_stale: true
Recent MemoriesRecent context entriesTop 5 by recency
Top FilesFiles sorted by PageRankTop 5 files

Prompt Templates

The MCP server exposes reusable workflow prompts via the prompts protocol:

// src/server/prompts.ts
export interface PromptDefinition {
  name: string;
  description: string;
  arguments: PromptArgument[];
  build: (args: Record<string, string | undefined>) => string;
}

Available Prompts

Prompt NameDescriptionRequired Args
sverklo/review-changesDiff-aware code review workflowref (optional)
sverklo/map-featureMap a feature across codebase entry points, symbols, tests, docsfeature

Prompt Workflow Example

The sverklo/review-changes prompt guides the model through a structured review:

# Review changes workflow (simplified)

1. Call `review_diff ref:"<ref>"` for risk-scored findings
2. Call `diff_search query:"<risk keywords>"` to surface related changes
3. Call `test_map ref:"<ref>"` to check test coverage
4. Call `impact ref:"<ref>"` for blast radius before approving

Context Tool (`context`)

The context tool is an umbrella bundler that provides codebase overview in a single call:

// src/server/tools/context.ts
const contextTool = {
  name: "context",
  description:
    "Umbrella context bundler. Give a task description and get a single curated bundle: " +
    "codebase overview header, semantically relevant code, related symbols, and matching " +
    "saved memories — in one round trip.",
  inputSchema: {
    type: "object",
    properties: {
      task: { type: "string" },
      detail_level: { type: "string", enum: ["minimal", "normal", "full"] },
      scope: { type: "string" },
      budget: { type: "number" },  // PageRank-pruned token budget
    },
  },
};

Context Parameters

ParameterTypeDefaultDescription
taskstring-Free-form task description
detail_levelenum"normal"minimal=fast/cheap, normal=balanced, full=adds dep neighbors
scopestring-Path prefix to constrain search (e.g., src/api/)
budgetnumber-When set, returns PageRank-pruned repo map fit to token budget

Critique Tool

The critique tool evaluates whether an AI's answer properly cites sverklo's evidence:

// src/server/tools/critique.ts
interface CritiqueData {
  claim: string | null;
  verify: VerifyResult[];
  stale: VerifyResult[];
  moved: VerifyResult[];
  hubsCited: string[];
  missedHubs: string[];
  undefinedSymbols: string[];
  undocumentedSymbols: string[];
  totalSymbols: number;
}

What Critique Checks

CheckDescription
Symbol verificationAre cited symbols actually defined in the codebase?
Stale memoryAre referenced memories marked as outdated?
Moved codeDo citations point to symbols that have been relocated?
Hub citationDoes the answer cite high PageRank hub files?
Undefined symbolsDoes the answer mention symbols that don't exist?
Undocumented symbolsAre important symbols missing .md/.markdown/.mdx documentation?

Zilliz Compatibility Layer

Sverklo provides aliases for Zilliz claude-context MCP server tools:

// src/server/mcp-server.ts (Zilliz compat tools)
const zillizTools = [
  {
    name: "search_code",
    description: "[Zilliz claude-context compat] Alias for sverklo's search tool.",
    inputSchema: {
      type: "object",
      properties: {
        query: { type: "string" },
        path: { type: "string" },
        limit: { type: "number" },
      },
      required: ["query"],
    },
  },
  {
    name: "clear_index",
    description: "[Zilliz claude-context compat] Delete the index database and rebuild from scratch.",
  },
  {
    name: "get_indexing_status",
    description: "[Zilliz claude-context compat] Alias for sverklo's `status` tool.",
  },
];

Global MCP Server (Multi-Repo Mode)

The startGlobalMcpServer() function serves multiple repositories from a single MCP server:

// src/server/mcp-server.ts:290
export async function startGlobalMcpServer(): Promise<void> {
  const pool = new IndexerPool();
  const hints = new HintEngine();
  // ... initializes server with list_repos tool
}

Global Mode Features

FeatureDescription
list_reposList all registered repositories with path, name, and status
repo parameterAll tools accept optional repo parameter to target specific repo
Single-repo shortcutIf only one repo is registered, repo parameter is optional

Server Instructions (Global Mode)

Sverklo (global mode): code intelligence serving multiple repos.
Use the list_repos tool to see available repositories, then pass the
repo name to any tool. If only one repo is registered, the repo
parameter is optional.

Wakeup Generation

The generateWakeup() function creates a compact project summary:

// src/server/tools/wakeup.ts
export function generateWakeup(
  indexer: IndexFiles & IndexMemory,
  options: { maxTokens?: number; format?: "markdown" | "plain" } = {}
): string

Wakeup Output Structure

# {projectName}
{fileCount} files · {languages}

## Core files (by dependency rank)
- `path/to/high-pagerank-file.ts`
- ...

## Project invariants (or Recent context)
- [{category}] {memory content}
- ...

Version Management

The server reads its version from package.json at startup using a directory traversal pattern:

// src/server/mcp-server.ts:305
for (const rel of ["..", "../..", "../../.."]) {
  try {
    const pkg = JSON.parse(readFileSync(join(here, rel, "package.json"), "utf-8"));
    if (pkg.name === "sverklo" && pkg.version) {
      serverVersion = pkg.version;
      break;
    }
  } catch {}
}

This ensures the MCP server always reports the version of the installed npm package, even when invoked from different working directories.

Source: https://github.com/sverklo/sverklo / Human Manual

Search and Retrieval System

Related topics: Search Tools Reference, Indexing System, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Hybrid Search

Continue reading this section for the full explanation and source context.

Section Reciprocal Rank Fusion

Continue reading this section for the full explanation and source context.

Section PageRank Boost

Continue reading this section for the full explanation and source context.

Related topics: Search Tools Reference, Indexing System, System Architecture

Search and Retrieval System

Overview

The Search and Retrieval System in sverklo provides multi-signal code search capabilities for coding agents. It combines full-text search (BM25), semantic embeddings (bi-encoder), symbol-based lookup, and graph-based PageRank scoring to surface relevant code chunks for a given query.

The system is designed to be local-first with no API keys required, using ONNX runtime for embedding inference and in-memory data structures for fast retrieval. It powers tools like search, investigate, context, and review_diff across the MCP server interface.

Source: src/search/hybrid-search.ts

Architecture

The retrieval architecture combines multiple rankers into a unified pipeline:

graph TD
    A[Query] --> B[BM25 FTS]
    A --> C[Bi-Encoder Embeddings]
    A --> D[Symbol Lookup]
    B --> E[Reciprocal Rank Fusion]
    C --> E
    D --> E
    E --> F[PageRank Boost]
    F --> G[Reranker]
    G --> H[Final Results]

The system uses Reciprocal Rank Fusion (RRF) to combine signals from multiple retrievers, followed by PageRank-based boosting to prioritize centrally-important files, and an optional reranking pass to refine results based on query-document affinity.

Source: src/search/rerank.ts Source: src/search/pagerank.ts

Core Components

The hybridSearch function orchestrates multi-signal retrieval by:

  1. Executing parallel searches across FTS, embeddings, symbols, and references
  2. Collecting results with source attribution (found_by tags)
  3. Applying Reciprocal Rank Fusion to merge ranked lists
  4. Boosting results from high-PageRank files
export async function hybridSearch(
  indexer: Indexer,
  query: string,
  options?: HybridSearchOptions
): Promise<SearchResult[]>

The function returns results annotated with found_by arrays, allowing callers to identify multi-source agreement:

Results agreed on by multiple retrievers are higher-signal than single-source hits.

Source: src/search/hybrid-search.ts Source: src/server/prompts.ts

Reciprocal Rank Fusion

RRF combines ranked lists using the formula:

RRF_score(doc) = Σ 1 / (k + rank_i(doc))

Where k is a constant (typically 60) that controls how much the lowest-ranked retrievers contribute. This approach is parameter-free and handles different score distributions across rankers.

Source: src/search/hybrid-search.ts

PageRank Boost

PageRank scores are computed from the import dependency graph during indexing. The pagerankBoost function adjusts search scores based on file centrality:

export function pagerankBoost(
  results: SearchResult[],
  fileStore: FileStore,
  factor?: number
): SearchResult[]

High-PR files (core libraries, entry points) receive a multiplicative boost, ensuring frequently-imported code surfaces first even when query terms are sparse.

Source: src/search/pagerank.ts

Embedding Store

The EmbeddingStore manages vector embeddings for semantic search:

export class EmbeddingStore {
  get(query: string, k?: number): EmbeddingResult[]
  upsert(records: EmbeddingRecord[]): void
  prune(ids: Set<number>): void
}

Embeddings are computed using ONNX runtime and stored in memory. The store supports:

  • Top-k retrieval by cosine similarity
  • Pruning to remove embeddings for deleted files
  • Batch upsert for incremental index updates

Source: src/storage/embedding-store.ts

Reranking

The reranker refines initial results using a cross-encoder approach. It takes the top-N candidates from hybrid search and reorders them based on finer-grained query-document matching:

export interface RerankerResult {
  chunk_id: number;
  score: number;
  rerank_score: number;
  source: "fts" | "embedding" | "symbol" | "ref";
  path: string;
  lines: string;
}

Source: src/search/rerank.ts

Investigation Tool

The investigate tool provides single-pass fan-out over all retrieval signals:

export async function handleInvestigate(
  indexer: Indexer,
  args: { query: string; scope?: string; limit?: number }
): Promise<string>

It returns structured results showing which retrievers found each result, enabling agents to:

  • Identify high-confidence hits (multi-source agreement)
  • Discover unexpected code locations
  • Build confidence in retrieved evidence

Source: src/search/investigate.ts

Search Iterative Tool

For complex queries requiring refinement, searchIterative supports multi-turn search with context accumulation:

export const searchIterativeTool = {
  name: "search_iterative",
  description: "Multi-turn search that builds on previous results...",
  inputSchema: {
    type: "object",
    properties: {
      query: { type: "string", description: "Search query" },
      refine: { type: "string", description: "Refinement to previous results" },
      // ...
    }
  }
}

The tool maintains trajectory state across calls, allowing progressive narrowing of search space.

Source: src/server/tools/search-iterative.ts

Context Tool

The context tool is an umbrella bundler that combines search with memory recall:

export const contextTool = {
  name: "context",
  description: "Umbrella context bundler. Give a task description and get a single curated bundle..."
}

It supports a budget parameter for PageRank-pruned repo maps that fit a token budget—ideal for giving agents a complete mental model of an unfamiliar codebase in one call.

Source: src/server/tools/context.ts

Retrieval Signal Types

SignalSourceStrength
BM25 FTSFull-text indexingExact term matching
Bi-encoder embeddingsONNX inferenceSemantic similarity
Symbol lookupAST parsingDefinition/expression finding
Reference graphImport analysisCall-site discovery
PageRankDependency graphArchitectural importance

Configuration

Boost Factor

The BOOST_FACTOR constant (default: 0.5) controls PageRank influence on final scores:

export function pagerankBoost(
  results: SearchResult[],
  fileStore: FileStore,
  factor: number = BOOST_FACTOR
): SearchResult[]

RRF Constant

The RRF_K constant (default: 60) controls how aggressively lower-ranked retrievers influence fusion:

const RRF_K = 60;
function rrfScore(rank: number): number {
  return 1 / (RRF_K + rank);
}

Community Considerations

Multi-Vector Reranker Evaluation (Issue #29)

The community has discussed evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder approach. Multi-vector models tokenize queries and documents into multiple embedding vectors, potentially capturing finer-grained relevance signals for code search.

Current architecture uses bi-encoder embeddings where both query and document are encoded independently. The enhancement would involve:

  • Late interaction between query and document token vectors
  • Potential improvement in recall for partial matches
  • Trade-off consideration: latency vs. accuracy

Benchmark Performance (Issue #28)

Parser improvements (string/comment-aware brace counting) recovered P1 performance from 0.30 → 0.73 on the 90-task benchmark, though P2/P4 categories showed slight regression. This highlights the sensitivity of retrieval quality to underlying code parsing accuracy.

Tool Summary

ToolPurposeSignals Used
searchBasic hybrid searchFTS + Embeddings + Symbols
investigateMulti-source fan-outAll signals + agreement analysis
search_iterativeRefinement searchTrajectory-aware multi-turn
contextBundle + memoriesHybrid search + recall
ctx_grepGrep within resultsPost-filter FTS
ctx_sliceDependency sliceGraph-based filtering

Dependencies

The search system depends on:

  • ONNX Runtime Node (onnxruntime-node) for embedding inference
  • Tree-sitter (web-tree-sitter) for AST parsing and symbol extraction
  • Picomatch for file path matching patterns

Source: package.json

Source: https://github.com/sverklo/sverklo / Human Manual

Bi-Temporal Memory Layer

Related topics: Search and Retrieval System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section What Makes It "Bi-Temporal"

Continue reading this section for the full explanation and source context.

Section Memory Categories

Continue reading this section for the full explanation and source context.

Section Memory Kinds (Cognitive Axis)

Continue reading this section for the full explanation and source context.

Related topics: Search and Retrieval System

Bi-Temporal Memory Layer

The Bi-Temporal Memory Layer is sverklo's persistent knowledge system that preserves coding decisions, conventions, and context across agent sessions. Unlike traditional memory stores that overwrite previous entries, the bi-temporal model maintains both the current state and historical validity of each memory, enabling conflict detection without data loss.

Core Concepts

What Makes It "Bi-Temporal"

The bi-temporal architecture tracks two independent time dimensions:

  1. Valid Time: When a memory was or will be true in the real world (e.g., "we deprecated X in v2.0")
  2. Record Time: When the memory was recorded in the system (e.g., "I learned about X's deprecation today")

This separation allows agents to reason about both what was historically true and when that knowledge was acquired. Source: src/server/tools/memories.ts:1-20

Memory Categories

Memories are classified into categories that determine their behavior:

CategoryPurposeDefault Kind
decisionArchitectural choices, API contractssemantic
preferenceCoding style, team conventionssemantic
patternReusable solutions to recurring problemssemantic
contextProject-specific information (default)episodic
todoOutstanding work itemsepisodic
proceduralStep-by-step processesprocedural
correctionFixes for prior model mistakesepisodic

Source: src/server/tools/remember.ts:15-23

Memory Kinds (Cognitive Axis)

The cognitive axis determines how memories are retrieved and prioritized:

  • episodic: Moment-bound events or decisions tied to specific contexts
  • semantic: Timeless facts or rules that apply universally
  • procedural: How-to knowledge for executing tasks

Source: src/server/tools/remember.ts:47-50

Memory Tiers

TierBehavior
coreAuto-injected at every session start (up to 15 memories)
archiveSearched on demand, not auto-injected

Source: src/server/mcp-server.ts:45-55

Architecture

Storage Components

┌─────────────────────────────────────────────────────────────────┐
│                     Bi-Temporal Memory Layer                     │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐    ┌──────────────────────┐                   │
│  │memory-store │    │memory-embedding-store │                   │
│  │  (SQLite)   │    │   (Vector + SQLite)   │                   │
│  └─────────────┘    └──────────────────────┘                   │
│         │                     │                                │
│         └──────────┬──────────┘                                │
│                     ▼                                           │
│         ┌─────────────────────┐                                │
│         │  staleness.ts       │  (file-change tracking)         │
│         └─────────────────────┘                                │
│                     │                                           │
│         ┌─────────────────────┐                                │
│         │  prune.ts           │  (lifecycle management)         │
│         └─────────────────────┘                                │
└─────────────────────────────────────────────────────────────────┘

Memory Lifecycle

graph TD
    A[remember tool] --> B[Check for conflicting memories]
    B --> C{Conflict threshold > 0.85?}
    C -->|Yes| D[Mark old memory as STALE]
    C -->|No| E[Keep both memories active]
    D --> F[Save new memory with git state]
    E --> F
    F --> G[Store in memory-store]
    F --> H[Generate embeddings in memory-embedding-store]
    G --> I[Auto-inject if tier=core]

Conflict Detection

The system uses a configurable conflict threshold (default: 0.85) to identify potentially contradictory memories:

const CONFLICT_THRESHOLD = 0.85;

When a new memory conflicts with an existing one above this threshold, the older memory is marked as stale rather than deleted. Both records are preserved, enabling agents to review the conflict. Source: src/server/tools/remember.ts:8

MCP Tools

remember — Save Persistent Memory

rememberTool = {
  name: "remember",
  inputSchema: {
    content: string,           // Required: the memory content
    category: MemoryCategory,  // decision|preference|pattern|context|todo|procedural|correction
    tags: string[],            // Optional metadata tags
    related_files: string[],   // Files this memory relates to
    confidence: number,        // 0.0-1.0, affects retrieval ranking
    tier: "core" | "archive",   // core=auto-inject, archive=search-on-demand
    kind: "episodic" | "semantic" | "procedural",
    scope: "project" | "workspace",  // project=repo-local, workspace=cross-repo
  }
}

Key behaviors:

  • Tied to git state — memories are associated with the current commit/branch
  • Auto-invalidates conflicting prior memories above the conflict threshold
  • procedural category defaults to procedural kind
  • preference/pattern categories default to semantic kind
  • Other categories default to episodic kind

Source: src/server/tools/remember.ts:12-55

recall — Retrieve Relevant Memories

The recall tool searches memories semantically and returns results ranked by relevance. Memories linked to files that have changed since recording may be marked as potentially stale.

memories — List and Audit Memories

memoriesTool = {
  name: "memories",
  inputSchema: {
    mode: "list" | "conflicts",  // list=show all, conflicts=show contradictory pairs
    category: MemoryCategory,    // Filter by category
    limit: number,              // Max results (default: 50)
    stale_only: boolean,         // Only show stale memories
  }
}

Conflict mode: Surfaces pairs of active memories sharing a pin that may contradict. The bi-temporal model preserves both, presenting this as a review prompt rather than auto-resolving.

Source: src/server/tools/memories.ts:8-30

pin / unpin — Anchor Memories to Code Locations

pinTool = {
  name: "pin",
  inputSchema: {
    memory_id: number,  // From recall/memories results
    target: string,     // File path or symbol name
  }
}

Pinned memories surface automatically when recalling by that file path or symbol name, without requiring semantic search. This enables location-specific knowledge injection.

Source: src/server/tools/pin.ts:1-30

Core Memories Auto-Injection

On every MCP session start, the server auto-injects core tier memories into the context:

// From mcp-server.ts
const coreMemories = indexer.memoryStore.getCore(15);
for (const m of coreMemories) {
  const stale = m.is_stale ? " [STALE]" : "";
  parts.push(`- [${m.category}]${stale} ${m.content}`);
}

These memories appear in the sverklo://context resource and serve as project invariants that agents should always consider. Source: src/server/mcp-server.ts:45-55

Staleness Detection

File-Based Staleness

When related_files are provided with a memory, sverklo tracks file changes:

interface Memory {
  related_files: string[];  // Files this memory relates to
  is_stale: boolean;        // Set when related files change
}

Stale memories are flagged with [STALE] in the context output, alerting agents to re-evaluate whether the memory is still valid. Source: src/server/mcp-server.ts:52

Conflict-Based Staleness

Memories that conflict with newer entries (similarity > 0.85) are automatically marked stale, preserving the historical record while surfacing the current best knowledge.

Workspace Scope

Memories can be saved at two scopes:

ScopeStorage LocationVisibility
project{repo}/.sverklo/memories.dbCurrent repository only
workspace~/.sverklo/workspaces/{name}/memories.dbAll repos in workspace

The workspace scope enables cross-repository decisions (e.g., "we use Postgres everywhere") to be shared across projects. Source: src/server/tools/remember.ts:55-62

Community Considerations

Global Memory Setup (Issue #72)

Users requesting sverklo init --global want one-time workspace-level memory setup that doesn't require per-project initialization. The bi-temporal layer's workspace scope partially addresses this by enabling cross-repo memories, but the initialization workflow remains per-project.

MCP Tool Name Conflicts (Issue #71)

Memory tools (remember, recall, memories) may be double-prefixed (sverklo_sverklo_remember) when registered under the sverklo key, though this is a naming convention issue rather than a bi-temporal architecture concern.

Summary

The Bi-Temporal Memory Layer provides sverklo's agents with persistent, conflict-aware knowledge that survives across sessions. Key design decisions:

  • Preservation over deletion: Conflicting memories are marked stale, not removed
  • Dual temporal axes: Valid time and record time enable historical reasoning
  • Tiered retrieval: Core memories auto-inject; archive memories search on demand
  • Location pinning: Memories anchor to files/symbols for context-sensitive recall
  • Git integration: Memories are tied to repository state for traceability

Source: https://github.com/sverklo/sverklo / Human Manual

Indexing System

Related topics: Search and Retrieval System, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Build Lookup Maps

Continue reading this section for the full explanation and source context.

Section Symbol Extraction

Continue reading this section for the full explanation and source context.

Section Documentation Edge Store

Continue reading this section for the full explanation and source context.

Related topics: Search and Retrieval System, System Architecture

Indexing System

The indexing system is the core data pipeline that powers sverklo's code intelligence capabilities. It transforms source code into a queryable, multi-dimensional index that combines file metadata, code symbols, dependency graphs, semantic embeddings, and persistent memory.

Overview

Sverklo's indexer builds a four-layer index that feeds all downstream tools:

LayerPurposeData Structure
FilesTrack all indexed source files with metadatafileStore
CodeExtract symbols, chunks, and documentation edgescodeStore / docEdgeStore
GraphBuild dependency relationships for PageRankgraphStore
MemoryPersist context and decisions across sessionsmemoryStore

Source: src/server/tools/context.ts

Architecture

graph TD
    A[Source Files] --> B[File Indexer]
    B --> C[fileStore]
    
    D[Code Parsing] --> E[Symbol Extractor]
    E --> F[codeStore]
    F --> G[Doc Edge Store]
    G --> H[docEdgeStore]
    
    I[File Metadata] --> J[Graph Builder]
    J --> K[graphStore]
    K --> L[PageRank Compute]
    L --> K
    
    M[Memory Store] --> N[memoryStore]
    
    C --> O[Hybrid Search]
    F --> O
    K --> O
    G --> O

Indexer Interfaces

The indexer exposes a unified interface combining four specialized stores:

type IndexFiles = {
  fileStore: FileStore;
  getStatus(): ProjectStatus;
};

type IndexCode = {
  codeStore: CodeStore;
  docEdgeStore: DocEdgeStore;
};

type IndexGraph = {
  graphStore: GraphStore;
};

type IndexMemory = {
  memoryStore: MemoryStore;
};

Source: src/server/tools/context.ts

File Store

The file store maintains a registry of all indexed source files with their metadata:

interface FileRecord {
  id: number;
  path: string;
  pagerank: number;
  // ... other metadata
}

The store provides:

  • getAll() - Retrieve all indexed files
  • getById(id) - Lookup by file ID
  • Path-to-ID mapping for graph edge resolution

Source: src/server/audit-obsidian.ts

Build Lookup Maps

const idToPath = new Map<number, string>();
for (const f of files) idToPath.set(f.id, f.path);

Source: src/server/audit-obsidian.ts

Code Store

The code store indexes code symbols and documentation references:

Symbol Extraction

Symbols are extracted from parsed code and stored with:

  • Symbol name and type (function, class, type, interface, method, variable)
  • Source file location
  • Chunk boundaries for code retrieval

Source: src/server/tools/find-references.ts

Documentation Edge Store

Documentation edges connect code symbols to their documentation:

interface DocMention {
  doc_file_path: string;
  doc_breadcrumb?: string;
  match_kind: string;
  edge_kind: "includes" | "reference";
  confidence: number;
}

The store supports:

  • getBySymbol(symbol, limit) - Find documentation mentions of a symbol
  • Edge kind filtering: "includes" for structural inclusions, "reference" for associative mentions
  • Deduplication by file path + breadcrumb + match kind

Source: src/server/tools/find-references.ts

Graph Store

The graph store builds and maintains the dependency graph:

Edge Structure

interface GraphEdge {
  source_file_id: number;
  target_file_id: number;
}

Import/Dependency Maps

const imports = new Map<string, string[]>();      // file -> files it imports
const importedBy = new Map<string, string[]>();   // file -> files that import it

Source: src/server/audit-obsidian.ts

PageRank Computation

Files receive PageRank scores based on their position in the dependency graph. High PageRank files are considered "load-bearing" modules.

Source: src/server/tools/wakeup.ts

The hybrid search combines multiple retrieval signals:

const { hybridSearch } = require("../../search/hybrid-search.js");

Search Signals

SignalDescription
BM25Traditional keyword matching
Vector/EmbeddingsSemantic similarity via bi-encoder
PageRankGraph-based importance
Symbol MatchExact symbol references

Source: src/server/tools/context.ts

Memory Store

The memory store provides persistent context across sessions:

Memory Tiers

TierUsage
CoreProject invariants, auto-injected on session start
StandardGeneral memories and decisions

Memory Categories

  • Conventions
  • Architecture decisions
  • Project-specific patterns

Source: src/server/mcp-server.ts

Status Reporting

The indexer provides project status through getStatus():

const status = indexer.getStatus();
// Returns: { projectName, fileCount, languages, ... }

Source: src/server/tools/wakeup.ts

Index Timestamp Bug (Issue #74)

sverklo reindex does not update the lastIndexed field in registry.json, causing stale age displays after reindexing.

Reproduction:

sverklo register .
sverklo reindex .
sverklo list  # Still shows stale age

Parser Regression (Issue #28)

A string/comment-aware brace counter fix in the parser recovered P1 from 0.30 → 0.73 on the 90-task benchmark but caused slight regressions in P2/P4 categories.

Retrieval Architecture (Issue #29)

Community discussion on evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank architecture.

Configuration

The indexer supports path-based filtering:

OptionDescription
scopePath prefix to constrain indexing
ignorePatterns to exclude from indexing

Wakeup Generation

The wakeup system generates a quick project orientation:

function generateWakeup(indexer, options) {
  const status = indexer.getStatus();
  const coreMemories = indexer.memoryStore.getCore(10);
  const topFiles = indexer.fileStore.getAll().slice(0, 5);
}

Output includes:

  • Project name and file count
  • Top 5 files by PageRank
  • Core memories (tier='core')

Source: src/server/tools/wakeup.ts

Source: https://github.com/sverklo/sverklo / Human Manual

Search Tools Reference

Related topics: Impact and Reference Tools, Search and Retrieval System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Discovery Tools

Continue reading this section for the full explanation and source context.

Section Context Tools

Continue reading this section for the full explanation and source context.

Section Navigation Tools

Continue reading this section for the full explanation and source context.

Related topics: Impact and Reference Tools, Search and Retrieval System

Search Tools Reference

Sverklo provides a layered suite of search and investigation tools designed to help coding agents navigate, understand, and reason about codebases. The tools span from fast keyword lookups to deep semantic analysis, with reciprocal rank fusion combining multiple retrieval signals for high-quality results.

Architecture Overview

Sverklo's search system uses a hybrid retrieval architecture that combines three signals:

SignalMechanismPurpose
BM25Sparse keyword matchingExact term matches and code identifiers
Bi-encoder embeddingsDense vector similaritySemantic/code-similarity search
PageRankGraph-based importancePrioritize high-centrality files

These signals are merged using reciprocal rank fusion (RRF) to produce ranked results that balance precision with recall. Source: package.json (keywords: "reciprocal-rank-fusion", "bm25", "pagerank")

graph TD
    subgraph "Retrieval Layer"
        BM25[BM25 Keyword Search]
        EMB[Bi-encoder Embeddings]
        PGR[PageRank Scorer]
    end
    
    subgraph "Fusion"
        RRF[Reciprocal Rank Fusion]
    end
    
    subgraph "Post-Processing"
        VERIFY[Verify Results]
        REFINE[Refine & Deduplicate]
    end
    
    BM25 --> RRF
    EMB --> RRF
    PGR --> RRF
    RRF --> VERIFY
    VERIFY --> REFINE
    REFINE --> RESULTS[Final Results]

Tool Categories

Sverklo organizes search tools into four functional categories, each targeting a specific stage of code investigation.

Discovery Tools

These tools help locate code and understand what's in the codebase.

ToolPurpose
searchPrimary semantic search across code chunks
search_iterativeMulti-turn search with result refinement
investigateFan-out search across FTS, embeddings, symbols, and refs simultaneously
grepExact string/pattern matching
headView beginning of files

Source: tool-overrides.ts (tool lists in research and lean profiles)

Context Tools

These tools bundle information for rapid orientation.

ToolPurpose
contextUmbrella tool returning codebase overview + search + symbols + memories in one call
overviewStructural summary with file/chunk counts and PageRank rankings
askFree-form question answering over indexed content
wakeupQuick orientation summary for new sessions

The context tool is designed as the "first call" when starting work on a new task, returning a curated bundle in a single round trip:

Give a task description and get a single curated bundle: codebase overview header, semantically relevant code, related symbols, and matching saved memories — in one round trip.

Source: src/server/tools/context.ts (tool description)

Navigation Tools

These tools help traverse code structure and relationships.

ToolPurpose
lookupRetrieve full code chunks by path
refsFind references to symbols, functions, and variables
depsExplore dependency relationships
clustersGroup similar code patterns
patternsIdentify recurring code idioms
conceptsExtract and map high-level concepts

Verification Tools

These tools validate findings and assess code quality.

ToolPurpose
verifyCheck if cited evidence actually supports claims
critiqueMulti-dimensional analysis of response quality
review_diffRisk-scored PR review with heuristic finding detection
auditCodebase health scoring

Source: tool-overrides.ts (verification tools in research and review profiles)

The primary search tool uses hybrid retrieval to find semantically relevant code chunks.

const searchTool = {
  name: "search",
  description: "Semantic code search using FTS + embeddings + PageRank"
};

Results include found_by tags that indicate which retrievers agreed on each result — results marked by multiple signals are higher-signal than single-source hits.

Source: prompts.ts (investigate prompt instructs: "Read the found_by tags — results agreed on by multiple retrievers are higher-signal than single-source hits")

Iterative Search: `search_iterative`

For complex queries requiring refinement, the iterative search tool supports multi-turn exploration where results from one search inform subsequent queries.

Source: tool-overrides.ts (included in research profile)

Investigation Tool: `investigate`

The investigate tool performs a single-pass fan-out across all retrieval signals simultaneously:

  1. Full-text search (BM25)
  2. Vector embeddings (semantic similarity)
  3. Symbol index (function/class names)
  4. Reference graph (who calls whom)

This is the recommended starting point for feature mapping and deep code exploration.

Source: prompts.ts (map-feature prompt)

Context Bundler: `context`

The context tool is an umbrella that combines multiple searches into a single curated response.

Parameters

ParameterTypeDescription
taskstringFree-form task description
detail_level"minimal" \"normal" \"full"Amount of detail to return
scopestringOptional path prefix constraint
budgetnumberPageRank-pruned token budget

Detail Levels

LevelContents
minimalOverview header + top 3 search hits + top 2 memories
normalOverview header + top 5 search hits + top 5 memories + symbol table
fullNormal + dependency neighbors of top results

When budget is set, returns a PageRank-pruned repo map greedily filled to the token limit.

Source: src/server/tools/context.ts (input schema and implementation)

Finding References: `refs`

The refs tool finds all references to a symbol, including:

  • Call sites
  • Type definitions
  • Documentation mentions
  • Import/export relationships

A significant enhancement in recent versions separates structural inclusions from associative references:

Sprint 9: split structural inclusions from associative references so callers see "this is where the symbol is documented" separately from "see also" mentions.

Results include deduplication to avoid emitting near-identical lines when both an outer fenced chunk and its inner content resolve to the same symbol.

Source: src/server/tools/find-references.ts (deduplication logic and comment)

Documentation Detection

The refs tool detects when markdown or README files reference a symbol by backtick or fenced code:

If any markdown / README / ADR chunks reference this symbol by backtick or fenced code, surface them so the agent sees both the code and its documentation together.

This helps agents identify both the implementation and its documentation in one view.

Source: src/server/tools/find-references.ts (doc mention detection)

Verification: `verify` and `critique`

These tools validate that search results actually support agent claims.

`verify`

Checks if cited evidence points to actual file locations and content matches the claim.

`critique`

Performs multi-dimensional analysis including:

  • Claim verification against cited evidence
  • Stale memory detection
  • Moved symbol tracking
  • Hub file citation analysis
  • Undefined symbol detection
  • Undocumented symbol identification
interface CritiqueData {
  claim: string | null;
  verify: VerifyResult[];
  stale: VerifyResult[];
  moved: VerifyResult[];
  hubsCited: string[];
  missedHubs: string[];
  undefinedSymbols: string[];
  undocumentedSymbols: string[];
  totalSymbols: number;
}

The critique tool specifically checks if documentation files (.md, .markdown, .mdx) cite the symbols — if not, the symbol is flagged as undocumented.

Source: src/server/tools/critique.ts (formatCritique function and verification logic)

Tool Profiles

Sverklo organizes tools into named profiles for different workflows:

ProfileTools IncludedUse Case
fullAll toolsComplete access
minimalsearch, lookup, overview, refs, impactQuick lookups
leansearch, lookup, overview, refs, impact, deps, context, status, remember, recall, review_diffDevelopment workflow
researchsearch, search_iterative, investigate, ask, lookup, overview, refs, impact, deps, concepts, patterns, clusters, verify, critique, ctx_slice, ctx_grep, ctx_stats, statusCode investigation
reviewreview_diff, diff_search, test_map, impact, refs, lookup, search, investigate, verify, statusPR/MR review

Source: tool-overrides.ts (PROFILES constant)

Environment Configuration

Tool Descriptions

Override tool descriptions via environment variables:

SVERKLO_TOOL_<NAME>_DESCRIPTION="custom description"

For example:

SVERKLO_TOOL_SEARCH_DESCRIPTION="Custom search override"

Tool Profiles

Select which tools are available via SVERKLO_PROFILE:

SVERKLO_PROFILE=research  # Enable investigation tools
SVERKLO_PROFILE=review    # Enable PR review tools

Tool Disabling

Disable specific tools via SVERKLO_DISABLED_TOOLS:

SVERKLO_DISABLED_TOOLS=search,investigate

Source: tool-overrides.ts (environment variable handling)

Prompt Templates

Sverklo includes prompt templates that encode the recommended order of tool calls for common tasks.

Map Feature Prompt

Maps a feature across the codebase using a recommended workflow:

  1. investigate — single-pass fan-out over all retrieval signals
  2. refs — expand surface area for top symbols
  3. impact — assess blast radius before proposing changes
  4. overview — structural summary for architectural context
  5. search — keyword/diff-aware search for recent changes

Architecture Map Prompt

Generates an architecture map using:

  1. overview — structural summary with PageRank
  2. deps — dependency analysis for top files
  3. recall — any saved design decisions
  4. search — find entry points

Source: src/server/prompts.ts (prompt definitions)

GitHub Action Integration

The search tools integrate with GitHub Actions for automated PR review:

uses: sverklo/sverklo/action@main
with:
  github-token: ${{ secrets.GITHUB_TOKEN }}
  fail-on: high
  max-files: 25

The action uses heuristic finding detection to identify risky code patterns and posts results as GitHub PR review comments with inline annotations.

Source: action/README.md (usage documentation)

Community Considerations

Retrieval Architecture Evaluation

The community has discussed evaluating alternative retrieval architectures:

LinkedIn discussion... raised two concrete pushes against sverklo's current retrieval architecture, both worth taking seriously...

Specifically, issue #29 discusses evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank approach.

Regressions After Parser Fix

Benchmark issue #28 documents a regression in P2/P4 performance after a parser brace-counter fix. This affected search quality for certain code patterns, highlighting the importance of the indexer's parsing quality for retrieval accuracy.

Source: https://github.com/sverklo/sverklo / Human Manual

Impact and Reference Tools

Related topics: Search Tools Reference, Indexing System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Graph Tools

Continue reading this section for the full explanation and source context.

Section Supporting Tools

Continue reading this section for the full explanation and source context.

Section Symbol Reference Finding

Continue reading this section for the full explanation and source context.

Related topics: Search Tools Reference, Indexing System

Impact and Reference Tools

The Impact and Reference Tools form the graph-analysis layer of sverklo's code intelligence system. These tools enable agents to understand how code entities relate to each other, assess the blast radius of proposed changes, and verify that code modifications haven't introduced broken references or undocumented dependencies.

Overview

Sverklo maintains a dependency graph built from import/export relationships parsed during indexing. The Impact and Reference Tools query this graph to answer two fundamental questions:

  1. What depends on this code? — Reference analysis traces consumers and dependencies
  2. What would break if this changed? — Impact analysis calculates blast radius and risk scores

These tools are available across all tool profiles (core, nav, lean, research, review) and are considered essential for safe refactoring and architectural decisions. Source: src/server/tool-overrides.ts:1-51

Tool Inventory

Core Graph Tools

ToolPurposePrimary Use Case
refsFind all references to a symbolUnderstanding usage patterns before refactoring
impactCalculate blast radius of changesRisk assessment for proposed modifications
depsShow dependency graph for a file/symbolUnderstanding architectural layers

Supporting Tools

ToolPurposeIntegration
investigateFan-out search across FTS, vectors, symbols, refsResearch workflow entry point
verifyCross-reference claims against codebaseCode review and documentation validation
critiqueStructured verification of code claimsPR review and architectural review

Source: src/server/mcp-server.ts

Reference Analysis

Symbol Reference Finding

The refs tool traces both direct code references and documentation mentions of symbols. It builds bidirectional maps from the graph store to answer "who imports this?" and "what does this import?" queries.

graph LR
    A[Symbol Query] --> B[Graph Store]
    B --> C[Import Map<br/>file → files it imports]
    B --> D[Imported-By Map<br/>file → files importing it]
    C --> E[Direct Dependencies]
    D --> F[Direct Consumers]
    E --> G[Transitive Dependencies<br/>N hops]
    F --> H[Transitive Consumers<br/>N hops]

The tool separates structural inclusions from associative references, surfacing "this is where the symbol is documented" separately from "see also" mentions. Source: src/server/tools/find-references.ts:1-40

Documentation Citation Tracking

The refs tool detects when documentation (.md, .markdown, .mdx files) references a symbol. This helps agents identify:

  • Architecture Decision Records (ADRs) that document the symbol's design
  • Usage examples in README files
  • Related documentation that should be updated alongside code changes

Reference rows are deduplicated to prevent emitting near-identical lines when both an outer fenced chunk and inner fence resolve to the same symbol. Source: src/server/tools/find-references.ts:25-38

Verify Results Format

Reference findings are returned with file path, line information, and match kind:

interface VerifyResult {
  file?: string;           // Repo-relative path
  line?: number;           // 1-based line number
  match_kind: string;      // 'import' | 'call' | 'type_ref' | 'doc_mention'
  confidence?: number;     // 0-1 for heuristic matches
}

Impact Analysis

Blast Radius Calculation

The impact tool calculates how many and which files would be affected by changes to a given symbol. It uses PageRank scores to prioritize the most important affected files.

graph TD
    A[Changed Symbol] --> B[Direct Consumers]
    B --> C[Test Files]
    B --> D[Direct Importers]
    C --> E[High Risk<br/>No Alternative Path]
    D --> F[Transitive Consumers]
    F --> G[Indirect Dependencies]
    G --> H[Risk Score by<br/>PageRank Weight]

Risk Scoring Factors

Impact analysis considers multiple factors:

FactorWeightDescription
PageRank ScoreHighFiles with higher centrality are riskier
Test CoverageMediumFiles with tests are safer to change
Fan-out CountMediumFiles importing many things affect more
Circular DependenciesHighChanges in cycles affect all members

Source: src/server/mcp-server.ts:40-70

Partition Plans

For large blast radii, the impact tool returns partition plans that break the change into buckets. Agents should pick one bucket and drill in rather than attempting to read the full list. This is especially important for monorepos with hundreds of affected files. Source: src/server/prompts.ts:20-35

Diff-Aware Analysis

The diff_search tool combines semantic search with git diff awareness. It searches only changed files between two refs, useful for understanding what changed in a PR:

graph LR
    A[Query + Ref Range] --> B[git diff]
    B --> C[Changed Paths]
    C --> D[Filtered Search<br/>Only Changed Files]
    D --> E[Impact on Changed Code]

Parameters:

ParameterTypeDefaultDescription
querystringrequiredSearch query
refstringmain..HEADGit ref range
include_callersnumber0Include N-hop transitive callers
token_budgetnumber3000Max tokens to return
typeenumanyFilter by symbol type

Source: src/server/tools/diff-search.ts:1-80

Include Callers Mode

When include_callers is set, the tool includes files that import the changed files. This answers "what uses these changed files?" rather than just "what changed?":

  • 0 — Only changed files
  • 1 — Changed files + direct callers
  • 2 — Changed files + transitive callers

Source: src/server/tools/diff-search.ts:20-30

Critique and Verification

Claim Verification

The critique tool validates code claims by cross-referencing them against the indexed codebase. It checks:

  1. Stale references — Symbols that no longer exist or have moved
  2. Undefined symbols — References to unindexed or non-existent code
  3. Documentation coverage — Whether cited symbols have documentation mentions
  4. Hub citation — Whether high-centrality files are referenced
graph TD
    A[Code Claim] --> B[Verify Against Index]
    B --> C{Still Exists?}
    C -->|No| D[Moved Symbol]
    C -->|No| E[Undefined Symbol]
    C -->|Yes| F{Has Docs?}
    F -->|No| G[Undocumented Symbol]
    F -->|Yes| H[Verified Claim]
    D --> I[Critique Report]
    E --> I
    G --> I

The tool detects when none of the cited evidence points at documentation files, suggesting the answer skipped documentation. Source: src/server/tools/critique.ts:1-50

Critique Data Structure

interface CritiqueData {
  claim: string | null;
  verify: VerifyResult[];
  stale: VerifyResult[];
  moved: VerifyResult[];
  hubsCited: string[];
  missedHubs: string[];
  undefinedSymbols: string[];
  undocumentedSymbols: string[];
  totalSymbols: number;
}

Tool Profiles

Tool availability varies by profile. The Impact and Reference tools are available in all profiles:

Toolcorenavleanresearchreview
refs
impact
deps
investigate
verify
critique

Source: src/server/tool-overrides.ts:10-50

Integration with Memory

The graph tools integrate with sverklo's memory layer:

  • Core memories (tier='core') are project invariants auto-injected on every session start
  • Impact analysis can be saved as memories using remember
  • Reference findings can be recalled in future sessions

This enables agents to build institutional knowledge about risky changes and their outcomes. Source: src/server/mcp-server.ts:75-100

Usage Patterns

Safe Refactoring Workflow

  1. Identify the symbol to refactor
  2. Call refs to understand all consumers
  3. Call impact to assess blast radius
  4. Review partition plans if blast radius is large
  5. Save decisions with remember for future reference

PR Review Workflow

  1. Call investigate to understand changed components
  2. Call diff_search on the PR branch
  3. Call verify to check for broken references
  4. Call critique to validate architectural claims

Source: src/server/prompts.ts:1-30

Performance Considerations

  • Reference lookups use in-memory graph stores for sub-millisecond response
  • PageRank scores are precomputed during indexing
  • Transitive dependency traversal is bounded by default to prevent runaway queries
  • Token budgets prevent oversized responses in large codebases

Source: https://github.com/sverklo/sverklo / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 19 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

  • Severity: high
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: packet_text.keyword_scan | github_repo:1203034717 | https://github.com/sverklo/sverklo

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_7a50e3a046d2438db185ba21d580ec9e | https://github.com/sverklo/sverklo/issues/71

3. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_13e1bc9ab7fa41a0866eb6c4f814875c | https://github.com/sverklo/sverklo/issues/60

4. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_a8bdc3779b264243b8362d6e57096e25 | https://github.com/sverklo/sverklo/issues/61

5. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_c0c1f6a71a764af596178de506d0b2c3 | https://github.com/sverklo/sverklo/issues/58

6. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_6be83aacd98c4e3abb6ae6361bf81940 | https://github.com/sverklo/sverklo/issues/69

7. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_fc3cc34d92454d5a92ab4a196b178799 | https://github.com/sverklo/sverklo/issues/72

8. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_bf329d5553724c3281773c6aee96cae5 | https://github.com/sverklo/sverklo/issues/74

9. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_42920ecfbbc54f4f8b207e386dfc9ebd | https://github.com/sverklo/sverklo/issues/73

10. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.host_targets | github_repo:1203034717 | https://github.com/sverklo/sverklo

11. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | github_repo:1203034717 | https://github.com/sverklo/sverklo

12. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:1203034717 | https://github.com/sverklo/sverklo

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Source: Project Pack community evidence and pitfall evidence