Doramagic Project Pack · Human Manual
sverklo
Related topics: Installation, Quick Start Guide, System Architecture
Overview
Related topics: Installation, Quick Start Guide, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation, Quick Start Guide, System Architecture
Overview
Sverklo is a local-first MCP (Model Context Protocol) server that provides repository memory and code intelligence for AI coding agents. It enables persistent context, semantic search, dependency graphs, blast-radius analysis, diff-aware review, and git-pinned decisions across coding sessions.
Version: 0.29.0 License: MIT Repository: sverklo/sverklo Website: https://sverklo.com
Purpose and Scope
Sverklo transforms a codebase into a queryable knowledge graph that AI agents can interact with across sessions. Unlike cloud-based solutions, sverklo runs entirely locally—no API keys or code upload required. The system indexes source files, builds dependency graphs, computes PageRank scores, and maintains persistent memories.
Key capabilities include:
| Category | Capabilities |
|---|---|
| Search | Semantic embeddings, BM25 full-text search, hybrid retrieval, PageRank-weighted ranking |
| Graph | Dependency analysis, blast-radius computation, impact analysis |
| Memory | Persistent context across sessions, core project invariants, categorized memories |
| Review | Diff-aware PR review, risk scoring, structural heuristics |
| Audit | Codebase health scoring, architecture diagrams, Obsidian-compatible exports |
Source: package.json:5-33
Architecture Overview
graph TD
subgraph "Client Layer"
IDE[Claude Code / Cursor / Windsurf / Codex CLI]
end
subgraph "MCP Server"
MCP[MCP Protocol Handler]
Tools[Tool Router]
Hints[Intent Hints Engine]
end
subgraph "Indexer Subsystem"
Files[File Indexer]
Code[Code Parser]
Graph[Dependency Graph]
Memory[Memory Store]
Search[Hybrid Search Engine]
end
subgraph "Stores"
FS[File Store]
GS[Graph Store]
MS[Memory Store]
DS[Doc Edge Store]
end
IDE <--> MCP
MCP <--> Tools
Tools <--> Hints
Tools <--> Indexer
Indexer <--> StoresThe MCP server (src/server/mcp-server.ts) implements the Model Context Protocol, exposing tools and resources that IDE clients consume. The indexer subsystem coordinates file scanning, AST-based code parsing, graph construction, and search indexing into multiple backing stores.
Source: src/server/mcp-server.ts:1-50
MCP Tools
Sverklo exposes a comprehensive set of code intelligence tools via the MCP protocol. Tools are organized into presets that optimize the available surface area for different agent workflows.
Tool Presets
| Preset | Purpose | Tools Included |
|---|---|---|
default | Balanced overview + search | search, lookup, overview, refs, impact |
nav | Navigation focus | search, lookup, overview, refs, impact, deps, context, status |
lean | Minimal footprint | search, lookup, overview, refs, impact, deps, context, status, remember, recall, review_diff |
research | Code exploration | search, search_iterative, investigate, ask, lookup, overview, refs, impact, deps, concepts, patterns, clusters, verify, critique, ctx_slice, ctx_grep, ctx_stats, status |
review | PR/MR review | review_diff, diff_search, test_map, impact, refs, lookup, search, investigate, verify, status |
Source: src/server/tool-overrides.ts:1-60
Core Tools
Context Bundle (context) — An umbrella tool that returns a curated bundle in a single call: codebase overview, semantically relevant code, related symbols, and matching memories. This is the recommended first call for unfamiliar tasks.
inputSchema: {
task: string, // Free-form task description
detail_level: enum, // "minimal" | "normal" | "full"
scope: string, // Optional path prefix filter
budget: number // PageRank-pruned token budget
}
Source: src/server/tools/context.ts:1-30
Search (search, search_iterative) — Hybrid retrieval combining:
- Full-text search via BM25
- Semantic embeddings via ONNX runtime
- PageRank-weighted ranking
- Reciprocal rank fusion
Source: src/server/tools/context.ts:45-55
Critique (critique) — Validates an agent's answer by checking cited evidence for staleness, detecting missed high-PageRank hubs, and flagging undocumented symbols. Returns structured critique without LLM calls.
Source: src/server/tools/critique.ts:1-60
Review Diff (review_diff) — Diff-aware code review with risk scoring and structural heuristics. Emits structured GitHub PR review JSON for CI integration.
Source: src/server/tools/review-format.ts:1-40
Intent-Aware Hints
The hint engine tracks recent tool-call trajectories and appends "next steps" suggestions. It classifies intent into categories:
| Intent | Trigger Patterns |
|---|---|
exploring | Search, lookup, investigate sequences |
reviewing-diff | Diff tools followed by refs |
tracing-impact | Impact analysis after symbol lookups |
debugging | Grep, lookup, investigate patterns |
onboarding | Context, overview, status calls |
memory-curating | Remember, recall sequences |
Source: src/server/hints.ts:1-50
Memory System
Sverklo maintains persistent context across coding sessions through a tiered memory architecture:
graph LR
Core[Core Memories<br/>Tier: core]
Recent[Recent Memories<br/>Tier: session]
Stale[Stale Flagging<br/>is_stale flag]
Core --> Session[Auto-injected on session start]
Recent --> Session
Stale --> SessionMemory Tiers:
- Core — Project invariants, always auto-injected at session start
- Recent — Session-scoped memories, last N entries
- Stale — Flagged when underlying code changes (detected via graph analysis)
Source: src/server/mcp-server.ts:80-100
The sverklo://context resource is auto-injected on every session start, providing the agent with project context without requiring explicit tool calls.
Source: src/server/mcp-server.ts:55-78
Audit and Reporting
HTML Audit Report
Generates self-contained HTML reports with sverklo.com dark theme branding. Includes:
- Dimension grade cards (A/B/C/D/F color-coded)
- Section content cards with formatted bodies
- SEO metadata and Open Graph tags
- Responsive styling with JetBrains Mono and Public Sans fonts
Source: src/server/audit-html.ts:1-30
Obsidian Export
Generates Obsidian-compatible markdown with [[wikilinks]] for clickable dependency navigation in the Obsidian knowledge base.
Source: src/server/audit-obsidian.ts:1-30
Architecture Diagram
Generates self-contained HTML architecture diagrams showing:
- Layer groupings (Frontend, API, Storage, Search, Indexer)
- File distribution by pagerank
- Cross-layer dependency edges
- Color-coded directory patterns
Source: src/server/audit-arch.ts:1-50
Workflow Prompts
Sverklo defines prompt templates for common code-intelligence tasks that encode the optimal order of tool calls:
| Prompt | Purpose |
|---|---|
sverklo/onboarding | New team member context injection |
sverklo/premerge | Pre-merge review checklist |
sverklo/premerge-full | Comprehensive pre-merge review |
sverklo/investigate | Root-cause debugging workflow |
sverklo/map-feature | Feature tracing across codebase |
Source: src/server/prompts.ts:1-50
Example prompt structure for feature mapping:
build: ({ feature, scope }) => `
1. investigate query:"${feature}"${scopeArg}
2. Pick top 3-5 symbols → refs on each
3. impact on most-referenced symbols
4. Call verify to validate assumptions
`
Source: src/server/prompts.ts:20-45
GitHub Action Integration
The sverklo/sverklo/action provides CI-integrated code review:
- uses: sverklo/sverklo/action@main
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
fail-on: high # Fail build on risk threshold
max-files: 25 # Max files to review
inline-comments: true # Post inline comments
The action posts a PR review containing:
- Sticky summary comment with risk-scored files
- Up to 30 inline comments anchored to flagged lines
- JSON payload for direct
pulls.createReviewAPI posting
Source: action/README.md:1-30
CLI Commands
| Command | Description |
|---|---|
sverklo init | Initialize project with index and CLAUDE.md |
sverklo register <path> | Register a repository |
sverklo unregister <name> | Unregister a repository |
sverklo list | List registered repositories |
sverklo reindex | Rebuild index for a repository |
sverklo status | Show current repository status |
sverklo doctor | Diagnose installation health |
Source: skill/README.md:1-20
Dependencies
Runtime Dependencies:
| Package | Version | Purpose |
|---|---|---|
@modelcontextprotocol/sdk | ^1.12.0 | MCP protocol implementation |
chokidar | ^4.0.0 | File watching |
ignore | ^7.0.0 | Gitignore pattern matching |
onnxruntime-node | ^1.21.0 | Local embedding inference |
picomatch | ^4.0.4 | Glob pattern matching |
yaml | ^2.8.3 | YAML parsing |
Optional Dependencies:
| Package | Version | Purpose |
|---|---|---|
web-tree-sitter | ^0.24.0 | AST parsing (optional) |
Source: package.json:35-55
Engine Requirements: Node.js >= 24.0.0
Supported Environments
Sverklo integrates with:
- Claude Code (primary)
- Cursor
- Windsurf
- Codex CLI
- ZED editor
Source: package.json:4
Known Limitations
Based on community feedback:
| Issue | Status | Reference |
|---|---|---|
| Windows path handling | User-reported issues | Issue #20 |
AGENTS.md not respected by sverklo init | Known | Issue #19 |
MCP tool double-prefixing (sverklo_sverklo_*) | Known | Issue #71 |
reindex does not update lastIndexed timestamp | Bug | Issue #74 |
| Stale MCP server binary after upgrade | Known | Issue #17 |
Source: package.json:1-10
Getting Started
# Install globally
npm install -g sverklo
# Initialize in your project
cd your-project
sverklo init
# Register a repository
sverklo register .
# Start using MCP tools from your IDE
The initialization creates a CLAUDE.md file with project context and builds the initial index. The MCP server then becomes available to any connected IDE.
Source: skill/README.md:20-35
Source: https://github.com/sverklo/sverklo / Human Manual
Installation
Related topics: Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Quick Start Guide
Installation
Sverklo is a local-first MCP (Model Context Protocol) server for code intelligence. This guide covers the complete installation process, from prerequisites through post-install verification.
Prerequisites
| Requirement | Version | Notes |
|---|---|---|
| Node.js | >= 24.0.0 | Required runtime. Earlier versions lack needed ESM and import metadata support. |
| npm | Any recent version | Used for global installation |
| Git | Any recent version | Required for repository operations during init |
| OS | Linux, macOS, Windows | Windows has known path normalization quirks (see Platform Notes) |
Verify your Node version before proceeding:
node --version
Installing via npm
Sverklo is distributed as a global npm package:
npm install -g sverklo
This installs the sverklo CLI binary globally, making it available from any directory. Source: package.json:9-11
The installation includes:
- CLI binary (
sverklo) — Command-line interface for all operations - MCP server — Language server for IDE integration
- Tree-sitter grammars — Language parsers for AST indexing
- ONNX runtime — Local embeddings for semantic search
Verify the installation:
sverklo --version
Project Initialization
Each codebase you want sverklo to manage requires initialization. Navigate to your project directory and run:
cd /path/to/your/project
sverklo init
The init command performs the following setup:
What `sverklo init` Does
graph TD
A[sverklo init] --> B{Is .sverklo dir present?}
B -->|No| C[Create ~/.sverklo directory]
B -->|Yes| D[Skip creation]
C --> E[Create registry.json]
D --> E
E --> F[Create CLAUDE.md in project]
F --> G[Parse existing docs/ADRs]
G --> H[Install tree-sitter grammars]
H --> I[Index codebase files]
I --> J[Build symbol graph]
J --> K[Compute PageRank scores]
K --> L[Generate initial embeddings]
L --> M[Write project metadata]Files Created
| File | Location | Purpose |
|---|---|---|
CLAUDE.md | Project root | Agent instructions for code intelligence |
registry.json | ~/.sverklo/ | Project registration with name, path, last indexed timestamp |
Source: src/init.ts:1-50
Grammar Installation
During initialization, sverklo installs tree-sitter grammars for supported languages. Grammars enable precise AST-based parsing for accurate symbol extraction.
graph LR
A[Init starts] --> B[Check installed grammars]
B --> C{Grammars exist?}
C -->|Yes| D[Use cached]
C -->|No| E[Download from npm]
E --> F[Build with node-gyp]
F --> G[Store in ~/.sverklo/grammars/]Source: src/indexer/grammars-install.ts:1-40
Supported languages include TypeScript, JavaScript, Python, Go, Rust, and more. The grammars are installed once and reused across projects.
Post-Install Verification
After installation and initialization, verify everything works correctly:
sverklo doctor
The doctor command performs health checks on:
| Check | Purpose |
|---|---|
| Installation | Verifies CLI is reachable |
| Node version | Confirms >= 24.0.0 |
| Grammar binaries | Checks tree-sitter parsers are compiled |
| Project registry | Validates ~/.sverklo/registry.json |
| MCP server | Tests server startup |
Source: src/doctor.ts:1-60
Known Issue: Version Mismatch
If you have multiple sverklo installations (e.g., a stale global binary), sverklo doctor may report a different version than the one you're actually running. This occurs when the doctor check uses sverklo from $PATH instead of the embedded version.
Platform-Specific Considerations
Windows
Windows users may encounter path-related issues due to backslash vs forward slash handling. The codebase normalizes paths using:
path.replace(/\\/g, "/").split("/")
This approach converts Windows paths to Unix-style before processing. Source: Issue #20
Fresh Git Repositories
When running sverklo init in a fresh repository with no commits, you may see a spurious git warning:
Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]'
This warning is cosmetic and does not affect functionality. Source: Issue #3
IDE Integration
After installation, configure your IDE to use sverklo as an MCP server:
Claude Code
Sverklo ships with a Claude Skill package for Claude Code. After installation, Claude Code automatically discovers the MCP tools.
Cursor / Windsurf / Other MCP Clients
Register the MCP server by adding to your IDE's MCP configuration:
{
"mcpServers": {
"sverklo": {
"command": "sverklo",
"args": ["mcp", "serve"]
}
}
}
Source: src/server/mcp-server.ts:1-30
Troubleshooting
MCP Tools Not Appearing
- Run
sverklo doctorto verify the server starts correctly - Restart your IDE to pick up the newly registered MCP server
- Check that the project is registered:
sverklo list
Stale Index After Upgrade
When upgrading via npm install -g, any running MCP server subprocess continues serving from the old binary until restarted. Restart your IDE after upgrading. Source: Issue #17
Grammar Compilation Failures
If tree-sitter grammar compilation fails:
- Ensure
node-gypis available:npm install -g node-gyp - Verify build tools are installed (Python, C++ compiler)
- On macOS, install Xcode Command Line Tools:
xcode-select --install
Configuration Reference
Environment Variables
| Variable | Default | Description |
|---|---|---|
SVERKLO_PROFILE | full | Tool profile: core, nav, lean, full, research, review |
SVERKLO_DISABLED_TOOLS | (none) | Comma-separated list of tools to hide |
SVERKLO_TOOL_<NAME>_DESCRIPTION | (none) | Override tool description |
Source: src/server/tool-overrides.ts:1-30
Registry Structure
Projects are registered in ~/.sverklo/registry.json:
{
"projects": [
{
"name": "my-project",
"path": "/home/user/code/my-project",
"lastIndexed": "2025-01-15T10:30:00Z",
"version": "0.29.0"
}
]
}
Next Steps
After successful installation:
- Index your project:
sverklo index(automatically run byinit) - Explore the codebase:
sverklo overview - Enable IDE integration: Configure MCP server in your IDE
- Read the CLI reference: Explore available commands with
sverklo --help
Source: https://github.com/sverklo/sverklo / Human Manual
Quick Start Guide
Related topics: Overview, Installation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Installation
Quick Start Guide
Sverklo is a local-first MCP (Model Context Protocol) server that provides code intelligence for AI coding assistants. It delivers symbol graphs, blast-radius analysis, diff-aware review, and persistent memory across sessions—without requiring API keys or uploading code. Source: package.json
This guide walks you through installing sverklo, initializing it for your project, and getting started with its core capabilities.
Prerequisites
| Requirement | Version/Details |
|---|---|
| Node.js | >= 24.0.0 |
| Package Manager | npm, pnpm, or yarn |
| IDE/Client | Claude Code, Cursor, Windsurf, or Codex CLI |
| Git | Required for version-aware features |
Sverklo uses ONNX runtime for embeddings and tree-sitter for AST parsing. These are included as dependencies. Source: package.json:38-43
Installation
Install sverklo globally via npm:
npm install -g sverklo
Verify the installation:
sverklo --version
# or
sverklo doctor
The doctor command checks your environment and reports any configuration issues. Source: skill/README.md
Project Initialization
Navigate to your project directory and run the initialization:
cd your-project
sverklo init
The init command performs the following setup:
- Indexes your codebase — Scans source files, builds a symbol graph, and computes dependency relationships
- Detects existing agent instructions — Checks for
CLAUDE.md,AGENTS.md, or other agent configuration files - Creates context files — Generates or updates documentation for AI assistants
- Registers the project — Adds the project to your local registry for quick access
Source: skill/README.md
Note: In fresh git repositories with no commits, you may see a stray git warning. This is cosmetic—the init still succeeds. Source: GitHub Issue #3
Initialization for Windows
If you're on Windows and encounter path-related issues, ensure your PATH handling is compatible. The tool uses forward-slash normalized paths internally, but some edge cases may still arise. Source: GitHub Issue #20
Global Setup Option
If you want one-time machine setup without per-project boilerplate, use the global initialization flow. This imports memories once and allows quick registration for subsequent projects. Source: GitHub Issue #72
Core Commands
Register a Project
Register an existing project (if not done during init):
sverklo register .
List Registered Projects
View all registered projects and their status:
sverklo list
Reindex a Project
After significant code changes, refresh the index:
sverklo reindex .
Known Issue: Thereindexcommand may not update thelastIndexedtimestamp in~/.sverklo/registry.json, causingsverklo listto show stale ages. Source: GitHub Issue #74
Unregister a Project
Remove a project from the registry:
sverklo unregister <project-name>
To unregister by path (useful for agent-driven workflows):
sverklo unregister --by-path /path/to/project
Source: GitHub Issue #73
MCP Tools Overview
Once initialized, sverklo provides these tools to your AI assistant:
| Tool | Purpose |
|---|---|
search | Hybrid semantic code search (BM25 + embeddings + PageRank) |
lookup | Find symbol definitions and references |
overview | Get codebase statistics and top files by importance |
impact | Calculate blast radius for proposed changes |
refs | Find all references to a symbol |
deps | Show dependency graph for a file |
context | Umbrella tool—returns curated context bundle in one call |
remember | Save decisions and context for future sessions |
recall | Retrieve previously saved memories |
review_diff | Risk-scored PR review with inline comments |
audit | Codebase health scoring |
investigate | Fan-out search across multiple signals |
critique | Verify claims against codebase evidence |
Source: src/server/mcp-server.ts
First Session Workflow
When your AI assistant starts a session, sverklo automatically provides context resources:
graph TD
A[Session Start] --> B[MCP Server Initializes]
B --> C[Load Core Memories]
C --> D[Load Recent Memories]
D --> E[Build sverklo://context Resource]
E --> F[Auto-inject into Session]The sverklo://context resource includes:
- Core project context — Tier-1 project invariants
- Key memories — Previously saved decisions
- Top files — High PageRank files for orientation
- Language stats — File counts by language
Source: src/server/mcp-server.ts:37-67
Using the Context Tool
The context tool is the recommended starting point for new tasks:
{
"task": "add rate limiting to the login endpoint",
"detail_level": "normal",
"budget": 4000
}
| Parameter | Type | Description | ||
|---|---|---|---|---|
task | string | Description of what you're working on | ||
detail_level | minimal \ | normal \ | full | How much context to return |
scope | string | Optional path prefix to constrain search | ||
budget | number | Token budget for PageRank-pruned repo map |
- minimal — Fast/cheap: overview header + top 3 search hits + top 2 memories
- normal — Balanced: header + top 5 search hits + top 5 memories + symbol table
- full — Normal + dependency neighbors of top results
- budget — Returns PageRank-pruned repo map fit to token budget
Source: src/server/tools/context.ts
GitHub Actions Integration
For automated code review on pull requests, use the sverklo action:
name: Sverklo Review
on: [pull_request]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: sverklo/sverklo/action@main
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
fail-on: high
inline-comments: true
| Input | Default | Description |
|---|---|---|
github-token | ${{ github.token }} | GitHub token for posting comments |
fail-on | none | Risk threshold: critical, high, medium, low, none |
ref | auto-detected | Git ref range (e.g., main..HEAD) |
max-files | 25 | Maximum files to review |
inline-comments | true | Post inline comments at flagged lines |
Source: action/README.md
Claude Code Subagents
Replace Claude Code's built-in subagents with sverklo-enhanced versions:
mkdir -p .claude/agents
curl -o .claude/agents/sverklo-explore.md \
https://raw.githubusercontent.com/sverklo/sverklo/main/agents/sverklo-explore.md
The sverklo-explore subagent uses hybrid retrieval (BM25 + ONNX embeddings + PageRank) and answers questions in ~150-800 tokens, versus ~14,200 tokens for the default approach. Source: agents/README.md
Claude Skill Package
Sverklo ships a Claude Skill for Claude Code:
# The skill is included in the npm package
ls skill/
# Contains: sverklo-skill.zip and skill definitions
Tools available via the skill include sverklo_search, sverklo_review_diff, sverklo_audit, and memory tools. Source: skill/README.md
Troubleshooting
Version Mismatch Warning
When upgrading sverklo via npm, a running MCP server subprocess may continue using the old binary. Restart your IDE or MCP client to pick up the new version. Source: GitHub Issue #17
AGENTS.md Not Respected
If your project uses AGENTS.md instead of CLAUDE.md, the init command may still add context to the wrong file. Manually migrate the content or file an issue. Source: GitHub Issue #19
MCP Tool Name Prefix
When registering the MCP server under the key "sverklo", tool names may appear as sverklo_sverklo_* due to double-prefixing. Register under a different key (e.g., "io.github.sverklo") to avoid this. Source: GitHub Issue #71
Next Steps
- Review the sverklo prompts documentation for workflow templates
- Explore architecture mapping to understand your codebase
- Set up PR review automation for your CI/CD pipeline
Source: https://github.com/sverklo/sverklo / Human Manual
System Architecture
Related topics: MCP Server Design, Search and Retrieval System, Indexing System, Bi-Temporal Memory Layer
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: MCP Server Design, Search and Retrieval System, Indexing System, Bi-Temporal Memory Layer
System Architecture
Sverklo is a local-first code intelligence platform designed as an MCP (Model Context Protocol) server. It provides persistent memory, semantic code search, dependency graphs, blast-radius analysis, and diff-aware review for AI coding assistants. The architecture follows a layered design with clear separation between indexing, storage, search, and tool delivery layers.
High-Level Architecture Overview
Sverklo operates as a long-running MCP server process that serves code intelligence tools to IDE-integrated AI clients. The system indexes code once and serves multiple query types across sessions.
graph TD
subgraph "Client Layer"
A["Claude Code / Cursor / Windsurf / Codex CLI"]
end
subgraph "MCP Server Layer"
B["mcp-server.ts<br/>MCP Protocol Handler"]
C["Tool Handlers"]
D["Prompt Templates"]
E["Resource Provider"]
end
subgraph "Search & Query Layer"
F["hybrid-search.ts<br/>BM25 + ONNX Embeddings + PageRank"]
G["investigate.ts<br/>Multi-signal Fan-out"]
H["Tool Profiles<br/>core, nav, lean, research, review"]
end
subgraph "Indexing Layer"
I["Index Files"]
J["Index Code (AST)"]
K["Index Graph (Dependencies)"]
L["Index Memory"]
end
subgraph "Storage Layer"
M["SQLite Database"]
N["Vector Store"]
O["File Registry"]
end
A --> B
B --> C
B --> D
B --> E
C --> H
D --> C
H --> I
H --> J
H --> K
H --> L
I --> M
J --> M
K --> M
L --> MCore Design Principles
Local-First Architecture
Sverklo stores all data locally in ~/.sverklo/ and per-project .sverklo/ directories. No API keys or cloud services are required. The system runs entirely on the developer's machine.
Dependencies supporting local-first operation:
chokidarfor file system watchingpicomatchfor glob pattern matchingignorefor.gitignorecompatible filteringonnxruntime-nodefor local embedding inferenceweb-tree-sitter(optional) for AST parsing
Source: package.json:1-50
MCP Protocol Integration
The server implements the full MCP 1.12.0 specification with three resource types:
server.setRequestHandler(ListResourcesRequestSchema, async () => ({
resources: [{
uri: "sverklo://context",
name: "Sverklo Project Context",
description: "Key memories and codebase overview...",
mimeType: "text/plain",
}],
}));
The server exposes resources that are auto-injected at session start, prompts for workflow templates, and tools for all code intelligence operations.
Source: src/server/mcp-server.ts:1-50
Tool Architecture
Tool Registration System
All tools follow a standardized handler pattern. The MCP server maintains a tool registry that supports dynamic configuration:
export const contextTool = {
name: "context",
description: "Umbrella context bundler...",
inputSchema: {
type: "object" as const,
properties: {
task: { type: "string", description: "..." },
detail_level: { type: "string", enum: ["minimal", "normal", "full"] },
scope: { type: "string" },
budget: { type: "number" },
},
},
};
Source: src/server/tools/context.ts:1-40
Tool Profiles
The system provides pre-defined tool subsets called profiles to control the MCP tool surface:
| Profile | Tools | Use Case |
|---|---|---|
core | search, lookup, overview, refs, impact | Hot path only |
nav | core + deps, context, status | Navigation focus |
lean | nav + remember, recall, review_diff | Memory + diff |
research | search, investigate, ask, concepts, patterns, clusters, verify, critique | Code research |
review | review_diff, diff_search, test_map, impact, refs | PR/MR review |
Source: src/server/tool-overrides.ts:1-80
Runtime Configuration
Tools can be customized via environment variables:
| Variable | Purpose |
|---|---|
SVERKLO_TOOL_<NAME>_DESCRIPTION | Override tool description text |
SVERKLO_DISABLED_TOOLS | Comma-separated list of tools to hide |
SVERKLO_PROFILE | Apply a named profile (core, nav, lean, research, review) |
SVERKLO_ZILLIZ_COMPAT | Enable Zilliz Claude context compatibility aliases |
Source: src/server/tool-overrides.ts:80-120
Search Architecture
Hybrid Search Pipeline
Sverklo combines multiple retrieval signals to maximize result quality:
graph LR
A["Query"] --> B["BM25 Keyword Search"]
A --> C["ONNX Bi-encoder Embeddings"]
A --> D["PageRank Centrality"]
B --> E["Reciprocal Rank Fusion"]
C --> E
D --> E
E --> F["Ranked Results"]The search layer coordinates BM25 keyword matching, semantic embedding similarity via ONNX, and PageRank-based importance scoring. Reciprocal Rank Fusion combines these signals into a unified ranking.
Source: src/server/mcp-server.ts:100-150
Investigation Engine
The investigate tool performs multi-signal fan-out in a single call:
- Executes full-text search
- Queries vector embeddings
- Resolves symbol references
- Checks documentation mentions
- Aggregates results with
found_bytags indicating which signals matched
Results agreed on by multiple retrievers are tagged as higher-signal than single-source hits.
Source: src/server/prompts.ts:1-50
PageRank Integration
Dependency graph analysis produces PageRank scores used to:
- Rank files by architectural importance
- Prune large repos to fit token budgets
- Identify load-bearing modules
- Surface high-centrality files in overview
Source: src/server/tools/wakeup.ts:1-50
Code Analysis Engine
Symbol Indexing
The indexing pipeline extracts symbols from AST-aware parsing:
- Function and class definitions
- Import/export relationships
- Type annotations
- Documentation comments
Symbol data enables refs (find references), impact (blast radius), and symbol-based search.
Dependency Graph
Edges between files capture:
- Import statements (ES modules, CommonJS, TypeScript imports)
- Re-exports and re-typed symbols
- Cross-reference relationships
The graph supports cycle detection, fan-in/fan-out analysis, and impact propagation.
Source: src/server/audit-obsidian.ts:1-50
Critique System
The critique tool validates claims against indexed evidence:
function formatCritique(c: CritiqueData): string {
const parts: string[] = [];
parts.push(c.claim ? `## critique — "${c.claim}"` : "## critique");
// Verifies citations point to actual source files
// Checks for undocumented symbols
// Detects stale or moved references
}
Critique verifies that:
- Cited evidence actually exists
- Symbols are documented in
.md,.markdown, or.mdxfiles - References haven't become stale or moved
Source: src/server/tools/critique.ts:1-60
Review and Audit Output
GitHub PR Review Format
Review output supports structured GitHub API payloads:
export interface InlineComment {
/** Repo-relative path of the file being commented on */
path: string;
/** 1-based file line number */
line: number;
severity: "info" | "warning" | "error";
body: string;
}
Risk levels are classified as: critical, high, medium, low.
Source: src/server/tools/review-format.ts:1-40
HTML Audit Reports
Self-contained HTML reports with dark theme branding are generated for codebase health analysis:
- Dimension cards with letter grades (A-F)
- Section cards with formatted content
- SEO meta tags and Open Graph support
- Google Fonts (JetBrains Mono, Public Sans)
Source: src/server/audit-html.ts:1-60
Obsidian Export
Audit reports can be exported as Obsidian-compatible markdown with [[wikilinks]] for clickable navigation between files and symbols.
Source: src/server/audit-obsidian.ts:1-50
Prompt Templates
Workflow Orchestration
Prompts encode the *order* of sverklo tool calls for common tasks:
| Prompt | Purpose |
|---|---|
sverklo/map-feature | Map a feature across codebase entry points, symbols, tests, docs |
sverklo/architecture-map | Generate architecture map using overview, deps, PageRank, recall |
sverklo/onboard | New developer onboarding with conventions and project index |
sverklo/premerge | Pre-merge review checklist |
sverklo/debug | Systematic debugging using symbol graphs and references |
Each prompt uses the PromptDefinition interface:
interface PromptDefinition {
name: string;
description: string;
arguments: { name: string; description: string; required: boolean }[];
build: (args: Record<string, string>) => string;
}
Source: src/server/prompts.ts:1-80
Context Injection
Session Startup
On session start, the MCP server injects context via the sverklo://context resource:
const coreMemories = indexer.memoryStore.getCore(15);
const recentMemories = indexer.memoryStore.getRecent(10);
const projectMemories = indexer.memoryStore.getByCategory("project");
const conventions = indexer.memoryStore.getByCategory("convention");
Context tiers:
- Core — Project invariants (always injected)
- Recent — Latest saved memories
- Category — Organized by type (project, convention, architecture)
Source: src/server/mcp-server.ts:50-100
Wakeup Generation
The wakeup tool produces quick orientation summaries:
export function generateWakeup(
indexer: IndexFiles & IndexMemory,
options: { maxTokens?: number; format?: "markdown" | "plain" }
): string
Includes project status, core files by dependency rank, and project invariants.
Source: src/server/tools/wakeup.ts:1-50
References Lookup
Symbol Resolution
The refs tool finds all references to a symbol and separates:
- Structural inclusions — where the symbol is defined/included
- Associative references — "see also" mentions
Dedup logic prevents near-identical rows from the same logical doc location:
const seen = new Set<string>();
for (const m of docMentions) {
const key = `${m.doc_file_path}|${m.doc_breadcrumb ?? ""}|${m.match_kind}`;
if (seen.has(key)) continue;
seen.add(key);
dedupedAll.push(m);
}
Source: src/server/tools/find-references.ts:1-60
Known Architectural Considerations
Windows Path Handling
Path normalization uses forward-slash conversion:
path.replace(/\\/g, "/").split("/").pop()
This is noted in community discussions as a workaround rather than a comprehensive fix. See Issue #20.
MCP Tool Name Prefixing
All tools include a sverklo_ prefix (e.g., sverklo_impact, sverklo_search). When registered under key "sverklo", this produces double-prefixing (e.g., sverklo_sverklo_impact). See Issue #71.
Retrieval Architecture Evolution
Community discussions (Issue #29) have raised evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank approach.
Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Runtime | Node.js >= 24.0.0 | Server runtime |
| Protocol | MCP SDK 1.12.0 | Client-server communication |
| Embeddings | ONNX Runtime Node 1.21.0 | Local vector inference |
| Parsing | Tree-sitter (optional) | AST extraction |
| File Watching | Chokidar 4.0.0 | Live reload support |
| Matching | Picomatch 4.0.4 | Glob pattern support |
| Config | YAML 2.8.3 | Configuration files |
Source: package.json:30-60
Summary
Sverklo's architecture implements a clean separation between indexing, storage, search, and delivery layers. The MCP protocol enables integration with multiple AI coding clients while the hybrid search pipeline combines keyword, semantic, and graph-based signals. Tool profiles allow runtime customization of the available surface, and the prompt system encodes best-practice workflows. The local-first design ensures data privacy and eliminates external dependencies.
Source: https://github.com/sverklo/sverklo / Human Manual
MCP Server Design
Related topics: Search Tools Reference, Impact and Reference Tools
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Search Tools Reference, Impact and Reference Tools
MCP Server Design
Sverklo implements a Model Context Protocol (MCP) server that provides code intelligence capabilities to AI coding agents. The MCP server layer sits between the indexer (which maintains the code graph, embeddings, and memories) and the AI client (Claude Code, Cursor, Windsurf, or Codex CLI). This design enables persistent, local-first code understanding without API keys or code upload.
Architecture Overview
The MCP server is built on top of the @modelcontextprotocol/sdk and exposes sverklo's indexer capabilities as tools, resources, and prompts. The architecture follows a layered design:
graph TD
subgraph "AI Client Layer"
A["Claude Code / Cursor / Windsurf"]
end
subgraph "MCP Server Layer"
B["startMcpServer()"]
C["startGlobalMcpServer()"]
D["Tool Handlers"]
E["Resource Handlers"]
F["Prompt Handlers"]
end
subgraph "Core Indexer Layer"
G["Indexer<br/>(IndexFiles + IndexCode + IndexGraph + IndexMemory)"]
H["Vector Store"]
I["Graph Store"]
J["Memory Store"]
end
A --> B
A --> C
B --> D
B --> E
B --> F
D --> G
E --> G
F --> G
G --> H
G --> I
G --> JKey Design Principles
- Local-first: All indexing happens on-disk; no data leaves the machine
- Git-aware: Tools understand branches, commits, and diffs
- Multi-signal retrieval: Combines FTS, embeddings, symbol graphs, and PageRank
- Backward compatibility: Legacy tool names are aliased to canonical names with deprecation warnings
Server Initialization
Per-Project MCP Server
The startMcpServer() function initializes an MCP server for a single repository:
// src/server/mcp-server.ts:180
export async function startMcpServer(rootPath: string): Promise<void> {
const config = getProjectConfig(rootPath);
// ... initializes indexer, registers handlers, starts server
}
Server configuration includes:
| Parameter | Source | Description |
|---|---|---|
rootPath | CLI argument | Absolute path to the project root |
serverName | server.json | MCP server identifier (default: io.github.sverklo/sverklo) |
serverVersion | package.json | Inherited from npm package version |
instructions | Static string | Server capabilities description for AI clients |
Server Capabilities
The MCP server declares three capability categories:
// src/server/mcp-server.ts:190
const server = new Server(
{ name: "sverklo", version: serverVersion },
{
capabilities: {
tools: {},
resources: {},
prompts: {},
},
instructions: /* string */,
}
);
| Capability | Purpose | Handler |
|---|---|---|
tools | Code intelligence operations (search, lookup, impact, etc.) | server.setRequestHandler(HandleCallToolRequestSchema, ...) |
resources | Static project context at session start | server.setRequestHandler(ListResourcesRequestSchema, ...) |
prompts | Reusable workflow templates | server.setRequestHandler(ListPromptsRequestSchema, ...) |
Tool System Architecture
Tool Registration
Tools are registered through the MCP SDK's server.tool() method. Each tool declares:
// src/server/mcp-server.ts (pattern)
server.tool(
"search", // canonical tool name
"Natural language search across indexed code", // description
{ query: { type: "string" } }, // input schema
async (args, extra) => { /* handler */ } // implementation
);
Tool Naming Convention (v0.28.0+)
Following issue #71, tool names use the format <verb>_<noun> without the sverklo_ prefix:
| Canonical Name | Description |
|---|---|
search | Full-text and semantic search |
lookup | Symbol lookup by name |
impact | Blast radius analysis |
refs | Find references to a symbol |
investigate | Multi-signal fan-out investigation |
context | Umbrella context bundler |
review_diff | Diff-aware code review |
status | Indexing status |
Legacy Tool Aliases
For backward compatibility, the server maintains a LEGACY_TOOL_ALIASES map:
// src/server/mcp-server.ts
export const LEGACY_TOOL_ALIASES: Record<string, string> = {
"sverklo_search": "search",
"sverklo_lookup": "lookup",
// ... ≥30 entries
};
The resolveToolName() function routes legacy names to canonical names and emits a single deprecation warning per legacy name per server instance:
// src/server/tools/rename-aliases.test.ts:16
it("resolveToolName routes legacy → canonical correctly", () => { ... });
// src/server/tools/rename-aliases.test.ts:22
it("deprecation warning fires exactly once per legacy name", () => { ... });
Tool Presets
Tool availability is controlled through named presets defined in tool-overrides.ts:
// src/server/tool-overrides.ts
export const TOOL_PRESETS = {
// Minimal: only essential tools
minimal: ["search", "lookup", "overview", "refs", "impact", "status"],
// Standard: balanced for most use cases
standard: ["search", "lookup", "overview", "refs", "impact", "deps", "context", "status"],
// Lean: adds memory tools for recall/remember
lean: ["search", "lookup", "overview", "refs", "impact", "deps", "context", "status", "remember", "recall", "review_diff"],
// Research: full investigation surface for code onboarding
research: ["search", "search_iterative", "investigate", "ask", "lookup", "overview", "refs", "impact", "deps", "concepts", "patterns", "clusters", "verify", "critique", "ctx_slice", "ctx_grep", "ctx_stats", "status"],
// Review: PR/MR focus with diff tools front-and-center
review: ["review_diff", "diff_search", "test_map", "impact", "refs", "lookup", "search", "investigate", "verify", "status"],
};
Presets are configured in server.json:
// server.json
{
"mcp": {
"preset": "research",
"env": {
"OVERRIDE_TOOLS": "standard,context"
}
}
}
Input Validation
Server-Side Validation Layer
The _validation.ts module provides shared validators used by all tool handlers:
// src/server/tools/_validation.ts
export function validateEnum<T extends string>(
raw: unknown,
allowed: readonly T[],
argName: string,
fallback: T
): T | Error { ... }
export function requireString(
raw: unknown,
argName: string,
usage: string
): { ok: true; value: string } | { ok: false; message: string } { ... }
Why Server-Side Validation?
The MCP wrapper declares JSON schemas, but Claude/agents sometimes pass values outside declared enums. Without server-side guards, invalid values fall through to silent type-cast paths, returning wrong but successful-looking results. Source: src/server/tools/_validation.ts:1-15
Git Parameter Validation
Git parameters (refs, paths) are validated against injection patterns:
// src/utils/git-validation.ts
export function validateGitRef(ref: string): boolean {
// Allows: branch names, tags, SHAs, ranges (A..B, A...B), HEAD~N, HEAD^N
// Rejects: spaces, semicolons, backticks, pipes, dollar signs, parentheses
return /^[a-zA-Z0-9_.\/@{}\-~^:]+(\.\.[a-zA-Z0-9_.\/@{}\-~^:]+)?$/.test(ref);
}
This prevents command injection (CWE-78) when git commands are executed via execSync or spawnSync.
Resource System
Auto-Injected Project Context
The MCP server registers a single resource sverklo://context that AI clients read at session start:
// src/server/mcp-server.ts:200
server.setRequestHandler(ListResourcesRequestSchema, async () => ({
resources: [
{
uri: "sverklo://context",
name: "Sverklo Project Context",
description:
"Key memories and codebase overview. Read this at session start to understand the project.",
mimeType: "text/plain",
},
],
}));
Context Content
When sverklo://context is read, the server returns a markdown document containing:
| Section | Content | Selection Criteria |
|---|---|---|
| Core Project Context | Project-invariant memories (tier='core') | Top 15 by recency |
| Stale Memories | Memories flagged as outdated | Any with is_stale: true |
| Recent Memories | Recent context entries | Top 5 by recency |
| Top Files | Files sorted by PageRank | Top 5 files |
Prompt Templates
The MCP server exposes reusable workflow prompts via the prompts protocol:
// src/server/prompts.ts
export interface PromptDefinition {
name: string;
description: string;
arguments: PromptArgument[];
build: (args: Record<string, string | undefined>) => string;
}
Available Prompts
| Prompt Name | Description | Required Args |
|---|---|---|
sverklo/review-changes | Diff-aware code review workflow | ref (optional) |
sverklo/map-feature | Map a feature across codebase entry points, symbols, tests, docs | feature |
Prompt Workflow Example
The sverklo/review-changes prompt guides the model through a structured review:
# Review changes workflow (simplified)
1. Call `review_diff ref:"<ref>"` for risk-scored findings
2. Call `diff_search query:"<risk keywords>"` to surface related changes
3. Call `test_map ref:"<ref>"` to check test coverage
4. Call `impact ref:"<ref>"` for blast radius before approving
Context Tool (`context`)
The context tool is an umbrella bundler that provides codebase overview in a single call:
// src/server/tools/context.ts
const contextTool = {
name: "context",
description:
"Umbrella context bundler. Give a task description and get a single curated bundle: " +
"codebase overview header, semantically relevant code, related symbols, and matching " +
"saved memories — in one round trip.",
inputSchema: {
type: "object",
properties: {
task: { type: "string" },
detail_level: { type: "string", enum: ["minimal", "normal", "full"] },
scope: { type: "string" },
budget: { type: "number" }, // PageRank-pruned token budget
},
},
};
Context Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
task | string | - | Free-form task description |
detail_level | enum | "normal" | minimal=fast/cheap, normal=balanced, full=adds dep neighbors |
scope | string | - | Path prefix to constrain search (e.g., src/api/) |
budget | number | - | When set, returns PageRank-pruned repo map fit to token budget |
Critique Tool
The critique tool evaluates whether an AI's answer properly cites sverklo's evidence:
// src/server/tools/critique.ts
interface CritiqueData {
claim: string | null;
verify: VerifyResult[];
stale: VerifyResult[];
moved: VerifyResult[];
hubsCited: string[];
missedHubs: string[];
undefinedSymbols: string[];
undocumentedSymbols: string[];
totalSymbols: number;
}
What Critique Checks
| Check | Description |
|---|---|
| Symbol verification | Are cited symbols actually defined in the codebase? |
| Stale memory | Are referenced memories marked as outdated? |
| Moved code | Do citations point to symbols that have been relocated? |
| Hub citation | Does the answer cite high PageRank hub files? |
| Undefined symbols | Does the answer mention symbols that don't exist? |
| Undocumented symbols | Are important symbols missing .md/.markdown/.mdx documentation? |
Zilliz Compatibility Layer
Sverklo provides aliases for Zilliz claude-context MCP server tools:
// src/server/mcp-server.ts (Zilliz compat tools)
const zillizTools = [
{
name: "search_code",
description: "[Zilliz claude-context compat] Alias for sverklo's search tool.",
inputSchema: {
type: "object",
properties: {
query: { type: "string" },
path: { type: "string" },
limit: { type: "number" },
},
required: ["query"],
},
},
{
name: "clear_index",
description: "[Zilliz claude-context compat] Delete the index database and rebuild from scratch.",
},
{
name: "get_indexing_status",
description: "[Zilliz claude-context compat] Alias for sverklo's `status` tool.",
},
];
Global MCP Server (Multi-Repo Mode)
The startGlobalMcpServer() function serves multiple repositories from a single MCP server:
// src/server/mcp-server.ts:290
export async function startGlobalMcpServer(): Promise<void> {
const pool = new IndexerPool();
const hints = new HintEngine();
// ... initializes server with list_repos tool
}
Global Mode Features
| Feature | Description |
|---|---|
list_repos | List all registered repositories with path, name, and status |
repo parameter | All tools accept optional repo parameter to target specific repo |
| Single-repo shortcut | If only one repo is registered, repo parameter is optional |
Server Instructions (Global Mode)
Sverklo (global mode): code intelligence serving multiple repos.
Use the list_repos tool to see available repositories, then pass the
repo name to any tool. If only one repo is registered, the repo
parameter is optional.
Wakeup Generation
The generateWakeup() function creates a compact project summary:
// src/server/tools/wakeup.ts
export function generateWakeup(
indexer: IndexFiles & IndexMemory,
options: { maxTokens?: number; format?: "markdown" | "plain" } = {}
): string
Wakeup Output Structure
# {projectName}
{fileCount} files · {languages}
## Core files (by dependency rank)
- `path/to/high-pagerank-file.ts`
- ...
## Project invariants (or Recent context)
- [{category}] {memory content}
- ...
Version Management
The server reads its version from package.json at startup using a directory traversal pattern:
// src/server/mcp-server.ts:305
for (const rel of ["..", "../..", "../../.."]) {
try {
const pkg = JSON.parse(readFileSync(join(here, rel, "package.json"), "utf-8"));
if (pkg.name === "sverklo" && pkg.version) {
serverVersion = pkg.version;
break;
}
} catch {}
}
This ensures the MCP server always reports the version of the installed npm package, even when invoked from different working directories.
Source: https://github.com/sverklo/sverklo / Human Manual
Search and Retrieval System
Related topics: Search Tools Reference, Indexing System, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Search Tools Reference, Indexing System, System Architecture
Search and Retrieval System
Overview
The Search and Retrieval System in sverklo provides multi-signal code search capabilities for coding agents. It combines full-text search (BM25), semantic embeddings (bi-encoder), symbol-based lookup, and graph-based PageRank scoring to surface relevant code chunks for a given query.
The system is designed to be local-first with no API keys required, using ONNX runtime for embedding inference and in-memory data structures for fast retrieval. It powers tools like search, investigate, context, and review_diff across the MCP server interface.
Source: src/search/hybrid-search.ts
Architecture
The retrieval architecture combines multiple rankers into a unified pipeline:
graph TD
A[Query] --> B[BM25 FTS]
A --> C[Bi-Encoder Embeddings]
A --> D[Symbol Lookup]
B --> E[Reciprocal Rank Fusion]
C --> E
D --> E
E --> F[PageRank Boost]
F --> G[Reranker]
G --> H[Final Results]The system uses Reciprocal Rank Fusion (RRF) to combine signals from multiple retrievers, followed by PageRank-based boosting to prioritize centrally-important files, and an optional reranking pass to refine results based on query-document affinity.
Source: src/search/rerank.ts Source: src/search/pagerank.ts
Core Components
Hybrid Search
The hybridSearch function orchestrates multi-signal retrieval by:
- Executing parallel searches across FTS, embeddings, symbols, and references
- Collecting results with source attribution (
found_bytags) - Applying Reciprocal Rank Fusion to merge ranked lists
- Boosting results from high-PageRank files
export async function hybridSearch(
indexer: Indexer,
query: string,
options?: HybridSearchOptions
): Promise<SearchResult[]>
The function returns results annotated with found_by arrays, allowing callers to identify multi-source agreement:
Results agreed on by multiple retrievers are higher-signal than single-source hits.
Source: src/search/hybrid-search.ts Source: src/server/prompts.ts
Reciprocal Rank Fusion
RRF combines ranked lists using the formula:
RRF_score(doc) = Σ 1 / (k + rank_i(doc))
Where k is a constant (typically 60) that controls how much the lowest-ranked retrievers contribute. This approach is parameter-free and handles different score distributions across rankers.
Source: src/search/hybrid-search.ts
PageRank Boost
PageRank scores are computed from the import dependency graph during indexing. The pagerankBoost function adjusts search scores based on file centrality:
export function pagerankBoost(
results: SearchResult[],
fileStore: FileStore,
factor?: number
): SearchResult[]
High-PR files (core libraries, entry points) receive a multiplicative boost, ensuring frequently-imported code surfaces first even when query terms are sparse.
Source: src/search/pagerank.ts
Embedding Store
The EmbeddingStore manages vector embeddings for semantic search:
export class EmbeddingStore {
get(query: string, k?: number): EmbeddingResult[]
upsert(records: EmbeddingRecord[]): void
prune(ids: Set<number>): void
}
Embeddings are computed using ONNX runtime and stored in memory. The store supports:
- Top-k retrieval by cosine similarity
- Pruning to remove embeddings for deleted files
- Batch upsert for incremental index updates
Source: src/storage/embedding-store.ts
Reranking
The reranker refines initial results using a cross-encoder approach. It takes the top-N candidates from hybrid search and reorders them based on finer-grained query-document matching:
export interface RerankerResult {
chunk_id: number;
score: number;
rerank_score: number;
source: "fts" | "embedding" | "symbol" | "ref";
path: string;
lines: string;
}
Source: src/search/rerank.ts
Investigation Tool
The investigate tool provides single-pass fan-out over all retrieval signals:
export async function handleInvestigate(
indexer: Indexer,
args: { query: string; scope?: string; limit?: number }
): Promise<string>
It returns structured results showing which retrievers found each result, enabling agents to:
- Identify high-confidence hits (multi-source agreement)
- Discover unexpected code locations
- Build confidence in retrieved evidence
Source: src/search/investigate.ts
Search Iterative Tool
For complex queries requiring refinement, searchIterative supports multi-turn search with context accumulation:
export const searchIterativeTool = {
name: "search_iterative",
description: "Multi-turn search that builds on previous results...",
inputSchema: {
type: "object",
properties: {
query: { type: "string", description: "Search query" },
refine: { type: "string", description: "Refinement to previous results" },
// ...
}
}
}
The tool maintains trajectory state across calls, allowing progressive narrowing of search space.
Source: src/server/tools/search-iterative.ts
Context Tool
The context tool is an umbrella bundler that combines search with memory recall:
export const contextTool = {
name: "context",
description: "Umbrella context bundler. Give a task description and get a single curated bundle..."
}
It supports a budget parameter for PageRank-pruned repo maps that fit a token budget—ideal for giving agents a complete mental model of an unfamiliar codebase in one call.
Source: src/server/tools/context.ts
Retrieval Signal Types
| Signal | Source | Strength |
|---|---|---|
| BM25 FTS | Full-text indexing | Exact term matching |
| Bi-encoder embeddings | ONNX inference | Semantic similarity |
| Symbol lookup | AST parsing | Definition/expression finding |
| Reference graph | Import analysis | Call-site discovery |
| PageRank | Dependency graph | Architectural importance |
Configuration
Boost Factor
The BOOST_FACTOR constant (default: 0.5) controls PageRank influence on final scores:
export function pagerankBoost(
results: SearchResult[],
fileStore: FileStore,
factor: number = BOOST_FACTOR
): SearchResult[]
RRF Constant
The RRF_K constant (default: 60) controls how aggressively lower-ranked retrievers influence fusion:
const RRF_K = 60;
function rrfScore(rank: number): number {
return 1 / (RRF_K + rank);
}
Community Considerations
Multi-Vector Reranker Evaluation (Issue #29)
The community has discussed evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder approach. Multi-vector models tokenize queries and documents into multiple embedding vectors, potentially capturing finer-grained relevance signals for code search.
Current architecture uses bi-encoder embeddings where both query and document are encoded independently. The enhancement would involve:
- Late interaction between query and document token vectors
- Potential improvement in recall for partial matches
- Trade-off consideration: latency vs. accuracy
Benchmark Performance (Issue #28)
Parser improvements (string/comment-aware brace counting) recovered P1 performance from 0.30 → 0.73 on the 90-task benchmark, though P2/P4 categories showed slight regression. This highlights the sensitivity of retrieval quality to underlying code parsing accuracy.
Tool Summary
| Tool | Purpose | Signals Used |
|---|---|---|
search | Basic hybrid search | FTS + Embeddings + Symbols |
investigate | Multi-source fan-out | All signals + agreement analysis |
search_iterative | Refinement search | Trajectory-aware multi-turn |
context | Bundle + memories | Hybrid search + recall |
ctx_grep | Grep within results | Post-filter FTS |
ctx_slice | Dependency slice | Graph-based filtering |
Dependencies
The search system depends on:
- ONNX Runtime Node (
onnxruntime-node) for embedding inference - Tree-sitter (
web-tree-sitter) for AST parsing and symbol extraction - Picomatch for file path matching patterns
Source: package.json
Source: https://github.com/sverklo/sverklo / Human Manual
Bi-Temporal Memory Layer
Related topics: Search and Retrieval System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Search and Retrieval System
Bi-Temporal Memory Layer
The Bi-Temporal Memory Layer is sverklo's persistent knowledge system that preserves coding decisions, conventions, and context across agent sessions. Unlike traditional memory stores that overwrite previous entries, the bi-temporal model maintains both the current state and historical validity of each memory, enabling conflict detection without data loss.
Core Concepts
What Makes It "Bi-Temporal"
The bi-temporal architecture tracks two independent time dimensions:
- Valid Time: When a memory was or will be true in the real world (e.g., "we deprecated X in v2.0")
- Record Time: When the memory was recorded in the system (e.g., "I learned about X's deprecation today")
This separation allows agents to reason about both what was historically true and when that knowledge was acquired. Source: src/server/tools/memories.ts:1-20
Memory Categories
Memories are classified into categories that determine their behavior:
| Category | Purpose | Default Kind |
|---|---|---|
decision | Architectural choices, API contracts | semantic |
preference | Coding style, team conventions | semantic |
pattern | Reusable solutions to recurring problems | semantic |
context | Project-specific information (default) | episodic |
todo | Outstanding work items | episodic |
procedural | Step-by-step processes | procedural |
correction | Fixes for prior model mistakes | episodic |
Source: src/server/tools/remember.ts:15-23
Memory Kinds (Cognitive Axis)
The cognitive axis determines how memories are retrieved and prioritized:
- episodic: Moment-bound events or decisions tied to specific contexts
- semantic: Timeless facts or rules that apply universally
- procedural: How-to knowledge for executing tasks
Source: src/server/tools/remember.ts:47-50
Memory Tiers
| Tier | Behavior |
|---|---|
core | Auto-injected at every session start (up to 15 memories) |
archive | Searched on demand, not auto-injected |
Source: src/server/mcp-server.ts:45-55
Architecture
Storage Components
┌─────────────────────────────────────────────────────────────────┐
│ Bi-Temporal Memory Layer │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌──────────────────────┐ │
│ │memory-store │ │memory-embedding-store │ │
│ │ (SQLite) │ │ (Vector + SQLite) │ │
│ └─────────────┘ └──────────────────────┘ │
│ │ │ │
│ └──────────┬──────────┘ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ staleness.ts │ (file-change tracking) │
│ └─────────────────────┘ │
│ │ │
│ ┌─────────────────────┐ │
│ │ prune.ts │ (lifecycle management) │
│ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Memory Lifecycle
graph TD
A[remember tool] --> B[Check for conflicting memories]
B --> C{Conflict threshold > 0.85?}
C -->|Yes| D[Mark old memory as STALE]
C -->|No| E[Keep both memories active]
D --> F[Save new memory with git state]
E --> F
F --> G[Store in memory-store]
F --> H[Generate embeddings in memory-embedding-store]
G --> I[Auto-inject if tier=core]Conflict Detection
The system uses a configurable conflict threshold (default: 0.85) to identify potentially contradictory memories:
const CONFLICT_THRESHOLD = 0.85;
When a new memory conflicts with an existing one above this threshold, the older memory is marked as stale rather than deleted. Both records are preserved, enabling agents to review the conflict. Source: src/server/tools/remember.ts:8
MCP Tools
remember — Save Persistent Memory
rememberTool = {
name: "remember",
inputSchema: {
content: string, // Required: the memory content
category: MemoryCategory, // decision|preference|pattern|context|todo|procedural|correction
tags: string[], // Optional metadata tags
related_files: string[], // Files this memory relates to
confidence: number, // 0.0-1.0, affects retrieval ranking
tier: "core" | "archive", // core=auto-inject, archive=search-on-demand
kind: "episodic" | "semantic" | "procedural",
scope: "project" | "workspace", // project=repo-local, workspace=cross-repo
}
}
Key behaviors:
- Tied to git state — memories are associated with the current commit/branch
- Auto-invalidates conflicting prior memories above the conflict threshold
proceduralcategory defaults toproceduralkindpreference/patterncategories default tosemantickind- Other categories default to
episodickind
Source: src/server/tools/remember.ts:12-55
recall — Retrieve Relevant Memories
The recall tool searches memories semantically and returns results ranked by relevance. Memories linked to files that have changed since recording may be marked as potentially stale.
memories — List and Audit Memories
memoriesTool = {
name: "memories",
inputSchema: {
mode: "list" | "conflicts", // list=show all, conflicts=show contradictory pairs
category: MemoryCategory, // Filter by category
limit: number, // Max results (default: 50)
stale_only: boolean, // Only show stale memories
}
}
Conflict mode: Surfaces pairs of active memories sharing a pin that may contradict. The bi-temporal model preserves both, presenting this as a review prompt rather than auto-resolving.
Source: src/server/tools/memories.ts:8-30
pin / unpin — Anchor Memories to Code Locations
pinTool = {
name: "pin",
inputSchema: {
memory_id: number, // From recall/memories results
target: string, // File path or symbol name
}
}
Pinned memories surface automatically when recalling by that file path or symbol name, without requiring semantic search. This enables location-specific knowledge injection.
Source: src/server/tools/pin.ts:1-30
Core Memories Auto-Injection
On every MCP session start, the server auto-injects core tier memories into the context:
// From mcp-server.ts
const coreMemories = indexer.memoryStore.getCore(15);
for (const m of coreMemories) {
const stale = m.is_stale ? " [STALE]" : "";
parts.push(`- [${m.category}]${stale} ${m.content}`);
}
These memories appear in the sverklo://context resource and serve as project invariants that agents should always consider. Source: src/server/mcp-server.ts:45-55
Staleness Detection
File-Based Staleness
When related_files are provided with a memory, sverklo tracks file changes:
interface Memory {
related_files: string[]; // Files this memory relates to
is_stale: boolean; // Set when related files change
}
Stale memories are flagged with [STALE] in the context output, alerting agents to re-evaluate whether the memory is still valid. Source: src/server/mcp-server.ts:52
Conflict-Based Staleness
Memories that conflict with newer entries (similarity > 0.85) are automatically marked stale, preserving the historical record while surfacing the current best knowledge.
Workspace Scope
Memories can be saved at two scopes:
| Scope | Storage Location | Visibility |
|---|---|---|
project | {repo}/.sverklo/memories.db | Current repository only |
workspace | ~/.sverklo/workspaces/{name}/memories.db | All repos in workspace |
The workspace scope enables cross-repository decisions (e.g., "we use Postgres everywhere") to be shared across projects. Source: src/server/tools/remember.ts:55-62
Community Considerations
Global Memory Setup (Issue #72)
Users requesting sverklo init --global want one-time workspace-level memory setup that doesn't require per-project initialization. The bi-temporal layer's workspace scope partially addresses this by enabling cross-repo memories, but the initialization workflow remains per-project.
MCP Tool Name Conflicts (Issue #71)
Memory tools (remember, recall, memories) may be double-prefixed (sverklo_sverklo_remember) when registered under the sverklo key, though this is a naming convention issue rather than a bi-temporal architecture concern.
Summary
The Bi-Temporal Memory Layer provides sverklo's agents with persistent, conflict-aware knowledge that survives across sessions. Key design decisions:
- Preservation over deletion: Conflicting memories are marked stale, not removed
- Dual temporal axes: Valid time and record time enable historical reasoning
- Tiered retrieval: Core memories auto-inject; archive memories search on demand
- Location pinning: Memories anchor to files/symbols for context-sensitive recall
- Git integration: Memories are tied to repository state for traceability
Source: https://github.com/sverklo/sverklo / Human Manual
Indexing System
Related topics: Search and Retrieval System, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Search and Retrieval System, System Architecture
Indexing System
The indexing system is the core data pipeline that powers sverklo's code intelligence capabilities. It transforms source code into a queryable, multi-dimensional index that combines file metadata, code symbols, dependency graphs, semantic embeddings, and persistent memory.
Overview
Sverklo's indexer builds a four-layer index that feeds all downstream tools:
| Layer | Purpose | Data Structure |
|---|---|---|
| Files | Track all indexed source files with metadata | fileStore |
| Code | Extract symbols, chunks, and documentation edges | codeStore / docEdgeStore |
| Graph | Build dependency relationships for PageRank | graphStore |
| Memory | Persist context and decisions across sessions | memoryStore |
Source: src/server/tools/context.ts
Architecture
graph TD
A[Source Files] --> B[File Indexer]
B --> C[fileStore]
D[Code Parsing] --> E[Symbol Extractor]
E --> F[codeStore]
F --> G[Doc Edge Store]
G --> H[docEdgeStore]
I[File Metadata] --> J[Graph Builder]
J --> K[graphStore]
K --> L[PageRank Compute]
L --> K
M[Memory Store] --> N[memoryStore]
C --> O[Hybrid Search]
F --> O
K --> O
G --> OIndexer Interfaces
The indexer exposes a unified interface combining four specialized stores:
type IndexFiles = {
fileStore: FileStore;
getStatus(): ProjectStatus;
};
type IndexCode = {
codeStore: CodeStore;
docEdgeStore: DocEdgeStore;
};
type IndexGraph = {
graphStore: GraphStore;
};
type IndexMemory = {
memoryStore: MemoryStore;
};
Source: src/server/tools/context.ts
File Store
The file store maintains a registry of all indexed source files with their metadata:
interface FileRecord {
id: number;
path: string;
pagerank: number;
// ... other metadata
}
The store provides:
getAll()- Retrieve all indexed filesgetById(id)- Lookup by file ID- Path-to-ID mapping for graph edge resolution
Source: src/server/audit-obsidian.ts
Build Lookup Maps
const idToPath = new Map<number, string>();
for (const f of files) idToPath.set(f.id, f.path);
Source: src/server/audit-obsidian.ts
Code Store
The code store indexes code symbols and documentation references:
Symbol Extraction
Symbols are extracted from parsed code and stored with:
- Symbol name and type (function, class, type, interface, method, variable)
- Source file location
- Chunk boundaries for code retrieval
Source: src/server/tools/find-references.ts
Documentation Edge Store
Documentation edges connect code symbols to their documentation:
interface DocMention {
doc_file_path: string;
doc_breadcrumb?: string;
match_kind: string;
edge_kind: "includes" | "reference";
confidence: number;
}
The store supports:
getBySymbol(symbol, limit)- Find documentation mentions of a symbol- Edge kind filtering:
"includes"for structural inclusions,"reference"for associative mentions - Deduplication by file path + breadcrumb + match kind
Source: src/server/tools/find-references.ts
Graph Store
The graph store builds and maintains the dependency graph:
Edge Structure
interface GraphEdge {
source_file_id: number;
target_file_id: number;
}
Import/Dependency Maps
const imports = new Map<string, string[]>(); // file -> files it imports
const importedBy = new Map<string, string[]>(); // file -> files that import it
Source: src/server/audit-obsidian.ts
PageRank Computation
Files receive PageRank scores based on their position in the dependency graph. High PageRank files are considered "load-bearing" modules.
Source: src/server/tools/wakeup.ts
Hybrid Search
The hybrid search combines multiple retrieval signals:
const { hybridSearch } = require("../../search/hybrid-search.js");
Search Signals
| Signal | Description |
|---|---|
| BM25 | Traditional keyword matching |
| Vector/Embeddings | Semantic similarity via bi-encoder |
| PageRank | Graph-based importance |
| Symbol Match | Exact symbol references |
Source: src/server/tools/context.ts
Memory Store
The memory store provides persistent context across sessions:
Memory Tiers
| Tier | Usage |
|---|---|
| Core | Project invariants, auto-injected on session start |
| Standard | General memories and decisions |
Memory Categories
- Conventions
- Architecture decisions
- Project-specific patterns
Source: src/server/mcp-server.ts
Status Reporting
The indexer provides project status through getStatus():
const status = indexer.getStatus();
// Returns: { projectName, fileCount, languages, ... }
Source: src/server/tools/wakeup.ts
Related Community Issues
Index Timestamp Bug (Issue #74)
sverklo reindex does not update the lastIndexed field in registry.json, causing stale age displays after reindexing.
Reproduction:
sverklo register .
sverklo reindex .
sverklo list # Still shows stale age
Parser Regression (Issue #28)
A string/comment-aware brace counter fix in the parser recovered P1 from 0.30 → 0.73 on the 90-task benchmark but caused slight regressions in P2/P4 categories.
Retrieval Architecture (Issue #29)
Community discussion on evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank architecture.
Configuration
The indexer supports path-based filtering:
| Option | Description |
|---|---|
scope | Path prefix to constrain indexing |
ignore | Patterns to exclude from indexing |
Wakeup Generation
The wakeup system generates a quick project orientation:
function generateWakeup(indexer, options) {
const status = indexer.getStatus();
const coreMemories = indexer.memoryStore.getCore(10);
const topFiles = indexer.fileStore.getAll().slice(0, 5);
}
Output includes:
- Project name and file count
- Top 5 files by PageRank
- Core memories (tier='core')
Source: src/server/tools/wakeup.ts
Source: https://github.com/sverklo/sverklo / Human Manual
Search Tools Reference
Related topics: Impact and Reference Tools, Search and Retrieval System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Impact and Reference Tools, Search and Retrieval System
Search Tools Reference
Sverklo provides a layered suite of search and investigation tools designed to help coding agents navigate, understand, and reason about codebases. The tools span from fast keyword lookups to deep semantic analysis, with reciprocal rank fusion combining multiple retrieval signals for high-quality results.
Architecture Overview
Sverklo's search system uses a hybrid retrieval architecture that combines three signals:
| Signal | Mechanism | Purpose |
|---|---|---|
| BM25 | Sparse keyword matching | Exact term matches and code identifiers |
| Bi-encoder embeddings | Dense vector similarity | Semantic/code-similarity search |
| PageRank | Graph-based importance | Prioritize high-centrality files |
These signals are merged using reciprocal rank fusion (RRF) to produce ranked results that balance precision with recall. Source: package.json (keywords: "reciprocal-rank-fusion", "bm25", "pagerank")
graph TD
subgraph "Retrieval Layer"
BM25[BM25 Keyword Search]
EMB[Bi-encoder Embeddings]
PGR[PageRank Scorer]
end
subgraph "Fusion"
RRF[Reciprocal Rank Fusion]
end
subgraph "Post-Processing"
VERIFY[Verify Results]
REFINE[Refine & Deduplicate]
end
BM25 --> RRF
EMB --> RRF
PGR --> RRF
RRF --> VERIFY
VERIFY --> REFINE
REFINE --> RESULTS[Final Results]Tool Categories
Sverklo organizes search tools into four functional categories, each targeting a specific stage of code investigation.
Discovery Tools
These tools help locate code and understand what's in the codebase.
| Tool | Purpose |
|---|---|
search | Primary semantic search across code chunks |
search_iterative | Multi-turn search with result refinement |
investigate | Fan-out search across FTS, embeddings, symbols, and refs simultaneously |
grep | Exact string/pattern matching |
head | View beginning of files |
Source: tool-overrides.ts (tool lists in research and lean profiles)
Context Tools
These tools bundle information for rapid orientation.
| Tool | Purpose |
|---|---|
context | Umbrella tool returning codebase overview + search + symbols + memories in one call |
overview | Structural summary with file/chunk counts and PageRank rankings |
ask | Free-form question answering over indexed content |
wakeup | Quick orientation summary for new sessions |
The context tool is designed as the "first call" when starting work on a new task, returning a curated bundle in a single round trip:
Give a task description and get a single curated bundle: codebase overview header, semantically relevant code, related symbols, and matching saved memories — in one round trip.
Source: src/server/tools/context.ts (tool description)
Navigation Tools
These tools help traverse code structure and relationships.
| Tool | Purpose |
|---|---|
lookup | Retrieve full code chunks by path |
refs | Find references to symbols, functions, and variables |
deps | Explore dependency relationships |
clusters | Group similar code patterns |
patterns | Identify recurring code idioms |
concepts | Extract and map high-level concepts |
Verification Tools
These tools validate findings and assess code quality.
| Tool | Purpose |
|---|---|
verify | Check if cited evidence actually supports claims |
critique | Multi-dimensional analysis of response quality |
review_diff | Risk-scored PR review with heuristic finding detection |
audit | Codebase health scoring |
Source: tool-overrides.ts (verification tools in research and review profiles)
Core Search Tool: `search`
The primary search tool uses hybrid retrieval to find semantically relevant code chunks.
const searchTool = {
name: "search",
description: "Semantic code search using FTS + embeddings + PageRank"
};
Results include found_by tags that indicate which retrievers agreed on each result — results marked by multiple signals are higher-signal than single-source hits.
Source: prompts.ts (investigate prompt instructs: "Read the found_by tags — results agreed on by multiple retrievers are higher-signal than single-source hits")
Iterative Search: `search_iterative`
For complex queries requiring refinement, the iterative search tool supports multi-turn exploration where results from one search inform subsequent queries.
Source: tool-overrides.ts (included in research profile)
Investigation Tool: `investigate`
The investigate tool performs a single-pass fan-out across all retrieval signals simultaneously:
- Full-text search (BM25)
- Vector embeddings (semantic similarity)
- Symbol index (function/class names)
- Reference graph (who calls whom)
This is the recommended starting point for feature mapping and deep code exploration.
Source: prompts.ts (map-feature prompt)
Context Bundler: `context`
The context tool is an umbrella that combines multiple searches into a single curated response.
Parameters
| Parameter | Type | Description | ||
|---|---|---|---|---|
task | string | Free-form task description | ||
detail_level | "minimal" \ | "normal" \ | "full" | Amount of detail to return |
scope | string | Optional path prefix constraint | ||
budget | number | PageRank-pruned token budget |
Detail Levels
| Level | Contents |
|---|---|
minimal | Overview header + top 3 search hits + top 2 memories |
normal | Overview header + top 5 search hits + top 5 memories + symbol table |
full | Normal + dependency neighbors of top results |
When budget is set, returns a PageRank-pruned repo map greedily filled to the token limit.
Source: src/server/tools/context.ts (input schema and implementation)
Finding References: `refs`
The refs tool finds all references to a symbol, including:
- Call sites
- Type definitions
- Documentation mentions
- Import/export relationships
A significant enhancement in recent versions separates structural inclusions from associative references:
Sprint 9: split structural inclusions from associative references so callers see "this is where the symbol is documented" separately from "see also" mentions.
Results include deduplication to avoid emitting near-identical lines when both an outer fenced chunk and its inner content resolve to the same symbol.
Source: src/server/tools/find-references.ts (deduplication logic and comment)
Documentation Detection
The refs tool detects when markdown or README files reference a symbol by backtick or fenced code:
If any markdown / README / ADR chunks reference this symbol by backtick or fenced code, surface them so the agent sees both the code and its documentation together.
This helps agents identify both the implementation and its documentation in one view.
Source: src/server/tools/find-references.ts (doc mention detection)
Verification: `verify` and `critique`
These tools validate that search results actually support agent claims.
`verify`
Checks if cited evidence points to actual file locations and content matches the claim.
`critique`
Performs multi-dimensional analysis including:
- Claim verification against cited evidence
- Stale memory detection
- Moved symbol tracking
- Hub file citation analysis
- Undefined symbol detection
- Undocumented symbol identification
interface CritiqueData {
claim: string | null;
verify: VerifyResult[];
stale: VerifyResult[];
moved: VerifyResult[];
hubsCited: string[];
missedHubs: string[];
undefinedSymbols: string[];
undocumentedSymbols: string[];
totalSymbols: number;
}
The critique tool specifically checks if documentation files (.md, .markdown, .mdx) cite the symbols — if not, the symbol is flagged as undocumented.
Source: src/server/tools/critique.ts (formatCritique function and verification logic)
Tool Profiles
Sverklo organizes tools into named profiles for different workflows:
| Profile | Tools Included | Use Case |
|---|---|---|
full | All tools | Complete access |
minimal | search, lookup, overview, refs, impact | Quick lookups |
lean | search, lookup, overview, refs, impact, deps, context, status, remember, recall, review_diff | Development workflow |
research | search, search_iterative, investigate, ask, lookup, overview, refs, impact, deps, concepts, patterns, clusters, verify, critique, ctx_slice, ctx_grep, ctx_stats, status | Code investigation |
review | review_diff, diff_search, test_map, impact, refs, lookup, search, investigate, verify, status | PR/MR review |
Source: tool-overrides.ts (PROFILES constant)
Environment Configuration
Tool Descriptions
Override tool descriptions via environment variables:
SVERKLO_TOOL_<NAME>_DESCRIPTION="custom description"
For example:
SVERKLO_TOOL_SEARCH_DESCRIPTION="Custom search override"
Tool Profiles
Select which tools are available via SVERKLO_PROFILE:
SVERKLO_PROFILE=research # Enable investigation tools
SVERKLO_PROFILE=review # Enable PR review tools
Tool Disabling
Disable specific tools via SVERKLO_DISABLED_TOOLS:
SVERKLO_DISABLED_TOOLS=search,investigate
Source: tool-overrides.ts (environment variable handling)
Prompt Templates
Sverklo includes prompt templates that encode the recommended order of tool calls for common tasks.
Map Feature Prompt
Maps a feature across the codebase using a recommended workflow:
investigate— single-pass fan-out over all retrieval signalsrefs— expand surface area for top symbolsimpact— assess blast radius before proposing changesoverview— structural summary for architectural contextsearch— keyword/diff-aware search for recent changes
Architecture Map Prompt
Generates an architecture map using:
overview— structural summary with PageRankdeps— dependency analysis for top filesrecall— any saved design decisionssearch— find entry points
Source: src/server/prompts.ts (prompt definitions)
GitHub Action Integration
The search tools integrate with GitHub Actions for automated PR review:
uses: sverklo/sverklo/action@main
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
fail-on: high
max-files: 25
The action uses heuristic finding detection to identify risky code patterns and posts results as GitHub PR review comments with inline annotations.
Source: action/README.md (usage documentation)
Community Considerations
Retrieval Architecture Evaluation
The community has discussed evaluating alternative retrieval architectures:
LinkedIn discussion... raised two concrete pushes against sverklo's current retrieval architecture, both worth taking seriously...
Specifically, issue #29 discusses evaluating ColBERT/PLAID-style multi-vector rerankers against the current bi-encoder + BM25 + PageRank approach.
Regressions After Parser Fix
Benchmark issue #28 documents a regression in P2/P4 performance after a parser brace-counter fix. This affected search quality for certain code patterns, highlighting the importance of the indexer's parsing quality for retrieval accuracy.
Source: https://github.com/sverklo/sverklo / Human Manual
Impact and Reference Tools
Related topics: Search Tools Reference, Indexing System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Search Tools Reference, Indexing System
Impact and Reference Tools
The Impact and Reference Tools form the graph-analysis layer of sverklo's code intelligence system. These tools enable agents to understand how code entities relate to each other, assess the blast radius of proposed changes, and verify that code modifications haven't introduced broken references or undocumented dependencies.
Overview
Sverklo maintains a dependency graph built from import/export relationships parsed during indexing. The Impact and Reference Tools query this graph to answer two fundamental questions:
- What depends on this code? — Reference analysis traces consumers and dependencies
- What would break if this changed? — Impact analysis calculates blast radius and risk scores
These tools are available across all tool profiles (core, nav, lean, research, review) and are considered essential for safe refactoring and architectural decisions. Source: src/server/tool-overrides.ts:1-51
Tool Inventory
Core Graph Tools
| Tool | Purpose | Primary Use Case |
|---|---|---|
refs | Find all references to a symbol | Understanding usage patterns before refactoring |
impact | Calculate blast radius of changes | Risk assessment for proposed modifications |
deps | Show dependency graph for a file/symbol | Understanding architectural layers |
Supporting Tools
| Tool | Purpose | Integration |
|---|---|---|
investigate | Fan-out search across FTS, vectors, symbols, refs | Research workflow entry point |
verify | Cross-reference claims against codebase | Code review and documentation validation |
critique | Structured verification of code claims | PR review and architectural review |
Source: src/server/mcp-server.ts
Reference Analysis
Symbol Reference Finding
The refs tool traces both direct code references and documentation mentions of symbols. It builds bidirectional maps from the graph store to answer "who imports this?" and "what does this import?" queries.
graph LR
A[Symbol Query] --> B[Graph Store]
B --> C[Import Map<br/>file → files it imports]
B --> D[Imported-By Map<br/>file → files importing it]
C --> E[Direct Dependencies]
D --> F[Direct Consumers]
E --> G[Transitive Dependencies<br/>N hops]
F --> H[Transitive Consumers<br/>N hops]The tool separates structural inclusions from associative references, surfacing "this is where the symbol is documented" separately from "see also" mentions. Source: src/server/tools/find-references.ts:1-40
Documentation Citation Tracking
The refs tool detects when documentation (.md, .markdown, .mdx files) references a symbol. This helps agents identify:
- Architecture Decision Records (ADRs) that document the symbol's design
- Usage examples in README files
- Related documentation that should be updated alongside code changes
Reference rows are deduplicated to prevent emitting near-identical lines when both an outer fenced chunk and inner fence resolve to the same symbol. Source: src/server/tools/find-references.ts:25-38
Verify Results Format
Reference findings are returned with file path, line information, and match kind:
interface VerifyResult {
file?: string; // Repo-relative path
line?: number; // 1-based line number
match_kind: string; // 'import' | 'call' | 'type_ref' | 'doc_mention'
confidence?: number; // 0-1 for heuristic matches
}
Impact Analysis
Blast Radius Calculation
The impact tool calculates how many and which files would be affected by changes to a given symbol. It uses PageRank scores to prioritize the most important affected files.
graph TD
A[Changed Symbol] --> B[Direct Consumers]
B --> C[Test Files]
B --> D[Direct Importers]
C --> E[High Risk<br/>No Alternative Path]
D --> F[Transitive Consumers]
F --> G[Indirect Dependencies]
G --> H[Risk Score by<br/>PageRank Weight]Risk Scoring Factors
Impact analysis considers multiple factors:
| Factor | Weight | Description |
|---|---|---|
| PageRank Score | High | Files with higher centrality are riskier |
| Test Coverage | Medium | Files with tests are safer to change |
| Fan-out Count | Medium | Files importing many things affect more |
| Circular Dependencies | High | Changes in cycles affect all members |
Source: src/server/mcp-server.ts:40-70
Partition Plans
For large blast radii, the impact tool returns partition plans that break the change into buckets. Agents should pick one bucket and drill in rather than attempting to read the full list. This is especially important for monorepos with hundreds of affected files. Source: src/server/prompts.ts:20-35
Diff-Aware Analysis
Diff Search
The diff_search tool combines semantic search with git diff awareness. It searches only changed files between two refs, useful for understanding what changed in a PR:
graph LR
A[Query + Ref Range] --> B[git diff]
B --> C[Changed Paths]
C --> D[Filtered Search<br/>Only Changed Files]
D --> E[Impact on Changed Code]Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query |
ref | string | main..HEAD | Git ref range |
include_callers | number | 0 | Include N-hop transitive callers |
token_budget | number | 3000 | Max tokens to return |
type | enum | any | Filter by symbol type |
Source: src/server/tools/diff-search.ts:1-80
Include Callers Mode
When include_callers is set, the tool includes files that import the changed files. This answers "what uses these changed files?" rather than just "what changed?":
0— Only changed files1— Changed files + direct callers2— Changed files + transitive callers
Source: src/server/tools/diff-search.ts:20-30
Critique and Verification
Claim Verification
The critique tool validates code claims by cross-referencing them against the indexed codebase. It checks:
- Stale references — Symbols that no longer exist or have moved
- Undefined symbols — References to unindexed or non-existent code
- Documentation coverage — Whether cited symbols have documentation mentions
- Hub citation — Whether high-centrality files are referenced
graph TD
A[Code Claim] --> B[Verify Against Index]
B --> C{Still Exists?}
C -->|No| D[Moved Symbol]
C -->|No| E[Undefined Symbol]
C -->|Yes| F{Has Docs?}
F -->|No| G[Undocumented Symbol]
F -->|Yes| H[Verified Claim]
D --> I[Critique Report]
E --> I
G --> IThe tool detects when none of the cited evidence points at documentation files, suggesting the answer skipped documentation. Source: src/server/tools/critique.ts:1-50
Critique Data Structure
interface CritiqueData {
claim: string | null;
verify: VerifyResult[];
stale: VerifyResult[];
moved: VerifyResult[];
hubsCited: string[];
missedHubs: string[];
undefinedSymbols: string[];
undocumentedSymbols: string[];
totalSymbols: number;
}
Tool Profiles
Tool availability varies by profile. The Impact and Reference tools are available in all profiles:
| Tool | core | nav | lean | research | review |
|---|---|---|---|---|---|
refs | ✓ | ✓ | ✓ | ✓ | ✓ |
impact | ✓ | ✓ | ✓ | ✓ | ✓ |
deps | — | ✓ | ✓ | ✓ | ✓ |
investigate | — | — | — | ✓ | ✓ |
verify | — | — | — | ✓ | ✓ |
critique | — | — | — | ✓ | ✓ |
Source: src/server/tool-overrides.ts:10-50
Integration with Memory
The graph tools integrate with sverklo's memory layer:
- Core memories (tier='core') are project invariants auto-injected on every session start
- Impact analysis can be saved as memories using
remember - Reference findings can be recalled in future sessions
This enables agents to build institutional knowledge about risky changes and their outcomes. Source: src/server/mcp-server.ts:75-100
Usage Patterns
Safe Refactoring Workflow
- Identify the symbol to refactor
- Call
refsto understand all consumers - Call
impactto assess blast radius - Review partition plans if blast radius is large
- Save decisions with
rememberfor future reference
PR Review Workflow
- Call
investigateto understand changed components - Call
diff_searchon the PR branch - Call
verifyto check for broken references - Call
critiqueto validate architectural claims
Source: src/server/prompts.ts:1-30
Performance Considerations
- Reference lookups use in-memory graph stores for sub-millisecond response
- PageRank scores are precomputed during indexing
- Transitive dependency traversal is bounded by default to prevent runaway queries
- Token budgets prevent oversized responses in large codebases
Related Documentation
- Audit Tools — Codebase health scoring and architecture analysis
- Search Tools — Semantic and full-text search
- Memory Tools — Persistent context across sessions
- Git Integration — Diff-aware analysis and ref validation
Source: https://github.com/sverklo/sverklo / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 19 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.
1. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: packet_text.keyword_scan | github_repo:1203034717 | https://github.com/sverklo/sverklo
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_7a50e3a046d2438db185ba21d580ec9e | https://github.com/sverklo/sverklo/issues/71
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_13e1bc9ab7fa41a0866eb6c4f814875c | https://github.com/sverklo/sverklo/issues/60
4. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_a8bdc3779b264243b8362d6e57096e25 | https://github.com/sverklo/sverklo/issues/61
5. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_c0c1f6a71a764af596178de506d0b2c3 | https://github.com/sverklo/sverklo/issues/58
6. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_6be83aacd98c4e3abb6ae6361bf81940 | https://github.com/sverklo/sverklo/issues/69
7. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_fc3cc34d92454d5a92ab4a196b178799 | https://github.com/sverklo/sverklo/issues/72
8. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_bf329d5553724c3281773c6aee96cae5 | https://github.com/sverklo/sverklo/issues/74
9. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_42920ecfbbc54f4f8b207e386dfc9ebd | https://github.com/sverklo/sverklo/issues/73
10. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | github_repo:1203034717 | https://github.com/sverklo/sverklo
11. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | github_repo:1203034717 | https://github.com/sverklo/sverklo
12. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:1203034717 | https://github.com/sverklo/sverklo
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using sverklo with real data or production workflows.
- MCP tool names double-prefixed (sverklo_sverklo_*) when server registere - github / github_issue
- sverklo init --global: one-time setup with memory import, skip per-proje - github / github_issue
- sverklo unregister should accept --by-path for agent-driven worktree tea - github / github_issue
- sverklo reindex does not update registry.json lastIndexed timestamp - github / github_issue
- fingerprintOf is defined but never called — provider-change auto-rebuild - github / github_issue
- v0.25.1: Ollama reindex still stores 384d vectors despite 1024d config; - github / github_issue
- Configuration risk requires verification - GitHub / issue
- Installation risk requires verification - GitHub / issue
- Installation risk requires verification - GitHub / issue
- Installation risk requires verification - GitHub / issue
- Security or permission risk requires verification - GitHub / issue
- Security or permission risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence