# https://github.com/RNBBarrett/thought-mcp 项目说明书

生成时间：2026-05-16 09:34:00 UTC

## 目录

- [Introduction to THOUGHT](#page-introduction)
- [Quickstart Guide](#page-quickstart)
- [Installation and Setup](#page-installation)
- [System Architecture](#page-architecture)
- [Storage and Database Layer](#page-storage-layer)
- [Memory Model and Data Structures](#page-memory-model)
- [Query and Retrieval System](#page-query-system)
- [Multi-Language Code Parsing](#page-code-parsing)
- [Git History Integration](#page-git-integration)
- [Agent Adapters and SDK Integration](#page-agent-adapters)

<a id='page-introduction'></a>

## Introduction to THOUGHT

### 相关页面

相关主题：[Quickstart Guide](#page-quickstart), [Installation and Setup](#page-installation), [System Architecture](#page-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/RNBBarrett/thought-mcp/blob/main/README.md)
- [docs/comparison.md](https://github.com/RNBBarrett/thought-mcp/blob/main/docs/comparison.md)
- [src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)
- [src/thought/demo.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)
- [src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)
- [src/thought/storage/sqlite/backend.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/backend.py)
- [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)
</details>

# Introduction to THOUGHT

THOUGHT is a local AI memory tool designed to help developers, researchers, writers, and investigators maintain persistent, queryable knowledge graphs of their work. It combines graph database technology with natural language processing to create a bi-temporal knowledge base that tracks information across time—answering questions like "what was true on date X" and "what did the system know on date X." 资料来源：[README.md](https://github.com/RNBBarrett/thought-mcp/blob/main/README.md)

## What is THOUGHT?

THOUGHT operates as a self-hosted memory layer that runs entirely on your local machine. Unlike cloud-based AI memory solutions, THOUGHT stores everything in a local SQLite database, giving you full control over your data while still providing powerful querying capabilities through natural language or Cypher graph queries. 资料来源：[src/thought/cli.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

The core philosophy is to treat memory as a first-class citizen in the development workflow—something that persists across sessions, understands context, and can be queried like a real database rather than a simple key-value store.

## Core Architecture

THOUGHT's architecture consists of several interconnected layers that work together to provide a complete memory solution.

```mermaid
graph TD
    A[CLI / MCP Server] --> B[Query Layer]
    B --> C[Graph Layer]
    B --> D[Code Layer]
    C --> E[Storage Backend]
    D --> E
    E --> F[SQLite Database]
    B --> G[LLM Providers]
    G --> H[Ollama / LM Studio / OpenAI]
```

### Storage Layer

The storage layer uses SQLite with a carefully designed schema that supports bi-temporal modeling. Every entity and edge in the knowledge graph has timestamps tracking when facts became valid and when they were learned. 资料来源：[src/thought/storage/sqlite/backend.py:1-100](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/backend.py)

| Component | Purpose |
|-----------|---------|
| `SQLiteBackend` | Core database operations with upsert, query, and embedding storage |
| WAL Mode | Write-Ahead Logging for crash recovery and concurrent reads |
| Migration System | Tracks applied migrations in `applied_migrations` table |
| Bi-temporal Columns | `valid_from`, `valid_until`, `learned_at`, `unlearned_at` |

### Query Layer

The query layer provides multiple interfaces for accessing your memory:

- **Natural Language**: Ask questions in plain English, translated to Cypher
- **Code Queries**: Find callers, callees, and impact sets
- **Recall**: Semantic search using embeddings
- **Cypher Direct**: Execute graph queries directly 资料来源：[src/thought/query/ask.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)

### Graph Layer

The graph layer provides the core graph operations that power all THOUGHT functionality. It handles entity and edge management with support for scopes (shared/private) and owner-based access control. 资料来源：[src/thought/layers/graph.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/graph.py)

## Entity Model

THOUGHT uses a flexible entity model that can represent code elements, prose content, legal documents, and research claims.

```mermaid
classDiagram
    class Entity {
        +str id
        +str type
        +str name
        +str canonical_name
        +ScopeName scope
        +Tier tier
        +float importance
        +datetime valid_from
        +datetime valid_until
        +datetime learned_at
        +dict~str, object~ attrs
    }
    
    class Edge {
        +str id
        +str source_id
        +str target_id
        +str relation_type
    }
    
    Entity "1" --> "*" Edge : source
    Entity "1" --> "*" Edge : target
```

资料来源：[src/thought/models.py:50-100](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)

### Entity Attributes

| Field | Type | Description |
|-------|------|-------------|
| `id` | str | Unique identifier |
| `type` | str | Entity type (function, class, module, claim, etc.) |
| `name` | str | Human-readable name |
| `canonical_name` | str | Fully qualified name for disambiguation |
| `scope` | ScopeName | "shared" or "private" |
| `owner_id` | str | Owner for private entities |
| `tier` | Tier | "hot", "warm", or "cold" |
| `valid_from` | datetime | When this fact became true |
| `valid_until` | datetime | When this fact stopped being true (null = current) |
| `learned_at` | datetime | When THOUGHT learned this fact |
| `attrs` | dict | Additional type-specific metadata |

### Edge Relations

Edges represent relationships between entities with the following relation types:

| Relation Type | Description |
|---------------|-------------|
| `CALLS` | Function/method invocation |
| `INHERITS_FROM` | Class inheritance |
| `DEFINES` | Container defines member |
| `IMPORTS` | Module import statement |
| `CONTRADICTS` | Logical contradiction between facts |
| `CITES` | Source citation relationship |

## Audience Verticals

THOUGHT is designed to serve multiple audiences, each with specialized commands and entity taxonomies optimized for their use case. 资料来源：[src/thought/demo.py:1-80](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)

```mermaid
graph LR
    A[THOUGHT] --> B[Code Developers]
    A --> C[Writers]
    A --> D[Legal Investigators]
    A --> E[Researchers]
    
    B --> B1[thought scan]
    B --> B2[thought impact]
    B --> B3[thought callers]
    
    C --> C1[thought ingest-prose]
    C --> C2[thought timeline]
    C --> C3[contradiction-check]
    
    D --> D1[thought ingest-legal]
    D --> D2[unique_predicates]
    D --> D3[contradiction-graph]
    
    E --> E1[thought ingest-claim]
    E --> E2[citation-analysis]
    E --> E3[reliability-filter]
```

### Code Developers

The code vertical provides tools for understanding, navigating, and analyzing source code:

- **`thought scan`**: Incremental code scanning with change detection
- **`thought impact <name>`**: Transitive impact set—what's affected if I change this?
- **`thought callers <name>`**: Direct callers ranked by Personalized PageRank
- **`thought recall`**: Semantic search across code by intent 资料来源：[src/thought/layers/code.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)

### Writers

The writing vertical supports fiction and academic prose:

- Ingest chapter/section facts about characters
- Detect contradictions via the bi-temporal model
- Query chronological mentions across documents
- Time-travel `as_of` recall for historical consistency

### Legal Investigators

The legal vertical is designed for investigation workflows:

- **`thought ingest-legal`**: Ingest witness statements with unique predicates
- **`thought contradiction-graph`**: Trigger CONTRADICTS edges between testimonies
- Query the contradiction graph for investigation leads

### Researchers

The research vertical supports academic workflows:

- **`thought ingest-claim`**: Ingest claim/source pairs
- Cypher queries to find uncited claims
- Most-cited source identification
- Citation reliability filtering

## CLI Commands Overview

| Command | Description |
|---------|-------------|
| `thought init` | Create database file + config + CLAUDE.md |
| `thought recall <query>` | Semantic recall with embeddings |
| `thought ask <question>` | Natural language query → Cypher → results |
| `thought scan <repo>` | Incremental code scan with change detection |
| `thought callers <name>` | Find direct callers ranked by PageRank |
| `thought impact <name>` | Transitive impact set |
| `thought db size` | Disk usage + entity/edge counts |
| `thought db flush` | Wipe the knowledge base |
| `thought db backup <file>` | SQLite online-backup snapshot |
| `thought db load <file>` | Load backup file |
| `thought hook install` | Install Claude Code hooks |
| `thought diff --from <sha1> --to <sha2>` | Entity diff between commits |

资料来源：[src/thought/cli.py:50-150](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

## Database Lifecycle Management

THOUGHT provides comprehensive database management commands under `thought db`:

### Backup and Restore

```mermaid
graph LR
    A[Production DB] -->|thought db backup| B[backup.db]
    B -->|thought db load| C[Production DB]
    B -->|thought db inspect| D[Inspection Report]
```

The backup system uses SQLite's online backup API, ensuring consistent snapshots even during active writes. Date filters can produce clean, self-contained subset files. 资料来源：[src/thought/storage/sqlite/backend.py:100-200](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/backend.py)

### Flush Operations

Flush commands support date-bounded deletion:
- `--before X`: Delete facts valid before date X
- `--since X`: Delete facts learned since date X
- `--time-axis valid|learned|created`: Choose which time axis to filter

All destructive operations automatically back up to `<db>.bak.<timestamp>` before proceeding.

## Git History Integration

THOUGHT can ingest git repositories with two modes:

| Mode | Behavior | Use Case |
|------|----------|----------|
| `snapshot` (default) | Ingest HEAD only, stamp with HEAD SHA | Fast code analysis |
| `full` | Walk every commit, stamp with commit SHA | Bi-temporal historical queries |

The `GitWalker` class shells out to `git` commands rather than using native libraries, avoiding C extension dependencies while maintaining cross-platform compatibility. 资料来源：[src/thought/ingest/code/git_walker.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/git_walker.py)

```mermaid
graph TD
    A[thought ingest-git] --> B{Snapshot Mode?}
    B -->|Yes| C[Ingest HEAD only]
    B -->|No| D[Walk all commits]
    C --> E[Stamp with HEAD SHA]
    D --> F[Stamp each entity with commit SHA]
    E --> G[Enable as_of queries]
    F --> G
```

## Bi-temporal Model

THOUGHT's bi-temporal model tracks two independent timelines for every fact:

| Time Axis | Description | Question Answered |
|-----------|-------------|-------------------|
| **Valid Time** | When a fact was true in reality | "What was true on date X?" |
| **Learned Time** | When THOUGHT learned the fact | "What did the system know on date X?" |

This distinction enables sophisticated queries like:

```cypher
MATCH (e:Entity)
WHERE e.valid_from <= date('2024-01-01')
  AND (e.valid_until IS NULL OR e.valid_until > date('2024-01-01'))
RETURN e
```

Contradictions surface as `CONTRADICTS` edges—they're treated as data rather than warnings, allowing you to query them directly. 资料来源：[src/thought/cli.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

## LLM Provider Integration

THOUGHT supports multiple LLM providers for natural language processing:

| Provider | Features |
|----------|----------|
| **Ollama** | Native `/api/embed` (batched), OpenAI-compatible fallback |
| **LM Studio** | OpenAI-compatible API |
| **Any OpenAI-compatible server** | Standard embedding endpoints |

The embedder selection defaults to `auto`, which probes for `sentence_transformers` and falls back to a deterministic embedder when the optional dependency is unavailable. 资料来源：[src/thought/storage/sqlite/backend.py:200-300](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/backend.py)

## Code Extraction Support

THOUGHT can parse and extract entities from multiple programming languages:

| Language | Extractor | Key Features |
|----------|-----------|--------------|
| Python | `python_extractor.py` | AST-based import tracking, class/function detection |
| TypeScript | `typescript_extractor.py` | Tree-sitter parsing, heritage analysis |
| Rust | `rust_extractor.py` | Module system, impl block handling |
| PHP | `php_extractor.py` | Namespace handling, method visibility |

All extractors produce consistent `CodeEntity` and `CodeEdge` objects that integrate with the unified graph model. 资料来源：[src/thought/ingest/code/python_extractor.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/python_extractor.py)

## Getting Started

### Initialization

```bash
thought init --db-path .thought/thought.db --embedder auto
```

This creates:
1. The SQLite database file
2. A `thought.toml` configuration file
3. A `CLAUDE.md` file for MCP client integration

### Quick Start Commands

```bash
# Ingest a git repository
thought ingest-git ./my-project --mode snapshot

# Recall something semantically
thought recall "authentication middleware"

# Ask a natural language question
thought ask "what calls the authenticate_user function?"

# Find impact of changing a function
thought impact MyClass.my_method
```

## Configuration

THOUGHT uses a `thought.toml` file for configuration:

| Section | Option | Default | Description |
|---------|--------|---------|-------------|
| `database` | `path` | `.thought/thought.db` | Database file path |
| `llm` | `provider` | `auto` | LLM provider selection |
| `embedder` | `model` | `auto` | Embedding model |
| `scopes` | `default` | `shared` | Default scope for new entities |

Configuration can be overridden via CLI flags or environment variables.

---

<a id='page-quickstart'></a>

## Quickstart Guide

### 相关页面

相关主题：[Introduction to THOUGHT](#page-introduction), [Installation and Setup](#page-installation)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/thought/demo.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)
- [src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)
- [src/thought/ingest/code/pipeline.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/pipeline.py)
- [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)
- [src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)
- [CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)
</details>

# Quickstart Guide

## Overview

**THOUGHT** is a local-AI memory tool designed to manage knowledge bases, run on local models, write graph queries, and query in natural language. It provides a comprehensive CLI for ingesting information, recalling facts, and performing code analysis with graph-based relationships.

资料来源：[CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)

## Architecture Overview

```mermaid
graph TD
    subgraph "THOUGHT Core"
        CLI[CLI Interface]
        DB[(SQLite Database)]
        EMB[Embedder Layer]
        GRAPH[Graph Layer]
    end
    
    subgraph "Ingestion Sources"
        CODE[Code Ingest]
        PROSE[Prose Ingest]
        LEGAL[Legal Ingest]
    end
    
    subgraph "Query Interface"
        RECALL[Recall Command]
        REPL[Interactive REPL]
        MCP[MCP Server]
    end
    
    CLI --> DB
    CLI --> EMB
    EMB --> DB
    CODE --> CLI
    PROSE --> CLI
    LEGAL --> CLI
    RECALL --> GRAPH
    REPL --> GRAPH
    MCP --> GRAPH
    GRAPH --> DB
```

资料来源：[src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

## Installation and Initialization

### Initial Setup

Run the `init` command to create the database, configuration file, and CLAUDE.md helper:

```bash
thought init
```

The init command accepts several options:

| Option | Default | Description |
|--------|---------|-------------|
| `--config` | `thought.toml` | Path to configuration file |
| `--db-path` | `.thought/thought.db` | SQLite database path |
| `--embedder` | `auto` | Embedder type: `auto`, `sentence-transformers`, or `deterministic` |
| `--write-claude-md` | `true` | Drop a CLAUDE.md for MCP clients |
| `--quick` | `false` | Skip first-run embedder warmup |

资料来源：[src/thought/cli.py:57-78](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

### Configuration File

The init command creates a `thought.toml` configuration file with the following structure:

```toml
[database]
path = ".thought/thought.db"

[embedder]
type = "auto"  # or "ollama", "lm_studio", "openai_compatible"

[llm]
provider = "auto"
```

## Core Commands

### Ingest Commands

THOUGHT supports multiple ingestion modes:

| Command | Purpose |
|---------|---------|
| `thought ingest TEXT` | One-shot remember from command line |
| `thought ingest --file PATH` | Ingest a single file |
| `thought ingest --glob PAT` | Bulk-ingest matching files |
| `thought ingest --stdin` | Bulk-ingest one line-per-item from stdin |

资料来源：[src/thought/cli.py:30-42](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

### Code Ingestion

The code ingest pipeline extracts entities and relationships from source files:

```bash
thought ingest --file src/main.py
thought ingest --glob "**/*.py"
```

The code extractor produces:

- **Entities**: modules, functions, classes, methods
- **Edges**: `IMPORTS`, `INHERITS_FROM`, `DEFINES`, `OVERRIDES`

资料来源：[src/thought/ingest/code/pipeline.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/pipeline.py)

### Git-Aware Ingest

For bi-temporal code analysis:

```bash
thought ingest-git <repo> --mode snapshot  # Fast: HEAD only
thought ingest-git <repo> --mode full      # Walk every commit
```

This enables `as_of` queries against historical commits.

资料来源：[CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)

### Recall and Query

```bash
thought recall "what did I learn about authentication?"
thought repl
```

The `recall` command returns up to 10 results with ranked relevance. Use `as_of` and `scope` to narrow results further.

资料来源：[src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

### Database Management

| Command | Description |
|---------|-------------|
| `thought db size` | Disk usage + entity/edge counts |
| `thought db flush` | Wipe the KB (with backup) |
| `thought db backup <file>` | SQLite backup snapshot |
| `thought db load <file>` | Load a backup file |
| `thought db inspect <file>` | Inspect backup without loading |

资料来源：[CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)

## Code Analysis Commands

### Callers and Impact Analysis

```bash
# Find who calls a function (ranked by PageRank)
thought callers authenticate_user

# Transitive impact: what's affected if I change this?
thought impact JWTValidator
```

资料来源：[src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)

### Diff Between Commits

```bash
thought diff --from abc1234 --to def5678
```

This shows entities added/removed between two ingested commits.

## Built-in Demos

Run audience-specific walkthroughs:

```bash
thought demo code        # Agent/developer flow (14-stage walkthrough)
thought demo writer       # Novelist/paper author
thought demo legal        # Investigator/paralegal
thought demo researcher   # Academic use case
thought demo all          # Run all demos sequentially
```

Each demo runs end-to-end in a self-cleaning temporary directory and produces a structured `DemoReport`.

资料来源：[src/thought/demo.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)

## Entity Data Model

```python
@dataclass(frozen=True)
class CodeEntity:
    name: str           # Qualified name (e.g., "ClassName.method_name")
    type_: CodeEntityType  # "module" | "function" | "class" | "method" | "file"
    language: str       # Programming language
    file_path: str      # POSIX-style relative path
    line_start: int     # Starting line number
    line_end: int       # Ending line number
    signature: str      # Function/class signature
    docstring: str | None
    visibility: Literal["public", "private"]
```

资料来源：[src/thought/ingest/code/types.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/types.py)

## Supported Languages

The code ingestion pipeline supports:

| Language | Extractor | File Extension |
|----------|-----------|----------------|
| Python | `python_extractor.py` | `.py` |
| TypeScript | `typescript_extractor.py` | `.ts`, `.tsx` |
| PHP | `php_extractor.py` | `.php` |
| Rust | `rust_extractor.py` | `.rs` |

## MCP Server

Start the MCP server for integration with Claude Code:

```bash
thought serve                          # stdio transport (default)
thought serve --transport streamable-http  # HTTP transport
```

资料来源：[src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

## Utility Commands

| Command | Description |
|---------|-------------|
| `thought stats` | Display knowledge base statistics |
| `thought forget PATTERN` | Soft-delete entities matching SQL LIKE pattern |
| `thought consolidate` | Run one consolidation cycle |
| `thought doctor` | Environment health check |

## Bi-Temporal Model

THOUGHT uses a bi-temporal model for knowledge tracking:

- **`valid_from` / `valid_until`**: When facts were true in reality
- **`learned_at` / `unlearned_at`**: When the system learned/corrected facts

Query variants:
- `as_of_kind='valid'` — "what was true on date X"
- `as_of_kind='learned'` — "what did the system know on date X"

资料来源：[src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)

---

<a id='page-installation'></a>

## Installation and Setup

### 相关页面

相关主题：[Quickstart Guide](#page-quickstart), [System Architecture](#page-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)
- [src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)
- [src/thought/hooks/install.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)
- [src/thought/demo.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)
- [CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)
- [CONTRIBUTING.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CONTRIBUTING.md)
</details>

# Installation and Setup

## Overview

The `thought-mcp` project provides a comprehensive CLI tool and MCP (Model Context Protocol) server for AI-powered memory and knowledge management. The installation and setup process involves initializing the local SQLite database, configuring MCP clients (Claude Code, Cursor, etc.), and optionally setting up Claude Code hooks for automated memory operations.

The setup system is designed with idempotency in mind — installations can be safely re-run without disrupting existing configurations.

## System Architecture

```mermaid
graph TD
    A[User] --> B[thought CLI]
    B --> C[init command]
    C --> D[SQLite Database]
    C --> E[thought.toml Config]
    C --> F[CLAUDE.md Agent Hint]
    B --> G[MCP Server]
    G --> D
    B --> H[Client Install]
    H --> I[Claude Code]
    H --> J[Cursor]
    H --> K[VS Code]
    B --> L[Hook Install]
    L --> M[.claude/settings.json]
```

## Prerequisites

| Component | Requirement | Notes |
|-----------|-------------|-------|
| Python | >= 3.10 | Core runtime |
| Git | On PATH | Used by git pipeline for code ingestion |
| SQLite | 3.x | Bundled with Python stdlib |
| pip/pipx | Latest | Package installation |

资料来源：[CONTRIBUTING.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CONTRIBUTING.md)

## Installation Methods

### Standard Installation

```bash
pip install thought-mcp
```

### Development Installation

```bash
git clone https://github.com/RNBBarrett/thought-mcp.git
cd thought-mcp
pip install -e ".[dev]"
```

## CLI Initialization

The `thought init` command establishes the complete working environment. It creates three essential components in sequence.

### Init Command Signature

```python
@app.command()
def init(
    config: Path = typer.Option("thought.toml", help="Path to config file."),
    db_path: str = typer.Option(".thought/thought.db", help="SQLite database path."),
    embedder: str = typer.Option(
        "auto", help="'auto' picks sentence-transformers if available, else deterministic.",
    ),
    write_claude_md: bool = typer.Option(
        True, "--write-claude-md/--no-claude-md",
        help="Drop a CLAUDE.md so MCP clients learn how to use the tool.",
    ),
    quick: bool = typer.Option(
        False, "--quick", help="Skip first-run embedder warmup.",
    ),
) -> None:
```

资料来源：[src/thought/cli.py:35-56](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

### What Init Creates

```mermaid
graph LR
    A[thought init] --> B[Create .thought/ directory]
    A --> C[Create SQLite DB file]
    A --> D[Write thought.toml config]
    A --> E[Write CLAUDE.md]
    
    B --> F[parents=True<br/>exist_ok=True]
    C --> G[DB auto-backed up<br/>before destructive ops]
```

#### 1. Database Initialization

The command creates the SQLite database at the specified path. Parent directories are created automatically using `parents=True` to ensure the path exists.

```python
Path(db_path).parent.mkdir(parents=True, exist_ok=True)
```

资料来源：[src/thought/cli.py:52-53](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

#### 2. Configuration File

The `thought.toml` file contains runtime configuration including embedder settings and database paths.

#### 3. CLAUDE.md Agent Hint

When `write_claude_md=True` (default), the init command drops a `CLAUDE.md` file that teaches MCP clients how to interact with the tool.

### Embedder Configuration

The init command supports three embedder modes:

| Mode | Behavior | Dependencies |
|------|----------|--------------|
| `auto` (default) | Uses `sentence-transformers` if available, falls back to deterministic embeddings | Optional: sentence-transformers |
| `sentence-transformers` | Uses local transformer models for embeddings | Required: sentence-transformers |
| `deterministic` | Uses hash-based embeddings, no ML dependencies | None |

The `--quick` flag skips the first-run embedder warmup process.

## MCP Client Installation

The `thought clients install` command merges a `thought` MCP server entry into your client's configuration file.

### Supported Clients

| Client | Config Location |
|--------|-----------------|
| Claude Code | `.claude/settings.json` |
| Cursor | `~/.cursor/settings.json` |
| VS Code | `~/.cursor/settings.json` |

### Installation Workflow

```mermaid
graph TD
    A[thought clients install] --> B{Check config exists?}
    B -->|No| C[Create new config file]
    B -->|Yes| D[Read existing JSON]
    C --> E{Valid JSON object?}
    D --> E
    E -->|Yes| F[Merge mcpServers entry]
    E -->|No| G[Return error]
    F --> H{Backup enabled?}
    H -->|Yes| I[Create .thought.bak backup]
    H -->|No| J[Write merged config]
    I --> J
    J --> K[Return ClientInstallResult]
```

### Client Install Result States

```python
@dataclass(frozen=True)
class ClientInstallResult:
    client: ClientName
    path: Path
    status: Literal["installed", "already_present", "no_path", "error"]
    detail: str = ""
```

资料来源：[src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)

### Server Block Structure

The MCP server configuration block includes:

- Server name (`thought`)
- Command to execute
- Server arguments
- Environment variables for database path

## Claude Code Hook Installation

The `thought hooks install` command adds hook entries to Claude Code's settings for automated memory operations.

### Hook Types

| Hook Kind | Claude Code Event | Command |
|-----------|-------------------|---------|
| `recall` | UserPromptSubmit | `thought hook recall` |
| `write` | Stop | `thought hook write` |
| `context` | SessionStart | `thought hook context` |

资料来源：[src/thought/hooks/install.py:17-22](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)

### Hook Installation Options

```python
def settings_path(*, scope: Literal["project", "user"] = "project") -> Path:
    """Return the ``.claude/settings.json`` path for the requested scope.

    Project scope is the recommended default — it travels with the repo and
    is what most users actually want for THOUGHT-flavoured auto-memory.
    """
    if scope == "project":
        return Path.cwd() / ".claude" / "settings.json"
```

### Hook Install Process

```mermaid
graph TD
    A[thought hooks install recall] --> B{Backup enabled?}
    B -->|Yes| C[Create settings.json.thought.bak]
    B -->|No| D[Read settings.json]
    C --> D
    D --> E{Valid JSON?}
    E -->|Yes| F[Merge recall hook entry]
    E -->|No| G[Return error]
    F --> H{Entry exists?}
    H -->|Yes| I[Return already_present]
    H -->|No| J[Write updated settings.json]
    J --> K[Return HookInstallResult]
```

### Hook Install Result

```python
@dataclass(frozen=True)
class HookInstallResult:
    kind: HookKind
    path: Path
    status: Literal["installed", "already_present", "error"]
    detail: str = ""
```

资料来源：[src/thought/hooks/install.py:28-32](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)

## Quick Start Guide

### Step 1: Initialize the Environment

```bash
# Standard initialization
thought init

# Skip embedder warmup for faster startup
thought init --quick

# Custom database location
thought init --db-path /path/to/custom.db
```

### Step 2: Install MCP Client

```bash
# Install for Claude Code
thought clients install claude_code

# Install for Cursor
thought clients install cursor
```

### Step 3: Install Claude Code Hooks (Optional)

```bash
# Install recall hook (automatic memory on user input)
thought hooks install recall

# Install write hook (save memory on session stop)
thought hooks install write

# Install context hook (load memory on session start)
thought hooks install context

# Install all hooks
thought hooks install recall --kind write --kind context
```

## Database Lifecycle Management

### Database Size Check

```bash
thought db size
```

Shows disk usage of main + WAL + SHM sidecars plus entity/edge counts.

### Database Backup

```bash
thought db backup <file>
```

Creates an SQLite online-backup snapshot. Date filters produce a clean, self-contained subset file with DELETE + VACUUM after backup.

### Database Restore

```bash
thought db load <file>
```

Atomically replaces the active database with the backup file. Use `--merge` to INSERT-OR-IGNORE rows from the snapshot instead of replacing.

### Database Flush

```bash
# Full flush with confirmation
thought db flush

# Skip confirmation
thought db flush --yes

# Date-bounded flush
thought db flush --before 2024-01-01
thought db flush --since 2024-06-01
```

> **Note**: All destructive operations auto-backup to `<db>.bak.<timestamp>` before proceeding.

## Verifying Installation

### Run the Demo

```bash
# Run code audience demo
thought demo code

# Run all demos
thought demo all
```

The demo runs an audience-specific walkthrough end-to-end in a self-cleaning temporary directory, verifying the installation works correctly.

### Health Check

```bash
thought doctor
```

Performs an environment health check to verify all dependencies and configurations are correct.

## Configuration File Format

### thought.toml

```toml
[database]
path = ".thought/thought.db"

[embedder]
type = "auto"  # or "sentence-transformers", "deterministic"

[server]
name = "thought"
transport = "stdio"  # or "streamable-http"
```

## Troubleshooting

### Common Issues

| Issue | Solution |
|-------|----------|
| Config file not found | Run `thought init` first |
| Database locked | Check for other `thought` processes |
| Embedder initialization slow | Use `--quick` flag or `deterministic` embedder |
| MCP client not connecting | Verify client config has correct server entry |

### Reset Installation

```bash
# Backup current database
thought db backup /path/to/backup.db

# Flush and reinitialize
thought db flush --yes
thought init --db-path .thought/thought.db
```

## Next Steps

After installation and setup, users typically:

1. **Ingest code**: `thought ingest-git <repo>` to analyze repository code
2. **Recall information**: `thought recall <query>` to query the knowledge base
3. **Run agents**: Use reference agents like the vulnerability scanner or OSINT aggregator

---

<a id='page-architecture'></a>

## System Architecture

### 相关页面

相关主题：[Introduction to THOUGHT](#page-introduction), [Storage and Database Layer](#page-storage-layer), [Memory Model and Data Structures](#page-memory-model)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/thought/server.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/server.py)
- [src/thought/ingest/code/pipeline.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/pipeline.py)
- [src/thought/ingest/code/git_pipeline.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/git_pipeline.py)
- [src/thought/query/ask.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)
- [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)
- [src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)
- [src/thought/demo.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)
- [src/thought/hooks/install.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)
- [src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)
</details>

# System Architecture

## Overview

The thought-mcp project is a Model Context Protocol (MCP) server implementation that provides an intelligent memory and code analysis system for AI-assisted development. The system combines semantic memory storage with code graph analysis, enabling natural language queries against codebases through a bi-temporal knowledge graph.

## High-Level Architecture

```mermaid
graph TD
    subgraph "Client Layer"
        MCP[MCP Client]
        CLI[Thought CLI]
        Hooks[Claude Code Hooks]
    end
    
    subgraph "Server Layer"
        Server[MCP Server]
        Router[Query Router]
        Classifier[Query Classifier]
    end
    
    subgraph "Memory Layer"
        Memory[Memory Manager]
        Recall[Recall Engine]
        Ask[Ask - NL to Cypher]
    end
    
    subgraph "Storage Layer"
        Backend[SQLite Backend]
        Entities[Entity Store]
        Edges[Edge Store]
        Embeddings[Vector Embeddings]
    end
    
    subgraph "Ingest Layer"
        CodePipeline[Code Pipeline]
        GitPipeline[Git Pipeline]
        Extractors[Language Extractors]
    end
    
    MCP --> Server
    CLI --> Server
    Hooks --> Server
    Server --> Router
    Router --> Classifier
    Classifier --> Memory
    Memory --> Backend
    CodePipeline --> Backend
    GitPipeline --> Backend
    Ask --> Recall
```

## Core Components

### MCP Server (`src/thought/server.py`)

The MCP server exposes the primary tool interface for AI clients. It implements async tool handlers that delegate to the memory layer.

**Key Tools:**

| Tool | Purpose |
|------|---------|
| `recall` | Semantic recall of entities using embeddings |
| `ask` | Natural language queries translated to Cypher |
| `working_context` | Context primitive for agent awareness |
| `scan` | Incremental code scanning with change detection |

资料来源：[src/thought/server.py:1-100]()

### Query Router and Classifier

The system routes queries through a classification system that detects:

- **CODE** queries: Triggered by code-shaped keywords (`function`, `class`, `caller`, `callee`, file extensions) plus camelCase/snake_case identifiers
- **CHANGE** queries: Historical or diff-based queries
- **HYBRID** combinations: CODE × CHANGE patterns like "what changed in auth.middleware since v1.0"

```mermaid
graph LR
    Q[Query] --> C[Classifier]
    C --> |CODE| CR[Code Route]
    C --> |CHANGE| CH[Change Route]
    C --> |HYBRID| HY[Hybrid Route]
    C --> |DEFAULT| DF[Default Recall]
```

资料来源：[CHANGELOG.md:1-80]()

### Code Layer (`src/thought/layers/code.py`)

The code layer provides a high-level API for code-specific graph queries against the currently-valid view of the code graph.

```python
class CodeLayer:
    def callers_of(name)    # Who calls this function
    def callees_of(name)    # What this function calls
    def impact_set(name)    # Transitive callers, ranked
    def defines_in_file()   # Entities in a given file
```

资料来源：[src/thought/layers/code.py:1-60]()

## Storage Architecture

### SQLite Backend

The system uses SQLite as its primary storage with the following schema features:

- **Bi-temporal model**: Tracks `valid_from`/`valid_until` (business time) and `learned_at` (system knowledge time)
- **Entity/Edge tables** with code-specific columns (`code_file`, `code_language`, `code_commit_sha`)
- **Partial indexes** for efficient queries
- **WAL mode** with checkpointing for consistent backups

### Data Models

**Entity Structure:**
```python
@dataclass
class CodeEntity:
    name: str
    type_: str           # function, class, module, method
    language: str        # python, typescript, rust, php
    file_path: str
    line_start: int
    line_end: int
    signature: str
    docstring: str
    visibility: str      # public, private, protected
    attrs: dict
```

**Edge Types:**
- `CALLS` - Function/method invocations
- `INHERITS_FROM` - Class inheritance
- `IMPORTS` - Module imports
- `DEFINES` - Member definitions within classes
- `OVERRIDES` - Method overrides (TypeScript)

资料来源：[src/thought/ingest/code/pipeline.py:1-100]()

## Code Ingestion Pipeline

### Language Extractors

The system uses tree-sitter parsers for multi-language code extraction:

| Language | File | Capabilities |
|----------|------|--------------|
| Python | `python_extractor.py` | Functions, classes, imports, inheritance |
| TypeScript | `typescript_extractor.py` | Functions, classes, imports, exports, inheritance, overrides |
| Rust | `rust_extractor.py` | Functions, impl blocks, traits |
| PHP | `php_extractor.py` | Functions, classes, methods, namespaces |

All extractors output `CodeEntity` and `CodeEdge` tuples parsed from AST nodes.

资料来源：[src/thought/ingest/code/python_extractor.py:1-80]()

### Code Pipeline Flow

```mermaid
graph TD
    F[File Input] --> LD[Language Detection]
    LD --> EX[Extract Entities/Edges]
    EX --> SI[Upsert Source]
    SI --> WE[_write_entities]
    WE --> EE[Embed Signatures]
    EE --> WEd[_write_edges]
    WEd --> CM[Commit Transaction]
    
    subgraph "Entities Processing"
        WE --> |"name_to_id map"| WEd
    end
```

资料来源：[src/thought/ingest/code/pipeline.py:100-200]()

### Git Pipeline (`src/thought/ingest/code/git_pipeline.py`)

The git pipeline enables historical code analysis with two modes:

| Mode | Behavior |
|------|----------|
| `snapshot` | Fast - ingest HEAD only, stamp entities with HEAD SHA |
| `full` | Walk every commit chronologically, stamp each entity with its commit SHA |

The `full` mode enables bi-temporal `as_of` queries against historical commits.

资料来源：[src/thought/ingest/code/git_pipeline.py:1-50]()

## Query System

### Recall Engine

Semantic recall uses vector embeddings to find entities by intent rather than exact name:

```python
def recall(
    query: str,
    scope: str = "all",
    owner_id: str | None = None,
    limit: int = 10,
) -> list[RecallHit]
```

The system embeds entity signatures and docstrings during ingestion, enabling natural queries like "who calls authenticate_user".

### Ask Engine (`src/thought/query/ask.py`)

Natural language to Cypher translation with validation:

```mermaid
graph LR
    NL[Natural Language] --> PROMPT[Build Prompt]
    PROMPT --> LLM[LLM Provider]
    LLM --> CY[Cypher Query]
    CY --> VAL[Validate]
    VAL --> |Valid| EXE[Execute]
    VAL --> |Invalid| FB[Fallback to Recall]
```

**Constraint System:**
- Read-only Cypher features only (MATCH, WHERE, RETURN)
- Validates against actual schema before execution
- Falls back to `recall()` on translation failures

资料来源：[src/thought/query/ask.py:1-80]()

## Integration Points

### MCP Client Installation (`src/thought/clients.py`)

The system installs as an MCP server for AI coding tools:

```python
def install(client: ClientName, *, server_name: str = "thought")
```

Supported clients include Claude Code and other MCP-compatible tools. Installation merges configuration without disturbing existing settings.

### Claude Code Hooks (`src/thought/hooks/install.py`)

Hooks provide automatic memory integration:

| Hook | Event | Action |
|------|-------|--------|
| `recall` | UserPromptSubmit | Memory recall on user input |
| `write` | Stop | Context capture on completion |
| `context` | SessionStart | Session initialization |

资料来源：[src/thought/hooks/install.py:1-50]()

## CLI Architecture (`src/thought/cli.py`)

The command-line interface provides database lifecycle management:

| Command | Function |
|---------|----------|
| `thought init` | Create database + config + CLAUDE.md |
| `thought db size` | Disk usage + entity/edge counts |
| `thought db flush` | Wipe KB with backup |
| `thought db backup` | SQLite online-backup snapshot |
| `thought db load` | Load snapshot atomically |
| `thought db inspect` | Count + schema summary |
| `thought ingest-git` | Git-history-aware ingestion |
| `thought callers` | Direct callers via PageRank |
| `thought impact` | Transitive impact set |
| `thought diff` | Entity diff between commits |

资料来源：[src/thought/cli.py:1-100]()

## Demo System (`src/thought/demo.py`)

The built-in demo provides audience-specific walkthroughs:

| Audience | Purpose |
|----------|---------|
| `code` | Agent/developer flow - 14-stage code vertical |
| `writer` | Novelist/paper author - bi-temporal recall |
| `legal` | Investigator - contradiction detection |
| `researcher` | Academic - claim/source relationships |

资料来源：[src/thought/demo.py:1-50]()

## Configuration

### Database Initialization

```toml
# thought.toml
[database]
path = ".thought/thought.db"

[llm]
provider = "anthropic"  # or ollama, lmstudio, openai-compat

[embedder]
type = "auto"  # sentence-transformers if available, else deterministic
```

资料来源：[src/thought/cli.py:50-80]()

## Summary

The thought-mcp architecture combines:

1. **MCP Server** - Tool interface for AI clients
2. **Bi-temporal Storage** - SQLite with code-specific schema
3. **Multi-language Extractors** - Tree-sitter based AST parsing
4. **Git Integration** - Historical code analysis
5. **Query Routing** - Classification-based query dispatch
6. **Natural Language Interface** - NL to Cypher translation

This design enables both real-time code assistance and deep historical analysis of codebases through a unified query interface.

---

<a id='page-storage-layer'></a>

## Storage and Database Layer

### 相关页面

相关主题：[System Architecture](#page-architecture), [Memory Model and Data Structures](#page-memory-model)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [src/thought/storage/__init__.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/__init__.py)
- [src/thought/storage/base.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/base.py)
- [src/thought/storage/sqlite/backend.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/backend.py)
- [src/thought/storage/sqlite/migrations/0001_initial.sql](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/migrations/0001_initial.sql)
- [src/thought/storage/sqlite/migrations/0002_v0.2_code.sql](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/migrations/0002_v0.2_code.sql)
- [src/thought/storage/sqlite/migrations/0003_views.sql](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/migrations/0003_views.sql)
- [src/thought/storage/sqlite/migrations/0004_agents.sql](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/migrations/0004_agents.sql)
</details>

# Storage and Database Layer

## Overview

The Storage and Database Layer is the persistence backbone of the THOUGHT system, providing a structured SQLite-based knowledge base (KB) for storing entities, edges, embeddings, and operational metadata. This layer abstracts database operations through a modular backend interface, enabling CRUD operations, bi-temporal data tracking, and specialized queries for code analysis.

The architecture supports:
- **Entity/Edge persistence** with bi-temporal validity tracking (valid_from, valid_until, learned_at)
- **Vector embeddings** for semantic recall operations
- **Source tracking** for ingested content provenance
- **Code-specific metadata** including language, file path, and commit SHA
- **Agent and scan logging** for operational auditability

资料来源：[src/thought/storage/__init__.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/__init__.py)

---

## Architecture

### Layer Stack

```mermaid
graph TD
    A[CLI / API Layer] --> B[GraphLayer / CodeLayer]
    B --> C[BaseBackend Interface]
    C --> D[SQLiteBackend]
    D --> E[(SQLite Database)]
    
    F[Ingest Pipelines] --> C
    G[Recall Queries] --> B
```

### Backend Interface Hierarchy

The system defines a base interface (`BaseBackend`) that all storage implementations must satisfy, with `SQLiteBackend` as the primary concrete implementation.

| Component | Responsibility |
|-----------|-----------------|
| `BaseBackend` | Abstract interface defining all storage operations |
| `SQLiteBackend` | Concrete SQLite implementation with WAL mode and migrations |
| `GraphLayer` | Query layer wrapping backend for graph traversal operations |
| `CodeLayer` | Specialized wrapper for code-specific queries (callers, callees, impact) |

资料来源：[src/thought/storage/base.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/base.py), [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)

---

## Data Models

### Entity Model

The `Entity` model represents atomic units of knowledge stored in the database.

| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | Unique identifier (UUID) |
| `type` | `str` | Entity type (e.g., "function", "class", "claim") |
| `name` | `str` | Display name |
| `canonical_name` | `str` | Normalized identifier for lookups |
| `owner_id` | `str \| None` | Owner for private entities |
| `scope` | `ScopeName` | "shared" or "private" visibility |
| `tier` | `Tier` | Importance tier ("hot", "warm", "cold") |
| `importance` | `float` | 0.0-1.0 importance score |
| `valid_from` | `datetime` | When entity became valid |
| `valid_until` | `datetime \| None` | When entity expired (NULL = currently valid) |
| `learned_at` | `datetime` | When system learned this fact |
| `unlearned_at` | `datetime \| None` | When system discarded this fact |
| `created_at` | `datetime` | Record creation timestamp |
| `last_accessed_at` | `datetime` | Last access timestamp |
| `access_count` | `int` | Query frequency counter |
| `attrs` | `dict[str, object]` | Extensible metadata |
| `code_file` | `str \| None` | Source file path (code entities) |
| `code_language` | `str \| None` | Programming language (code entities) |
| `code_commit_sha` | `str \| None` | Git commit SHA (code entities) |

资料来源：[src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)

### Edge Model

The `Edge` model represents relationships between entities.

| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | Unique identifier |
| `source_id` | `str` | Source entity ID |
| `target_id` | `str` | Target entity ID |
| `relation_type` | `str` | Relationship type (e.g., "CALLS", "INHERITS_FROM") |
| `attrs` | `dict[str, object]` | Edge metadata |

资料来源：[src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)

---

## Database Schema

### Core Tables

```mermaid
erDiagram
    entities ||--o{ edges : source
    entities ||--o{ edges : target
    entities ||--o{ embeddings : ""
    entities ||--o{ sources : references
    sources ||--o{ entities : ""
    
    entities {
        string id PK
        string type
        string name
        string canonical_name
        string scope
        string owner_id FK
        string tier
        float importance
        datetime valid_from
        datetime valid_until
        datetime learned_at
        datetime unlearned_at
        datetime created_at
        datetime last_accessed_at
        int access_count
        json attrs
        string code_file
        string code_language
        string code_commit_sha
    }
    
    edges {
        string id PK
        string source_id FK
        string target_id FK
        string relation_type
        json attrs
        datetime valid_from
        datetime valid_until
    }
    
    embeddings {
        string id PK
        string entity_id FK
        string model_name
        string model_version
        int dim
        blob vector
    }
    
    sources {
        string id PK
        string content
        string mime_type
    }
```

### Migration System

The database uses a migration-based schema evolution system. Migrations are tracked in the `applied_migrations` table to ensure idempotent execution.

| Migration | Purpose |
|-----------|---------|
| `0001_initial.sql` | Core schema: entities, edges, sources, embeddings |
| `0002_v0.2_code.sql` | Code-specific columns: code_file, code_language, code_commit_sha |
| `0003_views.sql` | Database views for common queries |
| `0004_agents.sql` | Agent management and scan logging tables |

资料来源：[src/thought/storage/sqlite/migrations/0001_initial.sql](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/migrations/0001_initial.sql) - [0004_agents.sql](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/migrations/0004_agents.sql)

---

## Backend Operations

### Core CRUD Operations

| Operation | Description |
|-----------|-------------|
| `upsert_entity()` | Insert or update an entity with identity `(name, scope, owner_id, code_file, code_commit_sha)` |
| `upsert_edge()` | Insert or update an edge between entities |
| `find_entity()` | Retrieve entity by ID |
| `find_code_entity()` | Fast lookup by canonical name with optional code_file/code_commit_sha disambiguation |
| `list_entities()` | Paginated entity listing with scope filtering |

### Backup and Restore Operations

| Operation | Description |
|-----------|-------------|
| `backup_to(path)` | Create SQLite online-backup snapshot |
| `merge_from(path)` | INSERT-OR-IGNORE rows from backup file |
| `open_readonly(path)` | Classmethod for read-only inspection of backup files |
| Auto-backup | Automatic `.bak.<timestamp>` backup before destructive operations |

### WAL Checkpointing

The backend performs WAL checkpointing on `close()` to ensure backups always see a consistent database state.

资料来源：[src/thought/storage/sqlite/backend.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/sqlite/backend.py)

---

## Bi-Temporal Data Model

The storage layer implements bi-temporal tracking, enabling both temporal validity and temporal knowledge queries.

```mermaid
graph LR
    A[valid_from] --> B[Entity Validity Period]
    B --> C[valid_until]
    
    D[learned_at] --> E[Knowledge Period]
    E --> F[unlearned_at]
```

| Time Axis | Purpose | Query Use |
|-----------|---------|-----------|
| `valid_from` / `valid_until` | Temporal validity | `as_of_kind='valid'` — "what was true on date X" |
| `learned_at` / `unlearned_at` | System knowledge | `as_of_kind='learned'` — "what did the system know on date X" |

This separation allows querying historical states even after facts are corrected.

资料来源：[src/thought/storage/__init__.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/storage/__init__.py)

---

## Code-Specific Features

### Entity Identity for Code

Code entities use a composite identity to prevent method name collisions across files and commits:

```
(name, scope, owner_id, code_file, code_commit_sha)
```

This ensures that `auth.middleware.authenticate_user` and `billing.middleware.authenticate_user` remain distinct entities, and methods from different commits are preserved separately.

### Fast Code Entity Lookup

The `find_code_entity()` method provides optimized lookup with:

- Primary: exact `canonical_name` match
- Secondary: `code_file` disambiguation
- Tertiary: `code_commit_sha` filtering for historical queries

### Call Graph Resolution

The `call_graph.py` module implements multi-stage resolution:

1. Intra-file qualified match (`file.method`)
2. Unique qualified suffix match
3. Cross-file bare-name match
4. Stub creation for unresolved references

资料来源：[src/thought/ingest/code/call_graph.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/call_graph.py)

---

## CLI Commands

The storage layer is accessible via the `thought db` command group.

| Command | Description |
|---------|-------------|
| `thought db size` | Disk usage of main + WAL + SHM files + entity/edge counts |
| `thought db flush` | Wipe KB with optional date/scope filters |
| `thought db backup <file>` | Create SQLite snapshot with optional date filters |
| `thought db load <file>` | Replace active DB or merge from snapshot |
| `thought db inspect <file>` | Inspect backup without loading |

```bash
# Flush entities valid before a date
thought db flush --before 2025-01-01

# Flush with time-axis filter
thought db flush --time-axis valid --since 2024-06-01

# Backup with date subset
thought db backup backup.db --since 2024-01-01
```

资料来源：[src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

---

## Query Layer Wrappers

### GraphLayer

Base wrapper providing graph traversal operations including:

- Personalized PageRank for ranking entities
- Transitive reachability queries
- Edge type filtering

### CodeLayer

Specialized wrapper for code-specific operations:

| Method | Description |
|--------|-------------|
| `callers_of(name)` | Direct callers ranked by PageRank |
| `callees_of(name)` | Direct callees within package |
| `impact_set(name)` | Transitive callers (all affected entities) |
| `defines_in_file(path)` | All entities in a source file |

Both layers support `as_of=` parameter for historical queries against the bi-temporal git ingest data.

资料来源：[src/thought/layers/graph.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/graph.py), [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)

---

## Configuration

### Database Path

Configured via `thought.toml`:

```toml
db_path = ".thought/thought.db"
```

### Initialization Options

| Option | Default | Description |
|--------|---------|-------------|
| `config` | `"thought.toml"` | Config file path |
| `db_path` | `".thought/thought.db"` | SQLite database path |
| `embedder` | `"auto"` | Embedder selection |
| `write_claude_md` | `true` | Generate CLAUDE.md for MCP clients |
| `quick` | `false` | Skip embedder warmup |

资料来源：[src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

---

## Summary

The Storage and Database Layer provides THOUGHT's persistent memory through:

1. **SQLite with WAL mode** for reliable concurrent access
2. **Bi-temporal modeling** for historical queries and fact correction tracking
3. **Migration-based schema evolution** with idempotent execution
4. **Code-aware entity identity** preventing cross-file/commits collisions
5. **Specialized query layers** (GraphLayer, CodeLayer) for semantic and structural queries
6. **Backup/restore operations** with atomic replacements and auto-backup safety

This architecture enables the "second brain" functionality, allowing agents to recall, query, and analyze information across code, prose, legal, and research domains.

---

<a id='page-memory-model'></a>

## Memory Model and Data Structures

### 相关页面

相关主题：[System Architecture](#page-architecture), [Storage and Database Layer](#page-storage-layer), [Query and Retrieval System](#page-query-system)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)
- [src/thought/ingest/entities.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/entities.py)
- [src/thought/consolidation/engine.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/consolidation/engine.py)
- [src/thought/layers/vector.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/vector.py)
- [src/thought/layers/graph.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/graph.py)
- [src/thought/layers/temporal.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/temporal.py)
</details>

# Memory Model and Data Structures

## Overview

The thought-mcp repository implements a multi-layered memory architecture designed for AI-assisted knowledge management. The memory model combines vector embeddings for semantic search, graph relationships for structural querying, and temporal versioning for historical analysis. This hybrid approach enables both intuitive natural-language recall and precise code-intent queries.

The core memory system operates as a knowledge base (KB) with bi-temporal semantics, tracking when facts became true (`valid_from`) versus when the system learned them (`learned_at`). This design supports time-travel queries that answer "what was true on date X" or "what did the system know on date X".

## Architecture Layers

The memory system is organized into three distinct but interconnected layers:

```mermaid
graph TD
    A[User Input] --> B[Memory Layer]
    B --> C[Vector Layer]
    B --> D[Graph Layer]
    B --> E[Temporal Layer]
    C --> F[SQLite Backend]
    D --> F
    E --> F
    G[Query/Recall] --> B
```

### Vector Layer (`src/thought/layers/vector.py`)

The vector layer handles semantic embedding and similarity search. It stores dense vector representations of entities enabling natural-language recall based on meaning rather than exact keyword matching.

**Core Responsibilities:**

- Embed text content (entity names, signatures, docstrings) into high-dimensional vectors
- Store embeddings with model metadata (name, version, dimensions)
- Perform similarity searches against the embedded corpus
- Support fallback to deterministic embeddings when ML models are unavailable

**Key Components:**

| Component | Purpose |
|-----------|---------|
| `VectorStore` | Persists embeddings in SQLite with metadata |
| `Embedder` | Base protocol for embedding models |
| `OllamaEmbedder` | Integration with Ollama's `/api/embed` endpoint |
| `DeterministicEmbedder` | Fallback using hash-based vectors |

资料来源：[src/thought/layers/vector.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/vector.py)

### Graph Layer (`src/thought/layers/graph.py`)

The graph layer manages entity-relationship data structures and supports Cypher-style traversals. It maintains the structural knowledge of how entities connect to each other.

**Entity Types Supported:**

| Type | Description |
|------|-------------|
| `module` | Source file or namespace unit |
| `class` | Class or type declarations |
| `function` | Function definitions |
| `method` | Class methods |
| `fact` | General knowledge facts |
| `claim` | Academic/research claims |
| `source` | Citation or reference |
| `witness` | Legal testimony statements |

**Edge Relation Types:**

| Relation | Meaning |
|----------|---------|
| `IMPORTS` | Module dependency relationship |
| `INHERITS_FROM` | Class inheritance |
| `DEFINES` | Container defines a member |
| `OVERRIDES` | Method overrides parent |
| `CALLS` | Function invocation |
| `REFERS_TO` | General reference |
| `CONTRADICTS` | Logical opposition between facts |

资料来源：[src/thought/layers/graph.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/graph.py)

### Temporal Layer (`src/thought/layers/temporal.py`)

The temporal layer implements bi-temporal data modeling, tracking both valid time and learned time for all entities. This enables sophisticated time-travel queries and contradiction detection.

**Bi-Temporal Model:**

```mermaid
graph LR
    A[Entity] --> B[valid_from<br/>When fact became true]
    A --> C[learned_at<br/>When KB learned fact]
    D[as_of valid] --> E[Historical state query]
    D --> F[as_of learned<br/>System knowledge query]
```

**Key Temporal Features:**

- `valid_from`: Timestamp when the fact became true in reality
- `learned_at`: Timestamp when the system recorded the fact
- `valid_until`: Optional expiration of fact validity
- `CONTRADICTS` edges: Automatically surface when facts conflict across time axes

资料来源：[src/thought/layers/temporal.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/temporal.py)

## Core Data Models

### Entity Model (`src/thought/models.py`)

The base `Entity` model represents all stored knowledge items in the system.

```python
class Entity:
    id: str                    # Unique identifier
    name: str                  # Canonical name
    type: str                  # Entity type (see table above)
    scope: str                  # "shared" or "private"
    owner_id: str | None        # Owner for private entities
    valid_from: datetime        # When fact became true
    learned_at: datetime        # When system learned it
    source_ref: str | None      # Reference to source document
    tier: str                   # "hot", "warm", "cold" (access frequency)
    attrs: dict                 # Type-specific attributes
```

**Entity Attributes by Type:**

| Entity Type | Key Attributes |
|-------------|----------------|
| `code_*` | `code_file`, `code_language`, `code_commit_sha`, `signature`, `visibility`, `line_start`, `line_end` |
| `fact` | `predicates`, `unique_predicates`, `source_doc` |
| `claim` | `citation_key`, `reliability_score` |

资料来源：[src/thought/models.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)

### Code Entities (`src/thought/ingest/entities.py`)

Code-specific entities extend the base model with language-aware attributes:

```python
class CodeEntity:
    name: str
    type_: str                  # "module", "class", "function", "method"
    language: str               # "python", "typescript", "rust", "php"
    file_path: str
    line_start: int
    line_end: int
    signature: str              # Function/class signature
    visibility: str             # "public", "private", "protected"
    docstring: str | None
    attrs: dict                 # Language-specific (e.g., `class` for methods)
```

### Edge Model (`src/thought/ingest/entities.py`)

Relationships between entities are modeled as typed, directed edges:

```python
class CodeEdge:
    source_name: str             # Origin entity
    target_name: str             # Destination entity
    relation_type: str          # IMPORTS, DEFINES, INHERITS_FROM, etc.
    line_number: int | None
    attrs: dict
```

资料来源：[src/thought/ingest/entities.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/entities.py)

## Consolidation Engine (`src/thought/consolidation/engine.py`)

The consolidation engine handles fact deduplication, merging, and contradiction detection. It processes incoming data through a pipeline that ensures data quality and consistency.

```mermaid
graph TD
    A[Raw Input] --> B[Jaccard Deduplication]
    B --> C[Fact Extraction]
    C --> D[Predicate Matching]
    D --> E{Conflict?}
    E -->|Yes| F[Create CONTRADICTS Edge]
    E -->|No| G[Merge into KB]
    F --> G
```

**Consolidation Pipeline Steps:**

1. **Jaccard Deduplication**: Skip content with >50% overlap to existing facts
2. **Fact Extraction**: Parse structured predicates from unstructured text
3. **Predicate Matching**: Match against existing knowledge using unique predicates
4. **Contradiction Detection**: Create `CONTRADICTS` edges when facts conflict
5. **Entity Merging**: Upsert with identity `(name, code_file, code_commit_sha)`

资料来源：[src/thought/consolidation/engine.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/consolidation/engine.py)

## Storage Backend

The system uses SQLite as its primary storage engine with the following schema:

```mermaid
graph TD
    A[SQLite Database] --> B[entities table]
    A --> C[edges table]
    A --> D[embeddings table]
    A --> E[applied_migrations table]
    B --> F[code_file<br/>code_language<br/>code_commit_sha]
    C --> G[relation_type<br/>source_name<br/>target_name]
    D --> H[model_name<br/>model_version<br/>vector BLOB]
```

**Key Backend Classes:**

| Class | Responsibility |
|-------|----------------|
| `Backend` | Core CRUD operations on entities/edges |
| `find_code_entity()` | Fast lookup by name + file/commit disambiguators |
| `upsert_entity()` | Insert or update with identity awareness |
| `store_embedding()` | Persist vectors with model metadata |

资料来源：[src/thought/storage/sqlite/backend.py](src/thought/storage/sqlite/backend.py) (inferred from CHANGELOG.md)

## Query Pathways

The memory system supports multiple query mechanisms:

### Recall (Vector Search)

```python
def recall(
    query: str,
    scope: str = "all",
    owner_id: str | None = None,
    max_results: int = 10,
) -> list[RecallResult]
```

Returns up to 10 semantically similar entities based on embedding similarity.

### Ask (Natural Language to Cypher)

Routes natural-language questions through an LLM to generate Cypher queries:

```
QUESTION: "who calls authenticate_user"
→ CYPHER: MATCH (caller)-[:CALLS]->(f:Function {name: 'authenticate_user'}) 
          RETURN caller.name
```

资料来源：[src/thought/query/ask.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)

### Code Intelligence Queries

| Command | Purpose |
|---------|---------|
| `thought callers <name>` | Direct callers via Personalized PageRank |
| `thought impact <name>` | Transitive impact set (what breaks if changed) |
| `thought diff --from SHA1 --to SHA2` | Entity diff between commits |

## Ingest Pipelines

Code ingestion follows a standardized pipeline:

```mermaid
graph TD
    A[Source File] --> B[Language Detection]
    B --> C[AST Parser<br/>tree-sitter]
    C --> D[Extractor<br/>Language-specific]
    D --> E[CodeEntity list]
    D --> F[CodeEdge list]
    E --> G[Embedding]
    G --> H[Backend upsert]
    F --> H
    H --> I[Call Graph Builder<br/>optional]
```

**Supported Languages:**

- Python (`.py`) - via tree-sitter-python
- TypeScript (`.ts`, `.tsx`) - via tree-sitter-typescript
- Rust (`.rs`) - via tree-sitter-rust
- PHP (`.php`) - via tree-sitter-php

**Extracted Metadata:**

- Module/namespace names
- Class declarations with heritage (extends, implements)
- Function and method definitions
- Import/use declarations
- Visibility modifiers (public, private, protected)

资料来源：[src/thought/ingest/code/python_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/python_extractor.py), [src/thought/ingest/code/typescript_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/typescript_extractor.py)

## Auto-Memory Hooks

The system integrates with Claude Code via hooks for automatic memory management:

| Hook | Event | Action |
|------|-------|--------|
| `recall` | `UserPromptSubmit` | Embeds prompt, recalls relevant context |
| `write` | `Stop` | Extracts facts from session transcript |
| `context` | `SessionStart` | Loads relevant context for new session |

资料来源：[src/thought/hooks/install.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)

## Versioning and Snapshots

The storage layer supports full database lifecycle management:

| Operation | Description |
|-----------|-------------|
| `db size` | Disk usage + entity/edge counts |
| `db flush` | Wipe KB with date-bounded options |
| `db backup <file>` | SQLite online backup snapshot |
| `db load <file>` | Restore or merge from snapshot |
| `db inspect <file>` | Preview backup without loading |

WAL (Write-Ahead Logging) checkpoints ensure consistent backups.

## Summary

The thought-mcp memory model implements a production-grade knowledge management system with:

1. **Three-layer architecture**: Vector for semantics, Graph for structure, Temporal for history
2. **Bi-temporal semantics**: Tracks both validity and knowledge acquisition times
3. **Code-aware extraction**: AST-based parsing for multiple programming languages
4. **Contradiction detection**: Automatic `CONTRADICTS` edges between conflicting facts
5. **Multiple query pathways**: Semantic recall, natural-language Cypher, and code-intelligence commands
6. **Git-aware versioning**: Commits can be stamped on entities for historical queries

This architecture enables sophisticated AI memory capabilities while maintaining query performance through strategic use of SQLite with proper indexing.

---

<a id='page-query-system'></a>

## Query and Retrieval System

### 相关页面

相关主题：[Memory Model and Data Structures](#page-memory-model), [Storage and Database Layer](#page-storage-layer), [Agent Adapters and SDK Integration](#page-agent-adapters)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/thought/query/__init__.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/__init__.py)
- [src/thought/query/ask.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)
- [src/thought/query/cypher.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/cypher.py)
- [src/thought/query/views.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/views.py)
- [src/thought/hooks/recall.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/recall.py)
</details>

# Query and Retrieval System

The Query and Retrieval System is a core subsystem within the thought-mcp project that enables users to query the knowledge graph using natural language. It translates human-readable questions into structured Cypher queries or SQL statements, executes them against the underlying SQLite backend, and returns ranked, relevant results. The system serves as the primary interface for retrieving facts, code entities, relationships, and historical data stored in the memory database.

## Architecture Overview

The Query and Retrieval System is composed of several interconnected modules that work together to process, route, and execute queries. At its core, the system leverages a Router to classify incoming queries into semantic categories, then delegates processing to specialized handlers based on the query type.

```mermaid
graph TD
    A[User Query] --> B[Router]
    B --> C{Code Query?}
    B --> D{Natural Language?}
    B --> E{Search Query?}
    C --> F[Code Layer]
    D --> G[Ask Module]
    G --> H[Cypher Translator]
    H --> I[Query Validator]
    I --> J[SQLite Backend]
    E --> K[Recall Hook]
    K --> J
    J --> L[Results]
    F --> L
```

The system follows a layered approach where queries are first classified by intent, then transformed into appropriate database queries. Natural language queries are translated to Cypher through an LLM-based translator, while code-specific queries bypass translation and directly execute predefined graph traversal operations.

## Query Classification

The Router module plays a critical role in determining how each query should be processed. Based on keyword detection and pattern matching, queries are classified into distinct types that trigger different handling paths.

### Query Types

| Query Type | Trigger Keywords | Handler | Use Case |
|------------|------------------|---------|----------|
| CODE | `function`, `class`, `caller`, `callee`, `impact`, file extensions, camelCase identifiers | CodeLayer | Code graph traversal |
| CHANGE | `since v1.0`, `before this commit`, `diff` | GitIngestReport | Version-aware queries |
| HYBRID | CODE × CHANGE combinations | GraphLayer + GitWalker | Historical code analysis |
| SEARCH | General text | Recall Hook | Semantic search |
| ASK | Natural language questions | Ask Module | Natural language to Cypher |

资料来源：[src/thought/query/ask.py:1-30](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)

### CODE Query Detection

The CODE query class is triggered by code-shaped keywords and identifier patterns. This includes function names, class declarations, caller/callee relationships, file extensions such as `.py` or `.ts`, and version-related phrases like `since v1.0` or `before this commit`. Additionally, camelCase and snake_case identifiers automatically route to the CODE handler, enabling queries like "who calls authenticate_user" to be processed through the call-graph machinery without explicit CLI invocation.

资料来源：[src/thought/query/ask.py:1-30](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)

## Natural Language to Cypher Translation

The Ask module (`src/thought/query/ask.py`) is responsible for translating natural language questions into Cypher queries. This translation is performed by an LLM provider configured in the `[llm]` section of the configuration file, supporting multiple backends through a unified interface.

### Translation Process

```mermaid
sequenceDiagram
    participant U as User
    participant A as Ask Module
    participant L as LLM Provider
    participant V as Cypher Validator
    participant B as SQLite Backend
    
    U->>A: "What functions call authenticate_user?"
    A->>A: Build Prompt with Schema
    A->>L: Send Prompt
    L-->>A: Cypher Query
    A->>V: Validate Cypher
    alt Valid
        V->>B: Execute Query
        B-->>V: Results
        V-->>U: Ranked Results
    else Invalid
        A->>A: Fallback to Recall
        A-->>U: Semantic Search Results
    end
```

The translation process begins with constructing a prompt that includes the database schema, entity types, and relationship types. The LLM generates a Cypher query that is then validated against a parser before execution. If validation fails or the query cannot be executed, the system gracefully falls back to a plain `recall()` call, ensuring the user always receives some response.

### Prompt Constraints

The Ask module enforces strict constraints on generated queries to maintain system safety and performance:

- Only read-only Cypher features are permitted, including `MATCH`, `WHERE`, `RETURN`, `LIMIT`, and `AS_OF`
- Query types are restricted to `MERGE`, `CREATE`, `DELETE`, `SET`, and `WITH` being explicitly forbidden
- All entity types and relationship types must come from the defined schema
- Single Cypher queries are required without explanations or markdown formatting

资料来源：[src/thought/query/ask.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)

### AskResult Data Model

The `AskResult` dataclass encapsulates the outcome of a query translation and execution attempt:

| Field | Type | Description |
|-------|------|-------------|
| `cypher` | `str \| None` | The generated Cypher query |
| `sql` | `str \| None` | Alternative SQL query if applicable |
| `rows` | `list[dict[str, Any]] \| None` | Query results |
| `fallback_used` | `bool` | Whether fallback to recall was triggered |
| `fallback_reason` | `str` | Explanation if fallback occurred |

资料来源：[src/thought/query/ask.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/ask.py)

## Recall Hook

The recall hook (`src/thought/hooks/recall.py`) provides semantic search functionality as a fallback mechanism and primary retrieval method. It uses embedding vectors to find semantically similar entities in the knowledge graph, supporting the core recall operation used throughout the system.

### Recall Behavior

Recall operations are bounded by design to prevent overwhelming the user with too many results. The system never returns more than 10 hits regardless of knowledge base size, encouraging users to narrow their queries using `as_of` and `scope` parameters for more targeted retrieval.

The recall mechanism supports bi-temporal queries through the `as_of_kind` parameter:

- `valid`: Returns what was true on a given date, answering "what was true on date X"
- `learned`: Returns what the system knew on a given date, answering "what did the system know on date X"

These two modes differ when facts are corrected after the fact, enabling users to perform historical analysis of their knowledge graph.

资料来源：[src/thought/query/views.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/views.py)

## Code Layer

The Code Layer (`src/thought/layers/code.py`) provides a specialized interface for code-specific graph queries. It wraps the GraphLayer with operations native to programmers, operating against the currently-valid view of the code graph using the `valid_until IS NULL` filter.

### Core Operations

| Method | Description | Use Case |
|--------|-------------|----------|
| `callers_of(name)` | Direct callers ranked by PageRank | Finding who uses a function |
| `callees_of(name)` | Direct callees within the package | Finding what a function calls |
| `impact_set(name)` | Transitive callers ranked | Dependency analysis |
| `defines_in_file()` | All entities in a file | File-level inspection |

All four operations support optional `as_of` parameters to query historical snapshots when bi-temporal git ingest has been configured. The `code_commit_sha` field enables time-travel queries against the code graph.

资料来源：[src/thought/layers/code.py:1-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)

### Entity Resolution

The `_resolve_entity_id` method handles name resolution with multiple fallback strategies:

1. Intra-file match with exact name
2. Cross-file match with unique qualified suffix
3. Cross-file bare-name match for top-level functions
4. Stub creation for unresolved references

This multi-stage resolution ensures that queries like `obj.method()` can resolve to `ClassName.method` when it is unique in the knowledge base, and that bare function names can be found across different files.

资料来源：[src/thought/query/cypher.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/query/cypher.py)

## Cypher Query Engine

The Cypher module (`src/thought/query/cypher.py`) handles the parsing, validation, and execution of Cypher queries against the SQLite backend. It provides a bridge between the graph query language and the relational database storage.

### Query Validation

Before executing any Cypher query, the system validates it against the defined grammar to prevent malformed queries from reaching the database. This validation step catches syntax errors, unsupported features, and schema violations before they can cause runtime errors.

### Execution Model

Cypher queries are translated into equivalent SQL statements that operate against the SQLite schema. The translation preserves the semantic meaning of graph patterns while adapting them to the relational storage model used by the backend.

## Views and Data Models

The views module (`src/thought/query/views.py`) defines the data structures and return formats used throughout the Query and Retrieval System.

### Entity Model

The `Entity` model represents nodes in the knowledge graph with the following key attributes:

| Attribute | Type | Description |
|-----------|------|-------------|
| `id` | `str` | Unique identifier |
| `type` | `str` | Entity type (PERSON, function, class, etc.) |
| `name` | `str` | Display name |
| `canonical_name` | `str` | Normalized name for matching |
| `scope` | `ScopeName` | shared, private, or all |
| `tier` | `Tier` | hot, warm, or cold |
| `valid_from` | `datetime` | Start of validity period |
| `valid_until` | `datetime \| None` | End of validity period |
| `attrs` | `dict[str, object]` | Additional attributes |

### Scope Filter

The `ScopeFilter` class determines visibility of entities based on ownership and scope:

- `shared`: All entities with scope = "shared"
- `private`: Entities matching both scope = "private" AND owner_id
- `all`: Shared entities plus private entities owned by the requesting user

The scope filter generates SQL fragments that join against the entity table aliased as `e`, enabling fine-grained access control across the query system.

资料来源：[src/thought/models.py:1-80](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/models.py)

## CLI Commands

The Query and Retrieval System is exposed through several CLI commands under the `thought` command group:

| Command | Description |
|---------|-------------|
| `thought recall <query>` | Semantic search across the knowledge graph |
| `thought ask <question>` | Natural language query with Cypher translation |
| `thought callers <name>` | Find direct callers ranked by PageRank |
| `thought callees <name>` | Find direct callees within the package |
| `thought impact <name>` | Transitive impact set analysis |
| `thought browse <name>` | Drill into a topic with PPR-ranked neighborhood |
| `thought diff --from <sha1> --to <sha2>` | Compare entities between commits |

### Browse Command

The browse command (`mcp__thought__browse_topic`) implements a two-step resolution process. First, the name is matched against entity types for a type facet. If no type matches, the name is resolved as an entity using canonical-name matching, and the PPR-ranked neighborhood is returned. The `via` field in results indicates whether the hit came from `type_facet`, `ppr`, or `bfs` matching.

资料来源：[src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)

## Configuration

The Query and Retrieval System respects configuration from the `thought.toml` file and environment variables:

| Option | Default | Description |
|--------|---------|-------------|
| `embedder` | `auto` | Embedder selection: auto, sentence-transformers, or deterministic |
| `llm.provider` | `openai` | LLM provider for Ask module |
| `llm.model` | varies | Model name for translation |
| `db_path` | `.thought/thought.db` | SQLite database path |

The `auto` embedder selector probes the `sentence_transformers` package via `importlib.util.find_spec` before returning the wrapper, falling back to the deterministic embedder when the optional dependency is missing.

## Integration Points

The Query and Retrieval System integrates with several other subsystems:

- **Storage Layer**: SQLite backend provides entity and edge persistence
- **Ingest System**: Code extractors populate entities that are later queried
- **Memory Module**: Coordinates between recall, browse, and scan operations
- **Server**: Exposes query functionality via MCP protocol

The bidirectional relationship between the Code Layer and the Cypher query engine enables both natural language queries like "who calls authenticate_user" and structured queries using the CODE query class, providing flexibility for different user interaction patterns.

## Error Handling

The system implements graceful degradation throughout the query pipeline. If Cypher translation fails or validation rejects the generated query, execution falls back to the recall hook, ensuring users always receive results. Bounded result sets prevent resource exhaustion, and the contradiction detection mechanism surfaces conflicts as `CONTRADICTS` edges in the graph rather than throwing errors, allowing downstream applications to handle them as data.

---

This documentation reflects the Query and Retrieval System as implemented in thought-mcp, providing the foundation for natural language interaction with the code knowledge graph.

---

<a id='page-code-parsing'></a>

## Multi-Language Code Parsing

### 相关页面

相关主题：[Git History Integration](#page-git-integration), [Storage and Database Layer](#page-storage-layer)

<details>
<summary>Related Source Files</summary>

以下源码文件用于生成本页说明：

- [src/thought/ingest/code/__init__.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/__init__.py)
- [src/thought/ingest/code/ast_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/ast_extractor.py)
- [src/thought/ingest/code/call_graph.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/call_graph.py)
- [src/thought/ingest/code/pipeline.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/pipeline.py)
- [src/thought/ingest/code/python_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/python_extractor.py)
- [src/thought/ingest/code/typescript_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/typescript_extractor.py)
- [src/thought/ingest/code/go_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/go_extractor.py)
- [src/thought/ingest/code/rust_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/rust_extractor.py)
- [src/thought/ingest/code/java_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/java_extractor.py)
- [src/thought/ingest/code/php_extractor.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/php_extractor.py)
- [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)
</details>

# Multi-Language Code Parsing

The Multi-Language Code Parsing system is the foundational code-vertical layer in THOUGHT. It provides language-agnostic AST extraction across six programming languages using tree-sitter grammars, produces standardized code entities and relationship edges, and enables downstream features like caller analysis, impact queries, and cross-file call-graph resolution.

## Overview

The parsing system operates in two phases:

1. **Phase 1 – AST Extraction**: Each language has a dedicated extractor that walks the tree-sitter parse tree and emits `CodeEntity` and `CodeEdge` objects.
2. **Phase 2 – Call Graph Resolution**: After all files are ingested, a separate pass resolves `CALLS` edges by matching callee names against the entity index.

资料来源：[src/thought/ingest/code/ast_extractor.py:1-15]()

## Supported Languages

The system supports six languages through language-specific extractors:

| Language | Extractor File | Tree-sitter Grammar |
|----------|----------------|---------------------|
| Python | `python_extractor.py` | `tree-sitter-python` |
| TypeScript / TSX / JSX | `typescript_extractor.py` | `tree-sitter-typescript` |
| Go | `go_extractor.py` | `tree-sitter-go` |
| Rust | `rust_extractor.py` | `tree-sitter-rust` |
| Java | `java_extractor.py` | `tree-sitter-java` |
| PHP | `php_extractor.py` | `tree-sitter-php` |

资料来源：[src/thought/ingest/code/ast_extractor.py:30-55]()

## Architecture

```mermaid
graph TD
    A[Code File] --> B[Language Detection]
    B --> C[ast_extractor.py Dispatcher]
    C --> D{Python?}
    C --> E{TypeScript?}
    C --> F{Go?}
    C --> G{Rust?}
    C --> H{Java?}
    C --> I{PHP?}
    D --> J[python_extractor.extract]
    E --> K[typescript_extractor.extract]
    F --> L[go_extractor.extract]
    G --> M[rust_extractor.extract]
    H --> N[java_extractor.extract]
    I --> O[php_extractor.extract]
    J --> P[(CodeEntity, CodeEdge)]
    K --> P
    L --> P
    M --> P
    N --> P
    O --> P
    P --> Q[CodeIngestPipeline]
    Q --> R[build_call_graph]
    R --> S[(CALLS Edges)]
```

### Dispatcher Pattern

The `ast_extractor.py` module uses lazy loading to avoid importing heavy tree-sitter C extensions at module load time:

```python
_REGISTRY: dict[str, Callable[[str, str], tuple[list[CodeEntity], list[CodeEdge]]]] = {}

def _python_extractor():
    from . import python_extractor
    return python_extractor.extract
```

Each language loader is registered in `_LOADERS` and invoked only when that language is first requested. 资料来源：[src/thought/ingest/code/ast_extractor.py:9-35]()

## Data Models

### CodeEntity

Represents a code element extracted from the AST:

| Field | Type | Description |
|-------|------|-------------|
| `name` | `str` | Canonical identifier (module, function, class, method) |
| `type_` | `str` | Entity kind: `module`, `function`, `class`, `method` |
| `language` | `str` | Source language: `python`, `typescript`, `go`, `rust`, `java`, `php` |
| `file_path` | `str` | Path to source file (relative to repo root) |
| `line_start` | `int` | 1-indexed start line |
| `line_end` | `int` | 1-indexed end line |
| `signature` | `str` | Declaration signature (e.g., `module foo`, `def bar(self, x)`) |
| `docstring` | `str \| None` | Extracted docstring text |
| `visibility` | `str` | `public` or `private` based on naming conventions |
| `attrs` | `dict` | Language-specific metadata |

资料来源：[src/thought/ingest/code/python_extractor.py:14-25]()

### CodeEdge

Represents a relationship between entities:

| Field | Type | Description |
|-------|------|-------------|
| `source_name` | `str` | Entity that is the subject of the relation |
| `target_name` | `str` | Entity that is the object of the relation |
| `relation_type` | `str` | One of: `IMPORTS`, `INHERITS_FROM`, `DEFINES`, `OVERRIDES`, `CALLS` |
| `line_number` | `int` | Source line where the relationship was discovered |
| `attrs` | `dict` | Additional metadata (e.g., `from_import: true`) |

资料来源：[src/thought/ingest/code/typescript_extractor.py:110-115]()

## Extractor Interface

All language extractors share a common signature:

```python
def extract(source: str, file_path: str) -> tuple[list[CodeEntity], list[CodeEdge]]:
    ...
```

This uniform interface allows the dispatcher to route to any language without knowing implementation details. 资料来源：[src/thought/ingest/code/python_extractor.py:28-40]()

## Supported Edge Types

| Relation | Source | Target | Languages |
|----------|--------|--------|-----------|
| `IMPORTS` | module | imported module | Python, TypeScript, PHP, Go, Rust, Java |
| `INHERITS_FROM` | class | parent class | Python, TypeScript, Java, PHP |
| `DEFINES` | class/module | contained member | All languages |
| `OVERRIDES` | method | overridden method | TypeScript (currently) |
| `CALLS` | function/method | called function | All (via call-graph pass) |

资料来源：[src/thought/ingest/code/python_extractor.py:1-15](), [src/thought/ingest/code/typescript_extractor.py:1-20]()

## Language-Specific Extractors

### Python Extractor

The Python extractor uses `tree-sitter-python` and handles:

- Module entities as the root node
- Function definitions (`function_item`)
- Class declarations (`class_declaration`)
- Method definitions within classes
- Import statements (`import_from_statement`, `import_statement`)
- Class inheritance via `base` field

```python
def extract(source: str, file_path: str) -> tuple[list[CodeEntity], list[CodeEdge]]:
    parser = _get_parser()
    source_bytes = source.encode("utf-8")
    tree = parser.parse(source_bytes)
    root = tree.root_node

    module_name = _module_name_from_path(file_path)
    entities: list[CodeEntity] = []
    edges: list[CodeEdge] = []

    entities.append(CodeEntity(
        name=module_name,
        type_="module",
        language="python",
        ...
    ))
```

资料来源：[src/thought/ingest/code/python_extractor.py:28-50]()

### TypeScript Extractor

The TypeScript extractor supports both `.ts` and `.tsx` files using separate tree-sitter grammars:

```python
def extract(source: str, file_path: str) -> tuple[list[CodeEntity], list[CodeEdge]]:
    use_tsx = file_path.endswith((".tsx", ".jsx"))
    parser = _get_parser(use_tsx=use_tsx)
    ...
```

Node types processed include `function_declaration`, `arrow_function`, `class_declaration`, `method_definition`, `import_statement`, and `export_statement`. 资料来源：[src/thought/ingest/code/typescript_extractor.py:120-145]()

### PHP Extractor

The PHP extractor handles files starting with `<?php` and recursively scans for definitions nested under `namespace_definition` blocks:

```python
def _scan(node: Node) -> None:
    for child in node.named_children:
        ...
```

资料来源：[src/thought/ingest/code/php_extractor.py:45-60]()

### Rust Extractor

The Rust extractor uses `tree-sitter-rust` and tracks method visibility through `impl_type` attributes:

```python
out_entities.append(CodeEntity(
    name=qualified, type_="method", language="rust",
    visibility=_rust_visibility(child, source_bytes),
    attrs={"impl_type": type_name},
))
```

资料来源：[src/thought/ingest/code/rust_extractor.py:1-30]()

## Call Graph Resolution

The call graph is built in a separate Phase 2 pass after all files are ingested. The `build_call_graph` function resolves callee references using a cascade of strategies:

1. **Exact match within same file** — direct intra-file resolution
2. **Qualified suffix match** — `obj.method()` resolves to `ClassName.method`
3. **Cross-file bare-name match** — top-level functions defined elsewhere
4. **Stub creation** — synthetic stub for unknown callees (filtered from impact graphs)

```python
tgt_id = backend.find_code_entity(
    canonical_name=callee_name, scope_filter=sf, code_file=file_path,
)
if tgt_id is None and "." not in callee_name:
    # Unique qualified suffix match.
    rows = backend._conn.execute(
        "SELECT id FROM entities "
        "WHERE type IN ('method','function') AND valid_until IS NULL "
        "AND canonical_name LIKE ? ...",
        (f"%.{callee_name.lower()}", commit_sha),
    ).fetchall()
```

资料来源：[src/thought/ingest/code/call_graph.py:1-60]()

## CodeIngestPipeline

The `CodeIngestPipeline` orchestrates the full ingest workflow:

1. Reads source file content
2. Detects or validates language
3. Calls the appropriate extractor
4. Creates a source reference record
5. Writes entities within a single transaction
6. Embeds entity signatures and docstrings for VIBE recall
7. Writes edges and resolves call graph

```mermaid
graph LR
    A[Source File] --> B[detect_language]
    B --> C[extract entities/edges]
    C --> D[upsert_source]
    D --> E[begin transaction]
    E --> F[_write_entities + embed]
    F --> G[_write_edges]
    G --> H[build_call_graph]
    H --> I[commit]
```

The pipeline embeds entity signatures and docstrings so that queries like "who calls authenticate_user" can find functions by intent rather than exact name. 资料来源：[src/thought/ingest/code/pipeline.py:1-80]()

## CodeLayer API

The `CodeLayer` provides a high-level interface for code graph queries:

| Method | Description |
|--------|-------------|
| `callers_of(name)` | Direct callers, ranked by Personalized PageRank |
| `callees_of(name)` | Direct callees (intra-package) |
| `impact_set(name)` | Transitive callers, ranked — for `thought impact` command |
| `defines_in_file(path)` | All entities discovered in a file |

All methods operate against the currently-valid view (`valid_until IS NULL`). Pass `as_of=` for historical snapshots. 资料来源：[src/thought/layers/code.py:1-40]()

## Git-Aware Ingest

The `GitWalker` enables two ingestion modes:

| Mode | Behavior |
|------|----------|
| `snapshot` (default) | Ingest HEAD only, stamp every entity with HEAD SHA |
| `full` | Walk every commit chronologically, stamp each entity with its commit SHA |

This enables bi-temporal `as_of` queries against historical commits. 资料来源：[src/thought/ingest/code/git_pipeline.py:1-50]()

## Configuration

Language is auto-detected by file extension when `language=None`:

| Extension | Language |
|-----------|----------|
| `.py` | `python` |
| `.ts`, `.tsx`, `.js`, `.jsx` | `typescript` |
| `.go` | `go` |
| `.rs` | `rust` |
| `.java` | `java` |
| `.php` | `php` |

Pass `language=` explicitly to override detection. 资料来源：[src/thought/ingest/code/pipeline.py:25-35]()

---

<a id='page-git-integration'></a>

## Git History Integration

### 相关页面

相关主题：[Multi-Language Code Parsing](#page-code-parsing), [Memory Model and Data Structures](#page-memory-model)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/thought/ingest/code/git_pipeline.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/git_pipeline.py)
- [src/thought/ingest/code/git_walker.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/ingest/code/git_walker.py)
- [src/thought/cli.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/cli.py)
- [src/thought/layers/code.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/layers/code.py)
- [CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)
</details>

# Git History Integration

## Overview

Git History Integration enables thought-mcp to ingest source code with full commit-level provenance, allowing bi-temporal queries that can reconstruct what a codebase looked like at any point in its history. This feature stamps every extracted code entity (functions, classes, modules) with the exact git commit SHA where it was discovered, creating a temporal graph that supports "as-of" queries.

The system provides two ingestion modes: a fast **snapshot mode** for current-state analysis and a comprehensive **full-history mode** for complete historical reconstruction.

资料来源：[CHANGELOG.md]()

## Architecture

### Component Overview

```mermaid
graph TD
    subgraph "Git History Integration"
        CLI["thought ingest-git CLI"]
        Pipeline["GitIngestPipeline"]
        Walker["GitWalker"]
        Storage["SQLite Backend"]
    end
    
    subgraph "Git Operations"
        Git["git executable"]
        RevParse["rev-parse HEAD"]
        Log["log --format"]
        LsTree["ls-tree -r"]
        Show["show <sha>:<path>"]
    end
    
    CLI --> Pipeline
    Pipeline --> Walker
    Walker --> Git
    Git --> RevParse
    Git --> Log
    Git --> LsTree
    Git --> Show
    Pipeline --> Storage
```

### Data Flow

```mermaid
sequenceDiagram
    participant User
    participant CLI
    participant Pipeline
    participant Walker
    participant Extractor
    participant Backend
    
    User->>CLI: thought ingest-git /repo --mode full
    CLI->>Pipeline: run(repo_path, mode)
    
    alt snapshot mode
        Pipeline->>Walker: get_head_commit()
        Walker->>Git: rev-parse HEAD
        Git-->>Walker: sha
        Pipeline->>Pipeline: ingest single snapshot
    else full mode
        Pipeline->>Walker: get_all_commits()
        Walker->>Git: log --format
        Git-->>Walker: commit list
        Loop for each commit
            Pipeline->>Git: ls-tree -r sha
            Pipeline->>Git: show sha:path
            Git-->>Pipeline: file content
            Pipeline->>Extractor: extract(entities, edges)
            Extractor-->>Pipeline: CodeEntity[], CodeEdge[]
            Pipeline->>Backend: upsert with commit_sha
        end
    end
    
    Pipeline-->>User: GitIngestReport
```

资料来源：[src/thought/ingest/code/git_pipeline.py:1-95]()
资料来源：[src/thought/ingest/code/git_walker.py:1-60]()

## Core Components

### GitWalker

The `GitWalker` class provides a read-only interface to git repositories using pure subprocess calls. It deliberately avoids native dependencies like `pygit2` to minimize installation footprint.

| Method | Git Command | Purpose |
|--------|-------------|---------|
| `get_head_sha()` | `rev-parse HEAD` | Get current HEAD commit SHA |
| `get_all_commits()` | `log --format=...` | List all commits chronologically |
| `get_files_at_commit(sha)` | `ls-tree -r <sha>` | List files in tree at commit |
| `get_file_at_commit(sha, path)` | `show <sha>:<path>` | Get file content at commit |

#### Commit Data Model

```python
@dataclass(frozen=True)
class Commit:
    sha: str                    # Full commit SHA
    author: str                 # Author name
    author_email: str           # Author email
    author_date: datetime      # Commit timestamp
    subject: str                # Commit message first line
```

资料来源：[src/thought/ingest/code/git_walker.py:24-31]()

#### Initialization Validation

```python
def __init__(self, repo_path: Path | str) -> None:
    self.repo = Path(repo_path)
    if shutil.which("git") is None:
        raise RuntimeError("git executable not on PATH")
    if not (self.repo / ".git").exists():
        raise ValueError(f"not a git repository: {self.repo}")
```

The walker validates that:
1. The `git` executable exists on PATH
2. The target path is a valid git repository (contains `.git` directory)

资料来源：[src/thought/ingest/code/git_walker.py:35-42]()

### GitIngestPipeline

The pipeline orchestrates the complete ingestion process, coordinating between git history traversal and code extraction.

| Parameter | Type | Description |
|-----------|------|-------------|
| `repo_path` | `Path` | Path to git repository |
| `mode` | `GitMode` | `"snapshot"` (HEAD only) or `"full"` (all commits) |
| `patterns` | `tuple[str, ...]` | Glob patterns to filter files (e.g., `*.py`) |

#### Ingestion Report

```python
@dataclass(frozen=True)
class GitIngestReport:
    head_sha: str           # SHA of HEAD at time of ingest
    mode: GitMode           # Mode used for ingestion
    commits_visited: int    # Number of commits processed
    files_ingested: int     # Total files ingested
    call_edges: int         # Call graph edges created
```

资料来源：[src/thought/ingest/code/git_pipeline.py:35-41]()

## Ingestion Modes

### Snapshot Mode (Default)

Snapshot mode ingests only the current HEAD commit. This is the recommended mode for:

- Initial repository ingestion
- Quick code analysis workflows
- When historical queries are not needed

**Performance characteristics:**
- Single-pass through current tree
- No duplicate processing
- Typical runtime: seconds to minutes depending on repository size

**Entity stamping:**
All extracted entities receive the HEAD SHA as their `code_commit_sha` attribute, enabling queries like "what did auth.middleware look like at HEAD?" or future comparisons.

资料来源：[src/thought/ingest/code/git_pipeline.py:7-16]()

### Full History Mode

Full mode walks every commit in chronological order, ingesting the file tree at each point. This enables:

- Historical queries: "what did function X look like at commit Y?"
- Diff analysis between any two commits
- Complete temporal reconstruction of code evolution

**Performance considerations:**

| Repository Size | Estimated Commits | Estimated Time |
|-----------------|-------------------|----------------|
| Small (<100 files) | ~100 | ~30 seconds |
| Medium (500 files) | ~1000 | ~5 minutes |
| Large (1000+ files) | ~5000+ | ~25+ minutes |

> Note: Full-history ingest is bounded by file count × commits. The per-commit cost is dominated by tree-sitter parsing, not git operations.

资料来源：[src/thought/ingest/code/git_pipeline.py:16-25]()

## CLI Usage

### Command Syntax

```bash
thought ingest-git <repo_path> [OPTIONS]
```

#### Options

| Option | Short | Default | Description |
|--------|-------|---------|-------------|
| `--mode` | `--mode snapshot` or `--mode full` | `snapshot` | Ingestion mode |
| `--paths` | `--paths "*.py,*.js"` | `*.py` | Comma-separated glob patterns |
| `--config` | `--config path/to/config` | `thought.toml` | Configuration file |

### Examples

```bash
# Ingest current directory as git repo (HEAD only)
thought ingest-git .

# Ingest specific repository with full history
thought ingest-git /path/to/repo --mode full

# Ingest Python and TypeScript files only
thought ingest-git . --paths "*.py,*.ts,*.tsx"

# Ingest with full git history, multiple file types
thought ingest-git /project --mode full --paths "*.py,*.js,*.go"
```

资料来源：[src/thought/cli.py:90-120]()

## Code Commit Stamping

Every extracted code entity receives metadata linking it to its source commit:

```python
eid = self._backend.upsert_entity(
    # ... other fields ...
    code_file=ent.file_path,
    code_language=language,
    code_commit_sha=commit_sha,  # Links entity to specific commit
)
```

The database schema includes:

| Column | Type | Purpose |
|--------|------|---------|
| `code_file` | `TEXT` | File path relative to repo root |
| `code_language` | `TEXT` | Programming language detected |
| `code_commit_sha` | `TEXT` | Git commit where entity was found |

These columns have partial indexes for fast lookups by commit.

资料来源：[CHANGELOG.md]()
资料来源：[src/thought/ingest/code/pipeline.py:60-75]()

## CodeLayer Query Interface

The `CodeLayer` class provides convenience methods for querying the code graph with temporal awareness:

```python
class CodeLayer:
    def callers_of(name, *, code_commit_sha=None)  # Find who calls this function
    def callees_of(name, *, code_commit_sha=None)  # Find what this function calls
    def impact_set(name)                            # Transitive callers, ranked
    def defines_in_file(path)                       # Entities in a file
```

### Temporal Queries

All lookups operate against the currently-valid view of the code graph. To query historical snapshots, pass the `as_of` parameter or filter by `code_commit_sha`:

```python
# Query current state
impact = code_layer.impact_set("authenticate_user")

# Query historical state (when full-history ingest was used)
impact_historical = code_layer.impact_set(
    "authenticate_user",
    code_commit_sha="abc123..."
)
```

资料来源：[src/thought/layers/code.py:1-50]()

## Diff Between Commits

The system supports computing the difference between any two ingested commits:

```bash
thought diff --from <sha1> --to <sha2>
```

This returns:
- **Added entities**: Entities present at `--to` but not at `--from`
- **Removed entities**: Entities present at `--from` but not at `--to`

The diff operates on the set of entities by name, comparing their commit stamps.

资料来源：[CHANGELOG.md]()

## Supported Languages

The git ingestion pipeline uses language-specific extractors:

| Language | Extractor | Extensions |
|----------|-----------|------------|
| Python | `python_extractor.py` | `.py` |
| Rust | `rust_extractor.py` | `.rs` |
| TypeScript | `typescript_extractor.py` | `.ts`, `.tsx` |
| PHP | `php_extractor.py` | `.php` |

Each extractor uses tree-sitter for AST parsing, extracting:
- **Entities**: modules, functions, classes, methods
- **Edges**: IMPORTS, DEFINES, CALLS, INHERITS_FROM, OVERRIDES

资料来源：[src/thought/ingest/code/python_extractor.py]()
资料来源：[src/thought/ingest/code/rust_extractor.py]()
资料来源：[src/thought/ingest/code/typescript_extractor.py]()
资料来源：[src/thought/ingest/code/php_extractor.py]()

## Configuration

### Thought Configuration (thought.toml)

```toml
[embedder]
type = "auto"  # or "ollama", "openai", "deterministic"

[storage]
path = "thought.db"
```

### Environment Variables

| Variable | Description |
|----------|-------------|
| `OLLAMA_BASE_URL` | Ollama server URL for embeddings |
| `OPENAI_API_KEY` | OpenAI API key for embeddings |

## Best Practices

### Initial Ingestion

1. Start with **snapshot mode** to verify the setup works
2. Run `thought stats` to confirm entities were created
3. Query a function to verify call graph edges exist

### Full History Ingestion

1. Ensure adequate disk space (full mode creates temporary copies)
2. Use `--paths` to filter to relevant file types on large repos
3. Consider running during off-peak hours for large repositories

### Query Optimization

- Use `code_file` filter when querying specific files
- Use `code_commit_sha` filter for historical lookups
- Combine with vector similarity for intent-based queries

## Troubleshooting

### "git executable not on PATH"

**Solution**: Install git or ensure it's in your system PATH.

```bash
# Verify git is available
git --version
```

### "not a git repository"

**Solution**: Ensure the path contains a `.git` directory:

```bash
# Initialize if needed
git init
```

### Slow Full-History Ingestion

**Mitigation**:
- Use `--paths` to filter file types
- Use snapshot mode for initial setup
- Consider parallelizing with multiple `--paths` passes

## Summary

Git History Integration transforms thought-mcp from a current-state code analysis tool into a full temporal code repository that can answer questions about code at any point in history. By combining git's commit tracking with bi-temporal database queries, users can reconstruct how functions evolved, who called what across commits, and the complete impact chain of changes over time.

The architecture prioritizes:
- **No native dependencies**: Pure subprocess git operations
- **Two-mode flexibility**: Fast snapshots or complete history
- **Temporal provenance**: Every entity stamped with its commit SHA
- **Language generality**: Support for multiple programming languages via tree-sitter

---

<a id='page-agent-adapters'></a>

## Agent Adapters and SDK Integration

### 相关页面

相关主题：[Query and Retrieval System](#page-query-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/thought/adapters/__init__.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/adapters/__init__.py)
- [src/thought/adapters/claude_sdk.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/adapters/claude_sdk.py)
- [src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)
- [src/thought/server.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/server.py)
- [src/thought/hooks/install.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)
</details>

# Agent Adapters and SDK Integration

## Overview

The Agent Adapters and SDK Integration subsystem provides a seamless bridge between THOUGHT's knowledge base and external AI agent frameworks. This system enables any Claude-Agent-SDK-shaped agent to interact with THOUGHT's memory, context retrieval, and code analysis capabilities through a standardized adapter interface.

The integration layer consists of three primary components:

1. **Claude SDK Adapter** (`ThoughtMemoryProvider`) — A drop-in memory adapter for Claude Agent SDK
2. **MCP Server Surface** — Exposes core primitives via the Model Context Protocol
3. **Claude Code Hook Installer** — Integrates THOUGHT directly into Claude Code's event loop

资料来源：[CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)

## Architecture Overview

```mermaid
graph TD
    subgraph "Agent Frameworks"
        ClaudeSDK[Claude Agent SDK]
        ClaudeCode[Claude Code CLI]
        MCPClients[MCP-Compatible Clients]
    end

    subgraph "THOUGHT Integration Layer"
        ClaudeSDKAdapter[ThoughtMemoryProvider]
        MCPServer[MCP Server Surface]
        HookInstaller[Claude Code Hook Installer]
    end

    subgraph "Core THOUGHT"
        Memory[Memory / Knowledge Base]
        Embedder[Embedder Service]
        CodeAnalysis[Code Analysis Engine]
        Backend[SQLite Backend]
    end

    ClaudeSDK --> ClaudeSDKAdapter
    ClaudeSDKAdapter --> Memory
    ClaudeSDKAdapter --> Embedder
    
    ClaudeCode --> HookInstaller
    HookInstaller --> Memory
    
    MCPClients --> MCPServer
    MCPServer --> Memory
    MCPServer --> CodeAnalysis
    MCPServer --> Backend

    Memory --> Backend
    Embedder --> Backend
    CodeAnalysis --> Backend
```

## The Claude SDK Adapter

### Purpose and Scope

The `ThoughtMemoryProvider` class serves as a drop-in memory adapter for any Claude-Agent-SDK-shaped agent. It wraps THOUGHT's core memory primitives and exposes them through a familiar interface that agent developers expect.

资料来源：[src/thought/adapters/claude_sdk.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/adapters/claude_sdk.py)

### Core Methods

The adapter implements three primary methods that cover the complete agent loop:

| Method | Purpose | Returns |
|--------|---------|---------|
| `context_for(target, role)` | Returns a working-context dict for a specific target entity and role | `dict` with anchor, neighbours, recent_contradictions, role_view |
| `render_context(target)` | Returns the same payload as a plain-text system-prompt augmentation | `str` formatted for LLM consumption |
| `record(content)` | Persists what the agent learned to the knowledge base | `str` — source ID of recorded content |
| `scan(repo_path)` | Runs an incremental scan under the agent's name | `dict` with scan results |

资料来源：[src/thought/adapters/claude_sdk.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/adapters/claude_sdk.py)

### Working Context Structure

The `context_for()` method returns a ranked, role-aware payload containing:

```python
{
    "anchor": "<entity-name>",           # The target entity
    "neighbours": [...],                  # Top-K related entities
    "recent_contradictions": [...],       # Entities that contradict this one
    "role_view": "<saved-view-name>"      # Optional named view for the role
}
```

The context is token-budgeted to prevent overwhelming the agent's context window.

资料来源：[CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)

### Integration Flow

```mermaid
sequenceDiagram
    participant Agent as Claude Agent SDK
    participant Adapter as ThoughtMemoryProvider
    participant Memory as THOUGHT Memory
    participant Embedder as Embedder Service
    participant Backend as SQLite Backend

    Agent->>Adapter: context_for("authenticate", role="code")
    Adapter->>Memory: working_context(target, role, budget_tokens)
    Memory->>Embedder: embed("authenticate")
    Embedder->>Memory: vector embedding
    Memory->>Backend: query similar entities
    Backend-->>Memory: ranked entity results
    Memory-->>Adapter: structured context dict
    Adapter-->>Agent: context payload

    Agent->>Adapter: record("Learned: auth uses JWT")
    Adapter->>Backend: upsert_source(content, mime_type)
    Adapter->>Backend: store entity + edges
    Backend-->>Adapter: source_id
    Adapter-->>Agent: source_id
```

## MCP Server Surface

The MCP (Model Context Protocol) server exposes THOUGHT's primitives as tools that any MCP-compatible client can invoke.

资料来源：[src/thought/server.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/server.py)

### Available Tools

#### `working_context`

Universal "what does my agent need to know about X right now" primitive.

```python
@app.tool()
async def working_context(
    target: str,           # "function:authenticate" / "chapter:5" / entity name
    role: str = "default", # Contextual role for view filtering
    budget_tokens: int = 1024,
    scope: str | None = None,
    owner_id: str | None = None,
) -> dict
```

Returns:
```python
{
    "anchor": str,
    "neighbours": list[dict],
    "recent_contradictions": list[dict],
    "role_view": str | None
}
```

资料来源：[src/thought/server.py:48-63](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/server.py)

#### `scan`

Incremental code-scan primitive for keeping the knowledge base current.

```python
@app.tool()
async def scan(
    repo_path: str,           # Repository to scan
    agent: str | None = None, # Agent name for scan attribution
    since: str | None = None, # Only files changed since this time/commit
    max_files: int | None = None,
    note: str | None = None,
) -> dict
```

资料来源：[src/thought/server.py:65-78](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/server.py)

#### `scan_log_list`

Lists recent scan runs for tracking incremental progress.

```python
@app.tool()
async def scan_log_list(
    agent: str | None = None,
    limit: int = 10,
) -> dict
```

Returns:
```python
{
    "scans": [
        {
            "scan_id": str,
            "agent": str,
            "timestamp": str,
            "files_processed": int,
            "note": str | None
        },
        ...
    ]
}
```

资料来源：[src/thought/server.py:80-91](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/server.py)

## Client Installation

THOUGHT supports installation into multiple MCP-compatible clients. The installation process merges a `thought` MCP server entry into the client's configuration file.

资料来源：[src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)

### Supported Clients

| Client | Configuration Path |
|--------|-------------------|
| Project | `.claude/settings.json` |
| User | `~/.claude/settings.json` |

资料来源：[src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)

### Installation Function

```python
def install(
    client: ClientName,
    *,
    server_name: str = "thought",
    block: dict | None = None,
    backup: bool = True,
) -> ClientInstallResult
```

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `client` | `ClientName` | Required | Target client name |
| `server_name` | `str` | `"thought"` | Name for the server entry |
| `block` | `dict \| None` | `None` | Custom server block; defaults to `server_block()` |
| `backup` | `bool` | `True` | Backup existing config before modification |

**Return Type:** `ClientInstallResult`

```python
@dataclass
class ClientInstallResult:
    client: ClientName
    path: Path | None
    status: Literal["installed", "already_present", "error", "no_path"]
    detail: str = ""
```

资料来源：[src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)

### Installation Behavior

The `install()` function performs the following:

1. **Read existing config** — Parses the client's JSON configuration
2. **Merge server entry** — Adds the `thought` server block under `mcpServers`
3. **Backup** — Creates `settings.json.thought.bak` before any write
4. **Idempotency check** — Returns `already_present` if entry exists and matches

资料来源：[src/thought/clients.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/clients.py)

```mermaid
graph TD
    A[install called] --> B{Config exists?}
    B -->|No| C[Create new config]
    B -->|Yes| D{Valid JSON?}
    D -->|No| E[Return error]
    D -->|Yes| F{Server entry exists?}
    F -->|Yes, matches| G[Return already_present]
    F -->|Yes, differs| H[Backup config]
    F -->|No| I[Add server entry]
    H --> J[Write merged config]
    I --> J
    C --> J
    J --> K[Return installed]
```

## Claude Code Hook Integration

The hook installer provides Claude Code event-driven integration, enabling THOUGHT to automatically capture context at key points in the development workflow.

资料来源：[src/thought/hooks/install.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)

### Hook Kinds

| Hook Kind | Claude Code Event | Command | Trigger |
|-----------|------------------|---------|---------|
| `recall` | `UserPromptSubmit` | `thought hook recall` | After user submits a prompt |
| `write` | `Stop` | `thought hook write` | After agent completes work |
| `context` | `SessionStart` | `thought hook context` | When session begins |

资料来源：[src/thought/hooks/install.py:15-22](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)

### Hook Installation Result

```python
@dataclass(frozen=True)
class HookInstallResult:
    kind: HookKind
    path: Path
    status: Literal["installed", "already_present", "error"]
    detail: str = ""
```

### Settings Path Resolution

```python
def settings_path(*, scope: Literal["project", "user"] = "project") -> Path
```

- **Project scope** — `.claude/settings.json` (recommended default)
- **User scope** — `~/.claude/settings.json`

资料来源：[src/thought/hooks/install.py:41-50](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/hooks/install.py)

## Demo Integration

The `thought demo` command includes a built-in walkthrough specifically for the Claude Agent SDK adapter:

```python
- ``code``  Agent / developer flow — the 14-stage code-vertical
            walkthrough including agent identity, ``thought scan``,
            ``working_context``, 4 new-language extractors, and the
            Claude Agent SDK adapter.
```

资料来源：[src/thought/demo.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/demo.py)

### Demo Audiences

| Audience | Purpose | Key Features |
|----------|---------|--------------|
| `code` | Agent/developer flow | SDK adapter, scan, working_context |
| `writer` | Novelist/paper author | Bi-temporal model, contradiction detection |
| `legal` | Investigator/paralegal | `unique_predicates`, CONTRADICTS edges |
| `researcher` | Academic | Claim/source pairs, Cypher queries |
| `all` | Sequential all audiences | Full demonstration suite |

## Configuration

### Environment Variables

The integration layer respects the following environment variables for embedder configuration:

| Variable | Purpose |
|----------|---------|
| `THOUGHT_DB_PATH` | Override database path |
| `THOUGHT_EMBEDDER` | Embedder choice (`auto`, `sentence-transformers`, etc.) |
| `THOUGHT_OLLAMA_HOST` | Ollama server host |
| `THOUGHT_OLLAMA_MODEL` | Ollama model name |
| `THOUGHT_LMSTUDIO_URL` | LM Studio server URL |
| `THOUGHT_LMSTUDIO_MODEL` | LM Studio model name |
| `THOUGHT_OPENAI_COMPAT_URL` | OpenAI-compatible API URL |
| `THOUGHT_OPENAI_COMPAT_MODEL` | OpenAI-compatible model name |
| `THOUGHT_OPENAI_COMPAT_API_KEY` | API key for OpenAI-compatible endpoints |

资料来源：[src/thought/config.py](https://github.com/RNBBarrett/thought-mcp/blob/main/src/thought/config.py)

### Config File (`thought.toml`)

```toml
[embedding]
choice = "auto"  # or specific embedder name

[db]
path = ".thought/thought.db"
```

## Dependencies

The adapter package requires the following extras:

```toml
[project.optional-dependencies]
adapters = ["httpx>=0.27"]
```

资料来源：[CHANGELOG.md](https://github.com/RNBBarrett/thought-mcp/blob/main/CHANGELOG.md)

## Usage Example

```python
from thought.adapters.claude_sdk import ThoughtMemoryProvider

# Initialize adapter
memory = ThoughtMemoryProvider()

# Get working context for a function
context = memory.context_for(
    target="authenticate_user",
    role="security-reviewer",
    budget_tokens=2048,
)

# Record what the agent learned
source_id = memory.record(
    "Session token validation happens in this function. "
    "Uses HMAC-SHA256 for signature verification."
)

# Run incremental scan
result = memory.scan(
    repo_path="/path/to/project",
    agent="security-audit",
    note="Weekly security review scan"
)
```

## Summary

The Agent Adapters and SDK Integration system provides three complementary pathways for integrating THOUGHT with external agents:

1. **Direct SDK Integration** — `ThoughtMemoryProvider` for Claude Agent SDK agents
2. **MCP Protocol** — Standard tool interface for any MCP-compatible client
3. **Claude Code Hooks** — Event-driven integration for Claude Code CLI users

All pathways share the same underlying memory primitives, ensuring consistent behavior regardless of how the agent connects to THOUGHT.

---

---

## Doramagic Pitfall Log

Project: RNBBarrett/thought-mcp

Summary: Found 8 potential pitfall items; 0 are high/blocking. Highest priority: configuration - 可能修改宿主 AI 配置.

## 1. configuration · 可能修改宿主 AI 配置

- Severity: medium
- Evidence strength: source_linked
- Finding: 项目面向 Claude/Cursor/Codex/Gemini/OpenCode 等宿主，或安装命令涉及用户配置目录。
- User impact: 安装可能改变本机 AI 工具行为，用户需要知道写入位置和回滚方法。
- Suggested check: 列出会写入的配置文件、目录和卸载/回滚步骤。
- Guardrail action: 涉及宿主配置目录时必须给回滚路径，不能只给安装命令。
- Evidence: capability.host_targets | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | host_targets=mcp_host, claude, claude_code, chatgpt

## 2. capability · 能力判断依赖假设

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: 假设不成立时，用户拿不到承诺的能力。
- Suggested check: 将假设转成下游验证清单。
- Guardrail action: 假设必须转成验证项；没有验证结果前不能写成事实。
- Evidence: capability.assumptions | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | README/documentation is current enough for a first validation pass.

## 3. maintenance · 来源证据：v0.2.1 — thought upgrade + mcp-extras fix

- Severity: medium
- Evidence strength: source_linked
- Finding: GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：v0.2.1 — thought upgrade + mcp-extras fix
- User impact: 可能增加新用户试用和生产接入成本。
- Suggested check: 来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- Guardrail action: 不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- Evidence: community_evidence:github | cevd_dc6434cd71064812ae14d01e7f5d9ef0 | https://github.com/RNBBarrett/thought-mcp/releases/tag/v0.2.1 | 来源类型 github_release 暴露的待验证使用条件。

## 4. maintenance · 维护活跃度未知

- Severity: medium
- Evidence strength: source_linked
- Finding: 未记录 last_activity_observed。
- User impact: 新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- Suggested check: 补 GitHub 最近 commit、release、issue/PR 响应信号。
- Guardrail action: 维护活跃度未知时，推荐强度不能标为高信任。
- Evidence: evidence.maintainer_signals | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | last_activity_observed missing

## 5. security_permissions · 下游验证发现风险项

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: 下游已经要求复核，不能在页面中弱化。
- Suggested check: 进入安全/权限治理复核队列。
- Guardrail action: 下游风险存在时必须保持 review/recommendation 降级。
- Evidence: downstream_validation.risk_items | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | no_demo; severity=medium

## 6. security_permissions · 存在评分风险

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: 风险会影响是否适合普通用户安装。
- Suggested check: 把风险写入边界卡，并确认是否需要人工复核。
- Guardrail action: 评分风险必须进入边界卡，不能只作为内部分数。
- Evidence: risks.scoring_risks | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | no_demo; severity=medium

## 7. maintenance · issue/PR 响应质量未知

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: 用户无法判断遇到问题后是否有人维护。
- Suggested check: 抽样最近 issue/PR，判断是否长期无人处理。
- Guardrail action: issue/PR 响应未知时，必须提示维护风险。
- Evidence: evidence.maintainer_signals | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | issue_or_pr_quality=unknown

## 8. maintenance · 发布节奏不明确

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: 安装命令和文档可能落后于代码，用户踩坑概率升高。
- Suggested check: 确认最近 release/tag 和 README 安装命令是否一致。
- Guardrail action: 发布节奏未知或过期时，安装说明必须标注可能漂移。
- Evidence: evidence.maintainer_signals | github_repo:1238261514 | https://github.com/RNBBarrett/thought-mcp | release_recency=unknown

<!-- canonical_name: RNBBarrett/thought-mcp; human_manual_source: deepwiki_human_wiki -->
