# https://github.com/abhigyanpatwari/GitNexus 项目说明书

生成时间：2026-05-16 01:16:25 UTC

## 目录

- [Introduction to GitNexus](#introduction)
- [Quick Start Guide](#quick-start)
- [Key Concepts](#key-concepts)
- [System Architecture](#system-architecture)
- [Package Structure](#package-structure)
- [MCP Integration](#mcp-integration)
- [Multi-Repo Registry Architecture](#multi-repo-registry)
- [Indexing Pipeline](#indexing-pipeline)
- [Knowledge Graph](#knowledge-graph)
- [Search System](#search-system)

<a id='introduction'></a>

## Introduction to GitNexus

### 相关页面

相关主题：[System Architecture](#system-architecture), [Quick Start Guide](#quick-start), [Key Concepts](#key-concepts)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/storage/git.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/storage/git.ts)
- [src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/group/PIPELINE.md)
- [src/core/tree-sitter/parser-loader.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/tree-sitter/parser-loader.ts)
- [src/core/ingestion/scope-extractor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/ingestion/scope-extractor.ts)
- [src/core/ingestion/heritage-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/ingestion/heritage-processor.ts)
- [src/core/ingestion/emit-references.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/ingestion/emit-references.ts)
- [src/core/ingestion/scope-resolution/workspace-index.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/ingestion/scope-resolution/workspace-index.ts)
- [src/mcp/resources.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/mcp/resources.ts)
- [src/storage/repo-manager.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/storage/repo-manager.ts)
- [src/core/group/extractors/grpc-patterns/node.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/group/extractors/grpc-patterns/node.ts)
- [src/core/group/extractors/http-patterns/php.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/src/core/group/extractors/http-patterns/php.ts)
- [gitnexus-web/src/components/OnboardingGuide.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/OnboardingGuide.tsx)
- [gitnexus-web/src/components/HelpPanel.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/HelpPanel.tsx)
</details>

# Introduction to GitNexus

GitNexus is a local-first code intelligence platform that transforms source code repositories into interactive, queryable knowledge graphs. It provides developers with deep visibility into code relationships, symbol dependencies, execution flows, and cross-repository impact analysis.

## What is GitNexus?

GitNexus is a graph-based code analysis tool that:

- **Clones and indexes GitHub repositories** locally on your machine
- **Parses source code** across multiple programming languages using tree-sitter grammars
- **Builds a knowledge graph** representing symbols, imports, inheritance, and call relationships
- **Provides AI-powered tools** for querying, refactoring, and understanding code impact

资料来源：[gitnexus-web/src/components/AnalyzeOnboarding.tsx:12-14]()

The system operates with a clear data-privacy guarantee: public repositories are cloned locally, parsed entirely on your machine, and no data leaves your environment. 资料来源：[gitnexus-web/src/components/AnalyzeOnboarding.tsx:26-27]()

## Installation and Setup

### Prerequisites

| Requirement | Version |
|-------------|---------|
| Node.js | 18+ |
| npm | Latest stable |
| Git | Configured with remote origin |

### Quick Start

```bash
npm install -g gitnexus && gitnexus serve
```

资料来源：[gitnexus-web/src/components/OnboardingGuide.tsx:20-21]()

After starting the server, the web interface automatically detects when the server is ready and opens the graph visualization without requiring page refresh. 资料来源：[gitnexus-web/src/components/OnboardingGuide.tsx:39-41]()

## Architecture Overview

GitNexus follows a modular architecture with distinct phases for code ingestion, graph construction, and query execution.

```mermaid
graph TD
    subgraph Ingestion["Ingestion Pipeline"]
        CLONE[Git Clone] --> PARSE[Code Parsing]
        PARSE --> EXTRACT[Symbol Extraction]
        EXTRACT --> EMIT[Graph Emission]
    end
    
    subgraph Analysis["Analysis Engine"]
        EMIT --> INDEX[Workspace Index]
        INDEX --> RESOLVE[Scope Resolution]
        RESOLVE --> MODEL[Semantic Model]
    end
    
    subgraph Query["Query Layer"]
        MODEL --> MCP[Model Context Protocol]
        MCP --> TOOLS[AI Tools]
    end
```

### Supported Languages

GitNexus uses tree-sitter for language-agnostic parsing. The following languages are supported through grammar loaders: 资料来源：[src/core/tree-sitter/parser-loader.ts:31-48]()

| Language | Grammar Package | Notes |
|----------|-----------------|-------|
| JavaScript | tree-sitter-javascript | Core language |
| TypeScript | tree-sitter-typescript | Full type support |
| TSX | tree-sitter-typescript | React JSX syntax |
| Python | tree-sitter-python | Indentation-based parsing |
| PHP | tree-sitter-php | Laravel patterns supported |

Grammar loading is centralized in a single registry table. Adding or removing a grammar requires only one entry modification—no scattered conditional spreads or per-grammar branches. 资料来源：[src/core/tree-sitter/parser-loader.ts:21-30]()

## Core Pipeline Phases

### 1. Manifest Extraction

The manifest extraction phase processes `group.yaml` configuration files to identify cross-repository symbol references.

```mermaid
flowchart TD
    LINKS[group.yaml links] --> ME[ManifestExtractor]
    ME --> LOOP{for each link}
    LOOP --> RES[resolveSymbol]
    RES --> OK{found?}
    OK -->|yes| REF[Real symbol uid + ref]
    OK -->|no| SYN[Synthetic uid<br/>manifest::repo::cid]
    REF --> EMIT[Emit Contract + CrossLink]
    SYN --> EMIT
    EMIT --> BRIDGE[Bridge query #795]
```

Label-scoped queries in `resolveSymbol` prevent accidental cross-matches using context-aware node types: 资料来源：[src/core/group/PIPELINE.md:14-24]()

- `topic` → `(n:Function|Method|Class|Interface)`
- `grpc` method → `(n:Function|Method)`, service → `(n:Class|Interface)`
- `lib` → `(n:Package|Module)`

### 2. Language-Specific Pattern Extraction

GitNexus includes specialized extractors for framework-specific patterns:

#### gRPC Patterns (Node.js/TypeScript)

The gRPC pattern extractor compiles tree-sitter queries for common protobuf service patterns: 资料来源：[src/core/group/extractors/grpc-patterns/node.ts:1-38]()

| Pattern | Purpose |
|---------|---------|
| `grpcMethod` | Detects RPC method definitions |
| `grpcClient` | Identifies `@GrpcClient(...)` decorator usage |
| `getService` | Finds `getService()` calls |
| `newSimpleCtor` | Matches simple constructor patterns |
| `newQualifiedCtor` | Matches qualified constructor patterns |
| `loadPackageDefinition` | Detects `loadPackageDefinition()` usage |

Bundles are pre-compiled for JavaScript, TypeScript, and TSX: 资料来源：[src/core/group/extractors/grpc-patterns/node.ts:40-42]()

#### HTTP Patterns (PHP)

PHP extraction supports common web frameworks: 资料来源：[src/core/group/extractors/http-patterns/php.ts:1-45]()

| Pattern | Framework | Query Target |
|---------|-----------|--------------|
| `laravelRoute` | Laravel | Route definitions |
| `httpFacade` | Laravel | HTTP facade calls |
| `guzzleMember` | Guzzle | HTTP client usage |
| `fileGetContents` | PHP Core | File read operations |

### 3. Scope Extraction

The scope extraction phase builds the fundamental parse representation used throughout the pipeline.

```mermaid
flowchart LR
    subgraph Input["LanguageProvider Hooks"]
        RSK[resolveScopeKind]
        BSF[bindingScopeFor]
        II[interpretImport]
        ITB[interpretTypeBinding]
        CC[classifyCallForm]
    end
    
    subgraph Process["ScopeExtractor"]
        Input --> EXTRACT[extract function]
        EXTRACT --> PF[ParsedFile]
    end
    
    PF --> OUTPUT[ownedDefs<br/>referenceSites<br/>parsedImports]
```

The `ScopeExtractorHooks` interface declares the exact subset of `LanguageProvider` methods used, enabling targeted testing and explicit dependency contracts. 资料来源：[src/core/ingestion/scope-extractor.ts:38-56]()

### 4. Heritage Processing (Inheritance Analysis)

The heritage processor resolves inheritance relationships across the codebase:

```mermaid
graph LR
    IMP[implements] --> MAP[heritage-map.ts]
    EXT[extends] --> MAP
    MAP --> RESOLVE[resolveExtendsType]
    RESOLVE --> EMIT[implements relationship<br/>to implementor files]
```

Inheritance handling is language-aware with configurable strategies. For `extends` relationships, the system determines whether to treat them as implementation relationships based on language-specific conventions. 资料来源：[src/core/ingestion/heritage-processor.ts:18-29]()

### 5. Reference Emission

References connect code locations to the symbols they use:

```mermaid
flowchart TD
    REF[Reference] --> RESOLVE[Resolve caller def]
    RESOLVE --> WALK[Walk up scope tree]
    WALK --> FOUND{Found Function-like?}
    FOUND -->|yes| EMIT[Emit edge to caller]
    FOUND -->|no| FALLBACK[Use innermost<br/>ancestor scope]
    FALLBACK --> SKIP[Skip if no def found]
```

Reference emission optionally flushes scope trees when `INGESTION_EMIT_SCOPES=1` is set, creating: 资料来源：[src/core/ingestion/emit-references.ts:26-35]()

- `Scope` nodes for every scope in the tree
- `CONTAINS` edges from parent to child scope
- `DEFINES` edges from scope to owned definitions
- `IMPORTS` edges from scope to target modules

### 6. Workspace Index Construction

The workspace index provides efficient reverse-lookups for scope-based queries:

| Map | Purpose |
|-----|---------|
| `classScopeByDefId` | Class def `nodeId` → class `Scope` |
| `classScopeIdToDefId` | Class `Scope.id` → class def `nodeId` |

The `classScopeIdToDefId` inverse map enables the implicit-`this` overload picker to skip O(C) reverse scans. 资料来源：[src/core/ingestion/scope-resolution/workspace-index.ts:18-27]()

## MCP Tools (Model Context Protocol)

GitNexus exposes AI tooling through the Model Context Protocol: 资料来源：[src/mcp/resources.ts:15-27]()

| Tool | Capability |
|------|------------|
| `query` | Process-grouped code intelligence—execution flows related to a concept |
| `context` | 360-degree symbol view—categorized refs, processes it participates in |
| `impact` | Symbol blast radius—what breaks at depth 1/2/3 with confidence |
| `detect_changes` | Git-diff impact—what do your current changes affect |
| `rename` | Multi-file coordinated rename with confidence-tagged edits |
| `cypher` | Raw graph queries |
| `list_repos` | Discover indexed repositories |

### MCP Resources

Each indexed repository exposes structured resources:

```
gitnexus://repo/{name}/context   → Stats, staleness check
gitnexus://repo/{name}/clusters  → All functional areas
gitnexus://repo/{name}/processes → All execution flows
gitnexus://repo/{name}/schema    → Graph schema for Cypher
```

资料来源：[src/mcp/resources.ts:29-33]()

## Repository Management

The `repo-manager.ts` module handles repository registration and working directory matching:

```mermaid
graph TD
    REG[Register Repo] --> CWDMATCH[CwdMatch Check]
    CWDMATCH --> MATCH{Result}
    MATCH -->|path| SAME[Same directory tree]
    MATCH -->|sibling-by-remote| SIBLING[Different clone<br/>same remote]
    MATCH -->|none| NONE[No relationship]
    SIBLING --> DRIFT[Calculate commit drift]
    DRIFT --> HINT[Optional warning]
```

The `CwdMatch` interface captures: 资料来源：[src/storage/repo-manager.ts:14-37]()

- Whether `cwd` is the registered path, a sibling clone, or unrelated
- The git toplevel when `cwd` is inside a work tree
- HEAD commit information for drift calculation
- Human-readable warnings when drift is detected

## Git Integration

GitNexus integrates with Git for repository identification and naming: 资料来源：[src/storage/git.ts:14-26]()

```typescript
hasGitDir(dirPath)     // Check for .git directory
getRemoteOriginUrl(repoPath)  // Read remote.origin.url
sanitizeRepoName(name) // Prevent argument injection
```

The `sanitizeRepoName` function:
1. Strips leading dashes to prevent git command-line argument injection
2. Replaces filesystem-unsafe characters across Windows/macOS/Linux

资料来源：[src/storage/git.ts:45-52]()

## Web Interface

### Graph Visualization

The interactive graph displays code relationships with visual encoding: 资料来源：[gitnexus-web/src/components/HelpPanel.tsx:7-9]()

| Visual Element | Meaning |
|----------------|---------|
| Node size | Connection count—larger nodes are depended on by more files |
| Edge direction | Points from importer → imported |
| Edge color | Relationship type (import, call, inheritance) |

### Status Bar

The status bar displays indexing progress and repository state:

- Progress bar during analysis phases
- "Ready" indicator when idle
- Sponsor link for community support

资料来源：[gitnexus-web/src/components/StatusBar.tsx:7-18]()

## Cross-Impact Analysis

When analyzing symbol changes, GitNexus computes affected code across repositories:

```mermaid
flowchart TD
    CHANGE[User changes symbol S<br/>in repo R] --> LOCAL[Local impact engine<br/>per-repo uid expansion]
    LOCAL --> IDS[Affected uid set]
    IDS --> BRIDGE[Bridge query<br/>MATCH Contract WHERE uid IN ids]
    BRIDGE --> CL[CrossLink traversal]
    CL --> OTHER[Matching contract in<br/>other repo]
    OTHER --> FDO[Fan-out impact<br/>to consuming repo]
```

This enables understanding how a change in one service affects dependent services through contract relationships. 资料来源：[src/core/group/PIPELINE.md:31-40]()

## Semantic Model Contract

The scope resolution pipeline enforces a single source of truth: 资料来源：[src/core/ingestion/scope-resolution/contract/scope-resolver.ts:48-57]()

> `ParsedFile` is the single semantic model consumed by both the legacy DAG and the scope-resolution pipeline.

Key invariants:
- Symbol-indexed lookups live on `SemanticModel` for the entire codebase
- Scope-shaped lookups (which `SemanticModel` doesn't carry) live on `WorkspaceResolutionIndex`
- Edges from `runScopeResolution` and legacy DAG are indistinguishable to downstream consumers

## Summary

GitNexus provides a comprehensive code intelligence platform with:

- **Multi-language parsing** via tree-sitter grammars
- **Framework-specific pattern detection** for gRPC, Laravel, Guzzle
- **Scope-aware reference resolution** with heritage tracking
- **AI-accessible tooling** through MCP
- **Cross-repository impact analysis** through contract graphs
- **Local-first architecture** ensuring data privacy

The system is designed for extensibility—adding new language support requires only grammar registration, and new framework patterns can be dropped into the extractors without modifying the orchestration pipeline.

---

<a id='quick-start'></a>

## Quick Start Guide

### 相关页面

相关主题：[Introduction to GitNexus](#introduction), [MCP Integration](#mcp-integration)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [RUNBOOK.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/RUNBOOK.md)
- [gitnexus/README.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/README.md)
- [gitnexus-web/src/components/OnboardingGuide.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/OnboardingGuide.tsx)
- [gitnexus-web/src/components/AnalyzeOnboarding.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/AnalyzeOnboarding.tsx)
- [gitnexus-web/src/components/DropZone.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/DropZone.tsx)
- [gitnexus-web/src/components/Header.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/Header.tsx)
</details>

# Quick Start Guide

## Overview

GitNexus is a local-first code analysis tool that clones GitHub repositories, parses source code, and builds an interactive knowledge graph directly in your browser. The tool operates entirely on your machine—no data leaves your computer during analysis.

资料来源：[gitnexus/README.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/README.md)

## System Requirements

| Requirement | Minimum Version | Notes |
|-------------|-----------------|-------|
| Node.js | 18.0.0+ | Runtime for the backend server |
| npm | 9.0.0+ | Package manager |
| Supported OS | macOS, Linux, Windows | Platform for running the server |
| Web Browser | Modern browser | Chrome, Firefox, Edge, or Safari |

资料来源：[gitnexus-web/src/components/OnboardingGuide.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/OnboardingGuide.tsx)

## Installation Options

GitNexus supports two installation methods depending on your workflow preferences.

### Global Installation (Recommended)

Install GitNexus globally using npm for command-line access from any directory:

```bash
npm install -g gitnexus && gitnexus serve
```

资料来源：[gitnexus-web/src/components/OnboardingGuide.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/OnboardingGuide.tsx)

### Local Installation

For project-specific usage, install locally within your project directory:

```bash
npm install gitnexus
npx gitnexus serve
```

## Getting Started Workflow

The onboarding process follows a three-step workflow that automatically connects the web interface to the backend server.

```mermaid
flowchart TD
    A[Install GitNexus] --> B[Start Server]
    B --> C[Auto-detect Server]
    C --> D[Open Graph UI]
    
    A1[Global: npm install -g] --> B
    A2[Local: npm install] --> B
    
    B --> B1[gitnexus serve]
    B1 --> B2[Server running on port]
    
    C --> C1[Backend Polling]
    C1 --> C2[Connection Established]
```

资料来源：[gitnexus-web/src/components/OnboardingGuide.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/OnboardingGuide.tsx)

### Step 1: Install and Run

Open a terminal at your project root, then install GitNexus and start the server:

```bash
npm install -g gitnexus && gitnexus serve
```

The server will initialize and begin listening for connections from the web interface.

资料来源：[gitnexus-web/src/components/OnboardingGuide.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/OnboardingGuide.tsx)

### Step 2: Wait for Server Startup

The web interface automatically detects when the server is ready through backend polling. A progress indicator shows the connection status while waiting.

| State | Description |
|-------|-------------|
| `onboarding` | Initial state, awaiting server connection |
| `analyze` | Server detected, ready for repository analysis |
| `landing` | Server connected with indexed repositories |
| `loading` | Processing in progress |
| `success` | Analysis complete |

资料来源：[gitnexus-web/src/components/DropZone.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/DropZone.tsx)

### Step 3: Auto-Connect

The page automatically detects the running server and establishes a WebSocket connection. No page refresh is required—the graph UI loads once the connection is established.

## Analyzing a Repository

After connecting to the server, you can analyze any public GitHub repository.

```mermaid
flowchart LR
    A[Enter GitHub URL] --> B[Server Clones Repo]
    B --> C[Parse Code Structure]
    C --> D[Build Knowledge Graph]
    D --> E[Interactive UI]
```

### Supported Repository Types

| Type | Support | Notes |
|------|---------|-------|
| Public Repositories | ✅ Full | Clone and analyze available |
| Private Repositories | ❌ Not supported | Requires authentication (future) |
| Local Repositories | ✅ Via server path | Mounted directories on server machine |

资料来源：[gitnexus-web/src/components/AnalyzeOnboarding.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/AnalyzeOnboarding.tsx)

### Analysis Workflow

1. **Enter Repository URL**: Paste a GitHub repository URL into the input field
2. **Start Analysis**: Click the analyze button to begin processing
3. **Monitor Progress**: View real-time progress with phase indicators
4. **Cancel if Needed**: Stop analysis at any point before completion

```typescript
// Analysis phases observed in the codebase
type AnalysisPhase = 
  | 'cloning'      // Repository being cloned
  | 'parsing'      // Code being parsed
  | 'indexing'     // Building graph index
  | 'complete';    // Analysis finished
```

资料来源：[gitnexus-web/src/components/StatusBar.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/StatusBar.tsx)

### Privacy and Data Handling

> Public repos only · Cloned locally by the server · No data leaves your machine

Repositories are cloned directly to your local machine. Code parsing and graph generation occur locally without external data transmission.

资料来源：[gitnexus-web/src/components/AnalyzeOnboarding.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/AnalyzeOnboarding.tsx)

## Managing Multiple Repositories

When multiple repositories have been analyzed, you can switch between them using the repository dropdown in the header.

```mermaid
flowchart TD
    A[Repository Dropdown] --> B{Select Repo}
    B -->|Same Repo| C[Stay on Current View]
    B -->|Different Repo| D[Switch Repository]
    D --> E[Load New Graph]
    
    F[Active Repository] --> G[Highlighted with accent]
    H[Other Repositories] --> I[Normal styling]
```

### Repository Selection UI

| Element | Visual Indicator | Behavior |
|---------|------------------|----------|
| Active repo | `border-l-2 border-accent` with accent background | Current selection |
| Other repos | Hover state with `hover:bg-hover` | Clickable to switch |
| Re-analyze button | Available per repository | Trigger fresh analysis |

资料来源：[gitnexus-web/src/components/Header.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/Header.tsx)

### Switching Repositories

1. Open the repository dropdown menu
2. Click on any repository name
3. The graph view updates automatically to show the selected repository's knowledge graph
4. The dropdown closes after selection

```typescript
// Header component handles repo switching
onClick={() => {
  if (repo.name !== projectName) onSwitchRepo?.(repo.name);
  setIsRepoDropdownOpen(false);
}}
```

## Troubleshooting

### Server Detection Issues

If the web interface fails to detect the running server:

| Symptom | Solution |
|---------|----------|
| "Connecting..." message persists | Verify server is running with `gitnexus serve` |
| Server on different port | Ensure default port is not blocked |
| Browser console errors | Check WebSocket connection in developer tools |

### Analysis Failures

| Error | Cause | Resolution |
|-------|-------|------------|
| Validation error | Invalid GitHub URL format | Enter a valid `https://github.com/user/repo` URL |
| Network timeout | Large repository | Retry or clone locally and use local path |
| Parse errors | Unsupported language | Check supported languages list |

资料来源：[gitnexus-web/src/components/RepoAnalyzer.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/RepoAnalyzer.tsx)

## Configuration

### Backend Server Configuration

The server can be configured via environment variables or command-line arguments:

| Option | Default | Description |
|--------|---------|-------------|
| `--port` | 3001 | Server listening port |
| `--host` | localhost | Server bind address |
| `--data-dir` | ~/.gitnexus | Local storage for cloned repos |

资料来源：[RUNBOOK.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/RUNBOOK.md)

### Frontend Settings

The web interface provides configuration options through the Settings panel:

- **API Provider Selection**: Choose between Ollama and other providers
- **Model Configuration**: Set the language model for Graph RAG features
- **Connection Testing**: Verify backend connectivity

## Next Steps

After completing the quick start:

| Task | Description |
|------|-------------|
| Explore the Graph | Navigate nodes, zoom, and pan to explore code relationships |
| Search Symbols | Use the search feature to find functions, classes, and imports |
| View Dependencies | Click nodes to see import/export relationships |
| Enable AI Features | Configure Ollama in settings for Graph RAG capabilities |
| Re-analyze Repos | Use the re-analyze button to refresh repository data |

资料来源：[gitnexus-web/src/components/HelpPanel.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/HelpPanel.tsx)

---

<a id='key-concepts'></a>

## Key Concepts

### 相关页面

相关主题：[Knowledge Graph](#knowledge-graph), [Indexing Pipeline](#indexing-pipeline), [MCP Integration](#mcp-integration)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/src/core/graph/types.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/graph/types.ts)
- [gitnexus/src/core/embeddings/types.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/embeddings/types.ts)
- [gitnexus/src/core/ingestion/model/heritage-map.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/model/heritage-map.ts)
- [gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)
- [gitnexus/src/core/ingestion/call-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/call-processor.ts)
- [gitnexus/src/core/ingestion/heritage-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/heritage-processor.ts)
- [gitnexus/src/core/ingestion/utils/ast-helpers.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/utils/ast-helpers.ts)
- [gitnexus/src/core/tree-sitter/parser-loader.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/tree-sitter/parser-loader.ts)
- [gitnexus/src/core/group/extractors/grpc-patterns/node.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/extractors/grpc-patterns/node.ts)
- [gitnexus/src/core/group/extractors/http-patterns/php.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/extractors/http-patterns/php.ts)
</details>

# Key Concepts

GitNexus is a sophisticated code repository analysis platform that combines AST-based parsing, dependency graph construction, and semantic embeddings to provide deep insights into codebase architecture. This page covers the fundamental concepts and data structures that power GitNexus's analysis capabilities.

---

## Graph Data Model

The graph model forms the core of GitNexus's representation of codebases. Every symbol, import, and relationship in a repository is abstracted into graph nodes and edges.

### Node Types

The graph supports multiple node types representing different code constructs:

| Node Type | Description | Examples |
|-----------|-------------|----------|
| `Class` | Class/struct definitions | `class UserService`, `struct Point` |
| `Interface` | Interface/type definitions | `interface IRepository`, `trait EventHandler` |
| `Function` | Function/method definitions | `function calculateTotal()`, `def process()` |
| `Method` | Class/instance methods | `void sendEmail()`, `async def fetch()` |
| `Module` | File-level modules/packages | `namespace App\Services`, `module db` |
| `Package` | Package/module declarations | `package main`, `module foo` |

### Edge Types

Edges represent relationships between nodes:

| Edge Type | Direction | Description |
|-----------|-----------|-------------|
| `imports` | importer → imported | Direct import/require statements |
| `calls` | caller → callee | Function/method invocations |
| `extends` | subclass → parent | Inheritance relationships |
| `implements` | impl → interface | Interface implementation |
| `uses` | consumer → provider | Framework/annotation usage |
| `contains` | container → contained | Structural containment |

### Node Identifiers

Nodes are identified using a deterministic ID scheme:

```typescript
generateId(nodeType: string, uniqueKey: string): string
```

The `uniqueKey` typically combines the file path with the symbol name, ensuring uniqueness across the codebase.

资料来源：[gitnexus/src/core/graph/types.ts]()

---

## Ingestion Pipeline

The ingestion pipeline is a multi-phase system that transforms raw source code into structured graph data.

### Phase Overview

```mermaid
flowchart TD
    subgraph Phase1[Phase 1: Discovery]
        SC[Scan Files] --> PL[Parse Languages]
        PL --> ID[Identify Code Units]
    end
    
    subgraph Phase2[Phase 2: Extraction]
        ID --> HA[Heritage Analysis]
        HA --> CA[Call Analysis]
    end
    
    subgraph Phase3[Phase 3: Linking]
        CA --> RI[Resolve Imports]
        RI --> XR[Cross-Repo Links]
    end
    
    subgraph Phase4[Phase 4: Indexing]
        XR --> EMB[Generate Embeddings]
        EMB --> STORE[Persist to Database]
    end
```

### Heritage Processing

Heritage processing extracts inheritance relationships from source code. The `HeritageMap` maintains parent-child relationships across multiple inheritance scenarios:

```typescript
interface ParentEntry {
  readonly kind: 'extends' | 'implements';
  readonly parentName: string;
  readonly filePath: string;
}
```

The heritage processor handles:

- **Simple inheritance**: Single class extension
- **Multiple inheritance**: Multiple parent classes (with C3 linearization for languages supporting it)
- **Interface implementation**: Distinguishing between `extends` and `implements` based on language semantics

资料来源：[gitnexus/src/core/ingestion/heritage-processor.ts:1-50]()

### Call Processing

The call processor analyzes function invocations and return types across file boundaries:

| Phase | Purpose |
|-------|---------|
| P (per-file) | Build type environment and resolve local calls |
| E1-E3 (cross-file) | Accumulate imported bindings, return types, and type maps |
| Registry Primary | Languages using built-in registry skip certain phases |

The processor uses a local-first resolution strategy, preferring local symbol matches over imported ones:

```typescript
// Consulted ONLY when SymbolTable has no unambiguous match
importedReturnTypesMap?: ReadonlyMap<string, ReadonlyMap<string, string>>
```

资料来源：[gitnexus/src/core/ingestion/call-processor.ts:1-80]()

---

## Pattern Extraction System

GitNexus uses a pattern-based extraction system that allows framework and language-specific analyzers to be added without modifying the core orchestrator.

### Pattern Bundle Architecture

Each language/framework combination defines a `PatternBundle` containing compiled tree-sitter queries:

```typescript
interface GrpcPatternBundle {
  grpcMethod: CompiledPatterns<Record<string, never>>;
  grpcClient: CompiledPatterns<Record<string, never>>;
  getService: CompiledPatterns<Record<string, never>>;
  newSimpleCtor: CompiledPatterns<Record<string, never>>;
  newQualifiedCtor: CompiledPatterns<Record<string, never>>;
  loadPackageDefinition: CompiledPatterns<Record<string, never>>;
}
```

### Language-Specific Extractors

The system supports multiple language extractors:

#### JavaScript/TypeScript gRPC Patterns

Extracts gRPC service definitions from Node.js/TypeScript codebases:

- `grpcMethod`: gRPC method declarations
- `grpcClient`: Client instantiation patterns
- `getService`: Service retrieval patterns

#### PHP HTTP Patterns

Handles PHP-specific HTTP interactions:

| Pattern | Tree-sitter Query Target |
|---------|--------------------------|
| `laravelRoute` | Laravel route definitions |
| `httpFacade` | HTTP facade usage |
| `guzzleMember` | Guzzle HTTP client patterns |
| `fileGetContents` | File reading operations |

```typescript
const LARAVEL_ROUTE_SPEC: PatternSpec<Record<string, never>> = {
  meta: {},
  query: `
    (function_call_expression
      function: (name) @fn (#eq? @fn "Route::get")
      arguments: (arguments . (argument (string) @path)))
  `,
};
```

资料来源：[gitnexus/src/core/group/extractors/http-patterns/php.ts:1-60]()

### Extensibility Model

Adding new language support requires only dropping a new file in `*-patterns/` and registering it in `index.ts`. The orchestrator never imports individual grammars, ensuring loose coupling.

资料来源：[gitnexus/src/core/group/PIPELINE.md:1-20]()

---

## Tree-Sitter Parser Integration

GitNexus leverages tree-sitter for robust, error-tolerant parsing across multiple programming languages.

### Supported Languages

| Language | Package | Notes |
|----------|---------|-------|
| TypeScript | `tree-sitter-typescript` | Reuses TSX parser |
| TSX | `tree-sitter-typescript` | Shares binding with TS |
| Python | `tree-sitter-python` | Standard Python 3 syntax |
| Java | `tree-sitter-java` | Java 17+ features |
| C# | `tree-sitter-c-sharp` | Uses explicit subpath for Node 22 compatibility |
| C++ | `tree-sitter-cpp` | C++20 features supported |
| PHP | `tree-sitter-php` | PHP 8.x syntax |

### Parser Loader

The parser loader provides dynamic language loading with graceful fallback:

```typescript
[SupportedLanguages.CSharp]: {
  load: () => _require('tree-sitter-c-sharp/bindings/node/index.js'),
  unavailableNote: 'C# parsing requires `tree-sitter-c-sharp/bindings/node/index.js`'
}
```

The explicit subpath import for C# bypasses Node 22's DEP0151 deprecation warning on bare-package imports.

### AST Caching

Parsed ASTs are cached to avoid redundant parsing:

```typescript
let tree = astCache.get(file.path);
if (!tree) {
  const parseContent = provider.preprocessSource?.(file.content, file.path) ?? file.content;
  tree = parseSourceSafe(parser, parseContent, undefined, {
    bufferSize: getTreeSitterBufferSize(parseContent)
  });
  astCache.set(file.path, tree);
}
```

资料来源：[gitnexus/src/core/tree-sitter/parser-loader.ts:1-80]()

---

## Semantic Embeddings

GitNexus provides semantic search capabilities through code embeddings, enabling natural language queries against the codebase.

### WebGPU Acceleration

Embedding generation can utilize WebGPU for GPU-accelerated computation:

```typescript
interface EmbeddingConfig {
  backend: 'webgpu' | 'wasm' | 'cpu';
  model: string;
  dimensions: number;
}
```

### Fallback Strategy

When WebGPU is unavailable, the system offers graceful degradation:

| Option | Performance | Description |
|--------|-------------|-------------|
| CPU | Slow | Works universally, ~X minutes for large codebases |
| WebAssembly | Moderate | Cross-browser compatibility |
| WebGPU | Fast | Best performance on supported browsers |

The fallback dialog provides estimated processing time based on repository size:

```typescript
<span>~{estimatedMinutes} min for {nodeCount} nodes</span>
```

资料来源：[gitnexus/src/core/embeddings/types.ts]()

---

## Cross-Repository Impact Analysis

GitNexus supports tracking dependencies and impacts across multiple repositories.

### Contract System

Contracts define explicit dependencies between repositories:

```mermaid
flowchart TD
    Y[group.yaml links] --> ME[ManifestExtractor]
    ME --> LOOP{for each link}
    LOOP --> RES[resolveSymbol label-scoped Cypher]
    RES --> OK{found?}
    OK -->|yes| REF[real symbol uid + ref]
    OK -->|no| SYN[synthetic uid manifest::repo::cid]
```

### Cross-Impact Query Engine

When a symbol changes in one repository, GitNexus can identify affected contracts in dependent repositories:

```mermaid
flowchart TD
    U[User changes symbol S in repo R] --> LI[Local impact engine per-repo uid expansion]
    LI --> IDS[Affected uid set]
    IDS --> BR[Bridge query MATCH Contract WHERE uid IN ids]
    BR --> CL[CrossLink traversal]
    CL --> OTHER[Matching contract in other repo]
    OTHER --> FE[Fan-out impact to consuming repo]
```

### Label-Scoped Resolution

Queries use label-scoped matching to prevent cross-contamination:

| Topic | Node Pattern |
|-------|--------------|
| `topic` | `(n:Function\|Method\|Class\|Interface)` |
| `grpc` method | `(n:Function\|Method)` |
| `grpc` service | `(n:Class\|Interface)` |
| `lib` | `(n:Package\|Module)` |

资料来源：[gitnexus/src/core/group/PIPELINE.md:40-80]()

---

## Symbol Resolution

### Binding Accumulator

The binding accumulator tracks symbol-to-type mappings across the codebase:

```typescript
/** Phase 14 E1: imported bindings to seed into buildTypeEnv */
importedBindingsMap?: ReadonlyMap<string, ReadonlyMap<string, string>>
```

### Type Environment

Type resolution follows a strict priority order:

1. Local symbol table matches (highest priority)
2. Imported return types map
3. Imported raw return types (for iteration)
4. Cross-file heritage map fallback

### AST Helpers

The AST helper utilities provide common operations across different language grammars:

```typescript
interface ClassInfo {
  classId: string;
  className: string;
}

// Handles language-specific node traversal
const nameNode = children.find((c: SyntaxNode) =>
  c.type === 'type_identifier' ||
  c.type === 'identifier' ||
  c.type === 'name' ||
  c.type === 'constant'
);
```

Special handling exists for:
- Kotlin's anonymous `interface` keyword in class declarations
- For-loop element type extraction
- Scoped type identifiers

资料来源：[gitnexus/src/core/ingestion/utils/ast-helpers.ts:1-80]()

---

## Architecture Summary

GitNexus's architecture follows a clean separation of concerns:

```mermaid
graph TB
    subgraph Ingestion["Ingestion Layer"]
        PS[Parser Selector] --> TP[TypeScript Parser]
        PS --> PP[Python Parser]
        PS --> JP[Java Parser]
        TP --> QE[Query Engine]
        PP --> QE
        JP --> QE
        QE --> H[Heritage Map]
        QE --> C[Call Processor]
    end
    
    subgraph Pattern["Pattern Extraction"]
        G[gRPC Bundle] --> E[Extractor]
        HTTP[PHP HTTP] --> E
        E --> SG[Symbol Graph]
    end
    
    subgraph Storage["Storage & Query"]
        SG --> DB[(Graph DB)]
        H --> DB
        C --> DB
    end
    
    subgraph UI["Web Interface"]
        DZ[DropZone] --> HX[Header]
        HX --> SB[StatusBar]
        SB --> GP[Graph Panel]
    end
    
    DB --> GP
    EMB[Embeddings] --> GP
```

### Key Design Principles

1. **Registry Pattern**: Languages using built-in registries (e.g., Python, Java) skip certain processing phases
2. **Event Loop Yielding**: Long-running operations yield to the event loop every 20 files to prevent blocking
3. **Error-Tolerant Parsing**: Parse failures are logged and skipped without halting the entire process
4. **Lazy Language Loading**: Tree-sitter language parsers are loaded on-demand to minimize memory footprint

---

## See Also

- [Architecture Overview](../architecture/overview.md)
- [Pattern Extractors](../extractors/index.md)
- [API Reference](../api/index.md)

---

<a id='system-architecture'></a>

## System Architecture

### 相关页面

相关主题：[Package Structure](#package-structure), [Indexing Pipeline](#indexing-pipeline), [MCP Integration](#mcp-integration), [Multi-Repo Registry Architecture](#multi-repo-registry)

<details>
<summary>Relevant Source Files</summary>

The following source files were used to generate this documentation:

- [gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)
- [gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts)
- [gitnexus/src/core/ingestion/scope-extractor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-extractor.ts)
- [gitnexus/src/core/ingestion/heritage-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/heritage-processor.ts)
- [gitnexus/src/core/ingestion/call-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/call-processor.ts)
- [gitnexus/src/core/tree-sitter/parser-loader.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/tree-sitter/parser-loader.ts)
- [gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts)
- [gitnexus-shared/src/scope-resolution/parsed-file.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-shared/src/scope-resolution/parsed-file.ts)
- [gitnexus/src/core/ingestion/model/heritage-map.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/model/heritage-map.ts)
- [gitnexus/src/storage/git.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/storage/git.ts)
</details>

# System Architecture

GitNexus is a code analysis and visualization platform that provides dependency graph generation, scope-aware symbol resolution, and cross-repository impact analysis. The system architecture is designed around a multi-stage ingestion pipeline that transforms source code into a queryable knowledge graph.

## Overview

GitNexus operates as a client-server application:

- **Backend Server** (`gitnexus/src/`): A Node.js server responsible for code ingestion, AST parsing, scope resolution, and graph construction
- **Web Frontend** (`gitnexus-web/src/`): A React-based visualization layer that renders the interactive dependency graph

The architecture emphasizes separation of concerns: language-specific extraction logic is isolated from the core orchestration, enabling extensibility without modifying the main pipeline.

资料来源：[gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)

## Core Architectural Principles

### Single Source of Truth

The `ParsedFile` model (defined in `gitnexus-shared/src/scope-resolution/parsed-file.ts`) serves as the single semantic model consumed by both the legacy DAG and the scope-resolution pipeline. Scope-resolution passes must not build a parallel parse representation.

资料来源：[gitnexus-shared/src/scope-resolution/parsed-file.ts:1-40](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-shared/src/scope-resolution/parsed-file.ts)

### Same-Graph Guarantee

Edges emitted by `runScopeResolution` and edges emitted by the legacy DAG are indistinguishable to downstream consumers:

| Aspect | Legacy DAG | Scope Resolution |
|--------|------------|-------------------|
| Node Identity | `generateId()` with same qualified-name keyspace | Identical |
| Edge Vocabulary | `'import-resolved' \| 'global' \| 'local-call' \| 'same-file' \| 'interface-dispatch' \| 'read' \| 'write'` | Identical |

This ensures transparent interoperability between both code paths.

资料来源：[gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts)

### Language Extensibility

The orchestrator never imports a grammar directly. Adding support for a new language or framework requires only:

1. Dropping one file in `*-patterns/` directory
2. Registering it in `index.ts`

No orchestrator edits are required.

资料来源：[gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)

## Ingestion Pipeline

The ingestion pipeline processes source code through a series of stages to build the dependency graph.

```mermaid
graph TD
    subgraph Ingestion
        FS[File System] --> SCOPE[Scope Extractor]
        SCOPE --> PARSE[Parsing Phase]
        PARSE --> CALLS[Call Processor]
        CALLS --> HERITAGE[Heritage Processor]
        HERITAGE --> FINALIZE[Graph Finalization]
    end
    
    subgraph Language Support
        PARSER[Parser Loader] --> TS[Tree-sitter Grammars]
        TS --> LANG[Language Registry]
        LANG --> PROVIDER[Language Providers]
    end
    
    SCOPE -.-> PROVIDER
    PARSE -.-> PROVIDER
    CALLS -.-> PROVIDER
    HERITAGE -.-> PROVIDER
```

### Stage 1: Scope Extraction

The `ScopeExtractor` is the central, source-agnostic driver that converts language provider captures into `ParsedFile` artifacts.

**Key responsibilities:**
- Build scope tree from `@scope.*` matches
- Maintain structural invariants (non-module has parent; parent contains child)
- Produce `ParsedFile` containing: `scopes`, `parsedImports`, `localDefs`, `referenceSites`

**Design principles:**
- Source-agnostic: Consumes `CaptureMatch[]` from providers
- One AST walk per language: Providers handle AST traversal internally
- Pure-ish: Same matches produce same `ParsedFile` output
- Centralized invariant enforcement via `buildScopeTree`

资料来源：[gitnexus/src/core/ingestion/scope-extractor.ts:1-50](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-extractor.ts)

### Stage 2: Call Resolution

The call processor resolves function and method calls to their definitions.

```mermaid
graph LR
    subgraph Input
        FILES[Source Files] --> LANG[Language Detection]
    end
    
    subgraph Processing
        LANG --> PARSE[Parse with Tree-sitter]
        PARSE --> QUERY[Execute Queries]
        QUERY --> MATCHES[Capture Matches]
        MATCHES --> RESOLVE[Resolve Calls to Defs]
    end
    
    subgraph Output
        RESOLVE --> EDGES[Call Edges in Graph]
    end
```

**Registry-Primary Language Gate:**

For languages marked as `isRegistryPrimary`, scope-based phase owns CALLS processing, bypassing the legacy call processor:

```typescript
// Registry-primary gate: scope-based phase owns CALLS for this lang.
if (isRegistryPrimary(language)) continue;
```

资料来源：[gitnexus/src/core/ingestion/call-processor.ts:1-50](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/call-processor.ts)

### Stage 3: Heritage Processing

Heritage processing resolves class inheritance relationships (`extends` and `implements`).

**Heritage types:**
- `extends` → creates `EXTENDS` relationship
- `implements` → creates `IMPLEMENTS` relationship

**Processing logic:**
```typescript
if (h.kind === 'extends') {
  const { type: relType, idPrefix } = resolveExtendsType(
    h.parentName,
    h.filePath,
    ctx,
    getHeritageStrategyForLanguage(fileLanguage),
  );
  // Create relationship with confidence score
  graph.addRelationship({
    type: relType,
    confidence: Math.sqrt(child.confidence * parent.confidence),
  });
}
```

资料来源：[gitnexus/src/core/ingestion/heritage-processor.ts:1-60](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/heritage-processor.ts)

### Stage 4: Heritage Map Construction

The heritage map tracks class/interface inheritance for Method Resolution Order (MRO) calculations.

**Key data structures:**

| Structure | Purpose |
|-----------|---------|
| `directParents` | `Map<nodeId, ParentEntry[]>` — parent relationships by kind |
| `implementorFiles` | `Map<parentName, Set<filePath>>` — tracking implementing files |

**Public API:**
- `getParentEntries(childNodeId)` → returns `readonly ParentEntry[]`
- `getParents(childNodeId)` → returns deduplicated `string[]`

资料来源：[gitnexus/src/core/ingestion/model/heritage-map.ts:1-60](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/model/heritage-map.ts)

## Parser Architecture

### Tree-sitter Integration

GitNexus uses Tree-sitter for universal AST parsing across supported languages.

**Grammar loading strategy:**

```typescript
const SOURCES: Record<string, GrammarSource> = {
  [SupportedLanguages.JavaScript]: {
    load: () => _require('tree-sitter-javascript'),
    unavailableNote: 'JavaScript parsing requires `tree-sitter-javascript`...',
  },
  [SupportedLanguages.TypeScript]: {
    load: () => _require('tree-sitter-typescript').typescript,
  },
  [`${SupportedLanguages.TypeScript}:tsx`]: {
    load: () => _require('tree-sitter-typescript').tsx,
  },
  [SupportedLanguages.Python]: {
    load: () => _require('tree-sitter-python'),
  },
  // Additional languages...
};
```

**Configuration rules:**
- Grammars must be in `dependencies` (not `optionalDependencies`)
- Failures indicate real install problems and should never be hidden

资料来源：[gitnexus/src/core/tree-sitter/parser-loader.ts:1-80](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/tree-sitter/parser-loader.ts)

### Supported Languages

The system supports multiple programming languages through a registry pattern:

| Language | Grammar Package | Notes |
|----------|-----------------|-------|
| JavaScript | `tree-sitter-javascript` | Registry-primary |
| TypeScript | `tree-sitter-typescript` | Registry-primary |
| TSX | `tree-sitter-typescript` | Re-uses TS binding |
| Python | `tree-sitter-python` | Pattern-based extraction |
| PHP | `tree-sitter-php` | Framework patterns (Laravel, Guzzle) |
| C# | `tree-sitter-c-sharp` | Using directive decomposition |
| Go | `tree-sitter-go` | gRPC pattern support |
| Ruby | `tree-sitter-ruby` | MRO strategies |

## Workspace Index

The workspace index provides O(totalScopes) build-time indexing for efficient runtime lookups.

**Interface:**

```typescript
export interface WorkspaceResolutionIndex {
  /** Class def `nodeId` → that class's `Scope`. */
  readonly classScopeByDefId: ReadonlyMap<string, Scope>;
  
  /** Inverse: class `Scope.id` → class def `nodeId`. */
  readonly classScopeIdToDefId: ReadonlyMap<string, string>;
}
```

**Semantic model lookups:**

| Lookup Type | Source | Purpose |
|-------------|--------|---------|
| Owner-keyed method | `model.methods.lookupAllByOwner` | Registry + scope-resolution |
| Name-keyed callable | `model.symbols.lookupCallableByName` | Symbol resolution |
| File-indexed symbol | `model.symbols.lookupExactAll` | Exact matching |
| Scope-shaped | `WorkspaceResolutionIndex` | Implicit `this` overload picker |

This split preserves the single-source-of-truth invariant: symbol-indexed lookups live on `SemanticModel`; scope-shaped lookups live on `WorkspaceResolutionIndex`.

资料来源：[gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts:1-50](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts)

## Pattern Extraction System

### Language Pattern Bundles

Each language has pattern bundles compiled from Tree-sitter query specifications:

**Example: PHP HTTP Patterns**

```typescript
const PHP_PATTERNS: PhpPatternBundle = {
  laravelRoute: mk(LARAVEL_ROUTE_SPEC, 'laravel-route'),
  httpFacade: mk(HTTP_FACADE_SPEC, 'http-facade'),
  guzzleMember: mk(GUZZLE_MEMBER_SPEC, 'guzzle-member'),
  fileGetContents: mk(FILE_GET_CONTENTS_SPEC, 'file-get-contents'),
};
```

**Example: gRPC Patterns (Node.js/TypeScript)**

```typescript
const JAVASCRIPT_BUNDLE = compileBundle(JavaScript, 'javascript-grpc');
const TYPESCRIPT_BUNDLE = compileBundle(TypeScript.typescript, 'typescript-grpc');
const TSX_BUNDLE = compileBundle(TypeScript.tsx, 'tsx-grpc');
```

**Pattern categories:**
- HTTP route handlers
- gRPC service clients
- Framework-specific patterns (Laravel, Guzzle, etc.)
- Import/export declarations
- Call expressions

资料来源：[gitnexus/src/core/group/extractors/http-patterns/php.ts:1-50](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/extractors/http-patterns/php.ts)

## Data Models

### ParsedFile Structure

```typescript
interface ParsedFile {
  readonly scopes: Scope[];
  readonly parsedImports: ParsedImport[];
  readonly localDefs: LocalDef[];
  readonly referenceSites: ReferenceSite[];
}
```

| Field | Description |
|-------|-------------|
| `scopes` | Every `Scope` created for this file, in tree-topological order (module first, then children) |
| `parsedImports` | Raw `ParsedImport[]` — finalize phase resolves each to concrete `ImportEdge` |
| `localDefs` | Defs structurally declared in this file; superset of `Scope.ownedDefs` |
| `referenceSites` | Pre-resolution usage facts; populated by resolution phase |

**What `ParsedFile` deliberately does NOT carry:**
- Linked `ImportEdge`s (finalize output)
- `ScopeTree` instance (callers build from `scopes`)

资料来源：[gitnexus-shared/src/scope-resolution/parsed-file.ts:1-60](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-shared/src/scope-resolution/parsed-file.ts)

## Git Integration

The system integrates with Git repositories for analysis:

**Key functions:**

| Function | Purpose |
|----------|---------|
| `hasGitDir(dirPath)` | Check for `.git` directory presence |
| `getRemoteOriginUrl(repoPath)` | Read `remote.origin.url` for repo naming |
| `sanitizeRepoName(name)` | Prevent argument injection, ensure cross-platform compatibility |

**Security considerations:**
- Leading dashes are stripped to prevent git command-line argument injection
- Characters unsafe for directory names are replaced with underscores
- `execSync` uses `stdio: ['ignore', 'pipe', 'ignore']` to prevent information leakage

资料来源：[gitnexus/src/storage/git.ts:1-80](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/storage/git.ts)

## Manifest Extraction

The manifest extraction system links contracts across repositories:

```mermaid
flowchart TD
    Y[group.yaml links] --> ME[ManifestExtractor]
    ME --> LOOP{for each link}
    LOOP --> RES[resolveSymbol<br/>label-scoped Cypher]
    RES --> OK{found?}
    OK -->|yes| REF[real symbol uid + ref]
    OK -->|no| SYN[synthetic uid<br/>manifest::repo::cid]
    
    REF --> EMIT[emit provider + consumer<br/>Contract objects + CrossLink]
    SYN --> EMIT
    
    EMIT --> BRIDGE[(bridge.lbug<br/>#795)]
```

**Label-scoped queries** prevent accidental cross-matches:
- `topic` → `(n:Function|Method|Class|Interface)`
- `grpc` method → `(n:Function|Method)`, service → `(n:Class|Interface)`
- `lib` → `(n:Package|Module)`

资料来源：[gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)

## Cross-Impact Query System

The cross-impact query system tracks dependencies across repository boundaries:

```mermaid
flowchart TD
    U[User changes symbol S<br/>in repo R] --> LI[Local impact engine<br/>per-repo uid expansion]
    LI --> IDS[Affected uid set]
    
    IDS --> BR[Bridge query<br/>MATCH Contract WHERE uid IN ids]
    BR --> CL[CrossLink traversal]
    CL --> OTHER[Matching contract in<br/>other repo]
    
    OTHER --> FE[Fan-out impact<br/>to consuming repo]
```

This enables developers to understand the blast radius of changes across service boundaries.

资料来源：[gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)

## Frontend Architecture

The web frontend provides the visualization layer:

**Key components:**

| Component | Purpose |
|-----------|---------|
| `Header.tsx` | Repository switching, analysis triggers |
| `DropZone.tsx` | Server connection and repo loading |
| `HelpPanel.tsx` | Interactive legend and search |
| `StatusBar.tsx` | Progress tracking and sponsor links |
| `OnboardingGuide.tsx` | First-time user setup flow |

**Connection states:**
- `onboarding` — initial setup phase
- `analyze` — server up but no repos indexed
- `landing` — server up with indexed repos
- `success` / `loading` — operation in progress

资料来源：[gitnexus-web/src/components/DropZone.tsx:1-50](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/DropZone.tsx)

## Error Handling and Resilience

**Language unavailability:**
```typescript
if (!isLanguageAvailable(language)) {
  if (skippedByLang) {
    skippedByLang.set(language, (skippedByLang.get(language) ?? 0) + 1);
  }
  continue;
}
```

**Parse error handling:**
```typescript
try {
  tree = parseSourceSafe(parser, parseContent, undefined, {
    bufferSize: getTreeSitterBufferSize(parseContent),
  });
} catch (parseError) {
  continue; // Skip malformed files
}
```

**Event loop yielding:**
```typescript
if (i % 20 === 0) await yieldToEventLoop();
```

Files are processed in batches of 20 with event loop yields to prevent UI blocking during large repository analysis.

资料来源：[gitnexus/src/core/ingestion/call-processor.ts:1-60](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/call-processor.ts)

## Performance Considerations

| Optimization | Mechanism |
|-------------|-----------|
| AST Caching | `astCache` stores parsed trees keyed by file path |
| Preprocessing | Per-language source preprocessing (e.g., UE macro stripping for C++) |
| Batch Processing | 20-file batches with `yieldToEventLoop()` |
| Workspace Index | O(totalScopes) build-time indexing for O(1) runtime lookups |
| Confidence Scoring | Geometric mean (`Math.sqrt(child.confidence * parent.confidence)`) for inheritance |

## Summary

GitNexus implements a modular, extensible code analysis architecture:

1. **Pipeline-driven ingestion** with clearly delineated stages
2. **Language-agnostic core** with pluggable pattern extractors
3. **Single semantic model** ensuring consistency across analysis paths
4. **Cross-repository impact tracking** for change propagation analysis
5. **Tree-sitter-based parsing** with efficient grammar loading

The architecture prioritizes extensibility (new languages = new pattern files), performance (caching, batching, indexing), and correctness (invariant enforcement, same-graph guarantees).

---

<a id='package-structure'></a>

## Package Structure

### 相关页面

相关主题：[System Architecture](#system-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/package.json](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/package.json)
- [gitnexus-web/package.json](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/package.json)
- [gitnexus-shared/package.json](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-shared/package.json)
- [eval/package.json](https://github.com/abhigyanpatwari/GitNexus/blob/main/eval/package.json)
</details>

# Package Structure

GitNexus is organized as a **monorepo** with four distinct packages, each serving a specific role in the codebase intelligence platform. The architecture separates concerns between the backend analysis engine, web interface, shared type definitions, and evaluation tooling.

## Monorepo Overview

```mermaid
graph TD
    subgraph "GitNexus Monorepo"
        Web[gitnexus-web<br/>React Frontend]
        Core[gitnexus<br/>Core Engine]
        Shared[gitnexus-shared<br/>Shared Types]
        Eval[eval<br/>Evaluation]
    end
    
    Web <--> Shared
    Core <--> Shared
    Core <--> Web
    Eval --> Core
```

## Package Breakdown

| Package | Role | Technology |
|---------|------|------------|
| `gitnexus` | Core analysis engine, CLI, ingestion pipeline | Node.js, TypeScript |
| `gitnexus-web` | Web UI, visualization, user interactions | React, TypeScript |
| `gitnexus-shared` | Shared types, utilities, interfaces | TypeScript |
| `eval` | Evaluation harness, benchmarking | TypeScript |

## gitnexus — Core Engine

The core package is the backbone of GitNexus, handling all code analysis, parsing, and knowledge graph construction.

### Directory Structure

```
gitnexus/src/
├── cli/                    # Command-line interface commands
│   ├── index.ts           # CLI entry point with command registration
│   ├── wiki.js            # Wiki generation command
│   ├── augment.js         # Search pattern augmentation
│   └── publish.js         # Registry notification
├── core/
│   ├── ingestion/         # Code parsing and analysis
│   │   ├── model/        # Heritage map generation
│   │   ├── scope-extractor.ts
│   │   ├── heritage-processor.ts
│   │   └── scope-resolution/
│   │       └── workspace-index.ts
│   ├── tree-sitter/      # Parser loader and language support
│   │   └── parser-loader.ts
│   └── group/            # Module extraction and contracts
│       ├── extractors/
│       │   └── grpc-patterns/
│       │       └── node.ts
│       └── PIPELINE.md
└── storage/
    └── git.ts            # Git operations utility
```

### Key Modules

#### Ingestion Pipeline

The ingestion system processes source code and builds the knowledge graph:

```mermaid
flowchart TD
    A[Source Files] --> B[Language Detection]
    B --> C[Tree-sitter Parsing]
    C --> D[AST Analysis]
    D --> E[Scope Extraction]
    E --> F[Heritage Processing]
    F --> G[Knowledge Graph]
```

- **Scope Extractor** (`scope-extractor.ts`): Extracts bindings, imports, and symbol references using language-specific providers.
- **Heritage Processor** (`heritage-processor.ts`): Resolves class inheritance and interface implementation relationships.
- **Workspace Index** (`workspace-index.ts`): Builds O(totalScopes) index for class scope lookups.

#### Tree-sitter Parser Loader

The parser loader (`parser-loader.ts`) provides language-specific parsing through tree-sitter grammars:

| Language | Grammar Package | Notes |
|----------|-----------------|-------|
| JavaScript | `tree-sitter-javascript` | Standard JS parsing |
| TypeScript | `tree-sitter-typescript` | TS and TSX supported |
| Python | `tree-sitter-python` | Python source parsing |
| Java | `tree-sitter-java` | Java class analysis |
| C# | `tree-sitter-c-sharp` | Uses subpath export |
| C++ | `tree-sitter-cpp` | C++ source parsing |
| Go | `tree-sitter-go` | Go package parsing |
| Rust | `tree-sitter-rust` | Rust module parsing |
| PHP | `tree-sitter-php` | Uses `php_only` export |
| Ruby | `tree-sitter-ruby` | Ruby parsing |
| Vue | `tree-sitter-typescript` | Reuses TypeScript grammar |
| C | `tree-sitter-c` | Required, ABI-sensitive |

资料来源：[parser-loader.ts:26-90](gitnexus/src/core/tree-sitter/parser-loader.ts)

#### CLI Commands

The CLI (`index.ts`) provides multiple commands:

- `serve` — Start the backend server
- `ingest <repoPath>` — Analyze and index a repository
- `query <search_query>` — Search the knowledge graph
- `wiki` — Generate documentation from module structure
- `augment <pattern>` — Add knowledge graph context to search
- `publish [path]` — Notify registry of fresh index

资料来源：[index.ts](gitnexus/src/cli/index.ts)

## gitnexus-web — Frontend Application

The web package provides the interactive UI for visualizing and exploring the knowledge graph.

### Component Architecture

```
gitnexus-web/src/
└── components/
    ├── Header.tsx          # Navigation, repo switching
    ├── DropZone.tsx        # Server connection, initial probe
    ├── OnboardingGuide.tsx # Setup wizard
    ├── AnalyzeOnboarding.tsx # GitHub repo analyzer
    ├── HelpPanel.tsx       # Contextual help, AI queries
    └── StatusBar.tsx       # Progress, status indicators
```

### Key Components

| Component | Purpose |
|-----------|---------|
| `Header` | Repository switching, analysis triggers, navigation |
| `DropZone` | Server detection, backend polling, connection states |
| `OnboardingGuide` | Step-by-step setup instructions |
| `AnalyzeOnboarding` | GitHub URL input and cloning |
| `HelpPanel` | Contextual help, Nexus AI integration |
| `StatusBar` | Progress bar, ready state, sponsor link |

#### Connection States

The `DropZone` component manages backend connection through polling:

```typescript
type ConnectionPhase = 'onboarding' | 'analyze' | 'landing' | 'success' | 'loading';
```

资料来源：[DropZone.tsx](gitnexus-web/src/components/DropZone.tsx)

#### Help Panel Modes

The `HelpPanel` provides contextual guidance for different exploration modes:

- **Graph view** — Node size meaning, edge direction, click interactions
- **Search & filter** — Query syntax guidance
- **Nexus AI** — Semantic query examples, "Semantic Ready" status

资料来源：[HelpPanel.tsx](gitnexus-web/src/components/HelpPanel.tsx)

## gitnexus-shared — Shared Type System

The shared package defines TypeScript interfaces, types, and utilities used across both the core engine and web frontend.

### Core Types

| Type | Description |
|------|-------------|
| `ParsedFile` | Parsed source file with imports and definitions |
| `ParsedImport` | Raw import statement |
| `BindingRef` | Reference to a symbol binding |
| `ReferenceSite` | Usage location of a symbol |
| `Scope` | Lexical scope with owned definitions |
| `ScopeKind` | Scope classification (module, class, function, etc.) |
| `SymbolDefinition` | Definition of a symbol |
| `TypeRef` | Reference to a type |

资料来源：[scope-extractor.ts:1-25](gitnexus/src/core/ingestion/scope-extractor.ts)

### Utility Functions

The shared package exports utility functions used throughout the codebase:

- `buildPositionIndex` — Position indexing for source locations
- `buildScopeTree` — Scope hierarchy construction
- `canParentScope` — Scope relationship validation
- `makeScopeId` — Scope identifier generation
- `isClassLike` — Class-like scope detection

## eval — Evaluation Harness

The eval package provides benchmarking and evaluation tooling for the analysis engine. It validates correctness of ingestion, querying, and relationship detection.

## Dependency Flow

```mermaid
graph LR
    Shared --> Core
    Shared --> Web
    Core --> Web
    Eval --> Core
    
    subgraph "gitnexus-shared"
        Shared
    end
    
    subgraph "gitnexus"
        Core
    end
    
    subgraph "gitnexus-web"
        Web
    end
    
    subgraph "eval"
        Eval
    end
```

## Data Models

### LanguageProvider Interface

The `LanguageProvider` interface defines hooks for language-specific analysis:

```typescript
type ScopeExtractorHooks = Pick<
  LanguageProvider,
  | 'resolveScopeKind'
  | 'bindingScopeFor'
  | 'interpretImport'
  | 'interpretTypeBinding'
  | 'classifyCallForm'
>;
```

资料来源：[scope-extractor.ts:55-66](gitnexus/src/core/ingestion/scope-extractor.ts)

### WorkspaceResolutionIndex

Build-time index for efficient scope lookups:

```typescript
interface WorkspaceResolutionIndex {
  readonly classScopeByDefId: ReadonlyMap<string, Scope>;
  readonly classScopeIdToDefId: ReadonlyMap<string, string>;
}
```

资料来源：[workspace-index.ts](gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts)

## Adding New Language Support

To add support for a new language:

1. Add grammar to `SOURCES` in `parser-loader.ts`:
   ```typescript
   [SupportedLanguages.NewLang]: {
     load: () => _require('tree-sitter-newlang'),
     unavailableNote: 'NewLang parsing requires tree-sitter-newlang.',
   },
   ```

2. Register patterns in the group extractors if needed for module extraction.

3. Implement language-specific hooks in `LanguageProvider` if the default extraction is insufficient.

资料来源：[parser-loader.ts](gitnexus/src/core/tree-sitter/parser-loader.ts)

---

<a id='mcp-integration'></a>

## MCP Integration

### 相关页面

相关主题：[System Architecture](#system-architecture), [Multi-Repo Registry Architecture](#multi-repo-registry)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/src/mcp/server.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/mcp/server.ts)
- [gitnexus/src/mcp/tools.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/mcp/tools.ts)
- [gitnexus/src/mcp/resources.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/mcp/resources.ts)
- [gitnexus/src/cli/setup.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/cli/setup.ts)
- [gitnexus/README.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/README.md)
- [gitnexus/hooks/claude/gitnexus-hook.cjs](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/hooks/claude/gitnexus-hook.cjs)
</details>

# MCP Integration

GitNexus provides a Model Context Protocol (MCP) server that exposes the knowledge graph as a set of tools and resources for AI code assistants like Claude. This integration enables AI assistants to query repository structure, understand code relationships, and perform impact analysis directly within their workflow.

## Overview

The MCP server operates over stdio (standard input/output), making it compatible with any editor or IDE that supports the MCP protocol. When connected, AI assistants can leverage GitNexus's indexed knowledge graph to provide context-aware responses about your codebase.

```mermaid
graph TD
    A["AI Editor<br/>(Claude, Cursor, etc.)"] -->|"MCP Protocol<br/>stdio"| B["GitNexus MCP Server"]
    B --> C["gitnexus mcp"]
    C --> D["Knowledge Graph DB<br/>(Indexed Repos)"]
    
    E["Tool Requests"] --> B
    B --> F["Query Results"]
    
    G["Resource Requests"] --> B
    B --> H["Repo Context<br/>Clusters<br/>Processes"]
```

资料来源：[gitnexus/README.md:17](https://github.com/abhigyanpatwari/GitNexus/blob/main/README.md#L17)

## Starting the MCP Server

### Command-Line Usage

```bash
gitnexus mcp
```

This starts the MCP server in stdio mode, serving all indexed repositories. The server listens for JSON-RPC requests from connected clients and returns structured responses.

资料来源：[gitnexus/README.md:17](https://github.com/abhigyanpatwari/GitNexus/blob/main/README.md#L17)

### Startup Behavior

The MCP server entry point implements intelligent binary resolution:

1. **Preferred**: Uses the globally-installed `gitnexus` binary (starts in ~1 second)
2. **Fallback**: Uses `npx -y gitnexus@<version>` when the binary isn't on PATH

The npx fallback is slower due to cold-cache installation of native dependencies (can exceed 30 seconds), which may exceed Claude Code's MCP connection timeout. The binary path is resolved at module load time and persisted in user config.

```typescript
function getMcpEntry() {
  const bin = resolveGitnexusBin();

  if (bin) {
    return { command: bin, args: ['mcp'] };
  }

  // Fallback: npx (works without a global install, but slow cold-start)
  if (process.platform === 'win32') {
    return {
      command: 'cmd',
      args: ['/c', 'npx', '-y', NPX_REF, 'mcp'],
    };
  }
  // ...
}
```

资料来源：[gitnexus/src/cli/setup.ts:45-62](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/cli/setup.ts#L45-L62)

## MCP Tools

The MCP server exposes a comprehensive set of tools for querying and manipulating the knowledge graph. Each tool corresponds to a specific capability within GitNexus.

### Available Tools

| Tool | Purpose | Description |
|------|---------|-------------|
| `query` | Process-grouped code intelligence | Find execution flows related to a concept |
| `context` | 360-degree symbol view | View categorized refs and processes a symbol participates in |
| `impact` | Symbol blast radius analysis | Determine what breaks at depth 1/2/3 with confidence scores |
| `detect_changes` | Git-diff impact analysis | Analyze what your current changes affect |
| `rename` | Multi-file coordinated rename | Perform confidence-tagged edits across files |
| `cypher` | Raw graph queries | Execute direct Cypher queries against the graph |
| `list_repos` | Repository discovery | List all indexed repositories |

资料来源：[gitnexus/src/mcp/resources.ts:47-56](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/mcp/resources.ts#L47-L56)

### Tool Execution Flow

```mermaid
sequenceDiagram
    participant Editor as AI Editor
    participant MCP as MCP Server
    participant Graph as Knowledge Graph
    
    Editor->>MCP: tool_request(query, "auth module")
    MCP->>MCP: Parse request & validate
    MCP->>Graph: Execute graph query
    Graph-->>MCP: Query results with execution flows
    MCP-->>Editor: Structured JSON response
```

## MCP Resources

Resources provide read-only access to repository metadata and graph schema information. They follow a `gitnexus://` URI scheme for addressing.

### Resource Types

| Resource | URI Pattern | Description |
|----------|-------------|-------------|
| Repository Stats | `gitnexus://repo/{name}/context` | Stats, staleness check, symbol counts |
| Functional Clusters | `gitnexus://repo/{name}/clusters` | All functional areas and groupings |
| Execution Flows | `gitnexus://repo/{name}/processes` | All execution flows in the repo |
| Graph Schema | `gitnexus://repo/{name}/schema` | Graph schema for Cypher queries |

### Resource Content Format

Each resource returns structured Markdown documentation. For example, the `context` resource includes:

```
This project is indexed by GitNexus as **{repo.name}** ({stats.nodes || 0} symbols, {stats.edges || 0} relationships, {stats.processes || 0} execution flows).
```

资料来源：[gitnexus/src/mcp/resources.ts:30-36](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/mcp/resources.ts#L30-L36)

## Editor Integration

### Claude Desktop Configuration

For Claude Desktop integration, add the GitNexus MCP server to your configuration:

```json
{
  "mcpServers": {
    "gitnexus": {
      "command": "gitnexus",
      "args": ["mcp"]
    }
  }
}
```

### Editor Hook System

GitNexus includes Claude-specific hooks that integrate with the MCP server:

```mermaid
graph LR
    A["gitnexus-hook.cjs"] -->|Integration| B["Claude Code"]
    A -->|Query| C["Knowledge Graph"]
    B -->|Requests| C
```

The hook enables real-time code context awareness during editing sessions, allowing Claude to reference repository structure without manual context switching.

资料来源：[gitnexus/hooks/claude/gitnexus-hook.cjs](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/hooks/claude/gitnexus-hook.cjs)

## Architecture

### Component Overview

```mermaid
graph TD
    subgraph "MCP Layer"
        A["server.ts<br/>Request Handler"] --> B["tools.ts<br/>Tool Registry"]
        A --> C["resources.ts<br/>Resource Provider"]
    end
    
    subgraph "Core Services"
        B --> D["runFullAnalysis"]
        B --> E["WikiGenerator"]
        C --> F["RepoRegistry"]
    end
    
    subgraph "Data Layer"
        D --> G["LadybugDB<br/>(SQLite)"]
        F --> G
    end
```

### Request Processing

1. **Connection**: Client establishes stdio connection to `gitnexus mcp`
2. **Initialization**: Server sends protocol handshake with available capabilities
3. **Tool Invocation**: Client sends `tools/call` requests
4. **Query Execution**: Server routes to appropriate handler (graph DB, file system)
5. **Response**: Structured JSON-RPC response returned via stdio

## Use Cases

### Semantic Code Understanding

When asking Claude questions like "Which files depend on the auth module?", the MCP server:

1. Receives the `query` tool call
2. Searches the knowledge graph for import relationships
3. Returns execution flows and dependency chains

### Impact Analysis

Before making changes, use the `impact` tool to understand the blast radius:

- **Depth 1**: Direct dependencies
- **Depth 2**: Transitive dependencies
- **Depth 3**: Full impact tree with confidence scores

### Change Detection

The `detect_changes` tool compares current working directory state against the indexed snapshot, returning:

- Modified files
- Added imports/exports
- Removed dependencies

## Configuration

### Multi-Repository Support

The MCP server serves all indexed repositories registered in the global registry. Use `gitnexus list` to see available repos:

```bash
gitnexus list                    # List all indexed repositories
gitnexus status                  # Show index status for current repo
```

资料来源：[gitnexus/README.md:20-24](https://github.com/abhigyanpatwari/GitNexus/blob/main/README.md#L20-L24)

### Binaries on Windows

On Windows, `gitnexus` may have multiple entries from `where`:
- POSIX shell script (not directly executable)
- `.cmd`/`.bat` wrapper (preferred for `spawn()`)

The setup code prefers the wrapper to ensure reliable execution:

```typescript
const cmdLine = lines.find((l) => /\.(cmd|bat)$/i.test(l));
return cmdLine || lines[0] || null;
```

资料来源：[gitnexus/src/cli/setup.ts:21-24](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/cli/setup.ts#L21-L24)

## Summary

The MCP Integration layer transforms GitNexus from a standalone analysis tool into a collaborative AI assistant. By exposing the knowledge graph through the Model Context Protocol, it enables:

- **Deep Context**: AI understands codebase structure, not just file names
- **Impact Awareness**: Changes can be validated against dependency graphs
- **Cross-Repo Intelligence**: Multi-repository analysis through unified queries
- **Workflow Integration**: Seamless incorporation into daily editing tasks

The architecture prioritizes fast startup (global binary preference) and reliable cross-platform execution, making it practical for CI/CD environments and interactive development alike.

---

<a id='multi-repo-registry'></a>

## Multi-Repo Registry Architecture

### 相关页面

相关主题：[MCP Integration](#mcp-integration), [Knowledge Graph](#knowledge-graph)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/src/storage/repo-manager.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/storage/repo-manager.ts)
- [gitnexus/src/cli/list.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/cli/list.ts)
- [gitnexus/src/storage/git.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/storage/git.ts)
- [gitnexus/src/cli/analyze.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/cli/analyze.ts)
- [gitnexus/src/core/run-analyze.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/run-analyze.ts)
</details>

# Multi-Repo Registry Architecture

## Overview

The Multi-Repo Registry is GitNexus's core system for managing multiple indexed repositories within a single global registry. It provides the foundation for repository discovery, identity resolution, and multi-repo querying across the CLI, MCP server, and web interface.

The registry persists as `~/.gitnexus/registry.json`, storing metadata for each indexed repository including its path, name, last indexed commit, and statistics about the generated knowledge graph (nodes, edges, processes).

## Registry Data Model

### RegistryEntry Interface

Each entry in the registry represents a single indexed repository:

| Field | Type | Description |
|-------|------|-------------|
| `name` | `string` | The resolved repository name (see Name Resolution Precedence) |
| `path` | `string` | Absolute filesystem path to the repository |
| `indexedAt` | `number` | Unix timestamp when the repo was last indexed |
| `lastCommit` | `string \| undefined` | Git commit SHA of the last analyzed state |
| `stats` | `RegistryStats \| undefined` | Graph statistics (files, nodes, edges, communities, processes, embeddings) |
| `remoteUrl` | `string \| undefined` | The `remote.origin.url` for the repo |

资料来源：[gitnexus/src/storage/repo-manager.ts:60-70]()

### CwdMatch Interface

The registry provides a matching interface to resolve the current working directory against indexed repositories:

```typescript
export interface CwdMatch {
  match: 'path' | 'sibling-by-remote' | 'none';
  entry?: RegistryEntry;
  cwdGitRoot?: string;
  cwdHead?: string;
  drift?: number;
  hint?: string;
}
```

The `CwdMatch` interface supports three match scenarios:

| Match Type | Description |
|------------|-------------|
| `path` | Exact path match — `cwd` is a subdirectory of a registered repo |
| `sibling-by-remote` | Same remote URL — `cwd` is a different on-disk clone of a registered repo |
| `none` | No relationship found |

资料来源：[gitnexus/src/storage/repo-manager.ts:200-220]()

## Name Resolution Precedence

GitNexus employs a four-tier name resolution strategy to determine the registry name for an indexed repository:

```mermaid
graph TD
    A[Start: Determine registry name] --> B{T Explicit --name provided?}
    B -->|Yes| C[Use explicit name]
    B -->|No| D{Existing entry with preserved alias?}
    D -->|Yes| E[Use preserved alias]
    D -->|No| F{remote.origin.url available?}
    F -->|Yes| G[Derive name from remote URL]
    F -->|No| H[Use path.basename]
    
    C --> I[Store in registry]
    E --> I
    G --> I
    H --> I
```

### Resolution Tiers

| Priority | Source | Rationale |
|----------|--------|-----------|
| 1 | `--name <alias>` CLI flag | Explicit user-provided alias |
| 2 | Preserved alias on existing entry | Maintains name across re-analyzes |
| 3 | `git config --get remote.origin.url` | Recovers meaningful names for monorepo subprojects, git worktrees, and Gas-Town-style layouts |
| 4 | `path.basename(repoPath)` | Original default behavior |

资料来源：[gitnexus/src/storage/repo-manager.ts:290-310]()

### Git Worktree Support

A key architectural concern is handling git worktrees correctly. When running `gitnexus analyze` inside a linked worktree, the system uses `git rev-parse --git-common-dir` to derive the **canonical repository root**, preventing worktrees from being registered under their directory slug (e.g., `wt-feature`) instead of the actual repo name (e.g., `repo`).

```typescript
export const resolveRepoIdentityRoot = (fromPath: string): string => {
  const resolved = path.resolve(fromPath);
  const canonical = getCanonicalRepoRoot(resolved);
  if (!canonical) return resolved;
  if (canonical === resolved) return canonical;
  if (hasGitDir(resolved)) return canonical; // linked worktree
  return resolved; // arbitrary subdir → preserve as-is
};
```

Without this, each worktree would re-register as a "different" project, causing AGENTS.md to be rewritten with the wrong MCP URI and silently accumulating duplicate registry entries.

资料来源：[gitnexus/src/storage/git.ts:90-110]()

## Registry Operations

### Registration Flow

```mermaid
sequenceDiagram
    participant CLI as gitnexus analyze
    participant RM as repo-manager
    participant FS as filesystem
    participant Git as git utilities
    
    CLI->>RM: registerRepo(repoPath, opts)
    RM->>Git: getRemoteOriginUrl(repoPath)
    RM->>Git: resolveRepoIdentityRoot(repoPath)
    RM->>RM: resolveName(precedence tiers)
    RM->>RM: checkDuplicateName()
    alt duplicate found & !allowDuplicateName
        RM-->>CLI: throw RegistryNameCollisionError
    else allowed or no duplicate
        RM->>FS: persist to registry.json
        RM-->>CLI: return resolved name
    end
```

#### Registration Options

| Option | Type | Description |
|--------|------|-------------|
| `name` | `string \| undefined` | Explicit alias for registry name |
| `allowDuplicateName` | `boolean` | Bypass collision guard, allowing two paths to share the same name |

The duplicate-name guard only fires when the user explicitly passes a `name`; un-aliased basename collisions continue to register silently so existing users see no behavior change.

资料来源：[gitnexus/src/core/run-analyze.ts:45-60]()

### Listing Repositories

The `gitnexus list` command displays all indexed repositories with collision-aware formatting:

```typescript
export const listCommand = async () => {
  const entries = await listRegisteredRepos({ validate: true });
  const nameCounts = new Map<string, number>();
  
  for (const entry of entries) {
    const key = entry.name.toLowerCase();
    nameCounts.set(key, (nameCounts.get(key) ?? 0) + 1);
  }
  
  for (const entry of entries) {
    const hasCollision = (nameCounts.get(entry.name.toLowerCase()) ?? 0) > 1;
    const header = hasCollision ? `${entry.name}  (${entry.path})` : entry.name;
    // display with path disambiguation for collisions
  }
};
```

Entries with name collisions display their path to disambiguate:

```
  myapp  (/projects/myapp)
  myapp  (/worktrees/myapp-feature)
```

资料来源：[gitnexus/src/cli/list.ts:15-50]()

### Repository Switching (Web Interface)

The web interface supports switching between indexed repositories via a dropdown menu. The Header component renders available repos with an "active" indicator for the current project:

```tsx
{availableRepos.map((repo) => (
  <div key={repo.name} className="...">
    <button onClick={() => onSwitchRepo?.(repo.name)}>
      <span className="font-mono text-sm">{repo.name}</span>
      {repo.name === projectName && (
        <span className="text-[10px] text-accent">active</span>
      )}
    </button>
  </div>
))}
```

资料来源：[gitnexus-web/src/components/Header.tsx:1-50]()

## Error Handling

### RegistryNameCollisionError

When two different repository paths attempt to register under the same name without the `--allow-duplicate-name` flag, the system throws `RegistryNameCollisionError` with actionable guidance:

```
Registry name collision:
  "myapp" is already used by "/projects/myapp".

Options:
  • Pick a different alias:  gitnexus analyze --name <alias>
  • Allow the duplicate:     gitnexus analyze --allow-duplicate-name
```

资料来源：[gitnexus/src/cli/analyze.ts:30-45]()

### AnalysisNotFinalizedError

If the analysis pipeline fails to complete, users receive diagnostic guidance:

1. Re-run `gitnexus analyze` — transient native errors often clear on retry
2. Inspect the storage path for leftover `lbug.wal` indicating an aborted write
3. Run with `NODE_OPTIONS="--max-old-space-size=8192 --trace-exit"` for detailed tracing

资料来源：[gitnexus/src/cli/analyze.ts:50-70]()

## MCP Server Integration

The MCP server exposes registry information through the `gitnexus://repo/{name}/*` URI scheme:

| Resource URI | Content |
|--------------|---------|
| `gitnexus://repo/{name}/context` | Stats, staleness check |
| `gitnexus://repo/{name}/clusters` | All functional areas |
| `gitnexus://repo/{name}/processes` | All execution flows |
| `gitnexus://repo/{name}/schema` | Graph schema for Cypher |

资料来源：[gitnexus/src/mcp/resources.ts:40-55]()

## Configuration Storage

Registry metadata is stored alongside the index in the GitNexus storage directory:

```typescript
const { registryPath, indexPath, metaPath } = getStoragePaths(repoName, repoPath);
```

The registry file (`registry.json`) maintains the authoritative list of all indexed repositories and is the single source of truth for repository discovery across CLI, MCP, and web components.

## Summary

The Multi-Repo Registry Architecture provides:

- **Centralized repository tracking** via `~/.gitnexus/registry.json`
- **Intelligent name resolution** with support for aliases, git remotes, and worktrees
- **Collision detection** with graceful degradation via `--allow-duplicate-name`
- **Unified discovery** across CLI (`list`, `query`), MCP (`list_repos`), and web interfaces
- **Cross-repo intelligence** enabling queries and context to span multiple repositories

---

<a id='indexing-pipeline'></a>

## Indexing Pipeline

### 相关页面

相关主题：[Knowledge Graph](#knowledge-graph), [Search System](#search-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/src/core/ingestion/pipeline-phases/parse.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/pipeline-phases/parse.ts)
- [gitnexus-shared/src/scope-resolution/parsed-file.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-shared/src/scope-resolution/parsed-file.ts)
- [gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts)
- [gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts)
- [gitnexus/src/core/ingestion/heritage-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/heritage-processor.ts)
- [gitnexus/src/core/ingestion/scope-extractor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/scope-extractor.ts)
- [gitnexus/src/core/ingestion/parsing-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/parsing-processor.ts)
- [gitnexus/src/core/ingestion/call-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/call-processor.ts)
- [gitnexus/src/core/group/PIPELINE.md](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/group/PIPELINE.md)
</details>

# Indexing Pipeline

The Indexing Pipeline is the core ingestion system that transforms source code repositories into a graph-based representation. It scans, parses, extracts semantic relationships, and builds the knowledge graph that powers GitNexus's dependency visualization and impact analysis capabilities.

## Overview

The pipeline operates as a multi-phase execution engine where each phase depends on the output of previous phases. It processes source files through sequential stages, extracting code symbols, relationships, and metadata to construct the semantic graph.

```mermaid
graph TD
    A[Source Files] --> B[Structure Phase]
    B --> C[Parse Phase]
    C --> D[Scope Extraction]
    D --> E[Finalize Phase]
    E --> F[Call Processing]
    E --> G[Heritage Processing]
    D --> H[Communities Detection]
    F --> I[Graph Completion]
    G --> I
    H --> I
    I --> J[Indexed Graph]
```

资料来源：[gitnexus/src/core/group/PIPELINE.md]()

## Pipeline Architecture

### Phase Execution Model

The pipeline is built on a `PipelinePhase<T>` abstraction where each phase declares its dependencies and implements an `execute` function:

```typescript
const pipelinePhase: PipelinePhase<ParseOutput> = {
  name: 'parse',
  deps: ['structure', 'markdown', 'cobol'],
  async execute(
    ctx: PipelineContext,
    deps: ReadonlyMap<string, PhaseResult<unknown>>,
  ): Promise<ParseOutput> { /* ... */ }
};
```

资料来源：[gitnexus/src/core/ingestion/pipeline-phases/parse.ts:1-24]()

### Pipeline Context

The `PipelineContext` carries shared state across phases:

| Property | Type | Purpose |
|----------|------|---------|
| `graph` | `SemanticGraph` | The graph being built |
| `repoPath` | `string` | Absolute path to the repository |
| `pipelineStart` | `number` | Timestamp when pipeline started |
| `onProgress` | `ProgressCallback` | Progress reporting callback |
| `options` | `PipelineOptions` | Configuration for the pipeline |

## Phase Breakdown

### Structure Phase

The entry point that discovers and classifies files in the repository. It builds the initial file inventory with metadata including paths, sizes, and detected languages.

### Parse Phase

The parse phase transforms source files into structured AST representations using tree-sitter parsers. It processes files in a chunked, parallel manner to handle large repositories efficiently.

```typescript
const result = await runChunkedParseAndResolve(
  ctx.graph,
  scannedFiles,
  allPaths,
  totalFiles,
  ctx.repoPath,
  ctx.pipelineStart,
  ctx.onProgress,
  ctx.options,
);
```

资料来源：[gitnexus/src/core/ingestion/pipeline-phases/parse.ts:17-24]()

#### Parsing Processor

The `parsing-processor.ts` handles the actual file parsing and extraction:

```typescript
const merged = mergeChunkResults(graph, symbolTable, chunkResults);
```

资料来源：[gitnexus/src/core/ingestion/parsing-processor.ts:1-50]()

### Scope Extraction Phase

Scope extraction produces `ParsedFile` objects that represent the semantic structure of each source file. This is the per-file, parallelizable boundary between extraction and cross-file resolution.

#### ParsedFile Structure

```typescript
interface ParsedFile {
  readonly scopes: Scope[];           // All scopes in the file
  readonly parsedImports: ParsedImport[];  // Raw imports before resolution
  readonly localDefs: SymbolDefinition[];  // Structurally declared definitions
  readonly referenceSites: ReferenceSite[]; // Pre-resolution usage facts
}
```

资料来源：[gitnexus-shared/src/scope-resolution/parsed-file.ts:1-30]()

The `ParsedFile` deliberately does NOT carry:
- Linked `ImportEdge`s (those are finalize output)
- A `ScopeTree` instance (callers build one from `scopes`)

#### Scope Extraction Hooks

Language providers implement these hooks for scope extraction:

```typescript
export type ScopeExtractorHooks = Pick<
  LanguageProvider,
  | 'resolveScopeKind'
  | 'bindingScopeFor'
  | 'interpretImport'
  | 'interpretTypeBinding'
  | 'classifyCallForm'
>;
```

资料来源：[gitnexus/src/core/ingestion/scope-extractor.ts:30-48]()

### Call Processing Phase

Call processing extracts function/method invocations and builds dispatch information. It works with language providers to query ASTs using tree-sitter queries:

```typescript
interface PreparedFile {
  language: SupportedLanguages;
  provider: ReturnType<typeof getProvider>;
  tree: ReturnType<typeof parser.parse>;
  matches: ReturnType<Parser.Query['matches']>;
  parentMap: ReadonlyMap<string, readonly string[]>;
  typeEnv: ReturnType<typeof buildTypeEnv>;
}
```

资料来源：[gitnexus/src/core/ingestion/call-processor.ts:1-20]()

The call processor skips registry-primary languages since scope-based phases own CALLS extraction for those languages.

### Heritage Processing Phase

Heritage processing extracts inheritance relationships (extends/implements) and resolves them into graph edges:

```mermaid
graph TD
    H[Heritage Record] -->|extends| E{Extends Type Check}
    H -->|implements| I{Is Implements}
    E -->|IMPLEMENTS| I
    E -->|CLASS| C[Add Extends Edge]
    I -->|Yes| Impl[Add Implements Edge]
```

For `extends` relationships, the processor determines the relationship type based on the language:

```typescript
const { type: relType, idPrefix } = resolveExtendsType(
  h.parentName,
  h.filePath,
  ctx,
  getHeritageStrategyForLanguage(fileLanguage),
);
```

资料来源：[gitnexus/src/core/ingestion/heritage-processor.ts:1-50]()

### Communities Detection Phase

Detects code communities and modules within the codebase, enabling grouped visualization and analysis.

## Scope Resolution

### Resolution Indexes

After the main parsing phases, scope resolution builds cross-file indexes:

```typescript
export interface ScopeResolutionIndexes {
  readonly scopeTree: ScopeTree;
  readonly defs: DefIndex;
  readonly qualifiedNames: QualifiedNameIndex;
  readonly moduleScopes: ModuleScopeIndex;
  readonly methodDispatch: MethodDispatchIndex;
  readonly imports: ImportEdge[];
  readonly bindings: BindingIndex;
  readonly referenceSites: ReferenceSite[];
  readonly stats: FinalizeStats;
}
```

资料来源：[gitnexus/src/core/ingestion/scope-resolution-indexes.ts:1-35]()

### Workspace Resolution Index

The workspace index provides fast lookups for class and module scopes:

```typescript
export interface WorkspaceResolutionIndex {
  /** Class def `nodeId` → that class's `Scope`. */
  readonly classScopeByDefId: ReadonlyMap<string, Scope>;

  /** Class `Scope.id` → class def `nodeId`. */
  readonly classScopeIdToDefId: ReadonlyMap<ScopeId, string>;

  /** Module scope by file path. */
  readonly moduleScopeByFile: ReadonlyMap<string, Scope>;
}
```

资料来源：[gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts:1-30]()

### Building the Index

```typescript
export function buildWorkspaceResolutionIndex(
  parsedFiles: readonly ParsedFile[],
): WorkspaceResolutionIndex {
  const classScopeByDefId = new Map<string, Scope>();
  const classScopeIdToDefId = new Map<ScopeId, string>();
  const moduleScopeByFile = new Map<string, Scope>();

  for (const parsed of parsedFiles) {
    const moduleScope = parsed.scopes.find((s) => s.kind === 'Module');
    if (moduleScope !== undefined) moduleScopeByFile.set(parsed.filePath, moduleScope);

    for (const scope of parsed.scopes) {
      if (scope.kind !== 'Class') continue;
      const cd = scope.ownedDefs.find((d) => isClassLike(d.type));
      if (cd !== undefined) {
        classScopeByDefId.set(cd.nodeId, scope);
        classScopeIdToDefId.set(scope.id, cd.nodeId);
      }
    }
  }

  return { classScopeByDefId, classScopeIdToDefId, moduleScopeByFile };
}
```

资料来源：[gitnexus/src/core/ingestion/scope-resolution/workspace-index.ts:30-55]()

## Language Support

The pipeline supports multiple languages through provider-specific implementations:

### PHP Import Resolution

```typescript
export function resolvePhpImportTargetInternal(
  targetRaw: string,
  _fromFile: string,
  allFilePaths: ReadonlySet<string>,
  resolutionConfig?: unknown,
): string | null
```

资料来源：[gitnexus/src/core/ingestion/languages/php/import-target.ts:1-30]()

### C Header Scanning

```typescript
export function scanHeaderFiles(repoPath: string): ReadonlySet<string> {
  const headers = new Set<string>();
  walk(repoPath, repoPath, headers);
  return headers;
}
```

资料来源：[gitnexus/src/core/ingestion/languages/c/header-scan.ts:1-20]()

### C# Import Decomposition

```typescript
type ImportKind = 'namespace' | 'alias' | 'static';

interface ImportSpec {
  readonly kind: ImportKind;
  readonly source: string;
  readonly name: string;
  readonly alias?: string;
  readonly atNode: SyntaxNode;
}
```

资料来源：[gitnexus/src/core/ingestion/languages/csharp/import-decomposer.ts:1-30]()

## Semantic Model Contract

The `ParsedFile` is the single semantic model consumed by both the legacy DAG and the scope-resolution pipeline. Key invariants:

1. **Scope-resolution passes MUST NOT build a parallel parse representation** — they should reuse the orchestrator's `treeCache`
2. **Edits from `runScopeResolution` and the legacy DAG are indistinguishable** to downstream consumers
3. **Node identity uses the same `generateId()` helper** across both paths

资料来源：[gitnexus/src/core/ingestion/scope-resolution/contract/scope-resolver.ts:1-30]()

## Performance Considerations

### Chunked Processing

Files are processed in chunks to maintain responsiveness:

```typescript
for (let i = 0; i < files.length; i++) {
  const file = files[i];
  onProgress?.(i + 1, files.length);
  if (i % 20 === 0) await yieldToEventLoop();
}
```

资料来源：[gitnexus/src/core/ingestion/heritage-processor.ts:10-15]()

### Worker Pool Dispatch

The parsing processor uses a worker pool for parallel file processing:

```typescript
const chunkResults = await workerPool.dispatch<ParseWorkerInput, ParseWorkerResult>(
  parseableFiles,
  (filesProcessed) => {
    onFileProgress?.(Math.min(filesProcessed, total), total, 'Parsing...');
  },
);
```

资料来源：[gitnexus/src/core/ingestion/parsing-processor.ts:1-50]()

### Cache Management

ASTs are cached with buffer size optimization for large files:

```typescript
const parseContent = provider.preprocessSource?.(file.content, file.path) ?? file.content;
tree = parseSourceSafe(parser, parseContent, undefined, {
  bufferSize: getTreeSitterBufferSize(parseContent),
});
astCache.set(file.path, tree);
```

## Configuration

### Language Availability

```typescript
const language = getLanguageFromFilename(file.path);
if (!language) continue;
if (!isLanguageAvailable(language)) {
  if (skippedByLang) {
    skippedByLang.set(language, (skippedByLang.get(language) ?? 0) + 1);
  }
  continue;
}
```

### Tree-sitter Queries

Each language provider supplies tree-sitter queries for extraction:

```typescript
const provider = getProvider(language);
const queryStr = provider.treeSitterQueries;
if (!queryStr) continue;
```

## Summary

The Indexing Pipeline transforms source code into a queryable graph through:

1. **Structure Discovery** — File enumeration and classification
2. **Parallel Parsing** — Tree-sitter AST generation with caching
3. **Scope Extraction** — Per-file semantic structure extraction
4. **Cross-file Resolution** — Import and reference resolution
5. **Relationship Building** — Heritage, calls, and dependencies
6. **Index Materialization** — Fast lookup indexes for the graph

The pipeline is designed for incremental updates and supports both full repository ingestion and targeted re-indexing of changed files.

---

<a id='knowledge-graph'></a>

## Knowledge Graph

### 相关页面

相关主题：[Indexing Pipeline](#indexing-pipeline), [Search System](#search-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/src/core/graph/graph.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/graph/graph.ts)
- [gitnexus/src/core/graph/types.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/graph/types.ts)
- [gitnexus-shared/src/graph/types.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-shared/src/graph/types.ts)
- [gitnexus/src/core/ingestion/model/semantic-model.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/model/semantic-model.ts)
- [gitnexus/src/core/ingestion/emit-references.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/emit-references.ts)
- [gitnexus/src/core/ingestion/structure-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/structure-processor.ts)
- [gitnexus/src/core/ingestion/parsing-processor.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/parsing-processor.ts)
- [gitnexus/src/core/ingestion/pipeline-phases/wildcard-synthesis.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/ingestion/pipeline-phases/wildcard-synthesis.ts)
</details>

# Knowledge Graph

The Knowledge Graph is the core data structure in GitNexus that represents a codebase as an interconnected network of nodes and relationships. It serves as the semantic backbone for code exploration, dependency analysis, and AI-powered queries.

## Overview

The Knowledge Graph transforms source code into a queryable graph where:

- **Nodes** represent code entities (files, functions, classes, interfaces, modules)
- **Edges** represent relationships (imports, calls, containment, inheritance)

This abstraction enables powerful navigation patterns like tracing execution flows, finding circular dependencies, and identifying highly-connected components.

资料来源：[gitnexus/src/core/graph/types.ts]()

## Core Architecture

### Graph Interface

The `KnowledgeGraph` interface provides the fundamental operations for building and querying the graph:

```typescript
interface KnowledgeGraph {
  addNode(node: GraphNode): void;
  addRelationship(relationship: GraphRelationship): void;
  removeNode(nodeId: string): boolean;
  removeNodesByFile(filePath: string): number;
  removeRelationship(relationshipId: string): boolean;
  getNode(id: string): GraphNode | undefined;
  forEachNode(fn: (node: GraphNode) => void): void;
  forEachRelationship(fn: (rel: GraphRelationship) => void): void;
  nodeCount: number;
  relationshipCount: number;
}
```

资料来源：[gitnexus/src/core/graph/types.ts]()

### Node Structure

Graph nodes contain:

| Field | Type | Description |
|-------|------|-------------|
| `id` | `string` | Unique identifier (e.g., `File:src/index.ts`) |
| `label` | `NodeLabel` | Entity type (File, Function, Class, etc.) |
| `properties` | `Record<string, unknown>` | Metadata (name, filePath, line numbers, language) |

资料来源：[gitnexus-shared/src/graph/types.ts]()

### Relationship Structure

Relationships connect nodes with semantic meaning:

```typescript
interface GraphRelationship {
  id: string;
  type: RelationshipType;
  sourceId: string;
  targetId: string;
  confidence: number;      // 0.0 - 1.0
  reason: string;          // Human-readable explanation
}
```

资料来源：[gitnexus-shared/src/graph/types.ts]()

## Node Types

The system recognizes multiple node labels representing different code entities:

| Node Label | Description |
|------------|-------------|
| `File` | Source file node |
| `Folder` | Directory node |
| `Function` | Function or method |
| `Class` | Class definition |
| `Interface` | Interface definition |
| `Struct` | Struct definition (Go, Rust, C) |
| `Enum` | Enumeration |
| `Trait` | Trait definition |
| `Module` | Module (COBOL programs, packages) |
| `Property` | Data item or property |
| `TypeAlias` | Type alias definition |
| `Const` | Constant declaration |
| `Static` | Static member |
| `Record` | Record type |
| `Union` | Union type |
| `Typedef` | Type definition |
| `Macro` | Preprocessor macro |

资料来源：[gitnexus/src/core/ingestion/pipeline-phases/wildcard-synthesis.ts]()

## Relationship Types

### Core Relationships

| Type | Description | Direction |
|------|-------------|-----------|
| `CONTAINS` | Parent-child hierarchy | Parent → Child |
| `IMPORTS` | Import/require statements | Importer → Imported |
| `CALLS` | Function invocations | Caller → Callee |
| `DECLARES` | Symbol definitions | Scope → Symbol |
| `EXTENDS` | Inheritance (parent class) | Child → Parent |
| `IMPLEMENTS` | Interface implementation | Class → Interface |
| `USES` | Variable/type usage | User → Used |

资料来源：[gitnexus/src/core/ingestion/emit-references.ts]()

### Confidence Scoring

Relationships include a `confidence` score:

- `1.0` - Resolved reference (fully verified)
- `0.5` - Unresolved reference (symbol not found in index)

```typescript
graph.addRelationship({
  id: `rel:imports:${scopeId}->${targetModule}:${localName}`,
  sourceId: scopeId,
  targetId: targetModule,
  type: 'IMPORTS',
  confidence: edge.linkStatus === 'unresolved' ? 0.5 : 1,
  reason: `import ${edge.kind} ${edge.localName}`,
});
```

资料来源：[gitnexus/src/core/ingestion/emit-references.ts]()
 
## Build Pipeline

The Knowledge Graph is constructed through a multi-phase pipeline:

```mermaid
graph TD
    A[Source Files] --> B[Structure Processor]
    B --> C[File/Folder Nodes]
    C --> D[Parsing Processor]
    D --> E[Symbol Nodes]
    E --> F[Emit References]
    F --> G[Import Edges]
    G --> H[Wildcard Synthesis]
    H --> I[Final Graph]
```

### Phase 1: Structure Processing

The structure processor creates nodes for the file system hierarchy:

```typescript
const processStructure = (graph: KnowledgeGraph, paths: string[]) => {
  paths.forEach((path) => {
    const parts = path.split('/');
    let currentPath = '';
    let parentId = '';

    parts.forEach((part, index) => {
      const isFile = index === parts.length - 1;
      const label = isFile ? 'File' : 'Folder';
      currentPath = currentPath ? `${currentPath}/${part}` : part;
      const nodeId = generateId(label, currentPath);
      
      graph.addNode({
        id: nodeId,
        label: label,
        properties: { name: part, filePath: currentPath },
      });

      if (parentId) {
        graph.addRelationship({
          id: generateId('CONTAINS', `${parentId}->${nodeId}`),
          type: 'CONTAINS',
          sourceId: parentId,
          targetId: nodeId,
          confidence: 1.0,
          reason: '',
        });
      }
      parentId = nodeId;
    });
  });
};
```

资料来源：[gitnexus/src/core/ingestion/structure-processor.ts]()

### Phase 2: Parsing and Symbol Extraction

The parsing processor extracts code symbols and relationships from source files:

```typescript
export const mergeChunkResults = (
  graph: KnowledgeGraph,
  symbolTable: SymbolTableWriter,
  chunkResults: readonly ParseWorkerResult[],
): WorkerExtractedData => {
  for (const result of chunkResults) {
    for (const node of result.nodes) {
      graph.addNode({
        id: node.id,
        label: node.label as NodeLabel,
        properties: node.properties,
      });
    }
    for (const rel of result.relationships) {
      graph.addRelationship(rel);
    }
    for (const sym of result.symbols) {
      symbolTable.add(sym.filePath, sym.name, sym.nodeId, sym.type, {
        parameterCount: sym.parameterCount,
        requiredParameterCount: sym.requiredParameterCount,
        parameterTypes: sym.parameterTypes,
        returnType: sym.returnType,
        declaredType: sym.declaredType,
      });
    }
  }
};
```

资料来源：[gitnexus/src/core/ingestion/parsing-processor.ts]()

### Phase 3: Reference Emission

References connect code entities across scopes and files:

```typescript
export interface EmitStats {
  readonly edgesEmitted: number;
  readonly skippedNoCaller: number;        // No caller def resolved
  readonly skippedMissingTarget: number;  // Target not in DefIndex
  readonly scopeNodesEmitted: number;      // Only if INGESTION_EMIT_SCOPES=1
  readonly scopeEdgesEmitted: number;
}
```

资料来源：[gitnexus/src/core/ingestion/emit-references.ts]()

### Phase 4: Wildcard Import Synthesis

For languages with whole-module import semantics (Go, Ruby, C/C++, Swift), wildcard imports are expanded into per-symbol bindings:

```typescript
const IMPORTABLE_SYMBOL_LABELS = new Set([
  'Function', 'Class', 'Interface', 'Struct', 'Enum',
  'Trait', 'TypeAlias', 'Const', 'Static', 'Record',
  'Union', 'Typedef', 'Macro',
]);

/** Max synthetic bindings per importing file — prevents memory bloat */
const MAX_SYNTHETIC_BINDINGS_PER_FILE = 1000;
```

资料来源：[gitnexus/src/core/ingestion/pipeline-phases/wildcard-synthesis.ts]()

The synthesis process:

1. Collects IMPORTS edges for wildcard languages
2. Retrieves exported symbols per file
3. Creates named import bindings up to the limit

## Language-Specific Processors

### COBOL Processor

COBOL files undergo specialized processing with unique node types:

```typescript
// PROGRAM-ID -> Module node
const moduleId = generateId('Module', `${filePath}:${extracted.programName}`);

// SECTIONs -> Namespace nodes
// Paragraphs -> Property nodes
// Data items -> Property nodes
graph.addNode({
  id: moduleId,
  label: 'Module',
  properties: {
    name: extracted.programName,
    filePath,
    language: SupportedLanguages.Cobol,
    isExported: true,
  },
});
```

Nested programs are linked to their enclosing programs:

```typescript
const enclosing = extracted.programs.find(
  (p) => p.startLine < prog.startLine &&
         p.endLine > prog.endLine &&
         p.nestingDepth < prog.nestingDepth,
);
const nestedParent = enclosing
  ? programModuleIds.get(enclosing.name.toUpperCase()) ?? moduleId
  : moduleId;
```

资料来源：[gitnexus/src/core/ingestion/cobol-processor.ts]()

## Symbol Table Integration

The Knowledge Graph integrates with a symbol table for cross-reference resolution:

```typescript
symbolTable.add(
  filePath,
  symbolName,
  nodeId,
  symbolType,
  {
    parameterCount: number,
    requiredParameterCount: number,
    parameterTypes: string[],
    returnType: string,
    declaredType: string,
  }
);
```

资料来源：[gitnexus/src/core/ingestion/parsing-processor.ts]()

## Query Patterns

The graph supports various query patterns for code exploration:

| Pattern | Description |
|---------|-------------|
| File dependencies | Traverse IMPORTS edges from a File node |
| Call graph | Follow CALLS edges from a Function node |
| Impact analysis | Reverse-traverse USES edges |
| Circular dependencies | Detect cycles in IMPORTS graph |
| Component boundaries | Find connected components |

## Web Interface Visualization

The GitNexus web UI visualizes the Knowledge Graph with:

- **Node size** reflects connection count — larger nodes are depended on by more files
- **Edges** point from importer to imported
- **Click interactions** open detail panels showing imports, exports, and reverse dependencies

资料来源：[gitnexus-web/src/components/HelpPanel.tsx]()

## Configuration

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `INGESTION_EMIT_SCOPES` | `false` | Enable scope tree nodes in graph |

```typescript
function isScopeEmissionEnabled(): boolean {
  const TRUTHY = new Set(['true', '1', 'yes']);
  const raw = process.env['INGESTION_EMIT_SCOPES'];
  return raw !== undefined && TRUTHY.has(raw.trim().toLowerCase());
}
```

资料来源：[gitnexus/src/core/ingestion/emit-references.ts]()

## Summary

The Knowledge Graph is a fundamental abstraction in GitNexus that transforms codebase structure into a queryable graph. By modeling files, symbols, and their relationships with confidence scores, it enables:

- Semantic code navigation
- Dependency analysis
- AI-powered queries over code structure
- Documentation generation

The multi-phase build pipeline progressively enriches the graph from raw file structure through parsing, reference resolution, and language-specific processing to produce a comprehensive semantic model of the codebase.

---

<a id='search-system'></a>

## Search System

### 相关页面

相关主题：[Indexing Pipeline](#indexing-pipeline), [Knowledge Graph](#knowledge-graph)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [gitnexus/src/core/search/hybrid-search.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/search/hybrid-search.ts)
- [gitnexus/src/core/search/bm25-index.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/search/bm25-index.ts)
- [gitnexus/src/core/search/fts-indexes.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/search/fts-indexes.ts)
- [gitnexus/src/core/embeddings/embedding-pipeline.ts](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus/src/core/embeddings/embedding-pipeline.ts)
- [gitnexus-web/src/components/HelpPanel.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/HelpPanel.tsx)
- [gitnexus-web/src/components/QueryFAB.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/QueryFAB.tsx)
- [gitnexus-web/src/components/WebGPUFallbackDialog.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/WebGPUFallbackDialog.tsx)
- [gitnexus-web/src/components/FileTreePanel.tsx](https://github.com/abhigyanpatwari/GitNexus/blob/main/gitnexus-web/src/components/FileTreePanel.tsx)
</details>

# Search System

GitNexus provides a multi-layered search system that enables users to explore code repositories through various search modalities. The system combines traditional keyword-based search with semantic understanding powered by embeddings, supporting navigation by filename, function name, import path, and natural language queries.

## Architecture Overview

The search system in GitNexus follows a hybrid architecture that leverages both classical information retrieval techniques and modern neural embedding approaches. This dual-strategy design allows users to perform fast exact-match searches while also enabling semantic understanding of code relationships.

```mermaid
graph TD
    subgraph "Search Entry Points"
        CMD[⌘K / Ctrl K - Global Search]
        FAB[QueryFAB - Natural Language]
        FILE[FileTreePanel Search]
    end
    
    subgraph "Search Pipeline"
        HY[Hybrid Search Engine]
        BM[BM25 Index]
        FTS[Full-Text Search Index]
        EMB[Embedding Pipeline]
    end
    
    subgraph "Result Handling"
        RES[Results Aggregation]
        HL[Highlight Matching Nodes]
        NAV[Navigate to Graph]
    end
    
    CMD --> HY
    FAB --> HY
    FILE --> BM
    HY --> BM
    HY --> EMB
    HY --> FTS
    BM --> RES
    FTS --> RES
    EMB --> RES
    RES --> HL
    HL --> NAV
```

### Core Components

| Component | File | Purpose |
|-----------|------|---------|
| Hybrid Search Engine | `hybrid-search.ts` | Orchestrates search across multiple backends |
| BM25 Index | `bm25-index.ts` | Classical keyword-based ranking |
| FTS Indexes | `fts-indexes.ts` | Full-text search infrastructure |
| Embedding Pipeline | `embedding-pipeline.ts` | Neural embedding generation and similarity |

资料来源：[gitnexus/src/core/search/hybrid-search.ts]() [gitnexus/src/core/search/bm25-index.ts]()

## Search Modalities

GitNexus supports three distinct search modalities that address different user needs:

### 1. Global Quick Search (⌘K / Ctrl+K)

The primary search interface is accessible via keyboard shortcut `⌘K` on macOS or `Ctrl+K` on Windows/Linux. This modal provides instant access to search nodes across the entire repository.

**Features:**

- Search by filename
- Search by function name
- Search by import path
- Live highlighting of matching nodes in the graph
- Fuzzy matching support for typo tolerance

**UI Behavior:**
When activated, the search modal overlays the main interface with a centered input field. Results appear in real-time as the user types, with matches highlighted across the knowledge graph visualization. 资料来源：[gitnexus-web/src/components/HelpPanel.tsx]() [gitnexus-web/src/components/FileTreePanel.tsx]()

### 2. Natural Language Query (Nexus AI)

GitNexus integrates an AI-powered search capability that allows users to ask questions in natural language about the codebase. This feature relies on the semantic embedding pipeline.

**Example Queries:**

- "Which files depend on the auth module?"
- "Find circular dependencies in this repo"
- "What are the most connected components?"
- "Show me all files that import useEffect"

**Prerequisites:**
The repository must be indexed and ready for semantic queries. The system checks for "Semantic Ready" status before allowing natural language queries. 资料来源：[gitnexus-web/src/components/HelpPanel.tsx]()

### 3. File Tree Search

A dedicated search interface within the file tree panel allows users to filter and locate specific files in the repository structure. This is particularly useful for large codebases with deep directory hierarchies.

**Implementation Details:**

```tsx
<input
  type="text"
  placeholder="Search files..."
  value={searchQuery}
  onChange={(e) => setSearchQuery(e.target.value)}
  className="w-full rounded border border-border-subtle bg-elevated py-1.5 pr-3 pl-8 text-xs text-text-primary placeholder:text-text-muted focus:border-accent focus:outline-none"
/>
```

资料来源：[gitnexus-web/src/components/FileTreePanel.tsx]()

## Hybrid Search Engine

The `HybridSearchEngine` class serves as the central orchestrator for search operations, combining multiple ranking strategies to produce relevant results.

### Search Strategy

The hybrid approach combines:

1. **BM25 Ranking**: Classical TF-IDF variant optimized for keyword matching
2. **Full-Text Search (FTS)**: Structured index for structured queries
3. **Embedding Similarity**: Vector-based semantic matching

### Result Aggregation

Results from multiple search backends are aggregated using a weighted scoring system. The final ranking considers:

- Keyword match score (BM25)
- Semantic similarity score (embeddings)
- Graph position relevance
- Connection count to other nodes

```mermaid
graph LR
    A[User Query] --> B[Query Parser]
    B --> C[BM25 Search]
    B --> D[Embedding Query]
    B --> E[FTS Query]
    C --> F[Score Normalization]
    D --> F
    E --> F
    F --> G[Weighted Aggregation]
    G --> H[Result Ranking]
    H --> I[Graph Highlighting]
```

资料来源：[gitnexus/src/core/search/hybrid-search.ts]()

## BM25 Index

The BM25 (Best Matching 25) algorithm provides the foundation for keyword-based search ranking in GitNexus. This classical information retrieval technique offers predictable, fast matching for exact and partial term queries.

### Key Features

| Feature | Description |
|---------|-------------|
| Term Frequency Normalization | Prevents bias toward longer documents |
| Document Length Scaling | Adaptive ranking based on field length |
| Saturation Function | Diminishing returns for repeated terms |
| IDF Weighting | Downweights common terms |

### Index Structure

The BM25 index maintains:

- **Inverted index**: Maps terms to document positions
- **Document statistics**: Length, term counts, field weights
- **IDF table**: Global term importance values

资料来源：[gitnexus/src/core/search/bm25-index.ts]()

## Full-Text Search Indexes

The FTS (Full-Text Search) subsystem provides structured indexing capabilities for code-specific queries. Unlike BM25 which focuses on term matching, FTS supports phrase queries, proximity searches, and structured field filtering.

### Index Types

1. **Token Index**: Standard word tokenization
2. **Code Token Index**: Language-aware tokenization preserving syntax
3. **Symbol Index**: Function/class/variable name index

### Query Capabilities

| Query Type | Syntax | Example |
|------------|--------|---------|
| Exact phrase | `"term1 term2"` | `"useState hook"` |
| Prefix match | `term*` | `use*` |
| Field filter | `field:value` | `type:function` |
| Boolean | `AND / OR / NOT` | `useState AND hook` |

资料来源：[gitnexus/src/core/search/fts-indexes.ts]()

## Embedding Pipeline

The embedding pipeline transforms code entities into dense vector representations that capture semantic meaning. This enables similarity-based search and natural language understanding.

```mermaid
graph TD
    subgraph "Embedding Pipeline"
        SRC[Source Code] --> PRE[Preprocessor]
        PRE --> TOK[Tokenization]
        TOK --> ENC[Encoder Model]
        ENC --> VEC[Vector Output]
    end
    
    subgraph "Storage"
        VEC --> ANN[Approximate NN Index]
        VEC --> DIM[Dimension Reduction]
    end
    
    subgraph "Query Flow"
        Q[Query Text] --> QENC[Query Encoder]
        QENC --> DIST[Distance Calculation]
        ANN --> DIST
        DIST --> TOP[Top-K Results]
    end
```

### WebGPU Acceleration

Embedding generation is computationally intensive. GitNexus leverages WebGPU for GPU-accelerated embedding computation when available. If WebGPU is not supported, the system falls back to CPU-based computation with a warning about reduced performance.

**Performance Characteristics:**

- WebGPU: Real-time embedding generation
- CPU: Batch processing with estimated completion time
- Memory: Handles large codebases with streaming processing

资料来源：[gitnexus/src/core/embeddings/embedding-pipeline.ts]() [gitnexus-web/src/components/WebGPUFallbackDialog.tsx]()

### Model Configuration

The embedding pipeline supports configurable embedding models. Users can select from available models in the settings panel or specify custom model identifiers.

**Configuration Options:**

| Parameter | Description | Default |
|-----------|-------------|---------|
| Model ID | Embedding model identifier |varies |
| Dimension | Vector dimensionality | 768 |
| Batch Size | Documents per batch | 32 |
| Max Sequence | Maximum input length | 512 |

资料来源：[gitnexus-web/src/components/SettingsPanel.tsx]()

## Query Interface (QueryFAB)

The QueryFAB (Floating Action Button) component provides persistent access to the natural language query interface. It appears as a floating button in the UI and expands to reveal the query input.

### Features

- Natural language query input
- Real-time result preview
- Execution time display
- Result count indicators
- Query result highlighting on graph
- Clear highlight functionality

### Result Display

```tsx
<div className="flex items-center gap-3 text-xs">
  <span className="text-text-secondary">
    <span className="font-semibold text-cyan-400">{queryResult.rows.length}</span> rows
  </span>
  {queryResult.nodeIds.length > 0 && (
    <span className="text-text-secondary">
      <span className="font-semibold text-cyan-400">{queryResult.nodeIds.length}</span> highlighted
    </span>
  )}
  <span className="text-text-muted">{queryResult.executionTime.toFixed(1)}ms</span>
</div>
```

**Result Metrics:**

| Metric | Display | Color |
|--------|---------|-------|
| Row count | `{count} rows` | cyan-400 |
| Highlighted nodes | `{count} highlighted` | cyan-400 |
| Execution time | `{time}ms` | text-muted |

资料来源：[gitnexus-web/src/components/QueryFAB.tsx]()

## Graph Integration

The search system is tightly integrated with the knowledge graph visualization. Search results directly influence the graph display:

### Highlighting

Matching nodes are highlighted in the graph with visual indicators:

- **Primary matches**: Full opacity, accent color border
- **Secondary matches**: Reduced opacity, subtle highlight
- **Related nodes**: Connected nodes shown with edge highlighting

### Navigation

Clicking a search result navigates the graph to focus on the corresponding node, opening the detail panel with:

- Imports and exports
- Reverse dependencies
- Node metadata
- Code preview (where applicable)

### Keyboard Navigation

| Shortcut | Action |
|----------|--------|
| `⌘K` / `Ctrl+K` | Open global search |
| `⌘↵` / `Ctrl+↵` | Execute query |
| `Escape` | Close search modal |
| `↑` / `↓` | Navigate results |

资料来源：[gitnexus-web/src/components/HelpPanel.tsx]()

## Performance Considerations

### Index Build Time

Initial index construction scales with repository size:

| Repository Size | BM25 Index | FTS Index | Embedding Index |
|-----------------|------------|-----------|-----------------|
| Small (<100 files) | < 1s | < 2s | ~30s |
| Medium (100-1000 files) | 2-10s | 5-30s | 2-10 min |
| Large (>1000 files) | 10-60s | 30-120s | 10+ min |

### Query Latency

| Search Type | Typical Latency |
|-------------|-----------------|
| BM25 exact | < 10ms |
| FTS structured | < 20ms |
| Semantic (cached) | < 50ms |
| Semantic (uncached) | 100-500ms |

### Optimization Strategies

1. **Incremental indexing**: Only re-index changed files
2. **Query caching**: Cache recent semantic query results
3. **Approximate nearest neighbor**: Trade accuracy for speed in embedding search
4. **Streaming processing**: Handle large files without memory overflow

## Configuration

### Server Settings

The search system can be configured through the settings panel:

```typescript
interface SearchConfig {
  // BM25 parameters
  k1: number;        // Term frequency saturation
  b: number;         // Document length normalization
  
  // Embedding settings
  modelId: string;   // Embedding model
  device: 'webgpu' | 'cpu';
  
  // FTS settings
  analyzer: 'standard' | 'code';
  minTokenLength: number;
}
```

### Ollama Integration

For self-hosted embedding models, GitNexus supports integration with Ollama:

```typescript
const checkOllamaStatus = async (baseUrl: string): Promise<{
  ok: boolean;
  error: string | null;
}> => {
  const response = await fetch(`${baseUrl}/api/tags`, {
    method: 'GET',
    headers: { 'Content-Type': 'application/json' }
  });
  // ...
};
```

资料来源：[gitnexus-web/src/components/SettingsPanel.tsx]()

## Summary

The GitNexus Search System provides a comprehensive, multi-layered approach to code exploration:

1. **Fast keyword search** via BM25 and FTS indexes for exact matching
2. **Semantic understanding** through embedding-based similarity
3. **Natural language queries** powered by AI integration
4. **Deep graph integration** for visual exploration of search results

The hybrid architecture ensures that users can find exactly what they're looking for through traditional search while also discovering unexpected relationships through semantic queries.

---

---

## Doramagic 踩坑日志

项目：abhigyanpatwari/GitNexus

摘要：发现 38 个潜在踩坑项，其中 10 个为 high/blocking；最高优先级：安装坑 - 来源证据：Bug: Local path analysis not working in Docker version - always throws "path must be an absolute path" error。

## 1. 安装坑 · 来源证据：Bug: Local path analysis not working in Docker version - always throws "path must be an absolute path" error

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Bug: Local path analysis not working in Docker version - always throws "path must be an absolute path" error
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f5cfc2c1ce264d6ab4928a417c66e389 | https://github.com/abhigyanpatwari/GitNexus/issues/1518 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 2. 安装坑 · 来源证据：GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour runtime expected? How to o…

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour runtime expected? How to optimize to within 30 minutes?
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f9b5bf45777e4195a5b65bec8eb05125 | https://github.com/abhigyanpatwari/GitNexus/issues/1444 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 3. 安装坑 · 来源证据：MCP error with Claude: Failed to reconnect to gitnexus: -32000

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：MCP error with Claude: Failed to reconnect to gitnexus: -32000
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_36bbed40ba504142b4564a0335aa0b91 | https://github.com/abhigyanpatwari/GitNexus/issues/1683 | 来源讨论提到 npm 相关条件，需在安装/试用前复核。

## 4. 安装坑 · 来源证据：Unable to install GitNexus in Mac

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Unable to install GitNexus in Mac
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_fbf96b8de8644eb9b2c948cdc732d96c | https://github.com/abhigyanpatwari/GitNexus/issues/1164 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 5. 安装坑 · 来源证据：Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%)

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%)
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f7ed64eb318442009c0c45bd394f15e7 | https://github.com/abhigyanpatwari/GitNexus/issues/1674 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 6. 安装坑 · 来源证据：[Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：[Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_8498a2d89f154c749aac4e7486f4d1f6 | https://github.com/abhigyanpatwari/GitNexus/issues/1680 | 来源类型 github_issue 暴露的待验证使用条件。

## 7. 安装坑 · 来源证据：bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb 0.16.1 VECTOR extension r…

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb 0.16.1 VECTOR extension regression)
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_4b1ddd7fb9104e66b60f9d726b19d298 | https://github.com/abhigyanpatwari/GitNexus/issues/1472 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 8. 配置坑 · 来源证据：Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type.

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type.
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_a2a4f1ac62124129844a1b1664fecdff | https://github.com/abhigyanpatwari/GitNexus/issues/1611 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 9. 配置坑 · 来源证据：analyze: generated CLAUDE.md examples omit repo parameter in multi-repo environments

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：analyze: generated CLAUDE.md examples omit repo parameter in multi-repo environments
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_d6aff7caa0db4beda6fe67cb86b7ccdd | https://github.com/abhigyanpatwari/GitNexus/issues/1542 | 来源类型 github_issue 暴露的待验证使用条件。

## 10. 安全/权限坑 · 来源证据：Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 results despite present bina…

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 results despite present binary)
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_b7e053347355476db324ef874ef9991c | https://github.com/abhigyanpatwari/GitNexus/issues/1690 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 11. 安装坑 · 失败模式：installation: 1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: 1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing
- 对用户的影响：Developers may fail before the first successful local run: 1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: 1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing. Context: Observed when using windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_1438bc0fa83941fc6074de2c5412cd25 | https://github.com/abhigyanpatwari/GitNexus/issues/1440 | 1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing

## 12. 安装坑 · 失败模式：installation: MCP error with Claude: Failed to reconnect to gitnexus: -32000

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: MCP error with Claude: Failed to reconnect to gitnexus: -32000
- 对用户的影响：Developers may fail before the first successful local run: MCP error with Claude: Failed to reconnect to gitnexus: -32000
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: MCP error with Claude: Failed to reconnect to gitnexus: -32000. Context: Observed when using node, windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_ebb5d6efaf48e8d879678b8a170aacf8 | https://github.com/abhigyanpatwari/GitNexus/issues/1683 | MCP error with Claude: Failed to reconnect to gitnexus: -32000, failure_mode_cluster:github_issue | fmev_de5b96783d1677ac281ee6c73055380b | https://github.com/abhigyanpatwari/GitNexus/issues/1683 | MCP error with Claude: Failed to reconnect to gitnexus: -32000

## 13. 安装坑 · 失败模式：installation: Release Candidate v1.6.6-rc.12

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: Release Candidate v1.6.6-rc.12
- 对用户的影响：Upgrade or migration may change expected behavior: Release Candidate v1.6.6-rc.12
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Release Candidate v1.6.6-rc.12. Context: Observed when using node, windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_release | fmev_ad68a701c98e79e3531d2d8292fd6acb | https://github.com/abhigyanpatwari/GitNexus/releases/tag/v1.6.6-rc.12 | Release Candidate v1.6.6-rc.12

## 14. 安装坑 · 失败模式：installation: Release Candidate v1.6.6-rc.13

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: Release Candidate v1.6.6-rc.13
- 对用户的影响：Upgrade or migration may change expected behavior: Release Candidate v1.6.6-rc.13
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Release Candidate v1.6.6-rc.13. Context: Observed when using node, windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_release | fmev_691d71a9a9f7c1da63532e7f4913d598 | https://github.com/abhigyanpatwari/GitNexus/releases/tag/v1.6.6-rc.13 | Release Candidate v1.6.6-rc.13

## 15. 安装坑 · 失败模式：installation: Release Candidate v1.6.6-rc.14

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: Release Candidate v1.6.6-rc.14
- 对用户的影响：Upgrade or migration may change expected behavior: Release Candidate v1.6.6-rc.14
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Release Candidate v1.6.6-rc.14. Context: Observed when using node, windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_release | fmev_fe26658ac0f7dbfc1389758a44d9f3ac | https://github.com/abhigyanpatwari/GitNexus/releases/tag/v1.6.6-rc.14 | Release Candidate v1.6.6-rc.14

## 16. 安装坑 · 失败模式：installation: Unable to install GitNexus in Mac

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: Unable to install GitNexus in Mac
- 对用户的影响：Developers may fail before the first successful local run: Unable to install GitNexus in Mac
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Unable to install GitNexus in Mac. Context: Observed when using node, macos
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_e8a618cc1ed2879b84e81b16d9f8d995 | https://github.com/abhigyanpatwari/GitNexus/issues/1164 | Unable to install GitNexus in Mac

## 17. 安装坑 · 失败模式：installation: WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps...

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps each, exit 0 masks crash) — related to #1427
- 对用户的影响：Developers may fail before the first successful local run: WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps each, exit 0 masks crash) — related to #1427
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps each, exit 0 masks crash) — related to #1427. Context: Observed when using node, python, windows, linux
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_9ec12d39ffa06f5c08b0ad85182bddd7 | https://github.com/abhigyanpatwari/GitNexus/issues/1431 | WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps each, exit 0 masks crash) — related to #1427

## 18. 安装坑 · 失败模式：installation: Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%)

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%)
- 对用户的影响：Developers may fail before the first successful local run: Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%)
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%). Context: Observed when using node, python, windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_122d12d9d7145c261627cb5c59179ac1 | https://github.com/abhigyanpatwari/GitNexus/issues/1674 | Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%), failure_mode_cluster:github_issue | fmev_6c9acc8d3c792d067bb848a0833634a3 | https://github.com/abhigyanpatwari/GitNexus/issues/1674 | Windows + Node v24.14.0: gitnexus@1.6.3 analyze crashes at LadybugDB persist (60%)

## 19. 安装坑 · 失败模式：installation: Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 re...

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 results despite present binary)
- 对用户的影响：Developers may fail before the first successful local run: Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 results despite present binary)
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 results despite present binary). Context: Observed when using node, windows
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_623b7791e3c0dc9791ade56a9bde72a3 | https://github.com/abhigyanpatwari/GitNexus/issues/1690 | Windows: FTS skip-guard too aggressive when extension is locally installed (BM25 returns 0 results despite present binary)

## 20. 安装坑 · 失败模式：installation: [Bug] Incorrect edge relationships when duplicate package, class, and method names exist acro...

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: [Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules
- 对用户的影响：Developers may fail before the first successful local run: [Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: [Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules. Context: Observed when using python
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_006addf62ce5971f44bd372e453a3fe2 | https://github.com/abhigyanpatwari/GitNexus/issues/1680 | [Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules, failure_mode_cluster:github_issue | fmev_d73fddaa3691a761b14cb569d5acbb58 | https://github.com/abhigyanpatwari/GitNexus/issues/1680 | [Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules, failure_mode_cluster:github_issue | fmev_d4e580f0f3653876a2e755ee3ca34259 | https://github.com/abhigyanpatwari/GitNexus/issues/1680 | [Bug] Incorrect edge relationships when duplicate package, class, and method names exist across different modules

## 21. 安装坑 · 失败模式：installation: bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb...

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this installation risk before relying on the project: bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb 0.16.1 VECTOR extension regression)
- 对用户的影响：Developers may fail before the first successful local run: bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb 0.16.1 VECTOR extension regression)
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb 0.16.1 VECTOR extension regression). Context: Observed when using node, python, docker, macos
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_78c629867f92802da91b458f6e7bc14f | https://github.com/abhigyanpatwari/GitNexus/issues/1472 | bug: analyze --embeddings crashes on ARM64 with UNREACHABLE_CODE in wal_record.cpp (ladybugdb 0.16.1 VECTOR extension regression)

## 22. 安装坑 · 来源证据：1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：1.6.4-rc.94 on Windows 11 + WSL FTS indexes missing
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_26a9d2c26b004c6ba8a99533ff1d6ac5 | https://github.com/abhigyanpatwari/GitNexus/issues/1440 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 23. 安装坑 · 来源证据：WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps each, exit 0 masks crash)…

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：WSL2 + 1.6.4-rc.88 + ladybug 0.16.1: list/status/--version persistently SIGSEGV (~2.5GB dumps each, exit 0 masks crash) — related to #1427
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_ec175ee0cb4d4d31bae2f29144c7d619 | https://github.com/abhigyanpatwari/GitNexus/issues/1431 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 24. 安装坑 · 来源证据：analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits)

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits)
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6c4087a57d5e4c949662a9721355ac8f | https://github.com/abhigyanpatwari/GitNexus/issues/1684 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 25. 配置坑 · 失败模式：configuration: GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour...

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this configuration risk before relying on the project: GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour runtime expected? How to optimize to within 30 minutes?
- 对用户的影响：Developers may misconfigure credentials, environment, or host setup: GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour runtime expected? How to optimize to within 30 minutes?
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour runtime expected? How to optimize to within 30 minutes?. Context: Observed when using node, docker
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_53da1aec043bc339e2d1e3db40239192 | https://github.com/abhigyanpatwari/GitNexus/issues/1444 | GitNexus Embedding performance on large Java projects (900k+ edges, 8k+ files): Is a 10-hour runtime expected? How to optimize to within 30 minutes?

## 26. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | github_repo:1031059905 | https://github.com/abhigyanpatwari/GitNexus | README/documentation is current enough for a first validation pass.

## 27. 运行坑 · 失败模式：runtime: GitNexus Enterprise for open source

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this runtime risk before relying on the project: GitNexus Enterprise for open source
- 对用户的影响：Developers may hit a documented source-backed failure mode: GitNexus Enterprise for open source
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: GitNexus Enterprise for open source. Context: Source discussion did not expose a precise runtime context.
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_f47b7a16378d6157c47cb32462d648cd | https://github.com/abhigyanpatwari/GitNexus/issues/1685 | GitNexus Enterprise for open source, failure_mode_cluster:github_issue | fmev_b0d64cb50e22bb1f6f82577f2f9468ed | https://github.com/abhigyanpatwari/GitNexus/issues/1685 | GitNexus Enterprise for open source

## 28. 运行坑 · 失败模式：runtime: Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record t...

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this runtime risk before relying on the project: Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type.
- 对用户的影响：Developers may hit a documented source-backed failure mode: Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type.
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type.. Context: Observed when using macos
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_83c386b31413aad9103fe675ded0d557 | https://github.com/abhigyanpatwari/GitNexus/issues/1611 | Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type., failure_mode_cluster:github_issue | fmev_ef20949c8067cc8c69adefcfe3cac1b3 | https://github.com/abhigyanpatwari/GitNexus/issues/1611 | Schema creation warning: Runtime exception: Corrupted wal file. Read out invalid WAL record type.

## 29. 运行坑 · 失败模式：runtime: analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits)

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this runtime risk before relying on the project: analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits)
- 对用户的影响：Developers may hit a documented source-backed failure mode: analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits)
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits). Context: Source discussion did not expose a precise runtime context.
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_d05c00b4ab6372cd518729b619d7a572 | https://github.com/abhigyanpatwari/GitNexus/issues/1684 | analyze hangs or stalls on microsoft/TypeScript repo root (raised file-size limits)

## 30. 运行坑 · 失败模式：runtime: cpp SFINAE: expand type_traits predicate registry (Tier A)

- 严重度：medium
- 证据强度：source_linked
- 发现：Developers should check this runtime risk before relying on the project: cpp SFINAE: expand type_traits predicate registry (Tier A)
- 对用户的影响：Developers may hit a documented source-backed failure mode: cpp SFINAE: expand type_traits predicate registry (Tier A)
- 建议检查：Before packaging this project, run the relevant install/config/quickstart check for: cpp SFINAE: expand type_traits predicate registry (Tier A). Context: Source discussion did not expose a precise runtime context.
- 防护动作：State this as source-backed community evidence, not as Doramagic reproduction.
- 证据：failure_mode_cluster:github_issue | fmev_356daa9356830c10f89e67ea3f087c37 | https://github.com/abhigyanpatwari/GitNexus/issues/1629 | cpp SFINAE: expand type_traits predicate registry (Tier A)

## 31. 运行坑 · 来源证据：GitNexus Enterprise for open source

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：GitNexus Enterprise for open source
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_761efc4dc902430dbff83585f209b270 | https://github.com/abhigyanpatwari/GitNexus/issues/1685 | 来源类型 github_issue 暴露的待验证使用条件。

## 32. 运行坑 · 来源证据：cpp SFINAE: expand type_traits predicate registry (Tier A)

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：cpp SFINAE: expand type_traits predicate registry (Tier A)
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_e0576badcd5a477bb2b988a8a0afa3d6 | https://github.com/abhigyanpatwari/GitNexus/issues/1629 | 来源类型 github_issue 暴露的待验证使用条件。

## 33. 维护坑 · 来源证据：GitNexus Enterprise for open source

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：GitNexus Enterprise for open source
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_8dd127cfb66644c49407761777f40c18 | https://github.com/abhigyanpatwari/GitNexus/issues/1685 | 来源类型 github_issue 暴露的待验证使用条件。

## 34. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | github_repo:1031059905 | https://github.com/abhigyanpatwari/GitNexus | last_activity_observed missing

## 35. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | github_repo:1031059905 | https://github.com/abhigyanpatwari/GitNexus | no_demo; severity=medium

## 36. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | github_repo:1031059905 | https://github.com/abhigyanpatwari/GitNexus | no_demo; severity=medium

## 37. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | github_repo:1031059905 | https://github.com/abhigyanpatwari/GitNexus | issue_or_pr_quality=unknown

## 38. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | github_repo:1031059905 | https://github.com/abhigyanpatwari/GitNexus | release_recency=unknown

<!-- canonical_name: abhigyanpatwari/GitNexus; human_manual_source: deepwiki_human_wiki -->
