# https://github.com/potpie-ai/potpie Project Manual

Generated at: 2026-06-26 17:35:45 UTC

## Table of Contents

- [Overview & System Architecture](#page-1)
- [CLI Commands, Setup, and Daemon Workflows](#page-2)
- [Context Graph Domain, Backends, and Reconciliation](#page-3)
- [Integrations, Source Connectors, and Sandbox Runtime](#page-4)

<a id='page-1'></a>

## Overview & System Architecture

### Related Pages

Related topics: [CLI Commands, Setup, and Daemon Workflows](#page-2), [Context Graph Domain, Backends, and Reconciliation](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md)
- [legacy/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/README.md)
- [legacy/app/modules/intelligence/tools/confluence_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/confluence_tools/README.md)
- [legacy/app/modules/intelligence/tools/linear_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/linear_tools/README.md)
- [legacy/app/modules/intelligence/tools/registry/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/registry/README.md)
- [legacy/deploy/observability/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/deploy/observability/README.md)
- [potpie/context-engine/adapters/inbound/http/ui/frontend/src/App.tsx](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/inbound/http/ui/frontend/src/App.tsx)
- [potpie/parsing/src/tag_extract.rs](https://github.com/potpie-ai/potpie/blob/main/potpie/parsing/src/tag_extract.rs)
- [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py)
</details>

# Overview & System Architecture

## System Purpose and Scope

Potpie is an open-source platform that turns code repositories into queryable, agent-driven knowledge. It indexes source code, tickets, and documentation, then exposes pre-built and custom AI agents that can answer architectural questions, generate specifications, write or refactor code, and create pull requests. The v2.0.0 release notes describe the goal as a **local-first context engine** with "cleaner CLI and daemon workflows, stronger agent runtime support, faster indexing, deeper observability, and expanded context-graph integrations." Source: [README.md:1-10]()

The project is published as two PyPI packages: `potpie==2.0.0b3` as the main CLI/daemon entry point and `potpie-context-engine==0.1.0b3` as the context-graph runtime. Beta packaging makes the new architecture installable via `uv tool install potpie` outside the development repository. Source: [README.md:1-30]()

## High-Level Architecture

The repository is split between a legacy FastAPI/Celery service tree (`legacy/app/`) and a newer v2 monorepo under `potpie/` that organizes capabilities as discrete Python packages plus a Rust parsing crate. The diagram below summarizes the layered design.

```mermaid
flowchart TB
    User[User / CLI / Coding Harness]
    CLI[potpie CLI + Daemon]
    CE[potpie-context-engine<br/>Hexagonal Adapters]
    Sandbox[potpie/sandbox<br/>Multi-backend Runtime]
    Parsing[potpie/parsing<br/>Rust + tree-sitter]
    Tools[Tool Registry & Integrations]
    Ext[GitHub / Linear / Jira /<br/>Confluence / Sentry]
    Obs[Loki + Grafana +<br/>Prometheus + Langfuse]

    User --> CLI
    CLI --> CE
    CLI --> Sandbox
    CE --> Parsing
    CE --> Tools
    Sandbox --> Ext
    Tools --> Ext
    CE --> Obs
    Sandbox --> Obs
    CLI --> Obs
```

The user-facing surface is a CLI plus coding-harness skills for Claude Code, OpenAI Codex, Cursor, and OpenCode, all of which install Potpie instructions and skills rather than running the server themselves. Source: [README.md:1-50]()

## Core Subsystems

### Context Engine

The context engine lives under `potpie/context-engine/` and follows hexagonal/clean-architecture boundaries. Inbound adapters (HTTP/UI) sit alongside outbound adapters for intelligence, persistence, and search. A React/TypeScript frontend renders the graph UI, where nodes are sorted by property rank and timestamp values are normalized to readable local dates. Source: [potpie/context-engine/adapters/inbound/http/ui/frontend/src/App.tsx:1-50]()

Community issue #909 exposes a concrete coupling inside this layer: the OSS `HashingEmbedder` emits 256 dimensions while every vector-index creator defaults to 1536 (OpenAI `text-embedding-3-small`), forcing the CE to reconcile `getattr(self._embedder, ...)` lookups. Source: [README.md (community evidence, issue #909)]()

### Parsing Layer

Repository parsing is implemented as a Rust crate at `potpie/parsing/src/tag_extract.rs` that uses `tree-sitter` and `rayon` to extract tags, nodes, and relationships from source files in parallel. It produces `TagPayload`, `NodePayload`, and `RelationshipPayload` structs that downstream stages consume to build the code graph. Source: [potpie/parsing/src/tag_extract.rs:1-40]()

Earlier releases (v1.0.2) replaced the original Rust module `create_graph_rs` with a parallelized Python parsing package that performs hybrid ColBERT/BM25 indexing through Qdrant. Issue #235 ("Fails to parse local repository") and issue #517 ("Consolidate `repo_name` and `repo_path` into a Single Repository Identifier") both stem from this dual Python/Rust split, where the legacy Python flow still expects two separate identifiers while the new engine has unified them. Source: [README.md (community evidence, issues #235 and #517)]()

### Sandbox & Agent Runtime

The sandbox package (`potpie/sandbox/sandbox/api/client.py`) isolates agent execution so agents no longer operate directly on the host filesystem. The `client` exposes operations such as `create_pull_request` and `comment_on_pull_request`, which delegate to a configured `GitPlatformProvider`. Inline comments require both `path` and `line`, while top-level PR comments only need `body` and `pr_number`, allowing review-only agents like `review-pr` to run without a writable workspace handle. Source: [potpie/sandbox/sandbox/api/client.py:1-50]()

The v1.1.0 release introduced "sandbox-native agent execution with multi-backend support, bare-repo caching, and a revamped toolset for dramatic speed and token efficiency gains," formalizing this layer as a first-class runtime. Source: [README.md (community evidence, v1.1.0 release notes)]()

### Tool Registry and Integrations

The legacy tool registry (`legacy/app/modules/intelligence/tools/registry/`) is the single source of truth for tool metadata and agent–tool binding. Each `ToolMetadata` entry exposes a `tier` (`low`/`medium`/`high`), a `category` such as `search` or `integration_jira`, and Phase-4 flags like `read_only`, `destructive`, `idempotent`, and `requires_confirmation`. Agents resolve tools via allow-lists or categories, and per-request flags `add_when_non_local`, `exclude_in_local`, and `local_mode_only` control which tools ship to local clients such as the VS Code extension. Source: [legacy/app/modules/intelligence/tools/registry/README.md:1-25]()

Integration tools cover four external systems today, each implemented behind the same Pydantic + LangChain `StructuredTool` factory pattern: GitHub (issues, PRs, reviews, source history), Linear (issues, projects), Jira (issues, changelog), and Confluence (spaces, pages, runbooks). The Confluence tools, for example, use OAuth 2.0 (3LO), call the v2 REST API at `https://api.atlassian.com/ex/confluence/{cloud_id}/wiki/api/v2/`, and always verify that an integration exists before executing. Source: [legacy/app/modules/intelligence/tools/confluence_tools/README.md:1-50]() and [legacy/app/modules/intelligence/tools/linear_tools/README.md:1-30]().

Issue #911 ("Bitbucket foundation for unified Atlassian login") pushes this layer toward a single Atlassian onboarding path that covers Jira, Confluence, and Bitbucket together rather than treating each as a separate integration story. Source: [README.md (community evidence, issue #911)]()

## Distribution, Operations, and Known Gaps

Potpie v2 ships as installable PyPI packages plus a local daemon, which together form the developer-facing CLI contract. Issue #924 requests a `potpie --version` command for first-pass install verification with `uv tool`, and issue #926 asks that `potpie status` default to the agent-readiness contract (context/pot/backend/skill readiness plus a recommended next action) rather than integration auth status. Source: [README.md (community evidence, issues #924 and #926)]()

Observability is wired for Loki + Grafana on top of Prometheus and Langfuse; the legacy observability README specifies a low-cardinality label set (`service`, `env`, `level`, `container`) and high-cardinality structured metadata (`request_id`, `conversation_id`, `run_id`, `user_id`, `task_id`, `project_id`, `logger`) that Promtail promotes from the JSON log output. The FastAPI and Celery containers must be tagged with those labels by the deployment layer; Loki retention is not bounded by default. Source: [legacy/deploy/observability/README.md:1-30]()

Finally, two architectural cleanups are tracked against the context-graph domain: issue #899 asks to remove or reimplement the dead `context_resolve` stub left behind after `_ResolutionServiceShim` was torn down in `legacy/app/modules/intelligence/tools/context_tools/agent_context_tools.py`, and issue #902 requires that destructive semantic mutations on the agent-facing write tier (`domain/semantic_mutations.py`) cannot auto-commit past review even when their `auto_commit` flag is set. Source: [README.md (community evidence, issues #899 and #902)]()

## See Also

- Context Graph Architecture (deep dive)
- CLI & Daemon Workflow
- Tool Registry and Agent–Tool Binding
- Parsing and Indexing Pipeline
- Integrations: GitHub, Linear, Jira, Confluence

---

<a id='page-2'></a>

## CLI Commands, Setup, and Daemon Workflows

### Related Pages

Related topics: [Overview & System Architecture](#page-1), [Context Graph Domain, Backends, and Reconciliation](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md)
- [legacy/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/README.md)
- [legacy/deploy/observability/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/deploy/observability/README.md)
- [legacy/app/modules/intelligence/tools/confluence_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/confluence_tools/README.md)
- [legacy/app/modules/intelligence/tools/jira_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/jira_tools/README.md)
- [legacy/app/modules/intelligence/tools/linear_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/linear_tools/README.md)
- [legacy/app/modules/intelligence/tools/registry/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/registry/README.md)
- [potpie/integrations/README.md](https://github.com/potpie-ai/potpie/blob/main/potpie/integrations/README.md)
- [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py)
</details>

# CLI Commands, Setup, and Daemon Workflows

## Overview and Purpose

Potpie exposes its functionality through a layered CLI that bridges a **local-first context engine** with the **integration surface** (GitHub, Linear, Jira, Confluence, Bitbucket) and the **agent runtime**. According to the v2.0.0 pre-release notes, the CLI is being reworked to provide "cleaner CLI and daemon workflows, stronger agent runtime support, faster indexing, deeper observability, and expanded context-graph integrations" (Source: [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md)).

The legacy system centers on a FastAPI service at `localhost:8001` that receives user requests, a Celery worker with Redis broker that performs async repository parsing, and an Agent Router that dispatches prompts to pre-built or custom agents (Source: [legacy/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/README.md)). The new v2 CLI replaces this with a daemon-driven model where context is built locally and queried without round-tripping to a hosted backend for every operation.

## High-Level Architecture

```mermaid
flowchart LR
    A[potpie CLI] --> B[Setup Wizard]
    A --> C[Daemon Process]
    A --> D[Command Dispatcher]
    B --> E[Repository Parse<br/>Neo4j Knowledge Graph]
    C --> F[Local Context Engine]
    D --> G[Integration Auth<br/>GitHub/Linear/Jira/Confluence]
    D --> H[Agent Invocation<br/>pre-built + custom]
    E --> I[Qdrant Vector Index]
    F --> I
    H --> J[Tool Registry]
    J --> K[Sandbox API<br/>git, terminal, PRs]
```

The diagram captures the three orthogonal concerns a CLI user touches: **setup** (parsing repos into a knowledge graph), **daemon** (local context engine + agent runtime), and **commands** (auth, status, query, ledger).

## Setup Workflow

The setup flow ingests one or more codebases and materializes them as a queryable knowledge graph. In the legacy stack this is orchestrated by Celery with Redis as the broker, performing "cloning, AST extraction, and knowledge graph construction" entirely in the background (Source: [legacy/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/README.md)). The graph is persisted in Neo4j and captures "every file, function, class, and the relationships between them" (Source: [legacy/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/README.md)).

The v1.0.2 release replaced the Rust `create_graph_rs` module with `app/src/parsing/` and introduced ColBERT/BM25 hybrid vector indexing via Qdrant (Source: [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md)). Setup is therefore a long-running operation; users poll the parse job until completion before they can issue agent queries.

### Setup Steps

1. Authenticate with chosen provider(s) — see the Atlassian CLI unification tracked in issue #911.
2. Point Potpie at a repository (local path or remote URL — see issue #517 on consolidating `repo_name` and `repo_path`).
3. Trigger indexing; CLI returns a job handle.
4. Wait for completion (CLI surfaces parsing failure alerts via email — added in v1.0.1).
5. Verify readiness via `potpie status` (see "Common Failure Modes" below).

## Daemon Workflow

The daemon is the long-lived process that hosts the local context engine and serves agent invocations. In v2.0.0b3 the daemon ships as part of the `potpie-context-engine` PyPI package alongside the main `potpie` distribution, making the daemon installable through `uv tool` or `pip` outside the development repo (Source: [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md)).

The sandbox layer the daemon depends on is defined in [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py). Its `comment_on_pull_request` method documents two operation shapes — top-level conversation comments and inline review comments — and explicitly notes that review comments are commonly issued "from analysis flows (the `review-pr` agent that has no worktree at all)" (Source: [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py)). This is the kind of operation a daemon-orchestrated agent issues without direct filesystem access.

The integrations module mirrors the hexagonal layout of `context-engine`, with `integrations/domain/` for the provider registry, `integrations/application/` for services and provider bootstrap, and `integrations/adapters/outbound/` for persistence, OAuth clients, and crypto (Source: [potpie/integrations/README.md](https://github.com/potpie-ai/potpie/blob/main/potpie/integrations/README.md)). The CLI loads providers from this package at startup, so adding a new integration (Bitbucket, Sentry) is a domain registration plus an outbound adapter.

## CLI Command Surface

### Integration Commands

Each integration ships a parallel toolset wired through the tool registry:

| Integration | Tool Group | Auth Model |
| --- | --- | --- |
| GitHub | repos, PRs, issues, reviews | OAuth / PAT |
| Linear | teams, issues, projects, documents | API key (env or per-user secret) |
| Jira | projects, issues, status, changelog | OAuth 2.0 (3LO) |
| Confluence | spaces, pages, runbooks, decisions | OAuth 2.0 (3LO) |

The Linear integration explicitly supports two API-key configuration modes — a global env var (`LINEAR_API_KEY`) and per-user keys stored via `SecretManager.create_integration_keys` (Source: [legacy/app/modules/intelligence/tools/linear_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/linear_tools/README.md)). Confluence requires `CONFLUENCE_CLIENT_ID`, `CONFLUENCE_CLIENT_SECRET`, and `CONFLUENCE_REDIRECT_URI` (Source: [legacy/app/modules/intelligence/tools/confluence_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/confluence_tools/README.md)). All Jira tools verify integration presence via `check_jira_integration_exists` before execution (Source: [legacy/app/modules/intelligence/tools/jira_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/jira_tools/README.md)).

### Tool Discovery

The registry supports both **upfront** and **search-flow** tool loading. In search-flow mode, agents receive `search_tools`, `describe_tool`, and `execute_tool` discovery primitives and resolve tool names via allow-lists or categories at runtime (Source: [legacy/app/modules/intelligence/tools/registry/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/registry/README.md)). `ChatContext.use_tool_search_flow: bool = False` toggles the mode per request for A/B testing, and cache keys for `AgentFactory` include this flag so the daemon serves the right tool payload.

## Common Failure Modes

| Symptom | Root Cause | Resolution |
| --- | --- | --- |
| `potpie --version` errors with `No such option` | Version flag not wired on root parser (issue #924) | Track the PR fixing the root CLI parser |
| `potpie status` returns auth status instead of readiness | Default output misaligned with `context_status` contract (issue #926) | Use the explicit `context_status` invocation |
| Local repo parse fails | `repo_name` / `repo_path` dual-variable handling (issue #517) | Issue tracks consolidation |
| Vector dimension mismatch | `HashingEmbedder` (256d) vs default index (1536d) — issue #909 | Override `_DEFAULT_DIMENSIONS` or change index creator to read embedder dims |
| Confluence tool returns "No integration found" | OAuth connection not established | Complete `CONFLUENCE_CLIENT_ID` setup |
| Jira `version conflict` on update | Concurrent edit since last fetch | Re-fetch page/issue and retry |

The observability layer that surfaces these failures uses Promtail + Loki with high-cardinality fields (`request_id`, `conversation_id`, `run_id`, `user_id`, `task_id`, `project_id`) promoted as structured metadata rather than labels (Source: [legacy/deploy/observability/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/deploy/observability/README.md)). Container-level labels (`service`, `env`, `level`, `container`) still need to be set in the deployment manifest — they are not applied by the repository `dockerfile`.

## See Also

- Tool Registry and Discovery (`legacy/app/modules/intelligence/tools/registry/`)
- Integration Onboarding (`potpie/integrations/`)
- Sandbox API (`potpie/sandbox/sandbox/api/`)
- Observability Deployment (`legacy/deploy/observability/`)
- Confluence, Jira, and Linear tool READMEs under `legacy/app/modules/intelligence/tools/`

---

<a id='page-3'></a>

## Context Graph Domain, Backends, and Reconciliation

### Related Pages

Related topics: [Overview & System Architecture](#page-1), [CLI Commands, Setup, and Daemon Workflows](#page-2), [Integrations, Source Connectors, and Sandbox Runtime](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [potpie/context-engine/domain/semantic_mutations.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/semantic_mutations.py)
- [potpie/context-engine/domain/graph_mutations.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/graph_mutations.py)
- [potpie/context-engine/domain/graph_plans.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/graph_plans.py)
- [potpie/context-engine/domain/context_records.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/context_records.py)
- [potpie/context-engine/domain/graph_workbench.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/graph_workbench.py)
- [potpie/context-engine/domain/graph_workbench_ontology.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/graph_workbench_ontology.py)
- [potpie/context-engine/domain/ports/policy.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/ports/policy.py)
- [potpie/context-engine/domain/ingestion_event_models.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/ingestion_event_models.py)
- [potpie/context-engine/domain/ingestion_kinds.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/ingestion_kinds.py)
- [potpie/context-engine/adapters/inbound/http/api/v1/context/router.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/inbound/http/api/v1/context/router.py)
- [potpie/context-engine/adapters/outbound/postgres/ledger.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/outbound/postgres/ledger.py)
- [potpie/context-engine/adapters/outbound/postgres/reconciliation_ledger.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/outbound/postgres/reconciliation_ledger.py)
- [potpie/context-engine/bootstrap/ingestion_server.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/bootstrap/ingestion_server.py)
- [potpie/context-engine/application/use_cases/hard_reset_pot.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/hard_reset_pot.py)
- [potpie/context-engine/application/use_cases/record_durable_context.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/record_durable_context.py)
- [potpie/context-engine/application/use_cases/report_status.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/report_status.py)
- [potpie/context-engine/application/use_cases/submit_raw_episode.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/submit_raw_episode.py)
- [legacy/app/modules/context_graph/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/context_graph/README.md)
</details>

# Context Graph Domain, Backends, and Reconciliation

The **Context Graph** is Potpie's durable, queryable memory layer for agent runs. It stores episodes, plans, semantic nodes, and reconciliation events, then exposes them through a hexagonal API used by the CLI, the daemon, and LangChain tools. The v2 architecture splits portable business logic into the `potpie/context-engine` package and keeps Potpie-specific glue (Celery tasks, user-scoped wiring, FastAPI mount points) in `legacy/app/modules/context_graph` Source: [legacy/app/modules/context_graph/README.md:5-19]().

## 1. Domain Layer: Mutations, Plans, and Records

The context-engine domain is organized around three concerns: durable **records**, forward-looking **plans**, and state-changing **mutations**.

| Module | Responsibility |
|--------|---------------|
| `domain/context_records.py` | Immutable facts committed to the graph (decisions, conventions, episodes) |
| `domain/graph_plans.py` | Forward-looking intentions: tasks the agent intends to perform against a pot |
| `domain/graph_mutations.py` | Lower-level graph edits (add node, link, remove) |
| `domain/semantic_mutations.py` | High-level semantic edits that combine mutations into meaningful operations |

The workbench modules expose agent-friendly views on top of those primitives:

- `domain/graph_workbench.py` provides the read-side surface (queries, traversal helpers, projection of records/plans/mutations into something an agent can navigate). Source: [potpie/context-engine/domain/graph_workbench.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/graph_workbench.py)
- `domain/graph_workbench_ontology.py` defines the node/edge ontology (types, allowed relationships, validation rules) so every write stays consistent with the rest of the graph. Source: [potpie/context-engine/domain/graph_workbench_ontology.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/graph_workbench_ontology.py)

### Security: Destructive mutations

The agent-facing write tier exposes both an approval field and an auto-commit flag. Destructive kinds on `semantic_mutations` cannot auto-commit past review: they require explicit human approval regardless of the auto-commit setting, which prevents the agent from deleting or rewriting semantic state in a single shot. Source: [potpie/context-engine/domain/semantic_mutations.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/semantic_mutations.py) (see community issue #902).

## 2. Ports, Policy, and Application Use Cases

The domain defines **ports** that the rest of the system depends on. `domain/ports/policy.py` enumerates the action vocabulary used by the authorization layer: `ACTION_POT_INGEST_EPISODE`, `ACTION_POT_READ`, `ACTION_POT_RECORD`, `ACTION_POT_RESET`, and others. Every inbound request resolves an `Actor` (with `ActorSurface` and `normalize_surface`) and checks policy before mutating state. Source: [potpie/context-engine/domain/ports/policy.py:15-40](), [potpie/context-engine/domain/actor.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/actor.py).

The application layer orchestrates the domain into named **use cases**:

- `hard_reset_pot` — wipes all state for a pot (records, plans, embeddings). Source: [potpie/context-engine/application/use_cases/hard_reset_pot.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/hard_reset_pot.py)
- `record_durable_context` — accepts a `DurableContextPayload` and persists a context record. Source: [potpie/context-engine/application/use_cases/record_durable_context.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/record_durable_context.py)
- `submit_raw_episode` — pushes an unparsed episode onto the ingestion queue. Source: [potpie/context-engine/application/use_cases/submit_raw_episode.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/submit_raw_episode.py)
- `report_status` — cheap context/pot/backend/skill readiness probe consumed by `potpie status`. Source: [potpie/context-engine/application/use_cases/report_status.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/application/use_cases/report_status.py)

## 3. Backends: Ledgers, Vectors, and Dimensionality

The outbound adapters persist state to Postgres. Two ledgers cooperate:

```mermaid
flowchart LR
    A[Agent / CLI] -->|submit_raw_episode| B[IngestionLedger<br/>postgres/ledger.py]
    B -->|event| C[Worker / Use Case]
    C -->|reconcile| D[ReconciliationLedger<br/>postgres/reconciliation_ledger.py]
    C -->|embed| E[(Vector Index<br/>embedder.dim)]
    D --> F[(Graph Store<br/>Neo4j / Postgres)]
    E --> F
```

- `adapters/outbound/postgres/ledger.py` — `SqlAlchemyIngestionLedger` records ingestion events keyed by `IngestionEventStatus` and filterable through `EventListFilters`. Source: [potpie/context-engine/adapters/outbound/postgres/ledger.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/outbound/postgres/ledger.py), [potpie/context-engine/domain/ingestion_event_models.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/ingestion_event_models.py)
- `adapters/outbound/postgres/reconciliation_ledger.py` — `SqlAlchemyReconciliationLedger` tracks reconciliation runs, including `INGESTION_KIND_AGENT_RECONCILIATION`. Source: [potpie/context-engine/adapters/outbound/postgres/reconciliation_ledger.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/outbound/postgres/reconciliation_ledger.py), [potpie/context-engine/domain/ingestion_kinds.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/domain/ingestion_kinds.py)

**Embedder dimensionality** is a real failure mode: the bundled `HashingEmbedder` emits 256 dimensions, while vector-index creators default to 1536 (OpenAI `text-embedding-3-small`). Mismatched dims cause silent index errors. The reconciliation use case must reconcile the embedder's actual output with the index configuration (see community issue #909). Source: [potpie/context-engine/adapters/outbound/intelligence/local_embedder.py:38](), [potpie/context-engine/adapters/outbound/postgres/ledger.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/adapters/outbound/postgres/ledger.py).

## 4. Reconciliation and HTTP Surface

Reconciliation is the loop that turns raw episodes into durable records: workers consume events from the ingestion ledger, run the parsing/embedding pipeline, write to the graph, and record outcomes in the reconciliation ledger. Failed events remain queryable through `EventListFilters` for replay.

The HTTP surface is a single FastAPI router mounted at `/api/v1/context`:

- `POST /api/v1/context` — submit raw episodes (`submit_raw_episode`)
- `POST /api/v1/context/durable` — record durable context (`record_durable_context`)
- `GET /api/v1/context/status` — `report_status` (consumed by `potpie status`)
- `POST /api/v1/context/pots/{id}/reset` — `hard_reset_pot`

All routes share a common dependency layer (`get_container_or_503`, `get_db`, `get_db_optional`, `require_api_key`) defined in `adapters/inbound/http/deps.py`. The container is `IngestionServerContainer` from `bootstrap/ingestion_server.py`, built either standalone (for the daemon) or scoped to a user session. Source: [potpie/context-engine/adapters/inbound/http/api/v1/context/router.py:1-90](), [potpie/context-engine/bootstrap/ingestion_server.py](https://github.com/potpie-ai/potpie/blob/main/potpie/context-engine/bootstrap/ingestion_server.py).

The Potpie host wires this together with Celery (`legacy/app/modules/context_graph/celery_job_queue.py`, `tasks.py`), user-scoped pots (`context_graph_pot_model.py`, `wiring.py`), and a FastAPI mount (`context_pot_routes.py`) so each user gets isolated `context_graph_pots`, members, and repositories. Source: [legacy/app/modules/context_graph/README.md:21-33]().

## See Also

- [Tool Registry & Allow-Lists](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/registry/README.md) — how agents bind to graph tools
- [Integrations (OAuth, Linear, sources)](https://github.com/potpie-ai/potpie/blob/main/potpie/integrations/README.md) — how external systems feed the graph
- [Parsing crate](https://github.com/potpie-ai/potpie/blob/main/potpie/parsing/README.md) — how source files become nodes/edges
- [Context graph host (`legacy/app/modules/context_graph`)](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/context_graph/README.md) — Potpie-specific glue around the engine

---

<a id='page-4'></a>

## Integrations, Source Connectors, and Sandbox Runtime

### Related Pages

Related topics: [Overview & System Architecture](#page-1), [Context Graph Domain, Backends, and Reconciliation](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md)
- [legacy/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/README.md)
- [potpie/integrations/README.md](https://github.com/potpie-ai/potpie/blob/main/potpie/integrations/README.md)
- [legacy/app/modules/intelligence/tools/confluence_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/confluence_tools/README.md)
- [legacy/app/modules/intelligence/tools/linear_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/linear_tools/README.md)
- [legacy/app/modules/intelligence/tools/registry/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/registry/README.md)
- [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py)
- [potpie/parsing/src/tag_extract.rs](https://github.com/potpie-ai/potpie/blob/main/potpie/parsing/src/tag_extract.rs)
</details>

# Integrations, Source Connectors, and Sandbox Runtime

## Overview

Potpie is an agentic platform that grounds AI agents in real software repositories. To deliver that grounding, the system depends on three cooperating layers:

1. **Source Connectors** — pull repository content, issues, pages, and tickets from external systems (GitHub, Linear, Jira, Confluence) so the context engine has something to index.
2. **Tool Layer / Registry** — exposes callable operations (`get_confluence_page`, `update_linear_issue`, `create_pull_request`, etc.) to LangChain agents through allow-lists, categories, and tier metadata.
3. **Sandbox Runtime** — provides an isolated execution surface where agents can clone repos, edit files, run terminals, and open pull requests without touching the host filesystem.

Per the top-level README, Potpie currently ships four canonical integrations (GitHub, Linear, Jira, Confluence) and four coding harnesses (Claude Code, OpenAI Codex, Cursor, OpenCode), with more being added on demand. Source: [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md).

## Source Connectors and Integration Architecture

### Canonical integrations

The integration catalog is documented in the top-level README and grouped into two tables:

| Category    | Members                                                  | Purpose                                                                     |
| ----------- | -------------------------------------------------------- | --------------------------------------------------------------------------- |
| Integrations | GitHub, Linear, Jira, Confluence                         | Index repos, PRs, issues, reviews, tickets, pages, and design documents     |
| Harnesses   | Claude Code, OpenAI Codex, Cursor, OpenCode              | Install Potpie instructions/skills inside the user's coding environment      |

Source: [README.md](https://github.com/potpie-ai/potpie/blob/main/README.md).

### Hexagonal layout

The `potpie-integrations` Python package follows a hexagonal (ports-and-adapters) architecture that mirrors the context engine. Its layout is explicitly described in `potpie/integrations/README.md`:

- `integrations/domain/` — registry, provider definitions, shared schemas
- `integrations/application/` — services, provider bootstrap
- `integrations/adapters/outbound/` — persistence models, OAuth clients, Linear GraphQL, crypto
- `integrations/adapters/inbound/http/` — FastAPI routers mounted by the main app

The package root is named `integrations` (rather than top-level `domain`/`application`) so that the editable install does not collide with `context-engine`. Source: [potpie/integrations/README.md](https://github.com/potpie-ai/potpie/blob/main/potpie/integrations/README.md).

### Atlassian unification effort

A long-standing community request is to treat Jira, Confluence, and Bitbucket as a single Atlassian onboarding flow rather than separate integrations. Issue #911 ("Bitbucket foundation for unified Atlassian login") tracks the first step toward this consolidation, and the `AtlassianOAuthBase` pattern is the intended common parent for Jira, Confluence, and future Bitbucket OAuth clients.

## Tool Layer and Registry

### Per-integration tools

Each integration ships its own tool module under `legacy/app/modules/intelligence/tools/`. The tool modules follow a uniform three-layer pattern:

1. **Pydantic input schema** for parameter validation
2. **Tool class** containing the `run()` business logic
3. **Factory function** that returns a LangChain `StructuredTool` bound to the current user/DB session

For example, the Confluence tools module exposes seven tools — `get_confluence_spaces`, `get_confluence_page`, `search_confluence_pages`, `get_confluence_space_pages`, `create_confluence_page`, `update_confluence_page`, `add_confluence_comment` — all wired through `ConfluenceClient`, which handles OAuth 2.0 (3LO) Bearer tokens, automatic refresh, and storage-format HTML conversion. Source: [legacy/app/modules/intelligence/tools/confluence_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/confluence_tools/README.md).

The Linear integration supports two key-resolution modes: a global `LINEAR_API_KEY` environment variable and per-user keys stored via `SecretManager.create_integration_keys`. Two tools (`get_linear_issue`, `update_linear_issue`) are provided in the v0.1.5 release. Source: [legacy/app/modules/intelligence/tools/linear_tools/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/linear_tools/README.md).

### Tool registry metadata

The tool registry is the single source of truth that agents consult at runtime. `ToolMetadata` carries `id`, `name`, `description`, `tier` (`low|medium|high`), `category` (e.g. `search`, `code_changes`, `integration_jira`), and several conditional flags:

- `defer_loading` — exclude rarely used tools from the initial payload
- `local_mode_only` / `non_local_only` — gate terminal-class tools to the VS Code extension
- `read_only`, `destructive`, `idempotent`, `requires_confirmation` — safety hints used by the agent guardrails (especially relevant to issue #902 on destructive semantic mutations)

`AllowListDefinition` aggregates named sets of `tool_names` and/or `categories`, with optional `add_when_non_local` / `exclude_in_local` modifiers. The registry also supports an on-demand "search_tools → describe_tool → execute_tool" flow gated by `ChatContext.use_tool_search_flow`, intended for A/B testing against the all-upfront resolver. Source: [legacy/app/modules/intelligence/tools/registry/README.md](https://github.com/potpie-ai/potpie/blob/main/legacy/app/modules/intelligence/tools/registry/README.md).

## Sandbox Runtime

Release v1.1.0 introduced sandbox-native agent execution so that agents no longer operate directly on the host filesystem. The sandbox API exposes operations that abstract over Git platform providers and bare-repo caching, dramatically improving speed and token efficiency.

Key endpoints in the sandbox API client include:

- `create_pull_request(...)` — accepts `RepoIdentity(repo_name, repo_url)`, `head_branch`, `base_branch`, `reviewers`, `labels`, and `auth_token`, then delegates to the configured `GitPlatformProvider`.
- `comment_on_pull_request(...)` — supports two shapes: top-level conversation comments (`body` + `pr_number`) and inline review comments (`path` + `line`, optional `commit_id`). Inline review comments do not require a writable workspace handle, which is why the `review-pr` analysis flow can post comments without owning a worktree.
- Mutators enforce argument invariants (e.g. requiring both `path` and `line` together) by raising `SandboxOpError`.

Source: [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py).

The sandbox runtime is therefore the bridge between the agent's tool calls and the real world: code edits land inside the sandbox, and PR/auth-aware calls flow through the `GitPlatformProvider` chain rather than touching the host directly.

## Data Flow and Component View

```mermaid
flowchart LR
    subgraph Ext[External Systems]
        GH[GitHub]
        LI[Linear]
        JR[Jira]
        CF[Confluence]
        BB[Bitbucket<br/>planned]
    end

    subgraph Int[Integrations Package]
        DOM[integrations/domain<br/>registry & schemas]
        APP[integrations/application<br/>services & bootstrap]
        OAUTH[integrations/adapters/outbound/oauth<br/>AtlassianOAuthBase & clients]
        HTTP[integrations/adapters/inbound/http<br/>FastAPI routers]
    end

    subgraph Tools[Tool Layer]
        TR[Tool Registry<br/>allow-lists & tiers]
        CL[ConfluenceClient]
        LL[Linear API]
        SB[Sandbox API Client]
    end

    subgraph Agents[Agents]
        AG[LangChain agents]
        PR[Parsing → Neo4j<br/>tag_extract.rs]
    end

    Ext --> OAUTH
    OAUTH --> DOM
    DOM --> APP
    APP --> HTTP
    HTTP --> TR
    TR --> AG
    CL --> CF
    LL --> LI
    SB --> GH
    SB --> BB
    AG --> SB
    AG --> PR
```

The repository parsing path that feeds the knowledge graph runs through `potpie/parsing/src/tag_extract.rs`, which uses tree-sitter queries and rayon-parallel iteration over `CodeFile` records to produce `GraphPayload` (nodes + relationships) for downstream ingestion. Source: [potpie/parsing/src/tag_extract.rs](https://github.com/potpie-ai/potpie/blob/main/potpie/parsing/src/tag_extract.rs).

## Common Failure Modes and Community Issues

Several recurring community threads surface recurring operational concerns worth documenting alongside the architecture:

- **#235 — Fails to parse local repository.** Parsing relies on Celery workers cloning the repo into a managed location; the dual `repo_name`/`repo_path` handling flagged in #517 compounds this. After the v1.0.2 release moved parsing into a dedicated `parsing/` package with parallelized ColBERT + BM25 indexing, local-repo parsing became more reliable but still requires the worker to be reachable.
- **#517 — Consolidate `repo_name` and `repo_path`.** The legacy code treats these as separate variables throughout the parse/ingest flow; consolidating them into a single `RepoIdentity` (already used by the sandbox API at [potpie/sandbox/sandbox/api/client.py](https://github.com/potpie-ai/potpie/blob/main/potpie/sandbox/sandbox/api/client.py)) is the proposed direction.
- **#902 — Destructive semantic mutations.** The agent-facing write tier in the context graph (`domain/semantic_mutations.py`) carries both an approval field and an auto-commit flag. Destructive mutation kinds must never auto-commit past review; the registry's `destructive` / `requires_confirmation` metadata is the enforcement surface.
- **#909 — Embedder vector dimensionality.** The bundled `HashingEmbedder` emits 256 dims while every vector-index creator defaults to 1536. Any new integration that ships its own embedder must reconcile dimensions with the index it writes to, or search/retrieval will silently degrade.
- **#911 — Bitbucket / unified Atlassian login.** Confirms that `AtlassianOAuthBase` is the intended parent for any new Atlassian-flavored integration so that Jira, Confluence, and Bitbucket share one onboarding path.

## See Also

- [Architecture Overview](Architecture-Overview) — FastAPI / Celery / Neo4j layering
- [Agent Tool Registry](Agent-Tool-Registry) — allow-list resolution and defer-loading
- [Sandbox Runtime](Sandbox-Runtime) — multi-backend sandbox backends and bare-repo caching
- [Atlassian OAuth Setup](Atlassian-OAuth-Setup) — `CONFLUENCE_CLIENT_ID` / `JIRA_CLIENT_ID` / Bitbucket
- [Context Graph Mutations](Context-Graph-Mutations) — destructive-kind approval flow

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: potpie-ai/potpie

Summary: Found 9 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/potpie-ai/potpie/issues/946

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/potpie-ai/potpie/issues/924

## 3. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/potpie-ai/potpie

## 4. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/potpie-ai/potpie

## 5. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/potpie-ai/potpie

## 6. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/potpie-ai/potpie

## 7. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/potpie-ai/potpie

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/potpie-ai/potpie

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/potpie-ai/potpie

<!-- canonical_name: potpie-ai/potpie; human_manual_source: deepwiki_human_wiki -->
