# https://github.com/gptme/gptme Project Manual

Generated at: 2026-06-24 22:44:42 UTC

## Table of Contents

- [Project Overview and Architecture](#page-1)
- [Tools, LLM Providers, and RAG](#page-2)
- [Server, Web UI, and Desktop](#page-3)
- [Extensibility, Plugins, and Autonomous Agents](#page-4)

<a id='page-1'></a>

## Project Overview and Architecture

### Related Pages

Related topics: [Tools, LLM Providers, and RAG](#page-2), [Server, Web UI, and Desktop](#page-3), [Extensibility, Plugins, and Autonomous Agents](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [tauri/README.md](https://github.com/gptme/gptme/blob/main/tauri/README.md)
- [tauri/package.json](https://github.com/gptme/gptme/blob/main/tauri/package.json)
- [webui/extension/README.md](https://github.com/gptme/gptme/blob/main/webui/extension/README.md)
- [webui/extension/package.json](https://github.com/gptme/gptme/blob/main/webui/extension/package.json)
- [gptme/server/cli.py](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py)
- [gptme/server/api_v2.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py)
- [gptme/server/api_v2_common.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2_common.py)
- [gptme/server/tools_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py)
- [gptme/server/openapi_docs.py](https://github.com/gptme/gptme/blob/main/gptme/server/openapi_docs.py)
- [gptme/server/artifacts_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py)
- [gptme/eval/dspy/README.md](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md)
</details>

# Project Overview and Architecture

## Purpose and Scope

gptme is an open-source personal AI agent delivered as a monorepo that bundles a Python agent core, a Flask-based HTTP server, a React/TypeScript web UI, a Tauri desktop wrapper, and a Chrome MV3 browser extension. The architecture is intentionally multi-surface: the same agent and tool registry can be driven from a CLI, a browser tab, a native desktop window, or an HTTP API, while sharing the same on-disk conversation store.

The latest tracked release, `v0.31.1.dev20260622`, ships 88 new features in a single cycle, reflecting an active pace of development across the UI, server, tooling, and evaluation layers. Recent community interest has focused on three areas: extending the `patch` tool with hash-anchored editing to survive line drift (issue [#2667](https://github.com/gptme/gptme/issues/2667)), enabling RAG over code and past conversations (issue [#59](https://github.com/gptme/gptme/issues/59)), and adding Windows support (issue [#73](https://github.com/gptme/gptme/issues/73)).

## System Architecture

The gptme monorepo is composed of several cooperating subprojects, each with its own build and packaging metadata.

| Component | Path | Stack | Role |
| --- | --- | --- | --- |
| Agent core | `gptme/` | Python | LLM orchestration, tool execution, conversation log |
| HTTP server | `gptme/server/` | Flask | REST API v2, OpenAPI docs, auth, telemetry |
| Web UI | `webui/` | React + Vite | In-browser chat client |
| Desktop app | `tauri/` | Tauri + Rust + Node | Bundled webui + sidecar `gptme-server` |
| Browser extension | `webui/extension/` | Chrome MV3 + esbuild | Side-panel chat, content-script capture |
| Evaluation | `gptme/eval/dspy/` | DSPy | Prompt optimization and behavioral evals |

The desktop package is declared in [`tauri/package.json`](https://github.com/gptme/gptme/blob/main/tauri/package.json) (Tauri CLI `^2`) and built via the root `Makefile` targets `make tauri-dev` and `make tauri-build`, with the actual server binary produced by [`tauri/scripts/build-sidecar.sh`](https://github.com/gptme/gptme/blob/main/tauri/README.md) and bundled into `tauri/bins/`.

```mermaid
flowchart LR
    User[User] --> CLI[gptme CLI]
    User --> WebUI[webui / React]
    User --> Ext[Chrome Extension]
    User --> Desktop[Tauri Desktop App]
    WebUI -->|HTTP/JSON| Server[gptme-server Flask]
    Ext -->|HTTP/JSON| Server
    Desktop -->|bundled sidecar| Server
    CLI --> Agent[gptme agent core]
    Server --> Agent
    Agent --> Tools[Tool Registry]
    Agent --> Eval[gptme/eval/dspy]
    Server --> Log[(Conversation logs)]
```

## The Server Layer

The server entry point is the `gptme-server` Click command implemented in [`gptme/server/cli.py`](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py). It initializes the agent via `init(...)`, optionally in degraded mode when no API keys are configured, wires up telemetry with `init_telemetry(...)`, prints an authentication token via `init_auth(...)`, installs a SIGTERM handler for clean container shutdown, and finally runs the Flask app.

The V2 API is split into focused modules under `gptme/server/`:

- [`api_v2.py`](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py) — conversation CRUD, model/provider metadata, and config endpoints.
- [`api_v2_common.py`](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2_common.py) — shared validators for `conversation_id` and `branch` names, with explicit path-traversal protection and a `NAME_MAX`-aware byte budget so on-disk paths like `branches/{name}.jsonl` never exceed filesystem limits.
- [`tools_api.py`](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py) — exposes `GET /api/v2/tools` so the webui can render a `FunctionBrowser` panel describing each tool's name, description, block types, MCP provenance, and parameter schema.
- [`artifacts_api.py`](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py) — classifies artifacts by extension and MIME type, derives stable `art_<sha1[:12]>` ids from logdir-relative paths, and reconstructs file-writing operations from both Markdown code blocks (`save`, `append`, `patch`, `morph`, `patch_many`) and XML `<tool_use>` payloads to produce unified diffs.

OpenAPI documentation is generated automatically. The `api_doc` decorator in [`gptme/server/openapi_docs.py`](https://github.com/gptme/gptme/blob/main/gptme/server/openapi_docs.py) infers summary/description from docstrings, response/request types from annotations, and tags from the module path, then walks Flask routes at spec-generation time to emit a complete `openapi: 3.0.3` document served under `/openapi.json`.

## Client Surfaces

The web UI and browser extension share a single Vite build configured to emit two entries: a main app and a panel entry that wraps `ExtensionChat`. As described in [`webui/extension/README.md`](https://github.com/gptme/gptme/blob/main/webui/extension/README.md), the service worker (`background.ts`) and content script (`content/content.ts`) are compiled separately with esbuild because they cannot ship React, while the side panel reuses the shared component library. The extension's [`package.json`](https://github.com/gptme/gptme/blob/main/webui/extension/package.json) pins `typescript ^5.6.0` and `esbuild ^0.24.0` for that lightweight pipeline.

The desktop app, documented in [`tauri/README.md`](https://github.com/gptme/gptme/blob/main/tauri/README.md), packages the built webui together with a PyInstaller sidecar of `gptme-server`. The Rust backend (`src-tauri/`) owns the app lifecycle and IPC; the actual agent runs in the bundled Python process so end users never need to install Python or manage dependencies themselves.

## Evaluation and Prompt Optimization

The `gptme/eval/dspy/` module, described in [`gptme/eval/dspy/README.md`](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md), applies the DSPy framework to automatically improve gptme's system prompts. It evaluates across task success rate, tool-usage effectiveness, and LLM-judged response quality, then uses MIPROv2 (Bayesian instruction search plus few-shot bootstrapping) or BootstrapFewShot to propose improvements. Known findings documented there include a tool-usage metric that always reports `0.000` despite successful task completion, and the observation that MIPROv2 can lift scores without changing the final prompt text, suggesting the optimization sometimes improves the process rather than the artifact.

## See Also

- Tools Reference — every block type the agent can invoke
- Server API Reference — full OpenAPI schema and authentication model
- Tauri Desktop Build Guide — sidecar bundling details
- Browser Extension Guide — side-panel development workflow
- DSPy Prompt Optimization — evaluation harness and known issues

---

<a id='page-2'></a>

## Tools, LLM Providers, and RAG

### Related Pages

Related topics: [Project Overview and Architecture](#page-1), [Extensibility, Plugins, and Autonomous Agents](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [gptme/server/tools_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py)
- [gptme/server/api_v2.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py)
- [gptme/server/cli.py](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py)
- [gptme/server/artifacts_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py)
- [gptme/server/openapi_docs.py](https://github.com/gptme/gptme/blob/main/gptme/server/openapi_docs.py)
- [gptme/server/api_v2_common.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2_common.py)
- [webui/README.md](https://github.com/gptme/gptme/blob/main/webui/README.md)
- [gptme/eval/dspy/README.md](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md)
</details>

# Tools, LLM Providers, and RAG

gptme is an agent runtime built around three cooperating subsystems: a **tool registry** that defines what the agent can do in the local environment, an **LLM provider layer** that abstracts model selection across vendors, and a **retrieval-augmented generation (RAG)** story that is actively being shaped by community contributions. This page describes each subsystem as it is exposed in the server, CLI, and web UI, and links the high-traffic community issues that drive its evolution.

## Tool System

The tool registry is the agent's capability palette. Each tool declares a name, a one-line description, full instructions shown to the model, the code-block types it handles, and a typed parameter schema. The server exposes that catalog as a JSON API so the web UI can render a *FunctionBrowser* panel that mirrors the agent's live capabilities.

### Tool API surface

The V2 server mounts a Flask blueprint that serves the tool list with authentication enforced by `require_auth`. Source: [gptme/server/tools_api.py:18-38](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py).

| Field | Type | Meaning |
|-------|------|---------|
| `name` | `str` | Tool identifier; also the block-type prefix |
| `desc` | `str` | One-line summary shown in pickers |
| `instructions` | `str` | Full system-prompt text for the tool |
| `block_types` | `list[str]` | Markdown code-block tags (e.g. `shell`, `bash`) |
| `is_mcp` | `bool` | Provided by an MCP server |
| `is_available` | `bool` | Currently usable in this session |
| `disabled_by_default` | `bool` | Excluded from default sessions |

Parameter introspection is modeled by `ToolParameterOut` (name, type, description, required). Source: [gptme/server/tools_api.py:21-26](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py).

### File-writing tools and the `patch_anchored` proposal

The artifact pipeline distinguishes a closed set of file-writing tools: `save`, `append`, `patch`, `morph`, and `patch_many`. Source: [gptme/server/artifacts_api.py:5-11](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py). Of these, only `save` creates a new file and only `patch_many` may target multiple paths in a single tool invocation. The artifact API recovers these operations from both fenced markdown blocks and XML tool-use tokens so it can render unified diffs in the chat timeline. Source: [gptme/server/artifacts_api.py:21-39](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py).

A community proposal (#2667) asks for a sibling `patch_anchored` tool whose ORIGINAL/REPLACEMENT blocks are addressed by content hashes rather than literal line ranges, so a sequence of edits to the same file no longer drifts as earlier hunks shift line numbers. The motivation cited in the issue is the current `patch` tool's behaviour where every `ORIGINAL` block re-matches from the top of the file, causing sequential edits in one file to drift.

### Tool selection from the CLI

The `gptme-server` CLI accepts a comma-separated `--tools` allowlist and forwards it to `init_tools` as a `tool_allowlist` and a `tool_format="markdown"` marker. Source: [gptme/server/cli.py:62-71](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py). This is the entry point users use to disable risky tools or to scope the agent to a narrow workflow when running headless.

## LLM Providers

Provider configuration lives in `gptme/llm/` and is surfaced through the V2 API. The relevant module exports a `PROVIDERS` registry, helpers to enumerate available providers, and functions to get/set a default model and recommend a model for the user's hardware and key set. Source: [gptme/server/api_v2.py:41-54](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py).

Key abstractions:

- `list_available_providers()` — filtered to those with valid credentials in the user's config.
- `get_default_model()` / `set_default_model()` — read and persist the chosen model id.
- `get_recommended_model()` — heuristic selection when the user has not configured one.
- `_apply_model_filters()` / `_get_models_for_provider()` — internal filtering applied to remote catalogs.
- `OPENROUTER_APP_HEADERS` — constant identifying the OpenRouter integration. Source: [gptme/server/api_v2.py:39](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py).

Credentials are loaded via `gptme.credentials.get_stored_api_key`, and the API root handler reports `list_available_providers()` so the web UI can populate its provider picker. Source: [gptme/server/api_v2.py:38-42](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py). When the server is launched with no keys configured, the CLI still starts in *degraded mode* so the user can configure a provider through the UI. Source: [gptme/server/cli.py:65-71](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py).

### Provider data flow

```mermaid
flowchart LR
  UI[Web UI Provider Picker] -->|/api/v2/providers| API[api_v2.py]
  API -->|list_available_providers| LLM[llm/models.py]
  LLM -->|PROVIDERS + filters| API
  API -->|JSON list| UI
  User[User config / env] --> Creds[credentials.get_stored_api_key]
  Creds --> LLM
```

The web UI ships as a multi-backend client that connects to one or more `gptme-server` instances, each with its own provider configuration. Source: [webui/README.md:9-21](https://github.com/gptme/gptme/blob/main/webui/README.md).

## RAG and Conversation Search

RAG support is one of the most-discussed topics in the community. Issue #59 requests retrieval over project folders, plain-text notes, past conversations, and previously web-retrieved documents, with the stated goal of returning gptme to its roots as an agent that has long-term context about the user and their projects.

In the v0.31.1.dev20260622 release, gptme ships a first step in that direction: a new `context search-conversations` CLI subcommand that performs RAG-style search over stored conversations. This is the most direct evidence in the release notes that conversation-level retrieval is now a first-class surface.

The DSPy evaluation module complements this direction: it uses `MIPROv2` and `BootstrapFewShot` to optimize gptme's system prompts against task-success and tool-usage metrics, and stores the results as artifacts that future RAG indexes could consume. Source: [gptme/eval/dspy/README.md:60-75](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md). The README documents a known issue where the tool-usage metric always reads `0.000`, indicating the metric in `metrics.py` needs further work — a useful pointer for anyone extending RAG evaluation.

## See Also

- [gptme/server/openapi_docs.py](https://github.com/gptme/gptme/blob/main/gptme/server/openapi_docs.py) — OpenAPI schema generation for the V2 endpoints.
- [gptme/server/api_v2_common.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2_common.py) — shared validators for conversation ids and branch names.
- [gptme/eval/dspy/README.md](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md) — prompt optimization harness used to benchmark tool behavior.
- [webui/README.md](https://github.com/gptme/gptme/blob/main/webui/README.md) — deployment modes for the web UI, including the multi-backend configuration relevant to swapping providers and RAG backends.
- Community: Issue #59 (Add RAG for code, personal files, and conversations), Issue #2667 (`patch_anchored` tool proposal).

---

<a id='page-3'></a>

## Server, Web UI, and Desktop

### Related Pages

Related topics: [Project Overview and Architecture](#page-1), [Extensibility, Plugins, and Autonomous Agents](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [tauri/README.md](https://github.com/gptme/gptme/blob/main/tauri/README.md)
- [tauri/package.json](https://github.com/gptme/gptme/blob/main/tauri/package.json)
- [webui/README.md](https://github.com/gptme/gptme/blob/main/webui/README.md)
- [webui/extension/README.md](https://github.com/gptme/gptme/blob/main/webui/extension/README.md)
- [webui/extension/package.json](https://github.com/gptme/gptme/blob/main/webui/extension/package.json)
- [gptme/server/cli.py](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py)
- [gptme/server/api_v2.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py)
- [gptme/server/api_v2_common.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2_common.py)
- [gptme/server/tools_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py)
- [gptme/server/artifacts_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py)
- [gptme/server/openapi_docs.py](https://github.com/gptme/gptme/blob/main/gptme/server/openapi_docs.py)
</details>

# Server, Web UI, and Desktop

gptme ships as a monorepo with three coordinated delivery surfaces: a Flask-based HTTP **server** that exposes the agent's conversation, tool, and artifact APIs; a React/Vite **web UI** that consumes those APIs; and a Tauri-based native **desktop app** that bundles both into a standalone binary. A fourth surface, a Chrome MV3 extension, reuses the same web UI for in-browser use. Together they let users run gptme locally, in containers, on remote VMs, or as a packaged desktop application against a shared HTTP backend.

## Architecture at a Glance

The three surfaces share a single source-of-truth: the server's V2 REST API. The web UI and Tauri shell are thin presentation layers; the agent runtime, tool registry, and conversation log live entirely behind the server.

```mermaid
flowchart LR
  subgraph Clients
    UI[webui<br/>React + Vite]
    Tauri[tauri shell<br/>Rust + WebView]
    Ext[Chrome extension<br/>MV3 side panel]
    CLI[gptme CLI]
  end
  subgraph Server[gptme-server · Flask]
    API[v2 API<br/>api_v2.py]
    Tools[tools_api]
    Artifacts[artifacts_api]
    OpenAPI[openapi_docs]
    Auth[auth + token]
  end
  subgraph Agent[Agent Runtime]
    LLM[LLM providers]
    ToolsReg[Tool registry]
    Log[Conversation log<br/>JSONL]
  end
  UI -- "REST + Bearer" --> API
  Tauri -- "REST + Bearer" --> API
  Ext -- "REST + Bearer" --> API
  CLI -- "in-process" --> Agent
  API --> Auth
  API --> Tools
  API --> Artifacts
  API --> Agent
  ToolsReg --> Agent
  OpenAPI -. documents .-> API
```

## gptme-server (Backend)

The server is a Flask application started via the `gptme-server` CLI entry point. It can boot in two modes: an authenticated mode that requires API keys for LLM providers, and a "degraded" mode that boots without keys so the SetupWizard can prompt the user to configure one through the UI. Source: [gptme/server/cli.py:50-90]()

The V2 API surface is defined in [gptme/server/api_v2.py]() and provides conversation CRUD, session management, and tool execution. Common helpers enforce filesystem-safety rules that are reused across blueprints; for example, `_validate_conversation_id` enforces a 255-byte limit and rejects path-traversal attempts, returning `400 Bad Request` with a stable message. Source: [gptme/server/api_v2_common.py:24-40]()

Auxiliary blueprints expose specific subsystems:

- **Tools registry** — `GET /api/v2/tools` returns a `ToolOut` payload describing the agent's current tool palette (name, description, block types, parameter schema, MCP origin, availability, default-disabled flag) so the webui can render a `FunctionBrowser` panel. Source: [gptme/server/tools_api.py:14-65]()
- **Artifacts** — Recovered file writes from message history (markdown code blocks or XML tool-use blocks for `save`, `append`, `patch`, `morph`, `patch_many`) are classified into kinds by extension and MIME type, hashed into stable ids, and indexed by the originating message. Source: [gptme/server/artifacts_api.py:60-140]()
- **OpenAPI docs** — The `api_doc` decorator auto-infers summary, description, request/response schemas, and tags from docstrings, type hints, and Flask route metadata, then composes a live OpenAPI spec for the UI and external clients. Source: [gptme/server/openapi_docs.py:30-160]()

Authentication is delegated to `init_auth`, which issues a bearer token on first run; `gptme-server token` prints it for use in an `Authorization: Bearer` header. Source: [gptme/server/cli.py:100-130]()

A SIGTERM handler routes container shutdowns (`systemctl stop`, Kubernetes scale-down) through the same clean-shutdown path as Ctrl+C so telemetry flushing and the Flask `finally` block always run. Source: [gptme/server/cli.py:88-95]()

## gptme-webui (Frontend)

The web UI is a Vite-built React SPA that talks to one or more `gptme-server` instances. Its standout capability is **multi-backend mode**: a single browser tab can connect to several servers (local dev on `http://127.0.0.1:5700`, a remote workstation, a hosted `chat.gptme.org` instance, or the managed `gptme.ai` service) and unify their conversations. Source: [webui/README.md:8-22]()

A demo mode reads bundled conversation logs without a live server, useful for sharing reproducible artifacts; live fetches are suppressed in this mode. Source: [webui/README.md:12-20]()

A SetupWizard is launched from a "disconnected" banner on first visit when the UI cannot reach a configured server, guiding the user through API-key provisioning and provider selection. Source: [webui/README.md:14-22]()

## gptme-tauri (Desktop)

`tauri/` wraps the web UI and a bundled `gptme-server` PyInstaller sidecar into a native desktop binary. The Rust shell handles app lifecycle, IPC, and sidecar process management, so end users do not need to install Python or manage dependencies. Source: [tauri/README.md:6-16]()

Development and build are driven from the repo root:

```bash
make tauri-dev           # builds webui, starts Tauri dev server
make tauri-build-sidecar # PyInstaller → bins/gptme-server (gitignored)
make tauri-build         # produces bundle in src-tauri/target/release/bundle/
```

Source: [tauri/README.md:22-48](). The Tauri CLI is pinned to `@tauri-apps/cli ^2` in [tauri/package.json]().

## Chrome Extension

A Manifest V3 extension lives at `webui/extension/`. The side panel reuses the shared React components via the webui's multi-entry Vite build (`panel.html` → `ExtensionChat.tsx`), while the service worker and content script are standalone TypeScript compiled with esbuild. A selection-capture content script lets users highlight text on any page and ask gptme about it via the side panel. Source: [webui/extension/README.md:6-28]()

## Deployment Modes

| Mode | Backend | Client | Use case |
|------|---------|--------|----------|
| Local dev | `gptme-server` on `127.0.0.1:5700` | `webui` Vite dev server | Contributor workflow |
| Desktop | Bundled sidecar | `tauri` shell | End-user without Python |
| Hosted (open) | User-supplied server | `chat.gptme.org` | Share a server, keep keys |
| Cloud (managed) | Auto-discovered via `gptme.ai` auth | `gptme.ai` | No server setup |
| Custom remote | VM/workstation | `webui` multi-backend | Aggregate remote agents |
| Demo | None (bundled JSONL) | `webui` demo mode | Shareable reproductions |
| Browser | Any reachable server | Chrome extension side panel | Inline web assistance |

Source: [webui/README.md:10-22](), [tauri/README.md:6-16](), [webui/extension/README.md:6-12]().

## Common Failure Modes

- **No API keys at boot** — the server starts in degraded mode and the UI must guide configuration; otherwise tool calls that need an LLM will fail. Source: [gptme/server/cli.py:50-75]()
- **Path-traversal / oversized ids** — rejected at the API boundary with `400 Bad Request` and a stable message; clients should surface this verbatim. Source: [gptme/server/api_v2_common.py:24-50]()
- **Windows compatibility** — historically blocked by `readline`; the desktop bundle via PyInstaller + Tauri is the recommended path on Windows until upstream fixes land. Source: community issue [#73](https://github.com/gptme/gptme/issues/73)
- **Long patch sequences drift** — the `patch` tool re-matches ORIGINAL blocks against a moving line offset, which motivated the proposed `patch_anchored` (hash-anchored, atomic verify-apply) tool. Source: community issue [#2667](https://github.com/gptme/gptme/issues/2667)

## See Also

- Conversation & session model — `gptme/server/api_v2.py`, `gptme/server/api_v2_common.py`
- Tools & artifacts — `gptme/server/tools_api.py`, `gptme/server/artifacts_api.py`
- API reference — auto-generated by `gptme/server/openapi_docs.py`
- DSPy-based prompt optimization — `gptme/eval/dspy/README.md`

---

<a id='page-4'></a>

## Extensibility, Plugins, and Autonomous Agents

### Related Pages

Related topics: [Project Overview and Architecture](#page-1), [Tools, LLM Providers, and RAG](#page-2), [Server, Web UI, and Desktop](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [gptme/plugins/__init__.py](https://github.com/gptme/gptme/blob/main/gptme/plugins/__init__.py)
- [gptme/plugins/registry.py](https://github.com/gptme/gptme/blob/main/gptme/plugins/registry.py)
- [gptme/plugins/entrypoints.py](https://github.com/gptme/gptme/blob/main/gptme/plugins/entrypoints.py)
- [gptme/hooks/__init__.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/__init__.py)
- [gptme/hooks/registry.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/registry.py)
- [gptme/hooks/confirm.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/confirm.py)
- [gptme/server/tools_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py)
- [gptme/server/session_step.py](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py)
- [gptme/server/tasks_api.py](https://github.com/gptme/gptme/blob/main/gptme/server/tasks_api.py)
- [gptme/server/api_v2.py](https://github.com/gptme/gptme/blob/main/gptme/server/api_v2.py)
- [gptme/eval/dspy/README.md](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md)
</details>

# Extensibility, Plugins, and Autonomous Agents

gptme is designed as an extensible agent runtime where the tool palette, lifecycle behavior, and execution loop can be customized without modifying core code. This page documents the three cooperating subsystems that make that possible: the **plugin/tool registry**, the **hooks system**, and the **autonomous session/tasks engine** that drives the agent loop.

## 1. Purpose and Scope

The extensibility layer serves three goals:

- **Tool composition**: Allow new capabilities (shell, patch, browser, MCP-provided, third-party) to be added to an agent session without forking the runtime. Source: [gptme/server/tools_api.py:1-25](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py).
- **Lifecycle control**: Let host applications intercept and gate sensitive operations (file edits, network calls, confirmations) through declarative hook points. Source: [gptme/hooks/confirm.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/confirm.py) — referenced from [gptme/server/session_step.py:18-30](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py).
- **Autonomous execution**: Provide a managed loop where the agent can run multi-step workflows, parallelize them via the Tasks API, and recover or branch when prompts or contexts shift. Source: [gptme/server/session_step.py:1-60](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py); [gptme/server/tasks_api.py:1-60](https://github.com/gptme/gptme/blob/main/gptme/server/tasks_api.py).

```mermaid
flowchart LR
    A[Plugin entrypoint] --> B[Plugin registry]
    B --> C[Tool registry<br/>get_available_tools]
    C --> D[Tools API<br/>GET /api/v2/tools]
    H[Hooks registry] --> S[Session step executor]
    T[Tasks API] --> S
    S --> R[Conversation log<br/>+ branches]
    D --> UI[WebUI / Tauri / Extension]
```

## 2. Plugins and the Tool Registry

### 2.1 Plugin discovery

Plugins are loaded through Python entry points and assembled in the registry. Source: [gptme/plugins/__init__.py](https://github.com/gptme/gptme/blob/main/gptme/plugins/__init__.py); [gptme/plugins/registry.py](https://github.com/gptme/gptme/blob/main/gptme/plugins/registry.py); [gptme/plugins/entrypoints.py](https://github.com/gptme/gptme/blob/main/gptme/plugins/entrypoints.py). A plugin typically contributes one or more `Tool` subclasses that are registered under a name (e.g. `shell`, `patch`, `patch_many`, `morph`, `save`, `append`). The registry is the single source of truth consumed by both the CLI and the server. Source: [gptme/server/tools_api.py:9-13](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py) — `from ..tools import get_available_tools`.

### 2.2 Tool metadata exposed to clients

Each tool is described by a `ToolOut` model so the WebUI, Tauri shell, and Chrome MV3 extension can render the same palette. Source: [gptme/server/tools_api.py:30-50](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py).

| Field | Meaning |
| --- | --- |
| `name` | Tool name, also the block-type prefix the agent emits (e.g. `shell`). |
| `desc` | One-line summary shown in the Function Browser. |
| `instructions` | Full usage text given to the model. |
| `block_types` | Code-block language tags the tool handles (`['shell', 'bash']`). |
| `is_mcp` | Tool is provided via Model Context Protocol. |
| `is_available` | Whether the tool is usable in the current session (e.g. missing credentials). |
| `disabled_by_default` | Excluded from default sessions; user must opt in. |

Source: [gptme/server/tools_api.py:30-50](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py).

### 2.3 File-editing tools and the patch family

File-writing tools form a small, well-defined set the rest of the system relies on: `save`, `append`, `patch`, `patch_many`, `morph`. Source: [gptme/server/artifacts_api.py:1-80](https://github.com/gptme/gptme/blob/main/gptme/server/artifacts_api.py) — `_FILE_WRITE_TOOLS = {"save", "append", "patch", "morph", "patch_many"}`. The community has been actively requesting a `patch_anchored` tool — hash-anchored editing that survives line-number drift between sequential edits — illustrating the extensibility pressure on this surface area. Source: [GitHub issue #2667](https://github.com/gptme/gptme/issues/2667).

## 3. Hooks: Gating and Lifecycle Interception

The hooks subsystem lets hosts observe and modify the agent loop. Two concrete entry points are visible in the runtime:

- `HookType` enumeration and `trigger_hook` dispatch — the canonical place where event-style observers attach. Source: [gptme/hooks/__init__.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/__init__.py); referenced from [gptme/server/session_step.py:18-30](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py).
- `ConfirmationResult` from `gptme.hooks.confirm` — gates tool execution (e.g. shell commands, file edits) with a user-prompted allow/deny verdict. Source: [gptme/hooks/confirm.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/confirm.py); imported in [gptme/server/session_step.py:18-30](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py).

Hook handlers are registered centrally so plugin authors and embedding hosts (server, WebUI, Tauri) share the same lifecycle. Source: [gptme/hooks/registry.py](https://github.com/gptme/gptme/blob/main/gptme/hooks/registry.py).

## 4. Autonomous Agent Execution

### 4.1 Session step loop

The autonomous loop lives in `gptme.server.session_step`. It separates *how* steps are generated and tools executed (this module) from the Flask routes in `api_v2_sessions.py` and the data models in `session_models.py`. Source: [gptme/server/session_step.py:1-12](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py). The module wires together:

- LLM streaming/chat completion (`_chat_complete`, `_stream`).
- Message preparation (`prepare_messages`) and `LogManager` for conversation state.
- Tool acquisition (`get_tools`) and an ACP runtime for external agent processes.
- Telemetry (`trace_function`) and a background health monitor for ACP. Source: [gptme/server/session_step.py:32-60](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py).

The workspace is materialized via `prepare_execution_environment` and the shell tool's `set_workspace_cwd` so each step runs in a deterministic directory. Source: [gptme/server/session_step.py:18-30](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py).

### 4.2 Tasks for parallel autonomy

Long-running or parallel work is modeled as `Task` metadata that references one or more conversations. Source: [gptme/server/tasks_api.py:30-55](https://github.com/gptme/gptme/blob/main/gptme/server/tasks_api.py). Status transitions (`pending → active → completed | failed`) let the WebUI surface background progress, and `archived` keeps the registry clean without losing provenance. Workspace and git context are *derived* from the active conversation rather than duplicated, keeping Tasks lightweight. Source: [gptme/server/tasks_api.py:55-80](https://github.com/gptme/gptme/blob/main/gptme/server/tasks_api.py).

### 4.3 Server CLI wiring

The server CLI ties everything together: it initializes the tool allowlist, registers a fallback model for degraded startup, configures telemetry, and authenticates the HTTP layer before `app.run`. Source: [gptme/server/cli.py:1-50](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py). It also routes SIGTERM through the same clean-shutdown path as Ctrl+C, ensuring telemetry is flushed and session state is consistent on container scale-down. Source: [gptme/server/cli.py:30-50](https://github.com/gptme/gptme/blob/main/gptme/server/cli.py).

### 4.4 Prompt optimization for autonomy

Autonomous behavior quality depends on the system prompt. The DSPy integration under `gptme/eval/dspy/` provides automated prompt optimization with MIPROv2 (Bayesian instruction search) and BootstrapFewShot (example bootstrapping), driven by composite metrics covering task success, tool effectiveness, and LLM-judged quality. Source: [gptme/eval/dspy/README.md](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md). Recent releases also expand RAG-style context — e.g. a `context search-conversations` subcommand — directly addressing the long-standing community request to give the agent memory of past projects and notes. Source: [v0.31.1.dev20260622 release notes](https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260622); [GitHub issue #59](https://github.com/gptme/gptme/issues/59).

## Common Failure Modes

- **Plugin not loaded**: Tool missing from `GET /api/v2/tools` — check entrypoint registration and `disabled_by_default`. Source: [gptme/server/tools_api.py:30-50](https://github.com/gptme/gptme/blob/main/gptme/server/tools_api.py).
- **Patch drift in long sessions**: Sequential `patch` operations can shift line numbers; this is the motivation behind the proposed `patch_anchored` tool. Source: [GitHub issue #2667](https://github.com/gptme/gptme/issues/2667).
- **Stuck ACP process**: The health monitor thread restarts unhealthy ACP runtimes on a 30-second cadence. Source: [gptme/server/session_step.py:50-60](https://github.com/gptme/gptme/blob/main/gptme/server/session_step.py).
- **DSPy metric bug**: Known issue where tool usage score reports `0.000` despite successful tool-using tasks — under investigation. Source: [gptme/eval/dspy/README.md — Known Issues](https://github.com/gptme/gptme/blob/main/gptme/eval/dspy/README.md).

## See Also

- [Tools and Capabilities](Tools-and-Capabilities.md)
- [Server API Reference](Server-API-Reference.md)
- [Hooks and Lifecycle Events](Hooks-and-Lifecycle-Events.md)
- [Sessions, Tasks, and Branches](Sessions-Tasks-and-Branches.md)
- [DSPy Prompt Optimization](DSPy-Prompt-Optimization.md)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: gptme/gptme

Summary: Found 21 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

## 1. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/gptme/gptme/issues/2667

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: v0.31.1.dev20260525
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260525
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260525

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: v0.31.1.dev20260604
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260604
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260604

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/gptme/gptme/issues/2982

## 5. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/gptme/gptme

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: feat(patch): add patch_anchored tool — hash-anchored editing with atomic verify-apply
- User impact: Developers may misconfigure credentials, environment, or host setup: feat(patch): add patch_anchored tool — hash-anchored editing with atomic verify-apply
- Evidence: failure_mode_cluster:github_issue | https://github.com/gptme/gptme/issues/2667

## 7. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.31.1.dev20260504
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260504
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260504

## 8. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.31.1.dev20260511
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260511
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260511

## 9. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.31.1.dev20260518
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260518
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260518

## 10. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.31.1.dev20260521
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260521
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260521

## 11. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.31.1.dev20260601
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260601
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260601

## 12. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/gptme/gptme

## 13. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this runtime risk before relying on the project: v0.31.1.dev20260608
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260608
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260608

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this migration risk before relying on the project: v0.31.1.dev20260507
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260507
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260507

## 15. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/gptme/gptme

## 16. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/gptme/gptme

## 17. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/gptme/gptme

## 18. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/gptme/gptme/issues/2949

## 19. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/gptme/gptme

## 20. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/gptme/gptme

## 21. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this maintenance risk before relying on the project: v0.31.1.dev20260514
- User impact: Upgrade or migration may change expected behavior: v0.31.1.dev20260514
- Evidence: failure_mode_cluster:github_release | https://github.com/gptme/gptme/releases/tag/v0.31.1.dev20260514

<!-- canonical_name: gptme/gptme; human_manual_source: deepwiki_human_wiki -->
