# https://github.com/vstorm-co/pydantic-ai-backend Project Manual

Generated at: 2026-06-27 10:25:10 UTC

## Table of Contents

- [Project Overview & Architecture](#page-overview)
- [Backends & Protocol Reference](#page-backends)
- [Console Toolset, Permissions & Capabilities](#page-toolsets)
- [Sandboxes, Runtimes & Session Management](#page-sandboxes)

<a id='page-overview'></a>

## Project Overview & Architecture

### Related Pages

Related topics: [Backends & Protocol Reference](#page-backends), [Console Toolset, Permissions & Capabilities](#page-toolsets), [Sandboxes, Runtimes & Session Management](#page-sandboxes)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/README.md)
- [CLAUDE.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/CLAUDE.md)
- [mkdocs.yml](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/mkdocs.yml)
- [src/pydantic_ai_backends/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py)
- [src/pydantic_ai_backends/protocol.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/protocol.py)
- [src/pydantic_ai_backends/types.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/types.py)
- [src/pydantic_ai_backends/backends/state.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/state.py)
- [src/pydantic_ai_backends/hashline.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/hashline.py)
- [examples/predictive_analytics/README.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/examples/predictive_analytics/README.md)
</details>

# Project Overview & Architecture

## 1. Purpose and Scope

`pydantic-ai-backend` is a library that provides **file storage backends and sandboxed execution environments** for AI agents built on top of [pydantic-ai](https://github.com/pydantic/pydantic-ai). The project supplies a uniform abstraction over local filesystems, in-memory stores, Docker containers, Kubernetes pods, Daytona cloud sandboxes, and composite routings, so an agent can read, edit, list, and execute code interchangeably against any of them. Source: [CLAUDE.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/CLAUDE.md).

The repository's documented identity, as rendered by mkdocs, is "File Storage & Sandbox Backends for Pydantic AI", with a subtitle of "Console Toolset, Docker Sandbox, and Permission System for AI Agents". Source: [mkdocs.yml](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/mkdocs.yml).

The central design rule is **protocol-based backends**: every backend implements `BackendProtocol`, which lets tools, sandboxes, and composite routers depend on a single, minimal contract. Source: [CLAUDE.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/CLAUDE.md). The library is published under MIT, requires 100% test coverage, and ships optional extras (`docker`, `daytona`, `kubernetes`, `pydantic-ai`) so that consumers only pull in the dependencies they need. Source: [README.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/README.md).

## 2. High-Level Architecture

The codebase is organized as a thin **protocol layer** sitting on top of several concrete backend implementations, with a parallel **toolset layer** that exposes backend operations to LLM agents.

```mermaid
flowchart TB
    Agent["Pydantic AI Agent<br/>(LLM + tools)"]
    Tools["Console Toolset<br/>create_console_toolset()"]
    Perms["Permissions<br/>PermissionChecker / Rulesets"]
    Proto["BackendProtocol<br/>SandboxProtocol"]
    Backends["Concrete Backends<br/>State · Local · Filesystem · Docker · Kubernetes · Daytona"]
    Composite["Composite / AsyncComposite"]
    Adapter["ensure_async() adapter"]
    Hash["hashline.py<br/>content-hash line editing"]

    Agent --> Tools
    Tools --> Perms
    Tools --> Proto
    Proto --> Backends
    Proto --> Composite
    Backends --> Adapter
    Composite --> Adapter
    Tools -.uses.-> Hash
```

The lazy-import registry in `__init__.py` exposes a single namespace to users while deferring the import of optional-extra modules until they are actually requested. Source: [src/pydantic_ai_backends/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py).

### 2.1 Protocol Layer

`BackendProtocol` is `@runtime_checkable` and declares the minimum surface area every backend must provide: `exists`, `ls_info`, `read`, `read_bytes`, `write`, `edit`, `glob`, `grep_raw`, plus the `BackendProtocol.exists(path) -> bool` predicate introduced in 0.2.8 so callers no longer need to inspect private state. Source: [src/pydantic_ai_backends/protocol.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/protocol.py). `SandboxProtocol` extends this with `execute`, background-process handles, and lifecycle hooks.

### 2.2 Backends Layer

Concrete backends live under `src/pydantic_ai_backends/backends/` and provide storage in their respective domains. `StateBackend` is the canonical in-memory implementation, used both for tests and as a reference example of how a backend should populate the `FileData` store. Source: [src/pydantic_ai_backends/backends/state.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/state.py). Sandboxes (`BaseSandbox`, `DockerSandbox`, `LocalSandbox`, `KubernetesPodSandbox`, `DaytonaSandbox`) inherit from a shared base defined in the same package. Source: [src/pydantic_ai_backends/backends/docker/sandbox.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/docker/sandbox.py).

### 2.3 Toolset Layer

The console toolset wraps the protocol methods into LLM-callable tools (`ls`, `read_file`, `write_file`, `edit_file`, `hashline_edit`, `glob`, `grep`, `execute`) and attaches the permission checker. Tools emit standardized descriptions such as `READ_FILE_DESCRIPTION`, `WRITE_FILE_DESCRIPTION`, and `EDIT_FILE_DESCRIPTION` that steer the model toward safe defaults. Source: [src/pydantic_ai_backends/toolsets/console.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/console.py).

## 3. Module Layout

The top-level layout, as documented in `CLAUDE.md` and reflected in the package, is summarized below. Source: [CLAUDE.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/CLAUDE.md).

| Module | Responsibility |
|---|---|
| `types.py` | Typed dataclasses: `FileData`, `FileInfo`, `WriteResult`, `EditResult`, `ExecuteResponse`, `BackgroundHandle`. |
| `protocol.py` | `BackendProtocol`, `SandboxProtocol`, runtime-checkable contracts. |
| `state.py` | `StateBackend` — in-memory dict-backed storage. |
| `filesystem.py` | `FilesystemBackend` — real filesystem with sandboxing. |
| `composite.py` | `CompositeBackend` / `AsyncCompositeBackend` — path-prefix routing. |
| `sandbox.py` | `BaseSandbox`, `DockerSandbox`, `LocalSandbox` lifecycle. |
| `session.py` | `SessionManager` for multi-session agent state. |
| `runtimes.py` | `RuntimeConfig`, `BUILTIN_RUNTIMES` registry. |
| `adapter.py` | `ensure_async()` async-wrapping shim. |
| `hashline.py` | 2-char MD5-prefix content-hash line editing format. |
| `toolsets/console.py` | `create_console_toolset()` and system-prompt fragments. |
| `__init__.py` | Public API surface with lazy imports. |

Source: [src/pydantic_ai_backends/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py), [src/pydantic_ai_backends/types.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/types.py).

## 4. Data Flow: Agent to Backend

The request lifecycle is straightforward and protocol-driven:

1. The agent receives a user prompt and decides which tool to call (e.g. `read_file`, `execute`).
2. The console toolset resolves the tool's `BackendProtocol` reference from the agent's `RunContext` dependencies (e.g. `ConsoleDeps.backend`) and validates the call against `PermissionChecker`.
3. The call is dispatched to the backend's protocol method (`read_bytes`, `write`, `execute`, …) which returns a typed result (`WriteResult`, `EditResult`, `ExecuteResponse`).
4. For PDFs and images, `read_file` may return `pydantic_ai.BinaryContent` so document-understanding models can ingest them directly. Source: [src/pydantic_ai_backends/toolsets/console.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/console.py).
5. For edits, the agent may choose `str_replace` (exact string match) or `hashline` (2-char MD5 line tags) — the latter eliminates whitespace-matching errors and reduces tokens. Source: [src/pydantic_ai_backends/hashline.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/hashline.py).

A representative end-to-end example is the **predictive analytics demo**, where the main agent delegates to a sub-agent that owns a `DockerSandbox`, executes Python with sklearn/pandas inside the container, and returns structured chart data. Source: [examples/predictive_analytics/README.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/examples/predictive_analytics/README.md).

## 5. Recent Architectural Evolution

The community-visible changelog between 0.2.6 and 0.2.15 illustrates how the architecture has been steadily extended without breaking the protocol contract:

- **0.2.6** — `CompositeBackend` fixed trailing-slash route matching (`/foo` == `/foo/`). Source: [release notes](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.6).
- **0.2.7** — `LocalBackend.async_execute()` for cancellable shell execution via `asyncio.create_subprocess_exec`. Source: [release notes](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.7).
- **0.2.8** — `BackendProtocol.exists(path) -> bool` predicate adopted as a first-class method. Source: [release notes](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.8).
- **0.2.11** — `read_file` returns PDFs as `BinaryContent` for vision/document models. Source: [release notes](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.11).
- **0.2.12** — `KubernetesPodSandbox` joins the sandbox family as a synchronous counterpart to `DockerSandbox`. Source: [release notes](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.12).
- **0.2.13** — `LocalBackend.write()`/`edit()` no longer double `\r` on Windows (text-mode fix). Source: [release notes](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.13).
- **0.2.14 / 0.2.15** — `ensure_async()` adapter and `AsyncCompositeBackend` extend the protocol uniformly to mixed sync/async stacks. Source: [release notes 0.2.14](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.14), [0.2.15](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.15).

A live example of why this evolution matters is [issue #54](https://github.com/vstorm-co/pydantic-ai-backend/issues/54), where the async adapter originally called a wrapped backend's private `_read_bytes`; the 0.2.14 adapter redesign exposes the public surface so third-party wrappers like `pydantic-deep`'s `BranchOverlay` work without monkey-patching.

## See Also

- [`BackendProtocol` & `SandboxProtocol` Reference](./protocol.md)
- [Backends Catalog](./backends.md)
- [Sandbox Execution & Docker](./sandbox.md)
- [Composite & AsyncComposite Routing](./composite.md)
- [Console Toolset & Hashline Editing](./toolsets.md)
- [Permissions System](./permissions.md)

---

<a id='page-backends'></a>

## Backends & Protocol Reference

### Related Pages

Related topics: [Project Overview & Architecture](#page-overview), [Sandboxes, Runtimes & Session Management](#page-sandboxes)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/pydantic_ai_backends/protocol.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/protocol.py)
- [src/pydantic_ai_backends/types.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/types.py)
- [src/pydantic_ai_backends/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py)
- [src/pydantic_ai_backends/backends/state.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/state.py)
- [src/pydantic_ai_backends/backends/local.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/local.py)
- [src/pydantic_ai_backends/backends/composite.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/composite.py)
- [src/pydantic_ai_backends/adapter.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/adapter.py)
- [CLAUDE.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/CLAUDE.md)
</details>

# Backends & Protocol Reference

## Overview

`pydantic-ai-backend` is built around a **protocol-based backend abstraction**: every concrete storage backend implements the same [`BackendProtocol`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/protocol.py), so an AI agent can target in-memory, filesystem, composite, or sandboxed storage with the same call surface. The project exposes this contract alongside [`BackendProtocol` types and helpers through a lazy-loaded public API](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py) — optional integrations (Docker, Daytona, Kubernetes, pydantic-ai toolsets) are only imported when actually requested.

The reference architecture below summarises the layered design described in [`CLAUDE.md`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/CLAUDE.md):

```mermaid
graph TD
    A[Agent / Toolset] --> B[BackendProtocol]
    B --> C[StateBackend]
    B --> D[LocalBackend]
    B --> E[CompositeBackend]
    B --> F[Sandbox backends<br/>Docker / Daytona / K8s]
    G[ensure_async / AsyncAdapter] --> B
    G --> H[AsyncCompositeBackend]
    I[BackendProtocol.exists] --> B
```

## BackendProtocol Contract

`BackendProtocol` is a `@runtime_checkable` `Protocol` defining the minimum surface every backend must satisfy ([`src/pydantic_ai_backends/protocol.py`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/protocol.py)). Core methods include:

- `exists(path: str) -> bool` — returns `True` only for paths resolving to an existing file. Added in **0.2.8** ([#37](https://github.com/vstorm-co/pydantic-ai-backend/pull/37)) as a first-class predicate to replace pattern-matching empty-byte returns from `read_bytes()` or peeking into private state such as `StateBackend._files`.
- `ls_info(path: str) -> list[FileInfo]` — directory listing returning `FileInfo` TypedDicts (`name`, `path`, `is_dir`, `size`).
- `read_bytes(path: str) -> bytes` and `read(path: str) -> str` — raw and text reads (text adds line-number prefixes).
- `write`, `edit`, `grep_raw`, `glob_info` — content mutation and search.

Return types are defined as TypedDicts and dataclasses in [`src/pydantic_ai_backends/types.py`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/types.py), including `WriteResult`, `EditResult` (with `occurrences` counter), and `ExecuteResponse` (`output`, `exit_code`, `truncated`). `exists()` callers must use `ls_info()` to distinguish directories from missing paths, since the predicate returns `False` for both — this contract is documented directly on the protocol method.

## Concrete Backend Implementations

| Backend | Storage | Notes |
|---------|---------|-------|
| `StateBackend` | In-memory `dict[str, FileData]` | Ephemeral; `files` property exposes the dict for inspection. |
| `LocalBackend` | Real filesystem under `root_dir` | Permission-checked via `permissions` ruleset. |
| `CompositeBackend` | Routes by path prefix to child backends | One default backend + per-prefix entries. |
| `AsyncCompositeBackend` | Async counterpart of `CompositeBackend` | Added in **0.2.15** ([#57](https://github.com/vstorm-co/pydantic-ai-backend/pull/57)). |

`StateBackend` initialises with an optional file map and uses ISO-8601 timestamps for `created_at` / `modified_at` ([`src/pydantic_ai_backends/backends/state.py`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/state.py)). `CompositeBackend` performs **path-prefix routing**: requests resolve to the most specific prefix, falling back to a default backend ([`src/pydantic_ai_backends/backends/composite.py`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/composite.py)).

## Async Adapter

To consume a synchronous backend uniformly with `await`, `pydantic-ai-backend` provides `ensure_async` / an async adapter wrapper in [`src/pydantic_ai_backends/adapter.py`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/adapter.py) (added in **0.2.14**, [#55](https://github.com/vstorm-co/pydantic-ai-backend/pull/55), closes [#54](https://github.com/vstorm-co/pydantic-ai-backend/issues/54)). The adapter delegates I/O to the wrapped backend so consumers do not need to branch on sync vs async availability.

`AsyncCompositeBackend` extends this idea to routing: a single async object can dispatch to a mix of sync and async child backends while exposing one `await`able surface.

## Common Failure Modes

- **Windows CR doubling.** `LocalBackend.write()` previously used `Path.write_text()` in text mode, which converts `\n` to `\r\n` on Windows and doubles existing `\r`. Fixed in **0.2.13** ([#51](https://github.com/vstorm-co/pydantic-ai-backend/issues/51)) in [`src/pydantic_ai_backends/backends/local.py`](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/local.py).
- **Trailing-slash route mismatch.** Before **0.2.6**, `CompositeBackend` treated `/foo` and `/foo/` as distinct prefixes, so `ls /foo` could silently fall through to the default backend instead of matching a registered `/foo/` route. Fixed to align with shell semantics (`ls /tmp` ≡ `ls /tmp/`).
- **Async adapter calling private methods.** Issue [#54](https://github.com/vstorm-co/pydantic-ai-backend/issues/54) reports that wrappers like `pydantic-deep`'s `BranchOverlay` expose a public `read_bytes()` but no `_read_bytes()`; the adapter now targets the public symbol.
- **Cancellable subprocesses.** `LocalBackend.async_execute()` (added in **0.2.7**, [#36](https://github.com/vstorm-co/pydantic-ai-backend/pull/36)) uses `asyncio.create_subprocess_exec` so that cancelling the calling task immediately terminates the child.

## See Also

- [Console Toolset & Permissions](./console-toolset.md) — `create_console_toolset`, permission rulesets, image/document support.
- [Sandbox Backends (Docker / Daytona / Kubernetes)](./sandbox-backends.md) — `BaseSandbox`, `DockerSandbox`, `KubernetesPodSandbox`, runtimes.
- [Release Notes (0.2.6 – 0.2.15)](https://github.com/vstorm-co/pydantic-ai-backend/releases) — protocol additions and bug fixes referenced above.

---

<a id='page-toolsets'></a>

## Console Toolset, Permissions & Capabilities

### Related Pages

Related topics: [Project Overview & Architecture](#page-overview), [Backends & Protocol Reference](#page-backends)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/pydantic_ai_backends/toolsets/console.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/console.py)
- [src/pydantic_ai_backends/toolsets/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/__init__.py)
- [src/pydantic_ai_backends/permissions/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/__init__.py)
- [src/pydantic_ai_backends/permissions/checker.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/checker.py)
- [src/pydantic_ai_backends/permissions/presets.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/presets.py)
- [src/pydantic_ai_backends/permissions/types.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/types.py)
- [src/pydantic_ai_backends/capability.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/capability.py)
- [src/pydantic_ai_backends/hashline.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/hashline.py)
- [src/pydantic_ai_backends/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py)
</details>

# Console Toolset, Permissions & Capabilities

## Overview

The `pydantic-ai-backend` library exposes three tightly coupled layers that turn a Pydantic AI agent into a workspace-aware coding assistant:

1. **Console Toolset** — the set of `FunctionTool`s the LLM can call (file I/O, search, shell).
2. **Permission System** — a pattern-based gatekeeper that decides whether a given tool call is `allow`, `deny`, or `ask`.
3. **ConsoleCapability** — a Pydantic AI `AbstractCapability` that wires (1) and (2) into an `Agent` and hides denied tools from the model entirely.

All three are exported from the top-level package via lazy loading so that consumers that only need the permission system do not pull in `pydantic-ai` ([src/pydantic_ai_backends/__init__.py:__LAZY_IMPORTS__](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/__init__.py)).

## Console Toolset

The factory `create_console_toolset()` returns a `FunctionToolset[ConsoleDeps]` that can be attached to any Pydantic AI agent. It works against any backend that implements `BackendProtocol` — `LocalBackend`, `StateBackend`, `DockerSandbox`, etc. — so the same toolset drives both local and sandboxed execution ([src/pydantic_ai_backends/toolsets/console.py:create_console_toolset](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/console.py)).

### Tools exposed

| Tool | Purpose | Source |
|------|---------|--------|
| `ls` | List directory entries | `LS_DESCRIPTION` |
| `read_file` | Read text or return `BinaryContent` for media | `READ_FILE_DESCRIPTION` |
| `write_file` | Create or overwrite a file | `WRITE_FILE_DESCRIPTION` |
| `edit_file` | Targeted `str_replace` edit | `EDIT_FILE_DESCRIPTION` |
| `hashline_edit` | Hash-tagged line edit (Can Bölük-style) | `HASHLINE_EDIT_DESCRIPTION` |
| `glob` / `grep` | Path / content search | `GLOB_DESCRIPTION`, `GREP_DESCRIPTION` |
| `execute` | Synchronous shell command | `EXECUTE_DESCRIPTION` |

Edit behavior is selected by the `edit_format` parameter (`"str_replace"` by default, or `"hashline"`) ([src/pydantic_ai_backends/toolsets/console.py:create_console_toolset](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/console.py)).

### Multimodal and document support

Two independent flags switch the toolset into multimodal mode:

- `image_support=True` causes `read_file` to return `pydantic_ai.BinaryContent` for `IMAGE_EXTENSIONS` (`.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`) up to `max_image_bytes`.
- `document_support=True` extends that behavior to PDFs (`DOCUMENT_EXTENSIONS` / `DOCUMENT_MEDIA_TYPES`) up to `max_document_bytes` ([src/pydantic_ai_backends/toolsets/console.py:create_console_toolset](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/toolsets/console.py); release 0.2.11 added PDF support, release notes [0.2.11](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.11)).

### Hashline format

When `edit_format="hashline"`, file content is rendered as `N:HH|line` triples, where `HH` is the first 2 hex chars of the MD5 digest of the line. Models then reference lines by `number:hash` instead of reproducing exact whitespace, which is the primary motivation for this mode ([src/pydantic_ai_backends/hashline.py:line_hash](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/hashline.py)).

## Permission System

The permission layer is a small, dependency-free rule engine defined in `src/pydantic_ai_backends/permissions/`. It is consumed both by `create_console_toolset` (for approval gates) and by `ConsoleCapability` (for tool hiding) ([src/pydantic_ai_backends/permissions/__init__.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/__init__.py)).

### Data model

```mermaid
flowchart LR
    Rule[PermissionRule<br/>pattern + action] --> Op[OperationPermissions]
    Op --> Ruleset[PermissionRuleset<br/>read/write/edit/execute/ls/glob/grep]
    Ruleset --> Checker[PermissionChecker]
    Checker -->|allow / deny / ask| Tool[Console Tool]
```

A `PermissionRule` pairs a glob pattern with an `action` in `{"allow", "deny", "ask"}`. Rules are evaluated **in order — first match wins**. The `pattern` syntax extends `fnmatch` with `**` for recursive directory matches ([src/pydantic_ai_backends/permissions/types.py:PermissionRule](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/types.py)).

### Presets

Four named rulesets are exported for common safety profiles ([src/pydantic_ai_backends/permissions/presets.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/presets.py)):

| Preset | Behavior |
|--------|----------|
| `READONLY_RULESET` | Read-only — writes/edits/execute denied |
| `DEFAULT_RULESET` | Reads allowed (except secrets), writes/execute require `ask` |
| `PERMISSIVE_RULESET` | Allow most operations, deny only dangerous commands |
| `STRICT_RULESET` | All non-read operations require `ask` |

`SECRETS_PATTERNS` and `SYSTEM_PATTERNS` are reusable glob lists you can splice into custom rulesets via `create_ruleset()`.

### PermissionChecker

`PermissionChecker` is the runtime gate. It exposes `check_sync(operation, path)` returning the resolved action and integrates with the toolset via an `AskCallback` that fires for `ask` actions; if the callback is not wired, the action is recorded as `PermissionAskError` ([src/pydantic_ai_backends/permissions/checker.py:PermissionChecker](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/permissions/checker.py)).

## ConsoleCapability

`ConsoleCapability` is a `pydantic_ai.capabilities.AbstractCapability` that bundles the toolset, the system prompt, and the permission checker into a single object. It performs two distinct kinds of gating ([src/pydantic_ai_backends/capability.py:ConsoleCapability](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/capability.py)):

- `prepare_tools()` — removes tools whose operation is `deny` from the model-visible list.
- `before_tool_execute()` — runs a per-path `check_sync` for `path`/`file_path`-bound operations and raises `PermissionDeniedError` / `PermissionAskError` on a negative result.

The capability also generates a serialization name (`"ConsoleCapability"`) for use with `AgentSpec` YAML/JSON, and a stable `get_toolset()` accessor.

## Usage example

```python
from pydantic_ai import Agent
from pydantic_ai_backends import ConsoleCapability, LocalBackend
from pydantic_ai_backends.permissions import READONLY_RULESET

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[
        ConsoleCapability(
            backend=LocalBackend(root_dir="/workspace"),
            permissions=READONLY_RULESET,
            image_support=True,
        )
    ],
)
```

The same capability can be constructed without a permission ruleset to disable the gate entirely; the checker is short-circuited and every tool is offered to the model ([src/pydantic_ai_backends/capability.py:ConsoleCapability.before_tool_execute](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/capability.py)).

## Common failure modes

- **Windows line-ending corruption** — early versions of `LocalBackend.write()` doubled `\r` because `Path.write_text()` runs in text mode on Windows. Fixed in 0.2.13 ([#51](https://github.com/vstorm-co/pydantic-ai-backend/issues/51)). Cross-platform consumers should still treat the backend as byte-stable and avoid assuming LF normalization.
- **Trailing-slash route mismatch in `CompositeBackend`** — paths without `/` did not match `/foo/`-registered prefixes, causing silent fall-through to the default backend. Fixed in 0.2.6; if you maintain a composite backend, prefer the `AsyncCompositeBackend` added in 0.2.15 ([#57](https://github.com/vstorm-co/pydantic-ai-backend/pull/57)).
- **Async adapter private-method coupling** — `ensure_async()` historically delegated to a private `_read_bytes`, which breaks for wrapper backends that only expose public `read_bytes()`. See issue [#54](https://github.com/vstorm-co/pydantic-ai-backend/issues/54); resolved in 0.2.14 for the read path.

## See Also

- [Backends](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends) — `LocalBackend`, `StateBackend`, `CompositeBackend`, `AsyncCompositeBackend`
- [Sandboxes](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends) — `DockerSandbox`, `KubernetesPodSandbox`, `DaytonaSandbox`
- [Changelog](https://github.com/vstorm-co/pydantic-ai-backend/releases) — version-by-version notes
- [Examples](https://github.com/vstorm-co/pydantic-ai-backend/tree/main/examples) — `basic_capability.py`, `readonly_agent.py`, `multi_agent_permissions.py`

---

<a id='page-sandboxes'></a>

## Sandboxes, Runtimes & Session Management

### Related Pages

Related topics: [Project Overview & Architecture](#page-overview), [Backends & Protocol Reference](#page-backends)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/pydantic_ai_backends/backends/base.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/base.py)
- [src/pydantic_ai_backends/backends/docker/sandbox.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/docker/sandbox.py)
- [src/pydantic_ai_backends/backends/docker/session.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/docker/session.py)
- [src/pydantic_ai_backends/backends/docker/runtimes.py](https://github.com/vstorm-co/pydantic-ai_backends/backends/docker/runtimes.py)
- [src/pydantic_ai_backends/backends/daytona.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/daytona.py)
- [src/pydantic_ai_backends/backends/kubernetes.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/kubernetes.py)
- [src/pydantic_ai_backends/backends/local.py](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/backends/local.py)
- [examples/web_production/README.md](https://github.com/vstorm-co/pydantic-ai-backend/blob/main/examples/web_production/README.md)
</details>

# Sandboxes, Runtimes & Session Management

## Purpose and Scope

The sandbox subsystem provides **isolated execution environments** for AI agents. A plain `BackendProtocol` backend (e.g. `StateBackend`, `LocalBackend`) only manipulates files on the host; sandboxes additionally run shell commands and arbitrary code inside a separated runtime — a Docker container, a Kubernetes pod, a Daytona workspace, or the host shell via `LocalSandbox`.

Three concepts work together:

- **Sandboxes** — concrete environments that execute shell commands and expose a `BackendProtocol`-compatible file interface.
- **Runtimes** — declarative image/dependency presets (e.g. `python-datascience`, `node-react`) that materialise into a sandbox image.
- **Session Management** — `SessionManager` creates, reuses, and garbage-collects per-user sandboxes for multi-user web apps.

Source: [src/pydantic_ai_backends/backends/base.py:1-50]()
Source: [src/pydantic_ai_backends/backends/docker/session.py:1-30]()

## BaseSandbox and the Sandbox Hierarchy

All sandboxes derive from the `BaseSandbox` abstract class declared in `src/pydantic_ai_backends/backends/base.py`. It extends `ABC` and declares the synchronous shell/file contract that concrete implementations must provide (`start`, `stop`, `execute`, plus the `BackendProtocol` file operations such as `read`, `write`, `ls_info`, `grep_raw`, `glob_info`).

Concrete implementations shipped by the project:

| Class | File | Runtime target |
|---|---|---|
| `DockerSandbox` | `backends/docker/sandbox.py` | Docker container |
| `DaytonaSandbox` | `backends/daytona.py` | Daytona workspace |
| `KubernetesPodSandbox` | `backends/kubernetes.py` | Kubernetes pod (added in 0.2.12) |
| `LocalSandbox` | `backends/local.py` | Host shell (development only) |

Because every sandbox implements `BackendProtocol`, agents use the same `read` / `write` / `execute` calls whether they are talking to a container or the host. A Windows-specific bug in `LocalBackend.write()` / `edit()` — where `Path.write_text()` opened files in text mode and doubled `\r` when LLM output already contained `\r\n` — was fixed in 0.2.13 ([issue #51](https://github.com/vstorm-co/pydantic-ai-backend/issues/51)). The `async_execute()` method on `LocalBackend` (added in 0.2.7, [#36](https://github.com/vstorm-co/pydantic-ai-backend/pull/36)) uses `asyncio.create_subprocess_exec`, so cancelling the calling task immediately kills the subprocess.

Source: [src/pydantic_ai_backends/backends/base.py:1-80]()
Source: [src/pydantic_ai_backends/backends/local.py:1-30]()

## Runtime Configurations

A `RuntimeConfig` declares the base image, packages, package manager, and working directory for a sandbox. Built-in runtimes live in `src/pydantic_ai_backends/backends/docker/runtimes.py`:

- `python-datascience` — Python with NumPy, pandas, scikit-learn, matplotlib, seaborn.
- `python-web` — Python with FastAPI, SQLAlchemy, httpx, uvicorn.
- `node-minimal` — Clean Node.js 20.
- `node-react` — Node.js 20 with TypeScript, Vite, React.

`get_runtime(name)` resolves a name into a `RuntimeConfig`, raising `KeyError` with an "Available: …" message when the name is unknown. Constructing a `DockerSandbox` with a runtime name bakes the declared dependencies into a generated Dockerfile on first start.

When generating Dockerfiles, `DockerSandbox` enforces strict validation: package names match `^[A-Za-z0-9@][A-Za-z0-9._@/+=<>~!\[\]-]*$` and environment-variable names match `^[A-Za-z_][A-Za-z0-9_]*$`. Values containing shell metacharacters (`;&|`$()<>\n\r`) are rejected outright by `_reject_metacharacters`, preventing an LLM-supplied value from breaking out of a `RUN` instruction.

Source: [src/pydantic_ai_backends/backends/docker/runtimes.py:1-120]()
Source: [src/pydantic_ai_backends/backends/docker/sandbox.py:30-90]()

## Session Management

`SessionManager` (in `src/pydantic_ai_backends/backends/docker/session.py`) is the multi-user coordinator. It keeps a dictionary of `session_id → sandbox` and exposes:

- `await get_or_create(session_id)` — returns the existing sandbox for a returning user, or calls the factory to spin up a new one.
- `await release(session_id)` — explicit teardown.
- `cleanup_idle(max_idle=1800)` — reaper that stops sandboxes idle longer than `max_idle` seconds.

By default the factory returns a `DockerSandbox` built from `default_runtime`; passing a `sandbox_factory` callable lets you plug in `DaytonaSandbox`, `KubernetesPodSandbox`, or any custom subclass — the contract is simply `start()` / `stop()` / `is_alive()` plus a `_last_activity` attribute.

```python
from pydantic_ai_backends import SessionManager, DaytonaSandbox

def daytona_factory(session_id: str) -> DaytonaSandbox:
    return DaytonaSandbox(sandbox_id=session_id)

manager = SessionManager(sandbox_factory=daytona_factory)
sandbox = await manager.get_or_create("user-123")
result = sandbox.execute("python script.py")
await manager.release("user-123")
```

The web production example (`examples/web_production/`) wires `SessionManager` straight into a FastAPI server so each browser session owns an isolated Docker container.

Source: [src/pydantic_ai_backends/backends/docker/session.py:30-150]()
Source: [examples/web_production/README.md:1-30]()

## Architecture Overview

```mermaid
graph TD
    Agent[AI Agent] --> Tools[ConsoleToolset]
    Tools -->|execute / read / write| SB[BaseSandbox]
    SB --> DS[DockerSandbox]
    SB --> DT[DaytonaSandbox]
    SB --> KS[KubernetesPodSandbox]
    SB --> LS[LocalSandbox]
    DS -.uses.-> RC[RuntimeConfig]
    RC -.python-datascience / node-react / ... .-> IMG[Docker Image]
    SM[SessionManager] -->|factory| DS
    SM -->|factory| DT
    SM -->|factory| KS
    SM -->|cleanup_idle| SB
```

## Common Failure Modes

1. **Unknown runtime name** — `get_runtime("py-ds")` raises `KeyError`. Use one of the names listed in `BUILTIN_RUNTIMES`.
2. **Daemon unreachable** — `DockerSandbox.start()` requires a running Docker daemon; verify with `docker ps`.
3. **Host bind-mount UID mismatch** — configure a `user=` field on `RuntimeConfig` if files written by the container are owned by root on the host.
4. **Idle sessions leaking resources** — schedule a periodic `await manager.cleanup_idle(max_idle=1800)` task on the event loop.
5. **Path-traversal-like package names from the LLM** — `DockerSandbox` rejects them via `_reject_metacharacters`; do not bypass this guard when constructing images manually.

## See Also

- [Backends Overview](backends.md)
- [Docker Sandbox API](docker.md)
- [Kubernetes Sandbox API](kubernetes.md)
- [Daytona Sandbox API](daytona.md)
- [Multi-User App Example](../examples/multi-user.md)
- [Web Production Example](https://github.com/vstorm-co/pydantic-ai-backend/tree/main/examples/web_production)
- Release notes: [0.2.12 (KubernetesPodSandbox)](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.12), [0.2.13 (Windows CRLF fix)](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.13), [0.2.15 (AsyncCompositeBackend)](https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.15)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: vstorm-co/pydantic-ai-backend

Summary: Found 20 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: 0.2.10
- User impact: Upgrade or migration may change expected behavior: 0.2.10
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.10

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: 0.2.12
- User impact: Upgrade or migration may change expected behavior: 0.2.12
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.12

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: 0.2.9
- User impact: Upgrade or migration may change expected behavior: 0.2.9
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.9

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: Bug Report: `LocalBackend.write()` doubles carriage returns on Windows
- User impact: Developers may fail before the first successful local run: Bug Report: `LocalBackend.write()` doubles carriage returns on Windows
- Evidence: failure_mode_cluster:github_issue | https://github.com/vstorm-co/pydantic-ai-backend/issues/51

## 5. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: Dependency Dashboard
- User impact: Developers may fail before the first successful local run: Dependency Dashboard
- Evidence: failure_mode_cluster:github_issue | https://github.com/vstorm-co/pydantic-ai-backend/issues/41

## 6. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/vstorm-co/pydantic-ai-backend/issues/51

## 7. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/vstorm-co/pydantic-ai-backend/issues/41

## 8. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: 0.2.8
- User impact: Upgrade or migration may change expected behavior: 0.2.8
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.8

## 9. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/vstorm-co/pydantic-ai-backend

## 10. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this runtime risk before relying on the project: 0.2.14
- User impact: Upgrade or migration may change expected behavior: 0.2.14
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.14

## 11. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this runtime risk before relying on the project: 0.2.15
- User impact: Upgrade or migration may change expected behavior: 0.2.15
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.15

## 12. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this runtime risk before relying on the project: 0.2.7
- User impact: Upgrade or migration may change expected behavior: 0.2.7
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.7

## 13. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this migration risk before relying on the project: 0.2.6
- User impact: Upgrade or migration may change expected behavior: 0.2.6
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.6

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/vstorm-co/pydantic-ai-backend

## 15. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/vstorm-co/pydantic-ai-backend

## 16. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/vstorm-co/pydantic-ai-backend

## 17. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/vstorm-co/pydantic-ai-backend

## 18. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/vstorm-co/pydantic-ai-backend

## 19. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this maintenance risk before relying on the project: 0.2.11
- User impact: Upgrade or migration may change expected behavior: 0.2.11
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.11

## 20. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this maintenance risk before relying on the project: 0.2.13
- User impact: Upgrade or migration may change expected behavior: 0.2.13
- Evidence: failure_mode_cluster:github_release | https://github.com/vstorm-co/pydantic-ai-backend/releases/tag/0.2.13

<!-- canonical_name: vstorm-co/pydantic-ai-backend; human_manual_source: deepwiki_human_wiki -->