headroom Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

Overview

Headroom is a context-compression layer for LLM-powered software. It sits between an agent (or application) and a model provider, shrinking tool outputs, file reads, log streams, and prior conversation turns before they reach the model, and shipping the compressed prompt to Anthropic, OpenAI, Bedrock, or any compatible endpoint. The goal is fewer tokens, lower cost, and longer-running coding sessions without losing information that matters — errors, anomalies, and structural context are preserved while bulk noise is removed. Source: README.md:1-120

What Headroom Does

At a high level, Headroom is built around four ideas:

ContentRouter — detects whether a payload is structured JSON, source code, or natural-language prose, and picks the right compressor for it. Source: README.md:60-75
SmartCrusher / CodeCompressor / Kompress-base — a family of compressors. SmartCrusher handles generic JSON, CodeCompressor works on AST-shaped tool output, and the Kompress-base model (hosted on HuggingFace) shrinks prose. Source: README.md:65-72
CacheAligner — stabilizes prompt prefixes so provider KV caches actually hit, giving compounding savings across turns. Source: README.md:73-75
CCR (Compressible Context Retrieval) — stores originals locally and exposes a headroom_retrieve MCP tool, so any content the LLM needs back is recoverable rather than discarded. Source: README.md:76-78

Headroom also relies on the RTK binary to rewrite shell output (git show --short, scoped ls, summarized installers) before it ever reaches the proxy, and can alternatively use lean-ctx as the CLI context tool via HEADROOM_CONTEXT_TOOL=lean-ctx. Source: README.md:1-30

Architecture

Headroom is a polyglot system: a Rust core (headroom-core) holds the hot-path compression logic, a Python package wraps it via PyO3 and exposes the CLI, and a TypeScript SDK targets Vercel AI SDK, OpenAI, and Anthropic clients. The Rust core is reachable from Python through crates/headroom-py/src/lib.rs, which exposes content_has_error_indicators, the keyword_registry_snapshot, and a search_compressor bridge that consumes the signals::LineImportanceDetector trait. Source: crates/headroom-py/src/lib.rs:1-60

flowchart LR
    A[Agent / App] -->|HTTP / SDK| B[Headroom Proxy]
    B --> C[ContentRouter]
    C --> D[SmartCrusher]
    C --> E[CodeCompressor]
    C --> F[Kompress-base]
    B --> G[CacheAligner]
    B --> H[CCR Store]
    B -->|compressed prompt| I[LLM Provider]
    H -.->|on demand| A

The TypeScript SDK speaks four wire formats and always converts to OpenAI-style messages on the proxy side. Detection is structural rather than heuristic: Gemini is identified by the parts field, Vercel AI SDK by hyphenated part types (tool-call, tool-result), Anthropic by underscored block types (tool_use, tool_result), and OpenAI by tool_calls / tool_call_id. Source: sdk/typescript/src/utils/format.ts:1-60

Integration Modes

Headroom is designed to meet the developer where they are. Three modes are documented in the README. Source: README.md:80-100

Mode	Entry point	When to use
Wrap	`headroom wrap <agent>`	Launching a CLI coding agent (Claude, Codex, …) with routing through the proxy
Proxy	`headroom proxy --port 8787`	Drop-in sidecar for any HTTP-speaking client, zero code change
Library	`from headroom import compress`	Python projects that want inline compression in their own pipeline

The wrap command family is implemented in headroom/cli/wrap.py. It snapshots the agent's config file before mutating it, installs the chosen CLI context tool (RTK or lean-ctx), injects Headroom proxy instructions into the agent's AGENTS.md, and registers the Headroom MCP server so the agent can call headroom_retrieve on compression markers. All edits are wrapped in markers so headroom unwrap <agent> can restore the user's pre-wrap state byte-for-byte. Source: headroom/cli/wrap.py:1-200

For production deployment, the headroom install group manages persistent profiles: Docker-backed presets, supervised systemd/launchd services, and detached agents. A profile carries a DeploymentManifest that the planner builds, the runtime starts, and the supervisor keeps alive; headroom install re-uses the same manifest format across all three. Source: headroom/cli/install.py:1-80

Configuration and Extensibility

Two canonical roots are honored everywhere: HEADROOM_CONFIG_DIR (read-mostly configuration) and HEADROOM_WORKSPACE_DIR (read-write state such as the savings log, TOIN store, and subscription state). Per-resource env vars override these, and explicit arguments override env vars — a precedence the TypeScript SDK mirrors in paths.ts. Source: sdk/typescript/src/paths.ts:1-50

The TypeScript SDK ships extension points that match the Python ones. CompressionHooks exposes preCompress, computeBiases (per-message compression bias, where values >1 preserve more and <1 compress more), and postCompress — the last of which is observe-only. Source: sdk/typescript/src/hooks.ts:1-70

Errors are typed: HeadroomError is the root, with HeadroomConnectionError, HeadroomAuthError, HeadroomCompressError (carrying statusCode and errorType), ConfigurationError, ProviderError, and StorageError defined for catchable failure modes. Source: sdk/typescript/src/errors.ts:1-60

Pattern learning lives in TOIN (Tool Optimization Intelligence Network), exposed through ToolPattern, TOINStats, and TOINPattern types. Per-request hinting was retired in PR-B5; recommendations are now published offline via python -m headroom.cli.toin_publish into a sorted recommendations.toml that the Rust proxy loads once at startup from $HEADROOM_RECOMMENDATIONS_PATH. Source: headroom/cli/toin_publish.py:1-90, sdk/typescript/src/types/models.ts:1-60

Real-World Savings and Community Direction

The LangChain demo reports 74% aggregate token savings across five verbose tools, with 100% preservation of ERROR entries and 100% anomaly detection on the test runs. Source: examples/langchain_demo/README.md:1-50

The community is actively shaping the next integrations. Open issues request headroom wrap opencode (a CLI wrapper for the 30k-star OpenCode assistant) and a deeper headroom-opencode npm plugin for plugin-aware context compression and a stats dashboard. Other requests include Hermes agent support and Homebrew packaging. Issue #488 specifically asks for a Copilot CLI subscription mode that compresses client-side before Copilot packages prompts — a direction the v0.23.0 release partially addresses with the new "GitHub Copilot subscription mode through Headroom" feature, alongside a CCR fix that scopes proactive expansion by workspace to stop cross-project leaks. Source: community issues #74, #76, #488, #526, #527

Src

The Headroom repository is a polyglot project that ships a context-compression layer for AI agents. Its src/-style surface is split across three language trees — a Python CLI and library, a TypeScript SDK, and a Rust core exposed to Python through PyO3 — together with an examples/ directory that documents how each integration is meant to be driven. This page walks through the layout, what each module is responsible for, and how they cooperate at runtime.

Repository Layout

Tree	Purpose	Entry points
`headroom/cli/`	User-facing Click commands (`wrap`, `init`, `install`, `toin_publish`)	headroom/cli/wrap.py, headroom/cli/init.py, headroom/cli/install.py, headroom/cli/toin_publish.py
`sdk/typescript/src/`	Node/TS client SDK with provider adapters, errors, hooks, and path helpers	sdk/typescript/src/client.ts, sdk/typescript/src/errors.ts, sdk/typescript/src/hooks.ts, sdk/typescript/src/paths.ts
`crates/headroom-py/src/`	Rust implementations of log/search/etc. compressors, exposed as a Python extension	crates/headroom-py/src/lib.rs
`examples/`	Runnable demos (LangChain, MCP, Strands+Bedrock, raw provider clients)	examples/README.md, examples/langchain_demo/README.md

The README positions Headroom as a "context compression layer for AI agents" with three delivery shapes — library, proxy, and MCP — and advertises 60–95% token savings, six compression algorithms, and a local-first, reversible design (README.md).

Python CLI (`headroom/cli/`)

The cli/ tree is the orchestration layer that most users meet first. Each file owns one Click sub-command and its helpers.

`wrap.py` — Launch agents through the proxy

headroom wrap <agent> is the agent-bootstrapper. It edits the target agent's config so its API calls are routed through the local Headroom proxy, optionally registers the Headroom MCP retrieve tool, and configures a CLI context rewriter (RTK or lean-ctx). For example, headroom wrap codex snapshots ~/.codex/config.toml into a .headroom-backup file, injects a model_provider = "headroom" block and a Headroom MCP server entry, and can be undone exactly via headroom unwrap codex (headroom/cli/wrap.py). The same module also injects RTK shell-rewriting instructions into the project's AGENTS.md and the global ~/.codex/AGENTS.md so Codex learns to use rtk git, rtk gh, rtk docker etc. (headroom/cli/wrap.py). This is the same surface that community issue #74 proposes to extend to OpenCode, and the structure of the codex markers (_CODEX_TOP_LEVEL_MARKER, _CODEX_MCP_MARKER, _MEMORY_MCP_MARKER) makes it clear that a sibling wrap opencode would plug in here.

`init.py` — Provider and hook initialization

init provisions Headroom's *init* provider inside supported agents — currently claude, copilot, codex, and openclaw — and emits headroom init hook ensure invocations so the agent's hook chain compresses requests at the source (headroom/cli/init.py). The module distinguishes local targets (claude, codex) from global ones and is the natural extension point for any new wrapper requested by the community, including the OpenCode wrapper in #74, the Hermes agent integration in #526, and the Copilot subscription-mode work in #488.

`install.py` — Persistent deployments

For long-running setups, headroom install ships a deployment manifest (preset, scope, supervisor kind, runtime kind) that can be rendered into a foreground process, a detached agent, a supervised service, or a persistent Docker container (headroom/cli/install.py). wait_ready() blocks until the deployment reports healthy, which matters because the CLI then hands the user a stable proxy URL to point their agents at.

`toin_publish.py` — Offline recommendation export

TOIN learns which compression strategy works best for each (auth_mode, model_family, structure_hash) slice. toin_publish.py walks the on-disk TOIN store, filters by --min-observations, and emits a deterministic recommendations.toml that the Rust proxy loads once at startup via $HEADROOM_RECOMMENDATIONS_PATH (headroom/cli/toin_publish.py). The module deliberately documents that this is an *offline* deploy-time step — per-request mutation of the recommendation set was retired in PR-B5 — and the TOML output is sorted so successive publishes diff cleanly.

TypeScript SDK (`sdk/typescript/src/`)

The npm package is an HTTP client for the proxy, plus a small set of parity modules that mirror the Python API.

Errors. errors.ts mirrors headroom.exceptions with HeadroomError, HeadroomConnectionError, HeadroomAuthError, HeadroomCompressError, ConfigurationError, ProviderError, and StorageError (sdk/typescript/src/errors.ts). HeadroomCompressError carries the HTTP statusCode and an errorType discriminator so callers can branch on compression failures.
Hooks. hooks.ts exposes a CompressionHooks base class with preCompress, computeBiases, and postCompress lifecycle methods, plus CompressContext and CompressEvent shapes that match the Python headroom.hooks module (sdk/typescript/src/hooks.ts). postCompress is observe-only — by design, it cannot mutate the compressed result.
Paths. paths.ts is a parity shell for headroom/paths.py. It defines HEADROOM_CONFIG_DIR and HEADROOM_WORKSPACE_DIR (and the per-resource env vars) but returns the empty string in browser contexts, signalling that file-system helpers are Node-only (sdk/typescript/src/paths.ts).
Adapters & client. Provider adapters (adapters/anthropic.ts, adapters/openai.ts, adapters/gemini.ts, adapters/vercel-ai.ts) wrap each SDK's request shape into a Headroom-friendly payload; the central client.ts POSTs to the proxy and returns the decompressed response.

Rust Core (`crates/headroom-py/src/lib.rs`)

Performance-sensitive transforms (log compression, search compression, error-indicator detection, keyword registries) live in Rust and are bridged into Python with PyO3. lib.rs registers functions such as content_has_error_indicators, exposes the default keyword registry as a PyDict snapshot, and defines a RustLogConfig / LogCompressionResult pair that mirrors the Python LogCompressor knobs (max_errors, keep_first_error, enable_ccr, min_compression_ratio_for_ccr, etc.) (crates/headroom-py/src/lib.rs). The search_compressor bridge deliberately accepts a CCR-persistence callback rather than holding a long-lived CompressionStore reference, keeping the Rust crate free of Python-owned state.

Examples & Demos

examples/README.md indexes runnable scripts that double as integration tests: basic_usage.py, anthropic_example.py, streaming_example.py, smart_vs_naive_eval.py, real_world_eval.py, real_world_openai_eval.py, and the larger langchain_demo/, mcp_demo/, strands_bedrock_demo/ directories (examples/README.md). The LangChain demo is the most cited — its run_comparison reports 74% aggregate token savings on five tool families and 100% preservation of ERROR entries (examples/langchain_demo/README.md). Examples are intentionally runnable from the repo root with PYTHONPATH=. so they exercise the same code paths the production CLI does.

Component Relationships

flowchart LR
    User --> CLI[headroom/cli/*<br/>Click commands]
    CLI --> Proxy[Headroom Proxy<br/>Rust core]
    Proxy --> SDK[TypeScript SDK<br/>sdk/typescript/src]
    Proxy --> PyExt[headroom-py<br/>PyO3 extension]
    CLI --> Store[(TOIN store<br/>recommendations.toml)]
    Examples[examples/*] --> SDK
    Examples --> PyExt

The CLI is the only surface that mutates user-facing config files and emits recommendations.toml; the proxy and PyO3 extension are the runtime hot path; and the examples sit on top of both to validate end-to-end behavior.

Sdk

The Headroom SDK is the language-level surface that lets application code compress and route LLM context without standing up the full proxy. The project ships first-class libraries for both Python (headroom-ai on PyPI) and TypeScript / JavaScript (headroom-ai on npm), exposing the same compress(messages) primitive alongside higher-level adapters for popular agent frameworks. As described in the top-level README, the SDK is one of four entry points into Headroom — alongside the proxy, the headroom wrap CLI, and the MCP server — and is positioned as the "inline" integration path. Source: README.md

High-Level Architecture

The TypeScript SDK is, by design, a thin HTTP client that talks to a running Headroom proxy, plus a set of adapters that slot into the messaging conventions of each provider. The repo's own source comment makes this explicit: "The TypeScript SDK is an HTTP client today and does not touch the filesystem directly." Source: sdk/typescript/src/paths.ts It mirrors the Python package's filesystem contract so future local features (cache, log co-location) can land on the same paths.

flowchart LR
    App[Application / Agent] --> SDK[headroom-ai SDK]
    SDK -->|HTTP| Proxy[Headroom Proxy]
    Proxy -->|compressed request| LLM[(LLM Provider)]
    LLM -->|response| Proxy
    Proxy --> SDK
    SDK --> App

    subgraph Adapters
        Vercel[Vercel AI SDK middleware]
        OAI[OpenAI native adapter]
        ANT[Anthropic native adapter]
    end
    App -.uses.-> Vercel
    App -.uses.-> OAI
    App -.uses.-> ANT
    Vercel --> SDK
    OAI --> SDK
    ANT --> SDK

The compress call itself is format-agnostic: the SDK detects which of four wire formats a message array is using and normalizes everything to OpenAI format internally before negotiation with the proxy.

Core Modules

Message Format Detection

sdk/typescript/src/utils/format.ts defines a detectFormat function that distinguishes four message conventions structurally — no heuristic sniffing. Source: sdk/typescript/src/utils/format.ts

Format	Structural marker
OpenAI	assistant messages carry `tool_calls`; tool messages carry `tool_call_id`
Anthropic	content blocks use underscored `tool_use` / `tool_result`
Vercel AI SDK	content parts use hyphenated `tool-call` / `tool-result`
Google Gemini	messages use `parts` instead of `content`

Conversion always targets the OpenAI shape, described in the source as "the proxy's lingua franca." Source: sdk/typescript/src/utils/format.ts This is what allows the same compress() call to work for gpt-4o, Claude, Gemini, and Vercel-wrapped models without callers having to translate.

Filesystem Contract

sdk/typescript/src/paths.ts is a parity shell that mirrors headroom/paths.py. It defines two canonical roots — HEADROOM_CONFIG_DIR (read-mostly configuration) and HEADROOM_WORKSPACE_DIR (read-write state) — plus per-resource env vars for savings logs, the TOIN store, and subscription state. Source: sdk/typescript/src/paths.ts

The precedence for every per-resource helper is identical to Python: explicit argument → per-resource env var → derived from canonical root → default. The module also guards against browser use: when process is undefined, helpers return an empty string, so a misconfigured client fails cleanly instead of crashing at import time. Source: sdk/typescript/src/paths.ts

Integration Surfaces

The TypeScript SDK ships three integration shapes, all documented in sdk/typescript/examples/README.md. Source: sdk/typescript/examples/README.md

Vercel AI SDK middleware — withHeadroom(openai('gpt-4o')) is the one-liner; additional examples cover streaming (streaming-chat.ts), tool-calling agents (tool-calling-agent.ts), structured output (structured-output.ts), middleware composition with extractReasoningMiddleware (middleware-composition.ts), and cross-provider parity (multi-provider.ts). Source: sdk/typescript/examples/with-headroom-vercel.ts
Core SDK primitives — compress() for in-app use (basic-compress.ts), simulate() to preview compression without calling an LLM (simulation-dry-run.ts), CompressionHooks for pre/post hooks and per-message biases (hooks-custom-compression.ts), SharedContext for compressed handoff between agents (shared-context-multi-agent.ts), and CCR retrieval to fetch the original content losslessly after compression (ccr-retrieve.ts). Source: sdk/typescript/examples/README.md
Native SDK adapters — withHeadroom works with the native OpenAI and Anthropic SDKs for callers who do not want to adopt Vercel's AI SDK (openai-anthropic-adapters.ts). Source: sdk/typescript/examples/openai-anthropic-adapters.ts

The README's prerequisite block clarifies the runtime contract: any caller of these examples needs Node 18+, a running headroom proxy (installable with pip install "headroom-ai[proxy]"), and at minimum an OPENAI_API_KEY for most examples. Source: sdk/typescript/examples/README.md

Examples and Reference Implementations

Beyond the SDK's own example folder, the repo's examples/ directory demonstrates SDK-adjacent integrations. The LangChain demo reports concrete savings from running Headroom on a LangChain agent — 74% total token reduction across six tool calls while preserving 100% of ERROR entries and anomaly indicators. Source: examples/langchain_demo/README.md Other Python examples in the same directory (smart_vs_naive_eval.py, real_world_eval.py, real_world_openai_eval.py) show how the Python equivalent of the SDK is used to compare compression strategies end-to-end. Source: examples/README.md

Cli

The headroom CLI is the primary user-facing surface of the project. It is a Click-based command group registered under headroom.cli.main and exposed as the headroom console-script entry point. The CLI orchestrates the Headroom proxy, configures downstream AI coding agents, manages bundled developer tools, and ships operational subcommands for TOIN recommendation publishing and persistent deployment.

Command Group Architecture

The CLI is organized into several top-level groups, each implemented in its own module under headroom/cli/. Each subcommand module imports main from headroom.cli.main and registers its group via Click's @main.group() decorator. Source: headroom/cli/install.py:24-26, headroom/cli/mcp.py:18-20, headroom/cli/init.py:40-45.

graph TD
    A[headroom] --> B[wrap]
    A --> C[mcp]
    A --> D[init]
    A --> E[install]
    A --> F[tools]
    A --> G[sg/diff/loc passthrough]
    A --> H[toin_publish]
    B --> B1[claude]
    B --> B2[codex]
    B --> B3[copilot]
    B --> B4[openclaw]
    B --> B5[continue]
    C --> C1[serve]
    C --> C2[install/remove/status]
    D --> D1[claude/copilot/codex/openclaw]
    E --> E1[plan/apply/start/stop]
    F --> F1[install/doctor/list]

The `wrap` Command

The wrap group adapts Headroom to specific AI coding CLIs. It snapshots pre-wrap configuration, sets up the selected context tool (RTK by default, or lean-ctx when HEADROOM_CONTEXT_TOOL=lean-ctx), injects provider blocks, and finally launches the wrapped agent with all API calls routed through the local Headroom proxy. Source: headroom/cli/wrap.py:1-15, README.md:1-10.

Wrap Targets and Mechanics

Target	Config file touched	Notable behavior
`codex`	`~/.codex/config.toml`, `AGENTS.md`	Snapshots config before mutation, restores on `unwrap`. Injects `[model_providers.headroom]` block and registers the `headroom` MCP server. Source: headroom/cli/wrap.py:18-90
`claude`	`~/.claude/mcp.json`	Adds the headroom MCP server entry used by CCR. Source: headroom/cli/mcp.py:30-50
`copilot`	Copilot config	In v0.23.0 gained subscription mode support (no BYOK required). Source: README.md:1-5 (release notes)
`openclaw`	Per-tool config	First-class wrap target alongside `claude`, `copilot`, `codex`. Source: headroom/cli/init.py:6-8
`continue`	`.continue/config.json`	Extends top-level and per-model `systemMessage`; warns on non-string values. Source: headroom/cli/wrap.py:30-60

The Codex wrap flow uses _snapshot_codex_config_if_unwrapped to record a byte-for-byte backup, then calls _inject_codex_provider_config and _setup_headroom_mcp(CodexRegistrar(), port, ..., force=True). The force=True flag matters: Codex starts a long-lived local MCP subprocess from config.toml, so a stale port from a prior wrap would silently misroute retrieval traffic while model traffic used the correct one. Source: headroom/cli/wrap.py:8-20.

Uninstall via `unwrap`

The _restore_codex_provider_config function implements four outcomes based on filesystem state: "restored" (pre-wrap backup present), "cleaned" (markers stripped from a config that lacked a backup), "removed" (config file deleted when it only contained Headroom content), and "noop" (no markers and no backup). It also defensively strips orphaned top-level keys (model_provider, openai_base_url) and orphan [model_providers.headroom] tables left by older or crashed wrap runs, while preserving any user-defined headroom provider whose base_url does not match a Headroom proxy port. Source: headroom/cli/wrap.py:90-130.

The `mcp` Command

headroom mcp provides the Model Context Protocol server used by Claude Code and other agents to call headroom_retrieve on compression markers produced by the proxy. get_headroom_command() returns the canonical invocation (["headroom", "mcp", "serve"]) used when writing mcp.json entries. Source: headroom/cli/mcp.py:20-30.

load_mcp_config and save_mcp_config operate on ~/.claude/mcp.json with a top-level mcpServers object. A corrupt or unreadable existing file falls back to {"mcpServers": {}} rather than raising, so wrap flows degrade gracefully when the user has a partially-written config. Source: headroom/cli/mcp.py:32-50.

The `init` Command

init provisions Headroom for supported agent targets: claude, copilot, codex, openclaw. The local-only set is {"claude", "codex"}; the global set adds copilot and openclaw. The command hooks into shell events by emitting hook commands (built via _command_string for cross-platform compatibility — subprocess.list2cmdline on Windows, shlex.join elsewhere) that resolve to headroom init hook ensure <target>. Source: headroom/cli/init.py:5-20.

_enable_verbose_logging attaches a single stderr DEBUG handler to the init logger, with an idempotency guard keyed on a sentinel attribute so nested subcommand invocations do not stack handlers. The handler writes to stderr, not stdout, so headroom init remains composable in pipes that consume stdout. Source: headroom/cli/init.py:22-30.

The `tools` Command

tools manages bundled developer binaries (ast-grep, difftastic, scc) that agents can invoke verbatim. The headroom sg, headroom diff, and headroom loc passthrough commands forward every argument, stdin, stdout, stderr, and exit code to the underlying binary. They use a Click context with ignore_unknown_options=True, allow_extra_args=True, and help_option_names=[] so --help reaches the wrapped tool rather than Click. Source: headroom/cli/tools.py:12-25.

_is_windows() and _exec_tool() resolve binaries through headroom.binaries.resolve and translate PlatformNotSupported and OfflineError into colored stderr messages with a hint to run headroom tools install. Source: headroom/cli/tools.py:25-45.

The `install` Command

install manages persistent Headroom deployments. It uses dataclass-driven manifest types (DeploymentManifest, InstallPreset, RuntimeKind, SupervisorKind, ConfigScope, ProviderSelectionMode) and dispatches on preset to start the right runtime: start_persistent_docker for PERSISTENT_DOCKER, start_supervisor for SERVICE supervisor kind, or start_detached_agent for the fallback. _start_deployment blocks on wait_ready(manifest, timeout_seconds=45) and raises ClickException if the deployment fails health-check within the budget. Source: headroom/cli/install.py:15-50.

install reuses submodules headroom.install.{planner,providers,runtime,state,supervisors,health,models} for the plan/apply/start/stop lifecycle and round-trips manifests through save_manifest / load_manifest. Source: headroom/cli/install.py:1-15.

The `toin_publish` Command

python -m headroom.cli.toin_publish emits a static recommendations.toml the Rust proxy loads at startup. It walks the on-disk TOIN store, aggregates one row per (auth_mode, model_family, structure_hash) slice with at least --min-observations recorded events, and writes a sorted TOML file that diffs cleanly across publishes. Source: headroom/cli/toin_publish.py:1-25.

The TOML schema is:

[[recommendation]]
auth_mode = "payg"
model_family = "claude-3-5"
structure_hash = "deadbeef..."
strategy_hint = "smart_crusher"
confidence = 0.87
observations = 142

Strategy selection in _select_strategy prefers an explicit optimal_strategy (when not the "default" placeholder), falls back to the highest-success entry in strategy_success_rates, and finally returns "default". Publishing is deliberately a CLI, not a request-time hook — recommendations are a deploy-boundary artifact, never an in-flight mutation. Source: headroom/cli/toin_publish.py:25-60.

Cross-Cutting Configuration

Several environment variables flow across CLI subcommands:

Variable	Effect	Source
`HEADROOM_CONTEXT_TOOL`	Selects between RTK and `lean-ctx` for shell-output rewriting	README.md:1-5, headroom/cli/wrap.py:10-12
`HEADROOM_CONFIG_DIR`	Read-mostly configuration root (default `~/.headroom/config`)	sdk/typescript/src/paths.ts:20-30
`HEADROOM_WORKSPACE_DIR`	Read-write state root (default `~/.headroom`)	sdk/typescript/src/paths.ts:20-30
`HEADROOM_SAVINGS_PATH` / `HEADROOM_TOIN_PATH` / `HEADROOM_SUBSCRIPTION_STATE_PATH`	Per-resource overrides for storage paths	sdk/typescript/src/paths.ts:25-35
`HEADROOM_RECOMMENDATIONS_PATH`	Where the Rust proxy looks for `recommendations.toml`	headroom/cli/toin_publish.py:1-10
`HEADROOM_ANTHROPIC_PRE_UPSTREAM_CONCURRENCY`	Pre-upstream semaphore for Anthropic HTTP path	scripts/README.md:1-15

The TypeScript SDK mirrors this contract in paths.ts so future local features (cache, log co-location) land on the same precedence rule: explicit argument > per-resource env var > derived from canonical root > default. Source: sdk/typescript/src/paths.ts:10-25.

Community-Driven Extensions

Several open issues map directly to CLI features. Issue #74 requests headroom wrap opencode to route OpenCode's 75+ Vercel AI SDK providers through the proxy, and #76 proposes a deeper headroom-opencode npm plugin for automatic context compression and a stats dashboard. Issue #488 (8 comments) calls for Copilot CLI subscription-mode compression that works *before* Copilot packages prompts, which is partially addressed in v0.23.0. Issue #527 requests a brew install headroom path, and #526 requests Hermes agent support. Each of these would land as a new wrap subcommand or an additional supported target in init and wrap registries. Source: headroom/cli/init.py:6-8 (target registry that new agents would extend).

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

Developers may fail before the first successful local run: [BUG] MCP/direct compression hangs on larger JSON payloads on Windows; proxy `/mcp` does not behave as documented

Doramagic Pitfall Log

Found 30 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

Severity: high
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/chopratejas/headroom/issues/1177

2. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/chopratejas/headroom/issues/1132

3. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/chopratejas/headroom/issues/488

4. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: [BUG] MCP/direct compression hangs on larger JSON payloads on Windows; proxy /mcp does not behave as documented
User impact: Developers may fail before the first successful local run: [BUG] MCP/direct compression hangs on larger JSON payloads on Windows; proxy /mcp does not behave as documented
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [BUG] MCP/direct compression hangs on larger JSON payloads on Windows; proxy /mcp does not behave as documented. Context: Observed when using python, windows
Evidence: failure_mode_cluster:github_issue | https://github.com/chopratejas/headroom/issues/600

5. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: [FEATURE] Strands SDK Integration
User impact: Developers may fail before the first successful local run: [FEATURE] Strands SDK Integration
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [FEATURE] Strands SDK Integration. Context: Observed when using python
Evidence: failure_mode_cluster:github_issue | https://github.com/chopratejas/headroom/issues/14

6. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/chopratejas/headroom

7. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Antigravity 2.0 support
User impact: Developers may misconfigure credentials, environment, or host setup: Antigravity 2.0 support
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Antigravity 2.0 support. Context: Observed when using windows
Evidence: failure_mode_cluster:github_issue | https://github.com/chopratejas/headroom/issues/566

8. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Release v0.22.0
User impact: Upgrade or migration may change expected behavior: Release v0.22.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Release v0.22.0. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/chopratejas/headroom/releases/tag/v0.22.0

9. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Release v0.23.0
User impact: Upgrade or migration may change expected behavior: Release v0.23.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Release v0.23.0. Context: Observed when using python, docker, linux
Evidence: failure_mode_cluster:github_release | https://github.com/chopratejas/headroom/releases/tag/v0.23.0

10. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/chopratejas/headroom

11. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/chopratejas/headroom

12. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/chopratejas/headroom

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using headroom with real data or production workflows.

[[BUG] Historical tab in the dashboard does not include RTK stats](https://github.com/chopratejas/headroom/issues/1177) - github / github_issue
[[BUG] Windows: proxy.log RotatingFileHandler rollover fails (WinError 32](https://github.com/chopratejas/headroom/issues/1184) - github / github_issue
Bedrock streaming: message_start emits input_tokens=0, breaking downstre - github / github_issue
[[BUG]](https://github.com/chopratejas/headroom/issues/1179) - github / github_issue
[[BUG] headroom wrap claude corrupts model name with ANSI escape codes](https://github.com/chopratejas/headroom/issues/626) - github / github_issue
[[FEATURE] Support Copilot CLI subscription mode (no BYOK/API key)](https://github.com/chopratejas/headroom/issues/488) - github / github_issue
Antigravity 2.0 support - github / github_issue
[[BUG] MCP/direct compression hangs on larger JSON payloads on Windows; p](https://github.com/chopratejas/headroom/issues/600) - github / github_issue
[[FEATURE] Export Compression Analytics and Savings Report](https://github.com/chopratejas/headroom/issues/599) - github / github_issue
[[FEATURE] Strands SDK Integration](https://github.com/chopratejas/headroom/issues/14) - github / github_issue
Release v0.26.0 - github / github_release
Release v0.25.0 - github / github_release

Source: Project Pack community evidence and pitfall evidence

headroom

Overview

Related Pages

Overview

What Headroom Does

Architecture

Integration Modes

Configuration and Extensibility

Real-World Savings and Community Direction

See Also

Src

Related Pages

Src

Repository Layout

Python CLI (`headroom/cli/`)

`wrap.py` — Launch agents through the proxy

`init.py` — Provider and hook initialization

`install.py` — Persistent deployments

`toin_publish.py` — Offline recommendation export

TypeScript SDK (`sdk/typescript/src/`)

Rust Core (`crates/headroom-py/src/lib.rs`)

Examples & Demos

Component Relationships

See Also

Sdk

Related Pages

Sdk

High-Level Architecture

Core Modules

Message Format Detection

Filesystem Contract

Integration Surfaces

Examples and Reference Implementations

See Also

Cli

Related Pages

Cli

Command Group Architecture

The `wrap` Command

Wrap Targets and Mechanics

Uninstall via `unwrap`

The `mcp` Command

The `init` Command

The `tools` Command

The `install` Command

The `toin_publish` Command

Cross-Cutting Configuration

Community-Driven Extensions

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Configuration risk: Configuration risk requires verification

2. Security or permission risk: Security or permission risk requires verification

3. Security or permission risk: Security or permission risk requires verification

4. Installation risk: Installation risk requires verification

5. Installation risk: Installation risk requires verification

6. Configuration risk: Configuration risk requires verification

7. Configuration risk: Configuration risk requires verification

8. Configuration risk: Configuration risk requires verification

9. Configuration risk: Configuration risk requires verification

10. Capability evidence risk: Capability evidence risk requires verification

11. Maintenance risk: Maintenance risk requires verification

12. Security or permission risk: Security or permission risk requires verification

Community Discussion Evidence

Community Discussion Evidence