honcho Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

honcho

Memory library for building stateful agents

Overview and System Architecture

Related topics: Self-Hosting, Configuration, and LLM Provider Setup, Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval

Section Related Pages

Continue reading this section for the full explanation and source context.

Overview and System Architecture

Honcho is an open-source conversational memory platform positioned as a "memory layer" for LLM-based agents. Rather than returning semantically matched chunks like a retrieval-augmented generation (RAG) system, Honcho extracts reasoned conclusions about peers (users, agents, groups, projects) and serves them through a single FastAPI server. The project is split across multiple repositories: this one hosts the core service logic and the Python and TypeScript SDKs in sdks/, the CLI in honcho-cli/, and optional managed hosting at api.honcho.dev (README.md:1-40).

What Honcho Is For

Honcho targets two main audiences: developers who want to give coding agents persistent memory, and product teams who want to add memory to LLM-powered applications. According to the README, "Using Honcho as your memory system will earn your agents higher retention, more trust, and help you build data moats" (README.md:14-18). Capability highlights include reasoning-first memory extraction, a peer-centric data model, multi-peer perspective (modelling what peer X knows about peer Y), and support for both managed and self-hosted deployments.

System Architecture

The runtime is composed of an HTTP API server, a background deriver worker, a CLI, and two language SDKs. The following diagram summarizes the request flow:

flowchart LR
    subgraph Clients
        SDK_P[Python SDK<br/>honcho-ai]
        SDK_T[TypeScript SDK<br/>@honcho-ai/sdk]
        CLI[honcho-cli]
        MCP[MCP / Agent tools]
    end
    subgraph Server
        API[FastAPI Server<br/>routers + middleware]
        QM[QueueManager<br/>deriver process]
        WH[Webhook Delivery]
    end
    subgraph Storage
        DB[(Postgres<br/>workspaces, peers,<br/>sessions, messages)]
        VS[(Vector Store<br/>pgvector / turbopuffer / lancedb)]
        CACHE[(Redis cache)]
    end
    LLM[LLM provider<br/>OpenAI / Anthropic /<br/>OpenRouter / vLLM / NIM]
    EMB[Embedding provider]

    SDK_P -->|HTTP| API
    SDK_T -->|HTTP| API
    CLI -->|HTTP| API
    MCP -->|HTTP| API

    API --> DB
    API --> VS
    API --> CACHE
    API --> EMB
    QM --> DB
    QM --> VS
    QM --> LLM
    QM --> CACHE
    QM --> WH
    WH -->|signed POST| Subscribers

The README confirms the layered model: workspaces hold peers, peers participate in sessions, messages live on sessions, and Honcho builds a per-peer representation that callers query through the Chat Endpoint (README.md:62-66). The webhooks subsystem uses the same QueueManager process that powers the deriver to deliver signed HTTP POSTs to subscriber URLs (src/webhooks/README.md:9-19).

The Honcho Loop

Honcho's API contract is captured in a four-step loop (README.md:68-78):

Store — conversations, events, documents, or tool traces are appended as messages on a session.
Reason — the deriver consumes queue items in the background and updates peer representations, producing conclusions and refreshing peer cards.
Query — callers ask Honcho for context, search results, peer representations, or a natural-language answer through the Chat Endpoint.
Inject — the result is dropped into any LLM call or agent framework.

This loop is visible at the SDK level. In Python, peer.message(...) produces a MessageCreateParams that is then passed to session.add_messages(...) (sdks/python/src/honcho/peer.py:1-60). The TypeScript SDK exposes the same primitives through session.addMessages([...]) and peer.chat(...) (sdks/typescript/README.md:14-30). The Conclusion resource, exposed in both SDKs, represents the atom Honcho derives from messages (sdks/typescript/src/conclusions.ts:1-40).

Key Components and Services

Component	Location	Role
FastAPI server	`src/` (routers, middleware)	Public HTTP surface for workspaces, peers, sessions, messages, conclusions, chat, webhooks
Deriver worker	background process	Polls queue, drives reasoning, summary, peer card, and dream jobs
CLI	`honcho-cli/src/honcho_cli/main.py`	Terminal interface; `honcho conclusion ...` and `honcho session ...` are two of the most-used command groups
Python SDK	`sdks/python/src/honcho/`	Pydantic-validated async client; ships its own typed `api_types.py`
TypeScript SDK	`sdks/typescript/src/`	Zod-validated client; published as `@honcho-ai/sdk` v2.1.2
Webhook delivery	`src/webhooks/`	Signs and dispatches events to user-configured URLs

Both SDKs share the same domain model. The Python side defines Pydantic models for ReasoningConfiguration, PeerCardConfiguration, SummaryConfiguration, DreamConfiguration, and the umbrella WorkspaceConfiguration (sdks/python/src/honcho/api_types.py:1-80). The TypeScript side mirrors this with Zod schemas and a snake_case api.ts (sdks/typescript/src/types/api.ts:1-60, sdks/typescript/src/validation.ts:1-60). Pagination is implemented symmetrically: the Python SyncPage/AsyncPage generics (sdks/python/src/honcho/pagination.py:1-60) correspond to the TypeScript Page<T, U> with Symbol.asyncIterator (sdks/typescript/src/pagination.ts:1-50).

Configuration Surfaces

Honcho accepts configuration in priority order: environment variables > .env > config.toml > defaults (README.md:128-150). The TOML file is organized into sections: [app], [db], [auth], [cache], [llm], [deriver], [peer_card], [dialectic], [summary], [dream], [webhook], [metrics], [telemetry], [vector_store], and [sentry]. The [vector_store] section explicitly supports pgvector, turbopuffer, or lancedb backends — a detail that aligns with the community request to add TurboQuant/turbovec as an optional compressed vector backend (see issue #781).

Several community-reported issues map directly to configuration gaps. Self-hosters have repeatedly found that the deriver process must be started separately for derived memory to appear (issue #494). The OpenAI-compatible provider path (LLM_OPENAI_API_KEY + OPENAI_BASE_URL) requires both keys to be set, otherwise AsyncOpenAI falls back to api.openai.com and fails with 401 (issue #641). Local embedding providers (Ollama, llama.cpp, TEI) currently require source-level changes in src/embedding_client.py because the model name and base URL are hardcoded for two of three providers (issue #578, issue #443).

Self-Hosting, Configuration, and LLM Provider Setup

Related topics: Overview and System Architecture, Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Workspace Configuration Surface

Continue reading this section for the full explanation and source context.

Section SDK Configuration Surface

Continue reading this section for the full explanation and source context.

Section Embeddings Provider

Continue reading this section for the full explanation and source context.

Self-Hosting, Configuration, and LLM Provider Setup

Overview

Honcho is a memory infrastructure service for stateful agents, distributed as a FastAPI server that can be run managed at api.honcho.dev or self-hosted locally. Source: README.md:1-40. The self-hosted deployment path is targeted at users who want full control over their peer memory pipelines — especially when wiring OpenAI-compatible providers, local embeddings, or alternative vector stores. The project is licensed AGPL-3.0 and ships reference Docker configuration plus TOML-based configuration files.

The core runtime topology consists of a web/API service and a background deriver process that drains a QueueManager and turns ingested messages into conclusions and peer representations. Source: src/webhooks/README.md:14-19. This two-process model is the single most common source of self-hosting confusion — see "Common Failure Modes" below.

Self-Hosting Topology

A self-hosted Honcho deployment typically runs the following services side-by-side:

Component	Role	Configuration surface
FastAPI web/API	Accepts writes, queries, chat endpoint	`APP_`, `AUTH_` env vars
Deriver worker	Polls `QueueManager`, derives conclusions, fires webhooks	`DERIVER_`, `DREAM_`
Postgres + pgvector	Stores messages, documents, vector embeddings	`DB_*`
Redis	Caching, peer card cache	`CACHE_*`
Optional Turbopuffer / LanceDB	External vector store backend	`VECTOR_STORE_*`

Webhooks specifically require the deriver process to be running to facilitate delivery — otherwise queue items accumulate without ever being dispatched. Source: src/webhooks/README.md:20-22. The README explicitly states that the project ships with docker-compose.yml.example and supports deploying to Fly.io via documented guides. Source: README.md:1-10.

Configuration System

Honcho uses a flexible configuration system supporting both TOML files and environment variables. Configuration values are loaded in priority order: environment variables > .env file > config.toml > defaults. Source: README.md:42-46.

To begin, copy the example configuration file:

cp config.toml.example config.toml

Source: README.md:50-52. The TOML file is organized into clearly-named sections including [app], [db], [auth], [cache], [llm], [deriver], [peer_card], [dialectic], [summary], [dream], [webhook], [metrics], [telemetry], [vector_store], and [sentry]. Source: README.md:54-70.

Workspace Configuration Surface

At runtime, workspaces themselves carry configuration that controls reasoning, peer cards, summaries, and dream processing. The Python SDK exposes these as Pydantic models with extra="forbid" semantics, ensuring unknown fields are rejected rather than silently ignored. Source: sdks/python/src/honcho/api_types.py:20-90.

Representative models include:

ReasoningConfiguration — toggle reasoning and pass custom instructions.
PeerCardConfiguration — toggle creation/use of peer cards.
SummaryConfiguration — enable summarization and tune messages_per_short_summary / messages_per_long_summary.
DreamConfiguration — enable dream processing.
WorkspaceConfiguration — bundles the above into a workspace-scoped container.

Source: sdks/python/src/honcho/api_types.py:20-90.

LLM Provider Setup

Honcho's LLM configuration is namespaced under [llm] in TOML and under LLM_* environment variables. The model config for each subsystem (deriver, peer card, dialectic, summary, dream) accepts a transport field plus a model identifier. Source: README.md:54-70.

A representative self-hosted setup against an OpenAI-compatible provider looks like this:

LLM_OPENAI_API_KEY=<provider-key>
OPENAI_BASE_URL=https://integrate.api.nvidia.com/v1
DERIVER_MODEL_CONFIG__TRANSPORT=openai
DERIVER_MODEL_CONFIG__MODEL=nvidia/nemotron-3-nano-omni-3

The double-underscore (__) is Honcho's nested-config delimiter, mapping DERIVER_MODEL_CONFIG__TRANSPORT to deriver.model_config.transport in the TOML schema. Source: README.md:54-70.

SDK Configuration Surface

Both SDKs consume the same configuration shape. The TypeScript SDK exposes HonchoConfig alongside validation types such as ChatQuery, ContextParams, and GetRepresentationParams. Source: sdks/typescript/src/index.ts:1-70. The Python SDK provides a Honcho client and per-peer helpers like peer.message(...) (which builds MessageCreateParams with optional configuration and metadata) and peer.search(...) (which scopes semantic search to the peer-as-author). Source: sdks/python/src/honcho/peer.py:1-90.

Embeddings Provider

Honcho's embedding client (referenced in README.md:54-70 under [app] embedding settings) supports multiple providers. Community-reported issues indicate that for self-hosted embeddings (Ollama, llama.cpp, TEI, Infinity), additional environment-level configuration is typically required because the embedding client hardcodes model names and base URLs for two of three providers — see "Common Failure Modes" below.

Common Failure Modes

Several recurring self-hosting issues have surfaced in community discussions:

1. Deriver not running → derived memory stays empty

In self-hosted/local deployments, messages are written and queue items are created, but derived memory does not appear unless the deriver is started manually. Source: community issue #494. The fix is operational rather than code-level: ensure the deriver container/process is running alongside the web/API service, since conclusions, peer cards, and search-derived context all depend on it. Source: src/webhooks/README.md:20-22.

2. OpenAI-compatible provider returns 401

When LLM_OPENAI_API_KEY is set to a non-OpenAI key but OPENAI_BASE_URL is not propagated through to every LLM client, dialectic/deriver/summary calls fail with openai.AuthenticationError: 401 against api.openai.com. Source: community issue #641. The workaround is to explicitly set OPENAI_BASE_URL (or the per-subsystem equivalent under [llm]) for each provider.

3. Local embeddings provider unusable without code changes

Users have reported having to modify embedding_client.py to use a local embeddings provider because certain branches fall back to a hardcoded OpenAI endpoint. Source: community issue #443. A related feature request (issue #578) asks for a configurable embedding model name plus custom base URL for self-hosted embeddings.

4. honcho-cli missing `click` dependency

Installing honcho-cli via uv tool install honcho-cli can result in ModuleNotFoundError: No module named 'click' until a release containing the fix from PR #786 is published. Source: community issues #786 and #808. Until the PyPI release is updated, install from source or pin a fixed version.

SDKs, CLI, and Agent Integrations

Related topics: Overview and System Architecture, Self-Hosting, Configuration, and LLM Provider Setup

Section Related Pages

Continue reading this section for the full explanation and source context.

SDKs, CLI, and Agent Integrations

Honcho ships three surface areas on top of its FastAPI memory server: the Python SDK (honcho-ai), the TypeScript SDK (@honcho-ai/sdk), the honcho-cli terminal client, and a growing set of agent integrations including a hosted MCP server. This page documents the role, structure, and configuration of each.

Overview and High-Level Architecture

The core service is a FastAPI server that ingests messages, runs the deriver to update peer representations, and serves the Chat Endpoint, conclusions, peer cards, and session summaries. The SDKs, CLI, and integrations are thin client layers that translate native idioms (Python objects, TypeScript classes, shell commands, MCP tool calls) into HTTP requests against that server.

flowchart LR
  subgraph Clients
    A[Python SDK<br/>honcho-ai]
    B[TypeScript SDK<br/>@honcho-ai/sdk]
    C[honcho-cli<br/>Typer]
    D[Claude Code / OpenCode / OpenClaw / Hermes]
    E[MCP Server<br/>mcp/src/config.ts]
  end
  F[(Honcho FastAPI<br/>api.honcho.dev or :8000)]
  G[(Deriver Worker)]
  F <--> G
  A --> F
  B --> F
  C --> F
  D --> E
  E --> F

The same Honcho client class in both SDKs exposes peer(), session(), add_messages(), chat(), and representation() methods. Configuration is loaded with the same precedence in every surface: environment variables → .env → config.toml → defaults (see README.md).

Python SDK (`honcho-ai`)

The Python package exposes the full domain model as Pydantic-typed objects. Configuration types in sdks/python/src/honcho/api_types.py include ReasoningConfiguration, PeerCardConfiguration, SummaryConfiguration, DreamConfiguration, and the umbrella WorkspaceConfiguration. These mirror the server's Pydantic schemas with extra="forbid" so callers cannot smuggle in unknown keys.

A typical peer-message workflow is built around helpers in sdks/python/src/honcho/peer.py. peer.message(content, ...) validates that the content is non-empty, parses optional created_at timestamps, and wraps everything in a MessageCreateParams object:

from honcho import Honcho
honcho = Honcho(workspace_id="acme")
alice = honcho.peer("alice")
session = honcho.session("s1")
await session.add_peers([alice])
await session.add_messages(alice.message("I had oatmeal for breakfast."))
resp = await alice.chat("what did alice have for breakfast today?")

The search() method on peer.py (lines defining validate_call and Field) enforces 1 ≤ limit ≤ 100 and uses a pydantic Field for typed query/filters/limit, returning a list of Message objects (Source: sdks/python/src/honcho/peer.py). Install via pip install honcho-ai, uv add honcho-ai, or poetry add honcho-ai.

TypeScript SDK (`@honcho-ai/sdk`)

The TypeScript SDK is a DX-optimized, isomorphic client published as @honcho-ai/sdk (see sdks/typescript/package.json, version 2.1.2, Apache-2.0). It depends only on zod ^4.0.0 for runtime validation — no HTTP client dependency is hardcoded, allowing the package to run under Bun, Node, and edge runtimes.

The barrel export in sdks/typescript/src/index.ts re-exports the domain classes (Honcho, Peer, Session, Message, Conclusion, SessionContext, SessionSummaries, Summary), the streaming types (DialecticStreamChunk, DialecticStreamResponse), and all error classes. API response shapes live in sdks/typescript/src/types/api.ts and are deliberately snake_case to match the server's Pydantic schemas.

Input validation is centralized in sdks/typescript/src/validation.ts. WorkspaceIdSchema constrains IDs to ^[a-zA-Z0-9_-]+$ and ≤ 512 characters; HonchoConfigSchema is .strict() and caps maxRetries at 3 with a positive timeout. The README example (sdks/typescript/README.md) demonstrates the canonical peer/session/message/chat loop.

`honcho-cli` Terminal Client

The CLI is a Typer application whose entry point is honcho-cli/src/honcho_cli/main.py. It uses a custom HonchoTyperGroup (defined in honcho-cli/src/honcho_cli/_help.py) to replace Click's terse usage line with a themed, brand-colored panel. A --json flag (or the HONCHO_JSON env var) switches the global output renderer to JSON; a --version/-V callback prints the banner and exits eagerly.

The CLI is organized into subcommand modules, one per resource:

Module	Subcommands	Source
`peer.py`	list, card, chat, search, create, metadata, representation	honcho-cli/src/honcho_cli/commands/peer.py
`session.py`	list, inspect, context, summaries, peers, search, representation, metadata	honcho-cli/src/honcho_cli/commands/session.py
`conclusion.py`	list, search, create, delete (Honcho's memory atoms)	honcho-cli/src/honcho_cli/commands/conclusion.py

Every command resolves its resource ID from the CLI flag or from the resolved config (workspace, peer, session), validates the ID, and emits a structured print_error("NO_SCOPE", ...) if neither is set. The CLI shells out to the Python SDK under the hood, so its query semantics match peer.chat() exactly.

Known issue: Versions of honcho-cli on PyPI prior to the merge of fix #786 shipped without a click dependency, causing ModuleNotFoundError on a clean uv tool install honcho-cli — see #808 for the publish-status discussion.

Agent Integrations and MCP

Honcho exposes memory as Model Context Protocol tools, used by Claude Code, Cursor, Windsurf, OpenCode, OpenClaw, Hermes, and any MCP-compatible client. The MCP server's config layer lives in mcp/src/config.ts. It reads an Authorization: Bearer <key> header and an X-Honcho-User-Name header from every request, constructs an @honcho-ai/sdk Honcho client, and routes the request. The Honcho API base URL is taken from the HONCHO_API_URL env var on the Worker (intentionally not a request header, to avoid leaking internal URLs to public clients) so the same binary can target api.honcho.dev or a self-hosted instance.

Per the README.md, the canonical MCP install is:

claude mcp add honcho \
  --transport http \
  --url "https://mcp.honcho.dev" \
  --header "Authorization: Bearer hch-your-key-here" \
  --header "X-Honcho-User-Name: YourName"

For deeper Claude Code integration there is a plugin (/plugin marketplace add plastic-labs/claude-honcho), and dedicated plugins for OpenCode (@honcho-ai/opencode-honcho) and OpenClaw (@honcho-ai/openclaw-honcho). All of these ultimately forward to the same FastAPI surface, so the SDK and CLI can be used interchangeably to inspect what an agent has written to Honcho.

Configuration, Self-Hosting, and Common Failure Modes

All three surfaces accept the same config.toml / .env / environment variable stack described in README.md. The TOML file is partitioned into [app], [db], [auth], [cache], [llm], [deriver], [peer_card], [dialectic], [summary], [dream], [webhook], [metrics], [telemetry], [vector_store], and [sentry] sections. When self-hosting, the SDK and CLI default baseURL to http://localhost:8000; the MCP Worker's HONCHO_API_URL env var must be set to the same address.

Recurring community-reported failure modes that affect every surface:

Deriver not running (#494): Messages are persisted and queue items are created, but no conclusions, peer cards, or representations appear. The fix is to start the deriver worker process; webhooks also require it (src/webhooks/README.md).
OpenAI-compatible providers returning 401 (#641): The default AsyncOpenAI client lacks a base_url, so a non-OpenAI key hits api.openai.com. Set OPENAI_BASE_URL and (if needed) override DERIVER_MODEL_CONFIG__TRANSPORT/DERIVER_MODEL_CONFIG__MODEL per the WSL/NVIDIA NIM example in #789.
Local embeddings provider (#443, #578): src/embedding_client.py hardcodes model name and base URL for two of three providers; until that becomes configurable, self-hosters must patch the source or proxy a compatible endpoint.
Self-reinforcing conclusions (#725): A stale conclusion attached to a UserPromptSubmit hook can survive document deletion and Redis flush because the deriver re-derives it. Use the CLI's honcho conclusion delete (honcho-cli/src/honcho_cli/commands/conclusion.py) to purge it and re-derive from a clean document set.
DeepSeek tool continuation (#723): reasoning_content (str) vs. reasoning_details (list) is a transport-level mismatch that surfaces in dialectic calls; a model_config override per provider is the current workaround.

Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval

Related topics: Overview and System Architecture, Self-Hosting, Configuration, and LLM Provider Setup

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. Deriver (Conclusion Generation)

Continue reading this section for the full explanation and source context.

Section 2. Peer Card (Identity Snapshot)

Continue reading this section for the full explanation and source context.

Section 3. Summarizer (Progressive Context Compression)

Continue reading this section for the full explanation and source context.

Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval

Overview and Scope

Honcho is a conversational memory platform that turns raw messages into a layered, queryable representation of each *peer*. The reasoning pipeline is the system responsible for ingesting messages, distilling them into atomic observations (Conclusions), maintaining compact identity snapshots (Peer Cards), producing progressive conversation digests (Session Summaries), and exposing the resulting memory through semantic Retrieval. The public surface for these capabilities is the *Workspace Configuration* object, which toggles each stage independently. Source: README.md and sdks/python/src/honcho/api_types.py

The pipeline is implemented as a FastAPI server backed by a queue, with two long-running worker processes (commonly referenced in the codebase as the *deriver* and *dreamer*). The honcho-cli and the Python/TypeScript SDKs are thin clients that enqueue messages and read derived memory. Source: honcho-cli/README.md and sdks/typescript/src/index.ts

Pipeline Stages

1. Deriver (Conclusion Generation)

The deriver is the first stage of the pipeline. When a peer adds a message to a session, the server persists the message and enqueues a work item. The deriver consumes the queue, calls a language model, and writes zero or more *conclusions* — atomic facts derived from the exchange. A Conclusion is the public memory atom: it has an id, content, observerId, observedId, optional sessionId, and a createdAt timestamp. Source: sdks/typescript/src/conclusions.ts:13-37

Conclusions are scoped by the (observer, observed) peer pair, which lets a workspace model both self-representation (observer == observed) and asymmetric cross-peer views (peer X's model of peer Y). Source: README.md (Internal storage section)

2. Peer Card (Identity Snapshot)

The peer card is a static, low-latency identity summary built from a peer's conclusions. It is regenerated on a schedule and cached for fast retrieval. It is controlled by the peer_card configuration block, which exposes two independent flags: use (read pre-built cards into context) and create (run the card generation job). Source: sdks/python/src/honcho/api_types.py (PeerCardConfiguration) and sdks/typescript/src/validation.ts (PeerCardConfigSchema)

3. Summarizer (Progressive Context Compression)

The summarizer produces two rolling summaries of a session: a *short* summary and a *long* summary. Thresholds are user-tunable via summary.messages_per_short_summary (minimum 10) and summary.messages_per_long_summary (minimum 20). Summaries are exposed through the CLI as honcho session summaries and surface in the SDK as SessionSummaries / Summary objects. Source: sdks/typescript/src/validation.ts (SummaryConfigSchema), honcho-cli/src/honcho_cli/commands/session.py, and sdks/typescript/src/index.ts

4. Dreamer (Background Consolidation)

The dreamer is an offline consolidation pass. It runs less frequently than the deriver, traverses a workspace's conclusions, and re-organizes or prunes the representation. It is gated by the boolean dream.enabled flag in workspace and session configuration. Source: sdks/python/src/honcho/api_types.py (DreamConfiguration) and sdks/typescript/src/validation.ts (DreamConfigSchema)

5. Retrieval (Read Path)

Read-side retrieval is served by three endpoints: peer.context, peer.card, and conclusion.search. The Python SDK exposes these as peer.chat(), peer.context, and peer.search(); the TypeScript SDK mirrors them on the Peer class. The CLI surfaces them as honcho conclusion search <query> and honcho session context. Source: honcho-cli/src/honcho_cli/commands/conclusion.py and sdks/python/src/honcho/peer.py

flowchart LR
  A[Client SDK / CLI] -->|add messages| B[FastAPI Server]
  B -->|enqueue| C[Deriver Worker]
  C -->|LLM call| D[(Conclusions Store)]
  D -->|embed + index| E[(Vector Store)]
  C --> F[Peer Card Job]
  C --> G[Summarizer Job]
  D -->|periodic| H[Dreamer Worker]
  A -->|chat / search / context| B
  B -->|read| E
  B -->|read| F
  B -->|read| G

Configuration Surface

The pipeline is configured per workspace and per session, with a session-level config able to override or null out workspace settings. Both null and omission are valid and mean "inherit from the parent scope."

Config Block	Key Field(s)	Effect	Source
`reasoning`	`enabled`, `customInstructions`	Toggles LLM-driven derivation and lets callers inject system-prompt guidance	sdks/python/src/honcho/api_types.py
`peer_card`	`use`, `create`	Independently control card consumption and card generation	sdks/typescript/src/validation.ts
`summary`	`enabled`, `messages_per_short_summary` (≥10), `messages_per_long_summary` (≥20)	Enable rolling summaries and tune compression thresholds	sdks/typescript/src/validation.ts
`dream`	`enabled`	Enable background consolidation pass	sdks/python/src/honcho/api_types.py

The validation layer in both SDKs (extra="forbid" in Pydantic, .strict() in Zod) rejects unknown keys, making the contract additive. Source: sdks/python/src/honcho/api_types.py and sdks/typescript/src/validation.ts

Known Failure Modes (Community-Reported)

Several community-reported issues map directly to the reasoning pipeline:

Deriver not auto-starting in self-hosted deployments (issue #494): messages are persisted and queued, but peer.card, peer.context, and conclusion.search return empty until the deriver worker is started manually. Self-hosters should confirm the deriver process is running alongside the API.
Self-reinforcing conclusions (issue #725): a conclusion with no grounding in source messages can re-derive itself after deletion and cache flush, and can be reinforced by subsequent interactions. The SDK's honcho conclusion delete <id> removes the row but does not prevent re-derivation in the current architecture.
OpenAI-compatible provider misconfiguration (issue #641): when LLM_OPENAI_API_KEY points to OpenRouter, vLLM, or similar, the deriver's AsyncOpenAI client falls back to api.openai.com and returns 401. The fix is to set OPENAI_BASE_URL to the compatible endpoint's URL.
Custom embedding endpoints (issues #443, #578): the embedding client hardcodes model names and base URLs for two of three providers. Self-hosted embedding backends (Ollama, llama.cpp, TEI) may require code modifications or future configuration hooks.

Operational Notes

The reasoning pipeline is asynchronous: add_messages returns before conclusions are written, and queue depth is observable via the session queue-status endpoint. The SDK paginates all list responses (SyncPage in Python, Page<T> in TypeScript) so that large conclusion or session sets can be streamed. Source: sdks/python/src/honcho/pagination.py and sdks/typescript/src/pagination.ts

Honcho's evals (LongMemEval, LoCoMo, and others) are the primary correctness signal for the pipeline; the README links to the evals page and the benchmarking blog post for reproducible methodology. Source: README.md

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

Developers may fail before the first successful local run: [Bug] ModuleNotFoundError: No module named 'click'

Doramagic Pitfall Log

Found 20 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/725

2. Configuration risk: Configuration risk requires verification

Severity: high
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/494

3. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/789

4. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: [Bug] ModuleNotFoundError: No module named 'click'
User impact: Developers may fail before the first successful local run: [Bug] ModuleNotFoundError: No module named 'click'
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [Bug] ModuleNotFoundError: No module named 'click'. Context: Observed when using python
Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/786

5. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: [Feature] Support TurboQuant/turbovec as optional vector store backend for memory compression
User impact: Developers may fail before the first successful local run: [Feature] Support TurboQuant/turbovec as optional vector store backend for memory compression
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [Feature] Support TurboQuant/turbovec as optional vector store backend for memory compression. Context: Observed when using node, python, linux
Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/781

6. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: honcho-cli 0.1.0 on PyPI still missing click — please publish release with #786 fix
User impact: Developers may fail before the first successful local run: honcho-cli 0.1.0 on PyPI still missing click — please publish release with #786 fix
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: honcho-cli 0.1.0 on PyPI still missing click — please publish release with #786 fix. Context: Observed when using python, macos
Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/808

7. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/786

8. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/781

9. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/808

10. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/plastic-labs/honcho

11. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: I need help setting up honcho
User impact: Developers may misconfigure credentials, environment, or host setup: I need help setting up honcho
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: I need help setting up honcho. Context: Observed when using docker, windows, cuda
Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/789

12. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Self-hosted/local Honcho ingests messages but derived memory does not appear automatically; peer card/context/search remain empty
User impact: Developers may misconfigure credentials, environment, or host setup: Self-hosted/local Honcho ingests messages but derived memory does not appear automatically; peer card/context/search remain empty
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Self-hosted/local Honcho ingests messages but derived memory does not appear automatically; peer card/context/search remain empty. Context: Observed when using python
Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/494

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 8

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using honcho with real data or production workflows.

[[Feature] Support TurboQuant/turbovec as optional vector store backend f](https://github.com/plastic-labs/honcho/issues/781) - github / github_issue
Self-hosted/local Honcho ingests messages but derived memory does not ap - github / github_issue
Conclusions re-derive after delete + cache flush (self-reinforcing) - github / github_issue
honcho-cli 0.1.0 on PyPI still missing click — please publish release wi - github / github_issue
I need help setting up honcho - github / github_issue
Test issue for honcho_conclude parameters - github / github_issue
[[Bug] ModuleNotFoundError: No module named 'click'](https://github.com/plastic-labs/honcho/issues/786) - github / github_issue
Configuration risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

honcho

Overview and System Architecture

Related Pages

Overview and System Architecture

What Honcho Is For

System Architecture

The Honcho Loop

Key Components and Services

Configuration Surfaces

See Also

Self-Hosting, Configuration, and LLM Provider Setup

Related Pages

Self-Hosting, Configuration, and LLM Provider Setup

Overview

Self-Hosting Topology

Configuration System

Workspace Configuration Surface

LLM Provider Setup

SDK Configuration Surface

Embeddings Provider

Common Failure Modes

1. Deriver not running → derived memory stays empty

2. OpenAI-compatible provider returns 401

3. Local embeddings provider unusable without code changes

4. honcho-cli missing `click` dependency

See Also

SDKs, CLI, and Agent Integrations

Related Pages

SDKs, CLI, and Agent Integrations

Overview and High-Level Architecture

Python SDK (`honcho-ai`)

TypeScript SDK (`@honcho-ai/sdk`)

`honcho-cli` Terminal Client

Agent Integrations and MCP

Configuration, Self-Hosting, and Common Failure Modes

See Also

Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval

Related Pages

Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval

Overview and Scope

Pipeline Stages

1. Deriver (Conclusion Generation)

2. Peer Card (Identity Snapshot)

3. Summarizer (Progressive Context Compression)

4. Dreamer (Background Consolidation)

5. Retrieval (Read Path)

Configuration Surface

Known Failure Modes (Community-Reported)

Operational Notes

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Installation risk: Installation risk requires verification

2. Configuration risk: Configuration risk requires verification

3. Security or permission risk: Security or permission risk requires verification

4. Installation risk: Installation risk requires verification

5. Installation risk: Installation risk requires verification

6. Installation risk: Installation risk requires verification

7. Installation risk: Installation risk requires verification

8. Installation risk: Installation risk requires verification

9. Installation risk: Installation risk requires verification

10. Configuration risk: Configuration risk requires verification

11. Configuration risk: Configuration risk requires verification

12. Configuration risk: Configuration risk requires verification

Community Discussion Evidence

Community Discussion Evidence