Doramagic Project Pack · Human Manual
honcho
Memory library for building stateful agents
Overview and System Architecture
Related topics: Self-Hosting, Configuration, and LLM Provider Setup, Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Self-Hosting, Configuration, and LLM Provider Setup, Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval
Overview and System Architecture
Honcho is an open-source conversational memory platform positioned as a "memory layer" for LLM-based agents. Rather than returning semantically matched chunks like a retrieval-augmented generation (RAG) system, Honcho extracts reasoned conclusions about peers (users, agents, groups, projects) and serves them through a single FastAPI server. The project is split across multiple repositories: this one hosts the core service logic and the Python and TypeScript SDKs in sdks/, the CLI in honcho-cli/, and optional managed hosting at api.honcho.dev (README.md:1-40).
What Honcho Is For
Honcho targets two main audiences: developers who want to give coding agents persistent memory, and product teams who want to add memory to LLM-powered applications. According to the README, "Using Honcho as your memory system will earn your agents higher retention, more trust, and help you build data moats" (README.md:14-18). Capability highlights include reasoning-first memory extraction, a peer-centric data model, multi-peer perspective (modelling what peer X knows about peer Y), and support for both managed and self-hosted deployments.
System Architecture
The runtime is composed of an HTTP API server, a background deriver worker, a CLI, and two language SDKs. The following diagram summarizes the request flow:
flowchart LR
subgraph Clients
SDK_P[Python SDK<br/>honcho-ai]
SDK_T[TypeScript SDK<br/>@honcho-ai/sdk]
CLI[honcho-cli]
MCP[MCP / Agent tools]
end
subgraph Server
API[FastAPI Server<br/>routers + middleware]
QM[QueueManager<br/>deriver process]
WH[Webhook Delivery]
end
subgraph Storage
DB[(Postgres<br/>workspaces, peers,<br/>sessions, messages)]
VS[(Vector Store<br/>pgvector / turbopuffer / lancedb)]
CACHE[(Redis cache)]
end
LLM[LLM provider<br/>OpenAI / Anthropic /<br/>OpenRouter / vLLM / NIM]
EMB[Embedding provider]
SDK_P -->|HTTP| API
SDK_T -->|HTTP| API
CLI -->|HTTP| API
MCP -->|HTTP| API
API --> DB
API --> VS
API --> CACHE
API --> EMB
QM --> DB
QM --> VS
QM --> LLM
QM --> CACHE
QM --> WH
WH -->|signed POST| SubscribersThe README confirms the layered model: workspaces hold peers, peers participate in sessions, messages live on sessions, and Honcho builds a per-peer representation that callers query through the Chat Endpoint (README.md:62-66). The webhooks subsystem uses the same QueueManager process that powers the deriver to deliver signed HTTP POSTs to subscriber URLs (src/webhooks/README.md:9-19).
The Honcho Loop
Honcho's API contract is captured in a four-step loop (README.md:68-78):
- Store — conversations, events, documents, or tool traces are appended as messages on a session.
- Reason — the deriver consumes queue items in the background and updates peer representations, producing conclusions and refreshing peer cards.
- Query — callers ask Honcho for context, search results, peer representations, or a natural-language answer through the Chat Endpoint.
- Inject — the result is dropped into any LLM call or agent framework.
This loop is visible at the SDK level. In Python, peer.message(...) produces a MessageCreateParams that is then passed to session.add_messages(...) (sdks/python/src/honcho/peer.py:1-60). The TypeScript SDK exposes the same primitives through session.addMessages([...]) and peer.chat(...) (sdks/typescript/README.md:14-30). The Conclusion resource, exposed in both SDKs, represents the atom Honcho derives from messages (sdks/typescript/src/conclusions.ts:1-40).
Key Components and Services
| Component | Location | Role |
|---|---|---|
| FastAPI server | src/ (routers, middleware) | Public HTTP surface for workspaces, peers, sessions, messages, conclusions, chat, webhooks |
| Deriver worker | background process | Polls queue, drives reasoning, summary, peer card, and dream jobs |
| CLI | honcho-cli/src/honcho_cli/main.py | Terminal interface; honcho conclusion ... and honcho session ... are two of the most-used command groups |
| Python SDK | sdks/python/src/honcho/ | Pydantic-validated async client; ships its own typed api_types.py |
| TypeScript SDK | sdks/typescript/src/ | Zod-validated client; published as @honcho-ai/sdk v2.1.2 |
| Webhook delivery | src/webhooks/ | Signs and dispatches events to user-configured URLs |
Both SDKs share the same domain model. The Python side defines Pydantic models for ReasoningConfiguration, PeerCardConfiguration, SummaryConfiguration, DreamConfiguration, and the umbrella WorkspaceConfiguration (sdks/python/src/honcho/api_types.py:1-80). The TypeScript side mirrors this with Zod schemas and a snake_case api.ts (sdks/typescript/src/types/api.ts:1-60, sdks/typescript/src/validation.ts:1-60). Pagination is implemented symmetrically: the Python SyncPage/AsyncPage generics (sdks/python/src/honcho/pagination.py:1-60) correspond to the TypeScript Page<T, U> with Symbol.asyncIterator (sdks/typescript/src/pagination.ts:1-50).
Configuration Surfaces
Honcho accepts configuration in priority order: environment variables > .env > config.toml > defaults (README.md:128-150). The TOML file is organized into sections: [app], [db], [auth], [cache], [llm], [deriver], [peer_card], [dialectic], [summary], [dream], [webhook], [metrics], [telemetry], [vector_store], and [sentry]. The [vector_store] section explicitly supports pgvector, turbopuffer, or lancedb backends — a detail that aligns with the community request to add TurboQuant/turbovec as an optional compressed vector backend (see issue #781).
Several community-reported issues map directly to configuration gaps. Self-hosters have repeatedly found that the deriver process must be started separately for derived memory to appear (issue #494). The OpenAI-compatible provider path (LLM_OPENAI_API_KEY + OPENAI_BASE_URL) requires both keys to be set, otherwise AsyncOpenAI falls back to api.openai.com and fails with 401 (issue #641). Local embedding providers (Ollama, llama.cpp, TEI) currently require source-level changes in src/embedding_client.py because the model name and base URL are hardcoded for two of three providers (issue #578, issue #443).
See Also
- Core Concepts and Data Model — workspaces, peers, sessions, messages, conclusions
- Configuration Reference — full
[app],[db],[llm],[deriver],[vector_store]section - Self-hosting Guide — Docker Compose, Fly.io deployment
- SDK Quickstarts — Python (
honcho-ai) and TypeScript (@honcho-ai/sdk) walkthroughs - Webhooks — event types, signing, QueueManager integration
- CLI Reference —
honcho conclusion,honcho session, and related command groups
Source: https://github.com/plastic-labs/honcho / Human Manual
Self-Hosting, Configuration, and LLM Provider Setup
Related topics: Overview and System Architecture, Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and System Architecture, Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval
Self-Hosting, Configuration, and LLM Provider Setup
Overview
Honcho is a memory infrastructure service for stateful agents, distributed as a FastAPI server that can be run managed at api.honcho.dev or self-hosted locally. Source: README.md:1-40. The self-hosted deployment path is targeted at users who want full control over their peer memory pipelines — especially when wiring OpenAI-compatible providers, local embeddings, or alternative vector stores. The project is licensed AGPL-3.0 and ships reference Docker configuration plus TOML-based configuration files.
The core runtime topology consists of a web/API service and a background deriver process that drains a QueueManager and turns ingested messages into conclusions and peer representations. Source: src/webhooks/README.md:14-19. This two-process model is the single most common source of self-hosting confusion — see "Common Failure Modes" below.
Self-Hosting Topology
A self-hosted Honcho deployment typically runs the following services side-by-side:
| Component | Role | Configuration surface |
|---|---|---|
| FastAPI web/API | Accepts writes, queries, chat endpoint | APP_*, AUTH_* env vars |
| Deriver worker | Polls QueueManager, derives conclusions, fires webhooks | DERIVER_*, DREAM_* |
| Postgres + pgvector | Stores messages, documents, vector embeddings | DB_* |
| Redis | Caching, peer card cache | CACHE_* |
| Optional Turbopuffer / LanceDB | External vector store backend | VECTOR_STORE_* |
Webhooks specifically require the deriver process to be running to facilitate delivery — otherwise queue items accumulate without ever being dispatched. Source: src/webhooks/README.md:20-22. The README explicitly states that the project ships with docker-compose.yml.example and supports deploying to Fly.io via documented guides. Source: README.md:1-10.
Configuration System
Honcho uses a flexible configuration system supporting both TOML files and environment variables. Configuration values are loaded in priority order: environment variables > .env file > config.toml > defaults. Source: README.md:42-46.
To begin, copy the example configuration file:
cp config.toml.example config.toml
Source: README.md:50-52. The TOML file is organized into clearly-named sections including [app], [db], [auth], [cache], [llm], [deriver], [peer_card], [dialectic], [summary], [dream], [webhook], [metrics], [telemetry], [vector_store], and [sentry]. Source: README.md:54-70.
Workspace Configuration Surface
At runtime, workspaces themselves carry configuration that controls reasoning, peer cards, summaries, and dream processing. The Python SDK exposes these as Pydantic models with extra="forbid" semantics, ensuring unknown fields are rejected rather than silently ignored. Source: sdks/python/src/honcho/api_types.py:20-90.
Representative models include:
ReasoningConfiguration— toggle reasoning and pass custom instructions.PeerCardConfiguration— toggle creation/use of peer cards.SummaryConfiguration— enable summarization and tunemessages_per_short_summary/messages_per_long_summary.DreamConfiguration— enable dream processing.WorkspaceConfiguration— bundles the above into a workspace-scoped container.
Source: sdks/python/src/honcho/api_types.py:20-90.
LLM Provider Setup
Honcho's LLM configuration is namespaced under [llm] in TOML and under LLM_* environment variables. The model config for each subsystem (deriver, peer card, dialectic, summary, dream) accepts a transport field plus a model identifier. Source: README.md:54-70.
A representative self-hosted setup against an OpenAI-compatible provider looks like this:
LLM_OPENAI_API_KEY=<provider-key>
OPENAI_BASE_URL=https://integrate.api.nvidia.com/v1
DERIVER_MODEL_CONFIG__TRANSPORT=openai
DERIVER_MODEL_CONFIG__MODEL=nvidia/nemotron-3-nano-omni-3
The double-underscore (__) is Honcho's nested-config delimiter, mapping DERIVER_MODEL_CONFIG__TRANSPORT to deriver.model_config.transport in the TOML schema. Source: README.md:54-70.
SDK Configuration Surface
Both SDKs consume the same configuration shape. The TypeScript SDK exposes HonchoConfig alongside validation types such as ChatQuery, ContextParams, and GetRepresentationParams. Source: sdks/typescript/src/index.ts:1-70. The Python SDK provides a Honcho client and per-peer helpers like peer.message(...) (which builds MessageCreateParams with optional configuration and metadata) and peer.search(...) (which scopes semantic search to the peer-as-author). Source: sdks/python/src/honcho/peer.py:1-90.
Embeddings Provider
Honcho's embedding client (referenced in README.md:54-70 under [app] embedding settings) supports multiple providers. Community-reported issues indicate that for self-hosted embeddings (Ollama, llama.cpp, TEI, Infinity), additional environment-level configuration is typically required because the embedding client hardcodes model names and base URLs for two of three providers — see "Common Failure Modes" below.
Common Failure Modes
Several recurring self-hosting issues have surfaced in community discussions:
1. Deriver not running → derived memory stays empty
In self-hosted/local deployments, messages are written and queue items are created, but derived memory does not appear unless the deriver is started manually. Source: community issue #494. The fix is operational rather than code-level: ensure the deriver container/process is running alongside the web/API service, since conclusions, peer cards, and search-derived context all depend on it. Source: src/webhooks/README.md:20-22.
2. OpenAI-compatible provider returns 401
When LLM_OPENAI_API_KEY is set to a non-OpenAI key but OPENAI_BASE_URL is not propagated through to every LLM client, dialectic/deriver/summary calls fail with openai.AuthenticationError: 401 against api.openai.com. Source: community issue #641. The workaround is to explicitly set OPENAI_BASE_URL (or the per-subsystem equivalent under [llm]) for each provider.
3. Local embeddings provider unusable without code changes
Users have reported having to modify embedding_client.py to use a local embeddings provider because certain branches fall back to a hardcoded OpenAI endpoint. Source: community issue #443. A related feature request (issue #578) asks for a configurable embedding model name plus custom base URL for self-hosted embeddings.
4. honcho-cli missing `click` dependency
Installing honcho-cli via uv tool install honcho-cli can result in ModuleNotFoundError: No module named 'click' until a release containing the fix from PR #786 is published. Source: community issues #786 and #808. Until the PyPI release is updated, install from source or pin a fixed version.
See Also
- Honcho MCP Server — Cloudflare Worker MCP integration.
- Webhooks — Event delivery subsystem and deriver dependency.
- Honcho TypeScript SDK — Client surface and configuration types.
Source: https://github.com/plastic-labs/honcho / Human Manual
SDKs, CLI, and Agent Integrations
Related topics: Overview and System Architecture, Self-Hosting, Configuration, and LLM Provider Setup
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and System Architecture, Self-Hosting, Configuration, and LLM Provider Setup
SDKs, CLI, and Agent Integrations
Honcho ships three surface areas on top of its FastAPI memory server: the Python SDK (honcho-ai), the TypeScript SDK (@honcho-ai/sdk), the honcho-cli terminal client, and a growing set of agent integrations including a hosted MCP server. This page documents the role, structure, and configuration of each.
Overview and High-Level Architecture
The core service is a FastAPI server that ingests messages, runs the deriver to update peer representations, and serves the Chat Endpoint, conclusions, peer cards, and session summaries. The SDKs, CLI, and integrations are thin client layers that translate native idioms (Python objects, TypeScript classes, shell commands, MCP tool calls) into HTTP requests against that server.
flowchart LR
subgraph Clients
A[Python SDK<br/>honcho-ai]
B[TypeScript SDK<br/>@honcho-ai/sdk]
C[honcho-cli<br/>Typer]
D[Claude Code / OpenCode / OpenClaw / Hermes]
E[MCP Server<br/>mcp/src/config.ts]
end
F[(Honcho FastAPI<br/>api.honcho.dev or :8000)]
G[(Deriver Worker)]
F <--> G
A --> F
B --> F
C --> F
D --> E
E --> FThe same Honcho client class in both SDKs exposes peer(), session(), add_messages(), chat(), and representation() methods. Configuration is loaded with the same precedence in every surface: environment variables → .env → config.toml → defaults (see README.md).
Python SDK (`honcho-ai`)
The Python package exposes the full domain model as Pydantic-typed objects. Configuration types in sdks/python/src/honcho/api_types.py include ReasoningConfiguration, PeerCardConfiguration, SummaryConfiguration, DreamConfiguration, and the umbrella WorkspaceConfiguration. These mirror the server's Pydantic schemas with extra="forbid" so callers cannot smuggle in unknown keys.
A typical peer-message workflow is built around helpers in sdks/python/src/honcho/peer.py. peer.message(content, ...) validates that the content is non-empty, parses optional created_at timestamps, and wraps everything in a MessageCreateParams object:
from honcho import Honcho
honcho = Honcho(workspace_id="acme")
alice = honcho.peer("alice")
session = honcho.session("s1")
await session.add_peers([alice])
await session.add_messages(alice.message("I had oatmeal for breakfast."))
resp = await alice.chat("what did alice have for breakfast today?")
The search() method on peer.py (lines defining validate_call and Field) enforces 1 ≤ limit ≤ 100 and uses a pydantic Field for typed query/filters/limit, returning a list of Message objects (Source: sdks/python/src/honcho/peer.py). Install via pip install honcho-ai, uv add honcho-ai, or poetry add honcho-ai.
TypeScript SDK (`@honcho-ai/sdk`)
The TypeScript SDK is a DX-optimized, isomorphic client published as @honcho-ai/sdk (see sdks/typescript/package.json, version 2.1.2, Apache-2.0). It depends only on zod ^4.0.0 for runtime validation — no HTTP client dependency is hardcoded, allowing the package to run under Bun, Node, and edge runtimes.
The barrel export in sdks/typescript/src/index.ts re-exports the domain classes (Honcho, Peer, Session, Message, Conclusion, SessionContext, SessionSummaries, Summary), the streaming types (DialecticStreamChunk, DialecticStreamResponse), and all error classes. API response shapes live in sdks/typescript/src/types/api.ts and are deliberately snake_case to match the server's Pydantic schemas.
Input validation is centralized in sdks/typescript/src/validation.ts. WorkspaceIdSchema constrains IDs to ^[a-zA-Z0-9_-]+$ and ≤ 512 characters; HonchoConfigSchema is .strict() and caps maxRetries at 3 with a positive timeout. The README example (sdks/typescript/README.md) demonstrates the canonical peer/session/message/chat loop.
`honcho-cli` Terminal Client
The CLI is a Typer application whose entry point is honcho-cli/src/honcho_cli/main.py. It uses a custom HonchoTyperGroup (defined in honcho-cli/src/honcho_cli/_help.py) to replace Click's terse usage line with a themed, brand-colored panel. A --json flag (or the HONCHO_JSON env var) switches the global output renderer to JSON; a --version/-V callback prints the banner and exits eagerly.
The CLI is organized into subcommand modules, one per resource:
| Module | Subcommands | Source |
|---|---|---|
peer.py | list, card, chat, search, create, metadata, representation | honcho-cli/src/honcho_cli/commands/peer.py |
session.py | list, inspect, context, summaries, peers, search, representation, metadata | honcho-cli/src/honcho_cli/commands/session.py |
conclusion.py | list, search, create, delete (Honcho's memory atoms) | honcho-cli/src/honcho_cli/commands/conclusion.py |
Every command resolves its resource ID from the CLI flag or from the resolved config (workspace, peer, session), validates the ID, and emits a structured print_error("NO_SCOPE", ...) if neither is set. The CLI shells out to the Python SDK under the hood, so its query semantics match peer.chat() exactly.
Known issue: Versions of honcho-cli on PyPI prior to the merge of fix #786 shipped without a click dependency, causing ModuleNotFoundError on a clean uv tool install honcho-cli — see #808 for the publish-status discussion.
Agent Integrations and MCP
Honcho exposes memory as Model Context Protocol tools, used by Claude Code, Cursor, Windsurf, OpenCode, OpenClaw, Hermes, and any MCP-compatible client. The MCP server's config layer lives in mcp/src/config.ts. It reads an Authorization: Bearer <key> header and an X-Honcho-User-Name header from every request, constructs an @honcho-ai/sdk Honcho client, and routes the request. The Honcho API base URL is taken from the HONCHO_API_URL env var on the Worker (intentionally not a request header, to avoid leaking internal URLs to public clients) so the same binary can target api.honcho.dev or a self-hosted instance.
Per the README.md, the canonical MCP install is:
claude mcp add honcho \
--transport http \
--url "https://mcp.honcho.dev" \
--header "Authorization: Bearer hch-your-key-here" \
--header "X-Honcho-User-Name: YourName"
For deeper Claude Code integration there is a plugin (/plugin marketplace add plastic-labs/claude-honcho), and dedicated plugins for OpenCode (@honcho-ai/opencode-honcho) and OpenClaw (@honcho-ai/openclaw-honcho). All of these ultimately forward to the same FastAPI surface, so the SDK and CLI can be used interchangeably to inspect what an agent has written to Honcho.
Configuration, Self-Hosting, and Common Failure Modes
All three surfaces accept the same config.toml / .env / environment variable stack described in README.md. The TOML file is partitioned into [app], [db], [auth], [cache], [llm], [deriver], [peer_card], [dialectic], [summary], [dream], [webhook], [metrics], [telemetry], [vector_store], and [sentry] sections. When self-hosting, the SDK and CLI default baseURL to http://localhost:8000; the MCP Worker's HONCHO_API_URL env var must be set to the same address.
Recurring community-reported failure modes that affect every surface:
- Deriver not running (#494): Messages are persisted and queue items are created, but no conclusions, peer cards, or representations appear. The fix is to start the deriver worker process; webhooks also require it (src/webhooks/README.md).
- OpenAI-compatible providers returning 401 (#641): The default
AsyncOpenAIclient lacks abase_url, so a non-OpenAI key hitsapi.openai.com. SetOPENAI_BASE_URLand (if needed) overrideDERIVER_MODEL_CONFIG__TRANSPORT/DERIVER_MODEL_CONFIG__MODELper the WSL/NVIDIA NIM example in #789. - Local embeddings provider (#443, #578):
src/embedding_client.pyhardcodes model name and base URL for two of three providers; until that becomes configurable, self-hosters must patch the source or proxy a compatible endpoint. - Self-reinforcing conclusions (#725): A stale conclusion attached to a
UserPromptSubmithook can survive document deletion and Redis flush because the deriver re-derives it. Use the CLI'shoncho conclusion delete(honcho-cli/src/honcho_cli/commands/conclusion.py) to purge it and re-derive from a clean document set. - DeepSeek tool continuation (#723):
reasoning_content(str) vs.reasoning_details(list) is a transport-level mismatch that surfaces in dialectic calls; amodel_configoverride per provider is the current workaround.
See Also
Source: https://github.com/plastic-labs/honcho / Human Manual
Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval
Related topics: Overview and System Architecture, Self-Hosting, Configuration, and LLM Provider Setup
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and System Architecture, Self-Hosting, Configuration, and LLM Provider Setup
Reasoning Pipeline: Deriver, Dreamer, Summarizer, and Retrieval
Overview and Scope
Honcho is a conversational memory platform that turns raw messages into a layered, queryable representation of each *peer*. The reasoning pipeline is the system responsible for ingesting messages, distilling them into atomic observations (Conclusions), maintaining compact identity snapshots (Peer Cards), producing progressive conversation digests (Session Summaries), and exposing the resulting memory through semantic Retrieval. The public surface for these capabilities is the *Workspace Configuration* object, which toggles each stage independently. Source: README.md and sdks/python/src/honcho/api_types.py
The pipeline is implemented as a FastAPI server backed by a queue, with two long-running worker processes (commonly referenced in the codebase as the *deriver* and *dreamer*). The honcho-cli and the Python/TypeScript SDKs are thin clients that enqueue messages and read derived memory. Source: honcho-cli/README.md and sdks/typescript/src/index.ts
Pipeline Stages
1. Deriver (Conclusion Generation)
The deriver is the first stage of the pipeline. When a peer adds a message to a session, the server persists the message and enqueues a work item. The deriver consumes the queue, calls a language model, and writes zero or more *conclusions* — atomic facts derived from the exchange. A Conclusion is the public memory atom: it has an id, content, observerId, observedId, optional sessionId, and a createdAt timestamp. Source: sdks/typescript/src/conclusions.ts:13-37
Conclusions are scoped by the (observer, observed) peer pair, which lets a workspace model both self-representation (observer == observed) and asymmetric cross-peer views (peer X's model of peer Y). Source: README.md (Internal storage section)
2. Peer Card (Identity Snapshot)
The peer card is a static, low-latency identity summary built from a peer's conclusions. It is regenerated on a schedule and cached for fast retrieval. It is controlled by the peer_card configuration block, which exposes two independent flags: use (read pre-built cards into context) and create (run the card generation job). Source: sdks/python/src/honcho/api_types.py (PeerCardConfiguration) and sdks/typescript/src/validation.ts (PeerCardConfigSchema)
3. Summarizer (Progressive Context Compression)
The summarizer produces two rolling summaries of a session: a *short* summary and a *long* summary. Thresholds are user-tunable via summary.messages_per_short_summary (minimum 10) and summary.messages_per_long_summary (minimum 20). Summaries are exposed through the CLI as honcho session summaries and surface in the SDK as SessionSummaries / Summary objects. Source: sdks/typescript/src/validation.ts (SummaryConfigSchema), honcho-cli/src/honcho_cli/commands/session.py, and sdks/typescript/src/index.ts
4. Dreamer (Background Consolidation)
The dreamer is an offline consolidation pass. It runs less frequently than the deriver, traverses a workspace's conclusions, and re-organizes or prunes the representation. It is gated by the boolean dream.enabled flag in workspace and session configuration. Source: sdks/python/src/honcho/api_types.py (DreamConfiguration) and sdks/typescript/src/validation.ts (DreamConfigSchema)
5. Retrieval (Read Path)
Read-side retrieval is served by three endpoints: peer.context, peer.card, and conclusion.search. The Python SDK exposes these as peer.chat(), peer.context, and peer.search(); the TypeScript SDK mirrors them on the Peer class. The CLI surfaces them as honcho conclusion search <query> and honcho session context. Source: honcho-cli/src/honcho_cli/commands/conclusion.py and sdks/python/src/honcho/peer.py
flowchart LR A[Client SDK / CLI] -->|add messages| B[FastAPI Server] B -->|enqueue| C[Deriver Worker] C -->|LLM call| D[(Conclusions Store)] D -->|embed + index| E[(Vector Store)] C --> F[Peer Card Job] C --> G[Summarizer Job] D -->|periodic| H[Dreamer Worker] A -->|chat / search / context| B B -->|read| E B -->|read| F B -->|read| G
Configuration Surface
The pipeline is configured per workspace and per session, with a session-level config able to override or null out workspace settings. Both null and omission are valid and mean "inherit from the parent scope."
| Config Block | Key Field(s) | Effect | Source |
|---|---|---|---|
reasoning | enabled, customInstructions | Toggles LLM-driven derivation and lets callers inject system-prompt guidance | sdks/python/src/honcho/api_types.py |
peer_card | use, create | Independently control card *consumption* and card *generation* | sdks/typescript/src/validation.ts |
summary | enabled, messages_per_short_summary (≥10), messages_per_long_summary (≥20) | Enable rolling summaries and tune compression thresholds | sdks/typescript/src/validation.ts |
dream | enabled | Enable background consolidation pass | sdks/python/src/honcho/api_types.py |
The validation layer in both SDKs (extra="forbid" in Pydantic, .strict() in Zod) rejects unknown keys, making the contract additive. Source: sdks/python/src/honcho/api_types.py and sdks/typescript/src/validation.ts
Known Failure Modes (Community-Reported)
Several community-reported issues map directly to the reasoning pipeline:
- Deriver not auto-starting in self-hosted deployments (issue #494): messages are persisted and queued, but
peer.card,peer.context, andconclusion.searchreturn empty until the deriver worker is started manually. Self-hosters should confirm the deriver process is running alongside the API. - Self-reinforcing conclusions (issue #725): a conclusion with no grounding in source messages can re-derive itself after deletion and cache flush, and can be reinforced by subsequent interactions. The SDK's
honcho conclusion delete <id>removes the row but does not prevent re-derivation in the current architecture. - OpenAI-compatible provider misconfiguration (issue #641): when
LLM_OPENAI_API_KEYpoints to OpenRouter, vLLM, or similar, the deriver'sAsyncOpenAIclient falls back toapi.openai.comand returns 401. The fix is to setOPENAI_BASE_URLto the compatible endpoint's URL. - Custom embedding endpoints (issues #443, #578): the embedding client hardcodes model names and base URLs for two of three providers. Self-hosted embedding backends (Ollama, llama.cpp, TEI) may require code modifications or future configuration hooks.
Operational Notes
The reasoning pipeline is asynchronous: add_messages returns before conclusions are written, and queue depth is observable via the session queue-status endpoint. The SDK paginates all list responses (SyncPage in Python, Page<T> in TypeScript) so that large conclusion or session sets can be streamed. Source: sdks/python/src/honcho/pagination.py and sdks/typescript/src/pagination.ts
Honcho's evals (LongMemEval, LoCoMo, and others) are the primary correctness signal for the pipeline; the README links to the evals page and the benchmarking blog post for reproducible methodology. Source: README.md
See Also
- Core Concepts: Workspaces, Peers, Sessions, Messages
- Configuration Reference
- Self-Hosting Guide
- Honcho CLI Reference
- Embedding and Vector Store Backends
Source: https://github.com/plastic-labs/honcho / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Developers may fail before the first successful local run: [Bug] ModuleNotFoundError: No module named 'click'
Doramagic Pitfall Log
Found 20 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: high
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/725
2. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/494
3. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/789
4. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: [Bug] ModuleNotFoundError: No module named 'click'
- User impact: Developers may fail before the first successful local run: [Bug] ModuleNotFoundError: No module named 'click'
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [Bug] ModuleNotFoundError: No module named 'click'. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/786
5. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: [Feature] Support TurboQuant/turbovec as optional vector store backend for memory compression
- User impact: Developers may fail before the first successful local run: [Feature] Support TurboQuant/turbovec as optional vector store backend for memory compression
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [Feature] Support TurboQuant/turbovec as optional vector store backend for memory compression. Context: Observed when using node, python, linux
- Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/781
6. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: honcho-cli 0.1.0 on PyPI still missing click — please publish release with #786 fix
- User impact: Developers may fail before the first successful local run: honcho-cli 0.1.0 on PyPI still missing click — please publish release with #786 fix
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: honcho-cli 0.1.0 on PyPI still missing click — please publish release with #786 fix. Context: Observed when using python, macos
- Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/808
7. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/786
8. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/781
9. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/plastic-labs/honcho/issues/808
10. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | https://github.com/plastic-labs/honcho
11. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Developers should check this configuration risk before relying on the project: I need help setting up honcho
- User impact: Developers may misconfigure credentials, environment, or host setup: I need help setting up honcho
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: I need help setting up honcho. Context: Observed when using docker, windows, cuda
- Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/789
12. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Developers should check this configuration risk before relying on the project: Self-hosted/local Honcho ingests messages but derived memory does not appear automatically; peer card/context/search remain empty
- User impact: Developers may misconfigure credentials, environment, or host setup: Self-hosted/local Honcho ingests messages but derived memory does not appear automatically; peer card/context/search remain empty
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Self-hosted/local Honcho ingests messages but derived memory does not appear automatically; peer card/context/search remain empty. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | https://github.com/plastic-labs/honcho/issues/494
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using honcho with real data or production workflows.
- [[Feature] Support TurboQuant/turbovec as optional vector store backend f](https://github.com/plastic-labs/honcho/issues/781) - github / github_issue
- Self-hosted/local Honcho ingests messages but derived memory does not ap - github / github_issue
- Conclusions re-derive after delete + cache flush (self-reinforcing) - github / github_issue
- honcho-cli 0.1.0 on PyPI still missing click — please publish release wi - github / github_issue
- I need help setting up honcho - github / github_issue
- Test issue for honcho_conclude parameters - github / github_issue
- [[Bug] ModuleNotFoundError: No module named 'click'](https://github.com/plastic-labs/honcho/issues/786) - github / github_issue
- Configuration risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence