# https://github.com/memtomem/memtomem-stm Project Manual

Generated at: 2026-07-04 03:22:06 UTC

## Table of Contents

- [Introduction & System Architecture](#page-1)
- [Proxy Pipeline: Clean, Compress, Cache](#page-2)
- [Memory Surfacing & LTM Integration](#page-3)
- [CLI, Configuration, Daemon & Known Limitations](#page-4)

<a id='page-1'></a>

## Introduction & System Architecture

### Related Pages

Related topics: [Proxy Pipeline: Clean, Compress, Cache](#page-2), [Memory Surfacing & LTM Integration](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/memtomem/memtomem-stm/blob/main/README.md)
- [SECURITY.md](https://github.com/memtomem/memtomem-stm/blob/main/SECURITY.md)
- [src/memtomem_stm/server.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/server.py)
- [src/memtomem_stm/proxy/manager.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/manager.py)
- [src/memtomem_stm/proxy/pipeline_stages.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/pipeline_stages.py)
- [src/memtomem_stm/proxy/config.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/config.py)
- [src/memtomem_stm/proxy/relevance.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/relevance.py)
- [src/memtomem_stm/cli/proxy.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/proxy.py)
- [src/memtomem_stm/observability/tracing.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/observability/tracing.py)
- [src/memtomem_stm/surfacing/feedback_store.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/feedback_store.py)

</details>

# Introduction & System Architecture

## Purpose and Scope

**memtomem-stm** is an MCP-launched memory and tool-proxy service that exposes short-term memory (STM) surfacing, retrieval, and tool routing to Model Context Protocol clients. The bundled entry point is the `mms` server, which serves both an MCP stdio interface for upstream tools and an operator-facing CLI for inspection and tuning. The project targets two kinds of consumers: (1) agents that call `call_tool` against upstream MCP servers through the proxy, and (2) operators who tune `stm_proxy.json` and inspect on-disk state via `mms` subcommands such as `stats`, `health`, `tune`, and `config validate`. As of **v0.1.31**, the release notes emphasize surfacing-stat safeguards (the `stm_surfacing_stats` flat-score warning introduced in #573) and accumulated ops-stability hardening.

Scope covers: proxying upstream `call_tool` requests through a multi-stage pipeline, persisting feedback/surfacing signals into SQLite-backed stores, providing relevance scoring via embeddings, offering observability through tracing and a stats surface, and protecting external-LLM boundaries with a privacy scan.

## System Components

The codebase is organized around a small number of load-bearing modules:

- **`server.py`** — boots the MCP stdio server, configures `logging.basicConfig` (stderr-only, default `WARNING`, gated by `MEMTOMEM_STM_LOG_LEVEL` at `server.py:1845`) and exposes `stm_tuning_recommendations` (at `server.py:1802`), which produces per-tool `max_result_chars` / strategy / retention-floor suggestions but currently prints "apply manually to stm_proxy.json" (follow-up tracked in #615).
- **`proxy/manager.py`** — the `ProxyManager` orchestrates the upstream connection lifecycle (connect, reconnect, cleanup) and constructs the pipeline. At `manager.py:415-450` the manager emits a startup warning that `auto_index`/`extraction` config keys are accepted but inert because no `index_engine` is wired into the bundled server (#288, #616).
- **`proxy/pipeline_stages.py`** — defines the ordered transformation stages applied to outgoing tool calls and incoming results (see next section).
- **`proxy/config.py`** — Pydantic models for `stm_proxy.json`. The models use the default `extra="ignore"` semantics rather than `extra="forbid"`, which is why unknown/typo'd keys silently disappear (#611). `proxy/config.py:135,144` is one of four files currently flagged by mypy (#617).
- **`proxy/relevance.py`** — owns the `RelevanceScorer` and its embedding providers. `_embed_ollama` (`relevance.py:196`) and `_embed_openai` (`relevance.py:208`) issue **synchronous** `httpx.post` calls while their callers run on the asyncio event loop, blocking it for up to the full timeout (#618).
- **`observability/tracing.py`** — Langfuse wiring (~140 LOC); one of the modules with no direct test file (#619).
- **`surfacing/feedback_store.py`** — feedback persistence layer; `surfacing/feedback_store.py:366` (missing `event_ids_by_memory` annotation) is a remaining mypy gap (#617).
- **`cli/proxy.py`** — Click-based operator CLI. `mms stats` (at `cli/proxy.py:833`) reads only on-disk DBs and silently diverges from live `stm_proxy_stats`; observability tools lack a discoverability hint (#613).

## Request Pipeline & Data Flow

Every `call_tool` request that flows through `ProxyManager` traverses a fixed pipeline. The current stage set, ordered as the bundle ships, is illustrated below.

```mermaid
flowchart LR
    Client[MCP Client] -->|call_tool| Mgr[ProxyManager]
    Mgr --> Privacy[PRIVACY scan]
    Privacy --> Index[INDEX stage]
    Index --> Compress[COMPRESS context]
    Compress --> Extract[EXTRACT memories]
    Extract --> Rerank[RERANK / Relevance]
    Rerank --> Upstream[(Upstream MCP server)]
    Upstream --> Rerank
    Rerank --> Persist[Feedback / Surfacing store]
    Persist --> Client
```

Note the *documented* vs *actual* gap on `INDEX` / `Extract`: the manager emits a warning at `manager.py:415-450` because the bundled server constructs `ProxyManager` without an `index_engine`, so the INDEX stage does not run and `auto_index`/`extraction` config keys are inert (#616). Several other community-tracked characteristics of this pipeline are worth noting up front:

- **Circuit breaker claim.** `SECURITY.md` advertises per-upstream circuit-breaker isolation; #608 reports the upstream `call_tool` path has no breaker, and the SECURITY.md claim does not hold against the code.
- **Credential redaction.** A redaction sweep landed in #605/#606, but #622 observed a `mid-loop` reconnect-failure log site in `_fetch_upstream` that still leaks the credentialed upstream URL.
- **Privacy boundary.** The privacy scan is wired into LLM compression and extraction (#289/#454), but disabling `privacy_scan_enabled` silently sends raw upstream text to external LLM providers (#610).

## Configuration, Logging, and Observability

Configuration is loaded from `stm_proxy.json` via Pydantic models in `proxy/config.py`. Two failure modes recur in operator reports: unknown keys are silently dropped (`extra="ignore"` is the project default) and parse failures silently fall back to defaults — #611 proposes a `mms config validate` command and louder failure semantics. The recent mypy sweep (#617) reports 9 errors across 4 files and is configured as `continue-on-error` in CI.

Server-side logging is stderr-only (see `server.py:1845`), which works for stdio-launched MCP processes whose host captures stderr, but is invisible to operators running detached; #612 proposes an optional rotating file log under `~/.memtomem/`.

Observability rests on three surfaces:

1. **Tracing** — Langfuse integration in `observability/tracing.py`. ~140 LOC; currently lacks a direct test (#619).
2. **Stats** — `stm_surfacing_stats` (zero-variance warning, #573) and `mms stats`. The two diverge by construction: `cli/proxy.py:833` reads only on-disk DBs while `stm_proxy_stats` reflects live in-memory counters (#613).
3. **MCP tools + CLI health** — 13 MCP tools plus `mms health`; the discoverability gap is that several tools remain hidden (#613).

A practical operator loop today is: run `mms tune` → read `stm_tuning_recommendations` (`server.py:1802`) → hand-edit `stm_proxy.json` → run `mms stats` / `mms health`. The two obvious rough edges in that loop — automated application of recommendations (#615) and config validation (#611) — are tracked as separate follow-ups from the 2026-07-04 service review that also flagged the relevance-scoring blocking I/O (#618), the missing circuit breaker (#608), and the `mms` ↔ live-stats divergence (#613).

---

<a id='page-2'></a>

## Proxy Pipeline: Clean, Compress, Cache

### Related Pages

Related topics: [Introduction & System Architecture](#page-1), [Memory Surfacing & LTM Integration](#page-3), [CLI, Configuration, Daemon & Known Limitations](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/memtomem_stm/proxy/manager.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/manager.py)
- [src/memtomem_stm/proxy/cleaning.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/cleaning.py)
- [src/memtomem_stm/proxy/compression.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/compression.py)
- [src/memtomem_stm/proxy/cache.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/cache.py)
- [src/memtomem_stm/proxy/tool_eligibility.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/tool_eligibility.py)
- [src/memtomem_stm/proxy/tool_relevance.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/proxy/tool_relevance.py)
</details>

# Proxy Pipeline: Clean, Compress, Cache

The Proxy Pipeline is the core request-processing path inside `memtomem-stm`'s STM proxy server. Every `tools/call` arriving from an MCP client flows through a deterministic sequence of gates — eligibility, relevance, cleaning, compression, and cache — before reaching an upstream MCP server, and the response flows back through the same stages. The pipeline's purpose is to reduce token spend, deduplicate repeat calls, prevent credential or PII leakage, and surface tuning feedback without changing the MCP wire protocol observed by the client.

## Pipeline overview

`ProxyManager` orchestrates the request lifecycle for each upstream tool call. Eligibility filtering removes tools that the local configuration (per-tool regexes, call counters, retention floors) has marked as non-routable. Surviving candidates are scored by the relevance stage; only those above the configured `min_score` proceed Source: [src/memtomem_stm/proxy/manager.py:415-450](). The remaining request is then passed sequentially to the cleaning, compression, and cache stages. Errors at any stage produce a structured `ErrorDetails` payload rather than raising to the MCP transport Source: [src/memtomem_stm/proxy/config.py:135,144]().

```mermaid
flowchart LR
    A[Inbound tools/call] --> B[tool_eligibility]
    B --> C[tool_relevance<br/>min_score gate]
    C --> D[cleaning<br/>privacy scan + normalize]
    D --> E[compression<br/>LLM or char-budget]
    E --> F[cache<br/>read-through]
    F -->|miss| G[upstream fetch]
    G --> H[cache write]
    H --> I[tool_relevance record]
    I --> J[Return to client]
    F -->|hit| J
```

## Cleaning stage

The cleaning stage normalizes inbound payloads and runs the privacy scan. Normalization strips control characters, collapses whitespace, and produces a canonical representation used for cache-key derivation and relevance scoring. The privacy scan masks secrets and PII before any text leaves the proxy; when `privacy_scan_enabled` is `False`, the stage emits a loud warning rather than silently forwarding raw upstream text Source: [src/memtomem_stm/proxy/cleaning.py:1-120](). The community issue #610 tracks the residual gap where disabling the scan still ships raw text to external LLM providers without an entropy-based fallback heuristic.

## Compression stage

Compression enforces the per-tool `max_result_chars` budget declared in `stm_proxy.json`. When the upstream payload exceeds the budget, the compressor selects between a deterministic char-trim and an LLM-assisted summarization path. The LLM path delegates to a configured provider and is itself gated by the privacy scan from the previous stage, so raw secrets cannot be forwarded to the summarizer Source: [src/memtomem_stm/proxy/compression.py:1-180](). Tuning recommendations produced by `stm_tuning_recommendations` suggest per-tool `max_result_chars` adjustments, strategy switches, and retention floors; today these are emitted as text and applied by hand, which issue #615 proposes to automate behind `mms tune --apply` Source: [src/memtomem_stm/server.py:1802]().

## Cache stage

The cache stage is read-through. On a hit, the stored payload is replayed through the compression stage using the *current* budget — so a cached result that was acceptable yesterday can still be re-trimmed if a config change lowered the cap — and the relevance-scoring recorder is incremented without contacting the upstream Source: [src/memtomem_stm/proxy/cache.py:1-160](). On a miss, the cleaned and compressed payload is persisted with the normalized request as the key, and a relevance feedback row is written so that future `stm_surfacing_stats` queries can detect zero-variance score distributions Source: [src/memtomem_stm/surfacing/feedback_store.py:366]().

## Cross-cutting concerns

Three concerns touch every stage. First, observability: the relevance score recorded on each call feeds both the surfacing stats tool and the tuning recommender; #618 notes that the embedding providers in `_embed_ollama` and `_embed_openai` issue synchronous `httpx.post` calls, blocking the event loop for the full timeout Source: [src/memtomem_stm/proxy/relevance.py:196,208](). Second, resilience: #608 documents that the `call_tool` retry loop lacks the per-upstream circuit breaker that `SECURITY.md` advertises, while #622 reports that a mid-loop reconnect-failure log still leaks the credentialed upstream URL despite the #605/##606 redaction sweep. Third, configuration: no Pydantic model sets `extra="forbid"`, so unknown keys are silently ignored and parse failures silently fall back, which #611 proposes to surface via `mms config validate` Source: [src/memtomem_stm/proxy/config.py:135,144]()`.

Together the three stages — clean, compress, cache — turn each upstream `tools/call` into a bounded, auditable, deduplicated transaction while leaving the MCP surface observed by the client unchanged.

---

<a id='page-3'></a>

## Memory Surfacing & LTM Integration

### Related Pages

Related topics: [Introduction & System Architecture](#page-1), [Proxy Pipeline: Clean, Compress, Cache](#page-2), [CLI, Configuration, Daemon & Known Limitations](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/memtomem_stm/surfacing/engine.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/engine.py)
- [src/memtomem_stm/surfacing/relevance.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/relevance.py)
- [src/memtomem_stm/surfacing/context_extractor.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/context_extractor.py)
- [src/memtomem_stm/surfacing/formatter.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/formatter.py)
- [src/memtomem_stm/surfacing/mcp_client.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/mcp_client.py)
- [src/memtomem_stm/surfacing/feedback.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/feedback.py)
- [src/memtomem_stm/surfacing/feedback_store.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/surfacing/feedback_store.py)
- [src/memtomem_stm/server.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/server.py)
</details>

# Memory Surfacing & LTM Integration

## Purpose & Scope

The surfacing subsystem retrieves relevant entries from long-term memory (LTM) and injects them into the live conversation context held by the proxy. It is the bridge between the on-disk LTM store and the LLM tool-call loop: it scores candidate memories against the active turn, extracts and trims their content to fit a per-call budget, formats the result for the model, and records the outcome so future surfacing decisions can be tuned.

The subsystem lives entirely under `src/memtomem_stm/surfacing/` and is invoked by `proxy/` as a library, which means a surfacing failure cannot block the request path (`Source: [src/memtomem_stm/surfacing/engine.py]()`). Its six modules — `engine`, `relevance`, `context_extractor`, `formatter`, `mcp_client`, and `feedback` — are deliberately kept independent so each stage can be tested and replaced in isolation.

## Architecture & Pipeline

The surfacing flow has four ordered stages. The `engine` orchestrates them; each stage has a single module responsible for its behavior.

```mermaid
flowchart LR
    A[Active turn + LTM candidates] --> B[relevance.py<br/>RelevanceScorer]
    B --> C[context_extractor.py<br/>slice & budget]
    C --> D[formatter.py<br/>markup render]
    D --> E[Injected LTM context]
    B -.-> F[feedback.py<br/>score record]
    D -.-> F
    F --> G[stm_surfacing_stats<br/>MCP tool]
```

1. **Score** — `RelevanceScorer` in `relevance.py` ranks candidates against the current prompt (`Source: [src/memtomem_stm/surfacing/relevance.py:1-50]()`).
2. **Extract** — `context_extractor.py` slices each survivor to the per-tool `max_result_chars` budget (`Source: [src/memtomem_stm/surfacing/context_extractor.py]()`).
3. **Format** — `formatter.py` renders the slices into stable markup (`Source: [src/memtomem_stm/surfacing/formatter.py]()`).
4. **Record** — `feedback.py` persists the outcome to `feedback_store.py` for later analysis (`Source: [src/memtomem_stm/surfacing/feedback.py]()`).

Cross-store retrieval is delegated to `mcp_client.py`, which talks to the external memory backend on the scorer's behalf (`Source: [src/memtomem_stm/surfacing/mcp_client.py]()`).

## Relevance Scoring

`RelevanceScorer` is the ranking layer. It supports two embedding providers, both reached over HTTP:

- **Ollama** — local embeddings from a running Ollama instance (`Source: [src/memtomem_stm/surfacing/relevance.py:196]()`).
- **OpenAI-compatible** — remote embeddings via the OpenAI HTTP API (`Source: [src/memtomem_stm/surfacing/relevance.py:208]()`).

The scorer computes cosine similarity between the prompt embedding and each candidate, applies `min_score` and top-k filtering, and hands survivors downstream. A known limitation in this module is that both `_embed_ollama` and `_embed_openai` issue **synchronous** `httpx.post(...)` calls (`Source: [src/memtomem_stm/surfacing/relevance.py:196, 208]()`), while every consumer — `ProxyManager`, the surfacing engine — lives on the asyncio event loop. A slow or unreachable embedding endpoint can therefore block the loop for up to the full request timeout. Tracked in #618.

## Feedback & Statistics

`feedback.py` and its backing `feedback_store.py` record every surfacing outcome — which memories were surfaced, what they scored, and which downstream event they ultimately fed. The store's primary key is `event_ids_by_memory`, an untyped dict that is currently flagged by mypy for missing annotation (`Source: [src/memtomem_stm/surfacing/feedback_store.py:366]()`) and is part of the 9-error burn-down tracked in #617.

The MCP tool `stm_surfacing_stats` reads from this store. As of v0.1.31, it warns on a zero-variance score distribution (#573): when every recorded score for a memory is identical, the `min_score` filter carries no ranking information even though aggregate stats alone would not flag the degeneracy.

Surfacing also feeds `stm_tuning_recommendations` (`Source: [src/memtomem_stm/server.py:1802]()`), which produces per-tool `max_result_chars` and strategy suggestions. There is no `mms tune --apply` yet — recommendations still have to be hand-applied to `stm_proxy.json` (#615).

## Known Limitations

- **Synchronous embedding HTTP** — `httpx.post` in both relevance providers blocks the asyncio loop (#618).
- **Inert INDEX stage** — the bundled `mms` server constructs `ProxyManager` without an `index_engine`, so the index branch of the pipeline is wired but never runs (#616).
- **Mypy debt** — the surfacing module carries one of nine remaining mypy errors (`feedback_store.py:366`), gating re-enable of the `attr-defined` suppression (#617).
- **Direct test coverage gaps** — relevance embedding paths and the feedback store have incidental rather than direct test coverage, on the burn-down list (#619).

---

<a id='page-4'></a>

## CLI, Configuration, Daemon & Known Limitations

### Related Pages

Related topics: [Introduction & System Architecture](#page-1), [Proxy Pipeline: Clean, Compress, Cache](#page-2), [Memory Surfacing & LTM Integration](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/memtomem_stm/cli/proxy.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/proxy.py)
- [src/memtomem_stm/cli/daemon_cmd.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/daemon_cmd.py)
- [src/memtomem_stm/cli/hook_cmd.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/hook_cmd.py)
- [src/memtomem_stm/cli/hook_adapter.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/hook_adapter.py)
- [src/memtomem_stm/cli/hook_hosts.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/hook_hosts.py)
- [src/memtomem_stm/cli/mms_project.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/cli/mms_project.py)
- [src/memtomem_stm/server.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/server.py)
- [src/memtomem_stm/manager.py](https://github.com/memtomem/memtomem-stm/blob/main/src/memtomem_stm/manager.py)
</details>

# CLI, Configuration, Daemon & Known Limitations

This page documents the operational surface of memtomem-stm: the Click-based `mms` CLI, the daemon and project subsystems, the configuration model, and the known limitations surfaced during the 2026-07-04 service review. It is aimed at operators and integrators who need to run, observe, and tune the proxy in production.

## CLI Surface (`mms`)

The `mms` entry point is a Click application defined in `src/memtomem_stm/cli/proxy.py`. It groups subcommands for proxy lifecycle (`start`, `stop`, `run`), observability (`health`, `stats`, `logs`), configuration (`config`, `validate`, `tune`), and developer ergonomics. Daemon-specific behavior lives in a sibling module:

- `src/memtomem_stm/cli/daemon_cmd.py` — PID-file lifecycle, backgrounding, and process supervision.
- `src/memtomem_stm/cli/hook_cmd.py`, `hook_adapter.py`, `hook_hosts.py` — hook installation into host agents (Claude Code, Cursor, etc.).
- `src/memtomem_stm/cli/mms_project.py` — per-project scoping helpers used by `mms config` and `mms stats`.

The CLI delegates most heavy lifting to the bundled `mms` server; long-running commands spawn or attach to the daemon process. Source: [src/memtomem_stm/cli/proxy.py:833]() defines `mms stats`, which currently reads on-disk DBs only and does not consult live `stm_proxy_stats`.

## Configuration Model

Configuration is loaded from JSON files (notably `stm_proxy.json`) and parsed into Pydantic models. The CLI exposes `mms config` for inspection and `mms config validate` for pre-flight checks.

Two configuration behaviors are documented as the highest-impact usability gaps and are tracked as known limitations:

1. **Typo'd keys vanish silently.** No Pydantic model sets `extra="forbid"`, so the default `extra="ignore"` is relied upon and unknown keys are dropped without warning. Source: issue #611 (`mms config validate` proposal).
2. **Parse failures fall back silently.** Invalid values are swallowed and the daemon starts with defaults, masking operator errors.

Two further configuration surfaces are accepted but inert:

- `auto_index` and `extraction` config keys are parsed but the bundled `mms` server constructs `ProxyManager` without an `index_engine`, so the INDEX stage never runs. A startup warning is emitted, but the keys have no runtime effect. Source: [src/memtomem_stm/manager.py:415-450]() (warned at startup); issue #616.
- `privacy_scan_enabled=false` causes raw upstream text to be sent to external LLM providers with no console warning beyond startup logs. Source: issue #610.

The CLI's `mms tune` command surfaces `stm_tuning_recommendations` (per-tool `max_result_chars`, strategy, and retention-floor suggestions) but currently prints "apply manually to stm_proxy.json". Source: [src/memtomem_stm/server.py:1802](); issue #615 proposes `mms tune --apply`.

## Daemon, Logging & Observability

The daemon is launched by `mms start` (background) or `mms run` (foreground). Process supervision uses a PID file under the `~/.memtomem/` directory. Server entry point configures logging via `logging.basicConfig` to stderr only, default level WARNING, controlled by the `MEMTOMEM_STM_LOG_LEVEL` environment variable. Source: [src/memtomem_stm/server.py:1845](); issue #612 proposes an optional rotating file log under `~/.memtomem/`.

Observability is exposed through 13 MCP tools plus `mms health`. Two discoverability gaps are tracked:

- `mms stats` reads only on-disk DBs, silently diverging from live `stm_proxy_stats`. Source: [src/memtomem_stm/cli/proxy.py:833](); issue #613.
- Hidden observability tools lack a discoverability hint in the CLI help text. Source: issue #613.

In `v0.1.31`, `stm_surfacing_stats` was extended to warn on a zero-variance score distribution — a flat-score state that previously required manual inspection of `stm_feedback.db`.

## Known Limitations Summary

The following table consolidates the limitations referenced above and their tracking issues.

| Area | Limitation | Tracking |
| --- | --- | --- |
| Circuit breaker | SECURITY.md claims per-upstream breaker isolation; `call_tool` path has none | #608 |
| Logging | stderr-only, no rotating file | #612 |
| Config | Unknown keys silently ignored; parse failures fall back | #611 |
| Config (inert) | `auto_index` / `extraction` accepted but unused | #616 |
| Privacy | `privacy_scan_enabled=false` silently bypasses scan | #610 |
| Stats | `mms stats` reads on-disk only, diverges from live stats | #613 |
| Tune | `mms tune` cannot apply recommendations | #615 |
| Relevance | Embedding providers use synchronous `httpx.post` (blocks loop) | #618 |
| Types | 9 mypy errors across 4 files (advisory in CI) | #617 |
| Reconnect log | Mid-loop reconnect-failure log may leak credentialed URL | #622 |
| Test gaps | No direct tests for `cli/daemon_cmd`, `observability/tracing`, `mms/secrets`, `mms/detect`, relevance embedding paths | #619 |

### CLI / Docs Polish (issue #614)

A batch of low-severity consistency items is tracked together: `--json` flag coverage is uneven across commands, `daemon run` vs `daemon start` naming overlaps, `status` and `list` subcommands overlap, and the CHANGELOG lacks structured upgrade notes.

### Mitigation Guidance

Operators working around current limitations should:

- Run `mms config validate` after any hand edit of `stm_proxy.json` (once available); until then, diff against the Pydantic schema in `proxy/config.py`.
- Set `MEMTOMEM_STM_LOG_LEVEL=DEBUG` and capture stderr explicitly when running under an MCP stdio host, since logs are otherwise lost.
- Treat `auto_index` / `extraction` as no-ops in the bundled server until #616 is resolved.
- Apply `mms tune` recommendations by hand-editing `stm_proxy.json` until `mms tune --apply` ships.
- Treat `mms stats` as a snapshot of persisted state and prefer live MCP stats tools for current counters.

Together, these surfaces — CLI, configuration, daemon, and known limitations — define the operational contract of memtomem-stm as of `v0.1.31`.

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: memtomem/memtomem-stm

Summary: Found 36 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

## 1. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Developers should check this security_permissions risk before relying on the project: ci: supply-chain hardening — dependency audit, Dependabot, action SHA pinning, top-level workflow permissions
- User impact: Developers may expose sensitive permissions or credentials: ci: supply-chain hardening — dependency audit, Dependabot, action SHA pinning, top-level workflow permissions
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/609

## 2. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Developers should check this security_permissions risk before relying on the project: privacy: disabling privacy_scan_enabled silently sends raw upstream text to external LLM providers — warn loudly; consider entropy heuristic
- User impact: Developers may expose sensitive permissions or credentials: privacy: disabling privacy_scan_enabled silently sends raw upstream text to external LLM providers — warn loudly; consider entropy heuristic
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/610

## 3. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Developers should check this security_permissions risk before relying on the project: tests: fill direct-coverage gaps — observability/tracing, mms/secrets, mms/detect, cli/daemon_cmd, relevance embedding paths
- User impact: Developers may expose sensitive permissions or credentials: tests: fill direct-coverage gaps — observability/tracing, mms/secrets, mms/detect, cli/daemon_cmd, relevance embedding paths
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/619

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: proxy: upstream call_tool path has no circuit breaker — SECURITY.md claims per-upstream breaker isolation
- User impact: Developers may fail before the first successful local run: proxy: upstream call_tool path has no circuit breaker — SECURITY.md claims per-upstream breaker isolation
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/608

## 5. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: v0.1.29
- User impact: Upgrade or migration may change expected behavior: v0.1.29
- Evidence: failure_mode_cluster:github_release | https://github.com/memtomem/memtomem-stm/releases/tag/v0.1.29

## 6. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/611

## 7. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/601

## 8. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/612

## 9. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/memtomem/memtomem-stm

## 10. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: Harden non-startup exc_info cleanup logs against credential leaks (follow-up to #580/#593)
- User impact: Developers may misconfigure credentials, environment, or host setup: Harden non-startup exc_info cleanup logs against credential leaks (follow-up to #580/#593)
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/605

## 11. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: cli/docs polish batch: --json coverage, daemon run/start naming, status/list overlap, CHANGELOG upgrade-notes
- User impact: Developers may misconfigure credentials, environment, or host setup: cli/docs polish batch: --json coverage, daemon run/start naming, status/list overlap, CHANGELOG upgrade-notes
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/614

## 12. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: cli: mms tune --apply — apply stm_tuning_recommendations to stm_proxy.json instead of hand-editing
- User impact: Developers may misconfigure credentials, environment, or host setup: cli: mms tune --apply — apply stm_tuning_recommendations to stm_proxy.json instead of hand-editing
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/615

## 13. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: config: unknown keys are silently ignored and parse failures silently fall back — add `mms config validate` and louder failure
- User impact: Developers may misconfigure credentials, environment, or host setup: config: unknown keys are silently ignored and parse failures silently fall back — add `mms config validate` and louder failure
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/611

## 14. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: decide: wire an index engine into the bundled mms server, or retire the inert auto_index/extraction config surface (follow-up to #288)
- User impact: Developers may misconfigure credentials, environment, or host setup: decide: wire an index engine into the bundled mms server, or retire the inert auto_index/extraction config surface (follow-up to #288)
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/616

## 15. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: proxy: ProxyManager.stop() never closes the sqlite pending stores (leaked connection on config-change rebuild)
- User impact: Developers may misconfigure credentials, environment, or host setup: proxy: ProxyManager.stop() never closes the sqlite pending stores (leaked connection on config-change rebuild)
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/601

## 16. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: server: stderr-only logging — add optional rotating file log under ~/.memtomem/
- User impact: Developers may misconfigure credentials, environment, or host setup: server: stderr-only logging — add optional rotating file log under ~/.memtomem/
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/612

## 17. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: types: mypy burn-down — 9 errors in 4 files, then re-enable the attr-defined suppression
- User impact: Developers may misconfigure credentials, environment, or host setup: types: mypy burn-down — 9 errors in 4 files, then re-enable the attr-defined suppression
- Evidence: failure_mode_cluster:github_issue | https://github.com/memtomem/memtomem-stm/issues/617

## 18. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.1.27
- User impact: Upgrade or migration may change expected behavior: v0.1.27
- Evidence: failure_mode_cluster:github_release | https://github.com/memtomem/memtomem-stm/releases/tag/v0.1.27

## 19. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.1.30
- User impact: Upgrade or migration may change expected behavior: v0.1.30
- Evidence: failure_mode_cluster:github_release | https://github.com/memtomem/memtomem-stm/releases/tag/v0.1.30

## 20. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.1.31
- User impact: Upgrade or migration may change expected behavior: v0.1.31
- Evidence: failure_mode_cluster:github_release | https://github.com/memtomem/memtomem-stm/releases/tag/v0.1.31

## 21. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/614

## 22. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/615

## 23. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/616

## 24. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/619

## 25. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/617

## 26. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/memtomem/memtomem-stm

## 27. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/618

## 28. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/memtomem/memtomem-stm

## 29. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/memtomem/memtomem-stm

## 30. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/memtomem/memtomem-stm

## 31. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/605

## 32. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/609

## 33. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/610

## 34. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memtomem/memtomem-stm/issues/608

## 35. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/memtomem/memtomem-stm

## 36. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/memtomem/memtomem-stm

<!-- canonical_name: memtomem/memtomem-stm; human_manual_source: deepwiki_human_wiki -->
