# https://github.com/NevaMind-AI/memU Project Manual

Generated at: 2026-06-24 00:12:59 UTC

## Table of Contents

- [System Overview & Architecture](#page-overview)
- [Storage Backends & Data Model](#page-storage)
- [LLM, Embedding & VLM Providers and Routing](#page-llm-embedding)
- [Memorize & Retrieve Workflows, Preprocessing and Prompts](#page-memorize-retrieve)

<a id='page-overview'></a>

## System Overview & Architecture

### Related Pages

Related topics: [Storage Backends & Data Model](#page-storage), [LLM, Embedding & VLM Providers and Routing](#page-llm-embedding), [Memorize & Retrieve Workflows, Preprocessing and Prompts](#page-memorize-retrieve)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)
- [src/memu/prompts/memory_type/skill.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/skill.py)
- [src/memu/prompts/memory_type/knowledge.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/knowledge.py)
- [src/memu/prompts/memory_type/behavior.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/behavior.py)
- [src/memu/prompts/memory_type/tool.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/tool.py)
- [src/memu/prompts/preprocess/document.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/preprocess/document.py)
- [src/memu/prompts/category_summary/category.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/category_summary/category.py)
- [src/memu/prompts/category_summary/category_with_refs.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/category_summary/category_with_refs.py)
</details>

# System Overview & Architecture

## Purpose and Scope

memU is an agent memory framework that turns raw sources (chat logs, documents, deployment logs, agent traces) into a structured, self-organizing **compiled workspace** that can be queried on demand. The framework exposes two runtime operations: `memorize()` for writing new sources into the workspace, and `retrieve()` for serving only the relevant layers back to an agent. Source: [README.md]().

The framework targets three concrete needs of long-running agents:

- **Context** — inject the right facts, preferences, and source material instead of a cold prompt.
- **Continuity** — the workspace persists and self-organizes across sessions, sources, and tasks.
- **Control** — every record is structured and inspectable, tracing back to its raw source for auditing and editing.

Source: [README.md]().

The framework is delivered as a Python package (with both Linux x86_64 and ARM64 wheel targets added in v1.1.0) and offers a pluggable storage layer supporting in-memory, SQLite, and Postgres backends behind a shared repository contract. Source: [README.md](), [v1.1.0 release notes]().

## The Compiled Workspace

The primary output of memU is a navigable workspace composed of three layered records. Source: [README.md]().

| Record | Role | Contents |
|--------|------|----------|
| `MemoryCategory` | **Folder** — a topic with an evolving summary | `name`, `description`, `summary`, `embedding`, child `MemoryItem[]` |
| `MemoryItem` | **File** — a typed atomic memory | `memory_type` ∈ {profile, event, knowledge, behavior, skill, tool}, `summary`, `extra`, `happened_at`, `embedding` |
| `Resource` | **Source** — the raw artifact behind the memory | `url`, `modality`, `local_path`, `caption`, `embedding` |

The on-disk projection of this workspace is a tree of `INDEX.md`, `MEMORY.md`, and per-skill `SKILL.md` files, persisted through the configured storage backend. Source: [README.md]().

### Runtime Operations

```
WRITE — memorize()                                   READ — retrieve()
─────────────────────────────────────────            ─────────────────────────────────────────
raw files → extract → files + folders               query → walk folders → ranked files
persist via repository contracts                     return scoped, ranked context
```

Source: [README.md]().

Both operations are exposed through the `MemoryService` class. The `memorize()` method accepts a `resource_url` and a `modality` (e.g. `conversation`, `document`) along with scoping metadata such as `user_id`. The `retrieve()` method accepts a list of query items and a `where` filter for scoped lookup. Source: [README.md]().

## Typed Memory Extraction

memU classifies extracted memory into six canonical types, each with a dedicated prompt module that enforces type-specific rules. Source: [README.md]().

```mermaid
flowchart LR
    A[Raw Resource] --> B[Preprocess]
    B --> C{Type Router}
    C -->|profile| P[profile.py]
    C -->|event| E[event.py]
    C -->|knowledge| K[knowledge.py]
    C -->|behavior| Bp[behavior.py]
    C -->|skill| S[skill.py]
    C -->|tool| T[tool.py]
    P --> M[MemoryItem]
    E --> M
    K --> M
    Bp --> M
    S --> M
    T --> M
    M --> Cat[MemoryCategory]
```

- **Knowledge** — declarative facts, concepts, and explanations; forbids opinions, personal experiences, and user-specific traits. Items must be self-contained, under ~50 words, and merged when redundant. Source: [src/memu/prompts/memory_type/knowledge.py]().
- **Behavior** — recurring patterns, routines, and solutions; forbids one-time actions unless they reveal a significant pattern, and forbids assistant-only turns. Items must use the word "user" consistently to attribute the subject. Source: [src/memu/prompts/memory_type/behavior.py]().
- **Skill** — comprehensive skill profiles with YAML frontmatter (`name`, `description`, `category`, `demonstrated-in`) and full sections (Core Principles, When to Use, Implementation Guide, Success Patterns, Common Pitfalls, Key Takeaways). Each profile must be at least 300 words to ensure depth and actionability. Source: [src/memu/prompts/memory_type/skill.py]().
- **Tool** — tool usage patterns including tool name, scenario, outcome, and a `when_to_use` retrieval hint, capturing both successful patterns and failure lessons. Source: [src/memu/prompts/memory_type/tool.py]().

### Preprocessing Stage

Before typed extraction, document inputs are condensed by a dedicated preprocess prompt that produces two outputs: a `processed_content` block that preserves all key information while removing verbosity, and a single-sentence `caption` summarizing the document. Source: [src/memu/prompts/preprocess/document.py]().

## Category Organization and Retrieval

After extraction, memory items are sorted into `MemoryCategory` folders, cross-linked, embedded, and summarized into a browsable tree. Source: [README.md]().

Two summary prompt variants are used to keep the folder summaries current:

- **Plain category summary** — merges original content with new memory items, preserves Markdown hierarchy, and excludes one-off actions that lack long-term value (e.g. "ate Malatang today"). Source: [src/memu/prompts/category_summary/category.py]().
- **Reference-annotated category summary** — adds inline `[ref:ITEM_ID]` references pointing back to the specific memory items that contributed each fact, enabling auditability and inline citation. Source: [src/memu/prompts/category_summary/category_with_refs.py]().

During `retrieve()`, the framework navigates these folders and returns only the files relevant to the current user, agent, session, or task, returning scoped, ranked context that can be injected into any agent workflow. Source: [README.md]().

## Community Notes and Known Limitations

Several issues reported by the community are worth noting for anyone reading this overview:

- **Storage column bug** — the `memory_items.happened_at` column has been observed to store `NULL` even when proper timestamps are provided in conversational input. This affects the optional temporal metadata on memory items added in v1.3.0. Source: [issue #428]().
- **SQLite embedding storage** — the SQLite backend has had issues storing embedding fields as native lists, raising `ValueError: <class 'list'> has no matching SQLAlchemy type` in affected test scripts. Source: [issue #382]().
- **Server entry point** — the `memu-server` console script has been reported to point to a missing `memu.server.cli` module on `main`; users packaging the server should verify the entry point exists in their installed version. Source: [issue #354]().

These limitations are tracked in the issue tracker and the release notes (currently at v1.5.1, which includes an alembic URL interpolation fix). Source: [v1.5.1 release notes]().

## See Also

- Storage backends and repository contracts (in-memory, SQLite, Postgres)
- LLM profile configuration and routing
- `MemoryService.memorize()` / `retrieve()` API reference
- Custom memory type prompt registration
- Category summary reference format and inline citations

---

<a id='page-storage'></a>

## Storage Backends & Data Model

### Related Pages

Related topics: [System Overview & Architecture](#page-overview), [Memorize & Retrieve Workflows, Preprocessing and Prompts](#page-memorize-retrieve)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/memu/database/factory.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/factory.py)
- [src/memu/database/interfaces.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/interfaces.py)
- [src/memu/database/models.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/models.py)
- [src/memu/database/state.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/state.py)
- [src/memu/database/inmemory/repo.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/inmemory/repo.py)
- [src/memu/database/inmemory/models.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/inmemory/models.py)
- [src/memu/database/sqlite/sqlite.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/database/sqlite/sqlite.py)
- [src/memu/app/settings.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/app/settings.py)
- [README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)
</details>

# Storage Backends & Data Model

The storage layer is the persistence core of memU. It defines the canonical data model — `Resource`, `MemoryItem`, `MemoryCategory`, and the join record `CategoryItem` — and provides pluggable repository implementations behind a shared contract. The runtime is decoupled from any specific engine: callers go through `DatabaseFactory`, which returns a backend that satisfies the same interfaces used during `memorize()` and `retrieve()`. The README explicitly advertises "in-memory, SQLite, or Postgres backends with the same repository contracts" as a core feature. Source: [README.md]()

## Data Model

The model treats memory as a workspace hierarchy rather than a flat table.

| Record | Role | Key Fields |
|--------|------|------------|
| `Resource` | Raw source artifact (chat log, document, image caption) | `url`, `modality`, `local_path`, `caption`, `embedding` |
| `MemoryItem` | Typed atomic memory extracted from a resource | `memory_type`, `summary`, `extra`, `happened_at`, `embedding` |
| `MemoryCategory` | Folder grouping items by topic with an evolving summary | `name`, `description`, `summary`, `embedding` |
| `CategoryItem` | Many-to-many relation linking items to categories | `category_id`, `item_id` |

`memory_type` is an enum-like string drawn from a fixed set: `profile`, `event`, `knowledge`, `behavior`, `skill`, `tool`. The README and the per-type prompt modules (`src/memu/prompts/memory_type/*.py`) confirm this set. Each prompt module defines extraction rules for its type — for example, `event.py` requires a declarative sentence with a timestamp, while `behavior.py` insists on recurring patterns rather than one-off actions.

```mermaid
flowchart TB
    R[Resource] -->|extract| MI[MemoryItem]
    MI -->|belongs to| CI[CategoryItem]
    MC[MemoryCategory] -->|groups| CI
    MC -.embedding.-> VS[(Vector Store)]
    MI -.embedding.-> VS
    R -.embedding.-> VS
```

All three primary records carry an `embedding` field, which the retrieve pipeline uses for semantic ranking across both category-level (broad context) and item-level (precise facts) queries.

## Backend Implementations

### In-Memory

The default backend stores everything in Python dictionaries and is intended for tests, demos, and ephemeral agent runs. Its repositories (`src/memu/database/inmemory/repo.py`) cache `Resource`, `MemoryItem`, `MemoryCategory`, and `CategoryItem` records in instance attributes and write through synchronously. No process restart can recover state.

### SQLite

Introduced in v1.2.0 ([release notes]()), the SQLite backend uses SQLAlchemy models and a per-record repository layout. The `SQLiteMemoryStore` defined in `src/memu/database/sqlite/sqlite.py` composes four repositories — `ResourceRepo`, `MemoryCategoryRepo`, `MemoryItemRepo`, and `CategoryItemRepo` — and caches each in a dict for read access. The constructor accepts a Pydantic `scope_model` for user scoping and optional custom model overrides, so application code can extend `Resource` or `MemoryItem` without subclassing the store. Source: [src/memu/database/sqlite/sqlite.py:23-58]()

The DSN is passed as a plain SQLAlchemy URL (`sqlite:///path/to/db.sqlite`). As of v1.5.1 the alembic migration URL interpolation was fixed ([commit fd87ceb]()), so URL templating with environment variables now behaves correctly.

### Postgres

The Postgres backend shares the same repository contracts and is the recommended option for production deployments where embeddings and high-volume writes need to coexist with ACID guarantees.

## Configuration

The `DatabaseConfig` block in `src/memu/app/settings.py` selects the backend and its connection parameters. The `BlobConfig` controls where raw resources are materialized on disk (`resources_dir`, default `./data/resources`), which is independent of the relational store.

The `MemoryFilesConfig` block — disabled by default — toggles the on-disk "memory file system" rendering that writes `INDEX.md`, `MEMORY.md`, and per-skill `SKILL.md` files under `output_dir`. When `synthesize=True`, those markdown files are generated via an LLM call using the configured `synthesis_llm_profile`; otherwise they are rendered deterministically from already-extracted records. Source: [src/memu/app/settings.py:25-60]()

The retrieve-side options live in `RetrieveCategoryConfig` and `RetrieveItemConfig`: `top_k` controls how many categories (default 5) and how many items are returned per query. Both are enabled by default.

## Common Failure Modes

The community has surfaced three recurring issues that map directly onto the storage layer:

1. **`happened_at` NULL on conversational input** — When `MemoryService.memorize()` ingests chat messages with explicit timestamps, the resulting `memory_items.happened_at` column is sometimes stored as `NULL` even though the source carried a valid time. This indicates a propagation gap between the message envelope and the typed `event` item writer. Tracked in [issue #428]().

2. **SQLite embedding serialization** — Earlier snapshots of the SQLite backend raised `ValueError: <class 'list'> has no matching SQLAlchemy type` when persisting embeddings, because the `embedding` column was declared without a SQLAlchemy type that understands Python `list`. Reported in [issue #382]().

3. **`memu-server` entry point** — `pyproject.toml` configured `memu-server = "memu.server.cli:main"`, but `src/memu/server/cli.py` was missing from the tree, so the installed console script failed at import time. Reported in [issue #354](); fixed in a later release.

For all three, the mitigation is the same: pin to a release at or after v1.5.1, where the alembic interpolation fix and prompt/item fallback improvements have landed.

## See Also

- [Memory Extraction Prompts](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/) — per-type rules for `profile`, `event`, `knowledge`, `behavior`, `skill`, `tool`.
- [Configuration Reference](https://github.com/NevaMind-AI/memU/blob/main/src/memu/app/settings.py) — full list of `DatabaseConfig`, `BlobConfig`, and retrieve options.

---

<a id='page-llm-embedding'></a>

## LLM, Embedding & VLM Providers and Routing

### Related Pages

Related topics: [System Overview & Architecture](#page-overview), [Memorize & Retrieve Workflows, Preprocessing and Prompts](#page-memorize-retrieve)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/memu/llm/__init__.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/__init__.py)
- [src/memu/llm/base.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/base.py)
- [src/memu/llm/gateway.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/gateway.py)
- [src/memu/llm/wrapper.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/wrapper.py)
- [src/memu/llm/openai_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/openai_client.py)
- [src/memu/llm/anthropic_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/anthropic_client.py)
- [src/memu/llm/lazyllm_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/lazyllm_client.py)
- [src/memu/llm/http_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/http_client.py)
- [src/memu/llm/backends/base.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/backends/base.py)
- [src/memu/llm/backends/openrouter.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/llm/backends/openrouter.py)
- [src/memu/vlm/__init__.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/__init__.py)
- [src/memu/vlm/base.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/base.py)
- [src/memu/vlm/gateway.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/gateway.py)
- [src/memu/vlm/backends/base.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/backends/base.py)
- [src/memu/vlm/openai_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/openai_client.py)
- [src/memu/vlm/anthropic_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/anthropic_client.py)
- [src/memu/vlm/http_client.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/vlm/http_client.py)
- [README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)
</details>

# LLM, Embedding & VLM Providers and Routing

## 1. Purpose and Scope

memU needs to talk to a heterogeneous set of model providers for three distinct capabilities: **text chat / summarization (LLM)**, **vector embeddings (Embedding)**, and **vision-language analysis (VLM)**. The "Providers and Routing" subsystem is the abstraction layer that decouples the rest of the memory pipeline from any single vendor.

Its responsibilities are:

- Normalize the request/response shape of every supported provider behind a small, capability-scoped interface.
- Let the rest of memU call `client.summarize(...)` or `client.vision(...)` without knowing whether the request actually went to OpenAI, Anthropic, OpenRouter, or a self-hosted HTTP endpoint.
- Allow new providers to be added by registering a **backend module** and (optionally) a **transport client**, without editing the service composition root.

The README states the project is "**Profile-Based LLM Routing**" — chat, embedding, vision, and transcription work are routed through configurable LLM profiles, which is implemented via the per-capability `gateway.py` files.

> Source: [README.md]()

## 2. Package Layout

The two text/image capabilities each live in their own sibling package and follow the same internal shape:

```
src/memu/llm/                  # text / chat / summarization
├── base.py                    # LLMClient interface
├── gateway.py                 # builds a client from settings.LLMConfig
├── wrapper.py                 # higher-level façade used by services
├── openai_client.py           # OpenAI SDK transport
├── anthropic_client.py        # Anthropic SDK transport
├── lazyllm_client.py          # LazyLLM transport
├── http_client.py             # raw HTTP transport
├── defaults.py                # per-provider default model picks
└── backends/                  # per-provider request/response shape
    ├── base.py
    ├── openrouter.py
    └── ...

src/memu/vlm/                  # image / video understanding (mirrors llm/)
├── base.py                    # VLMClient interface + encode_image()
├── gateway.py                 # builds a VLM client from settings.VLMConfig
├── openai_client.py
├── anthropic_client.py
├── http_client.py
├── defaults.py
└── backends/
    ├── base.py
    └── ...
```

The package docstrings spell this out explicitly: "`backends/`: per-provider vision request/response shapes (HTTP transport). `http_client`/`openai_client`/`anthropic_client`: transport clients. `gateway`: build a client from a `:class:`memu.app.settings.VLMConfig`."

> Source: [src/memu/vlm/__init__.py:1-20]()

## 3. LLM Provider System

### 3.1 The backend contract

Every LLM provider is described by a subclass of `LLMBackend` defined in `src/memu/llm/backends/base.py`. The base class exposes three customization points:

- `default_headers(api_key)` — returns the auth header set. Defaults to OpenAI-style `Authorization: Bearer …`; Anthropic overrides this to use `x-api-key`.
- `build_summary_payload(...)` — converts `(text, system_prompt, chat_model, max_tokens)` into the provider-specific JSON body.
- `parse_summary_response(data)` — extracts the assistant text from the provider's response envelope.
- `build_vision_payload(...)` — builds the vision request body for text+image prompts (reused from the VLM path on providers that share an endpoint).

Concrete backends such as `OpenRouterLLMBackend` set `summary_endpoint = "/api/v1/chat/completions"` and emit OpenAI-compatible message arrays, then parse the response with `data["choices"][0]["message"]["content"]`. This means many OpenAI-shaped providers can be supported by a single small backend module.

> Source: [src/memu/llm/backends/base.py:1-44](), [src/memu/llm/backends/openrouter.py:1-25]()

### 3.2 Transports

The transport layer is independent of the backend layer. For each capability there is a small set of clients:

- `openai_client.py` — uses the official OpenAI SDK.
- `anthropic_client.py` — uses the official Anthropic SDK; the gateway strips a stale OpenAI `base_url` default and falls back to `https://api.anthropic.com`.
- `http_client.py` — a raw HTTP/httpx transport for any OpenAI-compatible endpoint, useful for self-hosted models and proxies.
- `lazyllm_client.py` — LazyLLM transport.

### 3.3 The LLM gateway

`src/memu/llm/gateway.py` exposes a `build_llm_client(cfg)` function that inspects `LLMConfig` (base URL, API key, provider name, model) and returns a fully-wired `LLMClient`. Adding a new transport is a single-line registration in the gateway; adding a new provider that reuses an existing transport is a new backend module.

The `LLMClient` base class in `src/memu/llm/base.py` defines the methods the rest of memU calls (`summarize`, `vision`, etc.); concrete clients implement them by calling the active backend.

> Source: [src/memu/llm/base.py](), [src/memu/llm/gateway.py]()

```mermaid
flowchart LR
    Settings["LLMConfig<br/>(base_url, api_key, model)"] --> Gateway["llm/gateway.py<br/>build_llm_client()"]
    Gateway -->|provider=openai| OpenAI["openai_client.py"]
    Gateway -->|provider=anthropic| Anthropic["anthropic_client.py"]
    Gateway -->|provider=http| HTTP["http_client.py"]
    Gateway -->|provider=lazyllm| Lazy["lazyllm_client.py"]
    OpenAI --> Backend["backends/*<br/>payload + response shape"]
    Anthropic --> Backend
    HTTP --> Backend
    Lazy --> Backend
    Backend --> Memory["MemoryService<br/>memorize / retrieve"]
```

## 4. VLM Provider System (Mirror of LLM)

The vision subsystem is intentionally a near-clone of the LLM subsystem. `VLMClient` exposes a single multimodal capability, `vision(prompt, image_path, ...)`, and `encode_image(image_path)` base64-encodes a file and infers the MIME type from the extension (`.jpg/.jpeg → image/jpeg`, `.png → image/png`, etc.).

The VLM gateway builds the client from a `VLMConfig` and registers three builders by default — SDK, Anthropic SDK, and raw HTTP — and shares the same OpenAI-default `base_url` stripping logic as the LLM gateway so an LLM-targeted config does not leak into a vision call.

> Source: [src/memu/vlm/base.py:1-44](), [src/memu/vlm/gateway.py:1-40]()

### 4.1 Embeddings

Although embedding is a separate logical capability, it is wired into the same routing model: the v1.5.0 release added HTTP proxy support for "LLM & embedding clients" together, and v1.0.1 explicitly fixed "get embedding client," indicating that embedding clients are constructed through the same gateway pattern. Embedding providers therefore plug in by adding a backend module and (if needed) a transport — no service-layer changes are required.

> Source: [README release notes for v1.5.0 and v1.0.1]()

## 5. Routing, Configuration, and Common Failure Modes

### 5.1 Profile-based routing

The README summarizes the goal: "**Profile-Based LLM Routing** — Route chat, embedding, vision, and transcription work through configurable LLM profiles." In practice this means:

- A `LLMConfig` (and its VLM / embedding siblings) is declared once in `app/settings.py`.
- `MemoryService` and friends never import a specific provider — they only depend on `LLMClient` / `VLMClient` interfaces.
- Swapping providers is a config change, not a code change.

### 5.2 Provider-customization table

| Concern | Where it lives | What you change |
|---------|---------------|----------------|
| Auth header scheme | `LLMBackend.default_headers` / `VLMBackend.default_headers` | Providers that don't use `Authorization: Bearer …` (e.g. Anthropic's `x-api-key`) |
| Request body shape | `build_summary_payload` / `build_vision_payload` | Providers whose message/tool format diverges from OpenAI |
| Response parsing | `parse_summary_response` / `parse_vision_response` | Providers that wrap `choices[0].message.content` differently |
| Endpoint path | `summary_endpoint` / `vision_endpoint` class attribute | Providers whose path differs from `/chat/completions` |
| Transport (SDK vs HTTP) | `llm/gateway.py` / `vlm/gateway.py` | Self-hosted endpoints, proxies, LazyLLM |
| Default model | `defaults.py` | Picking the latest recommended model per provider |

> Source: [src/memu/llm/backends/base.py:1-44](), [src/memu/vlm/backends/base.py:1-40]()

### 5.3 Known limitations and community-reported issues

- **Stale entry point on a release line** — `memu-server` was configured to point to a `memu.server.cli` module that did not exist on `main`, breaking startup. The fix lives in the packaging layer (`pyproject.toml`), not in the LLM/VLM code, but it is the kind of regression users hit immediately and is worth being aware of when upgrading. See issue [#354](https://github.com/NevaMind-AI/memU/issues/354).
- **SQLite + embeddings** — a community-reported bug ("sqlite backend embding issue", [#382](https://github.com/NevaMind-AI/memU/issues/382)) raised `ValueError: <class 'list'> has no matching SQLAlchemy type` because the `embedding` column is stored as a list/JSON shape. When the embedding provider returns vectors, the persistence layer must serialize them; ensure your backend writes JSON, not Python `list`, when the target dialect does not support array types natively.
- **HTTP proxy support** — added in v1.5.0 across LLM and embedding clients. If you sit behind a corporate proxy, set the new proxy option on the relevant `*Config` rather than patching transports.
- **Embedding client wiring** — v1.0.1's "get embedding client" fix is a reminder that embedding construction goes through the gateway; if a custom embedding provider is added, the corresponding builder must be registered there or `build_*_client` will return `None`.

> Source: community issues [#354](https://github.com/NevaMind-AI/memU/issues/354), [#382](https://github.com/NevaMind-AI/memU/issues/382); release notes v1.0.1, v1.5.0.

## See Also

- [Storage Backends (in-memory, SQLite, Postgres)]()
- [MemoryService: `memorize` and `retrieve`]()
- [Configuration: `app/settings.py` and `*Config` types]()
- [Multimodal Pipeline: image and video ingestion]()

---

<a id='page-memorize-retrieve'></a>

## Memorize & Retrieve Workflows, Preprocessing and Prompts

### Related Pages

Related topics: [System Overview & Architecture](#page-overview), [Storage Backends & Data Model](#page-storage), [LLM, Embedding & VLM Providers and Routing](#page-llm-embedding)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/memu/app/memorize.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/app/memorize.py)
- [src/memu/app/retrieve.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/app/retrieve.py)
- [src/memu/app/crud.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/app/crud.py)
- [src/memu/preprocess/base.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/preprocess/base.py)
- [src/memu/preprocess/conversation.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/preprocess/conversation.py)
- [src/memu/preprocess/document.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/preprocess/document.py)
- [src/memu/prompts/preprocess/document.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/preprocess/document.py)
- [src/memu/prompts/memory_type/event.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/event.py)
- [src/memu/prompts/memory_type/knowledge.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/knowledge.py)
- [src/memu/prompts/memory_type/behavior.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/behavior.py)
- [src/memu/prompts/memory_type/skill.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/skill.py)
- [src/memu/prompts/memory_type/tool.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/tool.py)
- [src/memu/prompts/category_summary/category.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/category_summary/category.py)
- [README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)
</details>

# Memorize & Retrieve Workflows, Preprocessing and Prompts

## 1. Overview and Compiled Workspace

memU exposes a two-operation API to agents: `memorize()` and `retrieve()`. Internally, these operations compile raw sources (chat logs, documents, logs, images) into a navigable **workspace** and then serve only the layers that match a query. The README frames the design as a "compiled workspace" that emulates a file system: `MemoryCategory` records act as folders, `MemoryItem` records act as files, and `Resource` records preserve the original artifact that produced each memory ([README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)).

The compiled workspace follows a fixed shape:

```text
MemoryCategory                       (folder: topic with evolving summary)
├── name, description, summary, embedding
└── MemoryItem[]                     (files: typed, atomic memories)
    ├── memory_type: profile | event | knowledge | behavior | skill | tool
    ├── summary, extra, happened_at, embedding
    └── Resource                     (source: raw file)
        └── url, modality, local_path, caption, embedding
```

`memorize()` accepts a `resource_url` and a `modality` (e.g. `conversation`, `document`) and returns a dictionary describing what was written. `retrieve()` accepts `queries` and a `where` filter (typically `user_id`) and returns only the folders and files relevant to the request, ranked for prompt injection. The README's "WRITE — memorize()" / "READ — retrieve()" diagram describes the full flow as: raw files → extract → files + folders → persist; and query → walk folders → ranked files ([README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)).

## 2. The Memorize Pipeline

The memorize path lives under `src/memu/app/memorize.py`, which orchestrates preprocessing, type-specific extraction, category routing, embedding, and persistence through the CRUD layer in `src/memu/app/crud.py`. The pipeline has five logical stages ([README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md)):

1. **Preprocess** — convert the raw resource into a clean text representation
2. **Extract** — run one or more memory-type extractors to produce typed items
3. **Categorize & link** — place items into `MemoryCategory` folders, cross-link, and embed
4. **Summarize** — produce/refresh per-category summaries
5. **Persist** — write items, relations, embeddings, and summaries through the repository contract

```mermaid
flowchart LR
    A[Raw resource<br/>conversation / document] --> B[Preprocess]
    B --> C[Memory-type extractors<br/>profile · event · knowledge · behavior · skill · tool]
    C --> D[MemoryItem + MemoryCategory]
    D --> E[Embedding + Category Summary]
    E --> F[(Repository<br/>in-memory / SQLite / Postgres)]
    F --> G[retrieve<br/>walk folders → ranked files]
```

### 2.1 Preprocessing

`src/memu/preprocess/base.py` defines the abstract preprocessor contract, and concrete implementations live in `src/memu/preprocess/conversation.py` and `src/memu/preprocess/document.py`. The conversation preprocessor normalizes message lists and timestamps (relevant to the `happened_at` column), while the document preprocessor delegates to the LLM via the prompt in `src/memu/prompts/preprocess/document.py` to produce a condensed version plus a one-sentence caption ([src/memu/prompts/preprocess/document.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/preprocess/document.py)).

> **Community note:** Issue [#428](https://github.com/NevaMind-AI/memU/issues/428) reports that the `memory_items.happend_at` column stores `NULL` even when messages carry proper timestamps. This is a known data-flow bug in the conversation preprocessing → extract path; verify preprocessor output before relying on `happened_at`.

### 2.2 Type-Specific Extraction

The `src/memu/prompts/memory_type/` package contains one prompt module per supported memory type. Each module follows a similar block-based structure: an `OBJECTIVE` block identifying the role (e.g. "professional User Memory Extractor"), a `WORKFLOW` block, a `RULES` block, a `CATEGORY` block listing the target categories, an `OUTPUT` block specifying the response shape, and an `EXAMPLES` block.

| Memory type | Prompt module | Extraction target | Output shape |
|---|---|---|---|
| `event` | [src/memu/prompts/memory_type/event.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/event.py) | Time-bounded happenings involving the user (with time, place, participants) | One declarative sentence per item |
| `knowledge` | [src/memu/prompts/memory_type/knowledge.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/knowledge.py) | Objective facts, concepts, definitions, explanations | Single-line plain text, `< 50 words` per item |
| `behavior` | [src/memu/prompts/memory_type/behavior.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/behavior.py) | Recurring patterns, routines, solutions (not one-time events) | Single or multi-line record of a pattern |
| `skill` | [src/memu/prompts/memory_type/skill.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/skill.py) | Actionable skill profiles with frontmatter and sectioned markdown | Full SKILL.md body, `≥ 300 words` |
| `tool` | [src/memu/prompts/memory_type/tool.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/tool.py) | Tool usage patterns: name, use case, outcome, retrieval hint | XML `<memory>` with `<when_to_use>` |

All five type prompts share these guard rails: items must be in the same language as the source, items must be self-contained without context, identical/similar items must be merged into a single category, and forbidden content includes illegal/harmful topics, opinions without factual basis (in `knowledge`), and assistant-only turns (in `behavior`) ([src/memu/prompts/memory_type/knowledge.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/knowledge.py), [src/memu/prompts/memory_type/behavior.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/behavior.py)).

The `skill` prompt is the most demanding: it requires a YAML frontmatter (`name`, `description`, `category`, `demonstrated-in`) followed by `Core Principles`, `When to Use This Skill`, `Implementation Guide` (with `Prerequisites`, `Techniques and Approaches`, `Example from Resource`), `Success Patterns`, `Common Pitfalls`, and `Key Takeaways`, with a 300-word minimum to guarantee reusability ([src/memu/prompts/memory_type/skill.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/skill.py)). The `tool` prompt produces a lighter record but adds an explicit `<when_to_use>` field so retrieval can match the right tool to the right task ([src/memu/prompts/memory_type/tool.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/memory_type/tool.py)).

### 2.3 Category Routing and Summarization

After extraction, items are placed into `MemoryCategory` folders, and `src/memu/prompts/category_summary/category.py` is invoked to maintain a per-category user markdown profile. The category prompt implements a three-step workflow: parse the initial profile and new items, perform `Update` (conflict detection, validity priority, overwrite/supplement) and `Add` (deduplication, category matching, insertion) operations, and re-render the result as a Markdown hierarchy with `H1` category titles and `H2` sub-categories ([src/memu/prompts/category_summary/category.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/prompts/category_summary/category.py)).

The prompt also enforces explicit exclusions: vague or non-user items are dropped, one-time events without long-term relevance (e.g. "ate Malatang today") are removed, and assistant-introduced content is rejected. The final output is **only** the updated Markdown profile — no explanations or operation traces.

## 3. The Retrieve Pipeline

`src/memu/app/retrieve.py` consumes a `queries` list (each a role/content pair) and a `where` filter (typically `{"user_id": "..."}`). It walks the `MemoryCategory` tree, scores items against the query embeddings, and returns only the relevant files and folder summaries as a dictionary suitable for prompt injection ([README.md](https://github.com/NevaMind-AI/memU/blob/main/README.md), [src/memu/app/retrieve.py](https://github.com/NevaMind-AI/memU/blob/main/src/memu/app/retrieve.py)). The persisted shape is identical to what `memorize()` returns, so downstream agents can treat both responses uniformly.

## 4. Common Failure Modes

Several community-reported issues map directly onto this workflow:

- **`happened_at` is NULL for conversation memories** — [issue #428](https://github.com/NevaMind-AI/memU/issues/428). The conversation preprocessor must propagate per-message timestamps into the extracted event items; check `src/memu/preprocess/conversation.py` and the `event` extractor in `src/memu/prompts/memory_type/event.py` when diagnosing.
- **SQLite backend rejects the `embedding` column** — [issue #382](https://github.com/NevaMind-AI/memU/issues/382). The SQLAlchemy type for `embedding` (a `list`) is not registered, raising `ValueError: <class 'list'> has no matching SQLAlchemy type`. The fix typically lives in the SQLite repository under `src/memu/app/crud.py`.
- **`memu-server` entry point points to a missing module** — [issue #354](https://github.com/NevaMind-AI/memU/issues/354). `pyproject.toml` declares `memu-server = "memu.server.cli:main"` but `src/memu/server/cli.py` is missing on `main`; the entry point and the module must be kept in sync.
- **Alembic URL interpolation** — fixed in v1.5.1 ([fd87ceb](https://github.com/NevaMind-AI/memU/commit/fd87ceb558eaa800aeb694b045847971958f23a3)) to use correct interpolation, which affects migration-driven schema updates for the `MemoryItem` and `MemoryCategory` tables.

## See Also

- [Storage Backends (in-memory, SQLite, Postgres)](./storage-backends.md)
- [LLM Profiles and Routing](./llm-profiles-and-routing.md)
- [Category Summary Generation](./category-summary.md)
- [Memory Types Overview](./memory-types.md)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: NevaMind-AI/memU

Summary: Found 10 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

## 1. Configuration risk - Configuration risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/428

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/354

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/382

## 4. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/NevaMind-AI/memU

## 5. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/NevaMind-AI/memU

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

## 7. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/NevaMind-AI/memU

## 8. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/NevaMind-AI/memU

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

## 10. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

<!-- canonical_name: NevaMind-AI/memU; human_manual_source: deepwiki_human_wiki -->